首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Principal component analysis (PCA) was used to analyse the behaviour of a chromatographic separation as its scale increased. Three 4.6 mm diameter columns identical in every respect except for column length (25, 15 and 5 cm), were used to generate the data from a test system based on the reversed-phase HPLC separation of crude erythromycin on a polystyrene matrix (PLRP 1000) having a particle diameter of 8 mu;m and a pore diameter of 100 nm. The species were separated with an isocratic solvent composed of 45/55 acetonitrile/water at about pH 7. An experimental design technique was used to investigate the effects of four process variables (load volume, load concentration, temperature and pH of buffer) on the chromatogram shapes. Following appropriate pre-processing of the chromatographic data, subsets of critical chromatograms were selected which sufficiently characterised the entire data set. From this subset, the corresponding runs were performed on the different sized columns and principal component models were generated for each. At 5 and 15 cm a single principal component was sufficient to characterise all the variance in the chromatograms which the range of process variables introduced, but at 25 cm two principal components were required, particularly to characterise the chromatograms with small loads. Excellent correlations were observed between the first principal components at the three scales. The possibility of predicting the separations on the 25 cm column from an analysis of the separations observed at 5 cm was investigated. The study revealed that good predictions could be made at high loads (>92%) , but the model was not effective at low loads because of the need to incorporate a second principal component which was not defined by the range of variables applied to the 5 cm column.  相似文献   

2.
The work reported in this paper examines the use of principal component analysis (PCA), a technique of multivariate statistics to facilitate the extraction of meaningful diagnostic information from a data set of chromatographic traces. Two data sets mimicking archived production records were analysed using PCA. In the first a full-factorial experimental design approach was used to generate the data. In the second, the chromatograms were generated by adjusting just one of the process variables at a time. Data base mining was achieved through the generation of both gross and disjoint principal component (PC) models. PCA provided easily interpretable 2-dimensional diagnostic plots revealing clusters of chromatograms obtained under similar operating conditions. PCA methods can be used to detect and diagnose changes in process conditions, however results show that a PCA model may require recalibration if an equipment change is made. We conclude that PCA methods may be useful for the diagnosis of subtle deviations from process specification not readily distinguishable to the operator.  相似文献   

3.
The use of univariate statistical techniques on multivariate electromyography data can fail to uncover important relationships between variables. Principal components analysis (PCA) is a multivariate statistical technique that can be used as a data exploration tool, both by classifying participants and simplifying data structures. Past research using this technique has focused on discriminating between "patients" and "normals". This investigation explored the use of PCA on electromyography data from healthy participants, with the objective of elucidating any between-participant differences in the multivariate patterns of muscle coactivation. Results indicated that, even between healthy participants, quantitative and qualitative differences in muscle coactivation patterns exist and that, in the context of the lower torso, a large portion (>70%) of the empirically determined muscle activation could be synthesized in a theoretical three-parameter control model.  相似文献   

4.
Metabolomics data are typically complex and high dimensional. Multivariate dimension-reducing techniques have thus been developed for analysing metabolomics data to disclose underlying relationships, with principal component analysis (PCA) as the technique mostly applied. Despite its widespread use in metabolomics, PCA has shortcomings that limit its applicability. Several approaches have been made to overcome these limitations and we describe an advanced disjoint PCA (DPCA) model, termed concurrent class analysis and abbreviated as CONCA. CONCA is a new model, and is unique in linking DPCA models to a traditional PCA model. This is accomplished by restructuring the input data matrix, applying DPCA group models to the restructured data, and combining the DPCA models in order to replicate a traditional PCA. We applied the CONCA model to a metabolomics data set on isovaleric acidaemia (IVA), a rare inherited metabolic disorder. The outcome showed that three of the variables with high discrimination value identified through the CONCA analysis are prominent organic acid biomarkers for IVA. Moreover, three further minor metabolites associated with the disease, and two as a consequence of treatment, were likewise identified as important discriminatory variables. The benefit of the CONCA model thus is its ability to disclose information concerning each individual group and to identify the variables important in discrimination (VIDs) which are also responsible for group separation.  相似文献   

5.
《Biosensors》1986,2(5):269-286
This paper describes the use of rapid chromatographic separation systems to monitor the level of specific proteins in various bioprocesses such as downstream processing and fermentation. In these monitoring systems, samples of the liquid are continuously extracted from the process and the proteins resolved from one another by a rapid chromatographic separation. The peak on the chromatogram corresponding to the protein of interest is identified and quantified to obtain on-line information on the level of that protein in the bioprocess. There are a number of advantages in using affinity separations as the rapid chromatographic principle. In particular, the use of immobilised monoclonal antibodies potentially allows a chromatographic sensor to be constructed for any protein against which a suitable antibody can be raised. The potential of this technique is illustrated with various examples, including measurement of the levels of monoclonal antibody in tissue culture supernatant using immobilised Protein A as the affinity adsorbent. A discussion of the inherent limitations of this type of protein biosensor is also included.  相似文献   

6.
The paper presents two analyzes of the MALDI-TOF mass spectrometry dataset. Both analyzes use the support vector machine as a tool to build a prediction model. The first analysis which is our contribution to the competition uses the given spectra data without further processing. In the second analysis, we employed an additional preprocessing step consisting of peak detection, peak alignment and feature selection based on statistical tests. The experimental results suggest that the preprocessing step with feature selection improves prediction accuracy.  相似文献   

7.

Metabolomics data are typically complex and high dimensional. Multivariate dimension-reducing techniques have thus been developed for analysing metabolomics data to disclose underlying relationships, with principal component analysis (PCA) as the technique mostly applied. Despite its widespread use in metabolomics, PCA has shortcomings that limit its applicability. Several approaches have been made to overcome these limitations and we describe an advanced disjoint PCA (DPCA) model, termed concurrent class analysis and abbreviated as CONCA. CONCA is a new model, and is unique in linking DPCA models to a traditional PCA model. This is accomplished by restructuring the input data matrix, applying DPCA group models to the restructured data, and combining the DPCA models in order to replicate a traditional PCA. We applied the CONCA model to a metabolomics data set on isovaleric acidaemia (IVA), a rare inherited metabolic disorder. The outcome showed that three of the variables with high discrimination value identified through the CONCA analysis are prominent organic acid biomarkers for IVA. Moreover, three further minor metabolites associated with the disease, and two as a consequence of treatment, were likewise identified as important discriminatory variables. The benefit of the CONCA model thus is its ability to disclose information concerning each individual group and to identify the variables important in discrimination (VIDs) which are also responsible for group separation.

  相似文献   

8.
In a wide variety of biotechnological and medical applications it is necessary to separate different cell populations from one another. A promising approach to cell separations is demonstrated to be the adoption of chromatographic techniques conducted in expanded beds. The high voidage between the adsorbent beads in an expanded bed allows for the efficient capture of particulate entities such as cells together with washing and subsequent elution without entrapment and loss. In addition, the combination of a gentle hydrodynamic environment, a high surface area and low mixing within the expanded bed make this technique highly favourable. A model system for the separation of two types of microbial cells using STREAMLINE DEAE adsorbent in expanded bed procedures has been investigated. The use of a less selective ligand such as an ion exchange group, which is often characterised by gentle elution procedures, has been investigated as an alternative to affinity ligands whose strong binding characteristics can result in harsh elution procedures with consequent loss of yield and cell viability. Expanded bed experiments have demonstrated selective and high capacity capture of cells from feedstocks containing either a single type of cell or as a mixture of cells of Saccharomyces cerevisiae and Eschericia coli. The capture, washing and elution phases of the separation have been studied with respect to capacity, selectivity and yield of released cells. In these procedures, separation of cell types is achieved by the presence of multiple equilibrium stages within the expanded bed. The results show the potential for carrying out cell separations in expanded beds as an alternative to immunomagnetic cell separations. The combination of these recently developed technologies promises to be a powerful, but economic technique for cell separations involving simple equipment that can readily be scaled up.  相似文献   

9.
Metabolic footprinting has been applied as a non-invasive approach to study the behaviour and responses of cultured cells to a range of genetic and environmental perturbations. Gas chromatography interfaced with time-of-flight mass spectrometry (GC-ToF-MS) has become a powerful tool for the analysis of metabolome-derived samples. Generally, two data analysis strategies are used to interrogate and understand the biological patterns within the multi-dimensional data. The first strategy, a commoner one, uses multivariate analysis after chromatographic and mass spectral deconvolution, and the second strategy directly employs multivariate analysis of non-deconvoluted data. Here, two strategies have been assessed for the separation and classification of metabolic footprints (exometabolomes) of two strains of Candida albicans grown on three different carbon sources (glycerol, glucose and galactose). We describe a semi-automated approach that simultaneously processes all samples using the chromatographic dimension data with principal components analysis (PCA), which can include data pre-processing before PCA analysis. The preprocessed and non-deconvoluted total ion chromatogram (TIC) data showed good separation of classes defined by growth on different carbon sources and when comparing the two strains grown on the same carbon source separation was achieved for strains grown on glucose and glycerol after preprocessing. The discrimination observed is greater for preprocessed and non-deconvoluted TIC data than for that of preprocessed and non-deconvoluted single ion chromatogram data. The results from the proposed approach with those produced by MZmine were compared. The results from MZmine data depicted separations in PCA space according to carbon source, but no separation was seen when studying strains grown on the same carbon source. Our research showed that the non-deconvoluted strategy is suitable for fast comparison of large sets of GC-MS data although it will not directly provide biological information. The non-deconvoluted strategy can avoid problems of analyzing complex samples using deconvolution software.  相似文献   

10.
This paper examines the selection of the appropriate representation of chromatogram data prior to using principal component analysis (PCA), a multivariate statistical technique, for the diagnosis of chromatogram data sets. The effects of four process variables were investigated; flow rate, temperature, loading concentration and loading volume, for a size exclusion chromatography system used to separate three components (monomer, dimer, trimer). The study showed that major positional shifts in the elution peaks that result when running the separation at different flow rates caused the effects of other variables to be masked if the PCA is performed using elapsed time as the comparative basis. Two alternative methods of representing the data in chromatograms are proposed. In the first data were converted to a volumetric basis prior to performing the PCA, while in the second, having made this transformation the data were adjusted to account for the total material loaded during each separation. Two datasets were analysed to demonstrate the approaches. The results show that by appropriate selection of the basis prior to the analysis, significantly greater process insight can be gained from the PCA and demonstrates the importance of pre-processing prior to such analysis.  相似文献   

11.
Gay S  Binz PA  Hochstrasser DF  Appel RD 《Proteomics》2002,2(10):1374-1391
Matrix-assisted laser desorption/ionization-time of flight mass spectrometry has become a valuable tool in proteomics. With the increasing acquisition rate of mass spectrometers, one of the major issues is the development of accurate, efficient and automatic peptide mass fingerprinting (PMF) identification tools. Current tools are mostly based on counting the number of experimental peptide masses matching with theoretical masses. Almost all of them use additional criteria such as isoelectric point, molecular weight, PTMs, taxonomy or enzymatic cleavage rules to enhance prediction performance. However, these identification tools seldom use peak intensities as parameter as there is currently no model predicting the intensities based on the physicochemical properties of peptides. In this work, we used standard datamining methods such as classification and regression methods to find correlations between peak intensities and the properties of the peptides composing a PMF spectrum. These methods were applied on a dataset comprising a series of PMF experiments involving 157 proteins. We found that the C4.5 method gave the more informative results for the classification task (prediction of the presence or absence of a peptide in a spectra) and M5' for the regression methods (prediction of the normalized intensity of a peptide peak). The C4.5 result correctly classified 88% of the theoretical peaks; whereas the M5' peak intensities had a correlation coefficient of 0.6743 with the experimental peak intensities. These methods enabled us to obtain decision and model trees that can be directly used for prediction and identification of PMF results. The work performed permitted to lay the foundations of a method to analyze factors influencing the peak intensity of PMF spectra. A simple extension of this analysis could lead to improve the accuracy of the results by using a larger dataset. Additional peptide characteristics or even PMF experimental parameters can also be taken into account in the datamining process to analyze their influence on the peak intensity. Furthermore, this datamining approach can certainly be extended to the tandem mass spectrometry domain or other mass spectrometry derived methods.  相似文献   

12.
An optimization framework based on the use of hybrid models is presented for preparative chromatographic processes. The first step in the hybrid model strategy involves the experimental determination of the parameters of the physical model, which consists of the full general rate model coupled with the kinetic form of the steric mass action isotherm. These parameters are then used to carry out a set of simulations with the physical model to obtain data on the functional relationship between various objective functions and decision variables. The resulting data is then used to estimate the parameters for neural-network-based empirical models. These empirical models are developed in order to enable the exploration of a wide variety of different design scenarios without any additional computational requirements. The resulting empirical models are then used with a sequential quadratic programming optimization algorithm to maximize the objective function, production rate times yield (in the presence of solubility and purity constraints), for binary and tertiary model protein systems. The use of hybrid empirical models to represent complex preparative chromatographic systems significantly reduces the computational time required for simulation and optimization. In addition, it allows both multivariable optimization and rapid exploration of different scenarios for optimal design.  相似文献   

13.
Logistic regression is often used to help make medical decisions with binary outcomes. Here we evaluate the use of several methods for selection of variables in logistic regression. We use a large dataset to predict the diagnosis of myocardial infarction in patients reporting to an emergency room with chest pain. Our results indicate that some of the examined methods are well suited for variable selection in logistic regression and that our model, and our myocardial infarction risk calculator, can be an additional tool to aid physicians in myocardial infarction diagnosis.  相似文献   

14.
Yao I  Sugiura Y  Matsumoto M  Setou M 《Proteomics》2008,8(18):3692-3701
Imaging MS is emerging as a useful tool for proteomic analysis. We utilized this technique to analyze gene knockout (KO) mice in addition to traditional 2-DE analysis. The Scrapper-knockout (SCR-KO) mouse brain showed two types of neurodegenerative pathologies, the spongiform neurodegeneration and shrinkage of neuronal cells. 2-DE analysis of the whole brain lysates of SCR-KO mice indicated slight changes in annexin A6, Rap1 GTPase, and glyoxalase domain containing four spots while most of the main components did not show significant changes. By imaging MS analysis based on principal component analysis (PCA), we could find numerous alterations in the KO mouse brain. Furthermore, we could also know the information on the position of altered substances all together. PCA provides information about which molecules in tissue microdomains have altered and is helpful in analyzing large dataset of imaging MS, while exact identification of each molecule from peaks in MALDI imaging MS may require additional analyses such as MS/MS. Direct imaging with PCA is a powerful tool to perform in situ proteomics and will lead to novel findings. Our study shows that imaging MS yields information complementary to conventional 2-DE analysis.  相似文献   

15.
A quickly growing number of characteristics reflecting various aspects of gene function and evolution can be either measured experimentally or computed from DNA and protein sequences. The study of pairwise correlations between such quantitative genomic variables as well as collective analysis of their interrelations by multidimensional methods have delivered crucial insights into the processes of molecular evolution. Here, we present a principal component analysis (PCA) of 16 genomic variables from Saccharomyces cerevisiae, the largest data set analyzed so far. Because many missing values and potential outliers hinder the direct calculation of principal components, we introduce the application of Bayesian PCA. We confirm some of the previously established correlations, such as evolutionary rate versus protein expression, and reveal new correlations such as those between translational efficiency, phosphorylation density, and protein age. Although the first principal component primarily contrasts genomic change and protein expression, the second component separates variables related to gene existence and expressed protein functions. Enrichment analysis on genes affecting variable correlations unveils classes of influential genes. For example, although ribosomal and nuclear transport genes make important contributions to the correlation between protein isoelectric point and molecular weight, protein synthesis and amino acid metabolism genes help cause the lack of significant correlation between propensity for gene loss and protein age. We present the novel Quagmire database (Quantitative Genomics Resource) which allows exploring relationships between more genomic variables in three model organisms-Escherichia coli, S. cerevisiae, and Homo sapiens (http://webclu.bio.wzw.tum.de:18080/quagmire).  相似文献   

16.
Tidepools experience significant gradients in ecologically relevant physical variables along the transition from ocean to terrestrial habitat (vertical axis) and from open coast to inner bays (horizontal axis). Associations amongst physical and biological variables, divided into algal, invertebrate and vertebrate (fish) groups, were examined in a tidepool survey dataset. Physical variables and the three biological groups were submitted separately to a principal component analysis (PCA). PCA scores were evaluated with Pearson correlation coefficients across the sampling units (tidepools) to identify significant correlations. Initially little structure in the data and no correlation amongst variables was present. At the onset of summer, correlations were confined amongst physical variables and algal and invertebrate components, followed in the late summer with correlations between invertebrate and fish components. By the fall, correlations were confined to fish and algal/invertebrate components. Species relationships followed a seasonal cycle with a succession from little to no structure, the forming of low trophic level relationships in the early summer to high trophic level relationships in late summer-fall, and deconstruction of structure with the onset of fall-winter storms and ice scour. The seasonal pattern, and well established vertical gradient, has nested within it species composition changes along a horizontal wave energy gradient. The horizontal gradient results in a shift from species which are physiologically adapted to extreme salinities and temperatures to those which are physically adapted to high wave-energy environments.  相似文献   

17.
The development of purification processes for protein biopharmaceuticals is challenging due to compressed development timelines, long experimental times, and the need to survey a large parameter space. Typical methods for development of a chromatography step evaluate several dozen chromatographic column runs to optimize the conditions. An efficient batch-binding method of screening chromatographic purification conditions in a 96-well format with a robotic liquid-handling system is described and evaluated. The system dispenses slurries of chromatographic resins into filter plates, which are then equilibrated, loaded with protein, washed and eluted. This paper evaluates factors influencing the performance of this high-throughput screening technique, including the reproducibility of the aliquotted resin volume, the contact time of the solution and resin during mixing, and the volume of liquid carried over in the resin bed after centrifugal evacuation. These factors led to the optimization of a batch-binding technique utilizing either 50 or 100 microL of resin in each well, the selection of an industrially relevant incubation time of 20 min, and the quantitation of the hold-up volume, which was as much as one quarter of the total volume added to each well. The results from the batch-binding method compared favorably to chromatographic column separation steps for a cGMP protein purification process utilizing both hydrophobic interaction and anion-exchange steps. These high-throughput screening tools can be combined with additional studies on the kinetics and thermodynamics of protein-resin interactions to provide fundamental information which is useful for defining and optimizing chromatographic separations steps.  相似文献   

18.
《Chirality》2017,29(5):202-212
The screening of a number of chiral stationary phases (CSPs) with different modifiers in supercritical fluid chromatography to find a chromatographic method for separation of enantiomers can be time‐consuming. Computational methods for data analysis were utilized to establish a hierarchical screening strategy, using a dataset of 110 drug‐like chiral compounds with diverse structures tested on 15 CSPs with two different modifiers. This dataset was analyzed using a combinatorial algorithm, principal component analysis (PCA), and a correlation matrix. The primary goal was to find a set of eight columns resolving a large number of compounds, but also having complementary enantioselective properties. In addition to the hereby defined hierarchical experimental strategy, quantitative structure enantioselective models (QSERs) were evaluated. The diverse chemical space and relatively limited size of the training set reduced the accuracy of the QSERs. However, including separation factors from other CSPs increased the accuracies of the QSERs substantially. Hence, such combined models can support the experimental strategy in prioritizing the CSPs of the second screening phase, when a compound is not separated by the primary set of columns.  相似文献   

19.
20.
Utilization of novel biologically-derived biomaterials in bioprosthetic heart valves (BHV) requires robust constitutive models to predict the mechanical behavior under generalized loading states. Thus, it is necessary to perform rigorous experimentation involving all functional deformations to obtain both the form and material constants of a strain-energy density function. In this study, we generated a comprehensive experimental biaxial mechanical dataset that included high in-plane shear stresses using glutaraldehyde treated bovine pericardium (GLBP) as the representative BHV biomaterial. Compared to our previous study (Sacks, JBME, v.121, pp. 551-555, 1999), GLBP demonstrated a substantially different response under high shear strains. This finding was underscored by the inability of the standard Fung model, applied successfully in our previous GLBP study, to fit the high-shear data. To develop an appropriate constitutive model, we utilized an interpolation technique for the pseudo-elastic response to guide modification of the final model form. An eight parameter modified Fung model utilizing additional quartic terms was developed, which fitted the complete dataset well. Model parameters were also constrained to satisfy physical plausibility of the strain energy function. The results of this study underscore the limited predictive ability of current soft tissue models, and the need to collect experimental data for soft tissue simulations over the complete functional range.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号