首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 296 毫秒
1.
Mass spectrometry (MS) is a technique that is used for biological studies. It consists in associating a spectrum to a biological sample. A spectrum consists of couples of values (intensity, m/z), where intensity measures the abundance of biomolecules (as proteins) with a mass-to-charge ratio (m/z) present in the originating sample. In proteomics experiments, MS spectra are used to identify pattern expressions in clinical samples that may be responsible of diseases. Recently, to improve the identification of peptides/proteins related to patterns, MS/MS process is used, consisting in performing cascade of mass spectrometric analysis on selected peaks. Latter technique has been demonstrated to improve the identification and quantification of proteins/peptide in samples. Nevertheless, MS analysis deals with a huge amount of data, often affected by noises, thus requiring automatic data management systems. Tools have been developed and most of the time furnished with the instruments allowing: (i) spectra analysis and visualization, (ii) pattern recognition, (iii) protein databases querying, (iv) peptides/proteins quantification and identification. Currently most of the tools supporting such phases need to be optimized to improve the protein (and their functionalities) identification processes. In this article we survey on applications supporting spectrometrists and biologists in obtaining information from biological samples, analyzing available software for different phases. We consider different mass spectrometry techniques, and thus different requirements. We focus on tools for (i) data preprocessing, allowing to prepare results obtained from spectrometers to be analyzed; (ii) spectra analysis, representation and mining, aimed to identify common and/or hidden patterns in spectra sets or in classifying data; (iii) databases querying to identify peptides; and (iv) improving and boosting the identification and quantification of selected peaks. We trace some open problems and report on requirements that represent new challenges for bioinformatics.  相似文献   

2.
A high-throughput software pipeline for analyzing high-performance mass spectral data sets has been developed to facilitate rapid and accurate biomarker determination. The software exploits the mass precision and resolution of high-performance instrumentation, bypasses peak-finding steps, and instead uses discrete m/z data points to identify putative biomarkers. The technique is insensitive to peak shape, and works on overlapping and non-Gaussian peaks which can confound peak-finding algorithms. Methods are presented to assess data set quality and the suitability of groups of m/z values that map to peaks as potential biomarkers. The algorithm is demonstrated with serum mass spectra from patients with and without ovarian cancer. Biomarker candidates are identified and ranked by their ability to discriminate between cancer and noncancer conditions. Their discriminating power is tested by classifying unknowns using a simple distance calculation, and a sensitivity of 95.6% and a specificity of 97.1% are obtained. In contrast, the sensitivity of the ovarian cancer blood marker CA125 is approximately 50% for stage I/II and approximately 80% for stage III/IV cancers. While the generalizability of these markers is currently unknown, we have demonstrated the ability of our analytical package to extract biomarker candidates from high-performance mass spectral data.  相似文献   

3.
This report describes the development of a method to detect the waterborne pathogen Aeromonas using matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS). The genus Aeromonas is one of several medically significant genera that have gained prominence due to their evolving taxonomy and controversial role in human diseases. In this study, MALDI-MS was applied to the characterization of seventeen species of Aeromonas. These seventeen species were represented by thirty-two strains, which included type, reference and clinical isolates. Intact cells from each strain were used to generate a reproducible library of protein mass spectral fingerprints or m/z signatures. Under the test conditions used, peak lists of the mass ions observed in each species revealed that three mass ions were conserved among all the seventeen species tested. These common mass ions having an average m/z of 6301, 12,160 or 12,254, and 13,450, can be potentially used as genus-specific biomarkers to identify Aeromonas in unknown samples. A dendrogram generated using the m/z signatures of all the strains tested indicated that the mass spectral data contained sufficient information to distinguish between genera, species, and strains. There are several advantages of using MALDI-MS based protein mass spectral fingerprinting of whole cells for the identification of microorganisms as well as for their differentiation at the sub-species level: (1) the capability to detect proteins, (2) high throughput, and (3) relatively simple sample preparation techniques. The accuracy and speed with which data can be obtained makes MALDI-MS a powerful tool especially suited for environmental monitoring and detection of biological hazards.  相似文献   

4.
For our analysis of the data from the First Annual Proteomics Data Mining Conference, we attempted to discriminate between 24 disease spectra (group A) and 17 normal spectra (group B). First, we processed the raw spectra by (i) correcting for additive sinusoidal noise (periodic on the time scale) affecting most spectra, (ii) correcting for the overall baseline level, (iii) normalizing, (iv) recombining fractions, and (v) using variable-width windows for data reduction. Also, we identified a set of polymeric peaks (at multiples of 180.6 Da) that is present in several normal spectra (B1-B8). After data processing, we found the intensities at the following mass to charge (m/z) values to be useful discriminators: 3077, 12 886 and 74 263. Using these values, we were able to achieve an overall classification accuracy of 38/41 (92.6%). Perfect classification could be achieved by adding two additional peaks, at 2476 and 6955. We identified these values by applying a genetic algorithm to a filtered list of m/z values using Mahalanobis distance between the group means as a fitness function.  相似文献   

5.
Peak detection is a key step in the analysis of SELDI-TOF-MS spectra, but the current default method has low specificity and poor peak annotation. To improve data quality, scientists still have to validate the identified peaks visually, a tedious and time-consuming process, especially for large data sets. Hence, there is a genuine need for methods that minimize manual validation. We have previously reported a multi-spectral signal detection method, called RS for 'region of significance', with improved specificity. Here we extend it to include a peak quantification algorithm based on annotated regions of significance (ARS). For each spectral region flagged as significant by RS, we first identify a dominant spectrum for determining the number of peaks and the m/z region of these peaks. From each m/z region of peaks, a peak template is extracted from all spectra via the principal component analysis. Finally, with the template, we estimate the amplitude and location of the peak in each spectrum with the least-squares method and refine the estimation of the amplitude via the mixture model.We have evaluated the ARS algorithm on patient samples from a clinical study. Comparison with the standard method shows that ARS (i) inherits the superior specificity of RS, and (ii) gives more accurate peak annotations than the standard method. In conclusion, we find that ARS alleviates the main problems in the preprocessing of SELDI-TOF spectra. The R-package ProSpect that implements ARS is freely available for academic use at http://www.meb.ki.se/ yudpaw.  相似文献   

6.
Protein profiling in blood serum by fractionation and MS analysis has been applied in mice to assess its applicability as a fast, economical alternative to current DNA and RNA analyses for diagnosis of neuromuscular disorders. Mass spectra of peptides and proteins were generated using serum from dystrophin-deficient mdx and control mice by WCX ClinProt bead fractionation, followed by MALDI-MS. Double cross-validatory linear discriminant and logistic regression data analysis methods were compared with a new Bayesian logistic regression method. These were evaluated on their ability to discriminate between healthy and dystrophic samples, and to identify the discriminatory peaks in the mass spectra. All three approaches classified the spectra with comparable misclassification rates (between 18.4 and 20.6%), with much overlap between the differential peaks identified between the methods. The differential peak pattern from the Bayesian method was sparser and easier to interpret than from the other two methods, without compromising classifying strength. One of the two main differentiating peaks at m/z 3908 was identified as an N-terminal peptide of coagulation Factor XIIIa, previously identified in human serum. This work underlines the translational aspect of serum protein profiling in mice and supports a further study with serum from patients with neuromuscular disorders.  相似文献   

7.
One of the important challenges for MALDI imaging mass spectrometry (MALDI-IMS) is the unambiguous identification of measured analytes. One way to do this is to match tryptic peptide MALDI-IMS m/z values with LC-MS/MS identified m/z values. Matching using current MALDI-TOF/TOF MS instruments is difficult due to the variability of in situ time-of-flight (TOF) m/z measurements. This variability is currently addressed using external calibration, which limits achievable mass accuracy for MALDI-IMS and makes it difficult to match these data to downstream LC-MS/MS results. To overcome this challenge, the work presented here details a method for internally calibrating data sets generated from tryptic peptide MALDI-IMS on formalin-fixed paraffin-embedded sections of ovarian cancer. By calibrating all spectra to internal peak features the m/z error for matches made between MALDI-IMS m/z values and LC-MS/MS identified peptide m/z values was significantly reduced. This improvement was confirmed by follow up matching of LC-MS/MS spectra to in situ MS/MS spectra from the same m/z peak features. The sum of the data presented here indicates that internal calibrants should be a standard component of tryptic peptide MALDI-IMS experiments.  相似文献   

8.
We addressed the problem of discriminating between 24 diseased and 17 healthy specimens on the basis of protein mass spectra. To prepare the data, we performed mass to charge ratio (m/z) normalization, baseline elimination, and conversion of absolute peak height measures to height ratios. After preprocessing, the major difficulty encountered was the extremely large number of variables (1676 m/z values) versus the number of examples (41). Dimensionality reduction was treated as an integral part of the classification process; variable selection was coupled with model construction in a single ten-fold cross-validation loop. We explored different experimental setups involving two peak height representations, two variable selection methods, and six induction algorithms, all on both the original 1676-mass data set and on a prescreened 124-mass data set. Highest predictive accuracies (1-2 off-sample misclassifications) were achieved by a multilayer perceptron and Na?ve Bayes, with the latter displaying more consistent performance (hence greater reliability) over varying experimental conditions. We attempted to identify the most discriminant peaks (proteins) on the basis of scores assigned by the two variable selection methods and by neural network based sensitivity analysis. These three scoring schemes consistently ranked four peaks as the most relevant discriminators: 11683, 1403, 17350 and 66107.  相似文献   

9.
Wagner M  Naik D  Pothen A 《Proteomics》2003,3(9):1692-1698
We report our results in classifying protein matrix-assisted laser desorption/ionization-time of flight mass spectra obtained from serum samples into diseased and healthy groups. We discuss in detail five of the steps in preprocessing the mass spectral data for biomarker discovery, as well as our criterion for choosing a small set of peaks for classifying the samples. Cross-validation studies with four selected proteins yielded misclassification rates in the 10-15% range for all the classification methods. Three of these proteins or protein fragments are down-regulated and one up-regulated in lung cancer, the disease under consideration in this data set. When cross-validation studies are performed, care must be taken to ensure that the test set does not influence the choice of the peaks used in the classification. Misclassification rates are lower when both the training and test sets are used to select the peaks used in classification versus when only the training set is used. This expectation was validated for various statistical discrimination methods when thirteen peaks were used in cross-validation studies. One particular classification method, a linear support vector machine, exhibited especially robust performance when the number of peaks was varied from four to thirteen, and when the peaks were selected from the training set alone. Experiments with the samples randomly assigned to the two classes confirmed that misclassification rates were significantly higher in such cases than those observed with the true data. This indicates that our findings are indeed significant. We found closely matching masses in a database for protein expression in lung cancer for three of the four proteins we used to classify lung cancer. Data from additional samples, increased experience with the performance of various preprocessing techniques, and affirmation of the biological roles of the proteins that help in classification, will strengthen our conclusions in the future.  相似文献   

10.
AIMS: Some species of Candida have been shown to differ with respect to their polar lipid fingerprints when analysed by fast atom bombardment mass spectrometry (FABMS). The aims of this study were to contribute to the existing body of information by (i) examining representatives of species not previously examined and (ii) seeking strains differences associated with country of origin (UK or Iran). METHODS AND RESULTS: FABMS analysis was performed on extracted lipids of 22 strains representing eight species of Candida. The most abundant anion (19 isolates) in spectra was with mass to charge (m/z) 281, corresponding to C18:1 carboxylate. The major phospholipid analogue anions were m/z 515 and 501 (13 strains). These anions were putatively identified as the phosphatidyl molecular species PA(23 : 2) and PA(22 : 2) respectively. Data for strain pairs were compared using the Pearson's coefficient of linear correlation. The values generated were used to cluster strains by nearest-neighbour linkage, using both carboxylate and phospholipid analogue anion data. Isolates of C. parapsilosis were clearly distinct from other isolates. Iranian isolates tended to cluster together when phospholipid anion data were used. However, if carboxylate anion data were used, four Iranian isolates of C. albicans were tightly clustered with three UK isolates, of which two were C. albicans and one was C. dubliniensis. CONCLUSION: It is concluded that both lower, and higher, mass peaks in FABMS spectra can be of potential value in comparing Candida isolates from different countries and from different species. SIGNIFICANCE AND IMPACT OF THE STUDY: When polar lipids of different Candida species are compared, it is important to bear in mind that geographical differences affect results as has been observed with bacteria in similar studies.  相似文献   

11.
Data reduction of isotope-resolved LC-MS spectra   总被引:1,自引:1,他引:0  
MOTIVATION: Data reduction of liquid chromatography-mass spectrometry (LC-MS) spectra can be a challenge due to the inherent complexity of biological samples, noise and non-flat baseline. We present a new algorithm, LCMS-2D, for reliable data reduction of LC-MS proteomics data. RESULTS: LCMS-2D can reliably reduce LC-MS spectra with multiple scans to a list of elution peaks, and subsequently to a list of peptide masses. It is capable of noise removal, and deconvoluting peaks that overlap in m/z, in retention time, or both, by using a novel iterative peak-picking step, a 'rescue' step, and a modified variable selection method. LCMS-2D performs well with three sets of annotated LC-MS spectra, yielding results that are better than those from PepList, msInspect and the vendor software BioAnalyst. AVAILABILITY: The software LCMS-2D is available under the GNU general public license from http://www.bioc.aecom.yu.edu/labs/angellab/as a standalone C program running on LINUX.  相似文献   

12.
Current molecular methods to characterize microalgae are time-intensive and expensive. Matrix Assisted Laser Desorption/Ionization Time-of-Flight Mass Spectrometry (MALDI-TOF MS) may represent a rapid and economical alternative approach. The objectives of this study were to determine whether MALDI-TOF MS can be used to: 1) differentiate microalgae at the species and strain levels and 2) characterize simple microalgal mixtures. A common protein extraction sample preparation method was used to facilitate rapid mass spectrometry-based analysis of 31 microalgae. Each yielded spectra containing between 6 and 56 peaks in the m/z 2,000 to 20,000 range. The taxonomic resolution of this approach appeared higher than that of 18S rDNA sequence analysis. For example, two strains of Scenedesmus acutus differed only by two 18S rDNA nucleotides, but yielded distinct MALDI-TOF mass spectra. Mixtures of two and three microalgae yielded relatively complex spectra that contained peaks associated with members of each mixture. Interestingly, though, mixture-specific peaks were observed at m/z 11,048 and 11,230. Our results suggest that MALDI-TOF MS affords rapid characterization of individual microalgae and simple microalgal mixtures.  相似文献   

13.
Biomarkers have the potential to impact a wide range of public health concerns, including early detection of diseases, drug discovery, and improved accuracy of monitoring effects of interventions. Given new technological developments, broad-based screening approaches will likely advance biomarker discovery at an accelerated pace. Matrix-assisted laser desorption/ionization-time of flight mass spectrometry (MALDI-TOF MS) allows for the elucidation of individual protein masses from a complex mixture with high throughput. We have developed a method for identifying serum biomarkers using MALDI-TOF and statistical analysis. However, before applying this approach to screening of complex diseases, we evaluated the approach in a controlled dietary intervention study. In this study, MALDI-TOF spectra were generated using samples from a randomized controlled trial. During separate feeding periods, 38 participants ate a basal diet devoid of fruits and vegetables and a basal diet supplemented with cruciferous (broccoli) family vegetables. Serum samples were obtained at the end of each 7-day feeding period and treated to remove large, abundant proteins. MALDI-TOF spectra were analyzed using peak picking algorithms and logistic regression models. Our bioinformatics methods identified two significant peaks at m/z values of 2740 and 1847 that could classify participants based on diet (basal vs. cruciferous) with 76% accuracy. The 2740 m/z peak was identified as the B-chain of alpha 2-HS glycoprotein, a serum protein previously found to vary with diet and be involved in insulin resistance and immune function.  相似文献   

14.
Hexadecadien-1-ol and the derivatives (acetate and aldehyde) with a conjugated diene system have recently been identified from a pheromone gland extract of the persimmon fruit moth (Stathmopoda masinissa), a pest insect of persimmon fruits distributed in East Asia. The alcohol and acetate showed their base peaks at m/z 79 in a GC-MS analysis by electron impact ionization, but the aldehyde produced a unique base peak at m/z 84, suggesting a 4,6-diene structure. To confirm this inference, four geometrical isomers of each 4,6-hexadecadienyl compound were synthesized by two different routes in which one of two double bonds was furnished in a highly stereoselective manner. Separation of the two isomers synthesized together by each route was facilely accomplished by preparative HPLC. Their mass spectra coincided well with those of natural components, indicating that they were available for use as authentic standards for determining the configuration of the natural pheromone. Furthermore, other hexadecadienyl compounds, including the conjugated diene system between the 3- and 10-positions, were synthesized to accumulate the spectral data of pheromone candidates. 5,7-Hexadecadienal interestingly showed the base peak at m/z 80; meanwhile, the base peaks of its alcohol and acetate were detected at m/z 79 like the corresponding 4,6-dienes. The base peaks of all 6,8-, 7,9-, and 8,10-dienes universally appeared at m/z 67 like 9,11-, 10,12-, and 13,15-dienes, the spectra of which have already been published. Although 3,5-hexadecadienal was not prepared, base peaks at m/z 67 and 79 were recorded for the alcohol and acetate, respectively.  相似文献   

15.
In the last two years, because of advances in protein separation and mass spectrometry, top-down mass spectrometry moved from analyzing single proteins to analyzing complex samples and identifying hundreds and even thousands of proteins. However, computational tools for database search of top-down spectra against protein databases are still in their infancy. We describe MS-Align+, a fast algorithm for top-down protein identification based on spectral alignment that enables searches for unexpected post-translational modifications. We also propose a method for evaluating statistical significance of top-down protein identifications and further benchmark various software tools on two top-down data sets from Saccharomyces cerevisiae and Salmonella typhimurium. We demonstrate that MS-Align+ significantly increases the number of identified spectra as compared with MASCOT and OMSSA on both data sets. Although MS-Align+ and ProSightPC have similar performance on the Salmonella typhimurium data set, MS-Align+ outperforms ProSightPC on the (more complex) Saccharomyces cerevisiae data set.  相似文献   

16.
Time-of-flight secondary ion mass spectrometry (TOF-SIMS) is capable of chemically visualizing proteins on insulated samples. Distribution of an immobilized probe protein, fluorescent-labeled protein A-immobilized on a glass plate, and that of a sample protein, immunogloblin G (IgG) in solution, reacting with protein A on the biosensor surface, were evaluated with TOF-SIMS (TFS-2100, Physical Electronics). TOF-SIMS spectra and images of the protein on the glass plates were obtained, and this "mutual information", as defined by information theory, was employed to analyze the TOF-SIMS spectra of proteins. Fragment ions from protein A and IgG were distinguished by the mutual, reinforcing information and specific fragment ions to each protein were selected to obtain the TOF-SIMS image of the protein. It is evident from the TOF-SIMS images of each protein that protein A was immobilized on the substrate homogeneously and that the reaction between the immobilized protein A and IgG is not localized in this condition. Chemical images of the proteins by TOF-SIMS will contribute to a better understanding of the reaction on the biosensor surface, and thus will help the development of more sophisticated biosensors. In addition, the requisite chemical conditions as well as the interaction between the biosensor surface and the immobilized proteins were investigated by TOF-SIMS by means of sets of reinforcing, mutually supportive information.  相似文献   

17.
Whole-cell matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (whole-cell MALDI-TOF MS) has been widely adopted as a useful technology in the identification and typing of microorganisms. This study employed the whole-cell MALDI-TOF MS to identify and differentiate wild-type and mutants containing constructed single gene mutations of Burkholderia pseudomallei, a pathogenic bacterium causing melioidosis disease in both humans and animals. Candidate biomarkers for the B. pseudomallei mutants, including rpoS, ppk, and bpsI isolates, were determined. Taxon-specific and clinical isolate-specific biomarkers of B. pseudomallei were consistently found and conserved across all average mass spectra. Cluster analysis of MALDI spectra of all isolates exhibited separate distribution. A total of twelve potential mass peaks discriminating between wild-type and mutant isolates were identified using ClinProTools analysis. Two peaks (m/z 2721 and 2748 Da) were specific for the rpoS isolate, three (m/z 3150, 3378, and 7994 Da) for ppk, and seven (m/z 3420, 3520, 3587, 3688, 4623, 4708, and 5450 Da) for bpsI. Our findings demonstrated that the rapid, accurate, and reproducible mass profiling technology could have new implications in laboratory-based rapid differentiation of extensive libraries of genetically altered bacteria.  相似文献   

18.

Background

Acute lymphoblastic leukemia (ALL) is a common form of cancer in children. Currently, bone marrow biopsy is used for diagnosis. Noninvasive biomarkers for the early diagnosis of pediatric ALL are urgently needed. The aim of this study was to discover potential protein biomarkers for pediatric ALL.

Methods

Ninety-four pediatric ALL patients and 84 controls were randomly divided into a "training" set (45 ALL patients, 34 healthy controls) and a test set (49 ALL patients, 30 healthy controls and 30 pediatric acute myeloid leukemia (AML) patients). Serum proteomic profiles were measured using surface-enhanced laser desorption/ionization-time-of-flight mass spectroscopy (SELDI-TOF-MS). A classification model was established by Biomarker Pattern Software (BPS). Candidate protein biomarkers were purified by HPLC, identified by LC-MS/MS and validated using ProteinChip immunoassays.

Results

A total of 7 protein peaks (9290 m/z, 7769 m/z, 15110 m/z, 7564 m/z, 4469 m/z, 8937 m/z, 8137 m/z) were found with differential expression levels in the sera of pediatric ALL patients and controls using SELDI-TOF-MS and then analyzed by BPS to construct a classification model in the "training" set. The sensitivity and specificity of the model were found to be 91.8%, and 90.0%, respectively, in the test set. Two candidate protein peaks (7769 and 9290 m/z) were found to be down-regulated in ALL patients, where these were identified as platelet factor 4 (PF4) and pro-platelet basic protein precursor (PBP). Two other candidate protein peaks (8137 and 8937 m/z) were found up-regulated in the sera of ALL patients, and these were identified as fragments of the complement component 3a (C3a).

Conclusion

Platelet factor (PF4), connective tissue activating peptide III (CTAP-III) and two fragments of C3a may be potential protein biomarkers of pediatric ALL and used to distinguish pediatric ALL patients from healthy controls and pediatric AML patients. Further studies with additional populations or using pre-diagnostic sera are needed to confirm the importance of these findings as diagnostic markers of pediatric ALL.  相似文献   

19.
Nuclear magnetic resonance (NMR) spectroscopy is widely used in metabonomics studies, but optimal recovery of latent biological information requires increasingly sophisticated statistical methods to identify quantitative relationships within these often highly complex data sets. Statistical heterospectroscopy (SHY) extracts latent relationships between NMR and mass spectrometry (MS) data from the same samples. Here we extend this concept to identify novel metabolic correlations between different biofluids and tissues from the same individuals. We acquired NMR data from blood plasma and cerebrospinal fluid (CSF) (N = 19) from HIV-1-infected individuals, who are known to be susceptible to neuropsychological dysfunction. We compared two computational approaches to SHY, namely the Pearson's product moment correlation and the Spearman's rank correlation. High correlations were observed for glutamine, valine, and polyethylene glycol, a drug delivery vehicle. Orthogonal projections to latent structures (OPLS) identified metabolites in blood plasma spectra that predicted the amounts of key CSF metabolites such as lactate, glutamine, and myo-inositol. Finally, brain metabolic data from magnetic resonance spectroscopy (MRS) measurements in vivo were integrated with CSF data to identify an association between 3-hydroxyvalerate and frontal white matter N-acetyl aspartate levels. The results underscore the utility of tools such as SHY and OPLS for coanalysis of high dimensional data sets to recover biological information unobtainable when such data are analyzed in isolation.  相似文献   

20.
MOTIVATION: MALDI mass spectrometry is able to elicit macromolecular expression data from cellular material and when used in conjunction with Ciphergen protein chip technology (also referred to as SELDI-Surface Enhanced Laser Desorption/Ionization), it permits a semi-high throughput approach to be taken with respect to sample processing and data acquisition. Due to the large array of data that is generated from a single analysis (8-10000 variables using a mass range of 2-15 kDa-this paper) it is essential to implement the use of algorithms that can detect expression patterns from such large volumes of data correlating to a given biological/pathological phenotype from multiple samples. If successful, the methodology could be extrapolated to larger data sets to enable the identification of validated biomarkers correlating strongly to disease progression. This would not only serve to enable tumours to be classified according to their molecular expression profile but could also focus attention upon a relatively small number of molecules that might warrant further biochemical/molecular characterization to assess their suitability as potential therapeutic targets. RESULTS: Using a multi-layer perceptron Artificial Neural Network (ANN) (Neuroshell 2) with a back propagation algorithm we have developed a prototype approach that uses a model system (comprising five low and seven high-grade human astrocytomas) to identify mass spectral peaks whose relative intensity values correlate strongly to tumour grade. Analyzing data derived from MALDI mass spectrometry in conjunction with Ciphergen protein chip technology we have used relative importance values, determined from the weights of trained ANNs (Balls et al., Water, Air Soil Pollut., 85, 1467-1472, 1996), to identify masses that accurately predict tumour grade. Implementing a three-stage procedure, we have screened a population of approximately 100000-120000 variables and identified two ions (m/z values of 13454 and 13457) whose relative intensity pattern was significantly reduced in high-grade astrocytoma. The data from this initial study suggests that application of ANN-based approaches can identify molecular ion patterns which strongly associate with disease grade and that its application to larger cohorts of patient material could potentially facilitate the rapid identification of validated biomarkers having significant clinical (i.e. diagnostic/prognostic) potential for the field of cancer biology. AVAILIBILITY: Neuroshell 2 is commercially available from ward systems.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号