首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Automated multidimensional capillary liquid chromatography-tandem mass spectrometry (LC-MS/MS) has been increasingly applied in various large scale proteome profiling efforts. However, comprehensive global proteome analysis remains technically challenging due to issues associated with sample complexity and dynamic range of protein abundances, which is particularly apparent in mammalian biological systems. We report here the application of a high efficiency cysteinyl peptide enrichment (CPE) approach to the global proteome analysis of human mammary epithelial cells (HMECs) which significantly improved both sequence coverage of protein identifications and the overall proteome coverage. The cysteinyl peptides were specifically enriched by using a thiol-specific covalent resin, fractionated by strong cation exchange chromatography, and subsequently analyzed by reversed-phase capillary LC-MS/MS. An HMEC tryptic digest without CPE was also fractionated and analyzed under the same conditions for comparison. The combined analyses of HMEC tryptic digests with and without CPE resulted in a total of 14 416 confidently identified peptides covering 4294 different proteins with an estimated 10% gene coverage of the human genome. By using the high efficiency CPE, an additional 1096 relatively low abundance proteins were identified, resulting in 34.3% increase in proteome coverage; 1390 proteins were observed with increased sequence coverage. Comparative protein distribution analyses revealed that the CPE method is not biased with regard to protein M(r) , pI, cellular location, or biological functions. These results demonstrate that the use of the CPE approach provides improved efficiency in comprehensive proteome-wide analyses of highly complex mammalian biological systems.  相似文献   

2.
Recent multidimensional liquid chromatography MS/MS studies have contributed to the identification of large numbers of expressed proteins for numerous species. The present study couples size exclusion chromatography of intact proteins with the separation of tryptically digested peptides using a combination of strong cation exchange and high resolution, reversed phase capillary chromatography to identify proteins extracted from human mammary epithelial cells (HMECs). In addition to conventional conservative criteria for protein identifications, the confidence levels were additionally increased through the use of peptide normalized elution times (NET) for the liquid chromatographic separation step. The combined approach resulted in a total of 5838 unique peptides identified covering 1574 different proteins with an estimated 4% gene coverage of the human genome, as annotated by the National Center for Biotechnology Information (NCBI). This database provides a baseline for comparison against variations in other genetically and environmentally perturbed systems. Proteins identified were categorized based upon intracellular location and biological process with the identification of numerous receptors, regulatory proteins, and extracellular proteins, demonstrating the usefulness of this application in the global analysis of human cells for future comparative studies.  相似文献   

3.
4.
5.
6.
The proteome of Haemophilus influenzae strain Rd KW20 was analyzed by liquid chromatography (LC) coupled with ion trap tandem mass spectrometry (MS/MS). This approach does not require a gel electrophoresis step and provides a rapidly developed snapshot of the proteome. In order to gain insight into the central metabolism of H. influenzae, cells were grown microaerobically and anaerobically in a rich medium and soluble and membrane proteins of strain Rd KW20 were proteolyzed with trypsin and directly examined by LC-MS/MS. Several different experimental and computational approaches were utilized to optimize the proteome coverage and to ensure statistically valid protein identification. Approximately 25% of all predicted proteins (open reading frames) of H. influenzae strain Rd KW20 were identified with high confidence, as their component peptides were unambiguously assigned to tandem mass spectra. Approximately 80% of the predicted ribosomal proteins were identified with high confidence, compared to the 33% of the predicted ribosomal proteins detected by previous two-dimensional gel electrophoresis studies. The results obtained in this study are generally consistent with those obtained from computational genome analysis, two-dimensional gel electrophoresis, and whole-genome transposon mutagenesis studies. At least 15 genes originally annotated as conserved hypothetical were found to encode expressed proteins. Two more proteins, previously annotated as predicted coding regions, were detected with high confidence; these proteins also have close homologs in related bacteria. The direct proteomics approach to studying protein expression in vivo reported here is a powerful method that is applicable to proteome analysis of any (micro)organism.  相似文献   

7.
We have developed GFam, a platform for automatic annotation of gene/protein families. GFam provides a framework for genome initiatives and model organism resources to build domain-based families, derive meaningful functional labels and offers a seamless approach to propagate functional annotation across periodic genome updates. GFam is a hybrid approach that uses a greedy algorithm to chain component domains from InterPro annotation provided by its 12 member resources followed by a sequence-based connected component analysis of un-annotated sequence regions to derive consensus domain architecture for each sequence and subsequently generate families based on common architectures. Our integrated approach increases sequence coverage by 7.2 percentage points and residue coverage by 14.6 percentage points higher than the coverage relative to the best single-constituent database within InterPro for the proteome of Arabidopsis. The true power of GFam lies in maximizing annotation provided by the different InterPro data sources that offer resource-specific coverage for different regions of a sequence. GFam’s capability to capture higher sequence and residue coverage can be useful for genome annotation, comparative genomics and functional studies. GFam is a general-purpose software and can be used for any collection of protein sequences. The software is open source and can be obtained from http://www.paccanarolab.org/software/gfam/.  相似文献   

8.
We have merged four different views of the human plasma proteome, based on different methodologies, into a single nonredundant list of 1175 distinct gene products. The methodologies used were 1) literature search for proteins reported to occur in plasma or serum; 2) multidimensional chromatography of proteins followed by two-dimensional electrophoresis and mass spectroscopy (MS) identification of resolved proteins; 3) tryptic digestion and multidimensional chromatography of peptides followed by MS identification; and 4) tryptic digestion and multidimensional chromatography of peptides from low-molecular-mass plasma components followed by MS identification. Of 1,175 nonredundant gene products, 195 were included in more than one of the four input datasets. Only 46 appeared in all four. Predictions of signal sequence and transmembrane domain occurrence, as well as Genome Ontology annotation assignments, allowed characterization of the nonredundant list and comparison of the data sources. The "nonproteomic" literature (468 input proteins) is strongly biased toward signal sequence-containing extracellular proteins, while the three proteomics methods showed a much higher representation of cellular proteins, including nuclear, cytoplasmic, and kinesin complex proteins. Cytokines and protein hormones were almost completely absent from the proteomics data (presumably due to low abundance), while categories like DNA-binding proteins were almost entirely absent from the literature data (perhaps unexpected and therefore not sought). Most major categories of proteins in the human proteome are represented in plasma, with the distribution at successively deeper layers shifting from mostly extracellular to a distribution more like the whole (primarily cellular) proteome. The resulting nonredundant list confirms the presence of a number of interesting candidate marker proteins in plasma and serum.  相似文献   

9.
An analysis of the structurally and catalytically diverse serine hydrolase protein family in the Saccharomyces cerevisiae proteome was undertaken using two independent but complementary, large-scale approaches. The first approach is based on computational analysis of serine hydrolase active site structures; the second utilizes the chemical reactivity of the serine hydrolase active site in complex mixtures. These proteomics approaches share the ability to fractionate the complex proteome into functional subsets. Each method identified a significant number of sequences, but 15 proteins were identified by both methods. Eight of these were unannotated in the Saccharomyces Genome Database at the time of this study and are thus novel serine hydrolase identifications. Three of the previously uncharacterized proteins are members of a eukaryotic serine hydrolase family, designated as Fsh (family of serine hydrolase), identified here for the first time. OVCA2, a potential human tumor suppressor, and DYR-SCHPO, a dihydrofolate reductase from Schizosaccharomyces pombe, are members of this family. Comparing the combined results to results of other proteomic methods showed that only four of the 15 proteins were identified in a recent large-scale, "shotgun" proteomic analysis and eight were identified using a related, but similar, approach (neither identifies function). Only 10 of the 15 were annotated using alternate motif-based computational tools. The results demonstrate the precision derived from combining complementary, function-based approaches to extract biological information from complex proteomes. The chemical proteomics technology indicates that a functional protein is being expressed in the cell, while the computational proteomics technology adds details about the specific type of function and residue that is likely being labeled. The combination of synergistic methods facilitates analysis, enriches true positive results, and increases confidence in novel identifications. This work also highlights the risks inherent in annotation transfer and the use of scoring functions for determination of correct annotations.  相似文献   

10.
We describe the initial characterization of the wheat amyloplast proteome, consisting of the identification and classification of 171 proteins. Whole amyloplasts and purified amyloplast membranes were prepared from wheat (Triticum aestivum). Protein extracts were examined by one-dimensional and two-dimensional electrophoresis, followed by high performance liquid chromatography-tandem mass spectrometry of separated proteins. Tandem mass spectrometry data of individual peptides was then searched by SEQUEST, using a database containing known protein sequences from both wheat and other homologous cereal crops. Using this approach we identified 108 proteins from whole amyloplasts and 63 proteins from purified amyloplast membranes. The majority of protein identifications were derived from protein sequences from cereal crops other than wheat, for which relatively little gene sequence data is available. The highest percentage of protein identifications obtained from any individual species was 46% of the total number of proteins identified, using sequence data found in our proprietary rice (Oryza sativa) genome database.  相似文献   

11.
There are several physiological roles postulated for aqueous humor, a liquid located in the anterior and posterior chamber of the eye, such as maintenance of the intraocular pressure, provision of nutrients, and removal of metabolic waste from neighboring tissues and provision of an immune response and protection during inflammation and infection. To link these function to specific or classes of proteins, identification of the aqueous humor proteome is essential. Aqueous humor obtained from healthy New Zealand white rabbits was analyzed using three synergistic protein separation methods: 1-D gel electrophoresis, 2-DE, and 1-DLC (RPLC) prior to protein identification by MS. As each of these separation methods separates intact proteins based on different physical properties (pIs, molecular weights, hydrophobicity, solubility, etc.) the proteome coverage is expanded. This was confirmed, since overlap between all three separation technologies was only about 8.2% with many proteins found uniquely by a single method. Although the most dominant protein presented in normal aqueous humor is albumin, by using this extensive separation/MS strategy, additional proteins were identified in total amount of 98 nonredundant proteins (plus an additional ten proteins for consideration). This expands the current protein identifications by approximately 65%. The aqueous humor proteome comprises a specific selection of cellular and plasma based proteins and can almost exclusively be divided into four functional groups: cell-cell interactions/wound healing, proteases and protease inhibitors, antioxidant protection, and antibacterial/anti-inflammatory proteins.  相似文献   

12.
We describe and demonstrate a global strategy that extends the sensitivity, dynamic range, comprehensiveness, and throughput of proteomic measurements based upon the use of peptide "accurate mass tags" (AMTs) produced by global protein enzymatic digestion. The two-stage strategy exploits Fourier transform-ion cyclotron resonance (FT-ICR) mass spectrometry to validate peptide AMTs for a specific organism, tissue or cell type from "potential mass tags" identified using conventional tandem mass spectrometry (MS/MS) methods, providing greater confidence in identifications as well as the basis for subsequent measurements without the need for MS/MS, and thus with greater sensitivity and increased throughput. A single high resolution capillary liquid chromatography separation combined with high sensitivity, high resolution and accurate FT-ICR measurements has been shown capable of characterizing peptide mixtures of significantly more than 10(5) components with mass accuracies of < 1 ppm, sufficient for broad protein identification using AMTs. Other attractions of the approach include the broad and relatively unbiased proteome coverage, the capability for exploiting stable isotope labeling methods to realize high precision for relative protein abundance measurements, and the projected potential for study of mammalian proteomes when combined with additional sample fractionation. Using this strategy, in our first application we have been able to identify AMTs for >60% of the potentially expressed proteins in the organism Deinococcus radiodurans.  相似文献   

13.
Mapping the proteome of barrel medic (Medicago truncatula)   总被引:9,自引:0,他引:9       下载免费PDF全文
  相似文献   

14.
We have developed a rice (Oryza sativa) genome annotation database (Osa1) that provides structural and functional annotation for this emerging model species. Using the sequence of O. sativa subsp. japonica cv Nipponbare from the International Rice Genome Sequencing Project, pseudomolecules, or virtual contigs, of the 12 rice chromosomes were constructed. Our most recent release, version 3, represents our third build of the pseudomolecules and is composed of 98% finished sequence. Genes were identified using a series of computational methods developed for Arabidopsis (Arabidopsis thaliana) that were modified for use with the rice genome. In release 3 of our annotation, we identified 57,915 genes, of which 14,196 are related to transposable elements. Of these 43,719 non-transposable element-related genes, 18,545 (42.4%) were annotated with a putative function, 5,777 (13.2%) were annotated as encoding an expressed protein with no known function, and the remaining 19,397 (44.4%) were annotated as encoding a hypothetical protein. Multiple splice forms (5,873) were detected for 2,538 genes, resulting in a total of 61,250 gene models in the rice genome. We incorporated experimental evidence into 18,252 gene models to improve the quality of the structural annotation. A series of functional data types has been annotated for the rice genome that includes alignment with genetic markers, assignment of gene ontologies, identification of flanking sequence tags, alignment with homologs from related species, and syntenic mapping with other cereal species. All structural and functional annotation data are available through interactive search and display windows as well as through download of flat files. To integrate the data with other genome projects, the annotation data are available through a Distributed Annotation System and a Genome Browser. All data can be obtained through the project Web pages at http://rice.tigr.org.  相似文献   

15.
16.
Four fractions from rat liver (a crude mitochondria (CM) and cytosol (C) fraction obtained with differential centrifugation, a purified mitochondrial (PM) fraction obtained with nycodenz density gradient centrifugation, and a total liver (TL) fraction) were analyzed with two-dimensional liquid chromatography tandem mass spectrometry analysis. A total of 564 rat proteins were identified and were bioinformatically annotated according to their physicochemical characteristics and functions. While most extreme alkaline ribosomal proteins were identified in the TL fraction, the C fraction mainly included neutral enzymes and the PM fraction enriched alkaline proteins and proteins with electron transfer activity or oxygen binding activity. Such characteristics were more apparent in proteins identified only in the TL, C, or PM fraction. The Swiss-Prot annotation and the bioinformatic prediction results proved that the C and PM fractions had enriched cytoplasmic or mitochondrial proteins, respectively. Combination usage of subcellular fractionation with two-dimensional liquid chromatography tandem mass spectrometry was proved to be a high-throughput, sensitive, and effective analytical approach for subcellular proteomics research. Using such a strategy, we have constructed the largest proteome database to date for rat liver (564 rat proteins) and its cytosol (222 rat proteins) and mitochondrial fractions (227 rat proteins). Moreover, the 352 proteins with Swiss-Prot subcellular location annotation in the 564 identified proteins were used as an actual subcellular proteome dataset to evaluate the widely used bioinformatics tools such as PSORT, TargetP, TMHMM, and GRAVY.  相似文献   

17.
We have identified and characterized the proteome of Sulfolobus solfataricus P2 using multidimensional liquid phase protein separations. Multidimensional liquid phase chromatography was performed using ion exchange chromatography in the first dimension, followed by reverse-phase chromatography using 500 microm i.d. poly(styrene-divinylbenzene) monoliths in the second dimension to separate soluble protein lysates from S. solfataricus. The 2DLC protein separations from S. solfataricus protein lysates enabled the generation of a 2D liquid phase map analogous to the traditional 2DE map. Following separation of the proteins in the second dimension, fractions were collected, digested in solution using trypsin and analyzed using mass spectrometry. These approaches offer significant reductions in labor intensity and the overall time taken to analyze the proteome in comparison to 2DE, taking advantage of automation and fraction collection associated with this approach. Furthermore, following proteomic analysis using 2DLC, the data obtained was compared to previous 2DE and shotgun proteomic studies of a soluble protein lysate from S. solfataricus. In comparison to 2DE, the results show an overall increase in proteome coverage. Moreover, 2DLC showed increased coverage of a number of protein subsets including acidic, basic, low abundance and small molecular weight proteins in comparison to 2DE. In comparison to shotgun studies, an increase in proteome coverage was also observed. Furthermore, 187 unique proteins were identified using 2DLC, demonstrating this methodology as an alternative approach for proteomic studies or in combination with 2DE and shotgun workflows for global proteomics.  相似文献   

18.
Understanding how proteins and their complex interaction networks convert the genomic information into a dynamic living organism is a fundamental challenge in biological sciences. As an important step towards understanding the systems biology of a complex eukaryote, we cataloged 63% of the predicted Drosophila melanogaster proteome by detecting 9,124 proteins from 498,000 redundant and 72,281 distinct peptide identifications. This unprecedented high proteome coverage for a complex eukaryote was achieved by combining sample diversity, multidimensional biochemical fractionation and analysis-driven experimentation feedback loops, whereby data collection is guided by statistical analysis of prior data. We show that high-quality proteomics data provide crucial information to amend genome annotation and to confirm many predicted gene models. We also present experimentally identified proteotypic peptides matching approximately 50% of D. melanogaster gene models. This library of proteotypic peptides should enable fast, targeted and quantitative proteomic studies to elucidate the systems biology of this model organism.  相似文献   

19.
We have developed a proteomics technology featuring on-line three-dimensional liquid chromatography coupled to tandem mass spectrometry (3D LC-MS/MS). Using 3D LC-MS/MS, the yeast-soluble, urea-solubilized peripheral membrane and SDS-solubilized membrane protein samples collectively yielded 3019 unique yeast protein identifications with an average of 5.5 peptides per protein from the 6300-gene Saccharomyces Genome Database searched with SEQUEST. A single run of the urea-solubilized sample yielded 2255 unique protein identifications, suggesting high peak capacity and resolving power of 3D LC-MS/MS. After precipitation of SDS from the digested membrane protein sample, 3D LC-MS/MS allowed the analysis of membrane proteins. Among 1221 proteins containing two or more predicted transmembrane domains, 495 such proteins were identified. The improved yeast proteome data allowed the mapping of many metabolic pathways and functional categories. The 3D LC-MS/MS technology provides a suitable tool for global proteome discovery.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号