首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The Yeast Protein Database (YPD) is a curated database for the proteome of Saccharomyces cerevisiae . It consists of approximately 6000 Yeast Protein Reports, one for each of the known or predicted yeast proteins. Each Yeast Protein Report is a one-page presentation of protein properties, annotation lines that summarize findings from the literature, and references. In the past year, the number of annotation lines has grown from 25 000 to approximately 35 000, and the number of articles curated has grown from approximately 3500 to >5000. Recently, new data types have been included in YPD: protein-protein interactions, genetic interactions, and regulators of gene expression. Finally, a new layer of information, the YPD Protein Minireviews, has recently been introduced. The Yeast Protein Database can be found on the Web at http://www.proteome.com/YPDhome. html  相似文献   

2.
YPD-A database for the proteins of Saccharomyces cerevisiae.   总被引:2,自引:1,他引:1       下载免费PDF全文
YPD is a database for the proteins of the budding yeast, Saccharomyces cerevisiae. YPD has two formats: (i) a spreadsheet which tabulates many of the physical and functional properties of yeast proteins, and (ii) the YPD Protein Reports which are formatted pages containing the protein properties, annotations gathered from the literature, and references with titles. YPD is available through the World-Wide Web, through an Email server, and by anonymous FTP. New releases of the YPD spreadsheet are produced every two to four months, and the on-line information is updated daily.  相似文献   

3.
4.
YPL.db: the Yeast Protein Localization database   总被引:3,自引:1,他引:2       下载免费PDF全文
The Yeast Protein Localization database (YPL.db) contains information about the localization patterns of yeast proteins resulting from microscopic analyses. The data and parameters of the experiments to obtain the localization information, together with images from confocal or video microscopy, are stored in a relational database, building an archive of, and the documentation for, all experiments. The database can be queried based on gene name, protein localization, growth conditions and a number of additional parameters. All experiment parameters are selectable from predefined lists to ensure database integrity and conformity across different investigators. The database provides a structure reference resource to allow for better characterization of unknown or ambiguous localization patterns. Links to MIPS, YPD and SGD databases are provided to allow fast access to further information not contained in the localization database itself. YPL.db is available at http://ypl.tugraz.at.  相似文献   

5.
The Proteome Analysis database (http://www.ebi.ac.uk/proteome/) has been developed by the Sequence Database Group at EBI utilizing existing resources and providing comparative analysis of the predicted protein coding sequences of the complete genomes of bacteria, archeae and eukaryotes. Three main projects are used, InterPro, CluSTr and GO Slim, to give an overview on families, domains, sites, and functions of the proteins from each of the complete genomes. Complete proteome analysis is available for a total of 89 proteome sets. A specifically designed application enables InterPro proteome comparisons for any one proteome against any other one or more of the proteomes in the database.  相似文献   

6.
MITOP (http://www.mips.biochem.mpg.de/proj/medgen/mitop/) is a comprehensive database for genetic and functional information on both nuclear- and mitochondrial-encoded proteins and their genes. The five species files--Saccharomyces cerevisiae, Mus musculus, Caenorhabditis elegans, Neurospora crassa and Homo sapiens--include annotated data derived from a variety of online resources and the literature. A wide spectrum of search facilities is given in the overlapping sections 'Gene catalogues', 'Protein catalogues', 'Homologies', 'Pathways and metabolism' and 'Human disease catalogue' including extensive references and hyperlinks to other databases. Central features are the results of various homology searches, which should facilitate the investigations into interspecies relationships. Precomputed FASTA searches using all the MITOP yeast protein entries and a list of the best human EST hits with graphical cluster alignments related to the yeast reference sequence are presented. The orthologue tables with cross-listings to all the protein entries for each species in MITOP have been expanded by adding the genomes of Rickettsia prowazeckii and Escherichia coli. To find new mitochondrial proteins the complete yeast genome has been analyzed using the MITOPROT program which identifies mitochondrial targeting sequences. The 'Human disease catalogue' contains tables with a total of 110 human diseases related to mitochondrial protein abnormalities, sorted by clinical criteria and age of onset. MITOP should contribute to the systematic genetic characterization of the mitochondrial proteome in relation to human disease.  相似文献   

7.
Identification of proteins from the mass spectra of peptide fragments generated by proteolytic cleavage using database searching has become one of the most powerful techniques in proteome science, capable of rapid and efficient protein identification. Using computer simulation, we have studied how the application of chemical derivatisation techniques may improve the efficiency of protein identification from mass spectrometric data. These approaches enhance ion yield and lead to the promotion of specific ions and fragments, yielding additional database search information. The impact of three alternative techniques has been assessed by searching representative proteome databases for both single proteins and simple protein mixtures. For example, by reliably promoting fragmentation of singly-charged peptide ions at aspartic acid residues after homoarginine derivatisation, 82% of yeast proteins can be unambiguously identified from a single typical peptide-mass datum, with a measured mass accuracy of 50 ppm, by using the associated secondary ion data. The extra search information also provides a means to confidently identify proteins in protein mixtures where only limited data are available. Furthermore, the inclusion of limited sequence information for the peptides can compensate and exceed the search efficiency available via high accuracy searches of around 5 ppm, suggesting that this is a potentially useful approach for simple protein mixtures routinely obtained from two-dimensional gels.  相似文献   

8.
9.
SENTRA, available via URL http://wit.mcs.anl.gov/WIT2/Sentra/, is a database of proteins associated with microbial signal transduction. The database currently includes the classical two-component signal transduction pathway proteins and methyl-accepting chemotaxis proteins, but will be expanded to also include other classes of signal transduction systems that are modulated by phosphorylation or methylation reactions. Although the majority of database entries are from prokaryotic systems, eukaroytic proteins with bacterial-like signal transduction domains are also included. Currently SENTRA contains signal transduction proteins in 34 complete and almost completely sequenced prokaryotic genomes, as well as sequences from 243 organisms available in public databases (SWISS-PROT and EMBL). The analysis was carried out within the framework of the WIT2 system, which is designed and implemented to support genetic sequence analysis and comparative analysis of sequenced genomes.  相似文献   

10.
The FSSP database of structurally aligned protein fold families.   总被引:17,自引:0,他引:17       下载免费PDF全文
L Holm  C Sander 《Nucleic acids research》1994,22(17):3600-3609
FSSP (families of structurally similar proteins) is a database of structural alignments of proteins in the Protein Data Bank (PDB). The database currently contains an extended structural family for each of 330 representative protein chains. Each data set contains structural alignments of one search structure with all other structurally significantly similar proteins in the representative set (remote homologs, < 30% sequence identity), as well as all structures in the Protein Data Bank with 70-30% sequence identity relative to the search structure (medium homologs). Very close homologs (above 70% sequence identity) are excluded as they rarely have marked structural differences. The alignments of remote homologs are the result of pairwise all-against-all structural comparisons in the set of 330 representative protein chains. All such comparisons are based purely on the 3D co-ordinates of the proteins and are derived by automatic (objective) structure comparison programs. The significance of structural similarity is estimated based on statistical criteria. The FSSP database is available electronically from the EMBL file server and by anonymous ftp (file transfer protocol).  相似文献   

11.
The Saccharomyces Genome Database (SGD: http://genome-www.stanford.edu/Saccharomyces/) has recently developed new resources to provide more complete information about proteins from the budding yeast Saccharomyces cerevisiae. The PDB Homologs page provides structural information from the Protein Data Bank (PDB) about yeast proteins and/or their homologs. SGD has also created a resource that utilizes the eMOTIF database for motif information about a given protein. A third new resource is the Protein Information page, which contains protein physical and chemical properties, such as molecular weight and hydropathicity scores, predicted from the translated ORF sequence.  相似文献   

12.
Reiter LT  Do LH  Fischer MS  Hong NA  Bier E 《Fly》2007,1(3):164-171
The availability of complete genome sequence information for diverse organisms including model genetic organisms has ushered in a new era of protein sequence comparisons making it possible to search for commonalities among entire proteomes using the Basic Local Alignment Search Tool (BLAST). Although the identification and analysis of proteins shared by humans and model organisms has proven an invaluable tool to understanding gene function, the sets of proteins unique to a given model organism's proteome have remained largely unexplored. We have constructed a searchable database that allows biologists to identify proteins unique to a given proteome. The Negative Proteome Database (NPD) is populated with pair-wise protein sequence comparisons between each of the following proteomes: Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Saccharomyces cerevisiae, Dictyostelium discoideum, Chlamydomonus reinhardti, Escherichia coli K12, Arabidopsis thaliana and Methanoscarcina acetivorans. Our analysis of negative proteome datasets using the NPD has thus far revealed 107 proteins in humans that may be involved in motile cilia function, 1628 potential pesticide target proteins in flies, 659 proteins shared by flies and humans that are not represented in the less neurologically complex worm proteome, and 180 nuclear encoded human disease associated proteins that are absent from the fly proteome. The NPD is the only online resource where users can quickly perform complex negative and positive comparisons of model organism proteomes. We anticipate that the NPD and the adaptable algorithm which can readily be used to duplicate this analysis on custom sets of proteomes will be an invaluable tool in the investigation of organism specific protein sets.  相似文献   

13.
Park GW  Kwon KH  Kim JY  Lee JH  Yun SH  Kim SI  Park YM  Cho SY  Paik YK  Yoo JS 《Proteomics》2006,6(4):1121-1132
In shotgun proteomics, proteins can be fractionated by 1-D gel electrophoresis and digested into peptides, followed by liquid chromatography to separate the peptide mixture. Mass spectrometry generates hundreds of thousands of tandem mass spectra from these fractions, and proteins are identified by database searching. However, the search scores are usually not sufficient to distinguish the correct peptides. In this study, we propose a confident protein identification method for high-throughput analysis of human proteome. To build a filtering protocol in database search, we chose Pseudomonas putida KT2440 as a reference because this bacterial proteome contains fewer modifications and is simpler than the human proteome. First, the P. putida KT2440 proteome was filtered by reversed sequence database search and correlated by the molecular weight in 1-D-gel band positions. The characterization protocol was then applied to determine the criteria for clustering of the human plasma proteome into three different groups. This protein filtering method, based on bacterial proteome data analysis, represents a rapid way to generate higher confidence protein list of the human proteome, which includes some of heavily modified and cleaved proteins.  相似文献   

14.
Sapkota A  Liu X  Zhao XM  Cao Y  Liu J  Liu ZP  Chen L 《Molecular bioSystems》2011,7(9):2615-2621
Rice is an important crop throughout the world and is the staple food for about half the world's population. For better breeding and improved production, we need to know the function of rice molecules which facilitate their function through interactions with each other. The database of interacting proteins in Oryza sativa (DIPOS) provides comprehensive information of interacting proteins in rice, where the interactions are predicted using two computational methods, i.e., interologs and domain based methods. DIPOS contains 14?614?067 pairwise interactions among 27?746 proteins, covering about 41% of the whole Oryaza sativa proteome. Furthermore, each interaction is assigned a confidence score which further enables biologists to sort out the important proteins. Biological explanations of pathways and interactions are also provided based on the database. Public access to the DIPOS is available at and .  相似文献   

15.
Nucleic acid sequences from genome sequencing projects are submitted as raw data, from which biologists attempt to elucidate the function of the predicted gene products. The protein sequences are stored in public databases, such as the UniProt Knowledgebase (UniProtKB), where curators try to add predicted and experimental functional information. Protein function prediction can be done using sequence similarity searches, but an alternative approach is to use protein signatures, which classify proteins into families and domains. The major protein signature databases are available through the integrated InterPro database, which provides a classification of UniProtKB sequences. As well as characterization of proteins through protein families, many researchers are interested in analyzing the complete set of proteins from a genome (i.e. the proteome), and there are databases and resources that provide non-redundant proteome sets and analyses of proteins from organisms with completely sequenced genomes. This article reviews the tools and resources available on the web for single and large-scale protein characterization and whole proteome analysis.  相似文献   

16.
Babnigg G  Giometti CS 《Proteomics》2006,6(16):4514-4522
In proteome studies, identification of proteins requires searching protein sequence databases. The public protein sequence databases (e.g., NCBInr, UniProt) each contain millions of entries, and private databases add thousands more. Although much of the sequence information in these databases is redundant, each database uses distinct identifiers for the identical protein sequence and often contains unique annotation information. Users of one database obtain a database-specific sequence identifier that is often difficult to reconcile with the identifiers from a different database. When multiple databases are used for searches or the databases being searched are updated frequently, interpreting the protein identifications and associated annotations can be problematic. We have developed a database of unique protein sequence identifiers called Sequence Globally Unique Identifiers (SEGUID) derived from primary protein sequences. These identifiers serve as a common link between multiple sequence databases and are resilient to annotation changes in either public or private databases throughout the lifetime of a given protein sequence. The SEGUID Database can be downloaded (http://bioinformatics.anl.gov/SEGUID/) or easily generated at any site with access to primary protein sequence databases. Since SEGUIDs are stable, predictions based on the primary sequence information (e.g., pI, Mr) can be calculated just once; we have generated approximately 500 different calculations for more than 2.5 million sequences. SEGUIDs are used to integrate MS and 2-DE data with bioinformatics information and provide the opportunity to search multiple protein sequence databases, thereby providing a higher probability of finding the most valid protein identifications.  相似文献   

17.
The Mycobacterium tuberculosis Proteome Comparison Database (MTB-PCDB) is an online database providing integrated access to proteome sequence comparison data for five strains of Mycobacterium tuberculosis (H37Rv, H37Ra, CDC 1551, F11 and KZN 1435) sequenced completely so far. MTB-PCDB currently hosts 40252 protein sequence comparison data obtained through inter-strain proteome comparison of five different strains of MTB. 2373 proteins were found to be identical in all 5 strains using MTB H(37)Rv as reference strain. To enable wide use of this data, MTB-PCDB provides a set of tools for searching, browsing, analyzing and downloading the data. By bringing together, M. tuberculosis proteome comparison among virulent & avirulent strains and also drug susceptible & drug resistance strains MTB-PCDB provides a unique discovery platform for comparative proteomics among these strains which may give insights into the discovery & development of TB drugs, vaccines and biomarkers. AVAILABILITY: The database is available for free at http://www.bicjbtdrc-mgims.in/MTB-PCDB/  相似文献   

18.
While tandem mass spectrometry (MS/MS) is routinely used to identify proteins from complex mixtures, certain types of proteins present unique challenges for MS/MS analyses. The major wheat gluten proteins, gliadins and glutenins, are particularly difficult to distinguish by MS/MS. Each of these groups contains many individual proteins with similar sequences that include repetitive motifs rich in proline and glutamine. These proteins have few cleavable tryptic sites, often resulting in only one or two tryptic peptides that may not provide sufficient information for identification. Additionally, there are less than 14,000 complete protein sequences from wheat in the current NCBInr release. In this paper, MS/MS methods were optimized for the identification of the wheat gluten proteins. Chymotrypsin and thermolysin as well as trypsin were used to digest the proteins and the collision energy was adjusted to improve fragmentation of chymotryptic and thermolytic peptides. Specialized databases were constructed that included protein sequences derived from contigs from several assemblies of wheat expressed sequence tags (ESTs), including contigs assembled from ESTs of the cultivar under study. Two different search algorithms were used to interrogate the database and the results were analyzed and displayed using a commercially available software package (Scaffold). We examined the effect of protein database content and size on the false discovery rate. We found that as database size increased above 30,000 sequences there was a decrease in the number of proteins identified. Also, the type of decoy database influenced the number of proteins identified. Using three enzymes, two search algorithms and a specialized database allowed us to greatly increase the number of detected peptides and distinguish proteins within each gluten protein group.  相似文献   

19.
20.
In Saccharomyces cerevisiae, the SLN1-YPD1-SSK1 phosphorelay system controls a downstream mitogen-activated protein (MAP) kinase in response to hyperosmotic stress. YPD1 functions as a phospho-histidine protein intermediate which is required for phosphoryl group transfer from the sensor kinase SLN1 to the response regulator SSK1. In addition, YPD1 mediates phosphoryl transfer from SLN1 to SKN7, the only other response regulator protein in yeast which plays a role in response to oxidative stress and cell wall biosynthesis.The X-ray structure of YPD1 was solved at a resolution of 2.7 A by conventional multiple isomorphous replacement with anomalous scattering. The tertiary structure of YPD1 consists of six alpha-helices and a short 310-helix. A four-helix bundle comprises the central core of the molecule and contains the histidine residue that is phosphorylated. Structure-based comparisons of YPD1 to other proteins having a similar function, such as the Escherichia coli ArcB histidine-containing phosphotransfer (HPt) domain and the P1 domain of the CheA kinase, revealed that the helical bundle and several structural features around the active-site histidine residue are conserved between the prokaryotic and eukaryotic kingdoms.Despite limited amino acid sequence homology among HPt domains, our analysis of YPD1 as a prototypical family member, indicates that these phosphotransfer domains are likely to share a similar fold and common features with regard to response regulator binding and mechanism for histidine-aspartate phosphoryl transfer.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号