首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The SWISS-PROT group at EBI has developed the Proteome Analysis Database utilising existing resources and providing comparative analysis of the predicted protein coding sequences of the complete genomes of bacteria, archaea and eukaryotes (http://www.ebi.ac. uk/proteome/). The two main projects used, InterPro and CluSTr, give a new perspective on families, domains and sites and cover 31-67% (InterPro statistics) of the proteins from each of the complete genomes. CluSTr covers the three complete eukaryotic genomes and the incomplete human genome data. The Proteome Analysis Database is accompanied by a program that has been designed to carry out InterPro proteome comparisons for any one proteome against any other one or more of the proteomes in the database.  相似文献   

2.
InterPro was developed as a new integrated documentation resource for protein families, domains and functional sites to rationalize the complementary efforts of the PROSITE, PRINTS, Pfam and ProDom database projects and has applications in computational functional classification of newly determined sequences lacking biochemical characterization and in comparative genome analysis. InterPro contains over 3500 entries, with more than 1000000 hits in SWISS-PROT and TrEMBL. The database is accessible for text- and sequence-based searches at http://www.ebi.ac.uk/interpro/. InterPro was used for whole proteome analysis of the pathogenic microorganism, Mycobacterium tuberculosis, and comparison with the predicted protein coding sequences of the complete genomes of Bacillus subtilis and Escherichia coli. 64.8% of the M. tuberculosis proteins in the proteome matched InterPro entries, and these could be classified according to function. The comparison with B. subtilis and E. coli provided information on the most common protein families and domains, and the most highly represented families in each organism. InterPro thus provides a useful tool for global views of whole proteomes and their compositions.  相似文献   

3.
InterPro, an integrated documentation resource of protein families, domains and functional sites, was created in 1999 as a means of amalgamating the major protein signature databases into one comprehensive resource. PROSITE, Pfam, PRINTS, ProDom, SMART and TIGRFAMs have been manually integrated and curated and are available in InterPro for text- and sequence-based searching. The results are provided in a single format that rationalises the results that would be obtained by searching the member databases individually. The latest release of InterPro contains 5629 entries describing 4280 families, 1239 domains, 95 repeats and 15 post-translational modifications. Currently, the combined signatures in InterPro cover more than 74% of all proteins in SWISS-PROT and TrEMBL, an increase of nearly 15% since the inception of InterPro. New features of the database include improved searching capabilities and enhanced graphical user interfaces for visualisation of the data. The database is available via a webserver (http://www.ebi.ac.uk/interpro) and anonymous FTP (ftp://ftp.ebi.ac.uk/pub/databases/interpro).  相似文献   

4.
The Proteome Analysis database (http://www.ebi.ac.uk/proteome/) has been developed by the Sequence Database Group at EBI utilizing existing resources and providing comparative analysis of the predicted protein coding sequences of the complete genomes of bacteria, archeae and eukaryotes. Three main projects are used, InterPro, CluSTr and GO Slim, to give an overview on families, domains, sites, and functions of the proteins from each of the complete genomes. Complete proteome analysis is available for a total of 89 proteome sets. A specifically designed application enables InterPro proteome comparisons for any one proteome against any other one or more of the proteomes in the database.  相似文献   

5.
Many protein regions have been shown to be intrinsically disordered, lacking unique structure under physiological conditions. These intrinsically disordered regions are not only very common in proteomes, but also crucial to the function of many proteins, especially those involved in signaling, recognition, and regulation. The goal of this work was to identify the prevalence, characteristics, and functions of conserved disordered regions within protein domains and families. A database was created to store the amino acid sequences of nearly one million proteins and their domain matches from the InterPro database, a resource integrating eight different protein family and domain databases. Disorder prediction was performed on these protein sequences. Regions of sequence corresponding to domains were aligned using a multiple sequence alignment tool. From this initial information, regions of conserved predicted disorder were found within the domains. The methodology for this search consisted of finding regions of consecutive positions in the multiple sequence alignments in which a 90% or more of the sequences were predicted to be disordered. This procedure was constrained to find such regions of conserved disorder prediction that were at least 20 amino acids in length. The results of this work included 3,653 regions of conserved disorder prediction, found within 2,898 distinct InterPro entries. Most regions of conserved predicted disorder detected were short, with less than 10% of those found exceeding 30 residues in length.  相似文献   

6.
7.
The CluSTr (Clusters of SWISS-PROT and TrEMBL proteins) database offers an automatic classification of SWISS-PROT and TrEMBL proteins into groups of related proteins. The clustering is based on analysis of all pairwise comparisons between protein sequences. Analysis has been carried out for different levels of protein similarity, yielding a hierarchical organisation of clusters. The database provides links to InterPro, which integrates information on protein families, domains and functional sites from PROSITE, PRINTS, Pfam and ProDom. Links to the InterPro graphical interface allow users to see at a glance whether proteins from the cluster share particular functional sites. CluSTr also provides cross-references to HSSP and PDB. The database is available for querying and browsing at http://www.ebi.ac.uk/clustr.  相似文献   

8.
9.
蛋白质二硫键异构酶家族的结构与功能   总被引:1,自引:0,他引:1  
蛋白质二硫键异构酶(protein disulfide isomerase,PDI)家族是一类在内质网中起作用的巯基-二硫键氧化还原酶.它们通常含有CXXC(Cys-Xaa-Xaa-Cys,CXXC)活性位点,活性位点的两个半胱氨酸残基可催化底物二硫键的形成、异构及还原.所有PDI家族成员包含至少一个约100个氨基酸残基的硫氧还蛋白同源结构域.PDI家族的主要职能是催化内质网中新生肽链的氧化折叠,另外在内质网相关的蛋白质降解途径(ERAD)、蛋白质转运、钙稳态、抗原提呈及病毒入侵等方面也起重要作用.  相似文献   

10.
InterPro, an integrated documentation resource for protein families, protein domains, and functional sites, was developed to amalgamate the individual efforts of the PROSITE, PRINTS, Pfam, and ProDom databases. InterPro can be used for the computational functional classification of newly determined amino acid sequences that lack biochemical characterization and for comparative genome analysis. InterPro contains over 3500 entries for more than 1 000 000 hits in SWISS-PROT and TrEMBL. The database is accessible for text-and sequence-based searches at http://www.ebi.ac.uk/interpro/. InterPro was used for the complete analysis of the proteome of the pathogenic microorganism Mycobacterium tuberculosis and the comparison with the predicted protein-coding sequences of the complete genomes of Bacillus subtilis and Escherichia coli. It was found that 64.8% of proteins in the proteome of M. tuberculosis matched InterPro entries and can be classified by their functions. The comparison with B. subtilis and E. coli provided information on the most common protein families and domains and on the most highly represented protein families in each organism. Thus, InterPro is a useful tool for general comparison of complete proteomes and their compositions.  相似文献   

11.
Applications of InterPro in protein annotation and genome analysis   总被引:2,自引:0,他引:2  
The applications of InterPro span a range of biologically important areas that includes automatic annotation of protein sequences and genome analysis. In automatic annotation of protein sequences InterPro has been utilised to provide reliable characterisation of sequences, identifying them as candidates for functional annotation. Rules based on the InterPro characterisation are stored and operated through a database called RuleBase. RuleBase is used as the main tool in the sequence database group at the EBI to apply automatic annotation to unknown sequences. The annotated sequences are stored and distributed in the TrEMBL protein sequence database. InterPro also provides a means to carry out statistical and comparative analyses of whole genomes. In the Proteome Analysis Database, InterPro analyses have been combined with other analyses based on CluSTr, the Gene Ontology (GO) and structural information on the proteins.  相似文献   

12.
Protein aggregation leads to several burdensome human maladies, but a molecular level understanding of how human proteome has tackled the threat of aggregation is currently lacking. In this work, we survey the human proteome for incidence of aggregation prone regions (APRs), by using sequences of experimentally validated amyloid‐fibril forming peptides and via computational predictions. While approximately 30 human proteins are currently known to be amyloidogenic, we found that 260 proteins (~1% of human proteome) contain at least one experimentally validated amyloid‐fibril forming segment. Computer predictions suggest that more than 80% of the human proteins contain at least one potential APR and approximately two‐thirds (65%) contain two or more APRs; spanning 3–5% of their sequences. Sequence randomizations show that this apparently high incidence of APRs has been actually significantly reduced by unique amino acid composition and sequence patterning of human proteins. The human proteome has utilized a wide repertoire of sequence‐structural optimization strategies, most of them already known, to minimize deleterious consequences due to the presence of APRs while simultaneously taking advantage of their order promoting properties. This survey also found that APRs tend to be located near the active and ligand binding sites in human proteins, but not near the post translational modification sites. The APRs in human proteins are also preferentially found at heterotypic interfaces rather than homotypic ones. Interestingly, this survey reveals that APRs play multiple, often opposing, roles in the human protein sequence‐structure‐function relationships. Insights gained from this work have several interesting implications towards novel drug discovery and development. Proteins 2017; 85:1099–1118. © 2017 Wiley Periodicals, Inc.  相似文献   

13.
G Sipos  F Reggiori  C Vionnet    A Conzelmann 《The EMBO journal》1997,16(12):3494-3505
Glycosylphosphatidylinositol (GPI)-anchored membrane proteins of Saccharomyces cerevisiae exist with two types of lipid moiety--diacylglycerol or ceramide--both of which contain 26:0 fatty acids. To understand at which stage of biosynthesis these long-chain fatty acids become incorporated into diacylglycerol anchors, we compared the phosphatidylinositol moieties isolated from myo-[2-(3)H]inositol-labelled protein anchors and from GPI intermediates. There is no evidence for the presence of long-chain fatty acids in any intermediate of GPI biosynthesis. However, GPI-anchored proteins contain either the phosphatidylinositol moiety characteristic of the precursor lipids or a version with a long-chain fatty acid in the sn-2 position of glycerol. The introduction of long-chain fatty acids into sn-2 occurs in the endoplasmic reticulum (ER) and is independent of the sn-2-specific acyltransferase SLC1. Analysis of ceramide anchors revealed the presence of two types of ceramide, one added in the ER and another more polar molecule which is found only on proteins which have reached the mid Golgi. In summary, the lipid of GPI-anchored proteins can be exchanged by at least three different remodelling pathways: (i) remodelling from diacylglycerol to ceramide in the ER as proposed previously; (ii) remodelling from diacylglycerol to a more hydrophobic diacylglycerol with a long-chain fatty acid in sn-2 in the ER; and (iii) remodelling to a more polar ceramide in the Golgi.  相似文献   

14.
The proteins in blood were all first expressed as mRNAs from genes within cells. There are databases of human proteins that are known to be expressed as mRNA in human cells and tissues. Proteins identified from human blood by the correlation of mass spectra that fail to match human mRNA expression products may not be correct. We compared the proteins identified in human blood by mass spectrometry by 10 different groups by correlation to human and nonhuman nucleic acid sequences. We determined whether the peptides or proteins identified by the different groups mapped to the human known proteins of the Reference Sequence (RefSeq) database. We used Structured Query Language data base searches of the peptide sequences correlated to tandem mass spectrometry spectra and basic local alignment search tool analysis of the identified full length proteins to control for correlation to the wrong peptide sequence or the existence of the same or very similar peptide sequence shared by more than one protein. Mass spectra were correlated against large protein data bases that contain many sequences that may not be expressed in human beings yet the search returned a very high percentage of peptides or proteins that are known to be found in humans. Only about 5% of proteins mapped to hypothetical sequences, which is in agreement with the reported false-positive rate of searching algorithms conditions. The results were highly enriched in secreted and soluble proteins and diminished in insoluble or membrane proteins. Most of the proteins identified were relatively short and showed a similar size distribution compared to the RefSeq database. At least three groups agree on a nonredundant set of 1671 types of proteins and a nonredundant set of 3151 proteins were identified by at least three peptides.  相似文献   

15.
The crystalloid endoplasmic reticulum (ER) of UT-1 cells is a specialized smooth ER that houses 3-hydroxy-3-methylglutaryl-CoA reductase, a membrane protein that regulates endogenous cholesterol synthesis. The biogenesis of this ER is coupled to the over production of 3-hydroxy-3-methylglutaryl-CoA reductase. To understand better this membrane system and the relationship between the synthesis of a membrane protein and the formation of membrane, we have purified the crystalloid ER. Purified crystalloid ER did not contain significant amounts of membrane derived from the Golgi apparatus, mitochondria, or plasma membrane. Approximately 24% of the protein in this organelle corresponded to 3-hydroxy-3-methylglutaryl-CoA reductase; however, at least eight other proteins were detected by gel electrophoresis. One of these proteins (Mr 73,000) was as abundant as reductase. These results suggest that the biogenesis of this ER involves the coordinate synthesis of multiple membrane and content proteins.  相似文献   

16.
Proteins that are concentrated in specific compartments of the endomembrane system in order to exert their organelle-specific function must possess specific localization signals that prevent their transport to distal regions of the exocytic pathway. Some resident proteins of the endoplasmic reticulum (ER) that are known to escape with low efficiency from this organelle to a post ER compartment are recognized by a recycling receptor and brought back to their site of residence. Other ER proteins, however, appear to be retained in the ER by mechanisms that operate in the organelle itself. The mammalian oligosaccharyltransferase (OST) is a protein complex that effects the cotranslational N-glycosylation of newly synthesized polypeptides, and is composed of at least four rough ER-specific membrane proteins: ribophorins I and II (RI and RII), OST48, and Dadl. The mechanism(s) by which the subunits of this complex are retained in the ER are not well understood. In an effort to identify the domains within RII responsible for its ER localization we have studied the fate of chimeric proteins in which one or more RII domains were replaced by the corresponding ones of the Tac antigen, the latter being a well characterized plasma membrane protein that lacks intrinsic ER retention signals and serves to provide a neutral framework for the identification of retention signals in other proteins. We found that the luminal domain of RII by itself does not contain retention information, while the cytoplasmic and transmembrane domains contain independent ER localization signals. We also show that the retention function of the transmembrane domain is strengthened by the presence of a flanking luminal region consisting of 15 amino acids.  相似文献   

17.
BACKGROUND: We evaluated both estrogen receptor (ER) and human epidermal growth factor receptor 2 (HER2) status on disseminated tumor cells (DTCs) in the bone marrow of 54 patients with early breast cancer and compared these with the corresponding primary tumor (PT). MATERIALS AND METHODS: Bone marrow aspirates were obtained at the time of first surgery, and ER and HER2 status on DTCs was assessed simultaneously by immunocytochemistry using a triple fluorescence staining method. RESULTS: The median number of DTCs was 13 (range 1-95). The concordance rate between ER status on DTC and PT was 74%. Patients with an ER-positive PT were significantly more likely to have at least one ER-positive DTC (34 out of 42) than patients with an ER-negative PT (6 out of 12; P = .031). Thirty-nine (93%) of the 42 patients with ER-positive PT had at least one ER-negative DTC. The concordance rate between HER2 status on DTC and PT was 52%. The probability of having at least one HER2-positive DTC was not related to the HER2 status of the PT (P = 0.56). Twenty-two (46%) of the 48 patients with a HER2-negative PT had at least one HER2-positive DTC. All the six patients with a HER2-positive PT had at least one HER2-negative DTC. CONCLUSION: Taken together, our study confirms that ER and/or HER2 status may differ between DTC and PT. This discordance could be important for patients lacking ER or HER2 expression on the PT but showing ER-positive or HER2-positive DTC because they might benefit from an endocrine and/or HER2-targeted therapy.  相似文献   

18.
We have constructed three gene fusions that encode portions of a membrane protein, arginine permease, fused to a reporter domain, the cytoplasmic enzyme histidinol dehydrogenase (HD), located at the C-terminal end. These fusion proteins contain at least one of the internal signal sequences of arginine permease. When the fusion proteins were expressed in Saccharomyces cerevisiae and inserted into the endoplasmic reticulum (ER), two of the fusion proteins placed HD on the luminal side of the ER membrane, but only when a piece of DNA encoding a spacer protein segment was inserted into the fusion joint. The third fusion protein, with or without the spacer included, placed HD on the cytoplasmic side of the membrane. These results suggest that (i) sequences C-terminal to the internal signal sequence can inhibit membrane insertion and (ii) HD requires a preceding spacer segment to be translocated across the ER membrane.  相似文献   

19.
InterPro (http://www.ebi.ac.uk/interpro/) is an integrated documentation resource for protein families, domains and sites, developed initially as a means of rationalizing the complementary efforts of the PROSITE, PRINTS, Pfam and ProDom database projects. It is a useful resource that aids the functional classification of proteins. Almost 90% of the actinopterygii protein sequences from SWISS-PROT and TrEMBL can be classified using InterPro. Over 30% of the actinopterygii protein sequences currently in SWISS-PROT and TrEMBL are of mitochondrial origin, the majority of which belong to the cytochrome b/b6 family. InterPro also gives insights into the domain composition of the classified proteins and has applications in the functional classification of newly determined sequences lacking biochemical characterization, and in comparative genome analysis. A comparison of the actinopterygii protein sequences against the sequences of other eukaryotes confirms the high representation of eukaryotic protein kinase in the organisms studied. The comparisons also show that, based on InterPro families, the trans-species evolution of MHC class I and II molecules in mammals and teleost fish can be recognized.  相似文献   

20.
The human PDI family: versatility packed into a single fold   总被引:2,自引:0,他引:2  
The enzymes of the protein disulfide isomerase (PDI) family are thiol-disulfide oxidoreductases of the endoplasmic reticulum (ER). They contain a CXXC active-site sequence where the two cysteines catalyze the exchange of a disulfide bond with or within substrates. The primary function of the PDIs in promoting oxidative protein folding in the ER has been extended in recent years to include roles in other processes such as ER-associated degradation (ERAD), trafficking, calcium homeostasis, antigen presentation and virus entry. Some of these functions are performed by non-catalytic members of the family that lack the active-site cysteines. Regardless of their function, all human PDIs contain at least one domain of approximately 100 amino acid residues with structural homology to thioredoxin. As we learn more about the individual proteins of the family, a complex picture is emerging that emphasizes as much their differences as their similarities, and underlines the versatility of the thioredoxin fold. Here, we primarily explore the diversity of cellular functions described for the human PDIs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号