首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
M E Reith  R A Cattolico 《Biochemistry》1985,24(10):2556-2561
Information on the ctDNA protein coding profile of the Chlorophyta, Rhodophyta, and Chromophyta might provide clues to the evolutionary mechanism(s) by which plants diverged into these three phylogenetic groups. The purpose of this study was to examine the ctDNA protein coding profile of the chromophytic plant Olisthodiscus luteus. Whole cells were labeled in the presence of cycloheximide, an inhibitor of cytoplasmic protein synthesis. Control experiments demonstrate that the chloroplast proteins labeled in vivo by this technique form a distinct subset of the total proteins synthesized by the cell. Approximately 50 plastid proteins (35 soluble, 15 membrane) were detected after two-dimensional gel electrophoresis and fluorography. Three ctDNA-coded proteins, the large subunit of ribulosebisphosphate carboxylase, the apoprotein of the P700-chlorophyll a-protein complex, and the "photogene" were identified. These proteins are also coded by chlorophytic ctDNA. Unexpectedly, the ctDNA of Olisthodiscus was shown to code for the small subunit of ribulosebisphosphate carboxylase. The gene for this enzyme subunit is nuclear coded in all chlorophytic plants that have been analyzed.  相似文献   

2.
MOTIVATION: Protein sequence clustering has been widely exploited to facilitate in-depth analysis of protein functions and families. For some applications of protein sequence clustering, it is highly desirable that a hierarchical structure, also referred to as dendrogram, which shows how proteins are clustered at various levels, is generated. However, as the sizes of contemporary protein databases continue to grow at rapid rates, it is of great interest to develop some summarization mechanisms so that the users can browse the dendrogram and/or search for the desired information more effectively. RESULTS: In this paper, the design of a novel incremental clustering algorithm aimed at generating summarized dendrograms for analysis of protein databases is described. The proposed incremental clustering algorithm employs a statistics-based model to summarize the distributions of the similarity scores among the proteins in the database and to control formation of clusters. Experimental results reveal that, due to the summarization mechanism incorporated, the proposed incremental clustering algorithm offers the users highly concise dendrograms for analysis of protein clusters with biological significance. Another distinction of the proposed algorithm is its incremental nature. As the sizes of the contemporary protein databases continue to grow at fast rates, due to the concern of efficiency, it is desirable that cluster analysis of a protein database can be carried out incrementally, when the protein database is updated. Experimental results with the Swiss-Prot protein database reveal that the time complexity for carrying out incremental clustering with k new proteins added into the database containing n proteins is O(n2betalogn), where beta congruent with 0.865, provided that k < n. AVAILABILITY: The Linux executable is available on the following supplementary page.  相似文献   

3.
Sequence databases are rapidly growing, thereby increasing the coverage of protein sequence space, but this coverage is uneven because most sequencing efforts have concentrated on a small number of organisms. The resulting granularity of sequence space creates many problems for profile-based sequence comparison programs. In this paper, we suggest several strategies that address these problems, and at the same time speed up the searches for homologous proteins and improve the ability of profile methods to recognize distant homologies. One of our strategies combines database clustering, which removes highly redundant sequence, and a two-step PSI-BLAST (PDB-BLAST), which separates sequence spaces of profile composition and space of homology searching. The combination of these strategies improves distant homology recognitions by more than 100%, while using only 10% of the CPU time of the standard PSI-BLAST search. Another method, intermediate profile searches, allows for the exploration of additional search directions that are normally dominated by large protein sub-families within very diverse families. All methods are evaluated with a large fold-recognition benchmark.  相似文献   

4.
T Itoh  H Matsuda  H Mori 《DNA research》1999,6(5):299-305
Novel members of the highly conserved protein family, Hsp70, have been found in the complete sequences of several genomes. To elucidate a phylogenetic relationship among Hsp70 proteins of Escherichia coli, we searched all open reading frames derived from 13 complete genomes for Hsp70/actin-related proteins by the single-linkage clustering method. Phylogenetic analysis of this superfamily revealed that E. coli possesses at least three Hsp70 homologs (DnaK, Hsc66 and Hsc62). We found that Hsc62, which is the product of hscC, is a new member of the Hsc66 subfamily, and is specific to E. coli. The analysis also suggested that YegD of E. coli is closely related to the actin family, which consists of the actin, FtsA and MreB subfamilies. A further database search revealed that two dnaJ homologs, ybeS and ybeV, were located on the opposite strand near hscC. Consequently, E. coli seems to have three gene clusters composed of DnaK and DnaJ homologs.  相似文献   

5.
Current state-of-the-art experimental and computational proteomic approaches were integrated to obtain a comprehensive protein profile of Populus vascular tissue. This featured: (1) a large sample set consisting of two genotypes grown under normal and tension stress conditions, (2) bioinformatics clustering to effectively handle gene duplication, and (3) an informatics approach to track and identify single amino acid polymorphisms (SAAPs). By applying a clustering algorithm to the Populus database, the number of protein entries decreased from 64,689 proteins to a total of 43,069 protein groups, thereby reducing 7505 identified proteins to a total of 4226 protein groups, in which 2016 were singletons. This reduction implies that ~50% of the measured proteins shared extensive sequence homology. Using conservative search criteria, we were able to identify 1354 peptides containing a SAAP and 201 peptides that become tryptic due to a K or R substitution. These newly identified peptides correspond to 502 proteins, including 97 previously unidentified proteins. In total, the integration of deep proteome measurements on an extensive sample set with protein clustering and peptide sequence variants provided an exceptional level of proteome characterization for Populus, allowing us to spatially resolve the vascular tissue proteome.  相似文献   

6.
7.
I have looked for proteins that are present in all hyperthermophile genomes, but absent from all mesophile or thermophile genomes by using the phylogenetic pattern search program of the COG database. Surprisingly, this search retrieved only one such hyperthermophile-specific protein: reverse gyrase. This result emphasizes the importance of reverse gyrase in the adaptation of life to very high temperatures, and strengthens the idea that evolution of this enzyme was crucial in the origin of hyperthermophiles.  相似文献   

8.
Low complexity proteins and protein domains have sequences which appear highly non-random. Over the years, these sequences have been routinely filtered out during sequence similarity searches because interest has been focused on globular proteins, and inclusion of these domains can severely skew search results. However, early work on these proteins and more recent studies of the related area of repeated protein sequences suggests that low complexity protein domains have function and therefore are in need of further investigation. 0j.py is a new tool for demarcating low complexity protein domains more accurately than has been possible to date. The paper describes 0j.py and its use in revealing proteins with repeated and poly-amino-acid peptides. Statistical methods are then employed to to examine the distribution of these proteins across species, while keyword clustering is used to suggest roles performed by proteins through the use of low complexity domains.  相似文献   

9.
Various sequence-motif and sequence-cluster databases have been integrated into a new resource known as InterPro. Because the contributing databases have different clustering principles and scoring sensitivities, the combined assignments complement each other for grouping protein families and delineating domains. InterPro and new developments in the analysis of both the phylogenetic profiles of protein families and domain fusion events improve the prediction of specific functions for numerous proteins.  相似文献   

10.
Park GW  Kwon KH  Kim JY  Lee JH  Yun SH  Kim SI  Park YM  Cho SY  Paik YK  Yoo JS 《Proteomics》2006,6(4):1121-1132
In shotgun proteomics, proteins can be fractionated by 1-D gel electrophoresis and digested into peptides, followed by liquid chromatography to separate the peptide mixture. Mass spectrometry generates hundreds of thousands of tandem mass spectra from these fractions, and proteins are identified by database searching. However, the search scores are usually not sufficient to distinguish the correct peptides. In this study, we propose a confident protein identification method for high-throughput analysis of human proteome. To build a filtering protocol in database search, we chose Pseudomonas putida KT2440 as a reference because this bacterial proteome contains fewer modifications and is simpler than the human proteome. First, the P. putida KT2440 proteome was filtered by reversed sequence database search and correlated by the molecular weight in 1-D-gel band positions. The characterization protocol was then applied to determine the criteria for clustering of the human plasma proteome into three different groups. This protein filtering method, based on bacterial proteome data analysis, represents a rapid way to generate higher confidence protein list of the human proteome, which includes some of heavily modified and cleaved proteins.  相似文献   

11.
Clinical application of oxaliplatin, a platinum-based chemotherapeutic agent, in cancer, especially colorectal cancer, is widely used. However, oxaliplatin-induced peripheral neurotoxicity (OIPN) has a high incidence, and to date, there have been few detailed studies on pathogenesis and treatment mechanisms. The present study was performed by using a proteomic approach to explore protein expression profiling of rats treated with oxaliplatin by multiplex isobaric tags for relative and absolute quantification labeling and two-dimensional liquid chromatography-tandem mass spectrometry. There were 74 proteins that showed different expression in sciatic nerve between control rats and OIPN model rats, with 53 upregulated proteins and 21 downregulated proteins detected in OIPN groups compared with control groups. On the basis of Gene Ontology clustering, these proteins were associated with biological processes (eg, muscle contraction, muscle system process, and skeletal muscle contraction), cellular component (eg, myofibril, contractile fiber, and contractile fiber part) and molecular function (structural constituent of muscle, hydro-lyase activity, and calcium ion binding). On the basis of Kyoto Encyclopedia of Genes and Genomes pathway database, these proteins were associated with African trypanosomiasis, malaria, nitrogen metabolism, etc. Real-time polymerase chain reaction, Western blot as well as immunohistochemistry analysis was performed to examine the expression of partially differential protein. In conclusion, our study establishes a protein expression profile of oxaliplatin-induced rats and mechanisms leading to OIPN development, and will be useful for developing novel diagnostic biomarkers and aiding in the prevention and control of OIPN.  相似文献   

12.

Mistletoes are semiparasite plants containing pharmaceutical proteins with applications in cancer treatment. Previous research has demonstrated that somaclonal variation can lead to the biosynthesis of novel proteins from mistletoe callus cultures. The protein content of Viscum album subsp. abietis tissues and biotechnologically propagated calluses, was analyzed to identify proteins with putative anticancer properties. In addition, evolutionary relations among linked species to Viscum were studied. Calluses were propagated from stem explants. The protein extracts mass spectra were processed with Proteome Discoverer and a search was performed using as reference the Uniprot V. album reviewed database. A phylogenetic tree was reconstructed using the LG amino acid substitution model by homologous sequences for Beta galactoside-specific lectin 2. The homology modeling of the Beta-galactoside-specific lectin 2 was carried out using Modeller software. Considerable differences were observed by comparing the protein content of the calluses and the maternal tissues. Four mistletoe lectins, six viscotoxins and the chitin binding lectin-cbML were identified within the species tissues. An in silico phylogenetic and structural study provides insights to the role of these lectins and the mechanism of semiparasite survival and evolution, towards a novel anticancer and immune system modulation pipeline. Callogenesis exhibited protein biosynthesis alterations and novel protein isoforms expression. Phyllogenetic analysis revealed evolutionary relations primarily within the Viscum genus and other species containing 2-ribosome inactivating proteins. The homology modeling of the mistletoe lectin 2 revealed possible structure related anticancer properties. In conclusion, mistletoe calluses were shown to possess a unique protein biosynthetic profile compared to donor plant tissues.

  相似文献   

13.
PALI (release 1.2) contains three-dimensional (3-D) structure-dependent sequence alignments as well as structure-based phylogenetic trees of homologous protein domains in various families. The data set of homologous protein structures has been derived by consulting the SCOP database (release 1.50) and the data set comprises 604 families of homologous proteins involving 2739 protein domain structures with each family made up of at least two members. Each member in a family has been structurally aligned with every other member in the same family (pairwise alignment) and all the members in the family are also aligned using simultaneous super-position (multiple alignment). The structural alignments are performed largely automatically, with manual interventions especially in the cases of distantly related proteins, using the program STAMP (version 4.2). Every family is also associated with two dendrograms, calculated using PHYLIP (version 3.5), one based on a structural dissimilarity metric defined for every pairwise alignment and the other based on similarity of topologically equivalent residues. These dendrograms enable easy comparison of sequence and structure-based relationships among the members in a family. Structure-based alignments with the details of structural and sequence similarities, superposed coordinate sets and dendrograms can be accessed conveniently using a web interface. The database can be queried for protein pairs with sequence or structural similarities falling within a specified range. Thus PALI forms a useful resource to help in analysing the relationship between sequence and structure variation at a given level of sequence similarity. PALI also contains over 653 'orphans' (single member families). Using the web interface involving PSI_BLAST and PHYLIP it is possible to associate the sequence of a new protein with one of the families in PALI and generate a phylogenetic tree combining the query sequence and proteins of known 3-D structure. The database with the web interfaced search and dendrogram generation tools can be accessed at http://pauling.mbu.iisc.ernet. in/ approximately pali.  相似文献   

14.
Mass spectrometry-driven BLAST (MS BLAST) is a database search protocol for identifying unknown proteins by sequence similarity to homologous proteins available in a database. MS BLAST utilizes redundant, degenerate, and partially inaccurate peptide sequence data obtained by de novo interpretation of tandem mass spectra and has become a powerful tool in functional proteomic research. Using computational modeling, we evaluated the potential of MS BLAST for proteome-wide identification of unknown proteins. We determined how the success rate of protein identification depends on the full-length sequence identity between the queried protein and its closest homologue in a database. We also estimated phylogenetic distances between organisms under study and related reference organisms with completely sequenced genomes that allow substantial coverage of unknown proteomes.  相似文献   

15.
16.
The minimal set of proteins necessary to maintain a vertebrate cell forms an interesting core of cellular machinery. The known proteome of human red blood cell consists of about 1400 proteins. We treated this protein complement of one of the simplest human cells as a model and asked the questions on its function and origins. The proteome was mapped onto phylogenetic profiles, i.e. vectors of species possessing homologues of human proteins. A novel clustering approach was devised, utilising similarity in the phylogenetic spread of homologues as distance measure. The clustering based on phylogenetic profiles yielded several distinct protein classes differing in phylogenetic taxonomic spread, presumed evolutionary history and functional properties. Notably, small clusters of proteins common to vertebrates or Metazoa and other multicellular eukaryotes involve biological functions specific to multicellular organisms, such as apoptosis or cell-cell signaling, respectively. Also, a eukaryote-specific cluster is identified, featuring GTP-ase signalling and ubiquitination. Another cluster, made up of proteins found in most organisms, including bacteria and archaea, involves basic molecular functions such as oxidation-reduction and glycolysis. Approximately one third of erythrocyte proteins do not fall in any of the clusters, reflecting the complexity of protein evolution in comparison to our simple model. Basically, the clustering obtained divides the proteome into old and new parts, the former originating from bacterial ancestors, the latter from inventions within multicellular eukaryotes. Thus, the model human cell proteome appears to be made up of protein sets distinct in their history and biological roles. The current work shows that phylogenetic profiles concept allows protein clustering in a way relevant both to biological function and evolutionary history.  相似文献   

17.
In the past, a large number of methods have been developed for predicting various characteristics of a protein from its composition. In order to exploit the full potential of protein composition, we developed the web-server COPid to assist the researchers in annotating the function of a protein from its composition using whole or part of the protein. COPid has three modules called search, composition and analysis. The search module allows searching of protein sequences in six different databases. Search results list database proteins in ascending order of Euclidian distance or descending order of compositional similarity with the query sequence. The composition module allows calculation of the composition of a sequence and average composition of a group of sequences. The composition module also allows computing composition of various types of amino acids (e.g. charge, polar, hydrophobic residues). The analysis module provides the following options; i) comparing composition of two classes of proteins, ii) creating a phylogenetic tree based on the composition and iii) generating input patterns for machine learning techniques. We have evaluated the performance of composition-based (or alignment-free) similarity search in the subcellular localization of proteins. It was found that the alignment free method performs reasonably well in predicting certain classes of proteins. The COPid web-server is available at http://www.imtech.res.in/raghava/copid/.  相似文献   

18.
The HSSP database of protein structure-sequence alignments.   总被引:2,自引:0,他引:2       下载免费PDF全文
HSSP is a derived database merging structural three dimensional (3-D) and sequence one dimensional(1-D) information. For each protein of known 3-D structure from the Protein Data Bank (PDB), the database has a multiple sequence alignment of all available homologues and a sequence profile characteristic of the family. The list of homologues is the result of a database search in Swissprot using a position-weighted dynamic programming method for sequence profile alignment (MaxHom). The database is updated frequently. The listed homologues are very likely to have the same 3-D structure as the PDB protein to which they have been aligned. As a result, the database is not only a database of aligned sequence families, but also a database of implied secondary and tertiary structures covering 27% of all Swissprot-stored sequences.  相似文献   

19.
The HSSP database of protein structure-sequence alignments.   总被引:4,自引:0,他引:4       下载免费PDF全文
HSSP is a derived database merging structural (3-D) and sequence (1-D) information. For each protein of known 3-D structure from the Protein Data Bank (PDB), the database has a multiple sequence alignment of all available homologues and a sequence profile characteristic of the family. The list of homologues is the result of a database search in SwissProt using a position-weighted dynamic programming method for sequence profile alignment (MaxHom). The database is updated frequently. The listed homologues are very likely to have the same 3-D structure as the PDB protein to which they have been aligned. As a result, the database is not only a database of aligned sequence families, but also a database of implied secondary and tertiary structures covering 29% of all SwissProt-stored sequences.  相似文献   

20.
A classification scheme for membrane proteins is proposed that clusters families of proteins into structural classes based on hydropathy profile analysis. The averaged hydropathy profiles of protein families are taken as fingerprints of the 3D structure of the proteins and, therefore, are able to detect more distant evolutionary relationships than amino acid sequences. A procedure was developed in which hydropathy profile analysis is used initially as a filter in a BLAST search of the NCBI protein database. The strength of the procedure is demonstrated by the classification of 29 families of secondary transporters into a single structural class, termed ST[3]. An exhaustive search of the database revealed that the 29 families contain 568 unique sequences. The proteins are predominantly from prokaryotic origin and most of the characterized transporters in ST[3] transport organic and inorganic anions and a smaller number are Na(+)/H(+) antiporters. All modes of energy coupling (symport, antiport, uniport) are found in structural class ST[3]. The relevance of the classification for structure/function prediction of uncharacterised transporters in the class is discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号