首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Among land plants, mitochondrial and plastid group II introns occasionally encode proteins called maturases that are important for splicing. Angiosperm nuclear genomes also encode maturases that are targeted to the organelles, but it is not known whether nucleus-encoded maturases exist in other land plant lineages. To examine the evolutionary diversity and history of this essential gene family, we searched for maturase homologs in recently sequenced nuclear and mitochondrial genomes from diverse land plants. We found that maturase content in mitochondrial genomes is highly lineage specific, such that orthologous maturases are rarely shared among major land plant groups. The presence of numerous mitochondrial pseudogenes in the mitochondrial genomes of several species implies that the sporadic maturase distribution is due to frequent inactivation and eventual loss over time. We also identified multiple maturase paralogs in the nuclear genomes of the lycophyte Selaginella moellendorffii, the moss Physcomitrella patens, and the representative angiosperm Vitis vinifera. Phylogenetic analyses of organelle- and nucleus-encoded maturases revealed that the nuclear maturase genes in angiosperms, lycophytes, and mosses arose by multiple shared and independent transfers of mitochondrial paralogs to the nuclear genome during land plant evolution. These findings indicate that plant mitochondrial maturases have experienced a surprisingly dynamic history due to a complex interaction of multiple evolutionary forces that affect the rates of maturase gain, retention, and loss.  相似文献   

2.
3.
New directions in biology are being driven by the complete sequencing of genomes, which has given us the protein repertoires of diverse organisms from all kingdoms of life. In tandem with this accumulation of sequence data, worldwide structural genomics initiatives, advanced by the development of improved technologies in X-ray crystallography and NMR, are expanding our knowledge of structural families and increasing our fold libraries. Methods for detecting remote sequence similarities have also been made more sensitive and this means that we can map domains from these structural families onto genome sequences to understand how these families are distributed throughout the genomes and reveal how they might influence the functional repertoires and biological complexities of the organisms. We have used robust protocols to assign sequences from completed genomes to domain structures in the CATH database, allowing up to 60% of domain sequences in these genomes, depending on the organism, to be assigned to a domain family of known structure. Analysis of the distribution of these families throughout bacterial genomes identified more than 300 universal families, some of which had expanded significantly in proportion to genome size. These highly expanded families are primarily involved in metabolism and regulation and appear to make major contributions to the functional repertoire and complexity of bacterial organisms. When comparisons are made across all kingdoms of life, we find a smaller set of universal domain families (approx. 140), of which families involved in protein biosynthesis are the largest conserved component. Analysis of the behaviour of other families reveals that some (e.g. those involved in metabolism, regulation) have remained highly innovative during evolution, making it harder to trace their evolutionary ancestry. Structural analyses of metabolic families provide some insights into the mechanisms of functional innovation, which include changes in domain partnerships and significant structural embellishments leading to modulation of active sites and protein interactions.  相似文献   

4.
5.
Liu F  Baggerman G  Schoofs L  Wets G 《Peptides》2006,27(12):3137-3153
Bioactive (neuro)peptides play critical roles in regulating most biological processes in animals. Peptides belonging to the same family are characterized by a typical sequence pattern that is conserved among the family's peptide members. Such a conserved pattern or motif usually corresponds to the functionally important part of the biologically active peptide. In this paper, all known bioactive (neuro)peptides annotated in Swiss-Prot and TrEMBL protein databases are collected, and the pattern searching program Pratt is used to search these unaligned peptide sequences for conserved patterns. The obtained patterns are then refined by combining the information on amino acids at important functional sites collected from the literature. All the identified patterns are further tested by scanning them against Swiss-Prot and TrEMBL protein databases. The diagnostic power of each pattern is validated by the fact that any annotated protein from Swiss-Prot and TrEMBL that contains one of the established patterns, is indeed a known (neuro)peptide precursor. We discovered 155 novel peptide patterns in addition to the 56 established ones in the PROSITE database. All the patterns cover 110 peptide families. Fifty-five of these families are not characterized by the PROSITE signatures, and 12 are also not identified by other existing motif databases, such as Pfam and SMART. Using the newly identified peptide signatures as a search tool, we predicted 95 hypothetical proteins as putative peptide precursors.  相似文献   

6.
LAGLIDADG endonucleases bind across adjacent major grooves via a saddle-shaped surface and catalyze DNA cleavage. Some LAGLIDADG proteins, called maturases, facilitate splicing by group I introns, raising the issue of how a DNA-binding protein and an RNA have evolved to function together. In this report, crystallographic analysis shows that the global architecture of the bI3 maturase is unchanged from its DNA-binding homologs; in contrast, the endonuclease active site, dispensable for splicing facilitation, is efficiently compromised by a lysine residue replacing essential catalytic groups. Biochemical experiments show that the maturase binds a peripheral RNA domain 50 A from the splicing active site, exemplifying long-distance structural communication in a ribonucleoprotein complex. The bI3 maturase nucleic acid recognition saddle interacts at the RNA minor groove; thus, evolution from DNA to RNA function has been mediated by a switch from major to minor groove interaction.  相似文献   

7.
Much attention has been paid on amphibian peptides for their wide-ranging pharmacological properties, clinical potential, and gene-encoded origin. More than 300 antimicrobial peptides (AMPs) from amphibians have been studied. Peptidomics and genomics analysis combined with functional test including microorganism killing, histamine-releasing, and mast cell degranulation was used to investigate antimicrobial peptide diversity. Thirty-four novel AMPs from skin secretions of Rana nigrovittata were identified in current work, and they belong to 9 families, including 6 novel families. Other three families are classified into rugosin, gaegurin, and temporin family of amphibian AMP, respectively. These AMPs share highly conserved preproregions including signal peptides and spacer acidic peptides, while greatly diversified on mature peptides structures. In this work, peptidomics combined with genomics analysis was confirmed to be an effective way to identify amphibian AMPs, especially novel families. Some AMPs reported here will provide leading molecules for designing novel antimicrobial agents.  相似文献   

8.
The sterile alpha motif (SAM) domain is a protein module found in many diverse signaling proteins. SAM domains in some systems have been shown to self-associate. Previous crystal structures of an EphA4-SAM domain dimer (Stapleton, D., Balan, I., Pawson, T., and Sicheri, F. (1999) Nat. Struct. Biol. 6, 44-49) and a possible EphB2-SAM oligomer (Thanos, C. D., Goodwill, K. E., and Bowie, J. U. (1999) Science 283, 833-836) both revealed large interfaces comprising an exchange of N-terminal peptide arms. Within the arm, a conserved hydrophobic residue (Tyr-8 in the EphB2-SAM structure or Phe-910 in the EphA4-SAM structure) is anchored into a hydrophobic cleft on a neighboring molecule. Here we have solved a new crystal form of the human EphB2-SAM domain that has the same overall SAM domain fold yet has no substantial intermolecular contacts. In the new structure, the N-terminal peptide arm of the EphB2-SAM domain protrudes out from the core of the molecule, leaving both the arm (including Tyr-8) and the hydrophobic cleft solvent-exposed. To verify that Tyr-8 is solvent-exposed in solution, we made a Tyr-8 to Ala-8 mutation and found that the EphB2-SAM domain structure and stability were only slightly altered. These results suggest that Tyr-8 is not part of the hydrophobic core of the EphB2-SAM domain and is conserved for functional reasons. Cystallographic evidence suggests a possible role for the N-terminal arm in oligomerization. In the absence of a direct demonstration of biological relevance, however, the functional role of the N-terminal arm remains an open question.  相似文献   

9.
A query learning algorithm based on hidden Markov models (HMMs) isdeveloped to design experiments for string analysis and prediction of MHCclass I binding peptides. Query learning is introduced to aim at reducingthe number of peptide binding data for training of HMMs. A multiple numberof HMMs, which will collectively serve as a committee, are trained withbinding data and used for prediction in real-number values. The universeof peptides is randomly sampled and subjected to judgement by the HMMs.Peptides whose prediction is least consistent among committee HMMs aretested by experiment. By iterating the feedback cycle of computationalanalysis and experiment the most wanted information is effectivelyextracted. After 7 rounds of active learning with 181 peptides in all,predictive performance of the algorithm surpassed the so far bestperforming matrix based prediction. Moreover, by combining the bothmethods binder peptides (log Kd < -6) could be predicted with84% accuracy. Parameter distribution of the HMMs that can be inspectedvisually after training further offers a glimpse of dynamic specificity ofthe MHC molecules.  相似文献   

10.
Who's your neighbor? New computational approaches for functional genomics   总被引:19,自引:0,他引:19  
Several recently developed computational approaches in comparative genomics go beyond sequence comparison. By analyzing phylogenetic profiles of protein families, domain fusions, gene adjacency in genomes, and expression patterns, these methods predict many functional interactions between proteins and help deduce specific functions for numerous proteins. Although some of the resultant predictions may not be highly specific, these developments herald a new era in genomics in which the benefits of comparative analysis of the rapidly growing collection of complete genomes will become increasingly obvious.  相似文献   

11.
We have compared a novel sequence-structure matching technique, FORESST, for detecting remote homologs to three existing sequence based methods, including local amino acid sequence similarity by BLASTP, hidden Markov models (HMMs) of sequences of protein families using SAM, HMMs based on sequence motifs identified using meta-MEME. FORESST compares predicted secondary structures to a library of structural families of proteins, using HMMs. Altogether 45 proteins from nine structural families in the database CATH were used in a cross-validated test of the fold assignment accuracy of each method. Local sequence similarity of a query sequence to a protein family is measured by the highest segment pair (HSP) score. Each of the HMM-based approaches (FORESST, MEME, amino acid sequence-based HMM) yielded log-odds score for the query sequence. In order to make a fair comparison among these methods, the scores for each method were converted to Z-scores in a uniform way by comparing the raw scores of a query protein with the corresponding scores for a set of unrelated proteins. Z-Scores were analyzed as a function of the maximum pairwise sequence identity (MPSID) of the query sequence to sequences used in training the model. For MPSID above 20%, the Z-scores increase linearly with MPSID for the sequence-based methods but remain roughly constant for FORESST. Below 15%, average Z-scores are close to zero for the sequence-based methods, whereas the FORESST method yielded average Z-scores of 1.8 and 1.1, using observed and predicted secondary structures, respectively. This demonstrates the advantage of the sequence-structure method for detecting remote homologs.  相似文献   

12.
Mitochondrial genomes (mtDNAs) in angiosperms contain numerous group II-type introns that reside mainly within protein-coding genes that are required for organellar genome expression and respiration. While splicing of group II introns in non-plant systems is facilitated by proteins encoded within the introns themselves (maturases), the mitochondrial introns in plants have diverged and have lost the vast majority of their intron-encoded ORFs. Only a single maturase gene (matR) is retained in plant mtDNAs, but its role(s) in the splicing of mitochondrial introns is currently unknown. In addition to matR, plants also harbor four nuclear maturase genes (nMat 1 to 4) encoding mitochondrial proteins that are expected to act in the splicing of group II introns. Recently, we established the role of one of these proteins, nMAT2, in the splicing of several mitochondrial introns in Arabidopsis. Here, we show that nMAT1 is required for trans-splicing of nad1 intron 1 and also functions in cis-splicing of nad2 intron 1 and nad4 intron 2. Homozygous nMat1 plants show retarded growth and developmental phenotypes, modified respiration activities and altered stress responses that are tightly correlated with mitochondrial complex I defects.  相似文献   

13.
The PANTHER database was designed for high-throughput analysis of protein sequences. One of the key features is a simplified ontology of protein function, which allows browsing of the database by biological functions. Biologist curators have associated the ontology terms with groups of protein sequences rather than individual sequences. Statistical models (Hidden Markov Models, or HMMs) are built from each of these groups. The advantage of this approach is that new sequences can be automatically classified as they become available. To ensure accurate functional classification, HMMs are constructed not only for families, but also for functionally distinct subfamilies. Multiple sequence alignments and phylogenetic trees, including curator-assigned information, are available for each family. The current version of the PANTHER database includes training sequences from all organisms in the GenBank non-redundant protein database, and the HMMs have been used to classify gene products across the entire genomes of human, and Drosophila melanogaster. The ontology terms and protein families and subfamilies, as well as Drosophila gene c;assifications, can be browsed and searched for free. Due to outstanding contractual obligations, access to human gene classifications and to protein family trees and multiple sequence alignments will temporarily require a nominal registration fee. PANTHER is publicly available on the web at http://panther.celera.com.  相似文献   

14.
In higher plants, the superfamily of carboxyl-CoA ligases and related proteins, collectively called acyl activating enzymes (AAEs), has evolved to provide enzymes for many pathways of primary and secondary metabolism and for the conjugation of hormones to amino acids. Across the superfamily there is only limited sequence similarity, but a series of highly conserved motifs, including the AMP-binding domain, make it easy to identify members. These conserved motifs are best understood in terms of the unique domain-rotation architecture that allows AAE enzymes to catalyze the two distinct steps of the CoA ligase reaction. Arabidopsis AAE sequences were used to identify the AAE gene families in the sequenced genomes of green algae, mosses, and trees; the size of the respective families increased with increasing degree of organismal cellular complexity, size, and generation time. Large-scale genome duplications and small-scale tandem gene duplications have contributed to AAE gene family complexity to differing extents in each of the multicellular species analyzed. Gene duplication and evolution of novel functions in Arabidopsis appears to have occurred rapidly, because acquisition of new substrate specificity is relatively easy in this class of proteins. Convergent evolution has also occurred between members of distantly related clades. These features of the AAE superfamily make it difficult to use homology searches and other genomics tools to predict enzyme function.  相似文献   

15.
Protein translations of over 100 complete genomes are now available. About half of these sequences can be provided with structural annotation, thereby enabling some profound insights into protein and pathway evolution. Whereas the major domain structure families are common to all kingdoms of life, these are combined in different ways in multidomain proteins to give various domain architectures that are specific to kingdoms or individual genomes, and contribute to the diverse phenotypes observed. These data argue for more targets in structural genomics initiatives and particularly for the selection of different domain architectures to gain better insights into protein functions.  相似文献   

16.
Subjecting selected peptides to in vitro analyses covering their ability to interfere with the lipid oxidation chain reaction as well as to protect proteins from direct and indirect oxidation has provided the basis for a more detailed understanding of peptide-mediated protection in biological systems. The efficiency of peptides as radical scavengers and chain-breaking antioxidants in oxidizing lipid membranes was found to be low. Previous studies on antioxidative activity of peptides tend not to include comparisons with efficiencies of more well-documented antioxidants and/or use irrelevantly high dosages of peptides. The present study demonstrates that the effect of the investigated peptides towards oxidation in biological membrane systems is mainly a protection of vital proteins from being oxidatively modified. This protection is obtained through a prevention of lipid oxidation derived carbonylation (indirect protein oxidation) and through interference with aqueous radical species (direct protein oxidation), and it is only achieved if the peptides are present in high concentrations as sacrificial antioxidants.  相似文献   

17.
Complex and diverse signal transduction circuits are responsible for the efficient functioning of cellular network. Protein kinases and O-protein phosphatases are primarily responsible for propagating such stimuli within a eukaryotic cell. However, there is limited understanding of O-protein phosphatases in the prokaryotic genomes. The availability of complete genome sequence information for several prokaryotes permits a genome-wide survey of O-protein phosphatases. The distribution of the various protein phosphatase families has been observed to be mosaic, with the exception of the members of the phospho protein family P (PPP), which is consistent with previous studies. The PPP family is ubiquitous in the prokaryotic world and undergoes the highest sequence divergence within a genome amongst phosphatases studied. The co-occurrence of low molecular mass tyrosine phosphatase (LMWPc) and PPP domain in a single polypeptide suggests that the protein present in Archaeoglobus fulgidus might represent the progenitor for all protein phosphatases. The curation of data on prokaryotic protein phosphatases provides a convenient framework for the analysis of domain architectures and for characterising structural and functional properties of this important family of signalling proteins.  相似文献   

18.
New organisms and biological systems designed to satisfy human needs are among the aims of synthetic genomics and synthetic biology. Synthetic biology seeks to model and construct biological components, functions and organisms that do not exist in nature or to redesign existing biological systems to perform new functions. Synthetic genomics, on the other hand, encompasses technologies for the generation of chemically-synthesized whole genomes or larger parts of genomes, allowing to simultaneously engineer a myriad of changes to the genetic material of organisms. Engineering complex functions or new organisms in synthetic biology are thus progressively becoming dependent on and converging with synthetic genomics. While applications from both areas have been predicted to offer great benefits by making possible new drugs, renewable chemicals or clean energy, they have also given rise to concerns about new safety, environmental and socio-economic risks – stirring an increasingly polarizing debate. Here we intend to provide an overview on recent progress in biomedical and biotechnological applications of synthetic genomics and synthetic biology as well as on arguments and evidence related to their possible benefits, risks and governance implications.  相似文献   

19.
Peptide-recognition modules (PRMs) are used throughout biology to mediate protein–protein interactions, and many PRMs are members of large protein domain families. Recent genome-wide measurements describe networks of peptide–PRM interactions. In these networks, very similar PRMs recognize distinct sets of peptides, raising the question of how peptide-recognition specificity is achieved using similar protein domains. The analysis of individual protein complex structures often gives answers that are not easily applicable to other members of the same PRM family. Bioinformatics-based approaches, one the other hand, may be difficult to interpret physically. Here we integrate structural information with a large, quantitative data set of SH2 domain–peptide interactions to study the physical origin of domain–peptide specificity. We develop an energy model, inspired by protein folding, based on interactions between the amino-acid positions in the domain and peptide. We use this model to successfully predict which SH2 domains and peptides interact and uncover the positions in each that are important for specificity. The energy model is general enough that it can be applied to other members of the SH2 family or to new peptides, and the cross-validation results suggest that these energy calculations will be useful for predicting binding interactions. It can also be adapted to study other PRM families, predict optimal peptides for a given SH2 domain, or study other biological interactions, e.g. protein–DNA interactions.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号