首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The proteomes that make up the collection of proteins in contemporary organisms evolved through recombination and duplication of a limited set of domains. These protein domains are essentially the main components of globular proteins and are the most principal level at which protein function and protein interactions can be understood. An important aspect of domain evolution is their atomic structure and biochemical function, which are both specified by the information in the amino acid sequence. Changes in this information may bring about new folds, functions and protein architectures. With the present and still increasing wealth of sequences and annotation data brought about by genomics, new evolutionary relationships are constantly being revealed, unknown structures modeled and phylogenies inferred. Such investigations not only help predict the function of newly discovered proteins, but also assist in mapping unforeseen pathways of evolution and reveal crucial, co-evolving inter- and intra-molecular interactions. In turn this will help us describe how protein domains shaped cellular interaction networks and the dynamics with which they are regulated in the cell. Additionally, these studies can be used for the design of new and optimized protein domains for therapy. In this review, we aim to describe the basic concepts of protein domain evolution and illustrate recent developments in molecular evolution that have provided valuable new insights in the field of comparative genomics and protein interaction networks.  相似文献   

2.
The enormous mammal’s lifespan variation is the result of each species’ adaptations to their own biological trade-offs and ecological conditions. Comparative genomics have demonstrated that genomic factors underlying both, species lifespans and longevity of individuals, are in part shared across the tree of life. Here, we compared protein-coding regions across the mammalian phylogeny to detect individual amino acid (AA) changes shared by the most long-lived mammals and genes whose rates of protein evolution correlate with longevity. We discovered a total of 2,737 AA in 2,004 genes that distinguish long- and short-lived mammals, significantly more than expected by chance (P = 0.003). These genes belong to pathways involved in regulating lifespan, such as inflammatory response and hemostasis. Among them, a total 1,157 AA showed a significant association with maximum lifespan in a phylogenetic test. Interestingly, most of the detected AA positions do not vary in extant human populations (81.2%) or have allele frequencies below 1% (99.78%). Consequently, almost none of these putatively important variants could have been detected by genome-wide association studies, suggesting that comparative genomics can be used to complement and enhance interpretation of human genome-wide association studies. Additionally, we identified four more genes whose rate of protein evolution correlated with longevity in mammals. Finally, we show that the human longevity-associated proteins are significantly more stable than the orthologous proteins from short-lived mammals, strongly suggesting that general protein stability is linked to increased lifespan.  相似文献   

3.
4.
Overview of structural genomics: from structure to function   总被引:7,自引:0,他引:7  
The unprecedented increase in the number of new protein sequences arising from genomics and proteomics highlights directly the need for methods to rapidly and reliably determine the molecular and cellular functions of these proteins. One such approach, structural genomics, aims to delineate the total repertoire of protein folds, thereby providing three-dimensional portraits for all proteins in a living organism and to infer molecular functions of the proteins. The goal of obtaining protein structures on a genomic scale has motivated the development of high-throughput technologies for macromolecular structure determination, which have begun to produce structures at a greater rate than previously possible. These new structures have revealed many unexpected functional and evolution relationships that were hidden at the sequence level.  相似文献   

5.
The genome of monotremes, like the animals themselves, is unique and strange. The importance of monotremes to genomics depends on their position as the earliest offshoot of the mammalian lineage. Although there has been controversy in the literature over the phylogenetic position of monotremes, this traditional interpretation is now confirmed by recent sequence comparisons. Characterizing the monotreme genome will therefore be important for studying the evolution and organization of the mammalian genome, and the proposal to sequence the platypus genome has been received enthusiastically by the genomics community. Recent investigations of X-chromosome inactivation, genomic imprinting and sex chromosome evolution provide good examples of the power of the monotreme genome to inform us about mammalian genome organization and evolution.  相似文献   

6.
Odonata (dragonflies and damselflies) present an unparalleled insect model to integrate evolutionary genomics with ecology for the study of insect evolution. Key features of Odonata include their ancient phylogenetic position, extensive phenotypic and ecological diversity, several unique evolutionary innovations, ease of study in the wild and usefulness as bioindicators for freshwater ecosystems worldwide. In this review, we synthesize studies on the evolution, ecology and physiology of odonates, highlighting those areas where the integration of ecology with genomics would yield significant insights into the evolutionary processes that would not be gained easily by working on other animal groups. We argue that the unique features of this group combined with their complex life cycle, flight behaviour, diversity in ecological niches and their sensitivity to anthropogenic change make odonates a promising and fruitful taxon for genomics focused research. Future areas of research that deserve increased attention are also briefly outlined.  相似文献   

7.
Dong Yang  Ying Jiang  Fuchu He 《遗传学报》2009,36(11):645-651
Genome sequencing opened the flood gate of "-omics" studies, among which the research about correlations between genomic and phenomic variables is an important part. With the development of functional genomics and systems biology, genome-wide investigation of the correlations between many genomic and phenomic variables became possible. In this review, five genomic variables, such as evolution rate (or "age" of the gene), the length of intron and ORF (protein length) in one gene, the biases of amino acid composition and codon usage, along with the phenomic variables related to expression patterns (level and breadth) are focused on. In most cases, genes with higher mRNA/protein expression level tend to evolve slowly, have less intronic DNA, code for smaller proteins, and have higher biases of amino acid composition and codon usage. In addition, broadly expressed proteins evolve more slowly and are shorter than tissue-specific proteins. Studies in this field are helpful for deeper understanding the signatures of selection mediated by the features of gene expression and are of great significance to enrich the evolution theory.  相似文献   

8.
Functional genomics has revolutionised the way that scientists approach biological questions, allowing for the comprehensive characterisation of the function of related proteins encoded in a genome. The sequencing of the genome of the model system Arabidopsis thaliana has enabled the beginning of functional genomics and the study of protein kinase families in plants. The large family of genes encoding protein kinases is a primary target of functional genomics studies in plants due to their importance in diverse physiological processes. This paper describes the functional genomics tools used to study the families of protein kinases in Arabidopsis, as well as progress in uncovering the functions of these proteins.  相似文献   

9.
The production of complex multidomain (membrane) proteins is a major hurdle in structural genomics and a generic approach for optimizing membrane protein expression is still lacking. We have devised a selection method to isolate mutant strains with improved functional expression of recombinant membrane proteins. By fusing green fluorescent protein and an erythromycin resistance marker (ErmC) to the C-terminus of a target protein, one simultaneously selects for variants with enhanced expression (increased erythromycin resistance) and correct folding (green fluorescent protein fluorescence). Three evolved hosts, displaying 2- to 8-fold increased expression of a plethora of proteins, were fully sequenced and shown to carry single-site mutations in the nisK gene. NisK is the sensor protein of a two-component regulatory system that directs nisin-A-mediated expression. The levels of recombinant membrane proteins were increased in the evolved strains, and in some cases their folding states were improved. The generality and simplicity of our approach allow rapid improvements of protein production yields by directed evolution in a high-throughput way.  相似文献   

10.
We describe GOTax, a comparative genomics platform that integrates protein annotation with protein family classification and taxonomy. User-defined sets of proteins, protein families, annotation terms or taxonomic groups can be selected and compared, allowing for the analysis of distribution of biological processes and molecular activities over different taxonomic groups. In particular, a measure of functional similarity is available for comparing proteins and protein families, establishing functional relationships independent of evolution.  相似文献   

11.
Early studies suggested that proteins with a greater contribution to the fitness of an organism evolve more slowly than less 'important' proteins. Recent articles by two research groups highlight the long-standing controversy about the genome-wide relationships between the measures of evolution rate, protein abundance and the fitness effect of gene disruption. These studies highlight the need for truly multidimensional approaches to the issues of quantitative genomics.  相似文献   

12.
A protein identified as "N-acylamino acid racemase" from Amycolaptosis sp. is an inefficient enzyme (kcat/Km = 3.7 x 10(2) M-1 s-1). Its sequence is 43% identical to that of an unidentified protein encoded by the Bacillus subtilis genome. Both proteins efficiently catalyze the o-succinylbenzoate synthase reaction in menaquinone biosynthesis (kcat/Km = 2.5 x 10(5) and 7.5 x 10(5) M-1 s-1, respectively), suggesting that this is their "correct" metabolic function. Their membership in the mechanistically diverse enolase superfamily provides an explanation for the catalytic promiscuity of the protein from Amycolaptosis. The adventitious promiscuity may provide an example of a protein poised for evolution of a new enzymatic function in the enolase superfamily. This study demonstrates that the correct assignment of function to new proteins in functional and structural genomics may require an understanding of the metabolism of the organism.  相似文献   

13.
Comparative genomics has proven a fruitful approach to acquire many functional and evolutionary insights into core cellular processes. Here it is argued that in order to perform accurate and interesting comparative genomics, one first and foremost has to be able to recognize, postulate, and revise different evolutionary scenarios. After all, these studies lack a simple protocol, due to different proteins having different evolutionary dynamics and demanding different approaches. The authors here discuss this challenge from a practical (what are the observations?) and conceptual (how do these indicate a specific evolutionary scenario?) viewpoint, with the aim to guide investigators who want to analyze the evolution of their protein(s) of interest. By sharing how the authors draft, test, and update such a scenario and how it directs their investigations, the authors hope to illuminate how to execute molecular evolution studies and how to interpret them. Also see the video abstract here https://youtu.be/VCt3l2pbdbQ .  相似文献   

14.
Prediction of protein function from protein sequence and structure   总被引:1,自引:0,他引:1  
The sequence of a genome contains the plans of the possible life of an organism, but implementation of genetic information depends on the functions of the proteins and nucleic acids that it encodes. Many individual proteins of known sequence and structure present challenges to the understanding of their function. In particular, a number of genes responsible for diseases have been identified but their specific functions are unknown. Whole-genome sequencing projects are a major source of proteins of unknown function. Annotation of a genome involves assignment of functions to gene products, in most cases on the basis of amino-acid sequence alone. 3D structure can aid the assignment of function, motivating the challenge of structural genomics projects to make structural information available for novel uncharacterized proteins. Structure-based identification of homologues often succeeds where sequence-alone-based methods fail, because in many cases evolution retains the folding pattern long after sequence similarity becomes undetectable. Nevertheless, prediction of protein function from sequence and structure is a difficult problem, because homologous proteins often have different functions. Many methods of function prediction rely on identifying similarity in sequence and/or structure between a protein of unknown function and one or more well-understood proteins. Alternative methods include inferring conservation patterns in members of a functionally uncharacterized family for which many sequences and structures are known. However, these inferences are tenuous. Such methods provide reasonable guesses at function, but are far from foolproof. It is therefore fortunate that the development of whole-organism approaches and comparative genomics permits other approaches to function prediction when the data are available. These include the use of protein-protein interaction patterns, and correlations between occurrences of related proteins in different organisms, as indicators of functional properties. Even if it is possible to ascribe a particular function to a gene product, the protein may have multiple functions. A fundamental problem is that function is in many cases an ill-defined concept. In this article we review the state of the art in function prediction and describe some of the underlying difficulties and successes.  相似文献   

15.
16.
Abstract

The eukaryotic endomembrane system (ES) is served by hundreds of dedicated proteins. Experimental characterization of the ES-associated molecular machinery in several model eukaryotes complemented by a recent progress in phylogenomics and comparative genomics have revealed a conserved complex core of the machinery that appears to have been established before the last eukaryotic common ancestor (LECA). At the same time, modern eukaryotes exhibit a huge variation in the ES resulting from a multitude of evolutionary processes operating along the ever-branching paths from the LECA to its descendants. The most important source of evolutionary novelty in the ES functioning has undoubtedly been gene duplication followed by divergence of the gene copies, responsible not only for the pre-LECA establishment of many multi-paralog families of proteins in the very core of the ES-associated machinery, but also for post-LECA lineage-specific elaborations via family expansions and the origin of novel components. Extreme sequence divergence has obscured actual homologous relationships between potentially many components of the machinery, even between orthologous proteins, as illustrated by the yeast Vps51 subunit of the vesicle tethering complex GARP hypothesized here to be a highly modified ortholog of a conserved eukaryotic family typified by the zebrafish Fat-free (Ffr) protein. A dynamic evolution of many ES-associated proteins, especially those centred around RAB and ARF GTPases, seems to take place at the level of their domain architectures. Finally, reductive evolution and recurrent gene loss are emerging as pervasive factors shaping the ES in all phylogenetic lineages.  相似文献   

17.
The eukaryotic endomembrane system (ES) is served by hundreds of dedicated proteins. Experimental characterization of the ES-associated molecular machinery in several model eukaryotes complemented by a recent progress in phylogenomics and comparative genomics have revealed a conserved complex core of the machinery that appears to have been established before the last eukaryotic common ancestor (LECA). At the same time, modern eukaryotes exhibit a huge variation in the ES resulting from a multitude of evolutionary processes operating along the ever-branching paths from the LECA to its descendants. The most important source of evolutionary novelty in the ES functioning has undoubtedly been gene duplication followed by divergence of the gene copies, responsible not only for the pre-LECA establishment of many multi-paralog families of proteins in the very core of the ES-associated machinery, but also for post-LECA lineage-specific elaborations via family expansions and the origin of novel components. Extreme sequence divergence has obscured actual homologous relationships between potentially many components of the machinery, even between orthologous proteins, as illustrated by the yeast Vps51 subunit of the vesicle tethering complex GARP hypothesized here to be a highly modified ortholog of a conserved eukaryotic family typified by the zebrafish Fat-free (Ffr) protein. A dynamic evolution of many ES-associated proteins, especially those centred around RAB and ARF GTPases, seems to take place at the level of their domain architectures. Finally, reductive evolution and recurrent gene loss are emerging as pervasive factors shaping the ES in all phylogenetic lineages.  相似文献   

18.

Background  

Comparative genomics of the early diverging metazoan lineages and of their unicellular sister-groups opens new window to reconstructing the genetic changes which preceded or accompanied the evolution of multicellular body plans. A recent analysis found that the genome of the nerve-less sponges encodes the homologues of most vertebrate post-synaptic proteins. In vertebrate excitatory synapses, these proteins assemble to form the post-synaptic density, a complex molecular platform linking membrane receptors, components of their signalling pathways, and the cytoskeleton. Newly available genomes from Monosiga brevicollis (a member of Choanoflagellata, the closest unicellular relatives of animals) and Trichoplax adhaerens (a member of Placozoa: besides sponges, the only nerve-less metazoans) offer an opportunity to refine our understanding of post-synaptic protein evolution.  相似文献   

19.
Eukaryotes encode numerous proteins that either have no detectable homologs in prokaryotes or have only distant homologs. These molecular innovations of eukaryotes may be classified into three categories: proteins and domains inherited from prokaryotic precursors without drastic changes in biochemical function, but often recruited for novel roles in eukaryotes; new superfamilies or distinct biochemical functions emerging within pre-existing protein folds; and domains with genuinely new folds, apparently 'invented' at the outset of eukaryotic evolution. Most new folds emerging in eukaryotes are either alpha-helical or stabilized by metal chelation. Comparative genomics analyses point to an early phase of rapid evolution, and dramatic changes between the origin of the eukaryotic cell and the advent of the last common ancestor of extant eukaryotes. Extensive duplication of numerous genes, with subsequent functional diversification, is a distinctive feature of this turbulent era. Evolutionary analysis of ancient eukaryotic proteins is generally compatible with a two-symbiont scenario for eukaryotic origin, involving an alpha-proteobacterium (the ancestor of the mitochondria) and an archaeon, as well as key contributions from their selfish elements.  相似文献   

20.
The dramatically increasing number of new protein sequences arising from genomics 4 proteomics requires the need for methods to rapidly and reliably infer the molecular and cellular functions of these proteins. One such approach, structural genomics, aims to delineate the total repertoire of protein folds in nature, thereby providing three-dimensional folding patterns for all proteins and to infer molecular functions of the proteins based on the combined information of structures and sequences. The goal of obtaining protein structures on a genomic scale has motivated the development of high throughput technologies and protocols for macromolecular structure determination that have begun to produce structures at a greater rate than previously possible. These new structures have revealed many unexpected functional inferences and evolutionary relationships that were hidden at the sequence level. Here, we present samples of structures determined at Berkeley Structural Genomics Center and collaborators laboratories to illustrate how structural information provides and complements sequence information to deduce the functional inferences of proteins with unknown molecular functions.Two of the major premises of structural genomics are to discover a complete repertoire of protein folds in nature and to find molecular functions of the proteins whose functions are not predicted from sequence comparison alone. To achieve these objectives on a genomic scale, new methods, protocols, and technologies need to be developed by multi-institutional collaborations worldwide. As part of this effort, the Protein Structure Initiative has been launched in the United States (PSI; www.nigms.nih.gov/funding/psi.html). Although infrastructure building and technology development are still the main focus of structural genomics programs [1–6], a considerable number of protein structures have already been produced, some of them coming directly out of semi-automated structure determination pipelines [6–10]. The Berkeley Structural Genomics Center (BSGC) has focused on the proteins of Mycoplasma or their homologues from other organisms as its structural genomics targets because of the minimal genome size of the Mycoplasmas as well as their relevance to human and animal pathogenicity (http://www.strgen.org). Here we present several protein examples encompassing a spectrum of functional inferences obtainable from their three-dimensional structures in five situations, where the inferences are new and testable, and are not predictable from protein sequence information alone.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号