首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
微量元素指需要量很少(人体中含量在0.01%以下),但却是所有生物体所必需的元素。它们参与了生物体中各种复杂的生物过程,因此不同生物必须依赖相应的微量元素才能生存。过去大量的工作主要放在微量元素代谢通路和微量元素结合蛋白的实验研究上,由此凸显出微量元素对生命的重要性。然而,微量元素的计算生物学研究工作却非常有限。着重介绍当前利用比较基因组学的理论和方法来研究不同微量元素的利用、代谢、功能和进化方面问题的最新进展。对于所讨论的元素,大多数利用它们的蛋白已经基本确定,并且这些蛋白对于特定元素的依赖性也是非常保守的。通过比较基因组学分析,有助于帮助我们进一步认识微量元素领域很多基本问题(如在古菌、细菌和真核生物中的代谢、功能和动态进化规律等)及其重要特征。  相似文献   

2.
Membrane proteins serve as cellular gatekeepers, regulators, and sensors. Prior studies have explored the functional breadth and evolution of proteins and families of particular interest, such as the diversity of transport-associated membrane protein families in prokaryotes and eukaryotes, the composition of integral membrane proteins, and family classification of all human G-protein coupled receptors. However, a comprehensive analysis of the content and evolutionary associations between membrane proteins and families in a diverse set of genomes is lacking. Here, a membrane protein annotation pipeline was developed to define the integral membrane genome and associations between 21,379 proteins from 34 genomes; most, but not all of these proteins belong to 598 defined families. The pipeline was used to provide target input for a structural genomics project that successfully cloned, expressed, and purified 61 of our first 96 selected targets in yeast. Furthermore, the methodology was applied (1) to explore the evolutionary history of the substrate-binding transmembrane domains of the human ABC transporter superfamily, (2) to identify the multidrug resistance-associated membrane proteins in whole genomes, and (3) to identify putative new membrane protein families.  相似文献   

3.
基因组功能预测的进化印记方法   总被引:7,自引:1,他引:6  
改善基因组功能预测方案是目前功能基因组学的迫切问题,生物进化历程会在分子序列上留下相应进化印记-直系同源簇的特异模体,在这一生物学事实的基础上,提出了一个新的基因缚功能预测方法,首先利用进化分析方法构建直系同源簇,再找到各直系同源簇的功能模体,这样可以形成特异的功能模体库,未知基因的功能预测可望通过搜索该功能模体库而得以高效,准确地完成,对5个家族的检验初步证实该方案是可行的。  相似文献   

4.
Many proteins consist of several structural domains. These multi-domain proteins have likely been generated by selective genome growth dynamics during evolution to perform new functions as well as to create structures that fold on a biologically feasible time scale. Domain units frequently evolved through a variety of genetic shuffling mechanisms. Here we examine the protein domain statistics of more than 1000 organisms including eukaryotic, archaeal and bacterial species. The analysis extends earlier findings on asymmetric statistical laws for proteome to a wider variety of species. While proteins are composed of a wide range of domains, displaying a power-law decay, the computation of domain families for each protein reveals an exponential distribution, characterizing a protein universe composed of a thin number of unique families. Structural studies in proteomics have shown that domain repeats, or internal duplicated domains, represent a small but significant fraction of genome. In spite of its importance, this observation has been largely overlooked until recently. We model the evolutionary dynamics of proteome and demonstrate that these distinct distributions are in fact rooted in an internal duplication mechanism. This process generates the contemporary protein structural domain universe, determines its reduced thickness, and tames its growth. These findings have important implications, ranging from protein interaction network modeling to evolutionary studies based on fundamental mechanisms governing genome expansion.  相似文献   

5.
An efficient algorithm for large-scale detection of protein families   总被引:6,自引:0,他引:6  
Detection of protein families in large databases is one of the principal research objectives in structural and functional genomics. Protein family classification can significantly contribute to the delineation of functional diversity of homologous proteins, the prediction of function based on domain architecture or the presence of sequence motifs as well as comparative genomics, providing valuable evolutionary insights. We present a novel approach called TRIBE-MCL for rapid and accurate clustering of protein sequences into families. The method relies on the Markov cluster (MCL) algorithm for the assignment of proteins into families based on precomputed sequence similarity information. This novel approach does not suffer from the problems that normally hinder other protein sequence clustering algorithms, such as the presence of multi-domain proteins, promiscuous domains and fragmented proteins. The method has been rigorously tested and validated on a number of very large databases, including SwissProt, InterPro, SCOP and the draft human genome. Our results indicate that the method is ideally suited to the rapid and accurate detection of protein families on a large scale. The method has been used to detect and categorise protein families within the draft human genome and the resulting families have been used to annotate a large proportion of human proteins.  相似文献   

6.
The proteomes that make up the collection of proteins in contemporary organisms evolved through recombination and duplication of a limited set of domains. These protein domains are essentially the main components of globular proteins and are the most principal level at which protein function and protein interactions can be understood. An important aspect of domain evolution is their atomic structure and biochemical function, which are both specified by the information in the amino acid sequence. Changes in this information may bring about new folds, functions and protein architectures. With the present and still increasing wealth of sequences and annotation data brought about by genomics, new evolutionary relationships are constantly being revealed, unknown structures modeled and phylogenies inferred. Such investigations not only help predict the function of newly discovered proteins, but also assist in mapping unforeseen pathways of evolution and reveal crucial, co-evolving inter- and intra-molecular interactions. In turn this will help us describe how protein domains shaped cellular interaction networks and the dynamics with which they are regulated in the cell. Additionally, these studies can be used for the design of new and optimized protein domains for therapy. In this review, we aim to describe the basic concepts of protein domain evolution and illustrate recent developments in molecular evolution that have provided valuable new insights in the field of comparative genomics and protein interaction networks.  相似文献   

7.
Trace elements are used by all organisms and provide proteins with unique coordination and catalytic and electron transfer properties. Although many trace element-containing proteins are well characterized, little is known about the general trends in trace element utilization. We carried out comparative genomic analyses of copper, molybdenum, nickel, cobalt (in the form of vitamin B12), and selenium (in the form of selenocysteine) in 747 sequenced organisms at the following levels: (i) transporters and transport-related proteins, (ii) cofactor biosynthesis traits, and (iii) trace element-dependent proteins. Few organisms were found to utilize all five trace elements, whereas many symbionts, parasites, and yeasts used only one or none of these elements. Investigation of metalloproteomes and selenoproteomes revealed examples of increased utilization of proteins that use copper in land plants, cobalt in Dehalococcoides and Dictyostelium, and selenium in fish and algae, whereas nematodes were found to have great diversity of copper transporters. These analyses also characterized trace element metabolism in common model organisms and suggested new model organisms for experimental studies of individual trace elements. Mismatches in the occurrence of user proteins and corresponding transport systems revealed deficiencies in our understanding of trace element biology. Biological interactions among some trace elements were observed; however, such links were limited, and trace elements generally had unique utilization patterns. Finally, environmental factors, such as oxygen requirement and habitat, correlated with the utilization of certain trace elements. These data provide insights into the general features of utilization and evolution of trace elements in the three domains of life.  相似文献   

8.
Recent recognition that ecological and evolutionary processes can operate on similar timescales has led to a rapid increase in theoretical and empirical studies on eco‐evolutionary dynamics. Progress in the fields of evolutionary biology, genomics and ecology is greatly enhancing our understanding of rapid adaptive processes, the predictability of adaptation and the genetics of ecologically important traits. However, progress in these fields has proceeded largely independently of one another. In an attempt to better integrate these fields, the centre for ‘Adaptation to a Changing Environment’ organized a conference entitled ‘The genomic basis of eco‐evolutionary change’ and brought together experts in ecological genomics and eco‐evolutionary dynamics. In this review, we use the work of the invited speakers to summarize eco‐evolutionary dynamics and discuss how they are relevant for understanding and predicting responses to contemporary environmental change. Then, we show how recent advances in genomics are contributing to our understanding of eco‐evolutionary dynamics. Finally, we highlight the gaps in our understanding of eco‐evolutionary dynamics and recommend future avenues of research in eco‐evolutionary dynamics.  相似文献   

9.
Structures for protein domains have increased rapidly in recent years owing to advances in structural biology and structural genomics projects. New structures are often similar to those solved previously, and such similarities can give insights into function by linking poorly understood families to those that are better characterized. They also allow the possibility of combing information to find still more proteins adopting a similar structure and sometimes a similar function, and to reprioritize families in structural genomics pipelines. We explore this possibility here by preparing merged profiles for pairs of structurally similar, but not necessarily sequence-similar, domains within the SMART and Pfam database by way of the Structural Classification of Proteins (SCOP). We show that such profiles are often able to successfully identify further members of the same superfamily and thus can be used to increase the sensitivity of database searching methods like HMMer and PSI-BLAST. We perform detailed benchmarks using the SMART and Pfam databases with four complete genomes frequently used as annotation benchmarks. We quantify the associated increase in structural information in Swissprot and discuss examples illustrating the applicability of this approach to understand functional and evolutionary relationships between protein families.  相似文献   

10.
Transposon Tn7 is notable for the control it exercises over where transposition events are directed. One Tn7 integration pathways recognizes a highly conserved attachment (att) site in the chromosome, while a second pathway specifically recognizes mobile plasmids that facilitate transfer of the element to new hosts. In this review, I discuss newly discovered families of Tn7‐like elements with different targeting pathways. Perhaps the most exciting examples are multiple instances where Tn7‐like elements have repurposed CRISPR/Cas systems. In these cases, the CRISPR/Cas systems have lost their canonical defensive function to destroy incoming mobile elements; instead, the systems have been naturally adapted to use guide RNAs to specifically direct transposition into these mobile elements. The new families of Tn7‐like elements also include a variety of novel att sites in bacterial chromosomes where genome islands can form. Interesting families have also been revealed where proteins described in the prototypic Tn7 element are fused or otherwise repurposed for the new dual activities. This expanded understanding of Tn7‐like elements broadens our view of how genetic systems are repurposed and provides potentially exciting new tools for genome modification and genomics. Future opportunities and challenges to understanding the impact of the new families of Tn7‐like elements are discussed.  相似文献   

11.
Internal protein dynamics is essential for biological function. During evolution, protein divergence is functionally constrained: properties more relevant for function vary more slowly than less important properties. Thus, if protein dynamics is relevant for function, it should be evolutionary conserved. In contrast with the well-studied evolution of protein structure, the evolutionary divergence of protein dynamics has not been addressed systematically before, apart from a few case studies. X-Ray diffraction analysis gives information not only on protein structure but also on B-factors, which characterize the flexibility that results from protein dynamics. Here we study the evolutionary divergence of protein backbone dynamics by comparing the Cα flexibility (B-factor) profiles for a large dataset of homologous proteins classified into families and superfamilies. We show that Cα flexibility profiles diverge slowly, so that they are conserved at family and superfamily levels, even for pairs of proteins with nonsignificant sequence similarity. We also analyze and discuss the correlations among the divergences of flexibility, sequence, and structure. Electronic Supplementary Material Electronic Supplementary material is available for this article at and accessible for authorised users. [Reviewing Editor: Dr. David Pollock]  相似文献   

12.

Background

Nickel (Ni) and cobalt (Co) are trace elements required for a variety of biological processes. Ni is directly coordinated by proteins, whereas Co is mainly used as a component of vitamin B12. Although a number of Ni and Co-dependent enzymes have been characterized, systematic evolutionary analyses of utilization of these metals are limited.

Results

We carried out comparative genomic analyses to examine occurrence and evolutionary dynamics of the use of Ni and Co at the level of (i) transport systems, and (ii) metalloproteomes. Our data show that both metals are widely used in bacteria and archaea. Cbi/NikMNQO is the most common prokaryotic Ni/Co transporter, while Ni-dependent urease and Ni-Fe hydrogenase, and B12-dependent methionine synthase (MetH), ribonucleotide reductase and methylmalonyl-CoA mutase are the most widespread metalloproteins for Ni and Co, respectively. Occurrence of other metalloenzymes showed a mosaic distribution and a new B12-dependent protein family was predicted. Deltaproteobacteria and Methanosarcina generally have larger Ni- and Co-dependent proteomes. On the other hand, utilization of these two metals is limited in eukaryotes, and very few of these organisms utilize both of them. The Ni-utilizing eukaryotes are mostly fungi (except saccharomycotina) and plants, whereas most B12-utilizing organisms are animals. The NiCoT transporter family is the most widespread eukaryotic Ni transporter, and eukaryotic urease and MetH are the most common Ni- and B12-dependent enzymes, respectively. Finally, investigation of environmental and other conditions and identity of organisms that show dependence on Ni or Co revealed that host-associated organisms (particularly obligate intracellular parasites and endosymbionts) have a tendency for loss of Ni/Co utilization.

Conclusion

Our data provide information on the evolutionary dynamics of Ni and Co utilization and highlight widespread use of these metals in the three domains of life, yet only a limited number of user proteins.  相似文献   

13.
Comparative genomics has proven a fruitful approach to acquire many functional and evolutionary insights into core cellular processes. Here it is argued that in order to perform accurate and interesting comparative genomics, one first and foremost has to be able to recognize, postulate, and revise different evolutionary scenarios. After all, these studies lack a simple protocol, due to different proteins having different evolutionary dynamics and demanding different approaches. The authors here discuss this challenge from a practical (what are the observations?) and conceptual (how do these indicate a specific evolutionary scenario?) viewpoint, with the aim to guide investigators who want to analyze the evolution of their protein(s) of interest. By sharing how the authors draft, test, and update such a scenario and how it directs their investigations, the authors hope to illuminate how to execute molecular evolution studies and how to interpret them. Also see the video abstract here https://youtu.be/VCt3l2pbdbQ .  相似文献   

14.
JFY. Brookfield 《Genetics》1991,128(2):471-486
By analytical theory and computer simulation the expected evolutionary dynamics of P transposable element spread in an infinite population are investigated. The analysis is based on the assumption that, unlike transposable elements which move via RNA intermediates, the harmful effects of P elements arise primarily in the act of transposition, and that this causes their evolutionary dynamics to be unusual. It is suggested that a situation of transposition-selection balance will be superceded by the buildup of a cytoplasmically inherited repression or by the elimination of active transposase-encoding elements from the chromosomes, a process which may be accompanied by the evolution of elements which encode proteins which repress transposition.  相似文献   

15.
MOTIVATION: Protein families can be defined based on structure or sequence similarity. We wanted to compare two protein family databases, one based on structural and one on sequence similarity, to investigate to what extent they overlap, the similarity in definition of corresponding families, and to create a list of large protein families with unknown structure as a resource for structural genomics. We also wanted to increase the sensitivity of fold assignment by exploiting protein family HMMs. RESULTS: We compared Pfam, a protein family database based on sequence similarity, to Scop, which is based on structural similarity. We found that 70% of the Scop families exist in Pfam while 57% of the Pfam families exist in Scop. Most families that occur in both databases correspond well to each other, but in some cases they are different. Such cases highlight situations in which structure and sequence approaches differ significantly. The comparison enabled us to compile a list of the largest families that do not occur in Scop; these are suitable targets for structure prediction and determination, and may be useful to guide projects in structural genomics. It can be noted that 13 out of the 20 largest protein families without a known structure are likely transmembrane proteins. We also exploited Pfam to increase the sensitivity of detecting homologs of proteins with known structure, by comparing query sequences to Pfam HMMs that correspond to Scop families. For SWISSPROT+TREMBL, this yielded an increase in fold assignment from 31% to 42% compared to using FASTA only. This method assigned a structure to 22% of the proteins in Saccharomyces cerevisiae, 24% in Escherichia coli, and 16% in Methanococcus jannaschii.  相似文献   

16.
Selenium is an important trace element that occurs in proteins in the form of selenocysteine (Sec) and in tRNAs in the form of selenouridine. Recent large-scale metagenomics projects provide an opportunity for understanding global trends in trace element utilization. Herein, we characterized the selenoproteome of the microbial marine community derived from the Global Ocean Sampling (GOS) expedition. More than 3,600 selenoprotein gene sequences belonging to 58 protein families were detected, including sequences representing 7 newly identified selenoprotein families, such as homologs of ferredoxin–thioredoxin reductase and serine protease. In addition, a new eukaryotic selenoprotein family, thiol reductase GILT, was identified. Most GOS selenoprotein families originated from Cys-containing thiol oxidoreductases. In both Pacific and Atlantic microbial communities, SelW-like and SelD were the most widespread selenoproteins. Geographic location had little influence on Sec utilization as measured by selenoprotein variety and the number of selenoprotein genes detected; however, both higher temperature and marine (as opposed to freshwater and other aquatic) environment were associated with increased use of this amino acid. Selenoproteins were also detected with preference for either environment. We identified novel fusion forms of several selenoproteins that highlight redox activities of these proteins. Almost half of Cys-containing SelDs were fused with NADH dehydrogenase, whereas such SelD forms were rare in terrestrial organisms. The selenouridine utilization trait was also analyzed and showed an independent evolutionary relationship with Sec utilization. Overall, our study provides insights into global trends in microbial selenium utilization in marine environments.  相似文献   

17.
Wetlands play an important role in determining the water quality of streams and are generally considered to act as a sink for many reactive species. However, retention of chemical constituents varies seasonally and is affected by hydrologic and biogeochemical processes including water source, mineral weathering, DOC and SPM cycling, redox status, precipitation/dissolution/adsorption, and seasonal events. Relatively little is known about the influence of these factors on trace element cycling in wetland-influenced streams. To explore the role of wetlands with respect to the retention/release of trace elements to streams, we examined temporal and spatial patterns of concentrations of a large suite of trace elements (via ICP-MS) and geochemical drivers in five streams and wetland rivulets draining natural wetlands in a northern Wisconsin watershed as well as in their groundwater sources (terrestrial recharge, lake recharge, and older lake recharge). We performed principal components analyses of the concentrations of elements and their geochemical drivers in both the streams and rivulets to assist in the identification of factors regulating trace element concentrations. Variation in trace and major element concentrations among the streams was strongly related to the proportion of terrestrial recharge contributing to the stream. A dominant influence of water source on rivulet chemistry was supported by association of groundwater-sourced elements (Ba, Ca, Cs, Mg, Na, Si, Sr) with the primary statistical factor. DOC appeared in the first principal component factor for the streams and in the second factor for the rivulets. Strong correlations of Al, Cd, Ce, Cu, La, Pb, Ti, and Zn with DOC supported the important influence of DOC on trace metal cycling. A number of elements in the rivulets (Al, La, Pb, Ti) and streams (Al, Ce, Cr, Cu, La, Pb, Ti, Zn) had a significant particulate cycle. Redox cycling and precipitation/dissolution reactions involving Fe and Mn likely impacted Cu and Mo as evidenced by the low levels in the rivulets. Variance in Fe, Mn and the metal oxy-anions was associated with factors related to redox cycling and adsorption reactions in the wetland sediments. In streams, DOC and metals with a high affinity for DOC were associated with a factor which also included negative loadings for groundwater-sourced elements, reflecting the importance of seasonal hydrologic events which flush DOC and metals from wetland sediments and dilute groundwater sourced metals. Redox processes were of secondary importance in the streams but of primary significance in the rivulets, documenting the importance of anoxic conditions in wetland sediments on groundwater en route to the stream.  相似文献   

18.
Functional genomics has revolutionised the way that scientists approach biological questions, allowing for the comprehensive characterisation of the function of related proteins encoded in a genome. The sequencing of the genome of the model system Arabidopsis thaliana has enabled the beginning of functional genomics and the study of protein kinase families in plants. The large family of genes encoding protein kinases is a primary target of functional genomics studies in plants due to their importance in diverse physiological processes. This paper describes the functional genomics tools used to study the families of protein kinases in Arabidopsis, as well as progress in uncovering the functions of these proteins.  相似文献   

19.
Protein structure is generally more conserved than sequence, but for regions that can adopt different structures in different environments, does this hold true? Understanding how structurally disordered regions evolve altered secondary structure element propensities as well as conformational flexibility among paralogs are fundamental questions for our understanding of protein structural evolution. We have investigated the evolutionary dynamics of structural disorder in protein families containing both orthologs and paralogs using phylogenetic tree reconstruction, protein structure disorder prediction, and secondary structure prediction in order to shed light upon these questions. Our results indicate that the extent and location of structurally disordered regions are not universally conserved. As structurally disordered regions often have high conformational flexibility, this is likely to have an effect on how protein structure evolves as spatially altered conformational flexibility can also change the secondary structure propensities for homologous regions in a protein family.  相似文献   

20.
New directions in biology are being driven by the complete sequencing of genomes, which has given us the protein repertoires of diverse organisms from all kingdoms of life. In tandem with this accumulation of sequence data, worldwide structural genomics initiatives, advanced by the development of improved technologies in X-ray crystallography and NMR, are expanding our knowledge of structural families and increasing our fold libraries. Methods for detecting remote sequence similarities have also been made more sensitive and this means that we can map domains from these structural families onto genome sequences to understand how these families are distributed throughout the genomes and reveal how they might influence the functional repertoires and biological complexities of the organisms. We have used robust protocols to assign sequences from completed genomes to domain structures in the CATH database, allowing up to 60% of domain sequences in these genomes, depending on the organism, to be assigned to a domain family of known structure. Analysis of the distribution of these families throughout bacterial genomes identified more than 300 universal families, some of which had expanded significantly in proportion to genome size. These highly expanded families are primarily involved in metabolism and regulation and appear to make major contributions to the functional repertoire and complexity of bacterial organisms. When comparisons are made across all kingdoms of life, we find a smaller set of universal domain families (approx. 140), of which families involved in protein biosynthesis are the largest conserved component. Analysis of the behaviour of other families reveals that some (e.g. those involved in metabolism, regulation) have remained highly innovative during evolution, making it harder to trace their evolutionary ancestry. Structural analyses of metabolic families provide some insights into the mechanisms of functional innovation, which include changes in domain partnerships and significant structural embellishments leading to modulation of active sites and protein interactions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号