首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
The Molecular Evolutionary Genetics Analysis (MEGA) software is a desktop application designed for comparative analysis of homologous gene sequences either from multigene families or from different species with a special emphasis on inferring evolutionary relationships and patterns of DNA and protein evolution. In addition to the tools for statistical analysis of data, MEGA provides many convenient facilities for the assembly of sequence data sets from files or web-based repositories, and it includes tools for visual presentation of the results obtained in the form of interactive phylogenetic trees and evolutionary distance matrices. Here we discuss the motivation, design principles and priorities that have shaped the development of MEGA. We also discuss how MEGA might evolve in the future to assist researchers in their growing need to analyze large data set using new computational methods.  相似文献   

2.
Comparative analysis of molecular sequence data is essential for reconstructing the evolutionary histories of species and inferring the nature and extent of selective forces shaping the evolution of genes and species. Here, we announce the release of Molecular Evolutionary Genetics Analysis version 5 (MEGA5), which is a user-friendly software for mining online databases, building sequence alignments and phylogenetic trees, and using methods of evolutionary bioinformatics in basic biology, biomedicine, and evolution. The newest addition in MEGA5 is a collection of maximum likelihood (ML) analyses for inferring evolutionary trees, selecting best-fit substitution models (nucleotide or amino acid), inferring ancestral states and sequences (along with probabilities), and estimating evolutionary rates site-by-site. In computer simulation analyses, ML tree inference algorithms in MEGA5 compared favorably with other software packages in terms of computational efficiency and the accuracy of the estimates of phylogenetic trees, substitution parameters, and rate variation among sites. The MEGA user interface has now been enhanced to be activity driven to make it easier for the use of both beginners and experienced scientists. This version of MEGA is intended for the Windows platform, and it has been configured for effective use on Mac OS X and Linux desktops. It is available free of charge from http://www.megasoftware.net.  相似文献   

3.
MEGA2: molecular evolutionary genetics analysis software.   总被引:201,自引:0,他引:201  
We have developed a new software package, Molecular Evolutionary Genetics Analysis version 2 (MEGA2), for exploring and analyzing aligned DNA or protein sequences from an evolutionary perspective. MEGA2 vastly extends the capabilities of MEGA version 1 by: (1) facilitating analyses of large datasets; (2) enabling creation and analyses of groups of sequences; (3) enabling specification of domains and genes; (4) expanding the repertoire of statistical methods for molecular evolutionary studies; and (5) adding new modules for visual representation of input data and output results on the Microsoft Windows platform. AVAILABILITY: http://www.megasoftware.net. CONTACT: s.kumar@asu.edu  相似文献   

4.
The Molecular Evolutionary Genetics Analysis (MEGA) software has matured to contain a large collection of methods and tools of computational molecular evolution. Here, we describe new additions that make MEGA a more comprehensive tool for building timetrees of species, pathogens, and gene families using rapid relaxed-clock methods. Methods for estimating divergence times and confidence intervals are implemented to use probability densities for calibration constraints for node-dating and sequence sampling dates for tip-dating analyses. They are supported by new options for tagging sequences with spatiotemporal sampling information, an expanded interactive Node Calibrations Editor, and an extended Tree Explorer to display timetrees. Also added is a Bayesian method for estimating neutral evolutionary probabilities of alleles in a species using multispecies sequence alignments and a machine learning method to test for the autocorrelation of evolutionary rates in phylogenies. The computer memory requirements for the maximum likelihood analysis are reduced significantly through reprogramming, and the graphical user interface has been made more responsive and interactive for very big data sets. These enhancements will improve the user experience, quality of results, and the pace of biological discovery. Natively compiled graphical user interface and command-line versions of MEGA11 are available for Microsoft Windows, Linux, and macOS from www.megasoftware.net.  相似文献   

5.
6.
There have been substantial improvements in statistical tools for assessing the evolutionary roles of mutation and natural selection from interspecific sequence data. The importance of having the rate at which a point mutation occurs depend on the DNA sequence at sites surrounding the mutation is now better appreciated and can be accommodated in probabilistic models of protein evolution. To quantify the evolutionary impact of some aspect of phenotype, one promising strategy is to develop a system for predicting phenotype from the DNA sequence and to then infer how the evolutionary rates of sequence change are affected by the predicted phenotypic consequences of the changes. Although statistical tools for characterizing protein evolution are improving, the list of candidate phenomena that can affect rates of protein evolution is long and the relative contributions of these phenomena are only beginning to be disentangled.  相似文献   

7.
Warden CD  Kim SH  Yi SV 《PloS one》2008,3(2):e1559
Functional RNAs (fRNAs) are being recognized as an important regulatory component in biological processes. Interestingly, recent computational studies suggest that the number and biological significance of functional RNAs within coding regions (coding fRNAs) may have been underestimated. We hypothesized that such coding fRNAs will impose additional constraint on sequence evolution because the DNA primary sequence has to simultaneously code for functional RNA secondary structures on the messenger RNA in addition to the amino acid codons for the protein sequence. To test this prediction, we first utilized computational methods to predict conserved fRNA secondary structures within multiple species alignments of Saccharomyces sensu strico genomes. We predict that as much as 5% of the genes in the yeast genome contain at least one functional RNA secondary structure within their protein-coding region. We then analyzed the impact of coding fRNAs on the evolutionary rate of protein-coding genes because a decrease in evolutionary rate implies constraint due to biological functionality. We found that our predicted coding fRNAs have a significant influence on evolutionary rates (especially at synonymous sites), independent of other functional measures. Thus, coding fRNA may play a role on sequence evolution. Given that coding regions of humans and flies contain many more predicted coding fRNAs than yeast, the impact of coding fRNAs on sequence evolution may be substantial in genomes of higher eukaryotes.  相似文献   

8.
Akashi H 《Gene》1999,238(1):39-51
Extensive DNA data emerging from genome-sequencing projects have revitalized interest in the mechanisms of molecular evolution. Although the contribution of natural selection at the molecular level has been debated for over 30 years, the relevant data and appropriate statistical methods to address this issue have only begun to emerge. This paper will first present the predominant models of neutral, nearly neutral, and adaptive molecular evolution. Then, a method to identify the role of natural selection in molecular evolution by comparing within- and between-species DNA sequence variation will be presented. Computer simulations show that such methods are powerful for detecting even very weak selection. Examination of DNA variation data within and between Drosophila species suggests that 'silent' sites evolve under a balance between weak selection and genetic drift. Simulated data also show that sequence comparisons are a powerful method to detect adaptive protein evolution, even when selection is weak or affects a small fraction of nucleotide sites. In the Drosophila data examined, positive selection appears to be a predominant force in protein evolution.  相似文献   

9.
The Ras superfamily is a fascinating example of functional diversification in the context of a preserved structural framework and a prototypic GTP binding site. Thanks to the availability of complete genome sequences of species representing important evolutionary branch points, we have analyzed the composition and organization of this superfamily at a greater level than was previously possible. Phylogenetic analysis of gene families at the organism and sequence level revealed complex relationships between the evolution of this protein superfamily sequence and the acquisition of distinct cellular functions. Together with advances in computational methods and structural studies, the sequence information has helped to identify features important for the recognition of molecular partners and the functional specialization of different members of the Ras superfamily.  相似文献   

10.
Much molecular-evolution research is concerned with sequence analysis. Yet these sequences represent real, three-dimensional molecules with complex structure and function. Here I highlight a growing trend in the field to incorporate molecular structure and function into computational molecular-evolution work. I consider three focus areas: reconstruction and analysis of past evolutionary events, such as phylogenetic inference or methods to infer selection pressures; development of toy models and simulations to identify fundamental principles of molecular evolution; and atom-level, highly realistic computational modeling of molecular structure and function aimed at making predictions about possible future evolutionary events.  相似文献   

11.
Despite its potential role in the evolution of complex phenotypes, the detection of negative (purifying) and positive selection on noncoding regulatory sequence has been elusive because of the inherent difficulty in predicting the functional consequences of mutations on noncoding sequence. Because the functioning of regulatory sequence depends upon both chromatin configuration and cis-regulatory factor binding, we investigate the idea that the functional conservation of regulatory regions should be associated with the conservation of sequence-dependent bending properties of DNA that determine its affinity for the nucleosome. Recent advances in the computational prediction of sequence-dependent affinity to nucleosomes provide an opportunity to distinguish between neutral and nonneutral evolution of fine-scale chromatin organization. Here, a statistical test is presented for detecting evolutionary conservation and/or adaptive evolution of nucleosome affinity from interspecies comparisons of DNA sequences. Local nucleosome affinities of homologous sequences were calculated using 2 recently published methods. A randomization test was applied to sites of mutation to evaluate the similarity of DNA-nucleosome affinity between several closely related species of Saccharomyces yeast. For most of the genes we analyzed, the conservation of local nucleosome affinity was detected at a few distinct locations in the upstream noncoding region. Our results also demonstrate that different patterns of chromatin evolution have shaped DNA-nucleosome interaction at the core promoters of TATA-containing and TATA-less genes and that elevated purifying selection has maintained low affinity for nucleosome in the core promoters of the latter group. Across the entire yeast genome, DNA-nucleosome interaction was also discovered to be significantly more conserved in TATA-less genes compared with TATA-containing genes.  相似文献   

12.
A central goal of computational biology is the prediction of phenotype from DNA and protein sequence data. Recent models of sequence change use in silico prediction systems to incorporate the effects of phenotype on evolutionary rates. These models have been designed for analyzing sequence data from different species and have been accompanied by statistical techniques for estimating model parameters when the incorporation of phenotype induces dependent change among sequence positions. A difficulty with these efforts to link phenotype and interspecific evolution is that evolution occurs within populations, and parameters of interspecific models should have population genetic interpretations. We show, with two examples, how population genetic interpretations can be assigned to evolutionary models. The first example considers the impact of RNA secondary structure on sequence change, and the second reflects the tendency for protein tertiary structure to influence nonsynonymous substitution rates. We argue that statistical fit to data should not be the sole criterion for assessing models of sequence change. A good interspecific model should also yield a clear and biologically plausible population genetic interpretation.  相似文献   

13.
The study of biogeography has benefited from the exponential increase of DNA sequence data from recent molecular systematic studies, the development of analytical methods in the last decade concerning divergence time estimation and geographic area analyses, and the availability of large-scale distributiofi data of species in many groups of organisms. The underlying principle of divergence time estimation from DNA and protein data is that sequence divergence depends on the product of evolutionary rate and time. With their molecular clock hypothesis, Zuckerkandl and Pauling (1965) separated rates of molecular evolution from time by incorporating fossil evidence. Originally,  相似文献   

14.
The recent technological advances underlying the screening of large combinatorial libraries in high-throughput mutational scans deepen our understanding of adaptive protein evolution and boost its applications in protein design. Nevertheless, the large number of possible genotypes requires suitable computational methods for data analysis, the prediction of mutational effects, and the generation of optimized sequences. We describe a computational method that, trained on sequencing samples from multiple rounds of a screening experiment, provides a model of the genotype–fitness relationship. We tested the method on five large-scale mutational scans, yielding accurate predictions of the mutational effects on fitness. The inferred fitness landscape is robust to experimental and sampling noise and exhibits high generalization power in terms of broader sequence space exploration and higher fitness variant predictions. We investigate the role of epistasis and show that the inferred model provides structural information about the 3D contacts in the molecular fold.  相似文献   

15.
饶慧芸 《人类学学报》2022,41(6):1083-1096
东亚古人类演化是学术界关注的热点科学问题,国内外学者对此进行了多学科的相关研究,取得了很多重要进展,但仍然存在许多尚未解决的问题。古蛋白质分析近年来成为古生物演化领域的又一个前沿和热点方向,取得了一系列重要突破。较之古DNA,古蛋白质的保存优势使其可以在时间上和地域上突破古DNA的限制,在古人类演化领域大有可为。东亚古人类化石丰富且时段大致连续,但更新世或更早时期的分子证据非常缺乏。本文从古蛋白质分析的发展史、研究潜力、难点与挑战以及思考与展望等几方面,对古蛋白质分析在东亚古人类演化研究中的应用前景进行梳理与思考。相信随着更多分子证据的积累,古蛋白质分析可为东亚古人类的演化脉络提供更多关键性的线索,极大地促进人类演化研究。  相似文献   

16.
DNA replication is one of the most ancient of cellular processes and functional similarities among its molecular machinery are apparent across all cellular life. Cdc45 is one of the essential components of the eukaryotic replication fork and is required for the initiation and elongation of DNA replication, but its molecular function is currently unknown. In order to trace its evolutionary history and to identify functional domains, we embarked on a computational sequence analysis of the Cdc45 protein family. Our findings reveal eukaryotic Cdc45 and prokaryotic RecJ to possess a common ancestry and Cdc45 to contain a catalytic site within a predicted exonuclease domain. The likely orthology between Cdc45 and RecJ reveals new lines of enquiry into DNA replication mechanisms in eukaryotes.  相似文献   

17.
Wu ML  Lin TP  Lin MY  Cheng YP  Hwang SY 《Annals of botany》2007,99(3):461-475
BACKGROUND AND AIMS: Evolutionary and ecological roles of the chloroplast small heat shock protein (CPsHSP) have been emphasized based on variations in protein contents; however, DNA sequence variations related to the evolutionary and ecological roles of this gene have not been investigated. In the present study, a basal angiosperm, Machilus, together with the eudicot Rhododendron were used to illustrate the evolutionary dynamics of gene divergence in CPsHSPs. METHODS: Degenerate primers were used to amplify CPsHSP-related sequences from 16 Rhododendron and eight Machilus species that occur in Taiwan. Manual DNA sequence alignment was carried out according to the deduced amino acid sequence alignment performed by CLUSTAL X. A neighbour-joining tree was generated in MEGA using conceptual translated amino acid sequences from consensus sequences of cloned CPsHSP genes from eight Machilus and 16 Rhododendron species as well as amino acid sequences of CPsHSPs from five monocots and seven other eudicots acquired from GenBank. CPsHSP amino acid sequences of Funaria hygrometrica were used as the outgroups. The aligned DNA and amino acid sequences were used to estimate several parameters of sequence divergence using the MEGA program. Separate Bayesian inference of DNA sequences of Rhododendron and Machilus species was analysed and the resulting gene trees were used for detection of putative positively selected amino acid sites by the Codeml program implemented in the PAML package. Mean hydrophobicity profile analysis was performed with representative amino acid sequences for both Rhododendron and Machilus species by the Bioedit program. The computer program SplitTester was used to examine whether CPsHSPs of Rhododendron lineages and duplicate copies of the Machilus CPsHSPs have evolved functional divergence based on the hydrophobicity distance matrix. KEY RESULTS: Only one copy of the CPsHSP was found in Rhododendron. However, a higher evolutionary rate of amino acid substitutions in the Hymenanthes lineage of Rhododendron was inferred. Two positively selected amino acid sites may have resulted in higher hydrophobicity in the region of the alpha-crystallin domain (ACD) of the CPsHSP. By contrast, the basal angiosperm, Machilus, possessed duplicate copies of the CPsHSP, which also differed in their evolutionary rates of amino acid substitutions. However, no apparent relationship of ecological relevance toward the positively selected amino acid sites was found in Machilus. CONCLUSIONS: Divergent evolution was found for both Rhododendron lineages and the paralogues of CPsHSP in Machilus that were directed to the shift in hydrophobicity in the ACD and/or methionine-rich region, which might have played important roles in molecular chaperone activity.  相似文献   

18.
Gu X 《Genetics》2007,175(4):1813-1822
In this article, we develop an evolutionary model for protein sequence evolution. Gene pleiotropy is characterized by K distinct but correlated components (molecular phenotypes) that affect the organismal fitness. These K molecular phenotypes are under stabilizing selection with microadaptation (SM) due to random optima shifts, the SM model. Random coding mutations generate a correlated distribution of K molecular phenotypes. Under this SM model, we further develop a statistical method to estimate the "effective" number of molecular phenotypes (K(e)) of the gene. Therefore, for the first time we can empirically evaluate gene pleiotropy from the protein sequence analysis. Case studies of vertebrate proteins indicate that K(e) is typically approximately 6-9. We demonstrate that the newly developed SM model of protein evolution may provide a basis for exploring genomic evolution and correlations.  相似文献   

19.
Reconstructing a tree of life by inferring evolutionary history is an important focus of evolutionary biology. Phylogenetic reconstructions also provide useful information for a range of scientific disciplines such as botany, zoology, phylogeography, archaeology and biological anthropology. Until the development of protein and DNA sequencing techniques in the 1960s and 1970s, phylogenetic reconstructions were based on fossil records and comparative morphological/physiological analyses. Since then, progress in molecular phylogenetics has compensated for some of the shortcomings of phenotype-based comparisons. Comparisons at the molecular level increase the accuracy of phylogenetic inference because there is no environmental influence on DNA/peptide sequences and evaluation of sequence similarity is not subjective. While the number of morphological/physiological characters that are sufficiently conserved for phylogenetic inference is limited, molecular data provide a large number of datapoints and enable comparisons from diverse taxa. Over the last 20 years, developments in molecular phylogenetics have greatly contributed to our understanding of plant evolutionary relationships. Regions in the plant nuclear and organellar genomes that are optimal for phylogenetic inference have been determined and recent advances in DNA sequencing techniques have enabled comparisons at the whole genome level. Sequences from the nuclear and organellar genomes of thousands of plant species are readily available in public databases, enabling researchers without access to molecular biology tools to investigate phylogenetic relationships by sequence comparisons using the appropriate nucleotide substitution models and tree building algorithms. In the present review, the statistical models and algorithms used to reconstruct phylogenetic trees are introduced and advances in the exploration and utilization of plant genomes for molecular phylogenetic analyses are discussed.  相似文献   

20.
BEAST: Bayesian evolutionary analysis by sampling trees   总被引:2,自引:0,他引:2  

Background  

The evolutionary analysis of molecular sequence variation is a statistical enterprise. This is reflected in the increased use of probabilistic models for phylogenetic inference, multiple sequence alignment, and molecular population genetics. Here we present BEAST: a fast, flexible software architecture for Bayesian analysis of molecular sequences related by an evolutionary tree. A large number of popular stochastic models of sequence evolution are provided and tree-based models suitable for both within- and between-species sequence data are implemented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号