首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 390 毫秒
1.
Both traditional as well as 10 more recent methods of coding characters from exons of protein‐coding genes are reviewed. The more recent methods collectively blur the distinction between nucleotide and amino‐acid coding and enable investigators to carefully quantify the effects of different sources of phylogenetic signal as well as their potential biases. Codon models, which explicitly model silent and replacement substitutions, are a major advance and are expected to be broadly useful for simultaneously inferring recent and ancient divergences, unlike amino‐acid coding. Degeneracy coding, wherein ambiguity codes are used to eliminate silent substitutions at the individual‐nucleotide level, has clear advantages over scoring amino‐acid characters. Nucleotide, codon, and amino‐acid models are now directly comparable with easy‐to‐use programs, and widely used phylogenetics programs can analyze partitioned supermatrices that incorporate all three types of model. Therefore, it should become standard practice to test among these alternative model types before conducting parametric phylogenetic analyses. An earlier study of 78 protein‐coding genes from 360 green‐plant plastid genomes is used as an empirical example with which to quantify the relative performance of alternative character‐coding methods using five quantification measures. Codon models were selected as having the best fit to the data, yet were outperformed by nucleotide models for all five quantification measures. Third‐codon positions were found to be an important source of phylogenetic signal and even outperformed analyses of first and second positions for some measures. Degeneracy coding generally performed at least as well as amino‐acid coding and is an arguably more effective alternative.  相似文献   

2.
3.
The 32-kDa photosystem II protein of the chloroplast is thought to be a target molecule for the herbicide atrazine. The psbA gene coding for this protein was cloned from Solanum nigrum atrazine-susceptible ('S') and atrazine-resistant ('R') biotypes. The 'S' and 'R' genes are identical in nucleotide sequence except for an A to G transition, predicting a Ser to Gly change at codon 264. The same predicted amino acid change in psbA was previously shown for an Amaranthus hybridus 'S' and 'R' biotypes which had, in addition, two silent nucleotide changes between the genes (Hirschberg, J. and McIntosh, L., Science 222, 1346-1349, 1983). Occurrence of the identical, non-silent change in psbA in different 'S' and 'R' weed biotype pairs suggests a functional, herbicide-related role for this codon position.  相似文献   

4.
In theory, codon models that account for the dependence of nucleotide substitutions between codon positions as well as differences between synonymous and non-synonymous changes best describe the sequence evolution in protein coding genes. However, in practice we know little about the degree to which violations of the assumptions of codon model-based estimates occur, and how significant these artifacts may be. In nucleotide-based phylogenies from first and second codon positions in a concatenated plastid gene data set, two distantly related taxa--dinoflagellate and haptophyte plastids--were robustly grouped together. This artifactual grouping is attributed to the parallel heterogeneity in leucine (Leu) and serine (Ser) codon usages in the data set. Here, by using this data set, we demonstrated that codon-based phylogenetic estimations are seriously biased, robustly uniting the dinoflagellate and haptophyte plastids into a monophyletic clade, when the model assumption of homogeneity of codon composition was violated. Our results suggest that similar phylogenetic artifacts may occur via codon usage heterogeneity in any amino acids in codon model-based estimations. We advise that homogeneity in codon usage across taxa in a data set be confirmed before codon model-based phylogenetic estimation is attempted.  相似文献   

5.
Partial DNA and amino acid sequences translated from the mitochondrial cytochrome subunit I gene (408 bp) of 17 mite species have been used for analyzing the phylogenetic relationships within the terrestrial Parasitengona (Trombidia). Due to mutational saturation of the third codon position, only first and second codon positions and amino acid sequences were analyzed, applying neighbor-joining, maximum-parsimony, and maximum-likelihood tree-building methods. The reconstructed trees revealed similar topologies of taxa; however, the phylogenetic relationships could be convincingly resolved only within several trombidioid taxa. The proposed basic relationships within the Parasitengona, in particular those of Calyptostomatoidea, Smarididae, and Erythraeidae, were poorly supported in bootstrap tests. A comparison of the presented gene tree with a phylogenetic tree based upon traditional characters revealed only few contradictions in nodes only weakly supported by morphological data. The most astonishing result is the proposed early derivative position of Microtrombidiidae within the terrestrial Parasitengona.  相似文献   

6.
Actin gene family of Caenorhabditis elegans   总被引:28,自引:0,他引:28  
Four actin genes have been isolated from Caenorhabditis elegans that account for all of the major actin hybridization to total genomic DNA. Actin genes I, II and III are clustered within a 12 X 10(3) base region; gene IV is unlinked to the others. All four genes have been sequenced from at least nucleotide -109 to +250. Genes I and III are identical for the first 307 coding nucleotides. Genes I and II differ in 14 positions within the first 250 coding nucleotides; one difference substitutes an aspartic acid for a glutamic acid at codon 5. Genes I and IV differ in 18 positions within the first 259 coding nucleotides without causing any amino acid differences. Genes I, II and III have introns after the first nucleotide of codon 64 and gene IV has an intron between codons 19 and 20. The four nucleotide sequences thus far define two different amino acid sequences. Both of the amino acid sequences resemble vertebrate cytoplasmic actin more than vertebrate muscle actin. A DNA polymorphism between the Bristol and Bergerac strains has been used as a phenotypic marker in genetic crosses to map the cluster of actin genes within a 2% recombination interval on linkage group V between unc-23 and sma-1 in order to begin a molecular genetic analysis of the actin loci.  相似文献   

7.
The utility of a nuclear protein-coding gene for reconstructing phylogenetic relationships within the family Culicidae was explored. Relationships among 13 species representing three subfamilies and nine genera of Culicidae were analyzed using a 762-bp fragment of coding sequence from the eye color gene, white. Outgroups for the study were two species from the sister group Chaoboridae. Sequences were determined from clone PCR products amplified from genomic DNA, and aligned following conceptual intron splicing and amino acid translation. Third codon positions were characterized by high levels of divergence and biased nucleotide composition, the intensity and direction of which varied among taxa. Equal weighting of all characters resulted in parsimony and neighboring-joining trees at odds with the generally accepted phylogenetic hypothesis based on morphology and rDNA sequences. The application of differential weighting schemes recovered the traditional hypothesis, in which the subfamily Anophelinae formed the basal clade. The subfamily Toxorhynchitinae occupied an intermediate position, and was a sister group to the subfamily Culicinae. Within Culicinae, the genera Sabethes and Tripteroides formed an ancestral clade, while the Culex-Deinocerites and Aedes- Haemagogus clades occupied increasingly derived positions in the molecular phylogeny. An intron present in the Culicinae- Toxorhynchitinae lineage and one outgroup taxon was absent in the basal Anophelinae lineage and the second outgroup taxon, suggesting that intron insertions or deletions may not always be reliable systematic characters.   相似文献   

8.
9.
本文介绍了一个在微机(IBM PC)上实现的、用于核酸顺序分析的计算机程序系统.该系统由三个层次和18个功能块构成,菜单及人机对话使得用户能较快地掌握和使用它.在编程中,采用了树结构、先进后出栈和稀疏矩阵等数据结构技巧,运用了Bayes法等统计分析方法,Kruskal算法和Floyd算法等一系列图论方法也被得到应用,这个软件系统的推出对于分子生物学研究具有一定的积极作用.  相似文献   

10.
11.
We have compared the partial nucleotide and derived amino acid sequences of a phaseolin seed storage protein gene ofPhaseolus vulgaris (1) and a conglycinin storage protein gene ofGlycine max (2). Although these proteins are not antigenically related to one another, the architecture of the genes is similar throughout the sequences compared here. Intervening sequences interrupt the same amino acid positions in both genes. Within the 28% of theG. max gene and the 38% of theP. vulgaris gene represented in this comparison, 73% of the nucleotides in the coding and intervening sequences are identical, excluding the insertions and deletions. The nucleotide mismatches found in the coding sequences are distributed throughout the three codon positions with little bias towards the third codon position. In addition to the single nucleotide differences, six insertions or deletions, ranging from three to twenty-seven nucleotides in length, occur in this portion of the coding region and these are partially responsible for the molecular weight differences of the conglycinin α′-subunit and the phaseolin subunit.  相似文献   

12.
Miyazawa S 《PloS one》2011,6(12):e28892
BACKGROUND: A mechanistic codon substitution model, in which each codon substitution rate is proportional to the product of a codon mutation rate and the average fixation probability depending on the type of amino acid replacement, has advantages over nucleotide, amino acid, and empirical codon substitution models in evolutionary analysis of protein-coding sequences. It can approximate a wide range of codon substitution processes. If no selection pressure on amino acids is taken into account, it will become equivalent to a nucleotide substitution model. If mutation rates are assumed not to depend on the codon type, then it will become essentially equivalent to an amino acid substitution model. Mutation at the nucleotide level and selection at the amino acid level can be separately evaluated. RESULTS: The present scheme for single nucleotide mutations is equivalent to the general time-reversible model, but multiple nucleotide changes in infinitesimal time are allowed. Selective constraints on the respective types of amino acid replacements are tailored to each gene in a linear function of a given estimate of selective constraints. Their good estimates are those calculated by maximizing the respective likelihoods of empirical amino acid or codon substitution frequency matrices. Akaike and Bayesian information criteria indicate that the present model performs far better than the other substitution models for all five phylogenetic trees of highly-divergent to highly-homologous sequences of chloroplast, mitochondrial, and nuclear genes. It is also shown that multiple nucleotide changes in infinitesimal time are significant in long branches, although they may be caused by compensatory substitutions or other mechanisms. The variation of selective constraint over sites fits the datasets significantly better than variable mutation rates, except for 10 slow-evolving nuclear genes of 10 mammals. An critical finding for phylogenetic analysis is that assuming variable mutation rates over sites lead to the overestimation of branch lengths.  相似文献   

13.
Lv HJ  Huang Y 《动物学研究》2012,33(3):319-328
该研究基于直翅目56种昆虫的COI基因全序列构建了该目部分类群间的系统发育关系,同时也分析了COI基因编码的氨基酸序列构建直翅目系统发育关系的可靠性。将COI序列按照密码子一、二、三位点划分,分别计算PBS(partioned Bremer support)值,评估蛋白质编码基因密码子不同位点的系统发生信号强度。分析结果支持螽亚目和蝗亚目的单系性;剑角蝗科、斑腿蝗科、斑翅蝗科、网翅蝗科和槌角蝗科5科均不是单系群,科间的遗传距离在0.107~0.153之间变化,与其他科相比遗传距离较小,符合将这5科合并为一科(即蝗科)的分类系统,瘤锥蝗科和锥头蝗科归为锥头蝗总科,癞蝗科单独成为一科,这也与Otte(1997)系统的划分一致。根据PBS值的大小推断密码子第三、第一位点对系统树分支的贡献比第二位点大,并且较长的序列含有较多的信息位点。研究也证实将各物种COI基因之间的遗传距离作为直翅目划分科级阶元的工具是可行的。  相似文献   

14.
Summary Patterns of nucleotide substitutions in human major histocompatibility complex (MHC) class I genes were estimated by using phylogenetic trees of DNA sequences. The pattern is defined as a set of 12 parameters, each of which represents the relative frequency of substitutions from a particular nucleotide to another. The pattern at the antigen recognition sites (ARS) in functional MHC genes was remarkably different from that at the remaining coding region (non-ARS). In particular, the proportion of transitions among all the nucleotide substitutions (P s) was extremely low at the third codon positions of ARS. In the HLA-A genes, P s at the third codon positions was only 6% in ARS, whereas it was 69% in non-ARS. In HLA-B, the corresponding values were 30% in ARS and 80% in non-ARS, respectively. On the other hand, P s in a class I pseudogene (HLA-H) was 57%, which was in good agreement with P s in other pseudogenes. Because pseudogenes are selectively neutral, the pattern in pseudogenes is regarded as the pattern of spontaneous substitution mutations. In general, the pattern in functional genes that are subject to selective forces deviates from the pattern in pseudogenes. At the third codon positions in coding regions, transitions scarcely cause amino acid replacements, whereas about half of transversions do cause replacements. Accordingly, P s at the third codon positions decreases if amino acid replacements are accelerated by natural selection but increases if amino acids are conserved by functional constraint. Our observations imply that the ARS region is subject to natural selection favoring amino acid replacements, whereas the non-ARS region is subject to functional constraint. Offprint requests to: T. Gojobori  相似文献   

15.
The data of Fourier-analysis of nucleotide sequences are discussed. The existence of reflexes corresponding to regular position of nucleotides (mainly T and G) with 3-base period is the most striking feature of both phage and viral nucleic acid sequences spectra. The amplitude and phase of the similar reflexes in the dinucleotide spectra obtained by digital computing of Fourier-transform, give specific information on amino acid composition, codon bias, amino acid relations. The width of frequency band characterizes a tendency to nucleotide clustering or to separate existence. The blurring of reflexes shows the disturbance of far order in the regular nucleotide "lattice". The two-dimensional spectral analysis supports the existence of far correlation in nucleotide positions.  相似文献   

16.
Procedures for performing cladistic analyses can provide powerful tools for understanding the evolution of neuropeptide and polypeptide hormone coding genes. These analyses can be done on either amino acid data sets or nucleotide data sets and can utilize several different algorithms that are dependent on distinct sets of operating assumptions and constraints. In some cases, the results of these analyses can be used to gauge phylogenetic relationships between taxa. Selecting the proper cladistic analysis strategy is dependent on the taxonomic level of analysis and the rate of evolution within the orthologous genes being evaluated. For example, previous studies have shown that the amino acid sequence of proopiomelanocortin (POMC), the common precursor for the melanocortins and beta-endorphin, can be used to resolve phylogenetic relationships at the class and order level. This study tested the hypothesis that POMC sequences could be used to resolve phylogenetic relationships at the family taxonomic level. Cladistic analyses were performed on amphibian POMC sequences characterized from the marine toad, Bufo marinus (family Bufonidae; this study), the spadefoot toad, Spea multiplicatus (family Pelobatidae), the African clawed frog, Xenopus laevis (family Pipidae) and the laughing frog, Rana ridibunda (family Ranidae). In these analyses the sequence of Australian lungfish POMC was used as the outgroup. The analyses were done at the amino acid level using the maximum parsimony algorithm and at the nucleotide level using the maximum likelihood algorithm. For the anuran POMC genes, analysis at the nucleotide level using the maximum likelihood algorithm generated a cladogram with higher bootstrap values than the maximum parsimony analysis of the POMC amino acid data set. For anuran POMC sequences, analysis of nucleotide sequences using the maximum likelihood algorithm would appear to be the preferred strategy for resolving phylogenetic relationships at the family taxonomic level.  相似文献   

17.
The branching topology of the archaeal (archaebacterial) domain was inferred from sequence comparisons of the largest subunit (B) of DNA-dependent RNA polymerases (RNAP). Both the nucleic acid sequences of the genes coding for RNAP subunit B and the amino acid sequences of the derived gene products were used for phylogenetic reconstructions. Individual analysis of the three nucleotide positions of codons revealed significant inequalities with respect to guanosine and cytosine (GC) content and evolutionary rates. Only the nucleotides at the second codon positions were found to be unbiased by varied GC contents and sufficiently conserved for reliable phylogenetic reconstructions. A decision matrix was used for the combination of the results of distance matrix, maximum parsimony, and maximum likelihood methods. For this purpose the original results (sums of squares, steps, and logarithms of likelihoods) were transformed into comparable effective values and analyzed with methods known from the theory of statistical decisions. Phylogenetic invariants and statistical analysis with resampling techniques (bootstrap and jackknife) confirmed the preferred branching topology, which is significantly different from the topology known from phylogenetic trees based on 16S rRNA sequences. The preferred topology reconstructed by this analysis shows a common stem for the Methanococcales and Methanobacteriales and a separation of the thermophilic sulfur archaea from the methanogens and halophiles. The latter coincides with a unique phylogenetic location of a characteristic splitting event replacing the largest RNAP subunit of thermophilic sulfur archaea by two fragments in methanogens and halophiles. This topology is in good agreement with physiological and structural differences between the various archaea and demonstrates RNAP to be a suitable phylogenetic marker molecule. Correspondence to: H.-P. Klenk  相似文献   

18.
The complete 15,831 bp nucleotide sequence of the mitochondrial genome from Elimaea cheni(Phaneropterinae)was determined.The putative initiation codon for cox1 was TTA.The phylogeny of Orthoptera based on different mtDNA datasets were analyzed with maximum likelihood(ML)and Bayesian inference(BI).When all 37 genes(mtDNA)were analyzed simultaneously,the monophyly of Caelifera and Ensifera were recovered in the context of our taxon sampling.The phylogeny of Orthoptera was largely consistent with previous phylogenetie hypotheses.Rhaphidophoridae to be a sister group of Tettigoniidae,and the relationships among four subfamilies of Tettigoniidae were(Phaneropterinae+(Conocephalinae+(Bradyporinae+Tettigoniinae))).Pyrgomorphidae was the most basal group of Caelifera.The relationships among six acridid subfamilies were(Oedipodinae+(Acridinae+(Gomphocerinae+(Oxyinae+(Calliptaminae +Cyrtacanthaeridinae))))).However,we did not recover a monophyletic Grylloidea.Myrmecophilidae clustered into one clade with Gryllotalpidae instead of with Gryllidae.ML and BI analyses of all protein coding genes(using all nucleotide sequence data or excluding the third codon position,and amino acid sequences)revealed a topology identical to that of the entire mtDNA genome dataset.However,22 tRNAs genes excluding the DHU loop and T()C loop(TRNA),and two rRNA genes(RRNA)perform poorly when analyzed as single dataset.Our results suggest that the best phylogenetie inferences were ML and BI methods based on total mtDNA.Excluding tRNA genes,rRNA genes and the third codon position of protein coding genes from dataset and converting nucleotide sequences to amino acid sequences do not positively affect phylogenetic reconstruction.  相似文献   

19.
Ren F  Tanaka H  Yang Z 《Systematic biology》2005,54(5):808-818
Models of codon substitution have been commonly used to compare protein-coding DNA sequences and are particularly effective in detecting signals of natural selection acting on the protein. Their utility in reconstructing molecular phylogenies and in dating species divergences has not been explored. Codon models naturally accommodate synonymous and nonsynonymous substitutions, which occur at very different rates and may be informative for recent and ancient divergences, respectively. Thus codon models may be expected to make an efficient use of phylogenetic information in protein-coding DNA sequences. Here we applied codon models to 106 protein-coding genes from eight yeast species to reconstruct phylogenies using the maximum likelihood method, in comparison with nucleotide- and amino acid-based analyses. The results appeared to confirm that expectation. Nucleotide-based analysis, under simplistic substitution models, were efficient in recovering recent divergences whereas amino acid-based analysis performed better at recovering deep divergences. Codon models appeared to combine the advantages of amino acid and nucleotide data and had good performance at recovering both recent and deep divergences. Estimation of relative species divergence times using amino acid and codon models suggested that translation of gene sequences into proteins led to information loss of from 30% for deep nodes to 66% for recent nodes. Although computational burden makes codon models unfeasible for tree search in large data sets, we suggest that they may be useful for comparing candidate trees. Nucleotide models that accommodate the differences in evolutionary dynamics at the three codon positions also performed well, at much less computational cost. We discuss the relationship between a model's fit to data and its utility in phylogeny reconstruction and caution against use of overly complex substitution models.  相似文献   

20.
Gissi C  San Mauro D  Pesole G  Zardoya R 《Gene》2006,366(2):228-237
We explore whether phylogenetic analyses of the same sequence data set at the amino acid and nucleotide level are able to recover congruent topologies, as well as the advantages and limitations of both alternative approaches. As a case study, mitochondrial protein-coding genes were used to discern among competing hypotheses on the phylogenetic relationships of major anuran amphibian lineages. To properly address this phylogenetic question, the complete nucleotide sequences of the mitochondrial genomes of two archaeobatrachian species, Ascaphus truei and Pelobates cultripes, were determined anew. Bayesian and maximum likelihood phylogenetic inferences of the same sequence data set were performed based on both amino acid and nucleotide characters, with the latter analysed either as codons or as a reduced data set of first+second (P12) codon positions. In addition, likelihood-based ratio tests were performed to evaluate the support of alternative topologies. The different data sets arrived at congruent and highly supported topologies, suggesting a similar phylogenetic resolving power of the two character types provided that correctly selected sites and appropriate evolutionary models are used. The reconstructed anuran mitochondrial phylogeny supports the paraphyly of Archaeobatrachia, with Ascaphus as sister group to all the remaining anurans, and Pelobates as sister group of Neobatrachia. However, the employed tree reconstruction methods and likelihood-based ratio tests seemed to be negatively affected by the fast evolving sequences of neobatrachians, suggesting that the phylogeny of Anura here presented is not definitive, and needs further investigation using an extended taxon sampling.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号