首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 593 毫秒
1.
In the field of phylogenetics and comparative genomics, it is important to establish orthologous relationships when comparing homologous sequences. Due to the slight sequence dissimilarity between orthologs and paralogs, it is prone to regarding paralogs as orthologs. For this reason, several methods based on evolutionary distance, phylogeny and BLAST have tried to detect orthologs with more precision. Depending on their algorithmic implementations, each of these methods sometimes has increased false negative or false positive rates. Here, we developed a novel algorithm for orthology detection that uses a distance method based on the phylogenetic criterion of minimum evolution. Our algorithm assumes that sets of sequences exhibiting orthologous relationships are evolutionarily less costly than sets that include one or more paralogous relationships. Calculation of evolutionary cost requires the reconstruction of a neighbor-joining (NJ) tree, but calculations are unaffected by the topology of any given NJ tree. Unlike tree reconciliation, our algorithm appears free from the problem of incorrect topologies of species and gene trees. The reliability of the algorithm was tested in a comparative analysis with two other orthology detection methods using 95 manually curated KOG datasets and 21 experimentally verified EXProt datasets. Sensitivity and specificity estimates indicate that the concept of minimum evolution could be valuable for the detection of orthologs.  相似文献   

2.
Forty-four sequences of ornithine carbamoyltransferases (OTCases) and 33 sequences of aspartate carbamoyltransferases (ATCases) representing the three domains of life were multiply aligned and a phylogenetic tree was inferred from this multiple alignment. The global topology of the composite rooted tree (each enzyme family being used as an outgroup to root the other one) suggests that present-day genes are derived from paralogous ancestral genes which were already of the same size and argues against a mechanism of fusion of independent modules. A closer observation of the detailed topology shows that this tree could not be used to assess the actual order of organismal descent. Indeed, this tree displays a complex topology for many prokaryotic sequences, with polyphyly for Bacteria in both enzyme trees and for the Archaea in the OTCase tree. Moreover, representatives of the two prokaryotic Domains are found to be interspersed in various combinations in both enzyme trees. This complexity may be explained by assuming the occurrence of two subfamilies in the OTCase tree (OTC α and OTC β) and two other ones in the ATCase tree (ATC I and ATC II). These subfamilies could have arisen from duplication and selective losses of some differentiated copies during the successive speciations. We suggest that Archaea and Eukaryotes share a common ancestor in which the ancestral copies giving the present-day ATC II/OTC β combinations were present, whereas Bacteria comprise two classes: one containing the ATC II/OTC α combination and the other harboring the ATC I/OTC β combination. Moreover, multiple horizontal gene transfers could have occurred rather recently amongst prokaryotes. Whichever the actual history of carbamoyltransferases, our data suggest that the last common ancestor to all extant life possessed differentiated copies of genes coding for both carbamoyltransferases, indicating it as a rather sophisticated organism.  相似文献   

3.
Alcohol dehydrogenase genes were amplified by PCR, cloned, and sequenced from 11 putative nonhybrid species of the angiosperm genus Paeonia. Sequences of five exons and six intron regions of the Adh gene were used to reconstruct the phylogeny of these species. Two paralogous genes, Adh1A, and Adh2, were found; an additional gene, Adh1B, is also present in section Moutan. Phylogenetic analyses of exon sequences of the Adh genes of Paeonia and a variety of other angiosperms imply that duplication of Adh1 and Adh2 occurred prior to the divergence of Paeonia species and was followed by a duplication resulting in Adh1A and Adh1B. Concerted evolution appears to be absent between these paralogous loci. Phylogenetic analysis of only the Paeonia Adh exon sequences, positioning the root of the tree between the paralogous genes Adh1 and Adh2, suggests that the first evolutionary split within the genus occurred between the shrubby section Moutan and the other two herbaceous sections Oneapia and Paeonia. Restriction of Adh1B genes to section Moutan may have resulted from deletion of Adh1B from the common ancestor of sections Oneapia and Paeonia. A relative-rate test was designed to compare rates of molecular change among lineages based on the divergence of paralogous genes, and the results indicate a slower rate of evolution within the shrubby section Moutan than in section Oneapia. This may be responsible for the relatively long branch length of section Oneapia and the short branch length between section Moutan and the other two sections found on the Adh, ITS (nrDNA), and matK (cpDNA) phylogenies of the genus. Adh1 and Adh2 intron sequences cannot be aligned, and we therefore carried out separate analyses of Adh1A and Adh2 genes using exon and intron sequences together. The Templeton test suggested that there is not significant incongruence among Adh1A, ITS, and matK data sets, but that these three data sets conflict significantly with Adh2 sequence data. A combined analysis of Adh1A, ITS, and matK sequences produced a tree that is better resolved than that of any individual gene, and congruent with morphology and the results of artificial hybridization. It is therefore considered to be the current best estimate of the species phylogeny. Paraphyly of section Paeonia in the Adh2 gene tree may be caused by longer coalescence times and random sorting of ancestral alleles.   相似文献   

4.
Proton pumping ATPases are found in all groups of present day organisms. The F-ATPases of eubacteria, mitochondria and chloroplasts also function as ATP synthases, i.e., they catalyze the final step that transforms the energy available from reduction/oxidation reactions (e.g., in photosynthesis) into ATP, the usual energy currency of modern cells. The primary structure of these ATPases/ATP synthases was found to be much more conserved between different groups of bacteria than other parts of the photosynthetic machinery, e.g., reaction center proteins and redox carrier complexes.These F-ATPases and the vacuolar type ATPase, which is found on many of the endomembranes of eukaryotic cells, were shown to be homologous to each other; i.e., these two groups of ATPases evolved from the same enzyme present in the common ancestor. (The term eubacteria is used here to denote the phylogenetic group containing all bacteria except the archaebacteria.) Sequences obtained for the plasmamembrane ATPase of various archaebacteria revealed that this ATPase is much more similar to the eukaryotic than to the eubacterial counterpart. The eukaryotic cell of higher organisms evolved from a symbiosis between eubacteria (that evolved into mitochondria and chloroplasts) and a host organism. Using the vacuolar type ATPase as a molecular marker for the cytoplasmic component of the eukaryotic cell reveals that this host organism was a close relative of the archaebacteria.A unique feature of the evolution of the ATPases is the presence of a non-catalytic subunit that is paralogous to the catalytic subunit, i.e., the two types of subunits evolved from a common ancestral gene. Since the gene duplication that gave rise to these two types of subunits had already occurred in the last common ancestor of all living organisms, this non-catalytic subunit can be used to root the tree of life by means of an outgroup; that is, the location of the last common ancestor of the major domains of living organisms (archaebacteria, eubacteria and eukaryotes) can be located in the tree of life without assuming constant or equal rates of change in the different branches.A correlation between structure and function of ATPases has been established for present day organisms. Implications resulting from this correlation for biochemical pathways, especially photosynthesis, that were operative in the last common ancestor and preceding life forms are discussed.  相似文献   

5.
The appearance of the vertebrates demarcates some of the most far-reaching changes of structure and function seen during the evolution of the metazoans. These drastic changes of body plan and expansion of the central nervous system among other organs coincide with increased gene numbers. The presence of several groups of paralogous chromosomal regions in the human genome is a reflection of this increase. The simplest explanation for the existence of these paralogies would be two genome doublings with subsequent silencing of many genes. It is argued that gene localization data and the delineation of paralogous chromosomal regions give more reliable information about these types of events than dendrograms of gene families as gene relationships are often obscured by uneven replacement rates as well as other factors. Furthermore, the topographical relations of some paralogy groups are discussed.  相似文献   

6.
Determining the most appropriate way to represent the relationships between bacterial isolates is complicated by the differing rates of recombination within species. In many cases, a bifurcating tree can be positively misleading. The recently described program eBURST can be used with multilocus data to define groups or clonal complexes of related isolates derived from a common ancestor, the patterns of descent linking them together, and the ancestral genotype. eBURST has recently been extensively updated to include additional tools for exploring the relationships between isolates. We discuss the advantages of this approach and describe its use to explore patterns of descent within clonal complexes identified using multilocus sequence typing.  相似文献   

7.
The proliferation of gene data from multiple loci of large multigene families has been greatly facilitated by considerable recent advances in sequence generation. The evolution of such gene families, which often undergo complex histories and different rates of change, combined with increases in sequence data, pose complex problems for traditional phylogenetic analyses, and in particular, those that aim to successfully recover species relationships from gene trees. Here, we implement gene tree parsimony analyses on multicopy gene family data sets of snake venom proteins for two separate groups of taxa, incorporating Bayesian posterior distributions as a rigorous strategy to account for the uncertainty present in gene trees. Gene tree parsimony largely failed to infer species trees congruent with each other or with species phylogenies derived from mitochondrial and single-copy nuclear sequences. Analysis of four toxin gene families from a large expressed sequence tag data set from the viper genus Echis failed to produce a consistent topology, and reanalysis of a previously published gene tree parsimony data set, from the family Elapidae, suggested that species tree topologies were predominantly unsupported. We suggest that gene tree parsimony failure in the family Elapidae is likely the result of unequal and/or incomplete sampling of paralogous genes and demonstrate that multiple parallel gene losses are likely responsible for the significant species tree conflict observed in the genus Echis. These results highlight the potential for gene tree parsimony analyses to be undermined by rapidly evolving multilocus gene families under strong natural selection.  相似文献   

8.
Mounting evidence indicates the presence of a near complete biological nitrogen cycle in redox-stratified oceans during the late Archean to early Proterozoic (c. 2.5-2.0 Ga). It has been suggested that the iron (Fe)- or vanadium (V)-dependent nitrogenase rather than molybdenum (Mo)-dependent form was responsible for dinitrogen fixation during this time because oceans were depleted in Mo and rich in Fe. We evaluated this hypothesis by examining the phylogenetic relationships of proteins that are required for the biosynthesis of the active site cofactor of Mo-nitrogenase in relation to structural proteins required for Fe-, V- and Mo-nitrogenase. The results are highly suggestive that among extant nitrogen-fixing organisms for which genomic information exists, Mo-nitrogenase is unlikely to have been associated with the Last Universal Common Ancestor. Rather, the origin of Mo-nitrogenase can be traced to an ancestor of the anaerobic and hydrogenotrophic methanogens with acquisition in the bacterial domain via lateral gene transfer involving an anaerobic member of the Firmicutes. A comparison of substitution rates estimated for proteins required for the biosynthesis of the nitrogenase active site cofactor and for a set of paralogous proteins required for the biosynthesis of bacteriochlorophyll suggests that Nif emerged from a nitrogenase-like ancestor approximately 1.5-2.2 Ga. An origin and ensuing proliferation of Mo-nitrogenase under anoxic conditions would likely have occurred in an environment where anaerobic methanogens and Firmicutes coexisted and where Mo was at least episodically available, such as in a redox-stratified Proterozoic ocean basin.  相似文献   

9.
The phylogenetic trees of influenza virus genes of hemagglutinins, neuraminidases, and of NS genes were composed. Considering properties of synonimic replacements to be neutral and their rates constant at each tree, the dates of ancestor branch points were calculated, and the rates of fixation of synonimic (Ks) and non-synonimic (Kns) replacements estimated. The epidemic branches were mostly shown to be "deadlocks", non-epidemic ones being internal or "roots." The ratios of the numbers of synonimic to non-synonimic replacements (vs/vns) were correspondingly 1.32+/-0.42 and 4.78+/-1.28 for all trees, the difference being significant. It was shown that the dated branch points for hemagglutinins are non-randomly clustered around the initial points of the main genetic shifts of the A-type virus, corresponding to the influenza pandemics. It seems that these ancestor forms of virus behave similar to the "train" of these shifts, reproducing together with the pandemic forms under conditions of decreased immune resistance of host population. The rates of fixation of non-synonimic replacements in the epidemic branches of this tree are 4 times increased, as compared to non-epidemic ones.  相似文献   

10.
11.
Paralogous sequences of the RPB2 gene are demonstrated in the angiosperm order Gentianales. Two different copies were found by using different PCR primer pairs targeting a region that corresponds to exons 22-24 in the Arabidopsis RPB2 gene. One of the copies (RPB2-d) lacks introns in this region, whereas the other has introns at locations corresponding to those of green plants previously investigated. When analyzed with other available RPB2 sequences from this region, all 28 RPB2-d sequences obtained from the Gentianales and the four sequences from the Lamiales form a monophyletic group, together with a previously published tomato cDNA sequence. The substitution patterns, relative rates of change, and nucleotide compositions of the two paralogous RPB2 exon regions are similar, and none of them shows any signs of being a pseudogene. Although multiple copies of similar, paralogous sequences can confound phylogenetic interpretations, the lack of introns in RPB2-d make a priori homology assessment easy. The phylogenetic utility of RPB2-d within the Gentianales is evaluated in comparison with the chloroplast genes ndhF and rbcL. The hierarchical information in the RPB2-d region sequenced is more incongruent with that of the plastid genes than the plastid genes are with each other as determined by incongruence length difference tests. In contrast to the plastid genes, parsimony-informative third codon positions of RPB2 have a significantly higher rate of change than first and second positions. Topologically, the trees from the three genes are similar, and the differences are usually only weakly supported. In terms of support, RPB2 gives the highest jackknife support per sequenced nucleotide, whereas ndhF gives the highest Bremer support per sequenced nucleotide. The RPB2-d locus has the potential to be a valuable nuclear marker for determination of phylogenetic relationships within the euasterid I group of plants.  相似文献   

12.
In a previous paper (Klotz et a1., 1979) we described a method for determining evolutionary trees from sequence data when rates of evolution of the sequences might differ greatly. It was shown theoretically that the method always gave the correct topology and root when the exact number of mutation differences between sequences and from their common ancestor was known. However, the method is impractical to use in most situations because it requires some knowledge of the ancestor. In this present paper we describe another method, related to the previous one, in which a present-day sequence can serve temporarily as an ancestor for purposes of determining the evolutionary tree regardless of the rates of evolution of the sequences involved. This new method can be carried out with high precision without the aid of a computer, and it does not increase in difficulty rapidly as the number of sequences involved in the study increases, unlike other methods.  相似文献   

13.
Cladogenesis, coalescence and the evolution of the three domains of life   总被引:3,自引:0,他引:3  
In this article, we explore the large-scale structure of the tree of life by using a simple model with a constant number of species and rates of speciation that equal the rates of extinction. In addition, we discuss the consequences of horizontal gene transfer for the concept of a most recent common ancestor of all living organisms (cenancestor). A simple null hypothesis based on coalescence theory explains some features of the observed topologies of the tree of life. Simulations of genes and organismal lineages suggest that there was no single common ancestor that contained all the genes ancestral to those shared among the three domains of life. Each contemporary molecule has its own history that traces back to an individual molecular cenancestor. However, these molecular ancestors were likely to be present in different organisms and at different times.  相似文献   

14.
A rooted tree of life provides a framework to answer central questions about the evolution of life. Here we review progress on rooting the tree of life and introduce a new root of life obtained through the analysis of indels, insertions and deletions, found within paralogous gene sets. Through the analysis of indels in eight paralogous gene sets, the root is localized to the branch between the clade consisting of the Actinobacteria and the double-membrane (Gram-negative) prokaryotes and one consisting of the archaebacteria and the firmicutes. This root provides a new perspective on the habitats of early life, including the evolution of methanogenesis, membranes and hyperthermophily, and the speciation of major prokaryotic taxa. Our analyses exclude methanogenesis as a primitive metabolism, in contrast to previous findings. They parsimoniously imply that the ether archaebacterial lipids are not primitive and that the cenancestral prokaryotic population consisted of organisms enclosed by a single, ester-linked lipid membrane, covered by a peptidoglycan layer. These results explain the similarities previously noted by others between the lipid synthesis pathways in eubacteria and archaebacteria. The new root also implies that the last common ancestor was not hyperthermophilic, although moderate thermophily cannot be excluded.  相似文献   

15.
Paralogy is a pervasive problem in trying to use nuclear gene sequences to infer species phylogenies. One strategy for dealing with this problem is to infer species phylogenies from gene trees using reconciled trees, rather than directly from the sequences themselves. In this approach, the optimal species tree is the tree that requires the fewest gene duplications to be invoked. Because reconciled trees can identify orthologous from paralogous sequences, there is no need to do this prior to the analysis. Multiple gene trees can be analyzed simultaneously; however, the problem of nonuniform gene sampling raises practical problems which are discussed. In this paper the technique is applied to phylogenies for nine vertebrate genes (aldolase, alpha-fetoprotein, lactate dehydrogenase, prolactin, rhodopsin, trypsinogen, tyrosinase, vassopressin, and Wnt-7). The resulting species tree shows much similarity with currently accepted vertebrate relationships.  相似文献   

16.
M Di Giulio 《Gene》2001,281(1-2):11-17
By exploiting the correlation between the optimal growth temperature of organisms and a thermophily index based on the propensity of amino acids to enter thermophile/hyperthermophile proteins, an analysis is conducted in order to establish whether the last universal common ancestor (LUCA) was a mesophile or a (hyper)thermophile. This objective is reached by using maximum parsimony and maximum likelihood to reconstruct the ancestral sequences of the LUCA for two pairs of sets of paralogous protein sequences by means of the phylogenetic tree topology derived from the small subunit ribosomal RNA, even if this is rooted in all three possible ways. The thermophily index of all the reconstructed ancestral sequences of the LUCA belongs to the set of the thermophile/hyperthermophile sequences, thus supporting the hypotheses that see the LUCA as a thermophile or a hyperthermophile.  相似文献   

17.
Liò P  Vannucci M 《Gene》2003,317(1-2):29-37
Chemokine receptors represent a prime target for the development of novel therapeutic strategies in a variety of disease processes, including inflammation, allergy and neoplasia. Here we use maximum likelihood methods and bootstrap methods to investigate both the phylogenetic relationships in a large set of human chemokine receptor sequences and the relationships between chemokine receptors and their nearest neighbors. We found that CCR and CXCR families are not homogeneous. We also provide evidences that angiotensin receptors are the closest neighbors. Other close neighbors include opioid, somatostatin and melanin-concentrating hormone receptors. The phylogenetic analysis suggests ancient paralogous relationships and establishes a link between immune, metabolic and neural systems modulation. We complement our findings with a structural analysis based on wavelet methods of the major branches of chemokine receptors phylogeny. We hypothesize that receptors very close in the tree can form heterodimers. Our analyses reveal different characteristics of amino acid hydrophobicity and volume propensity in the different subfamilies. We also found that the second extra-cytoplasmic loop has higher rates of evolution than the internal loops and transmembrane segments, suggesting that selection, shifting, reassignments and broadening of receptor binding specificities involve mainly this loop.  相似文献   

18.
Gene duplication is a crucial mechanism of evolutionary innovation. A substantial fraction of eukaryotic genomes consists of paralogous gene families. We assess the extent of ancestral paralogy, which dates back to the last common ancestor of all eukaryotes, and examine the origins of the ancestral paralogs and their potential roles in the emergence of the eukaryotic cell complexity. A parsimonious reconstruction of ancestral gene repertoires shows that 4137 orthologous gene sets in the last eukaryotic common ancestor (LECA) map back to 2150 orthologous sets in the hypothetical first eukaryotic common ancestor (FECA) [paralogy quotient (PQ) of 1.92]. Analogous reconstructions show significantly lower levels of paralogy in prokaryotes, 1.19 for archaea and 1.25 for bacteria. The only functional class of eukaryotic proteins with a significant excess of paralogous clusters over the mean includes molecular chaperones and proteins with related functions. Almost all genes in this category underwent multiple duplications during early eukaryotic evolution. In structural terms, the most prominent sets of paralogs are superstructure-forming proteins with repetitive domains, such as WD-40 and TPR. In addition to the true ancestral paralogs which evolved via duplication at the onset of eukaryotic evolution, numerous pseudoparalogs were detected, i.e. homologous genes that apparently were acquired by early eukaryotes via different routes, including horizontal gene transfer (HGT) from diverse bacteria. The results of this study demonstrate a major increase in the level of gene paralogy as a hallmark of the early evolution of eukaryotes.  相似文献   

19.
The pairs of nitrogen fixation genes nifDK and nifEN encode for the α and β subunits of nitrogenase and for the two subunits of the NifNE protein complex, involved in the biosynthesis of the FeMo cofactor, respectively. Comparative analysis of the amino acid sequences of the four NifD, NifK, NifE, and NifN in several archaeal and bacterial diazotrophs showed extensive sequence similarity between them, suggesting that their encoding genes constitute a novel paralogous gene family. We propose a two-step model to reconstruct the possible evolutionary history of the four genes. Accordingly, an ancestor gene gave rise, by an in-tandem paralogous duplication event followed by divergence, to an ancestral bicistronic operon; the latter, in turn, underwent a paralogous operon duplication event followed by evolutionary divergence leading to the ancestors of the present-day nifDK and nifEN operons. Both these paralogous duplication events very likely predated the appearance of the last universal common ancestor. The possible role of the ancestral gene and operon in nitrogen fixation is also discussed. Received: 21 June 1999 / Accepted: 1 March 2000  相似文献   

20.
The complete sequence of Vitis vinifera revealed that the rosid clade derives from a hexaploid ancestor. At present, no analysis of complete genome sequence is available for an asterid, the other large eudicot clade, which includes the economically important species potato, tomato and coffee. To elucidate the genomic history of asterids, we compared the sequence of an 800 kb region of diploid Coffea genome to the orthologous regions of V. vinifera, Populus trichocarpa and Arabidopsis thaliana. We found a very high level of collinearity between around 80 genes of the three rosid species and Coffea. Collinearity comparisons between orthologous and paralogous regions indicates that (1) the Coffea (and consequently all asterids) and rosids share the same hexaploid ancestor; (2) the diploidization process (loss of duplicated and redundant copies from the whole genome duplication) was very advanced in the most recent common ancestor of rosids and asterids. Finally, no additional polyploidization events were detected in the Coffea lineage. Differences in gene loss rates were detected among the three rosid species and linked to the divergence in protein sequences.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号