共查询到20条相似文献,搜索用时 0 毫秒
1.
Background
Gene duplication and gene loss during the evolution of eukaryotes have hindered attempts to estimate phylogenies and divergence times of species. Although current methods that identify clusters of orthologous genes in complete genomes have helped to investigate gene function and gene content, they have not been optimized for evolutionary sequence analyses requiring strict orthology and complete gene matrices. Here we adopt a relatively simple and fast genome comparison approach designed to assemble orthologs for evolutionary analysis. Our approach identifies single-copy genes representing only species divergences (panorthologs) in order to minimize potential errors caused by gene duplication. We apply this approach to complete sets of proteins from published eukaryote genomes specifically for phylogeny and time estimation. 相似文献2.
Liu Q 《Bio Systems》2005,81(3):281-289
Using full-length cDNA sequences, a comparative analysis of sequence patterns around the stop codons in six eukaryotes was performed. Here, it was showed that the codon immediately before and after the stop codons (defined as -1 codon and +1 codon, respectively) were much more biased than other examined positions, especially at the second position of -1 codons and the first position of +1 codons which were rich in As/Us and purines, respectively, for most species. The author speculated that strongly biased sequence pattern from position -2 to +4 might act as an extended translation termination signal. Translation termination was catalyzed by release factors that recognized the stop codons. The multiple amino acid sequence alignment of eukaryotic release factor 1 (eRF1) of 20 species showed that there were 16 residue sites that were strictly conserved, especially the invariant amino acids Ile70 and Lys71. Accordingly, it could be inferred that those candidate amino acids might involve in the recognition process. Moreover, the possible stop signal recognition hypothesis was also discussed herein. 相似文献
3.
Masayuki Horie Yuki Kobayashi Yoshiyuki Suzuki Keizo Tomonaga 《Philosophical transactions of the Royal Society of London. Series B, Biological sciences》2013,368(1626)
Bornaviruses are the only animal RNA viruses that establish a persistent infection in their host cell nucleus. Studies of bornaviruses have provided unique information about viral replication strategies and virus–host interactions. Although bornaviruses do not integrate into the host genome during their replication cycle, we and others have recently reported that there are DNA sequences derived from the mRNAs of ancient bornaviruses in the genomes of vertebrates, including humans, and these have been designated endogenous borna-like (EBL) elements. Therefore, bornaviruses have been interacting with their hosts as driving forces in the evolution of host genomes in a previously unexpected way. Studies of EBL elements have provided new models for virology, evolutionary biology and general cell biology. In this review, we summarize the data on EBL elements including what we have newly identified in eukaryotes genomes, and discuss the biological significance of EBL elements, with a focus on EBL nucleoprotein elements in mammalian genomes. Surprisingly, EBL elements were detected in the genomes of invertebrates, suggesting that the host range of bornaviruses may be much wider than previously thought. We also review our new data on non-retroviral integration of Borna disease virus. 相似文献
4.
To study the possible codon usage and base composition variation in the bacteriophages, fourteen mycobacteriophages were used as a model system here and both the parameters in all these phages and their plating bacteria, M. smegmatis had been determined and compared. As all the organisms are GC-rich, the GC contents at third codon positions were found in fact higher than the second codon positions as well as the first + second codon positions in all the organisms indicating that directional mutational pressure is strongly operative at the synonymous third codon positions. Nc plot indicates that codon usage variation in all these organisms are governed by the forces other than compositional constraints. Correspondence analysis suggests that: (i) there are codon usage variation among the genes and genomes of the fourteen mycobacteriophages and M. smegmatis, i.e., codon usage patterns in the mycobacteriophages is phage-specific but not the M. smegmatis-specific; (ii) synonymous codon usage patterns of Barnyard, Che8, Che9d, and Omega are more similar than the rest mycobacteriophages and M. smegmatis; (iii) codon usage bias in the mycobacteriophages are mainly determined by mutational pressure; and (iv) the genes of comparatively GC rich genomes are more biased than the GC poor genomes. Translational selection in determining the codon usage variation in highly expressed genes can be invoked from the predominant occurrences of C ending codons in the highly expressed genes. Cluster analysis based on codon usage data also shows that there are two distinct branches for the fourteen mycobacteriophages and there is codon usage variation even among the phages of each branch. 相似文献
5.
6.
Complete eukaryote chromosomes were investigated for intrachromosomal duplications of nucleotide sequences. The analysis was performed by looking for nonexact repeats on two complete genomes, Saccharomyces cerevisiae and Caenorhabditis elegans, and four partial ones, Drosophila melanogaster, Plasmodium falciparum, Arabidopsis thaliana, and Homo sapiens. Through this analysis, we show that all eukaryote chromosomes exhibit similar characteristics for their intrachromosomal repeats, suggesting similar dynamics: many direct repeats have their two copies physically close together, and these close direct repeats are more similar and shorter than the other repeats. On the contrary, there are almost no close inverted repeats. These results support a model for the dynamics of duplication. This model is based on a continuous genesis of tandem repeats and implies that most of the distant and inverted repeats originate from these tandem repeats by further chromosomal rearrangements (insertions, inversions, and deletions). Remnants of these predicted rearrangements have been brought out through fine analysis of the chromosome sequence. Despite these dynamics, shared by all eukaryotes, each genome exhibits its own style of intrachromosomal duplication: the density of repeated elements is similar in all chromosomes issued from the same genome, but is different between species. This density was further related to the relative rates of duplication, deletion, and mutation proper to each species. One should notice that the density of repeats in the X chromosome of C. elegans is much lower than in the autosomes of that organism, suggesting that the exchange between homologous chromosomes is important in the duplication process. 相似文献
7.
E.V. Soldatenko 《Invertebrate reproduction & development.》2013,57(3):224-236
Terminal portions of the male copulatory apparatus of Planorbis planorbis, Segmentina oelandica, and Anisus vortex were studied using whole-mount preparations, serial semi-thin sections, and transmission electron microscopy. In the latter species, stylet formation was investigated at several stages of postembryonic development. Organization of the penial distal portion in the species studied varies greatly. In P. planorbis, the distal end of the penis lacks developed papillae and is armed with a stylet built up of the covering epithelial cells of the penis proper. In A. vortex, the stylet is formed by the secretory activity of the middle cells of the distal portion of the penis. To the time of maturation, the cells encompassing the stylet are broken down exposing its solid chitinous structure and characteristic shape. In S. oelandica, the distal end of the penis bears the long probably flexible papilla with the characteristics of an internal ‘skeleton,’ organized as a line of connective tissue cells and a system of hydrocoelic cavities. 相似文献
8.
9.
10.
Understanding regulatory mechanisms of protein synthesis in eukaryotes is essential for the accurate annotation of genome sequences. Kozak reported that the nucleotide sequence GCCGCC(A/G)CCAUGG (AUG is the initiation codon) was frequently observed in vertebrate genes and that this 'consensus' sequence enhanced translation initiation. However, later studies using invertebrate, fungal and plant genes reported different 'consensus' sequences. In this study, we conducted extensive comparative analyses of nucleotide sequences around the initiation codon by using genomic data from 47 eukaryote species including animals, fungi, plants and protists. The analyses revealed that preferred nucleotide sequences are quite diverse among different species, but differences between patterns of nucleotide bias roughly reflect the evolutionary relationships of the species. We also found strong biases of A/G at position -3, A/C at position -2 and C at position +5 that were commonly observed in all species examined. Genes with higher expression levels showed stronger signals, suggesting that these nucleotides are responsible for the regulation of translation initiation. The diversity of preferred nucleotide sequences around the initiation codon might be explained by differences in relative contributions from two distinct patterns, GCCGCCAUG and AAAAAAAUG, which implies the presence of multiple molecular mechanisms for controlling translation initiation. 相似文献
11.
We study short-range correlations in DNA sequences with methods from information theory and statistics. We find a persisting degree of identity between the correlation patterns of different chromosomes of a species. Except for the case of human and chimpanzee inter-species differences in this correlation pattern allow robust species distinction: in a clustering tree based upon the correlation curves on the level of individual chromosomes distinct clusters for the individual species are found. This capacity of distinguishing species persists, even when the length of the underlying sequences is drastically reduced. In comparison to the standard tool for studying symbol correlations in DNA sequences, namely the mutual information function, we find that an autoregressive model for higher order Markov processes significantly improves species distinction due to an implicit subtraction of random background. 相似文献
12.
13.
Heizer EM Raiford DW Raymer ML Doom TE Miller RV Krane DE 《Molecular biology and evolution》2006,23(9):1670-1680
For most prokaryotic organisms, amino acid biosynthesis represents a significant portion of their overall energy budget. The difference in the cost of synthesis between amino acids can be striking, differing by as much as 7-fold. Two prokaryotic organisms, Escherichia coli and Bacillus subtilis, have been shown to preferentially utilize less costly amino acids in highly expressed genes, indicating that parsimony in amino acid selection may confer a selective advantage for prokaryotes. This study confirms those findings and extends them to 4 additional prokaryotic organisms: Chlamydia trachomatis, Chlamydophila pneumoniae AR39, Synechocystis sp. PCC 6803, and Thermus thermophilus HB27. Adherence to codon-usage biases for each of these 6 organisms is inversely correlated with a coding region's average amino acid biosynthetic cost in a fashion that is independent of chemoheterotrophic, photoautotrophic, or thermophilic lifestyle. The obligate parasites C. trachomatis and C. pneumoniae AR39 are incapable of synthesizing many of the 20 common amino acids. Removing auxotrophic amino acids from consideration in these organisms does not alter the overall trend of preferential use of energetically inexpensive amino acids in highly expressed genes. 相似文献
14.
I V Babkin T S Nepomniashchikh R A Maksiutov V V Gutorov I N Babkina S N Shchelkunov 《Molekuliarnaia biologiia》2008,42(4):612-624
Nucleotide sequences of two extended segments of the terminal variable regions in variola virus genome were determined. The size of the left segment was 13.5 kbp and of the right, 10.5 kbp. Totally, over 540 kbp were sequenced for 22 variola virus strains. The conducted phylogenetic analysis and the data published earlier allowed us to find the interrelations between 70 variola virus isolates, the character of their clustering, and the degree of intergroup and intragroup variations of the clusters of variola virus strains. The most polymorphic loci of the genome segments studied were determined. It was demonstrated that that these loci are localized to either noncoding genome regions or to the regions of destroyed open reading frames, characteristic of the ancestor virus. These loci are promising for development of the strategy for genotyping variola virus strains. Analysis of recombination using various methods demonstrated that, with the only exception, no statistically significant recombinational events in the genomes of variola virus strains studied were detectable. 相似文献
15.
MOTIVATION: Detecting genes in viral genomes is a complex task. Due to the biological necessity of them being constrained in length, RNA viruses in particular tend to code in overlapping reading frames. Since one amino acid is encoded by a triplet of nucleic acids, up to three genes may be coded for simultaneously in one direction. Conventional hidden Markov model (HMM)-based gene-finding algorithms may typically find it difficult to identify multiple coding regions, since in general their topologies do not allow for the presence of overlapping or nested genes. Comparative methods have therefore been restricted to likelihood ratio tests on potential regions as to being double or single coding, using the fact that the constrictions forced upon multiple-coding nucleotides will result in atypical sequence evolution. Exploiting these same constraints, we present an HMM based gene-finding program, which allows for coding in unidirectional nested and overlapping reading frames, to annotate two homologous aligned viral genomes. Our method does not insist on conserved gene structure between the two sequences, thus making it applicable for the pairwise comparison of more distantly related sequences. RESULTS: We apply our method to 15 pairwise alignments of six different HIV2 genomes. Given sufficient evolutionary distance between the two sequences, we achieve sensitivity of approximately 84-89% and specificity of approximately 97-99.9%. We additionally annotate three pairwise alignments of the more distantly related HIV1 and HIV2, as well as of two different hepatitis viruses, attaining results of approximately 87% sensitivity and approximately 98.5% specificity. We subsequently incorporate prior knowledge by 'knowing' the gene structure of one sequence and annotating the other conditional on it. Boosting accuracy close to perfect we demonstrate that conservation of gene structure on top of nucleotide sequence is a valuable source of information, especially in distantly related genomes. AVAILABILITY: The Java code is available from the authors. 相似文献
16.
Transposable elements (TEs) are indwelling components of genomes, and their dynamics have been a driving force in genome evolution. Although we now have more information concerning their amounts and characteristics in various organisms, we still have little data from overall comparisons of their sequences in very closely-related species. While the Drosophila melanogaster genome has been extensively studied, we have only limited knowledge regarding the precise TE sequences in the genomes of the related species Drosophila simulans, Drosophila sechellia and Drosophila yakuba. In this study we analyzed the number and structure of TE copies in the sequenced genomes of these four species. Our findings show that, unexpectedly, the number of TE insertions in D. simulans is greater than that in D. melanogaster, but that most of the copies in D. simulans are degraded and in small fragments, as in D. sechellia and D. yakuba. This suggests that all three species were invaded by numerous TEs a long time ago, but have since regulated their activity, as the present TE copies are degraded, with very few full-length elements. In contrast, in D. melanogaster, a recent activation of TEs has resulted in a large number of almost-identical TE copies. We have detected variants of some TEs in D. simulans and D. sechellia, that are almost identical to the reference TE sequences in D. melanogaster, suggesting that D. melanogaster has recently been invaded by active TE variants from the other species. Our results indicate that the three species D. simulans, D. sechellia, and D. yakuba seem to be at a different stage of their TE life cycle when compared to D. melanogaster. Moreover, we show that D. melanogaster has been invaded by active TE variants for several TE families likely to come from D. simulans or the ancestor of D. simulans and D. sechellia. The numerous horizontal transfer events implied to explain these results could indicate introgression events between these species. 相似文献
17.
Pseudogenes are important resources in evolutionary and comparative genomics because they provide molecular records of the ancient genes that existed in the genome millions of years ago. We have systematically identified approximately 5000 processed pseudogenes in the mouse genome, and estimated that approximately 60% are lineage specific, created after the mouse and human diverged. In both mouse and human genomes, similar types of genes give rise to many processed pseudogenes. These tend to be housekeeping genes, which are highly expressed in the germ line. Ribosomal-protein genes, in particular, form the largest sub-group. The processed pseudogenes in the mouse occur with a distinctly different chromosomal distribution than LINEs or SINEs - preferentially in GC-poor regions. Finally, the age distribution of mouse-processed pseudogenes closely resembles that of LINEs, in contrast to human, where the age distribution closely follows Alus (SINEs). 相似文献
18.
19.
The genomes of several strains of feline leukemia virus (FeLV) were compared by two-dimensional polyacrylamide gel electrophoresis of the large RNase T1-resistant oligonucleotides of the 70S RNA. Differences between each strain of FeLV tested were detected by this method. We estimate that the degree of sequence identity between the viruses is: FeLV A (Glasgow-1) to FeLV B (Snyder-Theilen), 52%; FeLV A (Glasgow-1) to FeLV C(Sarma), 66%; FeLV B(Snyder-Theilen) to FeLV C (Sarma), 37%. The fingerprints of two independent isolates of FeLV strains of subgroup A (Glasgow-1 and Rickard) were detectably different. We conclude that the RNase T1 oligonucleotide fingerprint pattern provides a useful tool for identification of FeLV strains. 相似文献