共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
Probabilistic models of sequence evolution are in widespreaduse in phylogenetics and molecular sequence evolution. Thesemodels have become increasingly sophisticated and combined withstatistical model comparison techniques have helped to shedlight on how genes and proteins evolve. Models of codon evolutionhave been particularly useful, because, in addition to providinga significant improvement in model realism for protein-codingsequences, codon models can also be designed to test hypothesesabout the selective pressures that shape the evolution of thesequences. Such models typically assume a phylogeny and canbe used to identify sites or lineages that have evolved adaptively.Recently some of the key assumptions that underlie phylogenetictests of selection have been questioned, such as the assumptionthat the rate of synonymous changes is constant across sitesor that a single phylogenetic tree can be assumed at all sitesfor recombining sequences. While some of these issues have beenaddressed through the development of novel methods, others remainas caveats that need to be considered on a case-by-case basis.Here, we outline the theory of codon models and their applicationto the detection of positive selection. We review some of themore recent developments that have improved their power andutility, laying a foundation for further advances in the modelingof coding sequence evolution. 相似文献
3.
A software program combining sequence motif searches with keywords for finding repeats containing DNA sequences 总被引:3,自引:0,他引:3
MOTIVATION: One of the most interesting features of genomes (both coding and non-coding regions) is the presence of relatively short tandemly repeated DNA sequences known as tandem repeats (TRs). We developed a new PC-based stand-alone software analysis program, combining sequence motif searches with keywords such as organs, tissues, cell lines or development stages for finding exact, inexact and compound, TRs. Tandem Repeats Analyzer 1.5 (TRA) has several advanced repeat search parameters/options over other repeat finder programs as it does not only accept GenBank, FASTA and expressed sequence tag (EST) sequence files but also does analysis of multifiles with multisequences. Advanced user-defined parameters/options let the researchers use different motif lengths search criteria for varying motif lengths simultaneously. The outputs show statistical results to be evaluated by the user. The discovery of TRs in ESTs could be useful for both gene mapping and association studies and discovering TRs located in coding regions of important genes that are expressed under various conditions of environment, stress, organ, tissue and development stage. RESULTS: In this paper, we demonstrated applications of TRA using 175 899 ESTs sequences for three Arabidopsis spp. downloaded from GenBank. The EST-SSRs/ESTs ratios were found 43.1%, 15.3% and 2.34% in A.lyrata, A.thaliana and A.halleri, respectively. Analysis revealed that organs, tissues and development stages possessed different amounts of repeats and repeat compositions. This indicated that the distribution of TRs among the tissues or organs may not be random differing from the untranscribed repeats found in genomes. AVAILABILITY: The program can be obtained free by anonymous FTP from ftp.akdeniz.edu.tr/Araclar/TRA. 相似文献
4.
Molecular evolution of chloroplast DNA sequences 总被引:12,自引:1,他引:12
Comparative data on the evolution of chloroplast genes are reviewed. The
chloroplast genome has maintained a similar structural organization over
most plant taxa so far examined. Comparisons of nucleotide sequence
divergence among chloroplast genes reveals marked similarity across the
plant kingdom and beyond to the cyanobacteria (blue-green algae). Estimates
of rates of nucleotide substitution indicate a synonymous rate of 1.1 x
10(-9) substitutions per site per year. Noncoding regions also appear to be
constrained in their evolution, although addition/deletion events are
common. There have also been evolutionary changes in the distribution of
introns in chloroplast encoded genes. Relative to mammalian mitochondrial
DNA, the chloroplast genome evolves at a conservative rate.
相似文献
5.
Thorne JL 《Current opinion in genetics & development》2000,10(6):602-605
Homologous sequences are correlated due to their common ancestry. Probabilistic models of sequence evolution are employed routinely to properly account for these phylogenetic correlations. These increasingly realistic models provide a basis for studying evolution and for exploiting it to better understand protein structure and function. Notable recent advances have been made in the treatment of insertion and deletion events, the estimation of amino-acid replacement rates, and the detection of positive selection. 相似文献
6.
Statistical studies of gene populations on the purine/pyrimidine alphabet have shown that the mean occurrence probability of thei-motif YRY(N) i YRY (R=purine, Y=pyrimidine, N=R or Y) is not uniform by varyingi in the range [1,99], but presents a maximum ati=6 in the following populations: protein coding genes of eukaryotes, prokaryotes, chloroplasts and mitrochondria, and also viral introns, ribosomal RNA genes and transfer RNA genes (Arquès and Michel, 1987b,J. theor. Biol. 128, 457–461). From the “universality” of this observation, we suggested that the oligonucleotide YRY(N)6 is a primitive one and that it has a central function in DNA sequence evolution (Arquès and Michel, 1987b,J. theor. Biol. 128, 457–461). Following this idea, we introduce a concept of a model of DNA sequence evolution which will be validated according to a shema presented in three parts. In the first part, using the last version of the gene database, the YRY(N)6YRY preferential occurrence (maximum ati=6) is confirmed for the populations mentioned above and is extended to some newly analysed populations: chloroplast introns, chloroplast 5′ regions, mitochondrial 5′ regions and small nuclear RNA genes. On the other hand, the YRY(N)6YRY preferential occurrence and periodicities are used in order to classify 18 gene populations. In the second part, we will demonstrate that several statistical features characterizing different gene populations (in particular the YRY(N)6YRY preferential occurrence and the periodicities) can be retrieved from a simple Markov model based on the mixing of the two oligonucleotides YRY(N)6 and YRY(N)3 and based on the percentages of RYR and YRY in the unspecified trinucleotides (N)3 of YRY(N)6 and YRY(N)3. Several properties are identified and prove in particular that the oligonucleotide mixing is an independent process and that several different features are functions of a unique parameter. In the third part, the return of the model to the reality shows a strong correlation between reality and simulation concerning the presence of large alternating purine/pyrimidine stretches and of periodicities. It also contributes to a greater understanding of biological reality, e.g. the presence or the absence of large alternating purine/pyrimidine stretches can be explained as being a simple consequence of the mixing of two particular oligonucleotides. Finally, we believe that such an approach is the first step toward a unified model of DNA sequence evolution allowing the molecular understanding of both the origin of life and the actual biological reality. 相似文献
7.
Analysis of repetitive sequence elements containing tRNA-like sequences. 总被引:12,自引:4,他引:12
下载免费PDF全文

Several repetitive sequence elements from diverse species share extensive sequence homology with tRNA molecules. Analysis of the tRNA-like sequences within these elements suggest that they have originated from authentic tRNA sequences. Elements containing tRNA-like sequences can be divided into three distinct groups whose members share extensive sequence homology, have similar sequence organization and have unique species distribution. We suggest that these three groups represent independent examples of retroposon families that have originated from tRNAs. 相似文献
8.
Heterogeneity in base sequence among different DNA clones containing equivalent sequences of rotavirus double-stranded RNA.
下载免费PDF全文

The nucleotide sequences for several complementary DNA clones of the rotavirus genome were determined. When the sequences obtained from different clones for the same regions (16,000 bases) were compared, differences in eight base positions were observed. These discrepancies, approximately 1 in 2,000 bases, may be due to differences in individual RNA genomes resulting from multiple passages; infidelity of DNA synthesis in the cloning procedure; or both factors. Whatever the cause, this frequency of base substitution found in sequences of complementary DNA obtained from the same isolate should be considered when comparing DNA sequences obtained from independent isolates. On the other hand, the frequency of base changes observed suggests that the rotavirus genome is very conserved since the virus used for cDNA synthesis has been continuously passaged for 6 years without plaque purification. 相似文献
9.
The evolution of DNA sequences in Escherichia coli 总被引:9,自引:0,他引:9
D L Hartl M Medhora L Green D E Dykhuizen 《Philosophical transactions of the Royal Society of London. Series B, Biological sciences》1986,312(1154):191-204
It is proposed that certain families of transposable elements originally evolved in plasmids and functioned in forming replicon fusions to aid in the horizontal transmission of non-conjugational plasmids. This hypothesis is supported by the finding that the transposable elements Tn3 and gamma delta are found almost exclusively in plasmids, and also by the distribution of the unrelated insertion sequences IS4 and IS5 among a reference collection of 67 natural isolates of Escherichia coli. Each insertion sequence was found to be present in only about one-third of the strains. Among the ten strains found to contain both insertion sequences, the number of copies of the elements was negatively correlated. With respect to IS5, approximately half of the strains containing a chromosomal copy of the insertion element also contained copies within the plasmid complement of the strain. 相似文献
10.
Optimal sequence alignment allowing for long gaps 总被引:7,自引:0,他引:7
Osamu Gotoh 《Bulletin of mathematical biology》1990,52(3):359-373
A new algorithm for optimal sequence alignment allowing for long insertions and deletions is developed. The algorithm requires
O((L+C)MN) computational steps, O(LN) primary memory and O(MN) secondary memory storage, whereM andN(M≥N) are sequence lengths,L (typicallyL≤3) is the number of segment specifying the gap weighting function, andC is a constant. We have also modified our earlier traceback algorithm so that it finds all and only the optimal alignments
in a compact form of a directed graph. The current versions accept a set of aligned sequences as input, which facilitates
multiple sequence alignment by some iterative procedures.
Dedicated to Professor Akiyoshi Wada on the occasion of his 60th birthday. 相似文献
11.
Slipped-strand mispairing: a major mechanism for DNA sequence evolution 总被引:128,自引:13,他引:128
Simple repetitive DNA sequences are a widespread and abundant feature of
genomic DNA. The following several features characterize such sequences:
(1) they typically consist of a variety of repeated motifs of 1-10
bases--but may include much larger repeats as well; (2) larger repeat units
often include shorter ones within them; (3) long polypyrimidine and poly-CA
tracts are often found; and (4) tandem arrangements of closely related
motifs are often found. We propose that slipped-strand mispairing events,
in concert with unequal crossing- over, can readily account for all of
these features. The frequent occurrence of long tandem repeats of
particular motifs (polypyrimidine and poly-CA tracts) appears to result
from nonrandom patterns of nucleotide substitution. We argue that the
intrahelical process of slipped-strand mispairing is much more likely to be
the major factor in the initial expansion of short repeated motifs and
that, after initial expansion, simple tandem repeats may be predisposed to
further expansion by unequal crossing-over or other interhelical events
because of their propensity to mispair. Evidence is presented that
single-base repeats (the shortest possible motifs) are represented by
longer runs in mammalian introns than would be expected on a random basis,
supporting the idea that SSM may be a ubiquitous force in the evolution of
the eukaryotic genome. Simple repetitive sequences may therefore represent
a natural ground state of DNA unselected for coding functions.
相似文献
12.
SUMMARY: MAC5 implements MCMC sampling of the posterior distribution of tree topologies from DNA sequences containing gaps by using a five state model of evolution (the four nucleotides and the gap character). 相似文献
13.
Molecular sequences, like all experimental data, are subjectto error. Many current DNA sequencing protocols have very signerror rates and often generate artefactual insertions and deletionsof bases (indels) which corrupt the translation of sequencesand compromise the detection of protein homologies. The impactof these errors on the utility of molecular sequence data isdependent on the analytic technique used to interpret the data.In the presence of frameshift errors, standard algorithms usingsix-frame translation can miss important homologies becauseonly subfragments of the correct translation are available inany given frame. We present a new algorithm which can detectand correct frameshift errors in DNA sequences during comparisonof translated sequences with protein sequences in the databases.This algorithm can recognize homologous proteins sharing 30%identity even in the presence of a 7% frameshift error rate.Our algorithm uses dynamic programming, producing a guaranteedoptimal alignment in the presence of frameshifts, and has asensitivity equivalent to Smith-Waterman. The computationalefficiency of the algorithm is O(nm) where n and m are the sizesof two sequences being compared. The algorithm does not relyon prior knowledge or heuristic rules and performs sign betterthan any previously reported method. 相似文献
14.
15.
On the rate of DNA sequence evolution inDrosophila 总被引:30,自引:0,他引:30
Summary Analysis of the rate of nucleotide substitution at silent sites inDrosophila genes reveals three main points. First, the silent rate varies (by a factor of two) among nuclear genes; it is inversely
related to the degree of codon usage bias, and so selection among synonymous codons appears to constrain the rate of silent
substitution in some genes. Second, mitochondrial genes may have evolved only as fast as nuclear genes with weak codon usage
bias (and two times faster than nuclear genes with high codon usage bias); this is quite different from the situation in mammals
where mitochondrial genes evolve approximately 5–10 times faster than nuclear genes. Third, the absolute rate of substitution
at silent sites in nuclear genes inDrosophila is about three times hihger than the average silent rate in mammals. 相似文献
16.
17.
Michel Solignac Monique Monerot Jean-Claude Mounolou 《Journal of molecular evolution》1986,24(1-2):53-60
Summary In the eightDrosophila species of themelanogaster subgroup, the mitochondrial DNA (mtDNA) contains an A+T-rich region in which replication originates. The length of this region, in contrast with that of the coding part of the genome, varies extensively among these species. The A+T-rich region ranges from about 1kbp inD. yakuba, D. teissieri, D. erecta, andD. orena to 5 kbp inD. melanogaster, D. simulans, D. mauritiana, andD. sechellia. The difference in size is due in part to the amplification, in the species with long genomes, of a 470-bp sequence that is present only once in each of the four species with short genomes.Usually three to six repeats of this sequence occur in direct tandem repetition in the species with long genomes. The sequence is characterized by the relative positions of the Hpa I and Acc I cleavage sites. Comparative study of the genomes found in the species with long mtDNA molecules reveals relative homogeneity of the repeat units within a given genome, which contrasts with the variability found among the repeats of different genomes. This result is suggestive of a process of a concerted evolution.The examination of heteroplasmic flies of three species (D. simulans, D. mauritiana, andD. sechellia) has shed light on this process. In most cases the molecular types of mtDNA present in a heteroplasmic individual differ by one repeat unit. Addition or deletion of this sequence appears to be the original mutational event generating transient heteroplasmy. Cycles of addition or deletion may consequently maintain the intragenomic homogeneity of the repeats.Finally, we have analyzed an exceptional isofemale line in which three molecular lengths of mtDNA are found (molecules with four, five, and six repeats, respectively). Individual offspring of this line carry from one to three of the molecular types, in all combinations. This indicates that the remodeling of the mitochondrial genome occurs through a mechanism that is at present unknown, but that is site specific and rather frequent.Presented at the FEBS Symposium on Genome Organization and Evolution, held in Crete, Greece, September 1–5, 1986 相似文献
18.
Hominoid phylogeny was investigated in terms of unique DNA sequence homologies. In comparisons from the human standpoint the ΔTe50 DNA values were Man 0, chimpanzee 0·7, gorilla 1·4, gibbon 2·7, orangutan 2·9, and African green monkey 5·7. In comparisons from the orangutan standpoint the ΔTe50 DNA values were orangutan 0, chimpanzee 1·8, Man 1·9, gorilla 2·3, gibbon 2·4 and African green monkey 4·3. These results indicate that chimpanzee and gorilla are cladistically closer to Man than to orangutan and other primates, and that gorilla DNA may have diverged slightly more from the ancestral state than chimpanzee or human DNA. Comparisons from chimpanzee and gorilla DNA standpoints are needed to achieve a more definitive picture of hominoid phylogeny. 相似文献
19.
Sequences 74–91 and 77–91 of E. coli thioredoxin, which according to x-ray structure contain an irregular β-turn, a hairpinlike structural element, have been synthesized and their conformational properties in solution have been investigated by means of CD spectroscopy. In addition, analogs of these sequences, containing the regular β-turn element Gly-Pro-(Gly)2, have also been prepared and investigated. These are BOC-Ile-Gly-Pro-(Gly)2-Val-OMe (III) and BOC-(Ile)3Gly-Pro-(Gly)2-(Val)5-OMe (IV) that on the basis of probability, should form hairpin structures stabilized by intramolecular interactions. While the natural sequences were shown to be unable to adopt structures characterized by an intrinsic conformational stability, the two analogs showed evidence of intramolecular folding in methanol and trifluoroethanol–water solution. In particular, the CD spectra are indicative of β-structure. The most interesting case was observed for compound IV, as the highest degree of conformational order was present in solutions containing a large proportion of water. In addition, the formation of this structure took place in a highly cooperative manner. The results are utilized to discuss whether and to what extent conformationally stable folding peptide units of small size can be formed in aqueous solution. 相似文献
20.
A method for multiple sequence alignment with gaps 总被引:13,自引:0,他引:13
A method that performs multiple sequence alignment by cyclical use of the standard pairwise Needleman-Wunsch algorithm is presented. The required central processor unit time is of the same order of magnitude as the standard Needleman-Wunsch pairwise implementation. Comparison with the one known case where the optimal multiple sequence alignment has been rigorously determined shows that in practice the proposed method finds the mathematically optimal solution. The more interesting question of the biological usefulness of such multiple sequence alignment over pairwise approaches is assessed using protein families whose X-ray structures are known. The two such cases studied, the subdomains of the ricin B-chain and the S-domains of virus coat proteins, have low pairwise similarity and thus fail to align correctly under standard pairwise sequence comparison. In both cases the multiple sequence alignment produced by the proposed technique, apart from minor deviations at loop regions, correctly predicts the true structural alignment. Thus, given many sequences of low pairwise similarity, the proposed multiple sequence method, can extract any familial similarity and so produce a sequence alignment consistent with the underlying structural homology. 相似文献