首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Toward a more accurate time scale for the human mitochondrial DNA tree   总被引:11,自引:0,他引:11  
Several estimates of the time of occurrence of the most recent common mitochondrial DNA (mtDNA) ancestor of modern humans have been made. Estimates derived from noncoding regions based on a model that classifies sites into two categories (variable and invariable) have been consistently older than those derived from the third positions of codons. This discrepancy can be attributed to a violation of the assumption of rate homogeneity among variable sites when analyzing the noncoding regions. Additional data from the partial control region sequences allow us to take into account some of this further heterogeneity. By assigning the sites to three classes (highly variable, moderately variable, and invariable) and by assuming that the last common mtDNA ancestor of humans and chimpanzees lived 4 million years ago, the most recent common mtDNA ancestor of humans is estimated to have occurred 211,000 ±111,000 years ago (±1 SE), consistent with the estimate, 101,000 ± 52,000 years, made from third positions of codons and also with those proposed previously. We used the same technique to estimate when a putative expansion of modern humans out of Africa took place and estimated a time of 89,000 ± 69,000 years ago. Even though the standard errors of these estimates are large, they allow us to reject the multiregional hypothesis of modern human origin.Deceased July 21, 1991 Correspondence to: M. Hasegawa  相似文献   

2.
Phylogenetic inference is well known to be problematic if both long and short branches occur together in the underlying tree. With biological data, correcting for this problem may require simultaneous consideration for both substitution biases and rate heterogeneity between lineages and across sequence positions. A particular form of the latter is the presence of invariable sites, which are well known to mislead estimation of genetic divergences. Here we describe a capture-recapture method to estimate the proportion of invariable sites in an alignment of amino acids or nucleotides. We use it to investigate phylogenetic signals in 18S ribosomal DNA sequences from Holometabolus insects. Our results suggest that, as taxa diverged, their 18S rDNA sequences have altered in both their distribution of sites that can vary as well as in their base compositions.  相似文献   

3.
Some of the assumptions underlying estimates of DNA and protein sequence divergence are examined. A solution for the variance of these estimates that allows for different mutation rates and different population sizes in each species and for an arbitrary structure in the initial population is obtained. It is shown that these conditions do not strongly affect estimates of divergence. In general, they cause the variance of divergence to be smaller than a binomial variance. Thus, the binomial variance that is usually assumed for these estimates is safely conservative. It is shown that variability in the mutation rate among sites can have an effect as large as or larger than variability in the mutation rate among bases. Variability in the mutation rate among bases and among sites causes the number of substitutions between two sequences to be underestimated. Protein and DNA sequences from several species are collected to estimate the variability in mutation rates among sites. When many homologous sequences are known, standard methods to estimate this variability can be used. The estimates of this variability show that this factor is important when considering the spectrum of spontaneous mutations and is strongly reflected in the divergence of sequences. Smaller variability is found for the third position of codons than for the first and second codon positions. This may be because of less selective constraints on this position or because the third position has been saturated with mutations for the sequences examined.   相似文献   

4.
If one has the amino acid sequences of a set of homologous proteins as well as their phylogenetic relationships, one can easily determine the minimum number of mutations (nucleotide replacements) which must have been fixed in each codon since their common ancestor. It is found that for 29 species of cytochrome c the data fit the assumption that there is a group of approximately 32 invariant codons and that the remainder compose two Poisson-distributed groups of size 65 and 16 codons, the latter smaller group fixing mutations at about 3.2 times the rate of the larger. It is further found that the size of the invariant group increases as the range of species is narrowed. Extrapolation suggests that less than 10% of the codons in a given mammalian cytochrome c gene are capable of accepting a mutation. This is consistent with the view that at any one point in time only a very restricted number of positions can fix mutations but that as mutations are fixed the positions capable of accepting mutations also change so that examination of a wide range of species reveals a wide range of altered positions. We define this restricted group as the concomitantly variable codons. Given this restriction, the fixation rates for mutations in concomitantly variable codons in cytochrome c and fibrinopeptide A are not very different, a result which should be the case if most of these mutations are in fact selectively neutral as Kimura suggests.Paper number 1382 from the Laboratory of Genetics. Work performed in part at the University of Iowa, Department of Preventive Medicine and Environmental Health and Department of Statistics, Iowa City, Iowa. Computing supported by the Graduate College, University of Iowa.  相似文献   

5.
A computer program (PINCERS) is described for use in the design of synthetic genes and mixed-probe DNA sequences. A protein sequence is reverse translated with generation of synonymous codons at each position producing a degenerate sequence. In order to locate potential restriction enzyme sites, the degenerate sequence is searched with a library of restriction enzymes for sites that utilize any combination of synonymous codons. These sites are indicated in a map so that they may be incorporated into the synthetic gene sequence. The program allows the user to select the appropriate codon usage table for the organism of interest and then to set a threshold usage frequency below which codons are not generated. PINCERS may also be used to assist in planning the synthesis of mixed-probe DNA sequences for cross-hybridization experiments. It can identify regions of specified length with the protein sequence that have the least overall degeneracy, thereby minimizing the number of probes to be synthesized and, therefore, maximizing the concentration of a given probe sequence.  相似文献   

6.
7.
We show, using the PDR1 element of pea, that dispersed repeated sequences of moderate copy number can be used simply and efficiently to generate markers linked to a trait of interest. Inspection of hybridization patterns of repeated sequences to DNA mixtures of pooled genotypes is a sensitive way of detecting such markers. The large number of bands in tracks of digests of these mixtures allows the simultaneous sampling of loci at many places in the genome, and the many unlinked loci serve as internal controls. It is also shown that intensity ratios calculated from these band differences can be used to give a rough estimate of linkage distance.  相似文献   

8.
9.
A Charomid ordered-array library containing a 2–16 Kb size fraction of MbeoI-digested canine genomic DNA has been screened with the Jeffreys multilocus probes, 33-6 and 33-15, to identify and isolate canine minisatellite sequences. Of the 48 positive clones identified, 7 were found to contain polymorphic mini-satellites with heterozygosities in the range 20–88%. The majority of the remainder were either monomorphic or dimorphic in the animals tested. Analysis of intrabreed variation in Bedlington Terriers using two polymorphic minisatellites has shown that a significant reduction occurs in the number of alleles seen compared to an agglomerated population sample, correlating with the high level of inbreeding within this breed. Flanking DNA sequence and partial repeat sequence is presented for the most polymorphic minisatellite thus far identified, cCfaMP5. The variable region in this mini-satellite is similar to human minisatellites which show a distinct purine or pyrimidine strand bias.  相似文献   

10.
This paper shows, within the limitations of the assumption stated below, that approximately 27–29 of the unmutated codons which determine the amino acids of cytochrome c are invariant because of biological requirements. A mutation is defined here as the change of a single base in the sequence of a trinucleotide codon, which change alters the amino acid coded for. Codons, if any, in which mutations would be vigorously selected against are termed invariant codons. We assume that, subject to one adjustment, those mutations in the cytochrome c gene which survived in the descent of today's species are randomly distributed among the variable codons. The one adjustment arises from the possibility that a very few codon positions may exhibit frequencies of mutation sufficiently great to justify the exclusion of these codons from the overall distribution on the grounds that the frequency of mutation occurring in these few positions is clearly inconsistent with the assumption of randomness. There are 5 out of the total 110 codons in the cytochrome c structural gene which have clearly sustained an abnormally large number of mutations.This project received support from grants to W.M.F. from the National Institutes of Health (NB-04565) and the National Science Foundation (GB-4017).  相似文献   

11.
trees sifter 1.0 implements an approximate method to estimate the time to the most recent common ancestor (TMRCA) of a set of DNA sequences, using population evolution modelling. In essence, the program simulates genealogies with a user‐defined model of coalescence of lineages, and then compares each simulated genealogy to the genealogy inferred from the real data, through two summary statistics: (i) the number of mutations on the genealogy (Mn), and (ii) the number of different sequence types (alleles) observed (Kn). The simulated genealogies are then submitted to a rejection algorithm that keeps only those that are the most likely to have generated the observed sequence data. At the end of the process, the accepted genealogies can be used to estimate the posterior probability distribution of the TMRCA.  相似文献   

12.
Seven-hundred globin sequences, including 146 nonvertebrate sequences, were aligned on the basis of conservation of secondary structure and the avoidance of gap penalties. Of the 182 positions needed to accommodate all the globin sequences, only 84 are common to all, including the absolutely conserved PheCD1 and HisF8. The mean number of amino acid substitutions per position ranges from 8 to 13 for all globins and 5 to 9 for internal positions. Although the total sequence volumes have a variation approximately 2-3%, the variation in volume per position ranges from approximately 13% for the internal to approximately 21% for the surface positions. Plausible correlations exist between amino acid substitution and the variation in volume per position for the 84 common and the internal but not the surface positions. The amino acid substitution matrix derived from the 84 common positions was used to evaluate sequence similarity within the globins and between the globins and phycocyanins C and colicins A, via calculation of pairwise similarity scores. The scores for globin-globin comparisons over the 84 common positions overlap the globin-phycocyanin and globin-colicin scores, with the former being intermediate. For the subset of internal positions, overlap is minimal between the three groups of scores. These results imply a continuum of amino acid sequences able to assume the common three-on-three alpha-helical structure and suggest that the determinants of the latter include sites other than those inaccessible to solvent.  相似文献   

13.
Sola L  Gornung E  Naoi H  Gunji R  Sato C  Kawamura K  Arai R  Ueda T 《Genetica》2003,119(1):99-106
The Japanese rose bitterling, Rhodeus ocellatus kurumeus, and the oily bitterling, Tanakia limbata, were cytogenetically studied by silver (Ag)- and chromomycin A3 (CMA3)-staining, by C-banding and by mapping of the 18S ribosomal genes and of the (TTAGGG) n telomeric sequence. These two representative species of related genera of the subfamily Acheilognathinae show very similar chromosome complements. Nevertheless, significant differences in the chromosomal distribution of nucleolus organizer regions (NORs) and interstitial telomeric sequences were observed. Whereas R. ocellatus kurumeus shows a single NOR-bearing chromosome pair, T. limbata is characterized by a higher number of variable NORs. Multiple telomeric sequence sites were found at the pericentromeric regions of several chromosomes in the rose bitterling. No telomeric sequence sites were detected near centromeres, but they were found to be scattered along the NORs in the oily bitterling. Two karyoevolutive trends might have been identified in the subfamily.  相似文献   

14.
It is believed that pausing during mRNA translation plays some role in ensuring proper folding of newly synthesized sections of a protein chain. Such pausing occurs when rare triplets are encountered in the mRNA, as it takes additional time for the corresponding rare species of tRNA to be delivered. To determine whether pause sites are non-randomly distributed along prokaryotic mRNA (cDNA), we have located clusters of rare triplets in cDNA sequences from 21 different bacteria. From the individual profiles of local codon frequencies calculated with various windows, the positions of the clusters of the rarest codons were taken for generation of the combined histograms of positional preferences of the pause sites. The histograms show that in the prokaryotic sequences, the pause sites are located preferentially at the start positions and at about 155 triplets from the starts. To verify the generality of these observations, the data are grouped in six independent sets about 500 sequences each, all revealing the same features. A less prominent maximum is also seen at the triplet position 75. Judging by the amplitude of the peak at 155 triplets, an optimal cluster size is estimated to equal 18 triplets. The distance 155 closely corresponds to the sizes of typical protein folds and to earlier estimated prokaryotic protein sequence segments. This supports the suggestion of a role for translation pausing in the cotranslational folding of protein domains. The profiles of rare codons in mRNA can serve in the detection or prediction of boundaries between protein domains.  相似文献   

15.
Operational taxonomic units (OTUs) are conventionally defined at a phylogenetic distance (0.03—species, 0.05—genus, 0.10—family) based on full-length 16S rRNA gene sequences. However, partial sequences (700 bp or shorter) have been used in most studies. This discord may affect analysis of diversity and species richness because sequence divergence is not distributed evenly along the 16S rRNA gene. In this study, we compared a set each of bacterial and archaeal 16S rRNA gene sequences of nearly full length with multiple sets of different partial 16S rRNA gene sequences derived therefrom (approximately 440-700 bp), at conventional and alternative distance levels. Our objective was to identify partial sequence region(s) and distance level(s) that allow more accurate phylogenetic analysis of partial 16S rRNA genes. Our results showed that no partial sequence region could estimate OTU richness or define OTUs as reliably as nearly full-length genes. However, the V1-V4 regions can provide more accurate estimates than others. For analysis of archaea, we recommend the V1-V3 and the V4-V7 regions and clustering of species-level OTUs at 0.03 and 0.02 distances, respectively. For analysis of bacteria, the V1-V3 and the V1-V4 regions should be targeted, with species-level OTUs being clustered at 0.04 distance in both cases.  相似文献   

16.
Rate of change of concomitantly variable codons   总被引:1,自引:1,他引:0  
Summary It was previously shown that about 10% of the codons in cytochromec are variable in any one mammalian species and any one point in time and that the positions of theseconcomitantlyvariable codons (covarions) must change as mutations are fixed. Variability implies the existence of an alternative, non-deleterious amino acid that differs by only one nucleotide replacement from the one presently encoded. This work, in addition to obtaining an independent estimate of the number of covarions, investigates the question: What is the likelihood that a cytochromec covarion will lose its variable status as a result of the fixation of a mutation in another covarion? The results show: 1, the number of covarions is in the range of 4 to 10 in agreement with the earlier result of 10 but suggesting the variability may be even more circumscribed than originally thought; and 2, the likelihood of a covarion loosing its variable status as a result of fixations elsewhere in the gene may be greater than 0.75, suggesting a high turnover rate among the covarions.  相似文献   

17.
This study answers the question: Are the variable and invariable codons of cytochrome c largely the same in all species? A method is presented for estimating the number of invariable (as opposed to unvaried) codons common to two taxa. The two taxa in this study were comprised of four fungi and four metazoans. Given the number of mutations fixed in each taxon, one calculates the number of codons that would be expected to have fixed mutations in both taxa, in one taxon only, in the other taxon only, and in neither taxon. This expectation depends upon the number of invariable codons that are assumed to be common to both taxa. In the present example, the assumption of 41 invariable codons in common leads to estimates that deviate by less than 2% from the values actually observed. This leads to the conclusion that there are 46 positions that are variable in one taxon but invariable in the other, thereby demonstrating that the invariable codons are not largely the same between the fungi and the metazoans.  相似文献   

18.
Discriminating phylogenetic signal from noise in DNA sequence data is a difficult problem in phylogenetic inference at higher systematic levels. For protein-coding genes, noise at synonymous (silent) positions can be filtered by deleting entire codon positions or types of change at a codon position. This method is not appropriate for replacement sites, because changes at each site within a codon may not be independent. This research presents a method using information from protein structure to evaluate variation in replacement sites. Analysis of the correlation of amino acid variation with protein structure identified rapidly evolving codons in the COIII gene. In a series of phylogenetic analyses attempting to recover a known set of vertebrate relationships, downweighting these labile codons produced the most accurate results. Structural correlates of variable and invariant residues identified in this study can be used to increase the accuracy of models used for phylogenetic inference. Viewing amino acid variation within a phylogenetic framework provided insight into residue changes important in the secondary and tertiary structures of the molecule, changes that were correlated between pairs of neighboring residues or between residues in neighboring helices.   相似文献   

19.
Recent investigations into the translation termination sites of various organisms have revealed that not only stop codons but also sequences around stop codons have an effect on translation termination. To investigate the relationship between these sequence patterns and translation as well as its termination efficiency, we analysed the correlation between strength of consensus and translation efficiency, as predicted according to Codon Adaptation Index (CAI) value. We used RIKEN full-length mouse cDNA sequences and ten other eukaryotic UniGene datasets from NCBI for the analyses. First, we conducted sequence profile analyses following translation termination sites. We found base G and A at position +1 as a strong consensus for mouse cDNA. A similar consensus was found for other mammals, such as Homo sapiens, Rattus norvegicus and Bos taurus. However, some plants had different consensus sequences. We then analysed the correlation between the strength of consensus at each position and the codon biases of whole coding regions, using information content and CAI value. The results showed that in mouse cDNA, CAI value had a positive correlation with information content at positions +1. We also found that, for positions with strong consensus, the strength of the consensus is likely to have a positive correlation with CAI value in some other eukaryotes. Along with these observations, biological insights into the relationship between gene expression level, codon biases and consensus sequence around stop codons will be discussed.  相似文献   

20.
A solution is presented for the problem of how to find ancestral codons which minimize the number of mutations over a given network of species for which character-states of aligned amino acid sequences among the contemporary species are known. Three theorems which allow this “maximum parsimony” problem to be solved are proved; then the use of these theorems in finding maximum parsimony ancestral codons is illustrated on a network of chicken and mammalian alpha globin amino acid sequences at two alignment positions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号