首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
The nucleotide sequence of the beta globin gene cluster of the prosimian Galago crassicaudatus has been determined. A total sequence spanning 41,101 bp contains and links together previously published sequences of the five galago beta-like globin genes (5'-epsilon-gamma-psi eta-delta-beta-3'). A computer-aided search for middle interspersed repetitive sequences identified 10 LINE (L1) elements, including a 5' truncated repeat that is orthologous to the full-length L1 element found in the human epsilon-gamma intergenic region. SINE elements that were identified included one Alu type I repeat, four Alu type II repeats, and two methionine tRNA-derived Monomer (type III) elements. Alu type II and Monomer sequences are unique to the galago genome. Structural analyses of the cluster sequence reveals that it is relatively A+T rich (about 62%) and regions with high G+C content are associated primarily with globin coding regions. Comparative analyses with the beta globin cluster sequences of human, rabbit, and mouse reveal extensive sequence homologies in their genic regions, but only human, galago, and rabbit sequences share extensive intergenic sequence homologies. Divergence analyses of aligned intergenic and flanking sequences from orthologous human, galago, and rabbit sequences show a gradation in the rate of nucleotide sequence evolution along the cluster where sequences 5' of the epsilon globin gene region show the least sequence divergence and sequences just 5' of the beta globin gene region show the greatest sequence divergence.  相似文献   

4.
To study the genomic divergences among hominoids and to estimate the effective population size of the common ancestor of humans and chimpanzees, we selected 53 autosomal intergenic nonrepetitive DNA segments from the human genome and sequenced them in a human, a chimpanzee, a gorilla, and an orangutan. The average sequence divergence was only 1.24% +/- 0.07% for the human-chimpanzee pair, 1.62% +/- 0.08% for the human-gorilla pair, and 1.63% +/- 0.08% for the chimpanzee-gorilla pair. These estimates, which were confirmed by additional data from GenBank, are substantially lower than previous ones, which included repetitive sequences and might have been based on less-accurate sequence data. The average sequence divergences between orangutans and humans, chimpanzees, and gorillas were 3.08% +/- 0.11%, 3.12% +/- 0.11%, and 3.09% +/- 0.11%, respectively, which also are substantially lower than previous estimates. The sequence divergences in other regions between hominoids were estimated from extensive data in GenBank and the literature, and Alus showed the highest divergence, followed in order by Y-linked noncoding regions, pseudogenes, autosomal intergenic regions, X-linked noncoding regions, synonymous sites, introns, and nonsynonymous sites. The neighbor-joining tree derived from the concatenated sequence of the 53 segments--24,234 bp in length--supports the Homo-Pan clade with a 100% bootstrap value. However, when each segment is analyzed separately, 22 of the 53 segments (approximately 42%) give a tree that is incongruent with the species tree, suggesting a large effective population size (N(e)) of the common ancestor of Homo and Pan. Indeed, a parsimony analysis of the 53 segments and 37 protein-coding genes leads to an estimate of N(e) = 52,000 to 96,000. As this estimate is 5 to 9 times larger than the long-term effective population size of humans (approximately 10,000) estimated from various genetic polymorphism data, the human lineage apparently had experienced a large reduction in effective population size after its separation from the chimpanzee lineage. Our analysis assumes a molecular clock, which is in fact supported by the sequence data used. Taking the orangutan speciation date as 12 to 16 million years ago, we obtain an estimate of 4.6 to 6.2 million years for the Homo-Pan divergence and an estimate of 6.2 to 8.4 million years for the gorilla speciation date, suggesting that the gorilla lineage branched off 1.6 to 2.2 million years earlier than did the human-chimpanzee divergence.  相似文献   

5.
We have isolated a new family of moderately repetitive nucleotide sequences (about 2500 copies per haploid genome) specific to the genus Zea and absent in other graminaceous species. These sequences are interspersed in the genome and they show the same genomic organization pattern and similar copy number in all the Zea species examined. These two facts, consistency in the copy number and the same organization pattern, would indicate on the one hand that these sequences were amplified before the divergence of Zea species, and on the other hand that maize and all the teosintes could be considered as the same evolutionary population. Independent clones corresponding to the repetitive sequences have been isolated and sequenced from a genomic library of the teosinte, Zea diploperennis. The repeats, flanked by HaeIII sites, are more than 70% G + C-rich, on average 253 bp long and show 78% similarity to each other. These repetitive sequences are in a highly methylated-C context and they present some features resembling those of coding sequences, such as high CpG and low TpA content, and similar codon usage to maize genes in one of the reading frames. Moreover, the repetitive probe hybridizes with RNA extracted from different tissues of maize and from teosinte, indicating that these repeats or similar ones are present in transcribed sequences.  相似文献   

6.
Measurements are reported which lead to the conclusion that repetitive and nonrepetitive sequences are intimately interspersed in the majority of the DNA of the sea urchin, Strongylocentrotus purpuratus. Labeled DNA was sheared to various lengths, reassociated with a great excess of 450 nucleotide-long fragments to cot 20, and the binding of the labeled DNA to hydroxyapatite was measured. Repetitive sequences measured in this way are present on about 42% of the 450 nucleotide-long fragments. As the DNA fragment length is increased, larger and larger fractions of the fragments contain repetitive sequences. Analysis of the measurements leads to the following estimate of the quantitative features of the pattern of interspersion of repetitive and nonrepetitive sequences. About 50% of the genome consists of a short-period pattern with 300–400 nucleotide average length repetitive segments interspersed with about 1000 nucleotide average length nonrepetitive segments. Another 20% or more consists of a longer period interspersed pattern. About 6% of the genome is made up of relatively long regions of repetitive sequences. The remaining 22% of the genome may be uninterrupted single copy DNA, or may have more widely spaced repeats interspersed. The similarity of these results to previous measurements with the DNA of an amphibian suggests that this interspersion pattern is of general occurrence and selective importance.  相似文献   

7.
Nucleotide sequences of nine 5' upstream gene regions for human, chimpanzee, gorilla, and orangutan were determined. We estimated nucleotide differences (d) for each region between human and great apes. The overall d was 0.027 (ranged from 0.004 to 0.052). Rates of nucleotide substitution were estimated by using d and divergence times of human, chimpanzee, gorilla, and orangutan. The overall rate of nucleotide substitution between human and other hominoids was estimated to be 0.52-0.85 x 10(-9). This rate in 5' upstream regions was lower than that of synonymous sites, suggesting that 5' upstream regions have evolved under some functional constraints. Because lower rates have been reported for coding sequences in primates compared to rodents, we also estimated the rate (1.17-1.76 x 10(-9)) of nucleotide substitutions for the corresponding 5' upstream regions in rodents (mouse/rat comparison). Thus the primate rate was lower than rodent rate also for the 5' upstream regions.  相似文献   

8.
9.
Two distinct processed calmodulin genes of rat (lambda SC8 and lambda SC9) were identified, cloned and their DNA sequences determined. The existence of direct repeats of 19 base-pairs for lambda SC8 or 9 base-pairs for lambda SC9 at both ends of the coding plus non-coding regions suggested a possible involvement of a mRNA-mediated process of insertion. Total genomic Southern hybridization suggested the existence of at least three different calmodulin-related genes in the rat genome. The other gene was the bona fide calmodulin gene (lambda SC4) which was split into at least five exons. lambda SC9 contained insertions of one nucleotide and two 17 base-pair direct repeats in the coding region. These insertions cause frameshift mutations probably preventing it from encoding a functional calmodulin. It also carried an insertion of a rat middle repetitive sequence, identifier sequence (IDS: Sutcliffe et al., 1982) in the 3'-non-coding region. Otherwise, it consisted of an almost identical DNA sequence to that of the bona fide calmodulin gene (lambda SC4), including the 3'-non-coding region down to the poly(A) recognition signal, A-A-T-A-A-A. On the other hand, lambda SC8 did not possess frameshift mutations in the coding region, and hence was capable of encoding a functional protein. In fact, a probe specific to the lambda SC8 sequence identified a band in Northern blotting whose size was 300 nucleotides smaller than that of authentic calmodulin mRNA. Comparison of the nucleotide sequences showed that only the coding regions of these two processed genes were homologous, indicating that the divergence of these two processed genes from the common ancestor calmodulin was an ancient event.  相似文献   

10.

Background

Approximately 11 Mb of finished high quality genomic sequences were sampled from cattle, dog and human to estimate genomic divergences and their regional variation among these lineages.

Results

Optimal three-way multi-species global sequence alignments for 84 cattle clones or loci (each >50 kb of genomic sequence) were constructed using the human and dog genome assemblies as references. Genomic divergences and substitution rates were examined for each clone and for various sequence classes under different functional constraints. Analysis of these alignments revealed that the overall genomic divergences are relatively constant (0.32–0.37 change/site) for pairwise comparisons among cattle, dog and human; however substitution rates vary across genomic regions and among different sequence classes. A neutral mutation rate (2.0–2.2 × 10(-9) change/site/year) was derived from ancestral repetitive sequences, whereas the substitution rate in coding sequences (1.1 × 10(-9) change/site/year) was approximately half of the overall rate (1.9–2.0 × 10(-9) change/site/year). Relative rate tests also indicated that cattle have a significantly faster rate of substitution as compared to dog and that this difference is about 6%.

Conclusion

This analysis provides a large-scale and unbiased assessment of genomic divergences and regional variation of substitution rates among cattle, dog and human. It is expected that these data will serve as a baseline for future mammalian molecular evolution studies.
  相似文献   

11.
Evolution of the genome size in eukaryotes is often affected by changes in the noncoding sequences, for which insertions and deletions (indels) of small nucleotide sequences and amplification of repetitive elements are considered responsible. In this study, we compared the genomic DNA sequences of two kinds of fish, medaka (Oryzias latipes) and fugu (Takifugu rubripes), which show two-fold difference in the genome size (800 Mb vs. 400 Mb). We selected a contiguous DNA sequence of 790 kb from the medaka chromosome LG22 (linkage group 22), and made a precise comparison with the sequence (387 kb) of the corresponding region of Takifugu. The sequence of 178 kb in total was aligned common between two fishes, and the remaining sequences (612 kb for medaka and 209 kb for fugu) were found abundant in various repetitive elements including many types of unclassified low copy repeats, all of which accounted for more than a half (54%) of the genome size difference. Furthermore, we identified a significant difference in the length ratio of the unaligned sequences that locate between the aligned sequences (USBAS), particularly after eliminating known repetitive elements. These USBAS with no repetitive elements (USBAS-nr) located within the intron and intergenic region. These results strongly indicated that amplification of repetitive elements and compilation of indels are major driving forces to facilitate changes in the genome size.  相似文献   

12.
Slow molecular clocks in Old World monkeys,apes, and humans   总被引:17,自引:0,他引:17  
Two longstanding issues on the molecular clock hypothesis are studied in this article. First, is there a global molecular clock in mammals? Although many authors have observed unequal rates of nucleotide substitution among mammalian lineages, some authors have proposed a global clock for all eutherians, i.e., a single global rate of 2.2 x 10(-9) substitutions per nucleotide site per year. We reexamine this issue using noncoding, nonrepetitive DNA from Old World monkeys (OWMs), chimpanzee, and human. First, using the minimal date of 6 MYA for the human-chimpanzee divergence and more than 2.5 million base pairs of genomic sequences from human and chimpanzee, we estimate a maximal rate of 0.99 x 10(-9) for noncoding, nonrepetitive genomic regions for these two species. This estimate is less than half of the proposed global rate and much smaller than the commonly used rate (3.5 x 10(-9)) for eutherians. Further, using a minimal date of 23 MYA for the human-OWM divergence, we estimate a maximal rate of 1.5 x 10(-9) for both introns and fourfold degenerate sites in humans and OWMs. In addition, with the New World monkey (NWM) lineage as an outgroup, we estimate that the rate of substitution in introns is 30% higher in the OWM lineage than in the human lineage. Clearly, there is no global molecular clock in eutherians. Second, although many studies have indicated considerable variation in the mutation rate among regions of the mammalian genome, a recent study proposed a uniform rate. Using new and existing intron sequence data from higher primates, we find significant rate variation among genomic regions and a positive correlation between the rate of substitution and the GC content, refuting the claim of a uniform rate.  相似文献   

13.
We develop techniques to estimate the statistical significance of gap-free alignments between two genomic DNA sequences, using human-mouse alignments as an example. The sequences are assumed to be sufficiently similar that some but not all of the neutrally evolving regions (i.e., those under no evolutionary constraint) can be reliably aligned. Our goal is to model the situation in which the neutral rate of evolution, and hence the extent of the aligning intervals, varies across the genome. In some cases, this permits the weaker of two matches to be judged as less likely to have arisen by chance, provided it lies in a genomic interval with a high level of background divergence. We employ a hidden Markov model to capture variations in divergence rates and assign probability values to gap-free alignments using techniques of Dembo and Karlin, which are related to those used for the same purpose by BLAST. Our methods are illustrated in detail using a 1.49 Mb genomic region. Results obtained from the analysis of human chromosome 22 using these techniques are also provided.  相似文献   

14.
15.
B Brenig 《Animal genetics》1999,30(2):120-125
Interspersed elements are ubiquitous in the genomes of higher eukaryotes and account for over a third of the genomic DNA (Smit 1996). In swine the short interspersed elements, SINEs or PREs (porcine repetitive elements), have been found in a number of introns and 3' untranslated regions of different genes. However, compared to human Alu repeats the number of available PRE DNA sequences is still limited. In this study we have compared 85 PREs selected from DNA sequence database entries. The PREs were aligned and for each nucleotide position the relative frequencies of the four bases were calculated. A consensus sequence was derived from the first base usage. Similar to studies of SINEs in other species, the analysis showed that most mutations in PREs occur at CpG dinucleotide hot spots. The position variability for the two most frequent bases shows a bimodal distribution. The analysis suggests that the porcine SINEs can be divided into three major subfamilies sharing conserved nucleotide similarities.  相似文献   

16.
Arabidopsis thaliana has a relatively small genome of approximately 130 Mb containing about 10% repetitive DNA. Genome sequencing studies reveal a gene-rich genome, predicted to contain approximately 25000 genes spaced on average every 4.5 kb. Between 10 to 20% of the predicted genes occur as clusters of related genes, indicating that local sequence duplication and subsequent divergence generates a significant proportion of gene families. In addition to gene families, repetitive sequences comprise individual and small clusters of two to three retroelements and other classes of smaller repeats. The clustering of highly repetitive elements is a striking feature of the A. thaliana genome emerging from sequence and other analyses.  相似文献   

17.
18.
Behura SK  Severson DW 《Gene》2012,504(2):226-232
We present a detailed genome-scale comparative analysis of simple sequence repeats within protein coding regions among 25 insect genomes. The repetitive sequences in the coding regions primarily represented single codon repeats and codon pair repeats. The CAG triplet is highly repetitive in the coding regions of insect genomes. It is frequently paired with the synonymous codon CAA to code for polyglutamine repeats. The codon pairs that are least repetitive code for polyalanine repeats. The frequency of hexanucleotide and dinucleotide motifs of codon pair repeats is significantly (p<0.001) different in the Drosophila species compared to the non-Drosophila species. However, the frequency of synonymous and non-synonymous codon pair repeats varies in a correlated manner (r(2)=0.79) among all the species. Results further show that perfect and imperfect repeats have significant association with the trinucleotide and hexanucleotide coding repeats in most of these insects. However, only select species show significant association between the numbers of perfect/imperfect hexamers and repeat coding for single amino acid/amino acid pair runs. Our data further suggests that genes containing simple sequence coding repeats may be under negative selection as they tend to be poorly conserved across species. The sequences of coding repeats of orthologous genes vary according to the known phylogeny among the species. In conclusion, the study shows that simple sequence coding repeats are important features of genome diversity among insects.  相似文献   

19.
20.
The extant mammalian groups Monotremata, Marsupialia and Placentalia are, according to the 'Theria' hypothesis, traditionally classified into two subclasses. The subclass Prototheria includes the monotremes and subclass Theria marsupials and placental mammals. Based on some morphological and molecular data, an alternative proposition, the Marsupionta hypothesis, favours a sister group relationship between monotremes and marsupials to the exclusion of placental mammals. Phylogenetic analyses of single genes and even multiple gene alignments have not yet been able to conclusively resolve this basal mammalian divergence. We have examined this problem using one data set composed of expressed sequence tags (EST) and another containing 1 510 509 nucleotide (nt) sites from 1358 inferred cDNA genomic sequences. All analyses of the concatenated sequences unambiguously supported the Theria hypothesis. The Marsupionta hypothesis was rejected with high statistical confidence from both data sets. In spite of the strong support for Theria, a non-negligible number of single genes supported either of the two alternative hypotheses. The divergence between monotremes and therian mammals was estimated to have taken place 168–178 Mya, a dating compatible with the fossil record. Considering the long common evolutionary branch of therians, it is surprising that sequence data from many thousand amino acid sites were needed to conclusively resolve their relationship to monotremes. This finding draws attention to other mammalian divergences that have been taken as unequivocally settled based on much smaller alignments. EST data provide a comprehensive random sample of protein coding sequences and an economic way to produce large amounts of data for phylogenetic analysis of species for which genomic sequences are not yet available.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号