首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Recent segmental and gene duplications in the mouse genome   总被引:2,自引:0,他引:2       下载免费PDF全文

Background

The high quality of the mouse genome draft sequence and its associated annotations are an invaluable biological resource. Identifying recent duplications in the mouse genome, especially in regions containing genes, may highlight important events in recent murine evolution. In addition, detecting recent sequence duplications can reveal potentially problematic regions of the genome assembly. We use BLAST-based computational heuristics to identify large (≥ 5 kb) and recent (≥ 90% sequence identity) segmental duplications in the mouse genome sequence. Here we present a database of recently duplicated regions of the mouse genome found in the mouse genome sequencing consortium (MGSC) February 2002 and February 2003 assemblies.

Results

We determined that 33.6 Mb of 2,695 Mb (1.2%) of sequence from the February 2003 mouse genome sequence assembly is involved in recent segmental duplications, which is less than that observed in the human genome (around 3.5-5%). From this dataset, 8.9 Mb (26%) of the duplication content consisted of 'unmapped' chromosome sequence. Moreover, we suspect that an additional 18.5 Mb of sequence is involved in duplication artifacts arising from sequence misassignment errors in this genome assembly. By searching for genes that are located within these regions, we identified 675 genes that mapped to duplicated regions of the mouse genome. Sixteen of these genes appear to have been duplicated independently in the human genome. From our dataset we further characterized a 42 kb recent segmental duplication of Mater, a maternal-effect gene essential for embryogenesis in mice.

Conclusion

Our results provide an initial analysis of the recently duplicated sequence and gene content of the mouse genome. Many of these duplicated loci, as well as regions identified to be involved in potential sequence misassignment errors, will require further mapping and sequencing to achieve accuracy. A Genome Browser database was set up to display the identified duplication content presented in this work. This data will also be relevant to the growing number of investigators who use the draft genome sequence for experimental design and analysis.
  相似文献   

2.
Genome duplications may have played a role in the early stages of vertebrate evolution, near the time of divergence of the lamprey lineage. Additional genome duplication, specifically in ray-finned fish, may have occurred before the divergence of the teleosts. The common carp (Cyprinus carpio) has been considered tetraploid because of its chromosome number (2n = 100) and its high DNA content. We studied variation using 59 microsatellite primer pairs to better understand the ploidy level of the common carp. Based on the number of PCR amplicons per individual, about 60% of these primer pairs are estimated to amplify duplicates. Segregation patterns in families suggested a partially duplicated genome structure and disomic inheritance. This could suggest that the common carp is tetraploid and that polyploidy occurred by hybridization (allotetraploidy). From sequences of microsatellite flanking regions, we estimated the difference per base between pairs of alleles and between pairs of paralogs. The distribution of differences between paralogs had two distinct modes suggesting one whole-genome duplication and a more recent wave of segmental duplications. The genome duplication was estimated to have occurred about 12 MYA, with the segmental duplications occurring between 2.3 and 6.8 MYA. At 12 MYA, this would be one of the most recent genome duplications among vertebrates. Phylogenetic analysis of several cyprinid species suggests an evolutionary model for this tetraploidization, with a role for polyploidization in speciation and diversification.  相似文献   

3.
4.
Complete genome doubling has long-term consequences for the genome structure and the subsequent evolution of an organism. It has been suggested that two genome duplications occurred at the origin of vertebrates (known as the 2R hypothesis). However, there has been considerable debate as to whether these were two successive duplications, or whether a single duplication occurred, followed by large-scale segmental duplications. In this article, we review and compare the evidence for the 2R duplications from vertebrate genomes with similar data from other more recent polyploids.  相似文献   

5.
Using the extensive segmental duplications of the Arabidopsis thaliana genome, a comparative study of homoeologous segments occurring in chromosomes 1, 2, 4 and 5 was performed. The gene-by-gene BLASTP approach was applied to identify duplicated genes in homoeologues. The levels of synonymous substitutions between duplicated coding sequences suggest that these regions were formed by at least two rounds of duplications. Moreover, remnants of even more ancient duplication events were recognised by a whole-genome study. We describe a subchromosomal organisation of genes, including the tandemly repeated genes, and the distribution of transposable elements (TEs). In certain cases, evidence of the possible mechanisms of structural rearrangements within the segments could be found. We provide a probable scenario of the rearrangements that took place during the evolution of the homoeologous regions. Furthermore, on the basis of the comparative analysis of the chromosomal segments in the Columbia and Landsberg erecta accessions, an additional structural variation in the A.thaliana genome is described. Analysis of the segments, spanning 7 Mb or 5.6% of the genome, permitted us to propose a model of evolution at the subchromosomal level.  相似文献   

6.
About 5% of the human genome consists of large-scale duplicated segments of almost identical sequences. Segmental duplications (SDs) have been proposed to be involved in non-allelic homologous recombination leading to recurrent genomic variation and disease. It has also been suggested that these SDs are associated with syntenic rearrangements that have shaped the human genome. We have analyzed 14 members of a single family of closely related SDs in the human genome, some of which are associated with common inversion polymorphisms at chromosomes 8p23 and 4p16. Comparative analysis with the mouse genome revealed syntenic inversions for these two human polymorphic loci. In addition, 12 of the 14 SDs, while absent in the mouse genome, occur at the breaks of synteny; suggesting a non-random involvement of these sequences in genome evolution. Furthermore, we observed a syntenic familial relationship between 8 and 12 breakpoint-loci, where broken synteny that ends at one family member resumes at another, even across different chromosomes. Subsequent genome-wide assessment revealed that this relationship, which we named continuation-of-synteny, is not limited to the 8p23 family and occurs 46 times in the human genome with high frequency at specific chromosomes. Our analysis supports a non-random breakage model of genomic evolution with an active involvement of segmental duplications for specific regions of the human genome. Electronic supplementary material Supplementary material is available in the online version of this article at and is accessible for authorized users.  相似文献   

7.
The genome sequence of the plant model organism Arabidopsis thaliana was presented in December of the year 2000. Since then, the 125 Mb sequence has revealed many of its evolutionary secrets. Through comparative analyses with other plant genomes, we know that the genome of A. thaliana, or better that of its ancestors, has undergone at least three whole genome duplications during the last 120 or so million years. The first duplication seems to have occurred at the dawn of dicot evolution, while the later duplications probably occurred <70 million years ago (Ma). One of those younger genome-wide duplications might be linked to the K-T extinction. Following these duplication events, the ancestral A. thaliana genome was hugely rearranged and gene copies have been massively lost. During the last 10 million years of its evolution, almost half of its genome was lost due to hundreds of thousands of small deletions. Here, we reconstruct plant genome evolution from the early angiosperm ancestor to the current A. thaliana genome, covering about 150 million years of evolution characterized by gene and genome duplications, genome rearrangements and genome reduction.  相似文献   

8.
Genome Duplication in Soybean (Glycine Subgenus Soja)   总被引:8,自引:1,他引:8       下载免费PDF全文
Restriction fragment length polymorphism mapping data from nine populations (Glycine max X G. soja and G. max X G. max) of the Glycine subgenus soja genome led to the identification of many duplicated segments of the genome. Linkage groups contained up to 33 markers that were duplicated on other linkage groups. The size of homoeologous regions ranged from 1.5 to 106.4 cM, with an average size of 45.3 cM. We observed segments in the soybean genome that were present in as many as six copies with an average of 2.55 duplications per segment. The presence of nested duplications suggests that at least one of the original genomes may have undergone an additional round of tetraploidization. Tetraploidization, along with large internal duplications, accounts for the highly duplicated nature of the genome of the subgenus. Quantitative trait loci for seed protein and oil showed correspondence across homoeologous regions, suggesting that the genes or gene families contributing to seed composition have retained similar functions throughout the evolution of the chromosomes.  相似文献   

9.
The nuclear ribosomal locus coding for the large subunit is represented in tandem arrays in the plant genome. These consecutive gene blocks, consisting of several regions, are widely applied in plant phylogenetics. The regions coding for the subunits of the rRNA have the lowest rate of evolution. Also the spacer regions like the internal transcribed spacers (ITS) and external transcribed spacers (ETS) are widely utilized in phylogenetics. The fact, that these regions are present in many copies in the plant genome is an advantage for laboratory practice but might be problem for phylogenetic analysis. Beside routine usage, the rDNA regions provide the great potential to study complex evolutionary mechanisms, such as reticulate events or array duplications. The understanding of these processes is based on the observation that the multiple copies of rDNA regions are homogenized through concerted evolution. This phenomenon results to paralogous copies, which can be misleading when incorporated in phylogenetic analyses. The fact that non-functional copies or pseudogenes can coexist with ortholougues in a single individual certainly makes also the analysis difficult. This article summarizes the information about the structure and utility of the phylogenetically informative spacer regions of the rDNA, namely internal- and external transcribed spacer regions as well as the intergenic spacer (IGS).  相似文献   

10.
The review considers the structure, evolution, and possible mechanisms of spreading of intrachromosomal and interchromosomal segment duplications (SD), which account for more than 5% of the human genome. Most SD are mosaic and consist of multiple modules, which occur in several copies in different genome regions. SD are preferentially located in pericentric and subtelomeric regions, which are least studied on the human chromosomes. Homologous recombination between SD results in various chromosome rearrangements, contributing to the genome instability and the origin of several human hereditary disorders.  相似文献   

11.
Patterns of segmental duplication in the human genome   总被引:12,自引:0,他引:12  
We analyzed the completed human genome for recent segmental duplications (size > or = 1 kb and sequence similarity > or = 90%). We found that approximately 4% of the genome is covered by duplications and that the extent of segmental duplication varies from 1% to 14% among the 24 chromosomes. Intrachromosomal duplication is more frequent than interchromosomal duplication in 15 chromosomes. The duplication frequencies in pericentromeric and subtelomeric regions are greater than the genome average by approximately threefold and fourfold. We examined factors that may affect the frequency of duplication in a region. Within individual chromosomes, the duplication frequency shows little correlation with local gene density, repeat density, recombination rate, and GC content, except chromosomes 7 and Y. For the entire genome, the duplication frequency is correlated with each of the above factors. Based on known genes and Ensembl genes, the proportion of duplications containing complete genes is 3.4% and 10.7%, respectively. The proportion of duplications containing genes is higher in intrachromosomal than in interchromosomal duplications, and duplications containing genes have a higher sequence similarity and tend to be longer than duplications containing no genes. Our simulation suggests that many duplications containing genes have been selectively maintained in the genome.  相似文献   

12.
Simple Sequence Repeats (SSRs) represent short tandem duplications found within all eukaryotic organisms. To examine the distribution of SSRs in the genome of Brassica rapa ssp. pekinensis, SSRs from different genomic regions representing 17.7 Mb of genomic sequence were surveyed. SSRs appear more abundant in non-coding regions (86.6%) than in coding regions (13.4%). Comparison of SSR densities in different genomic regions demonstrated that SSR density was greatest within the 5'-flanking regions of the predicted genes. The proportion of different repeat motifs varied between genomic regions, with trinucleotide SSRs more prevalent in predicted coding regions, reflecting the codon structure in these regions. SSRs were also preferentially associated with gene-rich regions, with peri-centromeric heterochromatin SSRs mostly associated with retrotransposons. These results indicate that the distribution of SSRs in the genome is non-random. Comparison of SSR abundance between B. rapa and the closely related species Arabidopsis thaliana suggests a greater abundance of SSRs in B. rapa, which may be due to the proposed genome triplication. Our results provide a comprehensive view of SSR genomic distribution and evolution in Brassica for comparison with the sequenced genomes of A. thaliana and Oryza sativa.  相似文献   

13.
Lacroix  M.-H.  Oparina  N. Yu.  Mashkova  T. D. 《Molecular Biology》2003,37(2):186-193
The review considers the structure, evolution, and possible mechanisms of formation and spreading of intrachromosomal and interchromosomal segmental duplications (SD), which account for more than 5% of the human genome. Most SD consist of multiple modules, which occur in several copies in different genome regions. SD are preferentially located in pericentric and subtelomeric regions, which are least studied on the human chromosomes. Homologous recombination between SD results in various chromosome rearrangements, contributing to the genome instability and the origin of several human hereditary disorders.  相似文献   

14.
Subtelomeric duplications of an obscure tubulin "genic" segment located near the telomere of human chromosome 4q35 have occurred at different evolutionary time points within the last 25 million years of the catarrhine (i.e., hominoid and Old World monkey) evolution. The analyses of these segments reported here indicate an exceptional level of evolutionary instability. Substantial intra- and interspecific differences in copy number and distribution are observed among cercopithecoid (Old World monkey) and hominoid genomes. Characterization of the hominoid duplicated segments reveals a strong positional bias within pericentromeric and subtelomeric regions of the genome. On the basis of phylogenetic analysis from predicted proteins and comparisons of nucleotide-substitution rates, we present evidence of a conserved b-tubulin gene among the duplications. Remarkably, the evolutionary conservation has occurred in a nonorthologous fashion, such that the functional copy has shifted its positional context between hominoids and cercopithecoids. We propose that, in a chimpanzee-human common ancestor, one of the paralogous copies assumed the original function, whereas the ancestral copy acquired mutations and eventually became silenced. Our analysis emphasizes the dynamic nature of duplication-mediated genome evolution and the delicate balance between gene acquisition and silencing.  相似文献   

15.
We present a detailed molecular evolutionary analysis of 1.2 Mb from the pericentromeric region of human 15q11. Sequence analysis indicates the region has been subject to extensive interchromosomal and intrachromosomal duplications during primate evolution. Comparative FISH analyses among non-human primates show remarkable quantitative and qualitative differences in the organization and duplication history of this region - including lineage-specific deletions and duplication expansions. Phylogenetic and comparative analyses reveal that the region is composed of at least 24 distinct segmental duplications or duplicons that have populated the pericentromeric regions of the human genome over the last 40 million years of human evolution. The value of combining both cytogenetic and experimental data in understanding the complex forces which have shaped these regions is discussed.  相似文献   

16.
Neural crest cells are an important cell type present in all vertebrates, and elaboration of the neural crest is thought to have been a key factor in their evolutionary success. Genomic comparisons suggest there were two major genome duplications in early vertebrate evolution, raising the possibility that evolution of neural crest was facilitated by gene duplications. Here, we review the process of early neural crest formation and its underlying gene regulatory network (GRN) as well as the evolution of important neural crest derivatives. In this context, we assess the likelihood that gene and genome duplications capacitated neural crest evolution, particularly in light of novel data arising from invertebrate chordates.  相似文献   

17.
The grass family comprises the most important cereal crops and is a good system for studying, with comparative genomics, mechanisms of evolution, speciation, and domestication. Here, we identified and characterized the evolution of shared duplications in the rice (Oryza sativa) and wheat (Triticum aestivum) genomes by comparing 42,654 rice gene sequences with 6426 mapped wheat ESTs using improved sequence alignment criteria and statistical analysis. Intraspecific comparisons identified 29 interchromosomal duplications covering 72% of the rice genome and 10 duplication blocks covering 67.5% of the wheat genome. Using the same methodology, we assessed orthologous relationships between the two genomes and detected 13 blocks of colinearity that represent 83.1 and 90.4% of the rice and wheat genomes, respectively. Integration of the intraspecific duplications data with colinearity relationships revealed seven duplicated segments conserved at orthologous positions. A detailed analysis of the length, composition, and divergence time of these duplications and comparisons with sorghum (Sorghum bicolor) and maize (Zea mays) indicated common and lineage-specific patterns of conservation between the different genomes. This allowed us to propose a model in which the grass genomes have evolved from a common ancestor with a basic number of five chromosomes through a series of whole genome and segmental duplications, chromosome fusions, and translocations.  相似文献   

18.
The aims of the study were to outline the sequence of eventsthat gave rise to the vertebrate insulin-relaxin gene familyand the chromosomal regions in which they reside. We analyzedthe gene content surrounding the human insulin/relaxin geneswith respect to what family they belonged to and if the duplicationhistory of investigated families parallels the evolution ofthe insulin-relaxin family members. Markov Clustering and phylogeneticanalysis were used to determine family identity. More than 15%of the genes belonged to families that have paralogs in theregions, defining two sets of quadruplicate paralogy regions.Thereby, the localization of insulin/relaxin genes in humansis in accordance with those regions on human chromosomes 1,11, 12, 19q (insulin/insulin-like growth factors) and 1, 6p/15q,9/5, 19p (insulin-like factors/relaxins) were formed duringtwo genome duplications. We compared the human genome with thatof Ciona intestinalis, a species that split from the vertebratelineage before the two suggested genome duplications. Two insulin-likeorthologs were discovered in addition to the already describedCi-insulin gene. Conserved synteny between the Ciona regionshosting the insulin-like genes and the two sets of human paralogonsimplies their common origin. Linkage of the two human paralogons,as seen in human chromosome 1, as well as the two regions hostingthe Ciona insulin-like genes suggests that a segmental duplicationgave rise to the region prior to the genome doublings. Thus,preserved gene content provides support that genome duplication(s)in addition to segmental and single-gene duplications shapedthe genomes of extant vertebrates.  相似文献   

19.
Owing to a great progress in studying the human genome, its euchromatic portion is almost completely sequenced; the complete sequence is still unknown only for pericentric and telomeric regions and short arms of acrocentric chromosomes. Extended satellite blocks and segmental duplications located in these regions substantially hinder the joining of the sequenced fragments and construction of the full-length genome map. The sequence was established for a 1.5-kb human chromosome 13 subtelomeric region, which is about 10 kb away from the rDNA cluster, and deposited in GenBank under accession no. AF478540. The region showed 83–84% homology to the pericentric region of human chromosome 19, and contained short fragments homologous to the pericentric region of human chromosome 13. The results may contribute to the current revision of genome evolution concepts in view of numerous segmental duplications revealed.  相似文献   

20.
Owing to a great progress in studying the human genome, its euchromatic portion is almost completely sequenced; the complete sequence is still unknown only for pericentric and telomeric regions and short arms of acrocentric chromosomes. Extended satellite blocks and segment duplications located in these regions substantially hinder the joining of the sequenced fragments and construction of the full-length genome map. The sequence was established for a 1.5-kb human chromosome 13 subtelomeric region, which is about 10 kb away from the rDNA cluster, and deposited in GenBank under accession no. AF478540. The region showed 83-84% homology to the pericentric region of human chromosome 19, and contained short fragments homologous to the pericentric region of human chromosome 13. The results may contribute to the current revision of genome evolution concepts in view of numerous segment duplications revealed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号