首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Recent segmental and gene duplications in the mouse genome   总被引:2,自引:0,他引:2       下载免费PDF全文

Background

The high quality of the mouse genome draft sequence and its associated annotations are an invaluable biological resource. Identifying recent duplications in the mouse genome, especially in regions containing genes, may highlight important events in recent murine evolution. In addition, detecting recent sequence duplications can reveal potentially problematic regions of the genome assembly. We use BLAST-based computational heuristics to identify large (≥ 5 kb) and recent (≥ 90% sequence identity) segmental duplications in the mouse genome sequence. Here we present a database of recently duplicated regions of the mouse genome found in the mouse genome sequencing consortium (MGSC) February 2002 and February 2003 assemblies.

Results

We determined that 33.6 Mb of 2,695 Mb (1.2%) of sequence from the February 2003 mouse genome sequence assembly is involved in recent segmental duplications, which is less than that observed in the human genome (around 3.5-5%). From this dataset, 8.9 Mb (26%) of the duplication content consisted of 'unmapped' chromosome sequence. Moreover, we suspect that an additional 18.5 Mb of sequence is involved in duplication artifacts arising from sequence misassignment errors in this genome assembly. By searching for genes that are located within these regions, we identified 675 genes that mapped to duplicated regions of the mouse genome. Sixteen of these genes appear to have been duplicated independently in the human genome. From our dataset we further characterized a 42 kb recent segmental duplication of Mater, a maternal-effect gene essential for embryogenesis in mice.

Conclusion

Our results provide an initial analysis of the recently duplicated sequence and gene content of the mouse genome. Many of these duplicated loci, as well as regions identified to be involved in potential sequence misassignment errors, will require further mapping and sequencing to achieve accuracy. A Genome Browser database was set up to display the identified duplication content presented in this work. This data will also be relevant to the growing number of investigators who use the draft genome sequence for experimental design and analysis.
  相似文献   

2.
An unexpected finding of the human genome was the large fraction of the genome organized as blocks of interspersed duplicated sequence. We provide a comparative and phylogenetic analysis of a highly duplicated region of 16p12.2, which is composed of at least four different segmental duplications spanning in excess of 160 kb. We contrast the dispersal of two different segmental duplications (LCR16a and LCR16u). LCR16a, a 20 kb low-copy repeat sequence A from chromosome 16, was shown previously to contain a rapidly evolving novel hominoid gene family (morpheus) that had expanded within the last 10 million years of great ape/human evolution. We compare the dispersal of this genomic segment with a second adjacent duplication called LCR16u. The duplication contains a second putative gene family (KIAA0220/SMG1) that is represented approximately eight times within the human genome. A high degree of sequence identity (approximately 98%) was observed among the various copies of LCR16u. Comparative analyses with Old World monkey species show that LCR16a and LCR16u originated from two distinct ancestral loci. Within the human genome, at least 70% of the LCR16u copies were duplicated in concert with the LCR16a duplication. In contrast, only 30% of the chimpanzee loci show an association between LCR16a and LCR16u duplications. The data suggest that the two copies of genomic sequence were brought together during the chimpanzee/human divergence and were subsequently duplicated as a larger cassette specifically within the human lineage. The evolutionary history of these two chromosome-specific duplications supports a model of rapid expansion and evolutionary turnover among the genomes of man and the great apes.  相似文献   

3.
Despite considerable advances in sequencing of the human genome over the past few years, the organization and evolution of human pericentromeric regions have been difficult to resolve. This is due, in part, to the presence of large, complex blocks of duplicated genomic sequence at the boundary between centromeric satellite and unique euchromatic DNA. Here, we report the identification and characterization of an approximately 49-kb repeat sequence that exists in more than 40 copies within the human genome. This repeat is specific to highly duplicated pericentromeric regions with multiple copies distributed in an interspersed fashion among a subset of human chromosomes. Using this interspersed repeat (termed PIR4) as a marker of pericentromeric DNA, we recovered and sequence-tagged 3 Mb of pericentromeric DNA from a variety of human chromosomes as well as nonhuman primate genomes. A global evolutionary reconstruction of the dispersal of PIR4 sequence and analysis of flanking sequence supports a model in which pericentromeric duplications initiated before the separation of the great ape species (>12 MYA). Further, analyses of this duplication and associated flanking duplications narrow the major burst of pericentromeric duplication activity to a time just before the divergence of the African great ape and human species (5 to 7 MYA). These recent duplication exchange events substantially restructured the pericentromeric regions of hominoid chromosomes and created an architecture where large blocks of sequence are shared among nonhomologous chromosomes. This report provides the first global view of the series of historical events that have reshaped human pericentromeric regions over recent evolutionary time.  相似文献   

4.
An estimated 5% of the human genome consists of interspersed duplications that have arisen over the past 35 million years of evolution. Two categories of such recently duplicated segments can be distinguished: segmental duplications between nonhomologous chromosomes (transchromosomal duplications) and duplications mainly restricted to a particular chromosome (chromosome-specific duplications). Many of these duplications exhibit an extraordinarily high degree of sequence identity at the nucleotide level (>95%) and span large genomic distances (1-100 kb). Preliminary analyses indicate that these same regions are targets for rapid evolutionary turnover among the genomes of closely related primates. The dynamic nature of these regions because of recurrent chromosomal rearrangement, and their ability to create fusion genes from juxtaposed cassettes suggest that duplicative transposition was an important force in the evolution of our genome.  相似文献   

5.
Patterns of segmental duplication in the human genome   总被引:12,自引:0,他引:12  
We analyzed the completed human genome for recent segmental duplications (size > or = 1 kb and sequence similarity > or = 90%). We found that approximately 4% of the genome is covered by duplications and that the extent of segmental duplication varies from 1% to 14% among the 24 chromosomes. Intrachromosomal duplication is more frequent than interchromosomal duplication in 15 chromosomes. The duplication frequencies in pericentromeric and subtelomeric regions are greater than the genome average by approximately threefold and fourfold. We examined factors that may affect the frequency of duplication in a region. Within individual chromosomes, the duplication frequency shows little correlation with local gene density, repeat density, recombination rate, and GC content, except chromosomes 7 and Y. For the entire genome, the duplication frequency is correlated with each of the above factors. Based on known genes and Ensembl genes, the proportion of duplications containing complete genes is 3.4% and 10.7%, respectively. The proportion of duplications containing genes is higher in intrachromosomal than in interchromosomal duplications, and duplications containing genes have a higher sequence similarity and tend to be longer than duplications containing no genes. Our simulation suggests that many duplications containing genes have been selectively maintained in the genome.  相似文献   

6.
Genome duplications may have played a role in the early stages of vertebrate evolution, near the time of divergence of the lamprey lineage. Additional genome duplication, specifically in ray-finned fish, may have occurred before the divergence of the teleosts. The common carp (Cyprinus carpio) has been considered tetraploid because of its chromosome number (2n = 100) and its high DNA content. We studied variation using 59 microsatellite primer pairs to better understand the ploidy level of the common carp. Based on the number of PCR amplicons per individual, about 60% of these primer pairs are estimated to amplify duplicates. Segregation patterns in families suggested a partially duplicated genome structure and disomic inheritance. This could suggest that the common carp is tetraploid and that polyploidy occurred by hybridization (allotetraploidy). From sequences of microsatellite flanking regions, we estimated the difference per base between pairs of alleles and between pairs of paralogs. The distribution of differences between paralogs had two distinct modes suggesting one whole-genome duplication and a more recent wave of segmental duplications. The genome duplication was estimated to have occurred about 12 MYA, with the segmental duplications occurring between 2.3 and 6.8 MYA. At 12 MYA, this would be one of the most recent genome duplications among vertebrates. Phylogenetic analysis of several cyprinid species suggests an evolutionary model for this tetraploidization, with a role for polyploidization in speciation and diversification.  相似文献   

7.
8.
Duplicated genes produce genetic variation that can influence the evolution of genomes and phenotypes. In most cases, for a duplicated gene to contribute to evolutionary novelty it must survive the early stages of divergence from its paralog without becoming a pseudogene. I examined the evolutionary dynamics of recently duplicated genes in the Drosophila pseudoobscura genome to understand the factors affecting these early stages of evolution. Paralogs located in closer proximity have higher sequence identity. This suggests that gene conversion occurs more often between duplications in close proximity or that there is more genetic independence between distant paralogs. Partially duplicated genes have a higher likelihood of pseudogenization than completely duplicated genes, but no single factor significantly contributes to the selective constraints on a completely duplicated gene. However, DNA-based duplications and duplications within chromosome arms tend to produce longer duplication tracts than retroposed and inter-arm duplications, and longer duplication tracts are more likely to contain a completely duplicated gene. Therefore, the relative position of paralogs and the mechanism of duplication indirectly affect whether a duplicated gene is retained or pseudogenized. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

9.
Large chromosomal events such as translocations and segmental duplications enable rapid adaptation to new environments. Here we marshal genomic, genetic, meiotic mapping, and physical evidence to demonstrate that a chromosomal translocation and segmental duplication occurred during construction of a congenic strain pair in the fungal human pathogen Cryptococcus neoformans. Two chromosomes underwent telomere-telomere fusion, generating a dicentric chromosome that broke to produce a chromosomal translocation, forming two novel chromosomes sharing a large segmental duplication. The duplication spans 62,872 identical nucleotides and generated a second copy of 22 predicted genes, and we hypothesize that this event may have occurred during meiosis. Gene disruption studies of one embedded gene (SMG1) corroborate that this region is duplicated in an otherwise haploid genome. These findings resolve a genome project assembly anomaly and illustrate an example of rapid genome evolution in a fungal genome rich in repetitive elements.  相似文献   

10.
Oparina  N. Yu.  Lacroix  M.-H.  Rychkov  A. A.  Mashkova  T. D. 《Molecular Biology》2003,37(2):200-204
Intrachromosomal and interchromosomal segmental duplications account for more than 5% of the human genome. To analyze the processes resulting in the complex mosaic structure of duplicons, a draft human genome sequence was searched for duplicated segments of a genomic fragment of the pericentric region of the chromosome 21 short arm. The duplicons found consist of modules having paralogs in various genome regions. Module ends are flanked with various tandem or interspersed repeats, which are more unstable as compared with unique sequences. In most cases, the boundaries of duplicated segments exactly coincide with or are in close proximity to hot spots of various rearrangements within repeats or boundaries between repeats and unique sequences or between two different repeats. Homologous recombination between repetitive elements was assumed to be the major mechanism contributing to the mosaic structure of duplicons.  相似文献   

11.
Intrachromosomal and interchromosomal segmental duplications account for more than 5% of the human genome. To analyze the processes resulting in the complex mosaic structure of duplicons, a draft human genome sequence was searched for duplicated segments of a genomic fragment of the pericentric region of the chromosome 21 short arm. The duplicons found consist of modules having paralogs in various genome regions. Module ends are flanked with various tandem or interspersed repeats, which are more unstable as compared with unique sequences. In most cases, the boundaries of duplicated segments exactly coincide with or are in close proximity to hot spots of various rearrangements within repeats or boundaries between repeats and unique sequences or between two different repeats. Homologous recombination between repetitive elements was assumed to be the major mechanism contributing to the mosaic structure of duplicons.  相似文献   

12.
We present a detailed molecular evolutionary analysis of 1.2 Mb from the pericentromeric region of human 15q11. Sequence analysis indicates the region has been subject to extensive interchromosomal and intrachromosomal duplications during primate evolution. Comparative FISH analyses among non-human primates show remarkable quantitative and qualitative differences in the organization and duplication history of this region - including lineage-specific deletions and duplication expansions. Phylogenetic and comparative analyses reveal that the region is composed of at least 24 distinct segmental duplications or duplicons that have populated the pericentromeric regions of the human genome over the last 40 million years of human evolution. The value of combining both cytogenetic and experimental data in understanding the complex forces which have shaped these regions is discussed.  相似文献   

13.
Segmental duplications and copy-number variation in the human genome   总被引:33,自引:0,他引:33       下载免费PDF全文
The human genome contains numerous blocks of highly homologous duplicated sequence. This higher-order architecture provides a substrate for recombination and recurrent chromosomal rearrangement associated with genomic disease. However, an assessment of the role of segmental duplications in normal variation has not yet been made. On the basis of the duplication architecture of the human genome, we defined a set of 130 potential rearrangement hotspots and constructed a targeted bacterial artificial chromosome (BAC) microarray (with 2,194 BACs) to assess copy-number variation in these regions by array comparative genomic hybridization. Using our segmental duplication BAC microarray, we screened a panel of 47 normal individuals, who represented populations from four continents, and we identified 119 regions of copy-number polymorphism (CNP), 73 of which were previously unreported. We observed an equal frequency of duplications and deletions, as well as a 4-fold enrichment of CNPs within hotspot regions, compared with control BACs (P < .000001), which suggests that segmental duplications are a major catalyst of large-scale variation in the human genome. Importantly, segmental duplications themselves were also significantly enriched >4-fold within regions of CNP. Almost without exception, CNPs were not confined to a single population, suggesting that these either are recurrent events, having occurred independently in multiple founders, or were present in early human populations. Our study demonstrates that segmental duplications define hotspots of chromosomal rearrangement, likely acting as mediators of normal variation as well as genomic disease, and it suggests that the consideration of genomic architecture can significantly improve the ascertainment of large-scale rearrangements. Our specialized segmental duplication BAC microarray and associated database of structural polymorphisms will provide an important resource for the future characterization of human genomic disorders.  相似文献   

14.
The grass family comprises the most important cereal crops and is a good system for studying, with comparative genomics, mechanisms of evolution, speciation, and domestication. Here, we identified and characterized the evolution of shared duplications in the rice (Oryza sativa) and wheat (Triticum aestivum) genomes by comparing 42,654 rice gene sequences with 6426 mapped wheat ESTs using improved sequence alignment criteria and statistical analysis. Intraspecific comparisons identified 29 interchromosomal duplications covering 72% of the rice genome and 10 duplication blocks covering 67.5% of the wheat genome. Using the same methodology, we assessed orthologous relationships between the two genomes and detected 13 blocks of colinearity that represent 83.1 and 90.4% of the rice and wheat genomes, respectively. Integration of the intraspecific duplications data with colinearity relationships revealed seven duplicated segments conserved at orthologous positions. A detailed analysis of the length, composition, and divergence time of these duplications and comparisons with sorghum (Sorghum bicolor) and maize (Zea mays) indicated common and lineage-specific patterns of conservation between the different genomes. This allowed us to propose a model in which the grass genomes have evolved from a common ancestor with a basic number of five chromosomes through a series of whole genome and segmental duplications, chromosome fusions, and translocations.  相似文献   

15.
16.
Standard methods of DNA sequence analysis assume that sequences evolve independently, yet this assumption may not be appropriate for segmental duplications that exchange variants via interlocus gene conversion (IGC). Here, we use high quality multiple sequence alignments from well-annotated segmental duplications to systematically identify IGC signals in the human reference genome. Our analysis combines two complementary methods: (i) a paralog quartet method that uses DNA sequence simulations to identify a statistical excess of sites consistent with inter-paralog exchange, and (ii) the alignment-based method implemented in the GENECONV program. One-quarter (25.4%) of the paralog families in our analysis harbor clear IGC signals by the quartet approach. Using GENECONV, we identify 1477 gene conversion tracks that cumulatively span 1.54 Mb of the genome. Our analyses confirm the previously reported high rates of IGC in subtelomeric regions and Y-chromosome palindromes, and identify multiple novel IGC hotspots, including the pregnancy specific glycoproteins and the neuroblastoma breakpoint gene families. Although the duplication history of a paralog family is described by a single tree, we show that IGC has introduced incredible site-to-site variation in the evolutionary relationships among paralogs in the human genome. Our findings indicate that IGC has left significant footprints in patterns of sequence diversity across segmental duplications in the human genome, out-pacing the contributions of single base mutation by orders of magnitude. Collectively, the IGC signals we report comprise a catalog that will provide a critical reference for interpreting observed patterns of DNA sequence variation across duplicated genomic regions, including targets of recent adaptive evolution in humans.  相似文献   

17.
Approximately 5% of the human genome consists of segmental duplications that can cause genomic mutations and may play a role in gene innovation. Reticulate evolutionary processes, such as unequal crossing-over and gene conversion, are known to occur within specific duplicon families, but the broader contribution of these processes to the evolution of human duplications remains poorly characterized. Here, we use phylogenetic profiling to analyze multiple alignments of 24 human duplicon families that span >8 Mb of DNA. Our results indicate that none of them are evolving independently, with all alignments showing sharp discontinuities in phylogenetic signal consistent with reticulation. To analyze these results in more detail, we have developed a quartet method that estimates the relative contribution of nucleotide substitution and reticulate processes to sequence evolution. Our data indicate that most of the duplications show a highly significant excess of sites consistent with reticulate evolution, compared with the number expected by nucleotide substitution alone, with 15 of 30 alignments showing a >20-fold excess over that expected. Using permutation tests, we also show that at least 5% of the total sequence shares 100% sequence identity because of reticulation, a figure that includes 74 independent tracts of perfect identity >2 kb in length. Furthermore, analysis of a subset of alignments indicates that the density of reticulation events is as high as 1 every 4 kb. These results indicate that phylogenetic relationships within recently duplicated human DNA can be rapidly disrupted by reticulate evolution. This finding has important implications for efforts to finish the human genome sequence, complicates comparative sequence analysis of duplicon families, and could profoundly influence the tempo of gene-family evolution.  相似文献   

18.
Duplicated pseudogenes in the human genome are disabled copies of functioning parent genes. They result from block duplication events occurring throughout evolutionary history. Relatively recent duplications (with sequence similarity ≥90% and length ≥1 kb) are termed segmental duplications (SDs); here, we analyze the interrelationship of SDs and pseudogenes. We present a decision-tree approach to classify pseudogenes based on their (and their parents’) characteristics in relation to SDs. The classification identifies 140 novel pseudogenes and makes possible improved annotation for the 3172 pseudogenes located in SDs. In particular, it reveals that many pseudogenes in SDs likely did not arise directly from parent genes, but are the result of a multi-step process. In these cases, the initial duplication or retrotransposition of a parent gene gives rise to a ‘parent pseudogene’, followed by further duplication creating duplicated–duplicated or duplicated–processed pseudogenes, respectively. Moreover, we can precisely identify these parent pseudogenes by overlap with ancestral SD loci. Finally, a comparison of nucleotide substitutions per site in a pseudogene with its surrounding SD region allows us to estimate the time difference between duplication and disablement events, and this suggests that most duplicated pseudogenes in SDs were likely disabled around the time of the original duplication.  相似文献   

19.
20.
We have identified a chromosome duplication in the pericentromeric region of human chromosome 11 located in 11p11 and 11q14. A detailed physical map of each duplicated region was generated to describe the nature of the duplication, the involvement at the centromere and to resolve the correct maps. All clones were evaluated to ensure they were representative of their genetic origin. The order of clones, based on their marker content, as well as the distance covered was determined by SEGMAP. Each duplication encompasses more than 1 Mb of DNA and appears to be chromosome 11 specific. Ten STS markers were mapped within each duplication. Comparative sequence analysis along the duplication identified 35 nucleotide changes in 2,036 bp between the two copies, suggesting the duplication occurred over 14 million years ago. A suggested organization of the pericentromeric region, including the duplications and alpha-related repetitive sequences, is presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号