首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
Olfactory receptor (OR) genes of the 7E subfamily have been duplicated to multiple regions throughout the human genome. Segmental duplications containing 7E OR genes have been associated with both pathological and evolutionary chromosome rearrangements. Many of these breakpoint regions coincide with breaks of chromosomal synteny in the mouse, rat and/or chicken genomes. Collectively, these data suggest that 7E OR-containing regions represent hot spots of genomic instability.  相似文献   

2.
We study the probability distribution of the distance d = n + chi - kappa - psi between two genomes with n markers distributed on chi chromosomes and with breakpoint graphs containing kappa cycles and psi "good" paths, under the hypothesis of random gene order. We interpret the random order assumption in terms of a stochastic method for constructing the bicolored breakpoint graph. We show that the limiting expectation of E[d] = n - 1/2chi - 1/2 log n+chi/2chi. We also calculate the variance, the effect of different numbers of chromosomes in the two genomes, and the number of plasmids, or circular chromosomes, generated by the random breakpoint graph construction. A more realistic model allows intra- and interchromosomal operations to have different probabilities, and simulations show that for a fixed number of rearrangements, kappa and d depend on the relative proportions of the two kinds of operation.  相似文献   

3.
Paired-end sequencing is emerging as a key technique for assessing genome rearrangements and structural variation on a genome-wide scale. This technique is particularly useful for detecting copy-neutral rearrangements, such as inversions and translocations, which are common in cancer and can produce novel fusion genes. We address the question of how much sequencing is required to detect rearrangement breakpoints and to localize them precisely using both theoretical models and simulation. We derive a formula for the probability that a fusion gene exists in a cancer genome given a collection of paired-end sequences from this genome. We use this formula to compute fusion gene probabilities in several breast cancer samples, and we find that we are able to accurately predict fusion genes in these samples with a relatively small number of fragments of large size. We further demonstrate how the ability to detect fusion genes depends on the distribution of gene lengths, and we evaluate how different parameters of a sequencing strategy impact breakpoint detection, breakpoint localization, and fusion gene detection, even in the presence of errors that suggest false rearrangements. These results will be useful in calibrating future cancer sequencing efforts, particularly large-scale studies of many cancer genomes that are enabled by next-generation sequencing technologies.  相似文献   

4.
The ancient duplication of the Saccharomyces cerevisiae genome and subsequent massive loss of duplicated genes is apparent when it is compared to the genomes of related species that diverged before the duplication event. To learn more about the evolutionary effects of the duplication event, we compared the S. cerevisiae genome to other Saccharomyces genomes. We demonstrate that the whole genome duplication occurred before S. castellii diverged from S. cerevisiae. In addition to more accurately dating the duplication event, this finding allowed us to study the effects of the duplication on two separate lineages. Analyses of the duplication regions of the genomes indicate that most of the duplicated genes (approximately 85%) were lost before the speciation. Only a small amount of paralogous gene loss (4-6%) occurred after speciation. On the other hand, S. castellii appears to have lost several hundred genes that were not retained as duplicated paralogs. These losses could be related to genomic rearrangements that reduced the number of chromosomes from 16 to 9. In addition to S. castellii, other Saccharomyces sensu lato species likely diverged from S. cerevisiae after the duplication. A thorough analysis of these species will likely reveal other important outcomes of the whole genome duplication.  相似文献   

5.
Intrachromosomal duplications play a significant role in human genome pathology and evolution. To better understand the molecular basis of evolutionary chromosome rearrangements, we performed molecular cytogenetic and sequence analyses of the breakpoint region that distinguishes human chromosome 3p12.3 and orangutan chromosome 2. FISH with region-specific BAC clones demonstrated that the breakpoint-flanking sequences are duplicated intrachromosomally on orangutan 2 and human 3q21 as well as at many pericentromeric and subtelomeric sites throughout the genomes. Breakage and rearrangement of the human 3p12.3-homologous region in the orangutan lineage were associated with a partial loss of duplicated sequences in the breakpoint region. Consistent with our FISH mapping results, computational analysis of the human chromosome 3 genomic sequence revealed three 3p12.3-paralogous sequence blocks on human chromosome 3q21 and smaller blocks on the short arm end 3p26-->p25. This is consistent with the view that sequences from an ancestral site at 3q21 were duplicated at 3p12.3 in a common ancestor of orangutan and humans. Our results show that evolutionary chromosome rearrangements are associated with microduplications and microdeletions, contributing to the DNA differences between closely related species.  相似文献   

6.
We determined the complete nucleotide sequences of mitochondrial (mt) genomes from two dicroglossid frogs, Hoplobatrachus tigerinus (Indian Bullfrog) and Euphlyctis hexadactylus (Indian Green frog). The genome sizes are 20462 bp in H. tigerinus and 20280 bp in E. hexadactylus. Although both genomes encode the typical 37 mt genes, the following unique features are observed: 1) the ND5 genes are duplicated in H. tigerinus that have completely identical sequences, whereas duplicated ND5 genes in E. hexadactylus possessed dissimilar substitutions; 2) duplicated control region (CR) in H. tigerinus has almost identical sequences whereas single control region (CR) was found in E. hexadactylus; 3) the tRNA-Leu (CUN) gene is translocated from the LTPF tRNA cluster to downstream of ND5-1 in H. tigerinus, and the tRNA-Pro gene is translocated from the LTPF tRNA cluster to downstream of CR in E. hexadactylus; 4) pseudo tRNA-Leu (CUN) and tRNA-Pro genes are observed in E. hexadactylus; and 5) two tRNA-Met genes are encoded in both species, as observed in the previously reported dicroglossid mt genomes. Almost all observed gene rearrangements in H. tigerinus and E. hexadactylus can be explained by the tandem duplication and random loss model, except translocation of tRNA-Pro in E. hexadactylus. The novel mt genomic features found in this study may be useful for future phylogenetic studies in the dicroglossid taxa. However, the mt genome with interesting features found in the present study reveal a high level of variation of gene order and gene content, inspiring more research to understand the mechanisms behind gene and genome evolution in the dicroglossid and as well as in the amphibian taxa in future studies.  相似文献   

7.
8.

Background

By reshuffling genomes, structural genomic reorganizations provide genetic variation on which natural selection can work. Understanding the mechanisms underlying this process has been a long-standing question in evolutionary biology. In this context, our purpose in this study is to characterize the genomic regions involved in structural rearrangements between human and macaque genomes and determine their influence on meiotic recombination as a way to explore the adaptive role of genome shuffling in mammalian evolution.

Results

We first constructed a highly refined map of the structural rearrangements and evolutionary breakpoint regions in the human and rhesus macaque genomes based on orthologous genes and whole-genome sequence alignments. Using two different algorithms, we refined the genomic position of known rearrangements previously reported by cytogenetic approaches and described new putative micro-rearrangements (inversions and indels) in both genomes. A detailed analysis of the rhesus macaque genome showed that evolutionary breakpoints are in gene-rich regions, being enriched in GO terms related to immune system. We also identified defense-response genes within a chromosome inversion fixed in the macaque lineage, underlying the relevance of structural genomic changes in evolutionary and/or adaptation processes. Moreover, by combining in silico and experimental approaches, we studied the recombination pattern of specific chromosomes that have suffered rearrangements between human and macaque lineages.

Conclusions

Our data suggest that adaptive alleles – in this case, genes involved in the immune response – might have been favored by genome rearrangements in the macaque lineage.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-530) contains supplementary material, which is available to authorized users.  相似文献   

9.
Chloroplast genome organization, gene order, and content are highly conserved among land plants. We sequenced the chloroplast genome of Trachelium caeruleum L. (Campanulaceae), a member of an angiosperm family known for highly rearranged genomes. The total genome size is 162,321 bp, with an inverted repeat (IR) of 27,273 bp, large single-copy (LSC) region of 100,114 bp, and small single-copy (SSC) region of 7,661 bp. The genome encodes 112 different genes, with 17 duplicated in the IR, a tRNA gene (trnI-cau) duplicated once in the LSC region, and a protein-coding gene (psbJ) with two duplicate copies, for a total of 132 putatively intact genes. ndhK may be a pseudogene with internal stop codons, and clpP, ycf1, and ycf2 are so highly diverged that they also may be pseudogenes. ycf15, rpl23, infA, and accD are truncated and likely nonfunctional. The most conspicuous feature of the Trachelium genome is the presence of 18 internally unrearranged blocks of genes inverted or relocated within the genome relative to the ancestral gene order of angiosperm chloroplast genomes. Recombination between repeats or tRNA genes has been suggested as a mechanism of chloroplast genome rearrangements. The Trachelium chloroplast genome shares with Pelargonium and Jasminum both a higher number of repeats and larger repeated sequences in comparison to eight other angiosperm chloroplast genomes, and these are concentrated near rearrangement endpoints. Genes for tRNAs occur at many but not all inversion endpoints, so some combination of repeats and tRNA genes may have mediated these rearrangements.  相似文献   

10.
For the last 15 years molecular cytogenetic techniques have been extensively used to study primate evolution. Molecular probes were helpful to distinguish mammalian chromosomes and chromosome segments on the basis of their DNA content rather than solely on morphological features such as banding patterns. Various landmark rearrangements have been identified for most of the nodes in primate phylogeny while chromosome banding still provides helpful reference maps. Fluorescence in situ hybridization (FISH) techniques were used with probes of different complexity including chromosome painting probes, probes derived from chromosome sub-regions and in the size of a single gene. Since more recently, in silico techniques have been applied to trace down evolutionarily derived chromosome rearrangements by searching the human and mouse genome sequence databases. More detailed breakpoint analyses of chromosome rearrangements that occurred during higher primate evolution also gave some insights into the molecular changes in chromosome rearrangements that occurred in evolution. Hardly any "fusion genes" as known from chromosome rearrangements in cancer cells or dramatic "position effects" of genes transferred to new sites in primate genomes have been reported yet. Most breakpoint regions have been identified within gene poor areas rich in repetitive elements and/or low copy repeats (segmental duplications). The progress in various molecular and molecular-cytogenetic approaches including the recently launched chimpanzee genome project suggests that these new tools will have a significant impact on the further understanding of human genome evolution.  相似文献   

11.
Tandemly arrayed genes (TAGs) play an important functional and physiological role in the genome. Most previous studies have focused on individual TAG families in a few species, yet a broad characterization of TAGs is not available. Here we identified all TAGs in the genomes of humans, mouse, and rat and performed a comprehensive analysis of TAG distribution, TAG sizes, TAG orientations and intergenic distances, and TAG functions. TAGs account for about 14-17% of all genes in the genome and nearly one-third of all duplicated genes, highlighting the predominant role that tandem duplication plays in gene duplication. For all species, TAG distribution is highly heterogeneous along chromosomes and some chromosomes are enriched with TAG forests, whereas others are enriched with TAG deserts. The majority of TAGs are of size 2 for all genomes, similar to the previous findings in Caenorhabditis elegans, Arabidopsis thaliana, and Oryza sativa, suggesting that it is a rather general phenomenon in eukaryotes. The comparison with the genome patterns shows that TAG members have a significantly higher proportion of parallel gene orientation in all species, corroborating Graham's claim that parallel orientation is the preferred form of orientation in TAGs. Moreover, TAG members with parallel orientation tend to be closer to each other than all neighboring genes in the genome with parallel orientation. The analyses of Gene Ontology function indicate that genes with receptor or binding activities are significantly overrepresented by TAGs. Computer simulation reveals that random gene rearrangements have little effect on the statistics of TAGs for all genomes. Finally, the average proportion of TAGs shows a trend of increase with the increase of family sizes, although the correlation between TAG proportions in individual families and family sizes is not significant.  相似文献   

12.
We have previously found with the microcell hybrid-based "elimination test" that human chromosome 3 transferred into murine or human tumor cells regularly lost certain 3p regions during tumor growth in SCID mice. The most common eliminated region, CER1, is approximately 2.4 Mb at 3p21.3. CER1 breakpoints were clustered in approximately 200-kb regions at both telomeric and centromeric borders. We have also shown, earlier, that tumor-related deletions often coincide with human/mouse synteny breakpoints on 3p12-p22. Here we describe the results of a comparative genomic analysis on the CER1 region in Caenorhabditis elegans, Drosophila melanogaster, Fugu rubripes, Gallus gallus, Mus musculus, Rattus norvegicus, and Canis familiaris. First, four independent synteny breaks were found within the CER1 telomeric breakpoint cluster region, comparing human, dog, and chicken genomes, and two independent synteny breaks within the CER1 centromeric breakpoint cluster region, comparing human, mouse, and chicken genomes, suggesting a nonrandom involvement of tumor breakpoint regions in chromosome evolution. Second, both CER1 breakpoint cluster regions show recent tandem duplications (seven Zn finger protein family genes at the telomeric and eight chemokine receptor genes at the centromeric side). Finally, all genes from these regions underwent horizontal evolution in mammals, with formation of new genes and expansion of gene families, which were displayed in the human genome as tandem gene duplications and pseudogene insertions. In contrast the CER1 middle region contained evolutionarily well-conserved solitary genes and a minimal amount of retroposed genes. The coincidence of evolutionary plasticity with CER1 breakpoints may suggest that regional structural instability is expressed in both evolutionary and cancer-associated chromosome rearrangements.  相似文献   

13.
Use of whole genome sequence data to infer baculovirus phylogeny   总被引:18,自引:0,他引:18       下载免费PDF全文
Several phylogenetic methods based on whole genome sequence data were evaluated using data from nine complete baculovirus genomes. The utility of three independent character sets was assessed. The first data set comprised the sequences of the 63 genes common to these viruses. The second set of characters was based on gene order, and phylogenies were inferred using both breakpoint distance analysis and a novel method developed here, termed neighbor pair analysis. The third set recorded gene content by scoring gene presence or absence in each genome. All three data sets yielded phylogenies supporting the separation of the Nucleopolyhedrovirus (NPV) and Granulovirus (GV) genera, the division of the NPVs into groups I and II, and species relationships within group I NPVs. Generation of phylogenies based on the combined sequences of all 63 shared genes proved to be the most effective approach to resolving the relationships among the group II NPVs and the GVs. The history of gene acquisitions and losses that have accompanied baculovirus diversification was visualized by mapping the gene content data onto the phylogenetic tree. This analysis highlighted the fluid nature of baculovirus genomes, with evidence of frequent genome rearrangements and multiple gene content changes during their evolution. Of more than 416 genes identified in the genomes analyzed, only 63 are present in all nine genomes, and 200 genes are found only in a single genome. Despite this fluidity, the whole genome-based methods we describe are sufficiently powerful to recover the underlying phylogeny of the viruses.  相似文献   

14.
Our objective was to test whether or not cyclization recombination (CRE), the P1 phage site-specific recombinase, induces genome rearrangements in plastids. Testing was carried out in tobacco plants in which a DNA sequence, located between two inversely oriented locus of X-over of P1 (loxP) sites, underwent repeated cycles of inversions as a means of monitoring CRE activity. We report here that CRE mediates deletions between loxP sites and plastid DNA sequences in the 3'rps12 gene leader (lox-rps12) or in the psbA promoter core (lox-psbA). We also observed deletions between two directly oriented lox-psbA sites, but not between lox-rps12 sites. Deletion via duplicated rRNA operon promoter (Prrn) sequences was also frequent in CRE-active plants. However, CRE-mediated recombination is probably not directly involved, as no recombination junction between loxP and Prrn could be observed. Tobacco plants carrying deleted genomes as a minor fraction of the plastid genome population were fertile and phenotypically normal, suggesting that the absence of deleted genome segments was compensated by gene expression from wild-type copies. The deleted plastid genomes disappeared in the seed progeny lacking CRE. Observed plastid genome rearrangements are specific to engineered plastid genomes, which contain at least one loxP site or duplicated psbA promoter sequences. The wild-type plastid genome is expected to be stable, even if CRE is present in the plastid.  相似文献   

15.
DAGchainer: a tool for mining segmental genome duplications and synteny   总被引:8,自引:0,他引:8  
SUMMARY: Given the positions of protein-coding genes along genomic sequence and probability values for protein alignments between genes, DAGchainer identifies chains of gene pairs sharing conserved order between genomic regions, by identifying paths through a directed acyclic graph (DAG). These chains of collinear gene pairs can represent segmentally duplicated regions and genes within a single genome or syntenic regions between related genomes. Automated mining of the Arabidopsis genome for segmental duplications illustrates the use of DAGchainer.  相似文献   

16.
We present a generalization of the positional Burrows–Wheeler transform, or PBWT, to genome graphs, which we call the gPBWT. A genome graph is a collapsed representation of a set of genomes described as a graph. In a genome graph, a haplotype corresponds to a restricted form of walk. The gPBWT is a compressible representation of a set of these graph-encoded haplotypes that allows for efficient subhaplotype match queries. We give efficient algorithms for gPBWT construction and query operations. As a demonstration, we use the gPBWT to quickly count the number of haplotypes consistent with random walks in a genome graph, and with the paths taken by mapped reads; results suggest that haplotype consistency information can be practically incorporated into graph-based read mappers. We estimate that with the gPBWT of the order of 100,000 diploid genomes, including all forms structural variation, could be stored and made searchable for haplotype queries using a single large compute node.  相似文献   

17.
Skinner BM  Griffin DK 《Heredity》2012,108(1):37-41
It is generally believed that the organization of avian genomes remains highly conserved in evolution as chromosome number is constant and comparative chromosome painting demonstrated there to be very few interchromosomal rearrangements. The recent sequencing of the zebra finch (Taeniopygia guttata) genome allowed an assessment of the number of intrachromosomal rearrangements between it and the chicken (Gallus gallus) genome, revealing a surprisingly high number of intrachromosomal rearrangements. With the publication of the turkey (Meleagris gallopavo) genome it has become possible to describe intrachromosomal rearrangements between these three important avian species, gain insight into the direction of evolutionary change and assess whether breakpoint regions are reused in birds. To this end, we aligned entire chromosomes between chicken, turkey and zebra finch, identifying syntenic blocks of at least 250 kb. Potential optimal pathways of rearrangements between each of the three genomes were determined, as was a potential Galliform ancestral organization. From this, our data suggest that around one-third of chromosomal breakpoint regions may recur during avian evolution, with 10% of breakpoints apparently recurring in different lineages. This agrees with our previous hypothesis that mechanisms of genome evolution are driven by hotspots of non-allelic homologous recombination.  相似文献   

18.
Genomic sequence duplication is an important mechanism for genome evolution, often resulting in large sequence variations with implications for disease progression. Although paired-end sequencing technologies are commonly used for structural variation discovery, the discovery of novel duplicated sequences remains an unmet challenge. We analyze duplicons starting from identified high-copy number variants. Given paired-end mapped reads, and a candidate high-copy region, our tool, Reprever, identifies (a) the insertion breakpoints where the extra duplicons inserted into the donor genome and (b) the actual sequence of the duplicon. Reprever resolves ambiguous mapping signatures from existing homologs, repetitive elements and sequencing errors to identify breakpoint. At each breakpoint, Reprever reconstructs the inserted sequence using profile hidden Markov model (PHMM)-based guided assembly. In a test on 1000 artificial genomes with simulated duplication, Reprever could identify novel duplicates up to 97% of genomes within 3 bp positional and 1% sequence errors. Validation on 680 fosmid sequences identified and reconstructed eight duplicated sequences with high accuracy. We applied Reprever to reanalyzing a re-sequenced data set from the African individual NA18507 to identify >800 novel duplicates, including insertions in genes and insertions with additional variation. polymerase chain reaction followed by capillary sequencing validated both the insertion locations of the strongest predictions and their predicted sequence.  相似文献   

19.
We previously reported two graph algorithms for analysis of genomic information: a graph comparison algorithm to detect locally similar regions called correlated clusters and an algorithm to find a graph feature called P-quasi complete linkage. Based on these algorithms we have developed an automatic procedure to detect conserved gene clusters and align orthologous gene orders in multiple genomes. In the first step, the graph comparison is applied to pairwise genome comparisons, where the genome is considered as a one-dimensionally connected graph with genes as its nodes, and correlated clusters of genes that share sequence similarities are identified. In the next step, the P-quasi complete linkage analysis is applied to grouping of related clusters and conserved gene clusters in multiple genomes are identified. In the last step, orthologous relations of genes are established among each conserved cluster. We analyzed 17 completely sequenced microbial genomes and obtained 2313 clusters when the completeness parameter P was 40%. About one quarter contained at least two genes that appeared in the metabolic and regulatory pathways in the KEGG database. This collection of conserved gene clusters is used to refine and augment ortholog group tables in KEGG and also to define ortholog identifiers as an extension of EC numbers.  相似文献   

20.

Background  

Gene and genome duplication is the principle creative force in evolution. Recently, protein subcellular relocalization, or neolocalization was proposed as one of the mechanisms responsible for the retention of duplicated genes. This hypothesis received support from the analysis of yeast genomes, but has not been tested thoroughly on animal genomes. In order to evaluate the importance of subcellular relocalizations for retention of duplicated genes in animal genomes, we systematically analyzed nuclear encoded mitochondrial proteins in the human genome by reconstructing phylogenies of mitochondrial multigene families.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号