首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 765 毫秒
1.
Brandström M  Ellegren H 《Genetics》2007,176(3):1691-1701
It is increasingly recognized that insertions and deletions (indels) are an important source of genetic as well as phenotypic divergence and diversity. We analyzed length polymorphisms identified through partial (0.25x) shotgun sequencing of three breeds of domestic chicken made by the International Chicken Polymorphism Map Consortium. A data set of 140,484 short indel polymorphisms in unique DNA was identified after filtering for microsatellite structures. There was a significant excess of tandem duplicates at indel sites, with deletions of a duplicate motif outnumbering the generation of duplicates through insertion. Indel density was lower in microchromosomes than in macrochromosomes, in the Z chromosome than in autosomes, and in 100 bp of upstream sequence, 5'-UTR, and first introns than in intergenic DNA and in other introns. Indel density was highly correlated with single nucleotide polymorphism (SNP) density. The mean density of indels in pairwise sequence comparisons was 1.9 x 10(-4) indel events/bp, approximately 5% the density of SNPs segregating in the chicken genome. The great majority of indels involved a limited number of nucleotides (median 1 bp), with A-rich motifs being overrepresented at indel sites. The overrepresentation of deletions at tandem duplicates indicates that replication slippage in duplicate sequences is a common mechanism behind indel mutation. The correlation between indel and SNP density indicates common effects of mutation and/or selection on the occurrence of indels and point mutations.  相似文献   

2.
The occurrence of a fish-specific genome duplication (FSGD) in the lineage leading to teleost fishes is widely accepted, but the consequences of this event remain elusive. Teleosts, and the cichlid fishes from the species flocks in the East African Great Lakes in particular, evolved a unique complexity and diversity of body coloration and color patterning. Several genes involved in pigment cell development have been retained in duplicate copies in the teleost genome after the FSGD. Here we investigate the evolutionary fate of one of these genes, the type III receptor tyrosine kinase (RTK) colony-stimulating factor 1 receptor (csf1r). We isolated and shotgun sequenced two paralogous csf1r genes from a bacterial artificial chromosome library of the cichlid fish Astatotilapia burtoni that are both linked to paralogs of the pdgfr beta gene, another type III RTK. Two pdgfr beta-csf1r paralogons were also identified in the genomes of pufferfishes and medaka, and our phylogenetic analyses suggest that the pdgfr beta-csf1r locus was duplicated during the course of the FSGD. Comparisons of teleosts and tetrapods suggest asymmetrical divergence at different levels of genomic organization between the teleost-specific pdgfr beta-csf1r paralogons, which seem to have evolved as coevolutionary units. The high-evolutionary rate in the teleost B-paralogon, consisting of csf1rb and pdgfr betab, further suggests neofunctionalization by functional divergence of the extracellular, ligand-binding region of these cell-surface receptors. Finally, we hypothesize that genome duplications and the associated expansion of the RTK family might be causally linked to the evolution of coloration in vertebrates and teleost fishes in particular.  相似文献   

3.
Background

The number of species with completed genomes, including those with evidence for recent whole genome duplication events has exploded. The recently sequenced Atlantic salmon genome has been through two rounds of whole genome duplication since the divergence of teleost fish from the lineage that led to amniotes. This quadrupoling of the number of potential genes has led to complex patterns of retention and loss among gene families.

Results

Methods have been developed to characterize the interplay of duplicate gene retention processes across both whole genome duplication events and additional smaller scale duplication events. Further, gene expression divergence data has become available as well for Atlantic salmon and the closely related, pre-whole genome duplication pike and methods to describe expression divergence are also presented. These methods for the characterization of duplicate gene retention and gene expression divergence that have been applied to salmon are described.

Conclusions

With the growth in available genomic and functional data, the opportunities to extract functional inference from large scale duplicates using comparative methods have expanded dramatically. Recently developed methods that further this inference for duplicated genes have been described.

  相似文献   

4.
5.
鱼类特异的基因组复制   总被引:2,自引:0,他引:2  
周莉  汪洋  桂建芳 《动物学研究》2006,27(5):525-532
辐鳍鱼类是脊椎动物中种类最多、分布最广的类群,其基因组大小不等。过去的观点认为,在脊椎动物进化历程中曾发生了两次基因组复制。近期的系统基因组学研究资料进一步提出,在大约350百万年,辐鳍鱼还发生了第三次基因组复制,即鱼类特异的基因组复制(fish-specificgenomeduplication,FSGD),且发生的时间正处在“物种极度丰富”的硬骨鱼谱系(真骨总目)和“物种贫乏”的谱系(辐鳍鱼纲基部的类群)出现分歧的时间点,表明FSGD与硬骨鱼物种和生物多样性的增加有关。进一步开展鱼类比较基因组学和功能基因组学研究将进一步验证FSGD这一假说。  相似文献   

6.
Little is known about variation of nucleotide insertion/deletions (indels) within species. In Arabidopsis thaliana, we investigated indel polymorphism patterns between two genome sequences and among 96 accessions at 1215 loci. Our study identified patterns in the variation of indel density, size, GC content and distribution, and a correlation between indels and substitutions. We found that the GC content in indel sequences was lower than that in non-indel sequences and that indels typically occur in regions with lower GC content. Patterns of indel frequency distribution among populations were more consistent with neutral expectation than substitution patterns. We also found that the local level of substitutions is positively correlated with indel density and negatively correlated with their distance to the closed indel, suggesting that indels play an important role in nucleotide variation.  相似文献   

7.
MOTIVATION: The two mutation processes that have the largest impact on genome evolution at small scales are substitutions, and sequence insertions and deletions (indels). While the former have been studied extensively, indels have received less attention, and in particular, the problem of inferring indel rates between pairs of divergent sequence remains unsolved. Here, I describe a novel and accurate method for estimating neutral indel rates between divergent pairs of genomes. RESULTS: Simulations suggest that new method for estimating indel rates is accurate to within 2%, at divergences corresponding to that of human and mouse. Applying the method to these species, I show that indel rates are up to twice higher than is apparent from alignments, and depend strongly on the local G + C content. These results indicate that at these evolutionary distances, the contribution of indels to sequence divergence is much larger than hitherto appreciated. In particular, the ratio of substitution to indel rates between human and mouse appears to be around gamma = 8, rather than the currently accepted value of about gamma = 14.  相似文献   

8.
9.
Insertions and deletions (indels) cause numerous genetic diseases and lead to pronounced evolutionary differences among genomes. The macaque sequences provide an opportunity to gain insights into the mechanisms generating these mutations on a genome-wide scale by establishing the polarity of indels occurring in the human lineage since its divergence from the chimpanzee. Here we apply novel regression techniques and multiscale analyses to demonstrate an extensive regional indel rate variation stemming from local fluctuations in divergence, GC content, male and female recombination rates, proximity to telomeres, and other genomic factors. We find that both replication and, surprisingly, recombination are significantly associated with the occurrence of small indels. Intriguingly, the relative inputs of replication versus recombination differ between insertions and deletions, thus the two types of mutations are likely guided in part by distinct mechanisms. Namely, insertions are more strongly associated with factors linked to recombination, while deletions are mostly associated with replication-related features. Indel as a term misleadingly groups the two types of mutations together by their effect on a sequence alignment. However, here we establish that the correct identification of a small gap as an insertion or a deletion (by use of an outgroup) is crucial to determining its mechanism of origin. In addition to providing novel insights into insertion and deletion mutagenesis, these results will assist in gap penalty modeling and eventually lead to more reliable genomic alignments.  相似文献   

10.
Yang J  Xie Z  Glover BJ 《The New phytologist》2005,165(2):623-632
NF-Y is a ubiquitous CCAAT-binding factor composed of NF-YA, NF-YB and NF-YC. Multiple genes encoding NF-Y subunits have been identified in plant genomes. It remains unclear whether the duplicate genes underwent different evolutionary patterns. Likelihood-ratio tests were used to examine whether the amino acid substitution rates are the same between duplicate genes. The influences of selection on evolution were evaluated by comparing the conservative and radical amino acid substitution rates, as well as maximum-likelihood analysis. Some NF-YB and NF-YC duplicates showed significant evidence of asymmetric evolution but not the NF-YA duplicates. Most amino acid replacements in the NF-YB and NF-YC duplicates result in changes in hydropathy, polar requirement and polarity. The physicochemical changes in the sequences of NF-YB seem to be coupled to asymmetric divergence in gene function. Plant NF-Y genes have evolved in different patterns. Relaxed selective constraints following gene duplication are most likely responsible for the unequal evolutionary rates and distinct divergence patterns of duplicate NF-Y genes. Positive selection may have promoted amino acid hydropathy changes in the NF-YC duplicates.  相似文献   

11.
He X  Zhang J 《Genetics》2005,169(2):1157-1164
Gene duplication is the primary source of new genes. Duplicate genes that are stably preserved in genomes usually have divergent functions. The general rules governing the functional divergence, however, are not well understood and are controversial. The neofunctionalization (NF) hypothesis asserts that after duplication one daughter gene retains the ancestral function while the other acquires new functions. In contrast, the subfunctionalization (SF) hypothesis argues that duplicate genes experience degenerate mutations that reduce their joint levels and patterns of activity to that of the single ancestral gene. We here show that neither NF nor SF alone adequately explains the genome-wide patterns of yeast protein interaction and human gene expression for duplicate genes. Instead, our analysis reveals rapid SF, accompanied by prolonged and substantial NF in a large proportion of duplicate genes, suggesting a new model termed subneofunctionalization (SNF). Our results demonstrate that enormous numbers of new functions have originated via gene duplication.  相似文献   

12.
Insertions and deletions (indels) in chloroplast noncoding regions are common genetic markers to estimate population structure and gene flow, although relatively little is known about indel evolution among recently diverged lineages such as within plant families. Because indel events tend to occur nonrandomly along DNA sequences, recurrent mutations may generate homoplasy for indel haplotypes. This is a potential problem for population studies, because indel haplotypes may be shared among populations after recurrent mutation as well as gene flow. Furthermore, indel haplotypes may differ in fitness and therefore be subject to natural selection detectable as rate heterogeneity among lineages. Such selection could contribute to the spatial patterning of cpDNA haplotypes, greatly complicating the interpretation of cpDNA population structure. This study examined both nucleotide and indel cpDNA variation and divergence at six noncoding regions (psbB-psbH, atpB-rbcL, trnL-trnH, rpl20-5'rps12, trnS-trnG, and trnH-psbA) in 16 individuals from eight species in the Lecythidaceae and a Sapotaceae outgroup. We described patterns of cpDNA changes, assessed the level of indel homoplasy, and tested for rate heterogeneity among lineages and regions. Although regression analysis of branch lengths suggested some degree of indel homoplasy among the most divergent lineages, there was little evidence for indel homoplasy within the Lecythidaceae. Likelihood ratio tests applied to the entire phylogenetic tree revealed a consistent pattern rejecting a molecular clock. Tajima's 1D and 2D tests revealed two taxa with consistent rate heterogeneity, one showing relatively more and one relatively fewer changes than other taxa. In general, nucleotide changes showed more evidence of rate heterogeneity than did indel changes. The rate of evolution was highly variable among the six cpDNA regions examined, with the trnS-trnG and trnH-psbA regions showing as much as 10% and 15% divergence within the Lecythidaceae. Deviations from rate homogeneity in the two taxa were constant across cpDNA regions, consistent with lineage-specific rates of evolution rather than cpDNA region-specific natural selection. There is no evidence that indels are more likely than nucleotide changes to experience homoplasy within the Lecythidaceae. These results support a neutral interpretation of cpDNA indel and nucleotide variation in population studies within species such as Corythophora alta.  相似文献   

13.
An insertion/deletion polymorphism (Ind2) in the Brassica nigra CONSTANS LIKE 1 (Bni COL1) gene was previously found to be associated with variation in flowering time. In the present study we examine the inter-specific divergence of COL1 in the family Brassicaceae. Analysis of codon substitution models did not reveal evidence of positive Darwinian selection, but comparisons of the COL1 gene in different species revealed a surprising number of indels. A total of 24 indels were found in the 650 bp of the middle variable region of the gene. This high number of indels could reflect a lack of constraint on length of this region of the protein, or the effect of positive selection. The number of indels was close to that expected in non-coding DNA, but the indels were longer in COL1 than those observed in non-coding regions. Reconstruction of indel evolution indicated that most indels resulted from deletions rather than insertions. The Ind2 indel that has shown association with flowering time in Brassica nigra exhibited a remarkable distribution in the Brassicaceae family, indicating that the polymorphism may have persisted more than ten million years. Considering presumed historic populations sizes of Brassicaceae species, such a long persistence time seems unlikely for a neutral polymorphism.  相似文献   

14.
Expression divergence between duplicate genes   总被引:13,自引:0,他引:13  
Li WH  Yang J  Gu X 《Trends in genetics : TIG》2005,21(11):602-607
  相似文献   

15.
Nucleotide insertions and deletions (indels) are responsible for gaps in the sequence alignments. Indel is one of the major sources of evolutionary change at the molecular level. We have examined the patterns of insertions and deletions in the 19 mammalian genomes, and found that deletion events are more common than insertions in the mammalian genomes. Both the number of insertions and deletions decrease rapidly when the gap length increases and single nucleotide indel is the most frequent in all indel events. The frequencies of both insertions and deletions can be described well by power law.Key Words: Insertion, deletion, gap, indel, mammalian genome.  相似文献   

16.
Conant GC  Wolfe KH 《Genetics》2008,179(3):1681-1692
Identification of orthologous genes across species becomes challenging in the presence of a whole-genome duplication (WGD). We present a probabilistic method for identifying orthologs that considers all possible orthology/paralogy assignments for a set of genomes with a shared WGD (here five yeast species). This approach allows us to estimate how confident we can be in the orthology assignments in each genomic region. Two inferences produced by this model are indicative of purifying selection acting to prevent duplicate gene loss. First, our model suggests that there are significant differences (up to a factor of seven) in duplicate gene half-life. Second, we observe differences between the genes that the model infers to have been lost soon after WGD and those lost more recently. Gene losses soon after WGD appear uncorrelated with gene expression level and knockout fitness defect. However, later losses are biased toward genes whose paralogs have high expression and large knockout fitness defects, as well as showing biases toward certain functional groups such as ribosomal proteins. We suggest that while duplicate copies of some genes may be lost neutrally after WGD, another set of genes may be initially preserved in duplicate by natural selection for reasons including dosage.  相似文献   

17.
TATA box, the core promoter element, exists in a broad range of eukaryotes, and the expression of TATA-containing genes usually responds to various environmental stresses. Hence, the evolution of TATA-box in duplicate genes may provide some clues for the interrelationship among environmental stress, expression differentiation, and duplicate gene preservation. In the present study, we observed that the TATA box is significantly overrepresented in duplicate genes compared with singletons in human, worm, Arabidopsis, and yeast genomes. We then conducted an extensive functional genomic analysis to investigate the evolution of TATA box along over 700 yeast gene family phylogenies. After reconstructing the ancestral TATA-box states (presence or absence), we found that significantly higher numbers of TATA box gain events than loss events had occurred after yeast gene duplications-the overall gain-loss ratio is about 3-4 to 1. Interestingly, these TATA-gain duplicate genes on average have experienced greater expression divergence from the ancestral expression states than their most closely related TATA-less duplicate partners, but only under environmental stress conditions (asymmetric evolution); indeed, under normal physiological conditions, they have similar expression divergence (symmetric evolution). Moreover, we showed that TATA-gain duplicates are enriched in stress-associated functional categories but that is not the case for TATA-ancestral duplicates (those inherited from their ancestors prior to duplication). Together, we conclude that after the gene duplication, gain of the TATA box in duplicate promoters may have played an important role in yeast duplicate preservation by accelerating expression divergence that may facilitate the adaptive evolution of the organism in response to environmental changes.  相似文献   

18.
It has been shown that gene body DNA methylation is associated with gene expression. However, whether and how deviation of gene body DNA methylation between duplicate genes can influence their divergence remains largely unexplored. Here, we aim to elucidate the potential role of gene body DNA methylation in the fate of duplicate genes. We identified paralogous gene pairs from Arabidopsis and rice (Oryza sativa ssp. japonica) genomes and reprocessed their single-base resolution methylome data. We show that methylation in paralogous genes nonlinearly correlates with several gene properties including exon number/gene length, expression level and mutation rate. Further, we demonstrated that divergence of methylation level and pattern in paralogs indeed positively correlate with their sequence and expression divergences. This result held even after controlling for other confounding factors known to influence the divergence of paralogs. We observed that methylation level divergence might be more relevant to the expression divergence of paralogs than methylation pattern divergence. Finally, we explored the mechanisms that might give rise to the divergence of gene body methylation in paralogs. We found that exonic methylation divergence more closely correlates with expression divergence than intronic methylation divergence. We show that genomic environments (e.g., flanked by transposable elements and repetitive sequences) of paralogs generated by various duplication mechanisms are associated with the methylation divergence of paralogs. Overall, our results suggest that the changes in gene body DNA methylation could provide another avenue for duplicate genes to develop differential expression patterns and undergo different evolutionary fates in plant genomes.  相似文献   

19.
Gene and genome duplications are commonly regarded as being of major evolutionary significance. But how often does gene duplication occur? And, once duplicated, what are the fates of duplicated genes? How do they contribute to evolution? In a recent article, Lynch and Conery analyze divergence between duplicate genes from six eukaryotic genomes. They estimate the rate of gene duplication, the rate of gene loss after duplication and the strength of selection experienced by duplicate genes. They conclude that although the rate of gene duplications is high, so is the rate of gene loss, and they argue that gene duplications could be a major factor in speciation.  相似文献   

20.
Indels in the coding regions of a gene can either cause frameshifts or amino acid insertions/deletions. Frameshifting indels are indels that have a length that is not divisible by 3 and subsequently cause frameshifts. Indels that have a length divisible by 3 cause amino acid insertions/deletions or block substitutions; we call these 3n indels. The new amino acid changes resulting from 3n indels could potentially affect protein function. Therefore, we construct a SIFT Indel prediction algorithm for 3n indels which achieves 82% accuracy, 81% sensitivity, 82% specificity, 82% precision, 0.63 MCC, and 0.87 AUC by 10-fold cross-validation. We have previously published a prediction algorithm for frameshifting indels. The rules for the prediction of 3n indels are different from the rules for the prediction of frameshifting indels and reflect the biological differences of these two different types of variations. SIFT Indel was applied to human 3n indels from the 1000 Genomes Project and the Exome Sequencing Project. We found that common variants are less likely to be deleterious than rare variants. The SIFT indel prediction algorithm for 3n indels is available at http://sift-dna.org/  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号