首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A quantitative model was developed that detects a new function of noncoding sequences in the eukaryotic genome, namely, the protection of coding sequences from chemical (mainly endogenous) mutagens. It was shown that, under common ecological conditions, the number of nucleotides damaged by mutagens in coding sequences of the genome is inversely proportional to the size of their noncoding counterparts. Noncoding sequences can differently protect single genetic loci from chemical mutagens by the formation of specific spatial structures of the protected loci in the interphase nuclei. The significant differences in genome sizes between species (paradox C) can be explained by different contributions of noncoding sequences to the total effect of genome protection from endogenous chemical mutagens.  相似文献   

2.
The current state of knowledge concerning the unsolved problem of the huge interspecific eukaryotic genome size variations not correlating with the species phenotypic complexity (C-value enigma also known as C-value paradox) is reviewed. Characteristic features of eukaryotic genome structure and molecular mechanisms that are the basis of genome size changes are examined in connection with the C-value enigma. It is emphasized that endogenous mutagens, including reactive oxygen species, create a constant nuclear environment where any genome evolves. An original quantitative model and general conception are proposed to explain the C-value enigma. In accordance with the theory, the noncoding sequences of the eukaryotic genome provide genes with global and differential protection against chemical mutagens and (in addition to the anti-mutagenesis and DNA repair systems) form a new, third system that protects eukaryotic genetic information. The joint action of these systems controls the spontaneous mutation rate in coding sequences of the eukaryotic genome. It is hypothesized that the genome size is inversely proportional to functional efficiency of the anti-mutagenesis and/or DNA repair systems in a particular biological species. In this connection, a model of eukaryotic genome evolution is proposed.  相似文献   

3.
An improved quantitative model describing a protective function of eukaryotic genomic noncoding sequences was developed. In this new model, two factors affecting gene protection from chemical mutagensare considered: (1) the ratio of the total lengths of coding and noncoding genomic sequences and (2) the volume of the cell nucleus. An increase in the noncoding DNA in the genome reduces the number of mutagen-damaged nucleotides in the coding region, whereas an increase in the volume of the nucleus decreases the flow of mutagens per unit of nuclear volume that attacks its surface.  相似文献   

4.
An improved quantitative model describing a protective function of eukaryotic genomic noncoding sequences was developed. In this new model, two factors affecting gene protection from chemical mutagens are considered: (1) the ratio of the total lengths of coding and noncoding genomic sequences and (2) the volume of the cell nucleus. An increase in the noncoding DNA in the genome reduces the number of mutagen-damaged nucleotides in the coding region, whereas an increase in the volume of the nucleus decreases the flow of mutagens per unit of nuclear volume that attacks its surface.  相似文献   

5.
The contribution of slippage-like processes to genome evolution   总被引:19,自引:0,他引:19  
Simple sequences present in long (>30 kb) sequences representative of the single-copy genome of five species (Homo sapiens, Caenorhabditis elegans Saccharomyces cerevisiae, E. coli, and Mycobacterium leprae) have been analyzed. A close relationship was observed between genome size and the overall level of sequence repetition. This suggested that the incorporation of simple sequences had accompanied increases of genome size during evolution. Densities of simple sequence motifs were higher in noncoding regions than in coding regions in eukaryotes but not in eubacteria. All five genomes showed very biased frequency distributions of simple sequence motifs in all species, particularly in eukaryotes where AAA and TTT predominated. Interspecific comparisons showed that noncoding sequences in eukaryotes showed highly significantly similar frequency distributions of simple sequence motifs but this was not true of coding sequences. ANOVA of the frequency distributions of simple sequence motifs indicated strong contributions from motif base composition and repeat unit length, but much of the variation remained unexplained by these parameters. The sequence composition of simple sequences therefore appears to reflect both underlying sequence biases in slippage-like processes and the action of selection. Frequency distributions of simple sequence motifs in coding sequences correlated weakly or not at all with those in noncoding sequences. Selection on coding sequences to eliminate undesirable sequences may therefore have been strong, particularly in the human lineage.  相似文献   

6.
The genus Populus L. has been divided into five sections based on morphological characters, but the phylogenetic relationships among sections remain uncertain. Topological discrepancies have been reported between trees obtained using nuclear and plastid sequences. We selected nine chloroplast genomes from all five sections, including two new sequenced species in this study for analyses of maternal phylogenetic relationships in the genus Populus at the sectional level. Phylogenetic analyses were performed using various subsets of data, coding sequences, noncoding sequences, and different districts of the genome, yielding contradictory outcomes for various subsets. According to our phylogenetic analyses, (1) a robust maternal phylogenetic relationship among sections based on complete chloroplast genomes was obtained; (2) Sect. Tacamahaca can be divided into two clades based on maternally inherited loci, i.e. cladeⅠ, distributed in North America and northeast China, and cladeⅡ, distributed in southwest China; (3) SSC-noncoding regions revealed an inconsistent topology compared with all other subsets; (4) this discrepancy may be resulted from incomplete lineage sorting between species of Populus. We tested multiple partitioning schemes to resolve deep-level phylogenetic relationships in Populus, and complete noncoding subset is most recommended.  相似文献   

7.
8.
We investigated the occurrence of gene conversions between paralogous sequences of Salmoninae derived from ancestral tetraploidization and their effect on the evolutionary history of DNA sequences. A microsatellite with long flanking regions (750 bp) including both coding and noncoding sequences was analyzed. Microsatellite size polymorphism was used to detect the alleles of both paralogous counterparts and infer linkage arrangement between loci. DNA sequencing of seven Salmoninae species revealed that paralogous sequences were highly differentiated within species, especially for noncoding regions. Ten gene conversion events between paralogous sequences were inferred. While these events appears to have homogenized regions of otherwise highly differential paralogous sequences, they amplified the differentiation among orthologous sequences. Their effects were larger on coding than on noncoding regions. As a consequence, noncoding sequences grouped by orthologous lineages in phylogenetic trees, whereas coding regions grouped by taxa. Based upon these results, we present a model showing how gene conversion events may also result in the PCR amplification of nonorthologous sequences in different taxa, with obvious complications for phylogenetic inferences, comparative mapping, and population genetic studies. Received: 11 October 2000 / Accepted: 18 September 2001  相似文献   

9.
Ten years on from the finishing of the human reference genome sequence, it remains unclear what fraction of the human genome confers function, where this sequence resides, and how much is shared with other mammalian species. When addressing these questions, functional sequence has often been equated with pan-mammalian conserved sequence. However, functional elements that are short-lived, including those contributing to species-specific biology, will not leave a footprint of long-lasting negative selection. Here, we address these issues by identifying and characterising sequence that has been constrained with respect to insertions and deletions for pairs of eutherian genomes over a range of divergences. Within noncoding sequence, we find increasing amounts of mutually constrained sequence as species pairs become more closely related, indicating that noncoding constrained sequence turns over rapidly. We estimate that half of present-day noncoding constrained sequence has been gained or lost in approximately the last 130 million years (half-life in units of divergence time, d1/2 = 0.25–0.31). While enriched with ENCODE biochemical annotations, much of the short-lived constrained sequences we identify are not detected by models optimized for wider pan-mammalian conservation. Constrained DNase 1 hypersensitivity sites, promoters and untranslated regions have been more evolutionarily stable than long noncoding RNA loci which have turned over especially rapidly. By contrast, protein coding sequence has been highly stable, with an estimated half-life of over a billion years (d1/2 = 2.1–5.0). From extrapolations we estimate that 8.2% (7.1–9.2%) of the human genome is presently subject to negative selection and thus is likely to be functional, while only 2.2% has maintained constraint in both human and mouse since these species diverged. These results reveal that the evolutionary history of the human genome has been highly dynamic, particularly for its noncoding yet biologically functional fraction.  相似文献   

10.
In the last few years, dozens of studies have documented the detection of loci influenced by selection from genome scans in a wide range of non-model species. Many of those studies used amplified fragment length polymorphism (AFLP) markers, which became popular for being easily applicable to any organism. However, because they are anonymous markers, AFLPs impose many challenges for their isolation and identification. Most recent AFLP genome scans used capillary electrophoresis (CE), which adds even more obstacles to the isolation of bands with a specific size for sequencing. These caveats might explain the extremely low number of studies that moved from the detection of outlier AFLP markers to their actual isolation and characterization. We document our efforts to characterize a set of outlier AFLP markers from a previous genome scan with CE in ocellated lizards (Lacerta lepida). Seven outliers were successfully isolated, cloned and sequenced. Their sequences are noncoding and show internal indels or polymorphic repetitive elements (microsatellites). Three outliers were converted into codominant markers by using specific internal primers to sequence and screen population variability from undigested DNA. Amplification in closely related lizard species was also achieved, revealing remarkable interspecific conservation in outlier loci sequences. We stress the importance of following up AFLP genome scans to validate selection signatures of outlier loci, but also report the main challenges and pitfalls that may be faced during the process.  相似文献   

11.
Rice has many characteristics of a model plant. The recent completion of the draft of the rice genome represents an important advance in our knowledge of plant biology and also has an important contribution to the understanding of general genomic evolution. Besides the rice genome finishing map, the next urgent step for rice researchers is to annotate the genes and noncoding functional sequences. The recent work shows that noncoding RNAs (ncRNAs) play significant roles in biological systems. We have explored all the known small RNAs (a kind of ncRNA) within rice genome and other six species sequences, including Arabidopsis, maize, yeast, worm, mouse and pig. As a result we find 160 out of 552 small RNAs (sRNAs) in database have homologs in 108 rice scaffolds, and almost all of them (99.41%) locate in intron regions of rice by gene predication. 19 sRNAs only appear in rice. More importantly, we find two special LJ14 sRNAs: one is located in a set of sRNA ZMU14SNR9(s) which only appears in three plants, 86% sequences of them can be compared as the same sequence in rice, Arabidopsis and maize; the other conserved sRNA XLHS7CU14 has a segment which appears in almost all these species from plants to animals. All these results indicate that sRNA do not have evident borderline between plants and animals.  相似文献   

12.
Amplified fragment length polymorphism (AFLP) is often used for genetic mapping and diversity analysis, but very little information is currently available on their sequence characteristics. Species-specific sequences were analyzed from a single Coffea genome (Coffea pseudozanguebariae) associated with clustered or nonclustered AFLP loci of known genetic position. Compared with the expressed sequence tag (EST) sequence composition, their AT content exhibited a bimodal distribution with AT-poor sequences corresponding mainly to putative coding sequences. AT-rich sequences, apart from the EST distribution, were usually clustered on the genetic map and might correspond to noncoding sequences. Conversion of these AFLP markers into sequence-characterized amplified region (SCAR) anchor markers allowed us to assess sequence conservation within Coffea species with respect to species relatedness.  相似文献   

13.
The genome of avian erythroblastosis virus contains two independently expressed genetic loci (v-erbA and v-erbB) whose activities are probably responsible for oncogenesis by the virus. Both loci are closely related to nucleotide sequences found in the DNA and RNA of chickens and other vertebrates. We have isolated and characterized chicken DNA homologous to v-erbA and v-erbB. The two viral genes are represented by separate domains within chicken DNA (c-erbA and c-erbB), which are separated by a minimum of 12 kilobases (kb) of DNA and may not be linked at all. The nucleotide sequences shared by the viral and cellular erb loci are colinear, but the cellular loci are interrupted by multiple intervening sequences of various lengths. Polyribosomes prepared from normal chicken embryos contain two polyadenylated RNAs transcribed from c-erbA and two transcribed from c-erbB. The evident coding regions of these RNAs represent an unusually small fraction of the lengths of the RNAs, as if the 3′ untranslated domains of the RNAs might be exceptionally large (3–11 kb). These findings indicate that the c-erb loci are normal vertebrate genes rather than genes of cryptic endogenous retroviruses, and that they may have a role in the metabolism of normal cells. It appears that the viral erb genes, like most other retrovirus oncogenes, have been copied from cellular genes. In the viral genome, the two genes are devoid of introns, but they remain independently expressed loci, and they remain colinear with the coding domains of their cellular progenitors.  相似文献   

14.
The abundance and identity of functional variation segregating in natural populations is paramount to dissecting the molecular basis of quantitative traits as well as human genetic diseases. Genome sequencing of multiple organisms of the same species provides an efficient means of cataloging rearrangements, insertion, or deletion polymorphisms (InDels) and single-nucleotide polymorphisms (SNPs). While inbreeding depression and heterosis imply that a substantial amount of polymorphism is deleterious, distinguishing deleterious from neutral polymorphism remains a significant challenge. To identify deleterious and neutral DNA sequence variation within Saccharomyces cerevisiae, we sequenced the genome of a vineyard and oak tree strain and compared them to a reference genome. Among these three strains, 6% of the genome is variable, mostly attributable to variation in genome content that results from large InDels. Out of the 88,000 polymorphisms identified, 93% are SNPs and a small but significant fraction can be attributed to recent interspecific introgression and ectopic gene conversion. In comparison to the reference genome, there is substantial evidence for functional variation in gene content and structure that results from large InDels, frame-shifts, and polymorphic start and stop codons. Comparison of polymorphism to divergence reveals scant evidence for positive selection but an abundance of evidence for deleterious SNPs. We estimate that 12% of coding and 7% of noncoding SNPs are deleterious. Based on divergence among 11 yeast species, we identified 1,666 nonsynonymous SNPs that disrupt conserved amino acids and 1,863 noncoding SNPs that disrupt conserved noncoding motifs. The deleterious coding SNPs include those known to affect quantitative traits, and a subset of the deleterious noncoding SNPs occurs in the promoters of genes that show allele-specific expression, implying that some cis-regulatory SNPs are deleterious. Our results show that the genome sequences of both closely and distantly related species provide a means of identifying deleterious polymorphisms that disrupt functionally conserved coding and noncoding sequences.  相似文献   

15.
Bachtrog D  Andolfatto P 《Genetics》2006,174(4):2045-2059
Selection, recombination, and the demographic history of a species can all have profound effects on genomewide patterns of variability. To assess the impact of these forces in the genome of Drosophila miranda, we examine polymorphism and divergence patterns at 62 loci scattered across the genome. In accordance with recent findings in D. melanogaster, we find that noncoding DNA generally evolves more slowly than synonymous sites, that the distribution of polymorphism frequencies in noncoding DNA is significantly skewed toward rare variants relative to synonymous sites, and that long introns evolve significantly slower than short introns or synonymous sites. These observations suggest that most noncoding DNA is functionally constrained and evolving under purifying selection. However, in contrast to findings in the D. melanogaster species group, we find little evidence of adaptive evolution acting on either coding or noncoding sequences in D. miranda. Levels of linkage disequilibrium (LD) in D. miranda are comparable to those observed in D. melanogaster, but vary considerably among chromosomes. These patterns suggest a significantly lower rate of recombination on autosomes, possibly due to the presence of polymorphic autosomal inversions and/or differences in chromosome sizes. All chromosomes show significant departures from the standard neutral model, including too much heterogeneity in synonymous site polymorphism relative to divergence among loci and a general excess of rare synonymous polymorphisms. These departures from neutral equilibrium expectations are discussed in the context of nonequilibrium models of demography and selection.  相似文献   

16.
Species of the mussel genus Mytilus possess maternally and paternally transmitted mitochondrial genomes. In the interbreeding taxa Mytilus edulis and M. galloprovincialis, several genomes of both types have been fully sequenced. The genome consists of the coding part (which, in addition to protein and RNA genes, contains several small noncoding sequences) and the main control region (CR), which in turn consists of three distinct parts: the first variable (VD1), the conserved (CD), and the second variable (VD2) domain. The maternal and paternal genomes are very similar in gene content and organization, even though they differ by >20% in primary sequence. They differ even more at VD1 and VD2, yet they are remarkably similar at CD. The complete sequence of a genome from the closely related species M. trossulus was previously reported and found to consist of a maternal-like coding part and a paternal-like and a maternal-like CR. From this and from the fact that it was extracted from a male individual, it was inferred that this is a genome that switched from maternal to paternal transmission. Here we provide clear evidence that this genome is the maternal genome of M. trossulus. We have found that in this genome the tRNAGln in the coding region is apparently defective and that an intact copy of this tRNA occurs in the CR, that one of the two conserved domains is missing essential motifs, and that one of the two first variable domains has a high rate of divergence. These features may explain the large size and mosaic structure of the CR of the maternal genome of M. trossulus. We have also obtained CR sequences of the maternal and paternal genomes of M. californianus, a more distantly related species. We compare the control regions from all three species, focusing on the divergence among genomes of different species origin and among genomes of different transmission routes.  相似文献   

17.
Gene unscrambling in spirotrichous ciliates involves massive genome-wide DNA deletion and rearrangement events during development. During each sexual cycle, the somatic nucleus (macronucleus) regenerates from the germ line nucleus (micronucleus). Development of the polyploid somatic genome requires programmed DNA deletion of micronuclear-limited intragenic noncoding sequences and permutation and amplification of the protein-coding regions. Recent studies suggest that, despite novel insertions of endogenous transposon or foreign DNA into the germ line genome, ciliates possess a whole-genome surveillance system that guides the recapitulation of a functional somatic genome. This renders the germ line genome an extremely dynamic structure over evolutionary time. Here we describe the germ line and somatic architectures of the gene encoding alpha-telomere-binding protein in three early-diverging species (Holosticha sp., Uroleptus sp., and Paraurostyla weissei) and trace the natural history of DNA rearrangements in this gene in six species, including three previously studied oxytrichids. Comparisons of homologous coding regions between earlier and later diverging species provide evidence for fusion of scrambled germ line fragments as small as 24 bp during evolution, as well as simultaneous fragmentation and scrambling of the germ line locus and shifting of the boundaries between coding and noncoding DNA, leading to distinct gene architectures in each species. We infer an evolutionary recombination pathway that passes through identified intermediate species and gives rise to the observed patterns in all known species, capitalizing on their unique DNA rearrangement machinery and germ line flexibility.  相似文献   

18.
Using the 3′ noncoding and coding sequences of chick heart myosin light chain mRNA cloned into Escherichia coli as probes, it was observed that, while the coding sequence shared homology with myosin light-chain mRNAs from other sources, the 3′ noncoding sequence was specific for chick heart muscle. This property was used to detect chick heart-specific myosin light-chain gene activity in chick blastoderms of very early developmental stages where cells of different muscle origins cannot be distinguished morphologically. However, in spite of the tissue-specific divergence of the 3′ noncoding sequence of myosin light-chain gene, which is present in a single copy in the chick genome, a surprising homology with DNA from such a diverse source like Dictyostelium discoideum was noted. The sequence homologous to chick myosin light-chain DNA was apparently present in a high repetition frequency in the Dictyostelium genome.  相似文献   

19.
A fractal method to distinguish coding and non-coding sequences in a complete genome is proposed, based on different statistical behaviors between these two kinds of sequences. We first propose a number sequence representation of DNA sequences. Multifractal analysis is then performed on the measure representation of the obtained number sequence. The three exponents C(-1), C1 and C2 are selected from the result of multifractal analysis. Each DNA may be represented by a point in the three-dimensional space generated by these three-component vectors. It is shown that points corresponding to coding and non-coding sequences in the complete genome of many prokaryotes are roughly distributed in different regions. Fisher's discriminant algorithm can be used to separate these two regions in the spanned space. If the point (C(-1),C1,C2) for a DNA sequence is situated in the region corresponding to coding sequences, the sequence is discriminated as a coding sequence; otherwise, the sequence is classified as a non-coding one. For all 51 prokaryotes we considered , the average discriminant accuracies pc,pnc,qc and qnc reach 72.28%, 84.65%, 72.53% and 84.18%, respectively.  相似文献   

20.
Exploitation of custom-designed nucleases to induce DNA double-strand breaks (DSBs) at genomic locations of choice has transformed our ability to edit genomes, regardless of their complexity. DSBs can trigger either error-prone repair pathways that induce random mutations at the break sites or precise homology-directed repair pathways that generate specific insertions or deletions guided by exogenously supplied DNA. Prior editing strategies using site-specific nucleases to modify the Caenorhabditis elegans genome achieved only the heritable disruption of endogenous loci through random mutagenesis by error-prone repair. Here we report highly effective strategies using TALE nucleases and RNA-guided CRISPR/Cas9 nucleases to induce error-prone repair and homology-directed repair to create heritable, precise insertion, deletion, or substitution of specific DNA sequences at targeted endogenous loci. Our robust strategies are effective across nematode species diverged by 300 million years, including necromenic nematodes (Pristionchus pacificus), male/female species (Caenorhabditis species 9), and hermaphroditic species (C. elegans). Thus, genome-editing tools now exist to transform nonmodel nematode species into genetically tractable model organisms. We demonstrate the utility of our broadly applicable genome-editing strategies by creating reagents generally useful to the nematode community and reagents specifically designed to explore the mechanism and evolution of X chromosome dosage compensation. By developing an efficient pipeline involving germline injection of nuclease mRNAs and single-stranded DNA templates, we engineered precise, heritable nucleotide changes both close to and far from DSBs to gain or lose genetic function, to tag proteins made from endogenous genes, and to excise entire loci through targeted FLP-FRT recombination.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号