首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 265 毫秒
1.
Study of statistical correlations in DNA sequences   总被引:3,自引:0,他引:3  
Here we present a study of statistical correlations among different positions in DNA sequences and their implications by directly using the autocorrelation function. Such an analysis is possible now because of the availability of large sequences or even complete genomes of many organisms. After describing the way in which the autocorrelation function can be applied to DNA-sequence analysis, we show that long-range correlations, implying scale independence, appear in several bacterial genomes as well as in long human chromosome contigs. The source for such correlations in bacteria, which may extend up to 60 kb in Bacillus subtilis, may be related to massive lateral transfer of compositionally biased genes from other genomes. In the human genome, correlations extend for more than five decades and may be related to the evolution of the ’neogenome’, a modern evolutionary acquisition composed by GC-rich isochores displaying long-range correlations and scale invariance.  相似文献   

2.
We show that repeated sequences, like palindromes (local repetitions) and homologies between two different nucleotide sequences (motifs along the genome), compose a self-similar (fractal) pattern in mitochondrial DNA. This self-similarity comes from the looplike structures distributed along the genome. The looplike structures generate scaling laws in a pseudorandom DNA walk constructed from the sequence, called a Lévy flight. We measure the scaling laws from the generalized fractal dimension and singularity spectrum for mitochondrial DNA walks for 35 different species. In particular, we report characteristic loop distributions for mammal mitochondrial genomes.  相似文献   

3.
High‐throughput DNA analyses are increasingly being used to detect rare mutations in moderately sized genomes. These methods have yielded genome mutation rates that are markedly higher than those obtained using pre‐genomic strategies. Recent work in a variety of organisms has shown that mutation rate is strongly affected by sequence context and genome position. These observations suggest that high‐throughput DNA analyses will ultimately allow researchers to identify trans‐acting factors and cis sequences that underlie mutation rate variation. Such work should provide insights on how mutation rate variability can impact genome organization and disease progression.  相似文献   

4.
重复DNA沿染色体的分布是认识植物基因组的组织和进化的要素之一。本研究采用一种改良的基因组原位杂交程序,对基因组大小和重复DNA数量不同的6种植物进行了自身基因组原位杂交(self-genomic in situ hybridization,self-GISH)。在所有供试物种的染色体都观察到荧光标记探针DNA的不均匀分布。杂交信号图型在物种间有明显的差异,并与基因组的大小相关。小基因组拟南芥的染色体几乎只有近着丝粒区和核仁组织区被标记。基因组相对较小的水稻、高粱、甘蓝的杂交信号分散分布在染色体的全长,但在近着丝粒区或近端区以及某些异染色质臂的分布明显占优势。大基因组的玉米和大麦的所有染色体都被密集地标记,并在染色体全长显示出强标记区与弱标记或不标记区的交替排列。此外,甘蓝染色体的所有近着丝粒区和核仁组织区、大麦染色体的所有近着丝粒区和某些臂中间区还显示了增强的信号带。大麦增强的信号带带型与其N-带带型一致。水稻自身基因组原位杂交图型与水稻Cot-1DNA在水稻染色体上的荧光原位杂交图型基本一致。研究结果表明,自身基因组原位杂交信号实际上反映了基因组重复DNA序列对染色体的杂交,因而自身基因组原位杂交技术是显示植物基因组中重复DNA聚集区在染色体上的分布以及与重复DNA相关联的染色质分化的有效方法。  相似文献   

5.
Recent studies investigating the evolution of genome size diversity in ferns have shown that they have a distinctive genome profile compared with other land plants. Ferns are typically characterized by possessing medium‐sized genomes, although a few lineages have evolved very large genomes. Ferns are different from other vascular plant lineages as they are the only group to show evidence for a correlation between genome size and chromosome number. In this study, we aim to explore whether the evolution of fern genome sizes is not only shaped by chromosome number changes arising from polyploidy but also by constraints on the average amount of DNA per chromosome. We selected the genus Asplenium L. as a model genus to study the question because of the unique combination of a highly conserved base chromosome number and a high frequency of polyploidy. New genome size data for Asplenium taxa were combined with existing data and analyzed within a phylogenetic framework. Genome size varied substantially between diploid species, resulting in overlapping genome sizes among diploid and tetraploid spleenworts. The observed additive pattern indicates the absence of genome downsizing following polyploidy. The genome size of diploids varied non‐randomly and we found evidence for clade‐specific trends towards larger or smaller genomes. The 578‐fold range of fern genome sizes have arisen not only from repeated cycles of polyploidy but also through clade‐specific constraints governing accumulation and/or elimination of DNA.  相似文献   

6.
Analyses of genomic DNA sequences have shown in previous works that base pairs are correlated at large distances with scale-invariant statistical properties. We show in the present study that these correlations between nucleotides (letters) result in fact from long-range correlations (LRC) between sequence-dependent DNA structural elements (words) involved in the packaging of DNA in chromatin. Using the wavelet transform technique, we perform a comparative analysis of the DNA text and of the corresponding bending profiles generated with curvature tables based on nucleosome positioning data. This exploration through the optics of the so-called `wavelet transform microscope' reveals a characteristic scale of 100-200 bp that separates two regimes of different LRC. We focus here on the existence of LRC in the small-scale regime ( 200 bp). Analysis of genomes in the three kingdoms reveals that this regime is specifically associated to the presence of nucleosomes. Indeed, small scale LRC are observed in eukaryotic genomes and to a less extent in archaeal genomes, in contrast with their absence in eubacterial genomes. Similarly, this regime is observed in eukaryotic but not in bacterial viral DNA genomes. There is one exception for genomes of Poxviruses, the only animal DNA viruses that do not replicate in the cell nucleus and do not present small scale LRC. Furthermore, no small scale LRC are detected in the genomes of all examined RNA viruses, with one exception in the case of retroviruses. Altogether, these results strongly suggest that small-scale LRC are a signature of the nucleosomal structure. Finally, we discuss possible interpretations of these small-scale LRC in terms of the mechanisms that govern the positioning, the stability and the dynamics of the nucleosomes along the DNA chain. This paper is maily devoted to a pedagogical presentation of the theoretical concepts and physical methods which are well suited to perform a statistical analysis of genomic sequences. We review the results obtained with the so-called wavelet-based multifractal analysis when investigating the DNA sequences of various organisms in the three kingdoms. Some of these results have been announced in B. Audit et al. [1, 2].  相似文献   

7.
Microbial genome sequences provide us with the fossil records for inferring their origination and evolution. Assuming that current microbial genomes are the evolutionary results of ancient genomes or fragments and the neighboring genes in ancient genomes are more likely neighbors in current genomes, in this paper we proposed a paleontological algorithm and assembled the orthologous gene groups from 66 complete and current microbial genome sequences into a pseudo-ancient genome, which consists of continuous fragments of various sizes. We performed bootstrap resampling and correlation analyses and the results showed that the assembled ancient genome and fragments are statistically significant and the genes of the same fragment are inherently related and likely derived from common ancestors. This method provides a new computational tool for studying microbial genome structure and evolution.  相似文献   

8.
In plant species with large genomes such as wheat or barley, genome organization at the level of DNA sequence is largely unknown. The largest sequences that are publicly accessible so far from Triticeae genomes are two 60 kb and 66 kb intervals from barley. Here, we report on the analysis of a 211 kb contiguous DNA sequence from diploid wheat (Triticum monococcum L.). Five putative genes were identified, two of which show similarity to disease resistance genes. Three of the five genes are clustered in a 31 kb gene-enriched island while the two others are separated from the cluster and from each other by large stretches of repetitive DNA. About 70% of the contig is comprised of several classes of transposable elements. Ten different types of retrotransposons were identified, most of them forming a pattern of nested insertions similar to those found in maize and barley. Evidence was found for major deletion, insertion and duplication events within the analysed region, suggesting multiple mechanisms of genome evolution in addition to retrotransposon amplification. Seven types of foldback transposons, an element class previously not described for wheat genomes, were characterized. One such element was found to be closely associated with genes in several Triticeae species and may therefore be of use for the identification of gene-rich regions in these species.  相似文献   

9.
S. Ali  G. Bala  S. Bala 《Animal genetics》1993,24(3):199-202
A synthetic oligodeoxyribonucleotide probe (OAT36) comprising nine repeats of 5'GACA 3′ and several enzymes were used to analyse cow, (Bos taurus) and buffalo (Bubalus bubalis) genomes and a number of monomorphic loci were detected in both the species. Different animals from the same species showed an almost ‘similar’ monomorphic hybridization pattern but animals from two separate species showed a different ‘genome specific’ pattern. The overall hybridization with any enzyme and probe combination was found to be unique to one species. This forms the basis of genome specific hybridization which is substantiated by our zoo-blot hybridization studies. The evolutionary aspect of these loci in the context of sequence polymorphisms is discussed.  相似文献   

10.
Genome evolution in the genus Sorghum (Poaceae)   总被引:3,自引:0,他引:3  
BACKGROUND AND AIMS: The roles of variation in DNA content in plant evolution and adaptation remain a major biological enigma. Chromosome number and 2C DNA content were determined for 21 of the 25 species of the genus Sorghum and analysed from a phylogenetic perspective. METHODS: DNA content was determined by flow cytometry. A Sorghum phylogeny was constructed based on combined nuclear ITS and chloroplast ndhF DNA sequences. KEY RESULTS: Chromosome counts (2n = 10, 20, 30, 40) were, with few exceptions, concordant with published numbers. New chromosome numbers were obtained for S. amplum (2n = 30) and S. leiocladum (2n = 10). 2C DNA content varies 8.1-fold (1.27-10.30 pg) among the 21 Sorghum species. 2C DNA content varies 3.6-fold from 1.27 pg to 4.60 pg among the 2n = 10 species and 5.8-fold (1.52-8.79 pg) among the 2n = 20 species. The x = 5 genome size varies over an 8.8-fold range from 0.26 pg to 2.30 pg. The mean 2C DNA content of perennial species (6.20 pg) is significantly greater than the mean (2.92 pg) of the annuals. Among the 21 species studied, the mean x = 5 genome size of annuals (1.15 pg) and of perennials (1.29 pg) is not significantly different. Statistical analysis of Australian species showed: (a) mean 2C DNA content of annual (2.89 pg) and perennial (7.73 pg) species is significantly different; (b) mean x = 5 genome size of perennials (1.66 pg) is significantly greater than that of the annuals (1.09 pg); (c) the mean maximum latitude at which perennial species grow (-25.4 degrees) is significantly greater than the mean maximum latitude (-17.6) at which annual species grow. CONCLUSIONS: The DNA sequence phylogeny splits Sorghum into two lineages, one comprising the 2n = 10 species with large genomes and their polyploid relatives, and the other with the 2n = 20, 40 species with relatively small genomes. An apparent phylogenetic reduction in genome size has occurred in the 2n = 10 lineage. Genome size evolution in the genus Sorghum apparently did not involve a 'one way ticket to genomic obesity' as has been proposed for the grasses.  相似文献   

11.
The genomes of barley and wheat, two of the world's most important crops, are very large and complex due to their high content of repetitive DNA. In order to obtain a whole-genome sequence sample, we performed two runs of 454 (GS20) sequencing on genomic DNA of barley cv. Morex, which yielded approximately 1% of a haploid genome equivalent. Almost 60% of the sequences comprised known transposable element (TE) families, and another 9% represented novel repetitive sequences. We also discovered high amounts of low-complexity DNA and non-genic low-copy DNA. We identified almost 2300 protein coding gene sequences and more than 660 putative conserved non-coding sequences. Comparison of the 454 reads with previously published genomic sequences suggested that TE families are distributed unequally along chromosomes. This was confirmed by in situ hybridizations of selected TEs. A comparison of these data for the barley genome with a large sample of publicly available wheat sequences showed that several TE families that are highly abundant in wheat are absent from the barley genome. This finding implies that the TE composition of their genomes differs dramatically, despite their very similar genome size and their close phylogenetic relationship.  相似文献   

12.
Whole genome analysis provides new perspectives to determine phylogenetic relationships among microorganisms. The availability of whole nucleotide sequences allows different levels of comparison among genomes by several approaches. In this work, self-attraction rates were considered for each cluster of orthologous groups of proteins (COGs) class in order to analyse gene aggregation levels in physical maps. Phylogenetic relationships among microorganisms were obtained by comparing self-attraction coefficients. Eighteen-dimensional vectors were computed for a set of 168 completely sequenced microbial genomes (19 archea, 149 bacteria). The components of the vector represent the aggregation rate of the genes belonging to each of 18 COGs classes. Genes involved in nonessential functions or related to environmental conditions showed the highest aggregation rates. On the contrary genes involved in basic cellular tasks showed a more uniform distribution along the genome, except for translation genes. Self-attraction clustering approach allowed classification of Proteobacteria, Bacilli and other species belonging to Firmicutes. Rearrangement and Lateral Gene Transfer events may influence divergences from classical taxonomy. Each set of COG classes’ aggregation values represents an intrinsic property of the microbial genome. This novel approach provides a new point of view for whole genome analysis and bacterial characterization.  相似文献   

13.
合成基因组学:设计与合成的艺术   总被引:1,自引:0,他引:1  
随着基因组相关技术(测序、编辑、合成等)和知识(功能基因组学)的日益成熟,合成基因组学在本世纪迎得了发展的契机。病毒、原核生物的全基因组相继被化学合成并支持生命的存活,第1个真核生物合成基因组计划已经完成过半,人类基因组编写计划提上日程。在基因组合成的实践过程中,研究者们不断探索对基因组进行重编和设计所应遵循的规则,提高从头合成、组装和替换基因组的技术手段。合成基因组在工业、环境、健康和基础研究领域有着广阔的应用前景,同时也带来了相应的伦理问题。结合在Sc2.0计划中的基因组合成研究和近期合成基因组学所取得的重大进展,本文综述了基因组设计和合成相关的科学、技术和伦理内容,并探讨了未来发展所面对的挑战。作为合成生物学最重要的领域之一,合成基因组学方兴未艾。  相似文献   

14.
Recent advances in high‐thoughput DNA sequencing have made genome‐scale analyses of genomes of extinct organisms possible. With these new opportunities come new difficulties in assessing the authenticity of the DNA sequences retrieved. We discuss how these difficulties can be addressed, particularly with regard to analyses of the Neandertal genome. We argue that only direct assays of DNA sequence positions in which Neandertals differ from all contemporary humans can serve as a reliable means to estimate human contamination. Indirect measures, such as the extent of DNA fragmentation, nucleotide misincorporations, or comparison of derived allele frequencies in different fragment size classes, are unreliable. Fortunately, interim approaches based on mtDNA differences between Neandertals and current humans, detection of male contamination through Y chromosomal sequences, and repeated sequencing from the same fossil to detect autosomal contamination allow initial large‐scale sequencing of Neandertal genomes. This will result in the discovery of fixed differences in the nuclear genome between Neandertals and current humans that can serve as future direct assays for contamination. For analyses of other fossil hominins, which may become possible in the future, we suggest a similar ‘boot‐strap’ approach in which interim approaches are applied until sufficient data for more definitive direct assays are acquired.  相似文献   

15.
Common wheat ( Triticum aestivum L.) is an allohexaploid, consisting of three different genomes (Au, B and D ) which are genetically closely related. Genomic DNA of the three possible genome donors, T. urartu Thum., Aegilops speltoides Tausch and Ae. tauschii Coss.,were employed as probes to hybridize with the diploid genomic DNA digested by Eco RⅠand Hin dⅢ respectively. Both the hybridization strength and band patterns among the genomes would be good indicators of genome relationships. Combining distr ibution data of some repetitive DNA sequences cloned from T. urartu in the three genomes, the authors draw a conclusion that Au and D are more closely related to each other than either one to the B genome. Genomic in situ hybridization (GISH) of T. aestivum cv. Chinese Spring with genomic DNA probes of the three diploid progenitors respectively indicated that the three genomes could be discriminated clearly via GISH. The signals on the chromosomes of Au and D genomes were even. However, when Ae. speltoides DNA was used as probe, there were very strong cross hybridization and the signals condensed on some areas of the metaphasic chromosomes. In the interphase nucleus, the chromatin of B genome dispersed on the same region and the signals on the homologous chromosomes distributed symmetrically. Rich repetitive DNA sequences in B genome, especially the tandem repetitives, perhaps take an important role for the formation of the special hybridization pattern. The main difference between B and the other two genomes probably is in the repetitive DNA sequences.  相似文献   

16.
Genome size variation in plants is thought to be correlatedwith cytological, physiological, or ecological characters. However,conclusions drawn in several studies were often contradictory.To analyze nuclear genome size evolution in a phylogenetic framework,DNA contents of 134 accessions, representing all but one speciesof the barley genus Hordeum L., were measured by flow cytometry.The 2C DNA contents were in a range from 6.85 to 10.67 pg indiploids (2n = 14) and reached up to 29.85 pg in hexaploid species(2n = 42). The smallest genomes were found in taxa from theNew World, which became secondarily annual, whereas the largestdiploid genomes occur in Eurasian annuals. Genome sizes of polyploidtaxa equaled mostly the added sizes of their proposed progenitorsor were slightly (1% to 5%) smaller. The analysis of ancestralgenome sizes on the base of the phylogeny of the genus revealedlineages with decreasing and with increasing genome sizes. Correlationsof intraspecific genome size variation with the length of vegetationperiod were found in H. marinum populations from Western Europebut were not significant within two species from South America.On a higher taxonomical level (i.e., for species groups or theentire genus), environmental correlations were absent. Thiscould mostly be attributed to the superimposition of life-formchanges and phylogenetic constraints, which conceal ecogeographicalcorrelations.  相似文献   

17.
Genome size variation is of fundamental biological importance and has been a longstanding puzzle in evolutionary biology. In the present study, the genome size of 61 accessions corresponding to 11 genera and 50 species of Vitaceae and Leeaceae is determined using flow cytometry. Phylogenetically based statistical analyses were used to infer ancestral character reconstructions of nuclear DNA contents. The DNA 1C‐values of 38 species are reported for the first time, with the largest genome (Cyphostemma humile (N. E. Br.) Desc. ex Wild & R. B. Drumm, 1C = 3.25 pg) roughly 10.48‐fold larger than the smallest (Vitis vulpina L., 1C = 0.31 pg). The large genomes are restricted to the tribe Cayratieae, and most other extant species in the family possess relatively small genomes. Ancestral genome size reconstruction revealed that the most recent common ancestor for the family had a relatively small genome (1C = 0.85 pg). Genome evolution in Vitaceae has been characterized by a trend towards genome size reduction, with just one episode of apparent DNA accumulation in the Cayratieae lineage. Such contrasting patterns of genome size evolution probably resulted from transposable elements and chromosome rearrangements, while neopolyploidization seems to contribute to recent genome increase in some species at the tips in the family tree.  相似文献   

18.
The ever increasing rate at which whole genome sequences are becoming accessible to the scientific community has created an urgent need for tools enabling comparison of chromosomes of different species. We have applied biometric methods to available chromosome sequences and posted the results on our Comparative Genometrics (CG) web site. By genometrics, a term coined by Elston and Wilson [Genet. Epidemiol. (1990), 7, 17–19], we understand a biometric analysis of chromosomes. During the initial phase, our web site displays, for all completely sequenced prokaryotic genomes, three genometric analyses: the DNA walk [Lobry (1999) Microbiology Today, 26, 164–165] and two complementary representations, i.e. the cumulative GC- and TA-skew analyses, capable of identifying, at the level of whole genomes, features inherent to chromosome organization and functioning. It appears that the latter features are taxon-specific. Although primarily focused on prokaryotic chromosomes, the CG web site contains genometric information on paradigm plasmids, phages, viruses and eukaryotic organelles. Relevant data and methods can be readily used by the scientific community for further analyses as well as for tutorial purposes. Our data posted at the CG web site are freely available on the World Wide Web at http://www.unil.ch/comparativegenometrics.  相似文献   

19.
We develop and evaluate methods for inferring relatedness among individuals from low‐coverage DNA sequences of their genomes, with particular emphasis on sequences obtained from fossil remains. We suggest the major factors complicating the determination of relatedness among ancient individuals are sequencing depth, the number of overlapping sites, the sequencing error rate and the presence of contamination from present‐day genetic sources. We develop a theoretical model that facilitates the exploration of these factors and their relative effects, via measurement of pairwise genetic distances, without calling genotypes, and determine the power to infer relatedness under various scenarios of varying sequencing depth, present‐day contamination and sequencing error. The model is validated by a simulation study as well as the analysis of aligned sequences from present‐day human genomes. We then apply the method to the recently published genome sequences of ancient Europeans, developing a statistical treatment to determine confidence in assigned relatedness that is, in some cases, more precise than previously reported. As the majority of ancient specimens are from animals, this method would be applicable to investigate kinship in nonhuman remains. The developed software grups (Genetic Relatedness Using Pedigree Simulations) is implemented in Python and freely available.  相似文献   

20.
Banana streak virus (BSV), a member of genus Badnavirus, is a causal agent of banana streak disease throughout the world. The genetic diversity of BSVs from different regions of banana plantations has previously been investigated, but there are relatively few reports of the genetic characteristic of episomal (non-integrated) BSV genomes isolated from China. Here, the complete genome, a total of 7722bp (GenBank accession number DQ092436), of an isolate of Banana streak virus (BSV) on cultivar Cavendish (BSAcYNV) in Yunnan, China was determined. The genome organises in the typical manner of badnaviruses. The intergenic region of genomic DNA contains a large stem-loop, which may contribute to the ribosome shift into the following open reading frames (ORFs). The coding region of BSAcYNV consists of three overlapping ORFs, ORF1 with a non-AUG start codon and ORF2 encoding two small proteins are individually involved in viral movement and ORF3 encodes a polyprotein. Besides the complete genome, a defective genome lacking the whole RNA leader region and a majority of ORF1 and which encompasses 6525bp was also isolated and sequenced from this BSV DNA reservoir in infected banana plants. Sequence analyses showed that BSAcYNV has closest similarity in terms of genome organization and the coding assignments with an BSV isolate from Vietnam (BSAcVNV). The corresponding coding regions shared identities of 88% and ∼95% at nucleotide and amino acid levels, respectively. Phylogenetic analysis also indicated BSAcYNV shared the closest geographical evolutionary relationship to BSAcVNV among sequenced banana streak badnaviruses.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号