首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
编码序列和非编码序列的3-tuple分布特征   总被引:2,自引:0,他引:2  
傅强  钱敏平  陈良标  朱玉贤 《遗传学报》2005,32(10):1018-1026
非编码序列,特别是内含子的起源,是一个重要的悬而未决的问题。首先通过计算模式生物的编码序列和非编码序列的不同阅读框中3-tupie的频率分布,发现编码区中不同阅读框具有十分不同的3-tuple分布,而在非编码区中,不同阅读框的3-tuple分布几乎相等,并且这一性质不具有物种依赖性。为了描述分布差异的程度,引进夏量一对称相对熵,并通过比较原核生物和真核生物,发现无论是编码区还是非编码区,原核生物都具有比真核生物更高的SRE值。进一步研究表明,某一生物的SRE值与该生物全基因组中编码区所占的百分比存在一定的相关性(相关系数为0.86)。计算机模拟进化实验发现,2%的突变就足以使典型的嗯核生物编码区高SRE值变为真核生物内含子区特有的低SRE值。比对数据库中已经注释的内含子和编码区序列,证明确实有一部分与编码区具有很高同源性的内含子序列。实验表明,至少部分真核生物的内含子可能起源于编码序列,同时也说明SRE可能被用于研究物种基因组序列的进化。  相似文献   

3.
B Brenig 《Animal genetics》1999,30(2):120-125
Interspersed elements are ubiquitous in the genomes of higher eukaryotes and account for over a third of the genomic DNA (Smit 1996). In swine the short interspersed elements, SINEs or PREs (porcine repetitive elements), have been found in a number of introns and 3' untranslated regions of different genes. However, compared to human Alu repeats the number of available PRE DNA sequences is still limited. In this study we have compared 85 PREs selected from DNA sequence database entries. The PREs were aligned and for each nucleotide position the relative frequencies of the four bases were calculated. A consensus sequence was derived from the first base usage. Similar to studies of SINEs in other species, the analysis showed that most mutations in PREs occur at CpG dinucleotide hot spots. The position variability for the two most frequent bases shows a bimodal distribution. The analysis suggests that the porcine SINEs can be divided into three major subfamilies sharing conserved nucleotide similarities.  相似文献   

4.
Summary Previous studies have indicated that DNA bending is a general structural feature of sequences (ARSs) from cellular DNAs of yeasts and nuclear and mitochondrial genomic DNAs of other eukaryotes that are capable of autonomous replication in Saccharomyces cerevisiae. Here we showed that bending activity is also tightly associated with S. cerevisiae ARS function of segments cloned from mitochondrial linear DNA plasmids of the basidiomycetes Pleurotus ostreatus and Lentinus edodes. Two plasmids, designated pLPO2-like (9.4 kb), and pLPO3 (6.6 kb) were isolated from a strain of P. ostreatus. A 1029 by fragment with high-level ARS activity was cloned from pLPO3 and it contained one ARS consensus sequence (A/T)TTTAT(A/G)TTT(A/T) indispensable for activity and seven dispersed ARS consensus-like (10/11 match) sequences. A discrete bent DNA region was found to lie around 500 by upstream from the ARS consensus sequence (T-rich strand). Removal of the bent DNA region impaired ARS function. DNA bending was also implicated in the ARS function associated with a 1430 by fragment containing three consecutive ARS consensus sequences which had been cloned from the L. edodes plasmid pLLE1 (11.0 kb): the three consecutive ARSs responsible for high-level ARS function occurred in, and immediately adjacent to, a bent DNA region. A clear difference exists between the two plasmid-derived ARS fragments with respect to the distance between the bent DNA region and the ARS consensus sequence(s).  相似文献   

5.
Non-coding genomic regions in complex eukaryotes, including intergenic areas, introns, and untranslated segments of exons, are profoundly non-random in their nucleotide composition and consist of a complex mosaic of sequence patterns. These patterns include so-called Mid-Range Inhomogeneity (MRI) regions -- sequences 30-10000 nucleotides in length that are enriched by a particular base or combination of bases (e.g. (G+T)-rich, purine-rich, etc.). MRI regions are associated with unusual (non-B-form) DNA structures that are often involved in regulation of gene expression, recombination, and other genetic processes (Fedorova & Fedorov 2010). The existence of a strong fixation bias within MRI regions against mutations that tend to reduce their sequence inhomogeneity additionally supports the functionality and importance of these genomic sequences (Prakash et al. 2009).Here we demonstrate a freely available Internet resource -- the Genomic MRI program package -- designed for computational analysis of genomic sequences in order to find and characterize various MRI patterns within them (Bechtel et al. 2008). This package also allows generation of randomized sequences with various properties and level of correspondence to the natural input DNA sequences. The main goal of this resource is to facilitate examination of vast regions of non-coding DNA that are still scarcely investigated and await thorough exploration and recognition.  相似文献   

6.
A database of sequences of 139 introns from the nematode Caenorhabditis elegans was analyzed using the information measure of Schneider et al. (1986) J. Mol. Biol. 128: 415-431. Statistically significant information is encoded by at least the first 30 nt and last 20 nt of C. elegans introns. Both the quantity and the distribution of information in the 5' splice site sequences differs between the typical short (length less than 75 nt) and rarer long (length greater than 75 nt) introns, with the 5 sites of long introns containing approximately one bit more information. 3' splice site sequences of long and short C. elegans introns differ significantly in the region between -20 and -10 nt.  相似文献   

7.
8.
Revealing how recombination affects genomic sequence is of great significance to our understanding of genome evolution. The present paper focuses on the correlation between recombination rate and dinucleotide bias in Drosophila melanogaster genome. Our results show that the overall dinucleotide bias is positively correlated with recombination rate for genomic sequences including untranslated regions, introns, intergenic regions, and coding sequences. The correlation patterns of individual dinucleotide biases with recombination rate are presented. Possible mechanisms of interaction between recombination and dinucleotide bias are discussed. Our data indicate that there may be a genome-wide universal mechanism acting between recombination rate and dinucleotide bias, which is likely to be neighbor-dependent biased gene conversion.  相似文献   

9.
Detection, sequence patterns and function of unusual DNA structures.   总被引:25,自引:14,他引:11       下载免费PDF全文
Unusual DNA structures were detected by an electrophoretic procedure in which DNA fragments were separated according to size on agarose gels and then by shape on polyacrylamide gels. Fragments from yeast centromeres migrated faster in polyacrylamide than predicted from their base composition and size and this property was attributed to a nonrandom distribution of oligomeric A tracts that exhibited minima at 10-11 base intervals. Fragments from seven loci in 107 kb of DNA migrated anomalously slow and these fragments contained blocks of A2-6 in a 10-11 base periodicity which is indicative of bent DNA. The most pronounced bent sequences were found within yeast ARS1 and centered at 245 and 240 bp from the left and right ends of the adenovirus genome. Each sequence is approximately 150 bp away from a replication origin and the adenovirus sequences are within 50 bp of enhancers. Nuclear matrix attachment sites, which are also adjacent to enhancers, contain sequences characteristic of bent DNA. These results suggest that bent structures reside at the base of DNA loops in chromosomes.  相似文献   

10.
11.
The C. elegans genome contains a 1.7 kb repeated DNA sequence (Tc1) that is present in different numbers in various strains. In strain Bristol and 10 other strains analyzed, there are 20 ± 5 copies of Tc1, and these are located at a nearly constant set of sites in the DNA. In Bergerac, however, there are 200 ± 50 interspersed copies of Tc1 that have arisen by insertion of Tc1 elements into new genomic sites. The interspersed copies of Tc1 have a conserved, nonpermuted structure. The structure of genomic Tc1 elements was analyzed by the cloning of a single Tc1 element from Bergerac and the comparison of its structure with homologous genomic sequences in Bristol and Bergerac. Tc1 elements at three sites analyzed in Bergerac undergo apparently precise excision from their points of insertion at high frequency.  相似文献   

12.
Identification of recently gained spliceosomal introns would provide crucial evidence in the continuing debate concerning the age and evolutionary significance of introns. A previously published genomic analysis reported to have identified 122 introns that had been gained since the divergence of the nematodes Caenorhabidits elegans and Caenorhabditis briggsae approximately 100 MYA. However, using newly available genomic sequence from additional Caenorhabditis species, we show that 74% (60/81) of the reported gains in C. elegans are present in a C. briggsae relative. This pattern indicates that these introns represent losses in C. briggsae, not gains in C. elegans. In addition, 61% (25/41) of the reported gains in C. briggsae are present in the more distant C. briggsae relative, in a pattern suggesting that additional reported gains in C. elegans and/or C. briggsae may in fact represent unrecognized losses. These results underscore the dominance of intron loss over intron gain in recent eukaryotic evolution, the pitfalls associated with parsimony in inferring intron gains, and the importance of genomic sequencing of clusters of closely related species for drawing accurate inferences about genome evolution.  相似文献   

13.
Genes coding for 5S ribosomal RNA of the nematode Caenorhabditis elegans   总被引:6,自引:0,他引:6  
D W Nelson  B M Honda 《Gene》1985,38(1-3):245-251
We have identified a 1-kb genomic sequence that represents the major class of 5S rRNA genes in the nematode Caenorhabditis elegans. This 1-kb sequence is tandemly repeated 110 times in the haploid genome forming a single homogeneous gene family. Other nematode genomic sequences, distinct from the major 1-kb repeat class but homologous to it, may represent dispersed 5S rRNA genes or the ends of a gene cluster. One such fragment shows a restriction fragment length difference between two C. elegans strains. This should allow the genetic analysis of 5S rRNA-coding DNA (5S X rDNA) and its flanking regions in C. elegans.  相似文献   

14.
Ciliates are microbial eukaryotes that separate their nuclear functions into a germline micronucleus and a somatic macronucleus. During development of the macronucleus the genome undergoes a series of reorganization events that includes the precise excision of intervening DNA. Here, we determine the architecture of four loci in the micronuclear and macronuclear genomes of the ciliate Chilodonella uncinata and compare the levels of variation in micronuclear-limited sequences to macronuclear destined sequences at two of these loci. We find that within a population, germline-limited sequences are evolving at the same rate as other putatively neutral sites, but between populations germline-limited sequences are accumulating mutations at a much faster rate than other sites. We also find evidence of macronuclear recombination and incomplete elimination of intervening DNA, which result in increased diversity in the macronuclear genome. Our results support the assertion that the unusual genomic features of ciliates can result in rapid and unpredicted patterns of diversification.  相似文献   

15.
16.
Expression patterns of gene products provide important insights into gene function. Reporter constructs are frequently used to analyze gene expression in Caenorhabditis elegans, but the sequence context of a given gene is inevitably altered in such constructs. As a result, these transgenes may lack regulatory elements required for proper gene expression. We developed Gene Catchr, a novel method of generating reporter constructs that exploits yeast homologous recombination (YHR) to subclone and tag worm genes while preserving their local sequence context. YHR facilitates the cloning of large genomic regions, allowing the isolation of regulatory sequences in promoters, introns, untranslated regions and flanking DNA. The endogenous regulatory context of a given gene is thus preserved, producing expression patterns that are as accurate as possible. Gene Catchr is flexible: any tag can be inserted at any position without introducing extra sequence. Each step is simple and can be adapted to process multiple genes in parallel. We show that expression patterns derived from Gene Catchr transgenes are consistent with previous reports and also describe novel expression data. Mutant rescue assays demonstrate that Gene Catchr-generated transgenes are functional. Our results validate the use of Gene Catchr as a valuable tool to study spatiotemporal gene expression.  相似文献   

17.
We constructed an integrated DNA marker linkage map of eggplant (Solanum melongena L.) using DNA marker segregation data sets obtained from two independent intraspecific F(2) populations. The linkage map consisted of 12 linkage groups and encompassed 1,285.5 cM in total. We mapped 952 DNA markers, including 313 genomic SSR markers developed by random sequencing of simple sequence repeat (SSR)-enriched genomic libraries, and 623 single-nucleotide polymorphisms (SNP) and insertion/deletion polymorphisms (InDels) found in eggplant-expressed sequence tags (ESTs) and related genomic sequences [introns and untranslated regions (UTRs)]. Because of their co-dominant inheritance and their highly polymorphic and multi-allelic nature, the SSR markers may be more versatile than the SNP and InDel markers for map-based genetic analysis of any traits of interest using segregating populations derived from any intraspecific crosses of practical breeding materials. However, we found that the distribution of microsatellites in the genome was biased to some extent, and therefore a considerable part of the eggplant genome was first detected when gene-derived SNP and InDel markers were mapped. Of the 623 SNP and InDel markers mapped onto the eggplant integrated map, 469 were derived from eggplant unigenes contained within Solanum orthologous (SOL) gene sets (i.e., sets of orthologous unigenes from eggplant, tomato, and potato). Out of the 469 markers, 326 could also be mapped onto the tomato map. These common markers will be informative landmarks for the transfer of tomato's more saturated genomic information to eggplant and will also provide comparative information on the genome organization of the two solanaceous species. The data are available from the DNA marker database of vegetables, VegMarks (http://vegmarks.nivot.affrc.go.jp).  相似文献   

18.
19.
In animal and viral pre-mRNAS, the process of polyadenylation is mediated through several cis-acting poly (A) signals present upstream and downstream from poly (A) sites. The situation regarding polyadenylation of higher plant pre-mRNAS, however, has remained obscure so far. In this paper, a search for putative poly (A) signals is made by considering the published data from 46 plant genomic DNA sequences. Certain domains in the 3' untranslated regions from nuclear genes of higher plants were compiled and occurrence of sequence motifs such as AATAAA, CAYTG, YGTGTTYY and YAYTG was scored in relation to poly (A) sites. Moreover, consensus sequences for important regions in the 3' untranslated sequences and poly (A) signals were also deduced from the data. It was inferred that sequence motifs similar to poly (A) signals exist around poly (A) sites but some of them are in entirely different spatial relationship than observed in other eukaryotes. This indicates their probable non-involvement in the process of polyadenylation in higher plants necessitating a functional analysis approach to define the plant specific poly (A) signals.  相似文献   

20.
Irimia M  Roy SW 《PLoS genetics》2008,4(8):e1000148
The presence of spliceosomal introns in eukaryotes raises a range of questions about genomic evolution. Along with the fundamental mysteries of introns' initial proliferation and persistence, the evolutionary forces acting on intron sequences remain largely mysterious. Intron number varies across species from a few introns per genome to several introns per gene, and the elements of intron sequences directly implicated in splicing vary from degenerate to strict consensus motifs. We report a 50-species comparative genomic study of intron sequences across most eukaryotic groups. We find two broad and striking patterns. First, we find that some highly intron-poor lineages have undergone evolutionary convergence to strong 3' consensus intron structures. This finding holds for both branch point sequence and distance between the branch point and the 3' splice site. Interestingly, this difference appears to exist within the genomes of green alga of the genus Ostreococcus, which exhibit highly constrained intron sequences through most of the intron-poor genome, but not in one much more intron-dense genomic region. Second, we find evidence that ancestral genomes contained highly variable branch point sequences, similar to more complex modern intron-rich eukaryotic lineages. In addition, ancestral structures are likely to have included polyT tails similar to those in metazoans and plants, which we found in a variety of protist lineages. Intriguingly, intron structure evolution appears to be quite different across lineages experiencing different types of genome reduction: whereas lineages with very few introns tend towards highly regular intronic sequences, lineages with very short introns tend towards highly degenerate sequences. Together, these results attest to the complex nature of ancestral eukaryotic splicing, the qualitatively different evolutionary forces acting on intron structures across modern lineages, and the impressive evolutionary malleability of eukaryotic gene structures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号