首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Here we advocate the use of 2-dimensional data representation in the context of the informational approach of sequence analysis (Claverie & Bougueleret (1986) Nucleic Acids Research 14, 179-196) by applying these methods to the problem of intron/exon discrimination. Two main findings are reported: i) oligonucleotide patterns complementary to the Ul small nuclear RNA are specifically avoided in exon sequences, ii) vertebrate intron sequences, to the exclusion of other eukaryotic phyla, are characterized by a peculiar distribution of CpG containing patterns.  相似文献   

2.
With the exponential growth of genomic sequences, there is an increasing demand to accurately identify protein coding regions (exons) from genomic sequences. Despite many progresses being made in the identification of protein coding regions by computational methods during the last two decades, the performances and efficiencies of the prediction methods still need to be improved. In addition, it is indispensable to develop different prediction methods since combining different methods may greatly improve the prediction accuracy. A new method to predict protein coding regions is developed in this paper based on the fact that most of exon sequences have a 3-base periodicity, while intron sequences do not have this unique feature. The method computes the 3-base periodicity and the background noise of the stepwise DNA segments of the target DNA sequences using nucleotide distributions in the three codon positions of the DNA sequences. Exon and intron sequences can be identified from trends of the ratio of the 3-base periodicity to the background noise in the DNA sequences. Case studies on genes from different organisms show that this method is an effective approach for exon prediction.  相似文献   

3.
In this study, the evolutionary history of the variable second exon of RT1.Ba and its adjoining intron b are compared across a number of species and subspecies of the Australian RATTUS: Three lineages are identified in the second intron across a range of Rattus species. Two of these lineages, separated by the insertion of a probable rodent short interspersed nucleotide element and by point mutations outside the indel region, are both found in each of the major clades of the endemic Australian RATTUS: This pattern of ancestral polymorphism is reflected in the adjoining exon 2 sequences, although phylogenetic constraints confirm that the clustering is not identical to that of the associated intron sequences. In addition, the coding sequences show evidence of the retention of ancestral polymorphism, with identical exon sequences found in two divergent species, and some indication of gene conversion detected for the exon sequences.  相似文献   

4.
Twenty-three sequence haplotypes spanning the boundary of the second exon and intron of a red-winged blackbird Mhc class II B gene, Agph-DAB1, are presented. The polymorphism of the exon segment is distributed in two divergent allelic lineages which appear to be maintained by balancing selection. The silent nucleotide diversity of the exon (pi = 0.101) is more than five times that of the intron (pi = 0.018) and decays rapidly across the exon-intron boundary. Additionally, genealogical reconstruction indicates that divergence from a common ancestor in the exon sample is over four times that of the intron. The intron sequences reveal a pattern of polymorphism which is characteristic of directional selection, rather than a pattern expected from linkage to a balanced polymorphism. These results suggest that the evolutionary histories of these two adjacent regions have been disassociated by recombination or gene conversion. The estimated population recombination parameter between the exon and the intron is sufficiently high (4NeC = 8.545) to explain the homogenization of intron sequences. Compatibility analyses estimate that these events primarily occur from the exon-intron boundary to about 20-30 bases into the intron. Additionally, the observation that divergent exon alleles share identical intron sequence supports the conclusion of disassociation of exon and intron evolutionary histories by recombination.  相似文献   

5.
A bovine genomic clone that hybridized to HLA-DQ beta cDNA was isolated and fragments containing the beta 1, beta 2 and transmembrane (TM) exons subcloned. The nucleotide sequences of the exons and flanking intron regions were determined. Comparisons of these exon nucleotide sequences and derived amino acid sequences to human class II beta-chain sequences showed that this gene is only 77% identical to HLA-DQ beta and about 75% identical to bovine DQ beta-like genes. The exon sequences were more divergent from other class II beta-chain genes. However, structural features such as conserved cysteines and regions of amino acids strongly suggest this to be a class II beta-chain gene. When exon-containing fragments were used as hybridization probes on Southern blots of bovine genomic DNA digested with Eco RI or Pvu II, each exon hybridized to a single band. Based on these results we have referred to this gene as a novel bovine class II beta-chain gene, BoLA-DIB.  相似文献   

6.
New contributions toward generalizing evolutionary models expand greatly our ability to analyze complex evolutionary characters and advance phylogeny reconstruction. In this article, we extend the binary stochastic Dollo model to allow for multi-state characters. In doing so, we align previously incompatible Wagner and Dollo parsimony principles under a common probabilistic framework by embedding arbitrary continuous-time Markov chains into the binary stochastic Dollo model. This approach enables us to analyze character traits that exhibit both Dollo and Wagner characteristics throughout their evolutionary histories. Utilizing Bayesian inference, we apply our novel model to analyze intron conservation patterns and the evolution of alternatively spliced exons. The generalized framework we develop demonstrates potential in distinguishing between phylogenetic hypotheses and providing robust estimates of evolutionary rates. Moreover, for the two applications analyzed here, our framework is the first to provide an adequate stochastic process for the data. We discuss possible extensions to the framework from both theoretical and applied perspectives.  相似文献   

7.
8.

Background  

In some genomic applications it is necessary to design large numbers of PCR primers in exons flanking one or several introns on the basis of orthologous gene sequences in related species. The primer pairs designed by this target gene approach are called "intron-flanking primers" or because they are located in exonic sequences which are usually conserved between related species, "conserved primers". They are useful for large-scale single nucleotide polymorphism (SNP) discovery and marker development, especially in species, such as wheat, for which a large number of ESTs are available but for which genome sequences and intron/exon boundaries are not available. To date, no suitable high-throughput tool is available for this purpose.  相似文献   

9.
In this paper, a revision for the existing method of locating exons by genomic signal processing technique employing four binary indicator sequences is presented. The existing method relies on the pronounced period three peaks observed in the Fourier power spectrum of the exon regions which are absent in non-coding regions. The authors have abandoned the four sequences all together and adopted a single 'EIIP indicator sequence' which is formed by substituting the electron-ion interaction pseudopotentials (EIIP) of the nucleotides A, G, C and T in the DNA sequence, reducing the computational overhead by 75%. The power spectrum of this sequence reveals period three peaks for exon regions. Also a number of exons have been identified which exhibit period three peaks when mapped to 'EIIP indicator sequence' and which do not show the same when the binary indicator sequences are employed. We could get better discrimination between exon areas and non-coding areas of a number of genomes when the sequences are mapped to EIIP indicator sequences and the power spectra of the same are taken in a sliding Kaiser window, compared to the existing method using a rectangular window which utilizes binary indicator sequences.  相似文献   

10.
Use of runs statistics for pattern recognition in genomic DNA sequences.   总被引:2,自引:0,他引:2  
In this article, the use of the finite Markov chain imbedding (FMCI) technique to study patterns in DNA under a hidden Markov model (HMM) is introduced. With a vision of studying multiple runs-related statistics simultaneously under an HMM through the FMCI technique, this work establishes an investigation of a bivariate runs statistic under a binary HMM for DNA pattern recognition. An FMCI-based recursive algorithm is derived and implemented for the determination of the exact distribution of this bivariate runs statistic under an independent identically distributed (IID) framework, a Markov chain (MC) framework, and a binary HMM framework. With this algorithm, we have studied the distributions of the bivariate runs statistic under different binary HMM parameter sets; probabilistic profiles of runs are created and shown to be useful for trapping HMM maximum likelihood estimates (MLEs). This MLE-trapping scheme offers good initial estimates to jump-start the expectation-maximization (EM) algorithm in HMM parameter estimation and helps prevent the EM estimates from landing on a local maximum or a saddle point. Applications of the bivariate runs statistic and the probabilistic profiles in conjunction with binary HMMs for pattern recognition in genomic DNA sequences are illustrated via case studies on DNA bendability signals using human DNA data.  相似文献   

11.
引入碱基间的关联,研究了外显子和内含子序列以双碱基为单位的分维,我们发现在这种情况下,外显子和内显子序列在短程和中程存在自相似性并分别定义了这两个区域的分维。结果表明,短程的分维值Dg一般比中程的Dm大,外显子的两个分维值比内含子大。我们改变双联体的位相而分维却不变,这反映出在双联体基础上,外显子的不规则性大于内含子,短程的不规则性大于中程,外显子和内含子序列对以2为周期的结构没有位相的特异性。  相似文献   

12.
H H Lin  D K Ann 《Genomics》1991,10(1):102-113
  相似文献   

13.
14.
15.
水稻NBS-LRR基因选择性剪接的全基因组检测及分析   总被引:1,自引:0,他引:1  
顾连峰  郭荣发 《遗传学报》2007,34(3):247-257
选择性剪接是促进基因组复杂性和蛋白质组多样性的一种主要机制,但是对水稻NBS-LRR序列选择性剪接的全基因组分析却未见报道。通过隐马尔柯夫模型搜索,从TIGR数据库里得到了855条编码NBS-LRR基序的序列。利用这些序列在KOME、TIGR基因索引及UniProt三个数据库中进行同源搜索,获得同源的完整cDNA序列、假设一致性序列和蛋白质序列。再利用Spidey和SIM4程序把完整cDNA序列和假设一致性序列联配到相应的BAC序列上来预测选择性剪接。蛋白质序列和基因组序列之间的联配使用tBLASTn。在这875个NBS-LRR基因中,119个基因具有选择性剪接现象,其中包括71内含子保留,20个外显子跳跃,25个选择性起始,16个选择性终止,12个5′端的选择性剪接和16个3′端选择性剪接。大多数选择性剪接都为两个和多个转录本所支持。可以通过访问http://www.bioinfor.org查询这些数据。进而通过生物信息学分析剪接边界发现外显子跳跃和内含子保留的‘GT…AG’的规则不如组成型的保守。这暗示了它们是通过不同的调控机制来指导剪接变构体的形成。通过分析内含子保留对蛋白质的影响,发现选择性剪接的蛋白更倾向于改变其C端氨基酸序列。最后对选择性剪接的组织分布和蛋白质定位进行分析,结果表明选择性剪接的最大类的组织分布是根和愈伤组织。超过1/3剪接变构体的蛋白质定位是质膜和细胞质。这些选择性剪接蛋白可能在抗病信号转导中起到重要作用。  相似文献   

16.
In common with other multigene families, sequence diversity in the hemoglobin genes of cladoceran crustaceans has been heavily impacted by gene conversion events. Because of their structural complexity (six exons, five introns), these genes provide a good opportunity to study the influence of intron length and position on the conversion process. This study surveys the patterns of divergence in variants of one hemoglobin gene (H1) from two closely related species of Daphnia using a PCR-based approach. Although its effects were most pronounced at their 5' ends, intron and exon regions of these genes showed similar exposure to gene conversion, excepting intron 2. This intron, which was the only one with a marked length difference among variants, showed substantial sequence divergence, suggesting that gene conversion was disrupted. These results, together with those on hemoglobin gene families in other organisms, indicate that sequence tracts showing gene conversion are often distributed in a mosaic fashion. The reactivation of gene conversion downstream of a block protected from its effects suggests that there are multiple initiation points, and the distribution of conversion tracts suggests that exon/intron splice sites are important in this regard.  相似文献   

17.
18.
We have isolated and characterized the gene encoding the human androgen receptor. The coding sequence is divided into eight coding exons and spans a minimum of 54 kilobases. The positions of the exon boundaries are highly conserved when compared to the location of the exon boundaries of the chicken progesterone and human estrogen receptor genes. Definition of the intron/exon boundaries has permitted the synthesis of specific oligonucleotides for use in the amplification of segments of the androgen receptor gene from samples of total genomic DNA. This technique allows the analysis of all segments of the androgen receptor gene except a small region of exon 1 that encodes the glycine homopolymeric segment. Using these methods we have analyzed samples of DNA prepared from a patient with complete androgen resistance and have detected a single nucleotide substitution at nucleotide 1924 in exon 3 of the androgen receptor gene that results in the conversion of a lysine codon into a premature termination codon at amino acid position 588. The introduction of a termination codon into the sequence of the normal androgen receptor cDNA at this position leads to a decrease in the amount of mRNA encoding the human androgen receptor and the synthesis of a truncated receptor protein that is unable to bind ligand and is unable to activate the long terminal repeat of the mouse mammary tumor virus in cotransfection assays.  相似文献   

19.
Prediction of splice junctions in mRNA sequences.   总被引:8,自引:6,他引:2       下载免费PDF全文
K Nakata  M Kanehisa    C DeLisi 《Nucleic acids research》1985,13(14):5327-5340
A general method based on the statistical technique of discriminant analysis is developed to distinguish boundaries of coding and non-coding regions in nucleic acid sequences. In particular, the method is applied to the prediction of splicing sites in messenger RNA precursors. Information used for discrimination includes consensus sequence patterns around splice junctions, free energy of snRNA and mRNA base pairing, and statistical differences between coding and non-coding regions such as periodic appearance of specific bases in coding regions reflecting the non-random usage of degenerate codons. Given the reading frame of an exon (but not the exon/intron boundaries), the method will predict the following exon, namely, the intron to be excised out. When applied to human sequences in the GenBank database, the method correctly identified 80% of true splice junctions.  相似文献   

20.
The AE1 (anion exchanger, band 3) protein is expressed in erythrocytes and in the A-type intercalated cells of the kidney distal collecting tubule. In both cell types it mediates the electroneutral transport of chloride and bicarbonate ions across the lipid bilayer, and, in erythrocytes, it also serves as the critical attachment site of the peripheral membrane skeleton. We have characterized the human AE1 gene using overlapping clones isolated from a phage library of human genomic DNA. The gene spans 20 kb and consists of 20 exons separated by 19 introns. The structure of the human AE1 gene corresponds closely with that of the previously characterized mouse AE1 gene, with a high degree of conservation of exon/intron junctions, as well as exon and intron nucleotide sequences. The putative upstream and internal promoter sequences of the human AE1 gene used in erythroid and kidney cells, respectively, are described. We also report the nucleotide sequence of the entire 3′ noncoding region of exon 20, which was lacking in the published cDNA sequences. In addition, we have characterized 9 Alu repeat elements found within the body of the human AE1 gene that are members of 4 related subfamilies that appear to have entered the genome at different times during primate evolution.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号