首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
4.
Interpolated markov chains for eukaryotic promoter recognition.   总被引:9,自引:0,他引:9  
MOTIVATION: We describe a new content-based approach for the detection of promoter regions of eukaryotic protein encoding genes. Our system is based on three interpolated Markov chains (IMCs) of different order which are trained on coding, non-coding and promoter sequences. It was recently shown that the interpolation of Markov chains leads to stable parameters and improves on the results in microbial gene finding (Salzberg et al., Nucleic Acids Res., 26, 544-548, 1998). Here, we present new methods for an automated estimation of optimal interpolation parameters and show how the IMCs can be applied to detect promoters in contiguous DNA sequences. Our interpolation approach can also be employed to obtain a reliable scoring function for human coding DNA regions, and the trained models can easily be incorporated in the general framework for gene recognition systems. RESULTS: A 5-fold cross-validation evaluation of our IMC approach on a representative sequence set yielded a mean correlation coefficient of 0.84 (promoter versus coding sequences) and 0.53 (promoter versus non-coding sequences). Applied to the task of eukaryotic promoter region identification in genomic DNA sequences, our classifier identifies 50% of the promoter regions in the sequences used in the most recent review and comparison by Fickett and Hatzigeorgiou ( Genome Res., 7, 861-878, 1997), while having a false-positive rate of 1/849 bp.  相似文献   

5.
6.
7.
8.
转录起始位点的计算定位是基因转录调控研究的重要内容,但现有方法的识别性能较低。文章作者在已有原核启动子识别算法的基础上,提出了一种基于滑动窗口的原核转录起始位点计算定位方法,通过在合理限定的定位范围内对序列进行滑动扫描,来预测转录起始位点的位置。首先根据窗口序列的交迭组分特征和启动子其它特征分别建立二次判别分类器,用其计算对应位置的似然得分,再利用转录起始位点与翻译起始位点的间隔经验分布信息对似然得分进行修正,最后依照似然得分的分布情况由阈值定位算法确定预测位置。对大肠杆菌真实序列数据的测试结果表明,该定位算法可实现对真实转录起始位点位置的有效预测,与已有算法相比,当敏感性指标同为0.85左右时,特异性指标可从0.20提高至0.65,从而使得定位准确率提高了约20个百分点。  相似文献   

9.
A synthetic DNA probe designed to detect coding sequences for platelet factor 4 and connective tissue-activating peptide III (two human platelet alpha-granule proteins) was used to identify several similar sequences in total human DNA. Sequence analysis of a corresponding 3,201-base-pair EcoRI fragment isolated from a human genomic library demonstrated the existence of a variant of platelet factor 4, designated PF4var1. The gene for PF4var1 consisted of three exons and two introns. Exon 1 coded for a 34-amino-acid hydrophobic leader sequence that had 70% sequence homology with the leader sequence for PF4 but, in contrast, contained a hydrophilic amino-terminal region with four arginine residues. Exon 2 coded for a 42-amino-acid segment that was 100% identical with the corresponding segment of the mature PF4 sequence containing the amino-terminal and disulfide-bonded core regions. Exon 3 coded for the 28-residue carboxy-terminal region corresponding to a domain specifying heparin-binding and cellular chemotaxis. However, PF4var1 had amino acid differences at three positions in the lysine-rich carboxy-terminal end that were all conserved among human, bovine, and rat PF4s. These differences should significantly affect the secondary structure and heparin-binding properties of the protein based on considerations of the bovine PF4 crystal structure. By comparing the PF4var1 genomic sequence with the known human cDNA and the rat genomic PF4-coding sequences, we identified potential genetic regulatory regions for PF4var1. Rat PF4 and human PF4var1 genes had identical 18-base sequences 5' to the promoter region. The intron positions appeared to correspond approximately to the boundaries of the protein functional domains.  相似文献   

10.
11.
A group-specific primer, F243 (positions 226 to 243, Escherichia coli numbering), was developed by comparison of sequences of genes encoding 16S rRNA (16S rDNA) for the detection of actinomycetes in the environment with PCR and temperature or denaturing gradient gel electrophoresis (TGGE or DGGE, respectively). The specificity of the forward primer in combination with different reverse ones was tested with genomic DNA from a variety of bacterial strains. Most actinomycetes investigated could be separated by TGGE and DGGE, with both techniques giving similar results. Two strategies were employed to study natural microbial communities. First, we used the selective amplification of actinomycete sequences (E. coli positions 226 to 528) for direct analysis of the products in denaturing gradients. Second, a nested PCR providing actinomycete-specific fragments (E. coli positions 226 to 1401) was used which served as template for a PCR when conserved primers were used. The products (E. coli positions 968 to 1401) of this indirect approach were then separated by use of gradient gels. Both approaches allowed detection of actinomycete communities in soil. The second strategy allowed the estimation of the relative abundance of actinomycetes within the bacterial community. Mixtures of PCR-derived 16S rDNA fragments were used as model communities consisting of five actinomycetes and five other bacterial species. Actinomycete products were obtained over a 100-fold dilution range of the actinomycete DNA in the model community by specific PCR; detection of the diluted actinomycete DNA was not possible when conserved primers were used. The methods tested for detection were applied to monitor actinomycete community changes in potato rhizosphere and to investigate actinomycete diversity in different soils.  相似文献   

12.
13.
14.
15.
Computational analysis of core promoters in the Drosophila genome   总被引:1,自引:0,他引:1       下载免费PDF全文
Ohler U  Liao GC  Niemann H  Rubin GM 《Genome biology》2002,3(12):research0087.1-8712
  相似文献   

16.
17.
We have characterized sequences of genomic DNA 5' to the coding region of the rat malic enzyme gene. This sequence possesses neither TATA nor CCAAT sequences in their usual positions but is rich in GC residues. Sequences similar to those found in the regulatory regions of other genes are discussed. Deletion analyses have revealed that sequences +1 to -41 are sufficient to initiate expression, although inclusion of information up to -177 is necessary for maximal promoter activity.  相似文献   

18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号