首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Pattern recognition in several sequences: Consensus and alignment   总被引:12,自引:0,他引:12  
The comparison of several sequences is central to many problems of molecular biology. Finding consensus patterns that define genetic control regions or that determine structural or functional themes are examples of these problems. Previously proposed methods, such as dynamic programming, are not adequate for solving problems of realistic size. This paper gives a new and practical solution for finding unknown patterns that occur imperfectly above a preset frequency. Algorithms for finding the patterns are given as well as estimates of statistical significance. This author supported by a grant from the System Development Foundation. This author supported by NSF grant MCS-8301960 and by a grant from the System Development Foundation. This author supported by NIH grant GM19036.  相似文献   

2.
Identification of functional motifs in a DNA sequence is fundamentally a statistical pattern recognition problem. Discriminant analysis is widely used for solving such problems. This paper will review two basic parametric methods: LDA (linear discriminant analysis) and QDA (quadratic discriminant analysis). Their usage in recognition of splice sites and exons in the human genome will be demonstrated.  相似文献   

3.
A new development is introduced here in the use of dynamic programming in finding pattern similarities in genetic sequences, as was first done by Needleman and Wunsch (1969). A condition of pattern similarity is defined and an algorithm is given which scans any set of similarities and screens out those which fail to meet the condition. When the set to be scanned contains every pair of segments, one from each of two given sequences of lengthsm andn (i.e. every possible location for a pattern similarity), then it completes the scan in a number of computational steps proportional tom·n, leaving those pairs of segments which satisfy the similarity condition. The algorithm is based on the concept of match density, as suggested by Goad and Kanehisa (1982).  相似文献   

4.
Selective amplification in PCR is principally determined by the sequence of the primers and the temperature of the annealing step. We have developed a new PCR technique for distinguishing related sequences in which additional selectivity is dependent on sequences within the amplicon. A 5′ extension is included in one (or both) primer(s) that corresponds to sequences within one of the related amplicons. After copying and incorporation into the PCR product this sequence is then able to loop back, anneal to the internal sequences and prime to form a hairpin structure—this structure is then refractory to further amplification. Thus, amplification of sequences containing a perfect match to the 5′ extension is suppressed while amplification of sequences containing mismatches or lacking the sequence is unaffected. We have applied Headloop PCR to DNA that had been bisulphite-treated for the selective amplification of methylated sequences of the human GSTP1 gene in the presence of up to a 105-fold excess of unmethylated sequences. Headloop PCR has a potential for clinical application in the detection of differently methylated DNAs following bisulphite treatment as well as for selective amplification of sequence variants or mutants in the presence of an excess of closely related DNA sequences.  相似文献   

5.
The A-factor receptor protein (ArpA) containing an α-helix-turn-α-helix DNA-binding consensus sequence at its N-terminal portion plays a key role in the regulation of secondary metabolism and cell differentiation in Streptomyces griseus . A binding site forming a palindrome 24 bp in length was initially recovered from a pool of random-sequence oligonucleotides by rounds of a binding/immunoprecipitation/amplification procedure with histidine-tagged ArpA and anti-ArpA antibody. By means of further binding/gel retardation/amplification experiments on the basis of the recovered sequence, a 22 bp palindromic binding site with the sequence 5'-GG(T/C)CGGT(A/T)(T/C)-G(T/G)-3' as one half of the palindrome was deduced as a consensus sequence recognized and bound by ArpA. ArpA did not bind to the binding site in the presence of its ligand, A-factor. In addition, exogenous addition of A-factor to the ArpA–DNA complex induced immediate release of ArpA from the DNA. All of these data are consistent with the idea, obtained from previous genetic studies, that ArpA acts as a repressor-type regulator for secondary metabolism and cellular differentiation by preventing the expression of a certain key gene(s) during the early growth phase. A-factor, produced in a growth-dependent manner, releases ArpA from the DNA, thus switching on the expression of the key gene(s), leading to the onset of secondary metabolism and aerial mycelium formation.  相似文献   

6.
We have used an algorithm from the pattern recognition theory "generalized portrait" to find a distinguishing vector for Escherichia coli promoters. We have made an attempt to solve closely linked problems for choosing significant signs of that signal, multiple alignment and for calculation of the recognition vector (matrix). The promoters with known strength have been ranged with this vector. The analysis of the occurrence of predicted promoters has been carried out. The promoters search program for IBM-compatible computers is available from the authors.  相似文献   

7.
We have developed a method for identifying consensus patternsin a set of unaligned DNA sequences known to bind a common proteinor to have some other common biochemical function. The methodis based on a tnatrix representation of binding site patterns.Each row of the matrix represents one of the four possible bases,each column represents one of the positions of the binding siteand each element is determined by the frequency the indicatedbase occurs at the indicated position. The goal of the methodis to find the most significant matrix-i.e. the one with thelowest probability of occurring by chance-out of all the matricesthat can be formed from the set of related sequences. The reliabilityof the method improves with the number of sequences, while thetime required increases only linearly with the number of sequences.To test this method, we analysed 11 DNA sequences containingpromoters regulated by the Escherichia coli LexA protein. Thematrices we' found were consistent with the known consensussequence, and could distinguish the generally accepted LexAbinding sites from other DNA sequences. Received on November 6, 1989; accepted on December 20, 1989  相似文献   

8.
A computer search of the pBR322 DNA sequence identified five sites matching reported glucocorticoid regulatory element (GRE) DNA consensus sequences and three related sites. A pBR322 DNA fragment containing one GRE site was shown to bind immobilized HeLa S3 cell glucocorticoid receptor and to compete for receptor binding in a competitive binding assay. Conversely, a pBR322 DNA fragment devoid of GRE sites showed barely detectable interaction with glucocorticoid receptor in either of these assays. These results demonstrate the importance of GRE consensus sequences in glucocorticoid receptor interactions with DNA, and further identify a cause for high background binding observed when pBR322 DNA is used as a negative control in studies of glucocorticoid receptor-DNA interactions.  相似文献   

9.
DNA sequences can be treated as finite-length symbol strings over a four-letter alphabet (A, C, T, G). As a universal and computable complexity measure, LZ complexity is valid to describe the complexity of DNA sequences. In this study, a concept of conditional LZ complexity between two sequences is proposed according to the principle of LZ complexity measure. An LZ complexity distance metric between two nonnull sequences is defined by utilizing conditional LZ complexity. Based on LZ complexity distance, a phylogenetic tree of 26 species of placental mammals (Eutheria) with three outgroup species was reconstructed from their complete mitochondrial genomes. On the debate that which two of the three main groups of placental mammals, namely Primates, Ferungulates, and Rodents, are more closely related, the phylogenetic tree reconstructed based on LZ complexity distance supports the suggestion that Primates and Ferungulates are more closely related.  相似文献   

10.

Background  

The ancestry of mitochondria and chloroplasts traces back to separate endosymbioses of once free-living bacteria. The highly reduced genomes of these two organelles therefore contain very distant homologs that only recently have been shown to recombine inside the mitochondrial genome. Detection of gene conversion between mitochondrial and chloroplast homologs was previously impossible due to the lack of suitable computer programs. Recently, I developed a novel method and have, for the first time, discovered recurrent gene conversion between chloroplast mitochondrial genes. The method will further our understanding of plant organellar genome evolution and help identify and remove gene regions with incongruent phylogenetic signals for several genes widely used in plant systematics. Here, I implement such a method that is available in a user friendly web interface.  相似文献   

11.
刘俊宏  李春 《生物信息学》2013,11(2):142-145
借助DNA序列中k-字的频数,将序列转化成一个340维向量,进而计算物种间的进化距离。作为应用:分别以15个物种的β球蛋白基因、13种汉坦病毒的S片段以及26个闭壳龟线粒体基因为例,构建系统发生树,所得结果与前人的结论一致,说明了该方法的有效性。  相似文献   

12.
DNA sequencing has resulted in an abundance of data on DNA sequences for various species. Hence, the characterization and comparison of sequences become more important but still difficult tasks. In this paper, we first give a 2-D ladderlike graphical representation for the characteristic sequences of a DNA sequence, and then construct a 3-component vector, in which the normalized ALE-indices extracted from such three 2-D graphs via D/D matrices are individual components, to characterize the DNA sequence. The examination of similarities/dissimilarities among sequences of the beta-globin genes of different species illustrates the utility of the approach.  相似文献   

13.
A personal computer program to visualize and compare the property of double-stranded DNA surface has been developed. Comparison of the surface property between Watson-Crick base-pairs in B-form DNA has elucidated that the base-pair replacement between a "degenerated base-pairs" conserves the pattern of potential hydrogen-bonding sites in both major and minor grooves. The idea of the "degenerated base-pairs" was applied for the problem of the base-sequence variation from the consensus sequence in the -35 region of E. coli promoter. The sequence variation is found to have tendency to occur among the degenerated base-pairs.  相似文献   

14.
15.
16.
扩增共有序列遗传标记(ACGM)及其在植物中的应用价值   总被引:1,自引:0,他引:1  
何俊平  阮松林  祝水金  马华升 《遗传》2009,31(9):913-920
扩增共有序列遗传标记(Amplified consensus genetic markers, ACGM)是建立在亲缘关系较近的物种间同源基因中编码序列(外显子)存在高度保守性, 而非编码区(内含子)存在潜在多态性基础之上的一种基于PCR的新型DNA分子标记技术。随着比较基因组学和生物信息学的迅猛发展, ACGM技术已成为物种间同源基因比较、系统发生关系分析和感兴趣基因定位的有效工具。文章就ACGM技术尤其是它在芸薹属和禾本科物种中的应用研究做详细介绍。同时, 对ACGM技术可能的应用前景进行了展望。  相似文献   

17.
Pattern of recognition of DNA by mammalian DNA topoisomerase II   总被引:1,自引:0,他引:1  
The antitumor drug VP-16 stabilizes the topoisomerase II-DNA covalent complexes formed in an intermediate step of the isomerization reaction. The location of the sites of formation of these complexes and their relative strength were studied in vitro using pBR322. Sequences alignment of the regions containing the 24 detectable sites allows to identify GCGCGC-(N) alpha-TGAC with 9 less than or equal to alpha less than or equal to 25 as the DNA sequence recognized by topoisomerase II to form a cleavable complex. Changes in the last two nucleotides of the sequence determine weaker complexes.  相似文献   

18.
19.
We introduce a novel 2D graphical representation of DNA sequences based on the pairs of the neighboring nucleotides (PNNs). Then we get the PNNs' distributions and obtain a y-M. The construction of the PNN-curve has some important advantages (1) It avoids loss of information and the PNN-curve standing for DNA sequences does not overlap or intersect with itself. (2) The novel 2D representation is more sensitive. The utility of this method can be illustrated by the examination of similarities/dissimilarities among the coding sequences of the first exon of beta-globin gene of eleven different species in Table 2.  相似文献   

20.
Interaction of regulator proteins with recognition sequences of DNA   总被引:1,自引:0,他引:1  
B Lewin 《Cell》1974,2(1):1-7
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号