首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
4.
The statistics of base-pair usage within known recognition sites for a particular DNA-binding protein can be used to estimate the relative protein binding affinities to these sites, as well as to sites containing any other combinations of base-pairs. As has been described elsewhere, the connection between base-pair statistics and binding free energy is made by an equal probability selection assumption; i.e. that all base-pair sequences that provide appropriate binding strength are equally likely to have been chosen as recognition sites in the course of evolution. This is analogous to a statistical-mechanical system where all configurations with the same energy are equally likely to occur. In this communication, we apply the statistical-mechanical selection theory to analyze the base-pair statistics of the known recognition sequences for the cyclic AMP receptor protein (CRP). The theoretical predictions are found to be in reasonable agreement with binding data for those sequences for which experimental binding information is available, thus lending support to the basic assumptions of the selection theory. On the basis of this agreement, we can predict the affinity for CRP binding to any base-pair sequence, albeit with a large statistical uncertainty. When the known recognition sites for CRP are ranked according to predicted binding affinities, we find that the ranking is consistent with the hypothesis that the level of function of these sites parallels their fractional saturation with CRP-cAMP under in-vivo conditions. When applied to the entire genome, the theory predicts the existence of a large number of randomly occurring "pseudosites" with strong binding affinity for CRP. It appears that most CRP molecules are engaged in non-productive binding at non-specific or pseudospecific sites under in-vivo conditions. In this sense, the specificity of the CRP binding site is very low. Relative specificity requirements for polymerases, repressors and activators are compared in light of the results of this and the first paper in this series.  相似文献   

5.
We have developed a method for identifying consensus patternsin a set of unaligned DNA sequences known to bind a common proteinor to have some other common biochemical function. The methodis based on a tnatrix representation of binding site patterns.Each row of the matrix represents one of the four possible bases,each column represents one of the positions of the binding siteand each element is determined by the frequency the indicatedbase occurs at the indicated position. The goal of the methodis to find the most significant matrix-i.e. the one with thelowest probability of occurring by chance-out of all the matricesthat can be formed from the set of related sequences. The reliabilityof the method improves with the number of sequences, while thetime required increases only linearly with the number of sequences.To test this method, we analysed 11 DNA sequences containingpromoters regulated by the Escherichia coli LexA protein. Thematrices we' found were consistent with the known consensussequence, and could distinguish the generally accepted LexAbinding sites from other DNA sequences. Received on November 6, 1989; accepted on December 20, 1989  相似文献   

6.
Theoretical analysis of ''addressed'' chemical modification of DNA.   总被引:3,自引:2,他引:1       下载免费PDF全文
Chemical "addressed" modification of DNA involves treatment of single-stranded DNA with oligonucleotides complementary to certain target sequences in this DNA and bearing a groupings reactive towards DNA bases. The binding of oligonucleotides can occur both at completely (specific) and incompletely (nonspecific) complementary sites. We analyse the modification of a fragment that is flanked by two target sequences complementary to a given oligonucleotide address, contains no more such targets and has some randomly distributed sites for nonspecific binding. Conditions for the maximum ratio between specific and non-specific modification are determined. We find the probability of both target termini being specifically modified without any non-specific modification occurring within the fragment up to a given moment in time. Quantitative analysis is based on the use of known features of the specific and non-specific binding of an oligonucleotide to DNA sites. This analysis shows the possibility of specific cutting of DNA based on addressed modification.  相似文献   

7.
Biological weighted sequences are used extensively in molecular biology as profiles for protein families, in the representation of binding sites and often for the representation of sequences produced by a shotgun sequencing strategy. In this paper, we address three fundamental problems in the area of biologically weighted sequences: (i) computation of repetitions, (ii) pattern matching, and (iii) computation of regularities. Our algorithms can be used as basic building blocks for more sophisticated algorithms applied on weighted sequences.  相似文献   

8.
9.
10.
11.
12.
13.
The preferred dye binding sites and the microenvironment of known nucleotide sequences within mitochondrial and plasmid pBR322 DNA was probed in a gross fashion with restriction endonucleases. The intercalating dyes, ethidium bromide and propidium iodide, do not inhibit a given restriction endonuclease equally at all of the restriction sites within a DNA molecule. The selective inhibition may be explained, in part, by the potential B to Z conformation transition of DNA flanking the restriction site and by preferred dye binding sites. Propidium iodide was found to be a more potent inhibitor than ethidium bromide and the inhibition is independent of the type of cut made by the enzyme.  相似文献   

14.
15.
16.
Kinjo AR  Nakamura H 《PloS one》2012,7(2):e31437
Most biological processes are described as a series of interactions between proteins and other molecules, and interactions are in turn described in terms of atomic structures. To annotate protein functions as sets of interaction states at atomic resolution, and thereby to better understand the relation between protein interactions and biological functions, we conducted exhaustive all-against-all atomic structure comparisons of all known binding sites for ligands including small molecules, proteins and nucleic acids, and identified recurring elementary motifs. By integrating the elementary motifs associated with each subunit, we defined composite motifs that represent context-dependent combinations of elementary motifs. It is demonstrated that function similarity can be better inferred from composite motif similarity compared to the similarity of protein sequences or of individual binding sites. By integrating the composite motifs associated with each protein function, we define meta-composite motifs each of which is regarded as a time-independent diagrammatic representation of a biological process. It is shown that meta-composite motifs provide richer annotations of biological processes than sequence clusters. The present results serve as a basis for bridging atomic structures to higher-order biological phenomena by classification and integration of binding site structures.  相似文献   

17.
18.
An analysis of the structure of DNA sites responsible for binding to glucocorticoid-receptor complex (GlRC) was carried out. The use of the frequency matrices and of a variant of the perception method made it possible to establish that in the GlRC binding site on both sides of the known conservative nucleotide sequence (nucleus) there were additional conservative elements which seemed to be able to modulate the efficiency of GlRC binding. A criterion is worked out for detecting the potential GlRC binding sites in given sequences. It is based on the simultaneous use of several perceptron matrices. The efficiency of detection of GlRC binding sites by means of the proposed criterion is by an order higher than that performed according to the GlRC binding site consensus (Beato et al. [2]).  相似文献   

19.
20.
The typical output of many computational methods to identify binding sites is a long list of motifs containing some real motifs (those most likely to correspond to the actual binding sites) along with a large number of random variations of these. We present a statistical method to separate real motifs from their artifacts. This produces a short list of high quality motifs that is sufficient to explain the over-representation of all motifs in the given sequences. Using synthetic data sets, we show that the output of our method is very accurate. On various sets of upstream sequences in S. cerevisiae, our program identifies several known binding sites, as well as a number of significant novel motifs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号