首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Selection of DNA binding sites by regulatory proteins   总被引:15,自引:0,他引:15  
  相似文献   

2.
The frequency of base-pair occurrence in a set of recognition sequences for a particular DNA-binding protein is strongly related to the contributions to the binding free energy from the individual base pairs. Thus from the statistics of base-pair choice, it is possible to estimate the relative binding strengths of any base-pair sequences and to predict the effect of point mutations in specific sites. On the same basis, one can describe the binding properties of random DNA sequences and thereby the expected competitive effects from all the nonspecific DNA sites in the genome of a living cell. The statistical selection theory [Berg & von Hippel.J. Mol. Biol. 193 (1987) 723-750] describing these relations is extended and tested with computer simulations. The theory is shown to hold up well also in the case when base pairs contribute cooperatively to the binding interaction. The simulations also demonstrate the effects of the statistical small-sample uncertainty that appears due to the limited size of all sets of recognition sites identified.  相似文献   

3.
The statistics of base-pair usage within known recognition sites for a particular DNA-binding protein can be used to estimate the relative protein binding affinities to these sites, as well as to sites containing any other combinations of base-pairs. As has been described elsewhere, the connection between base-pair statistics and binding free energy is made by an equal probability selection assumption; i.e. that all base-pair sequences that provide appropriate binding strength are equally likely to have been chosen as recognition sites in the course of evolution. This is analogous to a statistical-mechanical system where all configurations with the same energy are equally likely to occur. In this communication, we apply the statistical-mechanical selection theory to analyze the base-pair statistics of the known recognition sequences for the cyclic AMP receptor protein (CRP). The theoretical predictions are found to be in reasonable agreement with binding data for those sequences for which experimental binding information is available, thus lending support to the basic assumptions of the selection theory. On the basis of this agreement, we can predict the affinity for CRP binding to any base-pair sequence, albeit with a large statistical uncertainty. When the known recognition sites for CRP are ranked according to predicted binding affinities, we find that the ranking is consistent with the hypothesis that the level of function of these sites parallels their fractional saturation with CRP-cAMP under in-vivo conditions. When applied to the entire genome, the theory predicts the existence of a large number of randomly occurring "pseudosites" with strong binding affinity for CRP. It appears that most CRP molecules are engaged in non-productive binding at non-specific or pseudospecific sites under in-vivo conditions. In this sense, the specificity of the CRP binding site is very low. Relative specificity requirements for polymerases, repressors and activators are compared in light of the results of this and the first paper in this series.  相似文献   

4.
A model is proposed for the structure of stereospecific sites in regulatory proteins. On its basis a possible code is suggested that governs the binding of regulatory proteins at specific control sites on DNA. Stereospecific sites of regulatory proteins are assumed to contain pairs of antiparallel polypeptide chain segments which form a right-hand twisted antiparallel beta-sheet, with single-stranded regions at the ends of the beta-structure. The model predicts that binding reaction between a regulatory protein and double-helical DNA is a cooperative phenomenon and is accompanied by significant structural alteration at the stereospecific site of the protein. Half of hydrogen bonds normally existing in beta-structure are broken upon complex formation with DNA and a new set of hydrogen bonds is formed between polypeptide amide groups and DNA base pairs. In a stereospecific site, one chain (t-chain) is attached through hydrogen bonds to the carbonyl oxygens of pyramides and N3 adenines lying in one DNA strand, while the second polypeptide chain (g chain) is hydrogen bonded to the 2-amino groups of guanine residues lying in the opposite DNA strand. The amide groups serve as specific reaction sites being hydrogen bond acceptors in g-chain and hydrogen bond donors in t-chain. The single-stranded portions of t- and g-chains lying in neighbouring subunits of regulatory protein interact with each other forming deformed beta-sheets. The recognition of regulatory sequences by proteins is based on the structural complementarity between stereospecific sites of regulatory proteins and base pairs sequences at the control sites. An essential feature of these sequences is the asymmetrical distribution of guanine residues between the two DNA strands. The code predicts that there are six fundamental amino acid residues (serine, threonine, asparagine, histidine, glutamine and cysteine) whose sequence in stereospecific site determines the base pair sequence to which a given regulatory protein would bind preferentially. The code states a correspondence between four amino acid residues at the stereospecific site of regulatory protein with the two residues being in t- and g-segments, respectively, and AT(GC) base pair at the control site. It is thus possible to determine which amino acid residues in the repressor and which base pairs in the operator DNA are involved in specific interactions with each other, as exemplified by lac repressor binding to lac operator.  相似文献   

5.
The lack of a rigorous analytical theory for DNA looping has caused many DNA-loop-mediated phenomena to be interpreted using theories describing the related process of DNA cyclization. However, distinctions in the mechanics of DNA looping versus cyclization can have profound quantitative effects on the thermodynamics of loop closure. We have extended a statistical mechanical theory recently developed for DNA cyclization to model DNA looping, taking into account protein flexibility. Notwithstanding the underlying theoretical similarity, we find that the topological constraint of loop closure leads to the coexistence of multiple classes of loops mediated by the same protein structure. These loop topologies are characterized by dramatic differences in twist and writhe; because of the strong coupling of twist and writhe within a loop, DNA looping can exhibit a complex overall helical dependence in terms of amplitude, phase, and deviations from uniform helical periodicity. Moreover, the DNA-length dependence of optimal looping efficiency depends on protein elasticity, protein geometry, and the presence of intrinsic DNA bends. We derive a rigorous theory of loop formation that connects global mechanical and geometric properties of both DNA and protein and demonstrates the importance of protein flexibility in loop-mediated protein-DNA interactions.  相似文献   

6.
O G Berg 《Nucleic acids research》1988,16(11):5089-5105
The DNA sequences in the operator sites of the arginine regulon and of the SOS regulon have been subject to a statistical analysis. A quantitative correlation is found between the statistics of sequence choice and the activity at individual operator sites in both systems, as expected from theoretical considerations [Berg & von Hippel, J.Mol.Biol. (1987) 193, 723-750]. Based on these correlations it is possible to predict the effect of various sequence mutations. There is a significant difference in the slopes of the correlation lines between sequence and activity for the two systems. From this difference it can be expected that individual point mutations in the ARG boxes will have a much smaller effect on activity than similar changes in the SOS boxes. This difference may be related to a strong cooperative activity at tandem ARG boxes while the binding at SOS boxes appears to be mostly noncooperative.  相似文献   

7.
Structure-based prediction of DNA target sites by regulatory proteins   总被引:15,自引:0,他引:15  
Kono H  Sarai A 《Proteins》1999,35(1):114-131
Regulatory proteins play a critical role in controlling complex spatial and temporal patterns of gene expression in higher organism, by recognizing multiple DNA sequences and regulating multiple target genes. Increasing amounts of structural data on the protein-DNA complex provides clues for the mechanism of target recognition by regulatory proteins. The analyses of the propensities of base-amino acid interactions observed in those structural data show that there is no one-to-one correspondence in the interaction, but clear preferences exist. On the other hand, the analysis of spatial distribution of amino acids around bases shows that even those amino acids with strong base preference such as Arg with G are distributed in a wide space around bases. Thus, amino acids with many different geometries can form a similar type of interaction with bases. The redundancy and structural flexibility in the interaction suggest that there are no simple rules in the sequence recognition, and its prediction is not straightforward. However, the spatial distributions of amino acids around bases indicate a possibility that the structural data can be used to derive empirical interaction potentials between amino acids and bases. Such information extracted from structural databases has been successfully used to predict amino acid sequences that fold into particular protein structures. We surmised that the structures of protein-DNA complexes could be used to predict DNA target sites for regulatory proteins, because determining DNA sequences that bind to a particular protein structure should be similar to finding amino acid sequences that fold into a particular structure. Here we demonstrate that the structural data can be used to predict DNA target sequences for regulatory proteins. Pairwise potentials that determine the interaction between bases and amino acids were empirically derived from the structural data. These potentials were then used to examine the compatibility between DNA sequences and the protein-DNA complex structure in a combinatorial "threading" procedure. We applied this strategy to the structures of protein-DNA complexes to predict DNA binding sites recognized by regulatory proteins. To test the applicability of this method in target-site prediction, we examined the effects of cognate and noncognate binding, cooperative binding, and DNA deformation on the binding specificity, and predicted binding sites in real promoters and compared with experimental data. These results show that target binding sites for several regulatory proteins are successfully predicted, and our data suggest that this method can serve as a powerful tool for predicting multiple target sites and target genes for regulatory proteins.  相似文献   

8.
Discrimination of DNA binding sites by mutant p53 proteins.   总被引:2,自引:1,他引:2       下载免费PDF全文
Critical determinants of DNA recognition by p53 have been identified by a molecular genetic approach. The wild-type human p53 fragment containing amino acids 71 to 330 (p53(71-330)) was used for in vitro DNA binding assays, and full-length human p53 was used for transactivation assays with Saccharomyces cerevisiae. First, we defined the DNA binding specificity of the wild-type p53 fragment by using systematically altered forms of a known consensus DNA site. This refinement indicates that p53 binds with high affinity to two repeats of PuGPuCA.TGPyCPy, a further refinement of an earlier defined consensus half site PuPuPuC(A/T).(T/A) GPyPyPy. These results were further confirmed by transactivation assays of yeast by using full-length human p53 and systematically altered DNA sites. Dimers of the pentamer AGGCA oriented either head-to-head or tail-to-tail bound efficiently, but transactivation was facilitated only through head-to-head dimers. To determine the origins of specificity in DNA binding by p53, we identified mutations that lead to altered specificities of DNA binding. Single-amino-acid substitutions were made at several positions within the DNA binding domain of p53, and this set of p53 point mutants were tested with DNA site variants for DNA binding. DNA binding analyses showed that the mutants Lys-120 to Asn, Cys-277 to Gln or Arg, and Arg-283 to Gln bind to sites with noncanonical base pair changes at positions 2, 3, and 1 in the pentamer (PuGPuCA), respectively. Thus, we implicate these residues in amino acid-base pair contacts. Interestingly, mutant Cys-277 to Gln bound a consensus site as two and four monomers, as opposed to the wild-type p53 fragment, which invariably binds this site as four monomers.  相似文献   

9.
We describe a new procedure to identify RNA or DNA binding sites in proteins, based on a combination of UV cross-linking and single-hit chemical peptide cleavage. Site-directed mutagenesis is used to create a series of mutants with single Asn-Gly sequences in the protein to be analysed. Recombinant mutant proteins are incubated with their radiolabelled target sequence and UV irradiated. Covalently linked RNA- or DNA-protein complexes are digested with hydroxylamine and labelled peptides identified by SDS-PAGE and autoradiography. The analysis requires only small amounts of protein and is achieved within a relatively short time. Using this method we mapped the site at which human iron regulatory protein (IRP) is UV cross-linked to iron responsive element RNA to amino acid residues 116-151.  相似文献   

10.
There have been many different approaches employed to define the "consensus" sequence of various DNA binding sites and to use the definition obtained to locate and rank members of a given sequence family. The analysis presented here enlists two of these approaches, each in modified form, to develop a highly efficient search protocol for Escherichia coli promoters and to provide a relative ranking of these sites showing good agreement with in vitro measurements of promoter strength. Schneider et al. have applied Shannon's index of information content to evaluate the significance of each position within the consensus of a family of aligned sequences. In a formal sense, this index is only applicable to a group of sequences, providing at each position a negative entropy value between zero (random) and two bits (total conservation of a single base) for sequences in which all bases are equally represented. A method for evaluating how well an individual sequence conforms to the information content pattern of the consensus is described. A function is derived, by analogy to the information content of the sequence family, for application to individual sequences. Since this function is a measure of conformity, it can be used in a search protocol to identify new members of the family represented by the consensus. A protocol for locating E. coli promoters is presented. The Berg-von Hippel statistical-mechanical function is also tested in a similar application. While the information content function provides a superior search protocol, the Berg-von Hippel function, when scaled at each position by the information content, does well at ranking promoters according to their strength as measured in vitro.  相似文献   

11.
A possible code is suggested that describes a correspondence between amino acid sequences in stereospecific sites of regulatory proteins and nucleotide sequences at the control sites on DNA. Stereospecific sites of regulatory proteins are assumed to contain pairs of antiparallel polypeptide chain segments which form a right-hand twisted antiparallel -sheet with single-stranded regions at the ends of the -structure. The binding reaction between regulatory protein and double-helical DNA is accompanied by significant structural alterations at stereospecific sites of the protein and DNA. Half of the hydrogen bonds normally existing in -structure are broken upon complex formation with DNA and a new set of hydrogen bonds is formed between polypeptide amide groups and DNA base pairs. The code states a correspondence between four amino acid residues at a stereospecific site of the regulatory protein and an AT (GC) base pair at the control site. It predicts that there are six fundamental amino acid residues (serine, threonine, histidine, asparagine, glutamine and cysteine) whose arrangement in the stereospecific site determines the base pair sequence to which a given regulatory protein would bind preferentially.  相似文献   

12.

Background  

Detection of DNA-binding sites in proteins is of enormous interest for technologies targeting gene regulation and manipulation. We have previously shown that a residue and its sequence neighbor information can be used to predict DNA-binding candidates in a protein sequence. This sequence-based prediction method is applicable even if no sequence homology with a previously known DNA-binding protein is observed. Here we implement a neural network based algorithm to utilize evolutionary information of amino acid sequences in terms of their position specific scoring matrices (PSSMs) for a better prediction of DNA-binding sites.  相似文献   

13.
14.
The DNA binding characteristics of the rat nuclear matrix were investigated. A saturable and temperature-dependent, salt-resistant DNA binding to the nuclear matrix was discovered, with 70-80% of total bound DNA resistant to extraction with high concentrations of salt at 37 degrees C, compared to less than 5% at 0 degrees C. The initial binding of DNA to nuclear matrix is sensitive to salt concentration, indicating a transition to a salt-resistant binding state. The nuclear matrix shows a preference for single-stranded DNA, both in saturation and competition assays, with little binding of RNA or double-stranded DNA. Further competition studies show a preference for matrix-attached DNA probably involving predominantly AT-rich sequences, while a specific sequence defined previously as a matrix-attached region (MAR; Cockerill, P. N., and Garrard, W. T. (1986) Cell 46, 273-282) only showed preference for a limited number of the total matrix binding sites. These results and estimates from saturation data of approximately 150,000 single-stranded DNA binding sites per matrix lead us to propose that the nuclear matrix contains different classes of DNA binding sites, each with a separate sequence specificity. Binding of DNA to individual matrix polypeptides separated on sodium dodecyl sulfate-polyacrylamide gels and transferred to nitrocellulose blots was also temperature-dependent, salt-resistant, and showed a preference for binding DNA over RNA and nuclear matrix DNA over total genomic DNA. Subnuclear fractionation experiments further demonstrated that the nuclear matrix is enriched in the subset of higher molecular weight (greater than 50,000) DNA binding proteins of isolated nuclei and correspondingly depleted of the lower molecular weight ones. Of the approximately 12 major proteins separated on nonequilibrium two-dimensional gels, 7 were identified as specific DNA binding proteins including lamins A and C (but not B), and the internal nuclear matrix proteins, matrins D, E, F, G, and 4.  相似文献   

15.
16.
17.
We have previously shown that a cell cycle-dependent nucleoprotein complex assembles in vivo on a 74 bp sequence within the human DNA replication origin associated to the Lamin B2 gene. Here, we report the identification, using a one-hybrid screen in yeast, of three proteins interacting with the 74 bp sequence. All of them, namely HOXA13, HOXC10 and HOXC13, are orthologues of the Abdominal-B gene of Drosophila melanogaster and are members of the homeogene family of developmental regulators. We describe the complete open reading frame sequence of HOXC10 and HOXC13 along with the structure of the HoxC13 gene. The specificity of binding of these two proteins to the Lamin B2 origin is confirmed by both band-shift and in vitro footprinting assays. In addition, the ability of HOXC10 and HOXC13 to increase the activity of a promoter containing the 74 bp sequence, as assayed by CAT-assay experiments, demonstrates a direct interaction of these homeoproteins with the origin sequence in mammalian cells. We also show that HOXC10 expression is cell-type-dependent and positively correlates with cell proliferation.  相似文献   

18.
The first general multicomponent equations for transport through semipermeable membranes are derived from basic statistical-mechanical principles. The procedure follows that used earlier for open membranes, but semipermeability is modelled mathematically by the introduction of external forces on the impermeant species. Gases are treated first in order to clarify the problems involved, but the final results apply to general nonideal solutions of any concentration. The mixed-solvent effect is treated rigorously, and a mixed-solvent osmotic pressure is defined. A useful specific identification of so-called osmotic flow is given, along with a demonstration that such an identification cannot be unique. Results are obtained both for discontinuous membrane models, and for a continuous model.  相似文献   

19.
《Gene》1999,226(2):263-271
We report an efficient and flexible in vitro method for the isolation of genomic DNA sequences that are the binding targets of a given DNA binding protein. This method takes advantage of the fact that binding of a protein to a DNA molecule generally increases the rate of migration of the protein in nondenaturing gel electrophoresis. By the use of a radioactively labeled DNA-binding protein and nonradioactive DNA coupled with PCR amplification from gel slices, we show that specific binding sites can be isolated from Escherichia coli genomic DNA. We have applied this method to isolate a binding site for FadR, a global regulator of fatty acid metabolism in E. coli. We have also isolated a second binding site for BirA, the biotin operon repressor/biotin ligase, from the E. coli genome that has a very low binding efficiency compared with the bio operator region.  相似文献   

20.
Mice were subjected to different dietary manipulations to selectively alter expression of hepatic sterol regulatory element-binding protein 1 (SREBP-1) or SREBP-2. mRNA levels for key target genes were measured and compared with the direct binding of SREBP-1 and -2 to the associated promoters using isoform specific antibodies in chromatin immunoprecipitation studies. A diet supplemented with Zetia (ezetimibe) and lovastatin increased and decreased nuclear SREBP-2 and SREBP-1, respectively, whereas a fasting/refeeding protocol dramatically altered SREBP-1 but had modest effects on SREBP-2 levels. Binding of both SREBP-1 and -2 increased on promoters for 3-hydroxy-3-methylglutaryl-CoA reductase, fatty-acid synthase, and squalene synthase in livers of Zetia/lovastatin-treated mice despite the decline in total SREBP-1 protein. In contrast, only SREBP-2 binding was increased for the low density lipoprotein receptor promoter. Decreased SREBP-1 binding during fasting and a dramatic increase upon refeeding indicates that the lipogenic "overshoot" for fatty-acid synthase gene expression known to occur during high carbohydrate refeeding can be attributed to a similar overshoot in SREBP-1 binding. SREBP co-regulatory protein recruitment was also increased/decreased in parallel with associated changes in SREBP binding, and there were clear distinctions for different promoters in response to the dietary manipulations. Taken together, these studies reveal that there are alternative molecular mechanisms for activating SREBP target genes in response to the different dietary challenges of Zetia/lovastatin versus fasting/refeeding. This underscores the mechanistic flexibility that has evolved at the individual gene/promoter level to maintain metabolic homeostasis in response to shifting nutritional states and environmental fluctuations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号