首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Structure-based prediction of DNA target sites by regulatory proteins   总被引:15,自引:0,他引:15  
Kono H  Sarai A 《Proteins》1999,35(1):114-131
Regulatory proteins play a critical role in controlling complex spatial and temporal patterns of gene expression in higher organism, by recognizing multiple DNA sequences and regulating multiple target genes. Increasing amounts of structural data on the protein-DNA complex provides clues for the mechanism of target recognition by regulatory proteins. The analyses of the propensities of base-amino acid interactions observed in those structural data show that there is no one-to-one correspondence in the interaction, but clear preferences exist. On the other hand, the analysis of spatial distribution of amino acids around bases shows that even those amino acids with strong base preference such as Arg with G are distributed in a wide space around bases. Thus, amino acids with many different geometries can form a similar type of interaction with bases. The redundancy and structural flexibility in the interaction suggest that there are no simple rules in the sequence recognition, and its prediction is not straightforward. However, the spatial distributions of amino acids around bases indicate a possibility that the structural data can be used to derive empirical interaction potentials between amino acids and bases. Such information extracted from structural databases has been successfully used to predict amino acid sequences that fold into particular protein structures. We surmised that the structures of protein-DNA complexes could be used to predict DNA target sites for regulatory proteins, because determining DNA sequences that bind to a particular protein structure should be similar to finding amino acid sequences that fold into a particular structure. Here we demonstrate that the structural data can be used to predict DNA target sequences for regulatory proteins. Pairwise potentials that determine the interaction between bases and amino acids were empirically derived from the structural data. These potentials were then used to examine the compatibility between DNA sequences and the protein-DNA complex structure in a combinatorial "threading" procedure. We applied this strategy to the structures of protein-DNA complexes to predict DNA binding sites recognized by regulatory proteins. To test the applicability of this method in target-site prediction, we examined the effects of cognate and noncognate binding, cooperative binding, and DNA deformation on the binding specificity, and predicted binding sites in real promoters and compared with experimental data. These results show that target binding sites for several regulatory proteins are successfully predicted, and our data suggest that this method can serve as a powerful tool for predicting multiple target sites and target genes for regulatory proteins.  相似文献   

2.
Oobatake M  Kono H  Wang Y  Sarai A 《Proteins》2003,53(1):33-43
Recognition of specific DNA sequences by proteins is essential for regulation of gene expression. To fully understand the recognition mechanism, it is necessary to understand not only the structure of the specific protein-DNA interactions but also the energetics. We therefore performed a computer analysis in which a phage DNA-binding protein, lambda repressor, was used to examine the changes in binding free energy (DeltaDeltaG) and its energy components caused by single base mutations. We then determined which of the calculated energy components best correlated with the experimental data. The experimental DeltaDeltaG values were well reproduced by the calculations. Component analysis revealed that the electrostatic and hydrogen bond energies were most strongly correlated with the experimental data. Among the 51 single base-substitution mutants examined, positive DeltaDeltaG values, corresponding to weakened binding, were caused by the loss of favorable electrostatic interactions and hydrogen bonds, the introduction of steric collisions and electrostatic repulsion, the loss of favorable interactions with a thymine methyl group, and the increase of unfavorable hydration energy from isolated DNA. This analysis also showed distinct patterns of recognition at A-T and G-C positions, as different combinations of energy components were involved in DeltaDeltaG caused by the two substitution types. We have thus been able to identify the energy components that most strongly correlate with sequence-dependent DeltaDeltaG and determine their contribution to the specificity of DNA sequence recognition by the lambda repressor. Application of this method to other systems should provide additional insight into the molecular mechanism of protein-DNA recognition.  相似文献   

3.
Sequence-specific interactions between proteins and DNA are essential for a variety of biological functions. The (cytosine-C5)-methyltransferase from HhaI (M.HhaI) specifically modifies the second base in GCGC sequences, employing a base flipping mechanism to access the target base being chemically modified. The mechanism of sequence-specific recognition of M.HhaI is not evident based on crystallographic structures, leading to the suggestion that recognition is linked to the flipping event itself, a process that may be referred to as energetic recognition. Using computational methods, it is shown that the free energy barriers to flipping are significantly higher in non-cognate versus the cognate sequence, supporting the energetic recognition mechanism. Energetic recognition is imparted by two protein "selectivity filters" that function via a "web" of protein-DNA interactions in short-lived, high energy states present along the base flipping pathway. Other sequence-specific DNA binding proteins whose function involves significant distortion of DNA's conformation may use a similar recognition mechanism.  相似文献   

4.
Protein-DNA recognition plays an essential role in the regulation of gene expression. Regulatory proteins are known to recognize specific DNA sequences directly through atomic contacts (intermolecular readout) and/or indirectly through the conformational properties of the DNA (intramolecular readout). However, little is known about the respective contributions made by these so-called direct and indirect readout mechanisms. We addressed this question by making use of information extracted from a structural database containing many protein-DNA complexes. We quantified the specificity of intermolecular (direct) readout by statistical analysis of base-amino acid interactions within protein-DNA complexes. The specificity of the intramolecular (indirect) readout due to DNA was quantified by statistical analysis of the sequence-dependent DNA conformation. Systematic comparison of these specificities in a large number of protein-DNA complexes revealed that both intermolecular and intramolecular readouts contribute to the specificity of protein-DNA recognition, and that their relative contributions vary depending upon the protein-DNA complexes. We demonstrated that combination of the intermolecular and intramolecular energies derived from the statistical analyses lead to enhanced specificity, and that the combined energy could explain experimental data on binding affinity changes caused by base mutations. These results provided new insight into the relationship between specificity and structure in the process of protein-DNA recognition, which would lead to prediction of specific protein-DNA binding sites.  相似文献   

5.
6.
7.
Recognizing DNA     
It has become clear that there is no simple 'code' for protein-DNA recognition and that selecting an optimal binding sequence along the DNA double helix corresponds to more than simply forming a set of specific hydrogen bonds or steric interactions. However, it has been difficult to characterize the so-called indirect components of recognition. While DNA deformation certainly underlies indirect recognition, it is not easy to determine how local fine structure and deformability depend on base sequence or exactly what percentage of recognition should be attributed to such factors. Molecular modelling can help to develop these ideas into a quantitative model, provided the calculations can be carried out fast enough to enable a comprehensive survey of base-sequence effects. I present here some recent results from our group and their consequences for improving our understanding of protein-DNA binding, and their potential for predicting, and eventually modulating, protein-DNA binding.  相似文献   

8.
We recently showed that a nonspecific complex of the restriction nuclease EcoRI with poly (dI-dC) sequesters significantly more water at the protein-DNA interface than the complex with the specific recognition sequence. The nonspecific complex seems to retain almost a full hydration layer at the interface. We now find that at low osmotic pressures a complex of the restriction nuclease EcoRI with a DNA sequence that differs by only one base pair from the recognition site (a 'star' sequence) sequesters about 70 waters more than the specific one, a value virtually indistinguishable from nonspecific DNA. Unlike complexes with oligo (dI-dC) or with a sequence that differs by two base pairs from the recognition sequence, however, much of the water in the 'star' sequence complex is removed at high osmotic pressures. The energy of removing this water can be calculated simply from the osmotic pressure work done on the complex. The ability to measure not only the changes in water sequestered by DNA-protein complexes for different sequences, but also the work necessary to remove this water is a potentially powerful new tool for coupling inferred structural changes and thermodynamics.  相似文献   

9.
10.
11.
Eleven protein-DNA crystal structures were analyzed to test the hypothesis that hydration sites predicted in the first hydration shell of DNA mark the positions where protein residues hydrogen-bond to DNA. For nine of those structures, protein atoms, which form hydrogen bonds to DNA bases, were found within 1.5 A of the predicted hydration positions in 86% of the interactions. The correspondence of the predicted hydration sites with the hydrogen-bonded protein side chains was significantly higher for bases inside the conserved DNA recognition sequences than outside those regions. In two CAP-DNA complexes, predicted base hydration sites correctly marked 71% (within 1.5 A) of protein atoms, which form hydrogen bonds to DNA bases. Phosphate hydration was compared to actual protein binding sites in one CAP-DNA complex with 78% marked contacts within 2.0 A. These data suggest that hydration sites mark the binding sites at protein-DNA interfaces.  相似文献   

12.
We describe a rapid analytical assay for identification of proteins binding to specific DNA sequences. The DAPSTER assay (DNA affinity preincubation specificity test of recognition assay) is a DNA affinity chromatography-based microassay that can discriminate between specific and nonspecific protein-DNA interactions. The assay is sensitive and can detect protein-DNA interactions and larger multicomponent complexes that can be missed by other analytical methods. Here we describe in detail the optimization and utilization of the DAPSTER assay to isolate AP-1 complexes and associated proteins in multimeric complexes bound to the AP-1 DNA element.  相似文献   

13.
14.
15.
We investigate the conservation of amino acid residue sequences in 21 DNA-binding protein families and study the effects that mutations have on DNA-sequence recognition. The observations are best understood by assigning each protein family to one of three classes: (i) non-specific, where binding is independent of DNA sequence; (ii) highly specific, where binding is specific and all members of the family target the same DNA sequence; and (iii) multi-specific, where binding is also specific, but individual family members target different DNA sequences. Overall, protein residues in contact with the DNA are better conserved than the rest of the protein surface, but there is a complex underlying trend of conservation for individual residue positions. Amino acid residues that interact with the DNA backbone are well conserved across all protein families and provide a core of stabilising contacts for homologous protein-DNA complexes. In contrast, amino acid residues that interact with DNA bases have variable levels of conservation depending on the family classification. In non-specific families, base-contacting residues are well conserved and interactions are always found in the minor groove where there is little discrimination between base types. In highly specific families, base-contacting residues are highly conserved and allow member proteins to recognise the same target sequence. In multi-specific families, base-contacting residues undergo frequent mutations and enable different proteins to recognise distinct target sequences. Finally, we report that interactions with bases in the target sequence often follow (though not always) a universal code of amino acid-base recognition and the effects of amino acid mutations can be most easily understood for these interactions.  相似文献   

16.
17.
Inspection of the amino acid-base interactions in protein-DNA complexes is essential to the understanding of specific recognition of DNA target sites by regulatory proteins. The accumulation of information on protein-DNA co-crystals challenges the derivation of quantitative parameters for amino acid-base interaction based on these data. Here we use the coordinates of 53 solved protein-DNA complexes to extract all non-homologous pairs of amino acid-base that are in close contact, including hydrogen bonds and hydrophobic interactions. By comparing the frequency distribution of the different pairs to a theoretical distribution and calculating the log odds, a quantitative measure that expresses the likelihood of interaction for each pair of amino acid-base could be extracted. A score that reflects the compatibility between a protein and its DNA target can be calculated by summing up the individual measures of the pairs of amino acid-base involved in the complex, assuming additivity in their contributions to binding. This score enables ranking of different DNA binding sites given a protein binding site and vice versa and can be used in molecular design protocols. We demonstrate its validity by comparing the predictions using this score with experimental binding results of sequence variants of zif268 zinc fingers and their DNA binding sites.  相似文献   

18.
ABSTRACT: BACKGROUND: Protein-DNA interactions are important for many cellular processes, however structural knowledge for a large fraction of known and putative complexes is still lacking. Computational docking methods aim at the prediction of complex architecture given detailed structures of its constituents. They are becoming an increasingly important tool in the field of macromolecular assemblies, complementing particularly demanding protein-nucleic acids X ray crystallography and providing means for the refinement and integration of low resolution data coming from rapidly advancing methods such as cryoelectron microscopy. RESULTS: We present a new coarse-grained force field suitable for protein-DNA docking. The force field is an extension of previously developed parameter sets for protein-RNA and protein-protein interactions. The docking is based on potential energy minimization in translational and orientational degrees of freedom of the binding partners. It allows for fast and efficient systematic search for native-like complex geometry without any prior knowledge regarding binding site location. CONCLUSIONS: We find that the force field gives very good results for bound docking. The quality of predictions in the case of unbound docking varies, depending on the level of structural deviation from bound geometries. We analyze the role of specific protein-DNA interactions on force field performance, both with respect to complex structure prediction, and the reproduction of experimental binding affinities. We find that such direct, specific interactions only partially contribute to protein-DNA recognition, indicating an important role of shape complementarity and sequence-dependent DNA internal energy, in line with the concept of indirect protein-DNA readout mechanism.  相似文献   

19.
20.
Wong DL  Reich NO 《Biochemistry》2000,39(50):15410-15417
We describe a highly sensitive strategy combining laser-induced photo-cross-linking and HPLC-based electrospray ionization mass spectrometry to identify amino acid residues involved in protein-DNA recognition. The photoactivatible cross-linking thymine isostere, 5-iodoracil, was incorporated at a single site within the sequence recognized by EcoRI DNA methyltransferase (GAATTC). UV irradiation of the DNA-protein complex at 313 nm results in a >60% cross-linking yield. SDS-polyacrylamide gel electrophoresis and mass spectrometry were used to analyze the covalent cross-linked complex. The total mass is consistent with covalent bond formation between one strand of DNA and the protein with 1:1 stoichiometry. Protease digestion of the cross-linked complex yields several peptide-DNA adducts that were purified by anion-exchange column chromatography. A combination of mass spectrometric analysis and amino acid sequencing revealed that tyrosine 204 was cross-linked to the DNA. Electrospray mass spectrometric analysis of the peptide-nucleoside adduct confirmed this assignment. Tyrosine 204 resides in a peptide motif previously thought to be involved in AdoMet binding and methyl transfer. Thus, amino acids within loop segments but outside of "DNA binding" motifs can be critical to DNA recognition. Our method provides an accurate characterization of picomole quantities of DNA-protein complexes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号