首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Protein-DNA interactions are crucial for many cellular processes. Now with the increased availability of structures of protein-DNA complexes, gaining deeper insights into the nature of protein-DNA interactions has become possible. Earlier, investigations have characterized the interface properties by considering pairwise interactions. However, the information communicated along the interfaces is rarely a pairwise phenomenon, and we feel that a global picture can be obtained by considering a protein-DNA complex as a network of noncovalently interacting systems. Furthermore, most of the earlier investigations have been carried out from the protein point of view (protein-centric), and the present network approach aims to combine both the protein-centric and the DNA-centric points of view. Part of the study involves the development of methodology to investigate protein-DNA graphs/networks with the development of key parameters. A network representation provides a holistic view of the interacting surface and has been reported here for the first time. The second part of the study involves the analyses of these graphs in terms of clusters of interacting residues and the identification of highly connected residues (hubs) along the protein-DNA interface. A predominance of deoxyribose-amino acid clusters in beta-sheet proteins, distinction of the interface clusters in helix-turn-helix, and the zipper-type proteins would not have been possible by conventional pairwise interaction analysis. Additionally, we propose a potential classification scheme for a set of protein-DNA complexes on the basis of the protein-DNA interface clusters. This provides a general idea of how the proteins interact with the different components of DNA in different complexes. Thus, we believe that the present graph-based method provides a deeper insight into the analysis of the protein-DNA recognition mechanisms by throwing more light on the nature and the specificity of these interactions.  相似文献   

2.
Protein-DNA interactions facilitate the fundamental functions of living cells and are universal in all living organisms. Several investigations have been carried out, essentially identifying pairs of interactions between the amino acid residues in proteins and the bases in DNA. In the present study, we have detected the recognition motifs that may constitute a cluster of spatially interacting residues in proteins, which interact with the bases of DNA. Graph spectral algorithm has been used to detect side chain clusters comprising Arg, Lys, Asn, Gln and aromatic residues from proteins interacting with DNA. We find that the interaction of proteins with DNA is through clusters in about half of the proteins in the dataset and through individual residues in the rest. Furthermore, inspection of the clusters has revealed additional interactions in a few cases, which have not been reported earlier. The geometry of the interaction between the DNA base and the protein residue is quantified by the distance d and the angle theta. These parameters have been identified for the cation-pi/H-bond stair motif that was reported earlier. Among the Arg, Lys, Asn and Gln residues, the range of (d, theta) values of the interacting Arg clearly falls into the cation-pi and the hydrogen bond interactions of the 'cation-pi/H-bond' stair motif. Analysis of the cluster composition reveals that the Arg residue is predominant than the Lys, Asn and Gln residues. The clusters are classified into Type I and Type II based on the presence or absence of aromatic residues (Phe, Tyr) in them. Residue conservation in these clusters has been examined. Apart from the conserved residues identified previously, a few more residues mainly Phe, Tyr and Arg have also been identified as conserved and interactive with the DNA. Interestingly, a few residues that are parts of interacting clusters and do not interact directly with the DNA have also been conserved. This emphasizes the importance of recognizing the protein side chain cluster motifs interacting with the DNA, which could serve as signatures of protein-DNA recognition in the families of DNA binding proteins.  相似文献   

3.
An overview of the structures of protein-DNA complexes   总被引:1,自引:0,他引:1  
Luscombe NM  Austin SE  Berman HM  Thornton JM 《Genome biology》2000,1(1):reviews001.1-reviews00137
On the basis of a structural analysis of 240 protein-DNA complexes contained in the Protein Data Bank (PDB), we have classified the DNA-binding proteins involved into eight different structural/functional groups, which are further classified into 54 structural families. Here we present this classification and review the functions, structures and binding interactions of these protein-DNA complexes.  相似文献   

4.
5.
Protein-DNA recognition plays an essential role in the regulation of gene expression. Regulatory proteins are known to recognize specific DNA sequences directly through atomic contacts (intermolecular readout) and/or indirectly through the conformational properties of the DNA (intramolecular readout). However, little is known about the respective contributions made by these so-called direct and indirect readout mechanisms. We addressed this question by making use of information extracted from a structural database containing many protein-DNA complexes. We quantified the specificity of intermolecular (direct) readout by statistical analysis of base-amino acid interactions within protein-DNA complexes. The specificity of the intramolecular (indirect) readout due to DNA was quantified by statistical analysis of the sequence-dependent DNA conformation. Systematic comparison of these specificities in a large number of protein-DNA complexes revealed that both intermolecular and intramolecular readouts contribute to the specificity of protein-DNA recognition, and that their relative contributions vary depending upon the protein-DNA complexes. We demonstrated that combination of the intermolecular and intramolecular energies derived from the statistical analyses lead to enhanced specificity, and that the combined energy could explain experimental data on binding affinity changes caused by base mutations. These results provided new insight into the relationship between specificity and structure in the process of protein-DNA recognition, which would lead to prediction of specific protein-DNA binding sites.  相似文献   

6.
We describe the purification to near homogeneity of proteins binding to site C2 (muE3) in the immunoglobulin heavy-chain enhancer. Proteins binding to this site produce four protein-DNA complexes which are distinguished by their mobility in gel retardation assays and their elution properties in an anion exchange column. DNA affinity-purified preparations of three chromatographically separated pools, containing different subsets of the four complexes, each contained three polypeptides of 42.5, 44, and 45 kilodaltons (kDa). UV crosslinking of protein to enhancer DNA demonstrated that site C2-binding activities in the three different pools bound DNA through proteins of similar sizes (about 45 kDa), even though the protein-DNA complexes formed by these binding activities were quite distinct. Gel exclusion chromatography and equilibrium binding analyses indicated that the distinct protein-DNA complexes were due to different oligomeric forms of the individual subunits and that a larger multimeric form bound with high affinity to the heavy-chain enhancer site C2, while a smaller species had a much lower affinity for heavy-chain enhancer sequences. Purified protein has been used to map high-affinity binding sites for site C2-binding proteins within an immunoglobulin heavy-chain promoter and at site KE3 in the kappa light-chain enhancer.  相似文献   

7.
8.
9.
10.
A detailed analysis of the DNA-binding sites of 26 proteins is presented using data from the Nucleic Acid Database (NDB) and the Protein Data Bank (PDB). Chemical and physical properties of the protein-DNA interface, such as polarity, size, shape, and packing, were analysed. The DNA-binding sites shared common features, comprising many discontinuous sequence segments forming hydrophilic surfaces capable of direct and water-mediated hydrogen bonds. These interface sites were compared to those of protein-protein binding sites, revealing them to be more polar, with many more intermolecular hydrogen bonds and buried water molecules than the protein-protein interface sites. By looking at the number and positioning of protein residue-DNA base interactions in a series of interaction footprints, three modes of DNA binding were identified (single-headed, double-headed and enveloping). Six of the eight enzymes in the data set bound in the enveloping mode, with the protein presenting a large interface area effectively wrapped around the DNA.A comparison of structural parameters of the DNA revealed that some values for the bound DNA (including twist, slide and roll) were intermediate of those observed for the unbound B-DNA and A-DNA. The distortion of bound DNA was evaluated by calculating a root-mean-square deviation on fitting to a canonical B-DNA structure. Major distortions were commonly caused by specific kinks in the DNA sequence, some resulting in the overall bending of the helix. The helix bending affected the dimensions of the grooves in the DNA, allowing the binding of protein elements that would otherwise be unable to make contact. From this structural analysis a preliminary set of rules that govern the bending of the DNA in protein-DNA complexes, are proposed.  相似文献   

11.
12.
Recent advances in high-throughput methods and the application of computational tools for automatic classification of proteins have made it possible to carry out large-scale proteomic analyses. Biological analysis and interpretation of sets of proteins is a time-consuming undertaking carried out manually by experts. We have developed PANDORA (Protein ANnotation Diagram ORiented Analysis), a web-based tool that provides an automatic representation of the biological knowledge associated with any set of proteins. PANDORA uses a unique approach of keyword-based graphical analysis that focuses on detecting subsets of proteins that share unique biological properties and the intersections of such sets. PANDORA currently supports SwissProt keywords, NCBI Taxonomy, InterPro entries and the hierarchical classification terms from ENZYME, SCOP and GO databases. The integrated study of several annotation sources simultaneously allows a representation of biological relations of structure, function, cellular location, taxonomy, domains and motifs. PANDORA is also integrated into the ProtoNet system, thus allowing testing thousands of automatically generated clusters. We illustrate how PANDORA enhances the biological understanding of large, non-uniform sets of proteins originating from experimental and computational sources, without the need for prior biological knowledge on individual proteins.  相似文献   

13.
MotifCluster finds related motifs in a set of sequences, and clusters the sequences into families using the motifs they contain. MotifCluster, at , lets users test whether proteins are related, cluster sequences by shared conserved motifs, and visualize motifs mapped onto trees, sequences and three-dimensional structures. We demonstrate MotifCluster's accuracy using gold-standard protein superfamilies; using recommended settings, families were assigned to the correct superfamilies with 0.17% false positive and no false negative assignments.  相似文献   

14.
15.
Protein-DNA recognition plays an essential role in the regulation of gene expression. The protein-DNA binding specificity is based on direct atomic contacts between protein and DNA and/or the conformational properties of DNA. In this work, we have analyzed the influence of DNA stiffness (E) to the specificity of protein-DNA complexes. The average DNA stiffness parameters for several protein-DNA complexes have been computed using the structure based sequence dependent stiffness scale. The relationship between DNA stiffness and experimental protein-DNA binding specificity has been brought out. We have investigated the importance of DNA stiffness with the aid of experimental free energy changes (DeltaDeltaG) due to binding in several protein-DNA complexes, such as, ETS proteins, 434, lambda, Mnt and trp repressors, 434 cro protein, EcoRV endonuclease V and zinc fingers. We found a correlation in the range 0.65-0.97 between DeltaDeltaG and E in these examples. Further, we have qualitatively analyzed the effect of mutations in the target sequence of lambda repressor and we observed that the DNA stiffness could correctly identify 70% of the correct bases among the considered nine positions.  相似文献   

16.
Palindromic Units (PU or REP) were defined as DNA sequences of 40 nucleotides highly repeated on the genome of Escherichia coli and other Enterobacteriaceae. PU are found in clusters of up to six occurrences always localized in extragenic regions. By sorting the DNA sequences of the known PU containing regions into different classes, we show here for the first time that, besides the PU themselves, each PU clusters contains a number of other conserved sequence motifs. Seven such motifs were identified with the present list of PU regions. Remarkably, each PU cluster is exclusively composed of a mosaic combination of PU and of these other sequence motifs. We demonstrate directly by hybridization experiments that one of these motifs (called L) is indeed present at a large number of copies on the Escherichia coli chromosome and that its distribution follows the same species specificity as PU sequences themselves. We propose that the mosaic pattern of motif combination in PU clusters reveals a new type of bacterial genetic element which we propose to call BIME for Bacterial Interspersed Mosaic Element. The Escherichia coli genome contains about 500 BIME.  相似文献   

17.
Structure-based prediction of DNA target sites by regulatory proteins   总被引:15,自引:0,他引:15  
Kono H  Sarai A 《Proteins》1999,35(1):114-131
Regulatory proteins play a critical role in controlling complex spatial and temporal patterns of gene expression in higher organism, by recognizing multiple DNA sequences and regulating multiple target genes. Increasing amounts of structural data on the protein-DNA complex provides clues for the mechanism of target recognition by regulatory proteins. The analyses of the propensities of base-amino acid interactions observed in those structural data show that there is no one-to-one correspondence in the interaction, but clear preferences exist. On the other hand, the analysis of spatial distribution of amino acids around bases shows that even those amino acids with strong base preference such as Arg with G are distributed in a wide space around bases. Thus, amino acids with many different geometries can form a similar type of interaction with bases. The redundancy and structural flexibility in the interaction suggest that there are no simple rules in the sequence recognition, and its prediction is not straightforward. However, the spatial distributions of amino acids around bases indicate a possibility that the structural data can be used to derive empirical interaction potentials between amino acids and bases. Such information extracted from structural databases has been successfully used to predict amino acid sequences that fold into particular protein structures. We surmised that the structures of protein-DNA complexes could be used to predict DNA target sites for regulatory proteins, because determining DNA sequences that bind to a particular protein structure should be similar to finding amino acid sequences that fold into a particular structure. Here we demonstrate that the structural data can be used to predict DNA target sequences for regulatory proteins. Pairwise potentials that determine the interaction between bases and amino acids were empirically derived from the structural data. These potentials were then used to examine the compatibility between DNA sequences and the protein-DNA complex structure in a combinatorial "threading" procedure. We applied this strategy to the structures of protein-DNA complexes to predict DNA binding sites recognized by regulatory proteins. To test the applicability of this method in target-site prediction, we examined the effects of cognate and noncognate binding, cooperative binding, and DNA deformation on the binding specificity, and predicted binding sites in real promoters and compared with experimental data. These results show that target binding sites for several regulatory proteins are successfully predicted, and our data suggest that this method can serve as a powerful tool for predicting multiple target sites and target genes for regulatory proteins.  相似文献   

18.
19.
20.
We have been studying the molecular mechanism of neuronal differentiation through which the multipotent precursor becomes limited to the final transmitter phenotype. Here we focused on the role of the 5′ proximal regulatory cassette (?190; +53 bp) of the rat enkephalin (rENK) gene in the developmental regulation of the enkephalin phenotype. Several well characterizedcis-elements, including AP2, CREB, NF1, and NFkB, reside on this region of the rENK gene. These motifs were sufficient to confer activity-dependent expression of the gene during neurodifferentiation when it was tested using transient transfection assays of primary developing spinal cord neurons treated with tetrodotoxin (TTX). This region was then used as a DNA probe in mobility shift assays, with nuclear proteins derived from phenotypically and ontogenetically distinct brain regions. Only a few low abundance protein-DNA complexes were detected and only with nuclear proteins derived from developing but not from adult brain. The spatiotemporal pattern of these complexes did not show correlation with enkephalin expression which was assessed by RT-PCR. We employed synthetic probes corresponding to consensus as well as ENK-specific sequences of the individual motifs to identify the nature of the observed bands. Although both consensus NF1 and enk CRE1(NF1) formed complexes with nuclear proteins derived from the striatum and cortex at various ages, the appearance of the bands was not correlated with ENK expression. Surprisingly, no complexes were detected if other ENK-specific motifs were used as probes. We also tested nuclear extracts derived from forskolin-induced and control C6 glioma cells, again using the whole proximal regulatory cassette as well as individual motifs. These experiments showed the formation of elaborate protein-DNA bands. There was no direct correlation between the appearance of bands and forskolin-induced ENK expression. Unexpectedly, all ENK-specific motifs formed specific and highly abundant protein-DNA complexes when nuclear extracts from the human tumor cell line (HeLa), which does not express ENK, were used. Based on these observations, we concluded that:
  1. Interactions between the proximal regulatory cassette and additional probably far distant regions of the rENK gene and their binding proteins may be necessary to confer developmentally regulated, cell-specific expression of the ENK gene; and
  2. Inducibility of the gene by commoncis-elements can be governed by this region; however, the cell-specificity of the induction remains elusive.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号