首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Robust methods to detect DNA-binding proteins from structures of unknown function are important for structural biology. This paper describes a method for identifying such proteins that (i) have a solvent accessible structural motif necessary for DNA-binding and (ii) a positive electrostatic potential in the region of the binding region. We focus on three structural motifs: helix–turn-helix (HTH), helix–hairpin–helix (HhH) and helix–loop–helix (HLH). We find that the combination of these variables detect 78% of proteins with an HTH motif, which is a substantial improvement over previous work based purely on structural templates and is comparable to more complex methods of identifying DNA-binding proteins. Similar true positive fractions are achieved for the HhH and HLH motifs. We see evidence of wide evolutionary diversity for DNA-binding proteins with an HTH motif, and much smaller diversity for those with an HhH or HLH motif.  相似文献   

2.
This work describes a method for predicting DNA binding function from structure using 3-dimensional templates. Proteins that bind DNA using small contiguous helix–turn–helix (HTH) motifs comprise a significant number of all DNA-binding proteins. A structural template library of seven HTH motifs has been created from non-homologous DNA-binding proteins in the Protein Data Bank. The templates were used to scan complete protein structures using an algorithm that calculated the root mean squared deviation (rmsd) for the optimal superposition of each template on each structure, based on Cα backbone coordinates. Distributions of rmsd values for known HTH-containing proteins (true hits) and non-HTH proteins (false hits) were calculated. A threshold value of 1.6 Å rmsd was selected that gave a true hit rate of 88.4% and a false positive rate of 0.7%. The false positive rate was further reduced to 0.5% by introducing an accessible surface area threshold value of 990 Å2 per HTH motif. The template library and the validated thresholds were used to make predictions for target proteins from a structural genomics project.  相似文献   

3.
Diverse mechanisms for DNA-protein recognition have been elucidated in numerous atomic complex structures from various protein families. These structural data provide an invaluable knowledge base not only for understanding DNA-protein interactions, but also for developing specialized methods that predict the DNA-binding function from protein structure. While such methods are useful, a major limitation is that they require an experimental structure of the target as input. To overcome this obstacle, we develop a threading-based method, DNA-Binding-Domain-Threader (DBD-Threader), for the prediction of DNA-binding domains and associated DNA-binding protein residues. Our method, which uses a template library composed of DNA-protein complex structures, requires only the target protein''s sequence. In our approach, fold similarity and DNA-binding propensity are employed as two functional discriminating properties. In benchmark tests on 179 DNA-binding and 3,797 non-DNA-binding proteins, using templates whose sequence identity is less than 30% to the target, DBD-Threader achieves a sensitivity/precision of 56%/86%. This performance is considerably better than the standard sequence comparison method PSI-BLAST and is comparable to DBD-Hunter, which requires an experimental structure as input. Moreover, for over 70% of predicted DNA-binding domains, the backbone Root Mean Square Deviations (RMSDs) of the top-ranked structural models are within 6.5 Å of their experimental structures, with their associated DNA-binding sites identified at satisfactory accuracy. Additionally, DBD-Threader correctly assigned the SCOP superfamily for most predicted domains. To demonstrate that DBD-Threader is useful for automatic function annotation on a large-scale, DBD-Threader was applied to 18,631 protein sequences from the human genome; 1,654 proteins are predicted to have DNA-binding function. Comparison with existing Gene Ontology (GO) annotations suggests that ∼30% of our predictions are new. Finally, we present some interesting predictions in detail. In particular, it is estimated that ∼20% of classic zinc finger domains play a functional role not related to direct DNA-binding.  相似文献   

4.
Roy S  Sahu A  Adhya S 《Gene》2002,285(1-2):169-173
A gene regulatory protein with helix-turn-helix (HTH) DNA-binding motif, GalS contains a functional operator within the DNA sequences encoding the HTH region (Nature 369 (1994) 314). We searched for operator-like sequences within the DNA sequences encoding the DNA binding motifs of other regulatory proteins. Five such proteins, DeoR, CytR, LRP, LuxR and PurR, were found to have actual operator or operator-like sequences in the DNA sequences encoding the DNA-binding motif. Except DeoR, all of them including GalS, are known to be auto-regulated. Auto-regulation in case of DeoR has not been investigated. Seven other proteins containing a HTH motif, do not have operator-like sequences in the DNA sequences encoding the HTH motif; none of them, except MerR, are known to be auto-regulated. The DNA binding proteins may have evolved from a common ancestor containing a DNA binding site within its gene segment that encodes the DNA-binding motif to facilitate auto-regulation. We have discussed current evidence for monophyletic or polyphyletic origin of such sequences.  相似文献   

5.
ASTRAL compendium enhancements   总被引:7,自引:1,他引:6       下载免费PDF全文
The ASTRAL compendium provides several databases and tools to aid in the analysis of protein structures, particularly through the use of their sequences. It is partially derived from the SCOP database of protein domains, and it includes sequences for each domain as well as other resources useful for studying these sequences and domain structures. Several major improvements have been made to the ASTRAL compendium since its initial release 2 years ago. The number of protein domain sequences included has doubled from 15 190 to 30 867, and additional databases have been added. The Rapid Access Format (RAF) database contains manually curated mappings linking the biological amino acid sequences described in the SEQRES records of PDB entries to the amino acid sequences structurally observed (provided in the ATOM records) in a format designed for rapid access by automated tools. This information is used to derive sequences for protein domains in the SCOP database. In cases where a SCOP domain spans several protein chains, all of which can be traced back to a single genetic source, a ‘genetic domain’ sequence is created by concatenating the sequences of each chain in the order found in the original gene sequence. Both the original-style library of SCOP sequences and a new library including genetic domain sequences are available. Selected representative subsets of each of these libraries, based on multiple criteria and degrees of similarity, are also included. ASTRAL may be accessed at http://astral.stanford.edu/.  相似文献   

6.
DNA–protein interactions are involved in many essential biological activities. Because there is no simple mapping code between DNA base pairs and protein amino acids, the prediction of DNA–protein interactions is a challenging problem. Here, we present a novel computational approach for predicting DNA-binding protein residues and DNA–protein interaction modes without knowing its specific DNA target sequence. Given the structure of a DNA-binding protein, the method first generates an ensemble of complex structures obtained by rigid-body docking with a nonspecific canonical B-DNA. Representative models are subsequently selected through clustering and ranking by their DNA–protein interfacial energy. Analysis of these encounter complex models suggests that the recognition sites for specific DNA binding are usually favorable interaction sites for the nonspecific DNA probe and that nonspecific DNA–protein interaction modes exhibit some similarity to specific DNA–protein binding modes. Although the method requires as input the knowledge that the protein binds DNA, in benchmark tests, it achieves better performance in identifying DNA-binding sites than three previously established methods, which are based on sophisticated machine-learning techniques. We further apply our method to protein structures predicted through modeling and demonstrate that our method performs satisfactorily on protein models whose root-mean-square Cα deviation from native is up to 5 Å from their native structures. This study provides valuable structural insights into how a specific DNA-binding protein interacts with a nonspecific DNA sequence. The similarity between the specific DNA–protein interaction mode and nonspecific interaction modes may reflect an important sampling step in search of its specific DNA targets by a DNA-binding protein.  相似文献   

7.
This review describes methods for the prediction of DNA binding function, and specifically summarizes a new method using 3D structural templates. The new method features the HTH motif that is found in approximately one-third of DNAbinding protein families. A library of 3D structural templates of HTH motifs was derived from proteins in the PDB. Templates were scanned against complete protein structures and the optimal superposition of a template on a structure calculated. Significance thresholds in terms of a minimum root mean squared deviation (rmsd) of an optimal superposition, and a minimum motif accessible surface area (ASA), have been calculated. In this way, it is possible to scan the template library against proteins of unknown function to make predictions about DNA-binding functionality.  相似文献   

8.
Helix–hairpin–helix (HhH) is a widespread motif involved in non-sequence-specific DNA binding. The majority of HhH motifs function as DNA-binding modules, however, some of them are used to mediate protein–protein interactions or have acquired enzymatic activity by incorporating catalytic residues (DNA glycosylases). From sequence and structural analysis of HhH-containing proteins we conclude that most HhH motifs are integrated as a part of a five-helical domain, termed (HhH)2 domain here. It typically consists of two consecutive HhH motifs that are linked by a connector helix and displays pseudo-2-fold symmetry. (HhH)2 domains show clear structural integrity and a conserved hydrophobic core composed of seven residues, one residue from each α-helix and each hairpin, and deserves recognition as a distinct protein fold. In addition to known HhH in the structures of RuvA, RadA, MutY and DNA-polymerases, we have detected new HhH motifs in sterile alpha motif and barrier-to-autointegration factor domains, the α-subunit of Escherichia coli RNA-polymerase, DNA-helicase PcrA and DNA glyco­s­y­lases. Statistically significant sequence similarity of HhH motifs and pronounced structural conservation argue for homology between (HhH)2 domains in different protein families. Our analysis helps to clarify how non-symmetric protein motifs bind to the double helix of DNA through the formation of a pseudo-2-fold symmetric (HhH)2 functional unit.  相似文献   

9.
Many prokaryotic and eukaryotic DNA-binding proteins use a helix-turn-helix (HTH) structure for DNA recognition. Here we describe a new family of eukaryotic HTH proteins, the Pipsqueak (Psq) family, which includes proteins from fungi, sea urchins, nematodes, insects, and vertebrates. Three subgroups of the Psq family can be distinguished. Like the HTH proteins of the prokaryotic resolvase family, members of the CENP-B/transposase subgroup catalyze site-specific recombination reactions. This functional conservation, together with a primary sequence similarity between the resolvase and Psq DNA-binding domains, suggests that the resolvase and Psq families are evolutionarily linked. More than half of the newly identified Drosophila Psq proteins contain a BTB protein-protein interaction domain. All proteins of this BTB subgroup belong to the conserved Tramtrack group of BTB-domain proteins. About half of the members of the Tramtrack group contain a Psq domain, while the other half is made up of proteins that contain a zinc finger domain. Thus, nearly all members of this group appear to be DNA-binding proteins. Among other developmental regulators, the Drosophila cell death protein E93 was found to contain a Psq motif and to define a third subgroup of Psq domain proteins. The high sequence conservation of the E93 Psq motif allowed the identification of E93 orthologs in humans and lower metazoans.  相似文献   

10.
Sequence-based approach for motif prediction is of great interest and remains a challenge. In this work, we develop a local combinational variable approach for sequence-based helix-turn-helix (HTH) motif prediction. First we choose a sequence data set for 88 proteins of 22 amino acids in length to launch an optimized traversal for extracting local combinational segments (LCS) from the data set. Then after LCS refinement, local combinational variables (LCV) are generated to construct prediction models for HTH motifs. Prediction ability of LCV sets at different thresholds is calculated to settle a moderate threshold. The large data set we used comprises 13 HTH families, with 17 455 sequences in total. Our approach predicts HTH motifs more precisely using only primary protein sequence information, with 93.29% accuracy, 93.93% sensitivity and 92.66% specificity. Prediction results of newly reported HTH-containing proteins compared with other prediction web service presents a good prediction model derived from the LCV approach. Comparisons with profile-HMM models from the Pfam protein families database show that the LCV approach maintains a good balance while dealing with HTH-containing proteins and non-HTH proteins at the same time. The LCV approach is to some extent a complementary to the profile-HMM models for its better identification of false-positive data. Furthermore, genome-wide predictions detect new HTH proteins in both Homo sapiens and Escherichia coli organisms, which enlarge applications of the LCV approach. Software for mining LCVs from sequence data set can be obtained from anonymous ftp site ftp://cheminfo.tongji.edu.cn/LCV/freely.  相似文献   

11.
《Gene》1998,222(1):133-144
The A-factor receptor protein (ArpA) plays a key role in the regulation of secondary metabolism and cellular differentiation in Streptomyces griseus. ArpA binds the target DNA site forming a 22 bp palindrome in the absence of A-factor, and exogenous addition of A-factor to the ArpA–DNA complex immediately releases ArpA from the DNA. An amino acid (aa) replacement at Val-41 to Ala in an α-helix–turn–α-helix (HTH) motif at the N-terminal portion of ArpA abolished DNA-binding activity but not A-factor-binding activity, suggesting the involvement of this HTH in DNA-binding. On the other hand, an aa replacement at Trp-119 to Ala generated a mutant ArpA that was unable to bind A-factor, thus resulting in an A-factor-insensitive mutant that bound normally to its target DNA in both the presence and absence of A-factor. These data suggest that ArpA consisting of two functional domains, one for HTH-type DNA-binding at the N-terminal portion and one for A-factor-binding at the C-terminal portion, is a member of the LacI family. Consistent with this, two ArpA homologues, CprA and CprB, from Streptomyces coelicolor A3(2), each of which contains a very similar aa sequence of the HTH to that of ArpA, also recognized and bound the same DNA target. However, neither CprA nor CprB recognized A-factor, probably due to much less similarity in the C-terminal domains.  相似文献   

12.
13.
A method for discerning protein structures containing the DNA-binding helix-turn-helix (HTH) motif has been developed. The method uses statistical models based on geometrical measurements of the motif. With a decision tree model, key structural features required for DNA binding were identified. These include a high average solvent-accessibility of residues within the recognition helix and a conserved hydrophobic interaction between the recognition helix and the second alpha helix preceding it. The Protein Data Bank was searched using a more accurate model of the motif created using the Adaboost algorithm to identify structures that have a high probability of containing the motif, including those that had not been reported previously.  相似文献   

14.
15.
Redesign of the bacteriophage 434 Cro repressor was accomplished by using an in vivo genetic screening system to identify new variants that specifically bound previously unrecognized DNA sequences. Site-directed, combinatorial mutagenesis of the 434 Cro helix-turn-helix (HTH) motif generated libraries of new variants which were screened for binding to new target sequences. Multiple mutations of 434 Cro that functionally converted wild-type (wt) 434 Cro DNA binding-sequence specificity to that of a lambda bacteriophage-specific repressor were identified. The libraries contained variations within the HTH sequence at only three positions. In vivo and in vitro analysis of several of the identified 434 Cro variants showed that the relatively few changes in the recognition helix of the HTH motif of 434 Cro resulted in specific and tight binding of the target DNA sequences. For the best 434 Cro variant identified, an apparent K(d) for lambda O(R)3 of 1 nM was observed. In competition experiments, this Cro variant was observed to be highly selective. We conclude that functional 434 Cro repressor variants with new DNA binding specificities can be generated from wt 434 Cro by mutating just the recognition helix. Important characteristics of the screening system responsible for the successful identifications are discussed. Application of the techniques presented here may allow the identification of DNA binding protein variants that functionally affect DNA regulatory sequences important in disease and industrial and biotechnological processes.  相似文献   

16.
17.
18.
We had previously exploited a method for targeted DNA methylation in budding yeast to succeed in one-hybrid detection of methylation-dependent DNA–protein interactions. Based on this finding, we developed a yeast one-hybrid system to screen cDNA libraries for clones encoding methylated DNA-binding proteins. Concurrent use of two independent bait sequences in the same cell, or dual-bait system, effectively reduced false positive clones, which were derived from methylation-insensitive sequence-specific DNA-binding proteins. We applied the dual-bait system to screen cDNA libraries and demonstrated efficient isolation of clones for methylated DNA-binding proteins. This system would serve as a unique research tool for epigenetics.  相似文献   

19.
The homeobox, a 183 bp DNA sequence element, was originally identified as a region of sequence similarity between many Drosophila homeotic genes. The homeobox codes for a DNA-binding motif known as the homeodomain. Homeobox genes have been found in many animal species, including sea urchins, nematodes, frogs, mice and humans. To isolate homeobox-containing sequences from the plant Arabidopsis thaliana, a cDNA library was screened with a highly degenerate oligonucleotide corresponding to a conserved eight amino acid sequence from the helix-3 region of the homeodomain. Using this strategy two cDNA clones sharing homeobox-related sequences were identified. Interestingly, both of the cDNAs also contain a second element that potentially codes for a leucine zipper motif which is located immediately 3'' to the homeobox. The close proximity of these two domains suggests that the homeodomain-leucine zipper motif could, via dimerization of the leucine zippers, recognize dyad-symmetrical DNA sequences.  相似文献   

20.
The helix-turn-helix (HTH) motif features frequently in protein DNA-binding assemblies. Viral pac site-targeting small terminase proteins possess an unusual architecture in which the HTH motifs are displayed in a ring, distinct from the classical HTH dimer. Here we investigate how such a circular array of HTH motifs enables specific recognition of the viral genome for initiation of DNA packaging during virus assembly. We found, by surface plasmon resonance and analytical ultracentrifugation, that individual HTH motifs of the Bacillus phage SF6 small terminase bind the packaging regions of SF6 and related SPP1 genome weakly, with little local sequence specificity. Nuclear magnetic resonance chemical shift perturbation studies with an arbitrary single-site substrate suggest that the HTH motif contacts DNA similarly to how certain HTH proteins contact DNA non-specifically. Our observations support a model where specificity is generated through conformational selection of an intrinsically bent DNA segment by a ring of HTHs which bind weakly but cooperatively. Such a system would enable viral gene regulation and control of the viral life cycle, with a minimal genome, conferring a major evolutionary advantage for SPP1-like viruses.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号