首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A classification model of a DNA-binding protein chain was created based on identification of alpha helices within the chain likely to bind to DNA. Using the model, all chains in the Protein Data Bank were classified. For many of the chains classified with high confidence, previous documentation for DNA-binding was found, yet no sequence homology to the structures used to train the model was detected. The result indicates that the chain model can be used to supplement sequence based methods for annotating the function of DNA-binding. Four new candidates for DNA-binding were found, including two structures solved through structural genomics efforts. For each of the candidate structures, possible sites of DNA-binding are indicated by listing the residue ranges of alpha helices likely to interact with DNA.  相似文献   

2.
We developed a general method for the enrichment and identification of sequence-specific DNA-binding proteins. A well-characterized protein-DNA interaction is used to isolate from crude cellular extracts or fractions thereof proteins which bind to specific DNA sequences; the method is based solely on this binding property of the proteins. The DNA sequence of interest, cloned adjacent to the lac operator DNA segment is incubated with a lac repressor-beta-galactosidase fusion protein which retains full operator and inducer binding properties. The DNA fragment bound to the lac repressor-beta-galactosidase fusion protein is precipitated by the addition of affinity-purified anti-beta-galactosidase immobilized on beads. This forms an affinity matrix for any proteins which might interact specifically with the DNA sequence cloned adjacent to the lac operator. When incubated with cellular extracts in the presence of excess competitor DNA, any protein(s) which specifically binds to the cloned DNA sequence of interest can be cleanly precipitated. When isopropyl-beta-D-thiogalactopyranoside is added, the lac repressor releases the bound DNA, and thus the protein-DNA complex consisting of the specific restriction fragment and any specific binding protein(s) is released, permitting the identification of the protein by standard biochemical techniques. We demonstrate the utility of this method with the lambda repressor, another well-characterized DNA-binding protein, as a model. In addition, with crude preparations of the yeast mitochondrial RNA polymerase, we identified a 70,000-molecular-weight peptide which binds specifically to the promoter region of the yeast mitochondrial 14S rRNA gene.  相似文献   

3.
<正>Dear Editor,In recent years,post-translational modifications(PTMs)by small ubiquitin-related modifiers(SUMOs)have emerged as an important regulatory mechanism for both cellular and viral processes(Ribet and Cossart,2010).Identifying the SUMOylation sites of the target protein is  相似文献   

4.
Ho SY  Yu FC  Chang CY  Huang HL 《Bio Systems》2007,90(1):234-241
In this paper, we investigate the design of accurate predictors for DNA-binding sites in proteins from amino acid sequences. As a result, we propose a hybrid method using support vector machine (SVM) in conjunction with evolutionary information of amino acid sequences in terms of their position-specific scoring matrices (PSSMs) for prediction of DNA-binding sites. Considering the numbers of binding and non-binding residues in proteins are significantly unequal, two additional weights as well as SVM parameters are analyzed and adopted to maximize net prediction (NP, an average of sensitivity and specificity) accuracy. To evaluate the generalization ability of the proposed method SVM-PSSM, a DNA-binding dataset PDC-59 consisting of 59 protein chains with low sequence identity on each other is additionally established. The SVM-based method using the same six-fold cross-validation procedure and PSSM features has NP=80.15% for the training dataset PDNA-62 and NP=69.54% for the test dataset PDC-59, which are much better than the existing neural network-based method by increasing the NP values for training and test accuracies up to 13.45% and 16.53%, respectively. Simulation results reveal that SVM-PSSM performs well in predicting DNA-binding sites of novel proteins from amino acid sequences.  相似文献   

5.
6.
A computer program system was developed to predict carbohydrate-binding sites on three-dimensional (3D) protein structures. The programs search for binding sites by referring to the empirical rules derived from the known 3D structures of carbohydrate-protein complexes. A total of 80 non-redundant carbohydrate-protein complex structures were selected from the Protein Data Bank for the empirical rule construction. The performance of the prediction system was tested on 50 known complex structures to determine whether the system could detect the known binding sites. The known monosaccharide-binding sites were detected among the best three predictions in 59% of the cases, which covered 69% of the polysaccharide-binding sites in the target proteins, when the performance was evaluated by the overlap between residue patches of predicted and known binding sites.  相似文献   

7.
Comparison of the amino acid sequences of 13 procaryotic regulatory proteins, including the products of genes crp (catabolite activator protein; CAP), lacI, galR , lexA, lysR, araC, trpR, and tnpR of Escherichia coli, of genes cI, cII and cro of phage lambda, cro of phage 434, and c2 of phage P22, has revealed two regions of homology. The sites of action of these proteins also share common features in their DNA sequence. Taking into account the models proposed for the lambda repressors, cro and cI, and for CAP, a general type of DNA-protein interaction is suggested.  相似文献   

8.
A method to detect DNA-binding sites on the surface of a protein structure is important for functional annotation. This work describes the analysis of residue patches on the surface of DNA-binding proteins and the development of a method of predicting DNA-binding sites using a single feature of these surface patches. Surface patches and the DNA-binding sites were initially analysed for accessibility, electrostatic potential, residue propensity, hydrophobicity and residue conservation. From this, it was observed that the DNA-binding sites were, in general, amongst the top 10% of patches with the largest positive electrostatic scores. This knowledge led to the development of a prediction method in which patches of surface residues were selected such that they excluded residues with negative electrostatic scores. This method was used to make predictions for a data set of 56 non-homologous DNA-binding proteins. Correct predictions made for 68% of the data set.  相似文献   

9.
We describe a new method for identifying the sequences that signal the start of translation, and the boundaries between exons and introns (donor and acceptor sites) in human mRNA. According to the mandatory keyword, ORGANISM, and feature key, CDS, a large set of standard data for each signal site was extracted from the ASCII flat file, gbpri.seq, in the GenBank release 108.0. This was used to generate the scoring matrices, which summarize the sequence information for each signal site. The scoring matrices take into account the independent nucleotide frequencies between adjacent bases in each position within the signal site regions, and the relative weight on each nucleotide in proportion to their probabilities in the known signal sites. Using a scoring scheme that is based on the nucleotide scoring matrices, the method has great sensitivity and specificity when used to locate signals in uncharacterized human genomic DNA. These matrices are especially effective at distinguishing true and false sites.  相似文献   

10.
11.

Background  

The triosephosphate isomerase (TIM)-barrel fold occurs frequently in the proteomes of different organisms, and the known TIM-barrel proteins have been found to play diverse functional roles. To accelerate the exploration of the sequence-structure protein landscape in the TIM-barrel fold, a computational tool that allows sensitive detection of TIM-barrel proteins is required.  相似文献   

12.
This paper describes an approach for preparing unimolecular double-stranded DNA (uni-dsDNA) microarray chip. In this method, the various target oligonucleotides containing a reverse complementary sequence at 5' end were firstly annealed to a same universal oligonucleotide with amino group at 5' end and immobilized on aldehyde-derivatized glass slide. An on-chip DNA polymerization reaction was then performed to elongate the universal oligonucleotides. After a denaturation and a followed intra-strand annealing, a hairpin structure was formed at the free 3' end of the immobilized oligonucleotides. Finally, another on-chip DNA polymerization was done to synthesize the uni-dsDNA microarray. Combining with a PCR amplification of chemically synthesized target oligonucleotides, this method was much cost-effective for production of the uni-dsDNA microarray. The uni-dsDNA microarray was verified applicable for detecting the presence and monitoring the DNA-binding activity of the sequence-specific DNA-binding proteins.  相似文献   

13.
14.
15.
16.
Xiong Y  Liu J  Wei DQ 《Proteins》2011,79(2):509-517
Proteins that interact with DNA play vital roles in all mechanisms of gene expression and regulation. In order to understand these activities, it is crucial to analyze and identify DNA-binding residues on DNA-binding protein surfaces. Here, we proposed two novel features B-factor and packing density in combination with several conventional features to characterize the DNA-binding residues in a well-constructed representative dataset of 119 protein-DNA complexes from the Protein Data Bank (PDB). Based on the selected features, a prediction model for DNA-binding residues was constructed using support vector machine (SVM). The predictor was evaluated using a 5-fold cross validation on above dataset of 123 DNA-binding proteins. Moreover, two independent datasets of 83 DNA-bound protein structures and their corresponding DNA-free forms were compiled. The B-factor and packing density features were statistically analyzed on these 83 pairs of holo-apo proteins structures. Finally, we developed the SVM model to accurately predict DNA-binding residues on protein surface, given the DNA-free structure of a protein. Results showed here indicate that our method represents a significant improvement of previously existing approaches such as DISPLAR. The observation suggests that our method will be useful in studying protein-DNA interactions to guide consequent works such as site-directed mutagenesis and protein-DNA docking.  相似文献   

17.
An examination of the binding sites of four carbohydrate binding proteins (Escherichia coli lactose repressor, E. coli arabinose-binding protein, yeast hexokinase A and Concanavalin A) revealed certain similarities of amino acid sequences and residues forming hydrogen bonds and hydrophobic interactions with the bound carbohydrate. These were: (i) Asx-Asx, hydrogen bonding to the pyranose ring oxygen and anomeric-OH group; (ii) Arg-X-X-X-(Ser/Thr), or the reverse sequence, with the Arg hydrogen bonding to the pyranose ring oxygen; (iii) Lys-(Ser/Thr)-X-X-Asp, or the reverse sequence and with interchange of the Lys-(Ser/Thr) positions, with hydrogen bonding of either or both the Lys and Asp residues to the -OH groups at carbons 2, 3, 4 or 6; (iv) a diaromatic sequence with possible hydrophobic interactions to the faces of the pyranose ring structure. An algorithm was devised to search the amino acid sequences of a large number of proteins, those known to bind carbohydrates as well as those without known carbohydrate-binding activities, for the four amino acid sequence criteria. The algorithm incorporated a weighted distance value (WDV) to assess the approximate distance between any two criteria, with the WDV being based on the predicted secondary structure of the protein amino acid sequence. When the algorithm using criteria 1 and 2 plus the WDV was applied to the sequences of 125 proteins, the method indicated the presence of the potential carbohydrate-binding site motif for 42% of proteins with known carbohydrate binding, only 8% of proteins were predicted as false positives, and the accuracy of the method was calculated to be 61.6%.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

18.

Background  

Understanding the molecular details of protein-DNA interactions is critical for deciphering the mechanisms of gene regulation. We present a machine learning approach for the identification of amino acid residues involved in protein-DNA interactions.  相似文献   

19.
We describe a technique for a rapid and efficient isolation and purification of proteins binding to defined DNA sequences. Cloned double-stranded DNA was covalently coupled to m-aminobenzyloximethylcellulose in order to purify proteins which recognize and bind to specific sequences on the DNA. The purification of two DNA-binding proteins from Drosophila melanogaster is demonstrated using the respective cloned DNA sequences.  相似文献   

20.
A structure-based method for protein sequence alignment   总被引:1,自引:0,他引:1  
MOTIVATION: With the continuing rapid growth of protein sequence data, protein sequence comparison methods have become the most widely used tools of bioinformatics. Among these methods are those that use position-specific scoring matrices (PSSMs) to describe protein families. PSSMs can capture information about conserved patterns within families, which can be used to increase the sensitivity of searches for related sequences. Certain types of structural information, however, are not generally captured by PSSM search methods. Here we introduce a program, Structure-based ALignment TOol (SALTO), that aligns protein query sequences to PSSMs using rules for placing and scoring gaps that are consistent with the conserved regions of domain alignments from NCBI's Conserved Domain Database. RESULTS: In most cases, the alignment scores obtained using the local alignment version follow an extreme value distribution. SALTO's performance in finding related sequences and producing accurate alignments is similar to or better than that of IMPALA; one advantage of SALTO is that it imposes an explicit gapping model on each protein family. AVAILABILITY: A stand-alone version of the program that can generate global or local alignments is available by ftp distribution (ftp://ftp.ncbi.nih.gov/pub/SALTO/), and has been incorporated to Cn3D structure/alignment viewer. CONTACT: bryant@ncbi.nlm.nih.gov.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号