首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this work, we analyse the potential for using structural knowledge to improve the detection of the DNA-binding helix–turn–helix (HTH) motif from sequence. Starting from a set of DNA-binding protein structures that include a functional HTH motif and have no apparent sequence similarity to each other, two different libraries of hidden Markov models (HMMs) were built. One library included sequence models of whole DNA-binding domains, which incorporate the HTH motif, the second library included shorter models of ‘partial’ domains, representing only the fraction of the domain that corresponds to the functionally relevant HTH motif itself. The libraries were scanned against a dataset of protein sequences, some containing the HTH motifs, others not. HMM predictions were compared with the results obtained from a previously published structure-based method and subsequently combined with it. The combined method proved more effective than either of the single-featured approaches, showing that information carried by motif sequences and motif structures are to some extent complementary and can successfully be used together for the detection of DNA-binding HTHs in proteins of unknown function.  相似文献   

2.
We use methods from Data Mining and Knowledge Discovery to design an algorithm for detecting motifs in protein sequences. The algorithm assumes that a motif is constituted by the presence of a "good" combination of residues in appropriate locations of the motif. The algorithm attempts to compile such good combinations into a "pattern dictionary" by processing an aligned training set of protein sequences. The dictionary is subsequently used to detect motifs in new protein sequences. Statistical significance of the detection results are ensured by statistically determining the various parameters of the algorithm. Based on this approach, we have implemented a program called GYM. The Helix-Turn-Helix motif was used as a model system on which to test our program. The program was also extended to detect Homeodomain motifs. The detection results for the two motifs compare favorably with existing programs. In addition, the GYM program provides a lot of useful information about a given protein sequence.  相似文献   

3.
A method for discerning protein structures containing the DNA-binding helix-turn-helix (HTH) motif has been developed. The method uses statistical models based on geometrical measurements of the motif. With a decision tree model, key structural features required for DNA binding were identified. These include a high average solvent-accessibility of residues within the recognition helix and a conserved hydrophobic interaction between the recognition helix and the second alpha helix preceding it. The Protein Data Bank was searched using a more accurate model of the motif created using the Adaboost algorithm to identify structures that have a high probability of containing the motif, including those that had not been reported previously.  相似文献   

4.
The prediction of helix-turn-helix DNA-binding regions in proteins   总被引:5,自引:0,他引:5  
  相似文献   

5.
6.
7.
This paper presents a simple program for interactive searchingfor nucleotide sequences that may code for the helix—turn—helix,zinc finger or leucine zipper motifs in proteins. The helix—turn—helixmotifs are predicted using the recently published method ofDodd and Egan, while zinc fingers and leucine zippers are searchedfor by our original methods. DNABIND is shown to detect allfour known helix—turn—helix motifs in bacteriophagelambda genes and both zinc fingers of the adrl gene of yeast.  相似文献   

8.
The detection of DNA-binding proteins by protein blotting.   总被引:105,自引:19,他引:105       下载免费PDF全文
A method, called "protein blotting," for the detection of DNA-binding proteins is described. Proteins are separated on an SDA-polyacrylamide gel. The gel is sandwiched between 2 nitrocellulose filters and the proteins allowed to diffuse out of the gel and onto the filters. The proteins are tightly bound to each filter, producing a replica of the original gel pattern. The replica is used to detect DNA-binding proteins, RNA-binding proteins or histone-binding proteins by incubation of the filter with [32P]DNA, [125I]RNA, or [125I] histone. Evidence is also presented that specific protein-DNA interactions may be detected by this technique; under appropriate conditions, the lac repressor binds only to DNA containing the lac operator. Strategies for the detection of specific protein-DNA interactions are discussed.  相似文献   

9.

Background  

Automatic extraction of motifs from biological sequences is an important research problem in study of molecular biology. For proteins, it is desired to discover sequence motifs containing a large number of wildcard symbols, as the residues associated with functional sites are usually largely separated in sequences. Discovering such patterns is time-consuming because abundant combinations exist when long gaps (a gap consists of one or more successive wildcards) are considered. Mining algorithms often employ constraints to narrow down the search space in order to increase efficiency. However, improper constraint models might degrade the sensitivity and specificity of the motifs discovered by computational methods. We previously proposed a new constraint model to handle large wildcard regions for discovering functional motifs of proteins. The patterns that satisfy the proposed constraint model are called W-patterns. A W-pattern is a structured motif that groups motif symbols into pattern blocks interleaved with large irregular gaps. Considering large gaps reflects the fact that functional residues are not always from a single region of protein sequences, and restricting motif symbols into clusters corresponds to the observation that short motifs are frequently present within protein families. To efficiently discover W-patterns for large-scale sequence annotation and function prediction, this paper first formally introduces the problem to solve and proposes an algorithm named WildSpan (sequential pattern mining across large wildcard regions) that incorporates several pruning strategies to largely reduce the mining cost.  相似文献   

10.
In modern biology, there is a critical need to develop a high-throughput and inexpensive platform for DNA sequencing. Pyrosequencing is a nonelectrophoretic single-tube DNA sequencing method that takes advantage of cooperativity between four enzymes to monitor DNA synthesis. In these studies, single-stranded DNA-binding protein (SSB) was added to the primed DNA template prior to the Pyrosequencing reaction. The addition of SSB to a Pyrosequencing reaction system resulted in a read length of more than 30 nucleotides. Improvements were observed as: (i) increased efficiency of the enzymes, (ii) reduced mispriming, as measured by nonspecific signals, (iii) an increase in signal intensity during the reaction, (iv) higher accuracy in reading the number of identical adjacent nucleotides in difficult templates, and (v) longer reads. The usefulness of these results for future Pyrosequencing applications is discussed.  相似文献   

11.
Inspection of the structure of the C-terminal domain of ribosomal protein L7/L12 (1) reveals a helix-turn-helix motif similar to the one found in many DNA-binding regulatory proteins (2-5). The 19 alpha-carbon atoms of the L7/L12 alpha-helices superimpose on the DNA binding helices of CAP and cro with root-mean-square distances between corresponding alpha carbons of 1.45 and 1.55 A, respectively. These helices in L7/L12 are within a patch of highly conserved residues on the surface of L7/L12 whose role is as yet uncertain. We raise the possibility that they may constitute a binding site for nucleic acids, most probably RNA. Consistent with this hypothesis are calculations of the electrostatic charge potential surrounding the protein, which show a region of positive potential centered on the first of these helices.  相似文献   

12.
13.
SUMMARY: HELM is a web tool designed to automate the analysis of protein sequences searching for alpha helix motifs. This analysis can be useful in protein engineering studies, aimed at the identification of regions to be modified in order to obtain more suitable features of local and/or global stability. AVAILABILITY: The tool is available to academic and commercial institutions at the URL http://crisceb.area.na.cnr.it/angelo/ PROTEIN_TOOLS/HELM/ CONTACT: angelo@crisceb.area.na.cnr.it  相似文献   

14.

Background

False occurrences of functional motifs in protein sequences can be considered as random events due solely to the sequence composition of a proteome. Here we use a numerical approach to investigate the random appearance of functional motifs with the aim of addressing biological questions such as: How are organisms protected from undesirable occurrences of motifs otherwise selected for their functionality? Has the random appearance of functional motifs in protein sequences been affected during evolution?

Results

Here we analyse the occurrence of functional motifs in random sequences and compare it to that observed in biological proteomes; the behaviour of random motifs is also studied. Most motifs exhibit a number of false positives significantly similar to the number of times they appear in randomized proteomes (=expected number of false positives). Interestingly, about 3% of the analysed motifs show a different kind of behaviour and appear in biological proteomes less than they do in random sequences. In some of these cases, a mechanism of evolutionary negative selection is apparent; this helps to prevent unwanted functionalities which could interfere with cellular mechanisms.

Conclusion

Our thorough statistical and biological analysis showed that there are several mechanisms and evolutionary constraints both of which affect the appearance of functional motifs in protein sequences.
  相似文献   

15.
A sensitive technique for protein sequence motif recognition based on neural networks has been developed. It involves three major steps. (1) At each appropriate alignment position of a set of N matched sequences, a set of N aligned oligopeptides is specified with preselected window length. N neural nets are subsequently and successively trained on N-1 amino acid spans after eliminating each ith oligopeptide. A test for recognition of each of the ith spans is performed. The average neural net recognition over N such trials is used as a measure of conservation for the particular windowed region of the multiple alignment. This process is repeated for all possible spans of given length in the multiple alignment. (2) The M most conserved regions are regarded as motifs and the oligopeptides within each are used to train intensively M individual neural networks. (3) The M networks are then applied in a search for related primary structures in a databank of known protein sequences. The oligopeptide spans in the database sequence with strongest neural net output for each of the M networks are saved and then scored according to the output signals and the proper combination that follows the expected N- to C-terminal sequence order. The motifs from the database with highest similarity scores can then be used to retrain the M neural nets, which can be subsequently utilized for further searches in the databank, thus providing even greater sensitivity to recognize distant familial proteins. This technique was successfully applied to the integrase, DNA-polymerase and immunoglobulin families.  相似文献   

16.
17.
18.
19.
The distribution of RNA motifs in natural sequences.   总被引:2,自引:3,他引:2       下载免费PDF全文
Functional analysis of genome sequences has largely ignored RNA genes and their structures. We introduce here the notion of 'ribonomics' to describe the search for the distribution of and eventually the determination of the physiological roles of these RNA structures found in the sequence databases. The utility of this approach is illustrated here by the identification in the GenBank database of RNA motifs having known binding or chemical activity. The frequency of these motifs indicates that most have originated from evolutionary drift and are selectively neutral. On the other hand, their distribution among species and their location within genes suggest that the destiny of these motifs may be more elaborate. For example, the hammerhead motif has a skewed organismal presence, is phylogenetically stable and recent work on a schistosome version confirms its in vivo biological activity. The under-representation of the valine-binding motif and the Rev-binding element in GenBank hints at a detrimental effect on cell growth or viability. Data on the presence and the location of these motifs may provide critical guidance in the design of experiments directed towards the understanding and the manipulation of RNA complexes and activities in vivo.  相似文献   

20.
Mondol T  Batabyal S  Mazumder A  Roy S  Pal SK 《FEBS letters》2012,586(3):258-262
λ-Repressor-operator sites interaction, particularly O(R)1 and O(R)2, is a key component of the λ-genetic switch. FRET from the dansyl bound to the C-terminal domain of the protein, to the intercalated EtBr in the operator DNA indicates that the structure of the protein is more compact in the O(R)2 complex than in the O(R)1 complex. Fluorescence anisotropy reveals enhanced flexibility of the C-terminal domain of the repressor at fast timescales after complex formation with O(R)1. In contrast, O(R)2 bound repressor shows no significant enhancement of protein dynamics at these timescales. These differences are shown to be important for correct protein-protein interactions. Altered protein dynamics upon specific DNA sequence recognition may play important roles in assembly of regulatory proteins at the correct positions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号