首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Identification of protein sequence homology by consensus template alignment   总被引:26,自引:0,他引:26  
A pattern-matching procedure is described, based on fitting templates to the sequence, which allows general structural constraints to be imposed on the patterns identified. The templates correspond to structurally conserved regions of the sequence and were initially derived from a small number of related sequences whose tertiary structures are known. The templates were then made more representative by aligning other sequences of unknown structure. Two alignments were built up containing 100 immunoglobulin variable domain sequences and 85 constant domain sequences, respectively. From each of these extended alignments, templates were generated to represent features conserved in all the sequences. These consisted mainly of patterns of hydrophobicity associated with beta-structure. For structurally conserved beta-strands with no conserved features, templates based on general secondary structure prediction principles were used to identify their possible locations. The specificity of the templates was demonstrated by their ability to identify the conserved features in known immunoglobulin and immunoglobulin-related sequences but not in other non-immunoglobulin sequences.  相似文献   

2.
A new method is presented for identifying distantly related homologous proteins that are unrecognizable by conventional sequence comparison methods. The method combines information about functionally conserved sequence patterns with information about structure context. This information is encoded in stochastic discrete state-space models (DSMs) that comprise a new family of hidden Markov models. The new models are called sequence-pattern-embedded DSMs (pDSMs). This method can identify distantly related protein family members with a high sensitivity and specificity. The method is illustrated with trypsin-like serine proteases and globins. The strategy for building pDSMs is presented. The method has been validated using carefully constructed positive and negative control sets. In addition to the ability to recognize remote homologs, pDSM sequence analysis predicts secondary structures with higher sensitivity, specificity, and Q3 accuracy than DSM analysis, which omits information about conserved sequence patterns. The identification of trypsin-like serine proteases in new genomes is discussed.  相似文献   

3.
Mammalian and fungal Diaphanous-related formin homology (DRF) proteins contain several regions of conserved sequence homology. These include an amino-terminal GTPase binding domain (GBD) that interacts with activated Rho family members and formin homology domains that mediate targeting or interactions with signaling kinases and actin-binding proteins. DRFs also contain a conserved Dia-autoregulatory domain (DAD) in their carboxyl termini that binds the GBD. The GBD is a bifunctional autoinhibitory domain that is regulated by activated Rho. Expression of the isolated DAD in cells causes actin fiber formation and stimulates serum response factor-regulated gene expression. Inhibitor experiments show that the effects of exogenous DAD expression are dependent upon cellular Dia proteins. Alanine substitution of DAD consensus residues that disrupt GBD binding also eliminate DAD biological activity. Thus, DAD expression activates nuclear signaling and actin remodeling by mimicking activated Rho and unlatching the autoinhibited state of the cellular complement of Dia proteins.  相似文献   

4.
5.
In a similar manner to sequence database searching, it is also possible to compare three-dimensional protein structures. Such methods can be extremely useful because a structural similarity may represent a distant evolutionary relationship that is undetectable by sequence analysis. In this review, we summarise the most popular structure comparison methods, show how they can be used for database searching, and then describe some of the most advanced attempts to develop comprehensive protein structure classifications. With such data, it is possible to identify distant evolutionary relationships, provide libraries of unique folds for structure prediction, estimate the total number of folds that exist, and investigate the preference for certain types of structures over others. BioEssays 20:884–891, 1998. © 1998 John Wiley & Sons, Inc.  相似文献   

6.
In this paper, we present a new scheme named ProtClass for automatic classification of three-dimensional (3D) protein structures. It is a dedicated and unified multiclass classification scheme. Neither detailed structural alignment nor multiple binary classifications are required in this scheme. We adopt a nearest neighbor-based classification strategy. We use a filter-and-refine scheme. In the first step, we filter out the improbable answers using the precalculated parameters from the training data. In the second, we perform a relatively more detailed nearest neighbor search on the remaining answers. We use very concise and effective encoding schemes of the 3D protein structures in both steps. We compare our proposed method against two other dedicated protein structure classification schemes, namely SGM and CPMine. The experimental results show that ProtClass is slightly better in accuracy than SGM and much faster. In comparison with CPMine, ProtClass is much more accurate, while their running times are about the same. We also compare ProtClass against a structural alignment-based classification scheme named DALI, which is found to be more accurate, but extremely slow. The software is available upon request from the authors. The supplementary information on ProtClass method can be found at: http://xena1.ddns.comp.nus.edu.sg/ approximately genesis/PClass.htm.  相似文献   

7.
A physical map of 330 x 10(3) base-pairs near the replication origin of Myxococcus xanthus chromosome has been established already. Using DNA fragments from this region, Northern blot hybridization analysis was carried out in order to identify the genes expressed during vegetative growth. One of the genes, tentatively designated as vegA, was cloned and its entire DNA sequence was determined. The amino acid sequence of the gene product deduced from the DNA sequence reveals that the VegA protein is a very basic protein with a molecular weight of 18,700. The gene was expressed in Escherichia coli using an expression vector, and its gene product was identified using SDS/polyacrylamide gel electrophoresis. From the results of S1 nuclease mapping, the vegA promoter was found to contain the sequence TAGACA at the -35 region and the sequence AAGGGT at the -10 region. These two regions are separated by 18 nucleotides. Genetic analysis suggests that the vegA gene may be essential for the growth of M. xanthus. From a computer-aided search for homologies to know protein structures, it was found that the VegA protein has homologies to histone H4 of Tetrahymena thermophila and histone H2B of sea urchin.  相似文献   

8.
For over 2 decades, continuous efforts to organize the jungle of available protein structures have been underway. Although a number of discrepancies between different classification approaches for soluble proteins have been reported, the classification of membrane proteins has so far not been comparatively studied because of the limited amount of available structural data. Here, we present an analysis of α‐helical membrane protein classification in the SCOP and CATH databases. In the current set of 63 α‐helical membrane protein chains having between 1 and 13 transmembrane helices, we observed a number of differently classified proteins both regarding their domain and fold assignment. The majority of all discrepancies affect single transmembrane helix, two helix hairpin, and four helix bundle domains, while domains with more than five helices are mostly classified consistently between SCOP and CATH. It thus appears that the structural constraints imposed by the lipid bilayer complicate the classification of membrane proteins with only few membrane‐spanning regions. This problem seems to be specific for membrane proteins as soluble four helix bundles, not restrained by the membrane, are more consistently classified by SCOP and CATH. Our findings indicate that the structural space of small membrane helix bundles is highly continuous such that even minor differences in individual classification procedures may lead to a significantly different classification. Membrane proteins with few helices and limited structural diversity only seem to be reasonably classifiable if the definition of a fold is adapted to include more fine‐grained structural features such as helix–helix interactions and reentrant regions. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

9.
10.
A systematic classification of beta-hairpin structures which takes into account the polypeptide chain length and hydrogen bonding between the two antiparallel beta-strands is described. We have used this classification of beta-hairpin structures and their specific sequence pattern to derive rules which demonstrate its usefulness in assisting modelling beta-hairpins. These rules can be applied to comparative model building, modelling into electron density and in the prediction of conformation of beta-hairpins to aid protein engineering.  相似文献   

11.
12.
Approaching a complete classification of protein secondary structure   总被引:2,自引:0,他引:2  
A complete classification of types of the protein secondary structure is developed on the basis of computer analysis of the crystallographic structural data deposited in the protein Data Bank. The majority of amino acid residues fall into five conformation types. A conclusion is drawn that the number of sequence variants of torsion angles phi, psi in globular proteins is limited and is essentially less than the number of possible amino acid sequences for this chain length. Along with alpha-helix and beta-structure, the distribution analysis assigning every maximum of distribution of amino acid conformations on Ramachandran map to a certain type of the secondary structure exposed a third type of the secondary structure that was previously neglected. This type of the structure is extended left-handed helical conformation, designated as mobile (M-) conformation. A full set of M-conformation fragments that seems to play a major role in protein globule dynamics has been obtained, a small radius of correlation for the polypeptide chain in M-conformation is demonstrated. It explains a prevalence of short segments of mobile conformation revealed in globular proteins. For secondary structure types, the frequency of occurrence of amino acid residues has been computed.  相似文献   

13.

Background  

The classification of protein domains in the CATH resource is primarily based on structural comparisons, sequence similarity and manual analysis. One of the main bottlenecks in the processing of new entries is the evaluation of 'borderline' cases by human curators with reference to the literature, and better tools for helping both expert and non-expert users quickly identify relevant functional information from text are urgently needed. A text based method for protein classification is presented, which complements the existing sequence and structure-based approaches, especially in cases exhibiting low similarity to existing members and requiring manual intervention. The method is based on the assumption that textual similarity between sets of documents relating to proteins reflects biological function similarities and can be exploited to make classification decisions.  相似文献   

14.
15.
A M Phillips  A Bull  L E Kelly 《Neuron》1992,8(4):631-642
We have isolated a number of Drosophila cDNAs on the basis of their encoding calmodulin-binding proteins. A full-length cDNA clone corresponding to one of these genes has been cloned and sequenced. Conservation of amino acid sequence and tissue-specific expression are observed between this gene and the transient receptor potential (trp) gene. We propose the name transient receptor potential-like (trpl) to describe this newly isolated gene. The trpl protein contains two possible calmodulin-binding sites, six transmembrane regions, and a sequence homologous to an ankyrin-like repeat. Structurally, the trpl and trp proteins resemble cation channel proteins, particularly the brain isoform of the voltage-sensitive Ca2+ channel. The identification of a protein similar to the trp gene product, yet also able to bind Ca2+/calmodulin, allows for a reinterpretation of the phenotype of the trp mutations and suggests that both genes may encode light-sensitive ion channels.  相似文献   

16.
17.

Background  

Classification of newly resolved protein structures is important in understanding their architectural, evolutionary and functional relatedness to known protein structures. Among various efforts to improve the database of Structural Classification of Proteins (SCOP), automation has received particular attention. Herein, we predict the deepest SCOP structural level that an unclassified protein shares with classified proteins with an equal number of secondary structure elements (SSEs).  相似文献   

18.
19.
20.

Background  

Protein structure classification plays a central role in understanding the function of a protein molecule with respect to all known proteins in a structure database. With the rapid increase in the number of new protein structures, the need for automated and accurate methods for protein classification is increasingly important.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号