首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Standley DM  Toh H  Nakamura H 《Proteins》2004,57(2):381-391
A new algorithm for superimposing protein structures based on maximizing the number of spatially equivalent residues is introduced. The algorithm works in three distinct steps. First, the optimal residue map is calculated by structural alignment. By default, the double dynamic programming algorithm, as implemented in the program ASH, was used for the structure alignment step, but we also present results based on alignments imported from three other programs (Dali, CE, and VAST).Second, the structures are spatially superimposed such that the effective number of equivalent residues (NER)--aligned residue pairs that can be spatially overlapped--is maximized. The NER score is an analytic, differentiable similarity function that rewards spatially equivalent residues but ignores non-equivalent ones. Maximization of the NER score results in accurate superpositions in cases where root mean square deviation (RMSD) minimization fails. Third, the NER function is used in conjunction with traditional dynamic programming to realign the structures based on the proximity of residues in the superposition. Results are presented for a wide range of superposition problems and compared to results from Dali, CE, and VAST. In addition, several structure-structure pairs that show only partial similarity are discussed, and results are compared to those from the LGA, SARF2, and ThreeCa programs.  相似文献   

2.
The three-dimensional structures of 41 homologous proteins (belonging to eight families) were compared by pairwise superposition. A subset of 'core' residues was defined as those whose side chains have less than 7% of their surface exposed to solvent. This subset has significantly higher sequence identity and lower root mean square (RMS) alpha carbon separation than for all topologically equivalent residues in the structure, when members of a protein family are superposed. For such superpositions the relationship between RMS distance and percentage sequence identity of this subset of residues is similar to that for all equivalent residues, although some variation is observed between families of proteins which are predominantly beta sheet and those which are mainly alpha helix. The definition of a structurally more conserved core may be useful in model building proteins from an homologous family. The RMS differences of coordinates of structures of proteins with identical sequences are found to be related to the resolutions of the structures.  相似文献   

3.
Analogs of nantenine were docked into a modeled structure of the human 5-HT2A receptor using ICM Pro, GLIDE, and GOLD docking methods. The resultant docking scores were used to correlate with observed in vitro apparent affinity (Ke) data. The GOLD docking algorithm when used with a homology model of 5-HT2A, based on a bovine rhodopsin template and built by the program MODELLER, gives results which are most in agreement with the in vitro results. Further analysis of the docking poses among members of a C1 alkyl series of nantenine analogs, indicate that they bind to the receptor in a similar orientation, but differently than nantenine. Besides an important interaction between the protonated nitrogen of the C1 alkyl analogs and residue Asp155, we identified Ser242, Phe234, and Gly238 as key residues responsible for the affinity of these compounds for the 5-HT2A receptor. Specifically, the ability of some of these analogs to establish a H-bond with Ser242 and hydrophobic interactions with Phe234 and Gly238 appears to explain their enhanced affinity as compared to nantenine.  相似文献   

4.
We investigate the conservation of amino acid residue sequences in 21 DNA-binding protein families and study the effects that mutations have on DNA-sequence recognition. The observations are best understood by assigning each protein family to one of three classes: (i) non-specific, where binding is independent of DNA sequence; (ii) highly specific, where binding is specific and all members of the family target the same DNA sequence; and (iii) multi-specific, where binding is also specific, but individual family members target different DNA sequences. Overall, protein residues in contact with the DNA are better conserved than the rest of the protein surface, but there is a complex underlying trend of conservation for individual residue positions. Amino acid residues that interact with the DNA backbone are well conserved across all protein families and provide a core of stabilising contacts for homologous protein-DNA complexes. In contrast, amino acid residues that interact with DNA bases have variable levels of conservation depending on the family classification. In non-specific families, base-contacting residues are well conserved and interactions are always found in the minor groove where there is little discrimination between base types. In highly specific families, base-contacting residues are highly conserved and allow member proteins to recognise the same target sequence. In multi-specific families, base-contacting residues undergo frequent mutations and enable different proteins to recognise distinct target sequences. Finally, we report that interactions with bases in the target sequence often follow (though not always) a universal code of amino acid-base recognition and the effects of amino acid mutations can be most easily understood for these interactions.  相似文献   

5.
Protein secondary structure predictions and amino acid long range contact map predictions from primary sequence of proteins have been explored to aid in modelling protein tertiary structures. In order to evaluate the usefulness of secondary structure and 3D-residue contact prediction methods to model protein structures we have used the known Q3 (alpha-helix, beta-strands and irregular turns/loops) secondary structure information, along with residue-residue contact information as restraints for MODELLER. We present here results of our modelling studies on 30 best resolved single domain protein structures of varied lengths. The results shows that it is very difficult to obtain useful models even with 100% accurate secondary structure predictions and accurate residue contact predictions for up to 30% of residues in a sequence. The best models that we obtained for proteins of lengths 37, 70, 118, 136 and 193 amino acid residues are of RMSDs 4.17, 5.27, 9.12, 7.89 and 9.69, respectively. The results show that one can obtain better models for the proteins which have high percent of alpha-helix content. This analysis further shows that MODELLER restrain optimization program can be useful only if we have truly homologous structure(s) as a template where it derives numerous restraints, almost identical to the templates used. This analysis also clearly indicates that even if we satisfy several true residue-residue contact distances, up to 30% of their sequence length with fully known secondary structural information, we end up predicting model structures much distant from their corresponding native structures.  相似文献   

6.
Short-chain alcohol dehydrogenases (SCAD) constitute a large and diverse family of ancient origin. Several of its members play an important role in human physiology and disease, especially in the metabolism of steroid substrates (e.g., prostaglandins, estrogens, androgens, and corticosteroids). Their involvement in common human disorders such as endocrine-related cancer, osteoporosis, and Alzheimer disease makes them an important candidate for drug targets. Recent phylogenetic analysis of SCAD is incomplete and does not allow any conclusions on very ancient divergences or on a functional characterization of novel proteins within this complex family. We have developed a 3D structure-based approach to establish the deep-branching pattern within the SCAD family. In this approach, pairwise superpositions of X-ray structures were used to calculate similarity scores as an input for a tree-building algorithm. The resulting phylogeny was validated by comparison with the results of sequence-based algorithms and biochemical data. It was possible to use the 3D data as a template for the reliable determination of the phylogenetic position of novel proteins as a first step toward functional predictions. We were able to discern new patterns in the phylogenetic relationships of the SCAD family, including a basal dichotomy of the 17beta-hydroxysteroid dehydrogenases (17beta-HSDs). These data provide an important contribution toward the development of type-specific inhibitors for 17beta-HSDs for the treatment and prevention of disease. Our structure-based phylogenetic approach can also be applied to increase the reliability of evolutionary reconstructions in other large protein families.  相似文献   

7.
Protein secondary structure predictions and amino acid long range contact map predictions from primary sequence of proteins have been explored to aid in modelling protein tertiary structures. In order to evaluate the usefulness of secondary structure and 3D-residue contact prediction methods to model protein structures we have used the known Q3 (alpha-helix,beta-strands and irregular turns/loops) secondary structure information, along with residue-residue contact information as restraints for MODELLER. We present here results of our modelling studies on 30 best resolved single domain protein structures of varied lengths. The results shows that it is very difficult to obtain useful models even with 100% accurate secondary structure predictions and accurate residue contact predictions for up to 30% of residues in a sequence. The best models that we obtained for proteins of lengths 37, 70, 118, 136 and 193 amino acid residues are of RMSDs 4.17, 5.27, 9.12, 7.89 and 9.69,respectively. The results show that one can obtain better models for the proteins which have high percent of alpha-helix content. This analysis further shows that MODELLER restrain optimization program can be useful only if we have truly homologous structure(s) as a template where it derives numerous restraints, almost identical to the templates used. This analysis also clearly indicates that even if we satisfy several true residue-residue contact distances, up to 30%of their sequence length with fully known secondary structural information, we end up predicting model structures much distant from their corresponding native structures.  相似文献   

8.
Olfaction of insects is currently recognized as the major area of research for developing novel control strategies to prevent mosquito-borne infections. A 3-dimensional model (3D) was developed for the salivary gland odorant-binding protein-2 of the mosquito Culex quinquefasciatus, a major vector of human lymphatic filariasis. A homology modeling method was used for the prediction of the structure. For the modeling, two template proteins were obtained by mGenTHERADER, namely the high-resolution X-ray crystallography structure of a pheromone-binding protein (ASP1) of Apis mellifera L., [1R5R:A] and the aristolochene synthase from Penicillium roqueforti [1DI1:B]. By comparing the template protein a rough model was constructed for the target protein using MODELLER, a program for comparative modelling. The structure of OBP of the mosquito Culex quinquefasciatus resembles the structure of pheromone-binding protein ASP1 of Apis mellifera L., [1R5R:A]. From Ramachandran plot analysis it was found that the portion of residues falling into the most favoured regions was 86.0%. The predicted 3-D model may be further used in characterizing the protein in wet laboratory.  相似文献   

9.
Human mitochondrial aldehyde dehydrogenase is a member of superfamily of multisubunit enzymes, catalyzing the conversion of a broad range of aldehydes to corresponding acids via the NAD (P) (+)-dependent irreversible reaction. They play an important role in the detoxification of acetaldehyde, in the development of alcohol sensitivity and human alcohol-related disorders. The study aimed to understand the role of conserved residues by comparing similarities and differences between the two isoenzymes. A 3D model of the human ALDHX is constructed by molecular modeling based on the crystal structure of human ALDH2 by using MODELLER (8V1) program. Assessment of reliability of the 3D model is carried out by the programs PROCHECK and PROSAII. The ALDHX fold is similar to the previously described ALDH structures. Sequence and structural analyses have highlighted a close structural and functional relationship between the two isoenzymes of human origin. The interfacial residues that are involved in crucial interactions across the interface stabilize the dimer-tetramer interface in the enzyme. Stability factors like salt bonds and hydrogen bonds aid and maintain the tetrameric assembly of the enzyme.  相似文献   

10.
Toxin-antitoxin (TA) systems contribute to plasmid stability by a mechanism that relies on the differential stabilities of the toxin and antitoxin proteins and leads to the killing of daughter bacteria that did not receive a plasmid copy at the cell division. ParE is the toxic component of a TA system that constitutes along with RelE an important class of bacterial toxin called RelE/ParE superfamily. For ParE toxin, no crystallographic structure is available so far and rare in vitro studies demonstrated that the target of toxin activity is E. coli DNA gyrase. Here, a 3D Model for E. coli ParE toxin by molecular homology modeling was built using MODELLER, a program for comparative modeling. The Model was energy minimized by CHARMM and validated using PROCHECK and VERIFY3D programs. Resulting Ramachandran plot analysis it was found that the portion residues failing into the most favored and allowed regions was 96.8%. Structural similarity search employing DALI server showed as the best matches RelE and YoeB families. The Model also showed similarities with other microbial ribonucleases but in a small score. A possible homologous deep cleft active site was identified in the Model using CASTp program. Additional studies to investigate the nuclease activity in members of ParE family as well as to confirm the inhibitory replication activity are needed. The predicted Model allows initial inferences about the unexplored 3D structure of the ParE toxin and may be further used in rational design of molecules for structure-function studies.  相似文献   

11.
PALI (release 1.2) contains three-dimensional (3-D) structure-dependent sequence alignments as well as structure-based phylogenetic trees of homologous protein domains in various families. The data set of homologous protein structures has been derived by consulting the SCOP database (release 1.50) and the data set comprises 604 families of homologous proteins involving 2739 protein domain structures with each family made up of at least two members. Each member in a family has been structurally aligned with every other member in the same family (pairwise alignment) and all the members in the family are also aligned using simultaneous super-position (multiple alignment). The structural alignments are performed largely automatically, with manual interventions especially in the cases of distantly related proteins, using the program STAMP (version 4.2). Every family is also associated with two dendrograms, calculated using PHYLIP (version 3.5), one based on a structural dissimilarity metric defined for every pairwise alignment and the other based on similarity of topologically equivalent residues. These dendrograms enable easy comparison of sequence and structure-based relationships among the members in a family. Structure-based alignments with the details of structural and sequence similarities, superposed coordinate sets and dendrograms can be accessed conveniently using a web interface. The database can be queried for protein pairs with sequence or structural similarities falling within a specified range. Thus PALI forms a useful resource to help in analysing the relationship between sequence and structure variation at a given level of sequence similarity. PALI also contains over 653 'orphans' (single member families). Using the web interface involving PSI_BLAST and PHYLIP it is possible to associate the sequence of a new protein with one of the families in PALI and generate a phylogenetic tree combining the query sequence and proteins of known 3-D structure. The database with the web interfaced search and dendrogram generation tools can be accessed at http://pauling.mbu.iisc.ernet. in/ approximately pali.  相似文献   

12.
A nearly complete sequential resonance assignment is a key factor leading to successful protein structure determination via NMR spectroscopy. Assuming the availability of a set of NMR spectral peak lists, most of the existing assignment algorithms first use the differences between chemical shift values for common nuclei across multiple spectra to provide the evidence that some pairs of peaks should be assigned to sequentially adjacent amino acid residues in the target protein. They then use these connectivities as constraints to produce a sequential assignment. At various levels of success, these algorithms typically generate a large number of potential connectivity constraints, and it grows exponentially as the quality of spectral data decreases. A key observation used in our sequential assignment program, CISA, is that chemical shift residual signature information can be used to improve the connectivity determination, and thus to dramatically decrease the number of predicted connectivity constraints. Fewer connectivity constraints lead to less ambiguities in the sequential assignment. Extensive simulation studies on several large test datasets demonstrated that CISA is efficient and effective, compared to three most recently proposed sequential resonance assignment programs RANDOM, PACES, and MARS.  相似文献   

13.
14.
A comprehensive comparison of multiple sequence alignment programs.   总被引:35,自引:4,他引:31  
In recent years improvements to existing programs and the introduction of new iterative algorithms have changed the state-of-the-art in protein sequence alignment. This paper presents the first systematic study of the most commonly used alignment programs using BAliBASE benchmark alignments as test cases. Even below the 'twilight zone' at 10-20% residue identity, the best programs were capable of correctly aligning on average 47% of the residues. We show that iterative algorithms often offer improved alignment accuracy though at the expense of computation time. A notable exception was the effect of introducing a single divergent sequence into a set of closely related sequences, causing the iteration to diverge away from the best alignment. Global alignment programs generally performed better than local methods, except in the presence of large N/C-terminal extensions and internal insertions. In these cases, a local algorithm was more successful in identifying the most conserved motifs. This study enables us to propose appropriate alignment strategies, depending on the nature of a particular set of sequences. The employment of more than one program based on different alignment techniques should significantly improve the quality of automatic protein sequence alignment methods. The results also indicate guidelines for improvement of alignment algorithms.  相似文献   

15.
At present, 69 families of carbohydrate‐binding modules (CBMs) have been isolated by statistically significant differences in the amino acid sequences (primary structures) of their members, with most members of different families showing little if any homology. On the other hand, members of the same family have primary and tertiary (three‐dimensional) structures that can be computationally aligned, suggesting that they are descended from common protein ancestors. Members of the large majority of CBM families are β‐sandwiches. This raises the question of whether members of different families are descended from distant common ancestors, and therefore are members of the same tribe. We have attacked this problem by attempting to computationally superimpose tertiary structure representatives of each of the 53 CBM families that have members with known tertiary structures. When successful, we have aligned locations of secondary structure elements and determined root mean square deviations and percentages of similarity between adjacent amino acid residues in structures from similar families. Further criteria leading to tribal membership are amino acid chain lengths and bound ligands. These considerations have led us to assign 27 families to nine tribes. Eight of the tribes have members with β‐sandwich structures, while the ninth is composed of structures with β‐trefoils. © 2014 Wiley Periodicals, Inc. Biopolymers 103: 203–214, 2015.  相似文献   

16.
To classify proteins into functional families based on their primary sequences, popular algorithms such as the k-NN-, HMM-, and SVM-based algorithms are often used. For many of these algorithms to perform their tasks, protein sequences need to be properly aligned first. Since the alignment process can be error-prone, protein classification may not be performed very accurately. To improve classification accuracy, we propose an algorithm, called the Unaligned Protein SEquence Classifier (UPSEC), which can perform its tasks without sequence alignment. UPSEC makes use of a probabilistic measure to identify residues that are useful for classification in both positive and negative training samples, and can handle multi-class classification with a single classifier and a single pass through the training data. UPSEC has been tested with real protein data sets. Experimental results show that UPSEC can effectively classify unaligned protein sequences into their corresponding functional families, and the patterns it discovers during the training process can be biologically meaningful.  相似文献   

17.
MOTIVATION: Although the cores of homologous proteins are relatively well conserved, amino acid substitutions lead to significant differences in the structures of divergent superfamilies. Thus, the classification of amino acid sequence patterns and the selection of appropriate fragments of the protein cores of homologues of known structure are important for accurate comparative modelling. RESULTS: CHORAL utilizes a knowledge-based method comprising an amalgam of differential geometry and pattern recognition algorithms to identify conserved structural patterns in homologous protein families. Propensity tables are used to classify and to select patterns that most likely represent the structure of the core for a target protein. In our benchmark, CHORAL demonstrates a performance equivalent to that of MODELLER.  相似文献   

18.
We have identified a novel gene, six transmembrane protein of prostate 1 (STAMP1), which is largely specific to prostate for expression and is predicted to code for a 490-amino acid six transmembrane protein. Using a form of STAMP1 labeled with green fluorescent protein in quantitative time-lapse and immunofluorescence confocal microscopy, we show that STAMP1 is localized to the Golgi complex, predominantly to the trans-Golgi network, and to the plasma membrane. STAMP1 also localizes to vesicular tubular structures in the cytosol and colocalizes with the early endosome antigen 1 (EEA1), suggesting that it may be involved in the secretory/endocytic pathways. STAMP1 is highly expressed in the androgen-sensitive, androgen receptor-positive prostate cancer cell line LNCaP, but not in androgen receptor-negative prostate cancer cell lines PC-3 and DU145. Furthermore, STAMP1 expression is significantly lower in the androgen-dependent human prostate xenograft CWR22 compared with the relapsed derivative CWR22R, suggesting that its expression may be deregulated during prostate cancer progression. Consistent with this notion, in situ analysis of human prostate cancer specimens indicated that STAMP1 is expressed exclusively in the epithelial cells of the prostate and its expression is significantly increased in prostate tumors compared with normal glands. Taken together, these data suggest that STAMP1 may have an important role in the normal prostate cell as well as in prostate cancer progression.  相似文献   

19.
Evolution of protein sequences and structures.   总被引:9,自引:0,他引:9  
The relationship between sequence similarity and structural similarity has been examined in 36 protein families with five or more diverse members whose structures are known. The structural similarity within a family (as determined with the DALI structure comparison program) is linearly related to sequence similarity (as determined by a Smith-Waterman search of the protein sequences in the structure database). The correlation between structural similarity and sequence similarity is very high; 18 of the 36 families had linear correlation coefficients r>/=0.878, and only nine had correlation coefficients r相似文献   

20.
D Xu  K Baburaj  C B Peterson  Y Xu 《Proteins》2001,44(3):312-320
The structure of vitronectin, an adhesive protein that circulates in high concentrations in human plasma, was predicted through a combination of computational methods and experimental approaches. Fold recognition and sequence-structure alignment were performed using the threading program PROSPECT for each of three structural domains, i.e., the N-terminal somatomedin B domain (residues 1-53), the central region that folds into a four-bladed beta-propeller domain (residues 131-342), and the C-terminal heparin-binding domain (residues 347-459). The atomic structure of each domain was generated using MODELLER, based on the alignment obtained from threading. Docking experiments between the central and C-terminal domains were conducted using the program GRAMM, with limits on the degrees of freedom from a known inter-domain disulfide bridge. The docked structure has a large inter-domain contact surface and defines a putative heparin-binding groove at the inter-domain interface. We also docked heparin together with the combined structure of the central and C-terminal domains, using GRAMM. The predictions from the threading and docking experiments are consistent with experimental data on purified plasma vitronectin pertaining to protease sensitivity, ligand-binding sites, and buried cysteines.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号