首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The amino acid sequences of 18 alpha-amylases have been compared by hydrophobic cluster analysis. The method was first calibrated with two alpha-amylases (Aspergillus oryzae and pig pancreas) whose three-dimensional structures are known. It was then applied to the other alpha-amylases resulting in straightforward sequence alignments which could be used for structure prediction. It was found that all alpha-amylases which were investigated display the same basic super-secondary structure with a (beta alpha)8 barrel. Most of the secondary structure elements of the protein cores could be assigned to segments of the amino acid sequences. In addition, six sub-families could be identified, based upon specific similarities occurring in the variable regions of alpha-amylases.  相似文献   

2.
A new method for comparing and aligning protein sequences is described. This method, hydrophobic cluster analysis (HCA), relies upon a two-dimensional (2D) representation of the sequences. Hydrophobic clusters are determined in this 2D pattern and then used for the sequence comparisons. The method does not require powerful computer resources and can deal with distantly related proteins, even if no 3D data are available. This is illustrated in the present report by a comparison of human haemoglobin with leghaemoglobin, a comparison of the two domains of liver rhodanese (thiosulphate sulphurtransferase) and a comparison of plastocyanin and azurin.  相似文献   

3.
陶华  唐旭清 《生物信息学》2012,10(4):269-273,279
基于模糊邻近关系的粒度空间,对蛋白质序列进行聚类结构分析。利用MEGA软件计算选取的木聚糖酶序列间的比对距离,引入内积将其转化为模糊邻近关系(或矩阵),再应用算法求解其粒度空间,进行序列的聚类结构分析和最佳聚类确定研究。这些研究为蛋白质序列提供了定量分析的工具。  相似文献   

4.
Normal mode analysis (NMA) can facilitate quick and systematic investigation of protein dynamics using data from the Protein Data Bank (PDB). We developed an elastic network model-based NMA program using dihedral angles as independent variables. Compared to the NMA programs that use Cartesian coordinates as independent variables, key attributes of the proposed program are as follows: (1) chain connectivity related to the folding pattern of a polypeptide chain is naturally embedded in the model; (2) the full-atom system is acceptable, and owing to a considerably smaller number of independent variables, the PDB data can be used without further manipulation; (3) the number of variables can be easily reduced by some of the rotatable dihedral angles; (4) the PDB data for any molecule besides proteins can be considered without coarse-graining; and (5) individual motions of constituent subunits and ligand molecules can be easily decomposed into external and internal motions to examine their mutual and intrinsic motions. Its performance is illustrated with an example of a DNA-binding allosteric protein, a catabolite activator protein. In particular, the focus is on the conformational change upon cAMP and DNA binding, and on the communication between their binding sites remotely located from each other. In this illustration, NMA creates a vivid picture of the protein dynamics at various levels of the structures, i.e., atoms, residues, secondary structures, domains, subunits, and the complete system, including DNA and cAMP. Comparative studies of the specific protein in different states, e.g., apo- and holo-conformations, and free and complexed configurations, provide useful information for studying structurally and functionally important aspects of the protein.  相似文献   

5.
MOTIVATION: Information about a particular protein or protein family is usually distributed among multiple databases and often in more than one entry in each database. Retrieval and organization of this information can be a laborious task. This task is complicated even further by the existence of alternative terms for the same concept. RESULTS: The PDB, SWISS-PROT, ENZYME, and CATH databases have been imported into a combined relational database, BIOMOLQUEST: A powerful search engine has been built using this database as a back end. The search engine achieves significant improvements in query performance by automatically utilizing cross-references between the legacy databases. The results of the queries are presented in an organized, hierarchical way.  相似文献   

6.
7.
MOTIVATION: Given a large family of homologous protein sequences, many methods can divide the family into smaller groups that correspond to the different functions carried out by proteins within the family. One important problem, however, has been the absence of a general method for selecting an appropriate level of granularity, or size of the groups. RESULTS: We propose a consistent way of choosing the granularity that is independent of the sequence similarity and sequence clustering method used. We study three large, well-investigated protein families: basic leucine zippers, nuclear receptors and proteins with three consecutive C2H2 zinc fingers. Our method is tested against known functional information, the experimentally determined binding specificities, using a simple scoring method. The significance of the groups is also measured by randomizing the data. Finally, we compare our algorithm against a popular method of grouping proteins, the TRIBE-MCL method. In the end, we determine that dividing the families at the proposed level of granularity creates very significant and useful groups of proteins that correspond to the different DNA-binding motifs. We expect that such groupings will be useful in studying not only DNA binding but also other protein interactions.  相似文献   

8.
9.
Alignment of protein sequences is a key step in most computational methods for prediction of protein function and homology-based modeling of three-dimensional (3D)-structure. We investigated correspondence between "gold standard" alignments of 3D protein structures and the sequence alignments produced by the Smith-Waterman algorithm, currently the most sensitive method for pair-wise alignment of sequences. The results of this analysis enabled development of a novel method to align a pair of protein sequences. The comparison of the Smith-Waterman and structure alignments focused on their inner structure and especially on the continuous ungapped alignment segments, "islands" between gaps. Approximately one third of the islands in the gold standard alignments have negative or low positive score, and their recognition is below the sensitivity limit of the Smith-Waterman algorithm. From the alignment accuracy perspective, the time spent by the algorithm while working in these unalignable regions is unnecessary. We considered features of the standard similarity scoring function responsible for this phenomenon and suggested an alternative hierarchical algorithm, which explicitly addresses high scoring regions. This algorithm is considerably faster than the Smith-Waterman algorithm, whereas resulting alignments are in average of the same quality with respect to the gold standard. This finding shows that the decrease of alignment accuracy is not necessarily a price for the computational efficiency.  相似文献   

10.
11.
Eighty-two amino acid sequences of the catalytic domains of mature endoxylanases belonging to family 11 have been aligned using the programs MATCHBOX and CLUSTAL. The sequences range in length from 175 to 233 residues. The two glutamates acting as catalytic residues are conserved in all sequences. A very good correlation is found between the presence (at position 100) of an asparagine in the so-called 'alkaline' xylanases, or an aspartic acid in those with a more acidic pH optimum. Four boxes defining segments of highest similarity were detected; they correspond to regions of defined secondary structure: B5, B6, B8 and the carboxyl end of the alpha helix, respectively. Cysteine residues are not common in these sequences (0.7% of all residues), and disulfide bridges are not important in explaining the stability of several thermophilic xylanases. The alignment allows the classification of the enzymes in groups according to sequence similarity. Fungal and bacterial enzymes were found to form mostly separate clusters of higher similarity.  相似文献   

12.
13.

Background  

With the current technological advances in high-throughput biology, the necessity to develop tools that help to analyse the massive amount of data being generated is evident. A powerful method of inspecting large-scale data sets is gene set enrichment analysis (GSEA) and investigation of protein structural features can guide determining the function of individual genes. However, a convenient tool that combines these two features to aid in high-throughput data analysis has not been developed yet. In order to fill this niche, we developed the user-friendly, web-based application, PhenoFam.  相似文献   

14.
Protein structural class prediction is one of the challenging problems in bioinformatics. Previous methods directly based on the similarity of amino acid (AA) sequences have been shown to be insufficient for low-similarity protein data-sets. To improve the prediction accuracy for such low-similarity proteins, different methods have been recently proposed that explore the novel feature sets based on predicted secondary structure propensities. In this paper, we focus on protein structural class prediction using combinations of the novel features including secondary structure propensities as well as functional domain (FD) features extracted from the InterPro signature database. Our comprehensive experimental results based on several benchmark data-sets have shown that the integration of new FD features substantially improves the accuracy of structural class prediction for low-similarity proteins as they capture meaningful relationships among AA residues that are far away in protein sequence. The proposed prediction method has also been tested to predict structural classes for partially disordered proteins with the reasonable prediction accuracy, which is a more difficult problem comparing to structural class prediction for commonly used benchmark data-sets and has never been done before to the best of our knowledge. In addition, to avoid overfitting with a large number of features, feature selection is applied to select discriminating features that contribute to achieve high prediction accuracy. The selected features have been shown to achieve stable prediction performance across different benchmark data-sets.  相似文献   

15.
Protein structural class prediction is one of the challenging problems in bioinformatics. Previous methods directly based on the similarity of amino acid (AA) sequences have been shown to be insufficient for low-similarity protein data-sets. To improve the prediction accuracy for such low-similarity proteins, different methods have been recently proposed that explore the novel feature sets based on predicted secondary structure propensities. In this paper, we focus on protein structural class prediction using combinations of the novel features including secondary structure propensities as well as functional domain (FD) features extracted from the InterPro signature database. Our comprehensive experimental results based on several benchmark data-sets have shown that the integration of new FD features substantially improves the accuracy of structural class prediction for low-similarity proteins as they capture meaningful relationships among AA residues that are far away in protein sequence. The proposed prediction method has also been tested to predict structural classes for partially disordered proteins with the reasonable prediction accuracy, which is a more difficult problem comparing to structural class prediction for commonly used benchmark data-sets and has never been done before to the best of our knowledge. In addition, to avoid overfitting with a large number of features, feature selection is applied to select discriminating features that contribute to achieve high prediction accuracy. The selected features have been shown to achieve stable prediction performance across different benchmark data-sets.  相似文献   

16.
Sixteen alginate lyases whose primary sequences have been reported were compared, and classified into the following three groups on the basis of the identity of their primary sequences. Strong homology (>50%): A-AlgL, A-AlgL*, P-AlgL, P-AlgL*, and AlgA; weak homology (>20%): ALY, AlxM, P-Aly, K-Aly, AlyPG, AlgVGI, AlgVGII, and AlgVGIII; little homology (<20%): ALYII, Al-III, and AlgVMI. Using hydrophobic cluster analysis (HCA), a secondary structure prediction method, the sixteen alginate lyases were placed into the following classes. Class 1: AlgA, A-AlgL, A-AlgL*, P-AlgL, and P-AlgL*; Class 2: AlgVMI and Al-III; Class 3: ALY and AlxM; Class 4A: ALYII, K-Aly, P-Aly, and AlyPG; Class 4B: AlgVGI and AlgVGII; Class 5: AlgVGIII, which is put in a class of its own, because it is unlike any of the other alginate lyases.  相似文献   

17.

Background  

A large number of PROSITE patterns select false positives and/or miss known true positives. It is possible that – at least in some cases – the weak specificity and/or sensitivity of a pattern is due to the fact that one, or maybe more, functional and/or structural key residues are not represented in the pattern. Multiple sequence alignments are commonly used to build functional sequence patterns. If residues structurally conserved in proteins sharing a function cannot be aligned in a multiple sequence alignment, they are likely to be missed in a standard pattern construction procedure.  相似文献   

18.
19.
ZRANB2 was identified originally in a differential display experiment on 2-day and 10-day primary cultures of rat juxtaglomerular cells. During prolonged culture it was found to undergo down-regulation in concert with renin, the archetypical constituent of these cells. ZRANB2 has two zinc fingers that form a novel fold and show striking homology to Ran-binding protein domains. Human ZRANB2 mRNA is alternatively spliced to give two variants with different 3' ends. ZRANB2 has homologues across a range of species, the N-terminal end being particularly conserved. ZRANB2 is present in the nucleus of human cells. It binds to mRNA, as well as the essential splicing factors U170K and U2AF(35) and the novel splicing component SFRS17A (formerly known as XE7). ZRANB2 is one of 20 genes up-regulated in grade III ovarian serous papillary carcinoma. Here, we review current knowledge surrounding ZRANB2.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号