首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 546 毫秒
1.
2.
Protein structure alignment algorithms play an important role in the studies of protein structure and function. In this paper, a novel approach for structure alignment is presented. Specifically, core regions in two protein structures are first aligned by identifying connected components in a network of neighboring geometrically compatible aligned fragment pairs. The initial alignments then are refined through a multi-objective optimization method. The algorithm can produce both sequential and non-sequential alignments. We show the superior performance of the proposed algorithm by the computational experiments on several benchmark datasets and the comparisons with the well-known structure alignment algorithms such as DALI, CE and MATT. The proposed method can obtain accurate and biologically significant alignment results for the case with occurrence of internal repeats or indels, identify the circular permutations, and reveal conserved functional sites. A ranking criterion of our algorithm for fold similarity is presented and found to be comparable or superior to the Z-score of CE in most cases from the numerical experiments. The software and supplementary data of computational results are available at .  相似文献   

3.
We propose a detailed protein structure alignment method named "MatAlign". It is a two-step algorithm. Firstly, we represent 3D protein structures as 2D distance matrices, and align these matrices by means of dynamic programming in order to find the initially aligned residue pairs. Secondly, we refine the initial alignment iteratively into the optimal one according to an objective scoring function. We compare our method against DALI and CE, which are among the most accurate and the most widely used of the existing structural comparison tools. On the benchmark set of 68 protein structure pairs by Fischer et al., MatAlign provides better alignment results, according to four different criteria, than both DALI and CE in a majority of cases. MatAlign also performs as well in structural database search as DALI does, and much better than CE does. MatAlign is about two to three times faster than DALI, and has about the same speed as CE. The software and the supplementary information for this paper are available at http://xena1.ddns.comp.nus.edu.sg/~genesis/MatAlign/.  相似文献   

4.
5.
MOTIVATION: Glycans are the third major class of biomolecules following DNA and proteins. They are extremely vital for the functioning of multicellular organisms. However, comparing the fast development of sequence analysis techniques, informatics work on glycans have a long way to go. Alignment algorithms for glycan tree structures are one of the foremost concerns. In addition, the statistical analysis of these algorithms in terms of biological significance needs to be addressed. RESULTS: We developed a tree-structure alignment algorithm for glycans and performed a statistical analysis of these alignment scores such that biologically interesting features could be captured into a score matrix for glycans. We generated our score matrix in a manner similar to BLOSUM, but with slight variations to accomodate our glycan data, including the incorporation of linkage information. We verified the effectiveness of our new glycan score matrix by illustrating how well the resulting score matrix entries correspond with biological knowledge. Future work for even better improvements with the use of a variety of score matrices for different subclasses of glycans due to their complexity is also discussed. CONTACT: mami@kuicr.kyoto-u.ac.jp SUPPLEMENTARY INFORMATION: The glycan score matrix can be downloaded from http://kanehisa.kuicr.kyoto-u.ac.jp/Paper/kcam/glycanMatrix0.1.txt.  相似文献   

6.
7.
8.
目的 目前,如何从核磁共振(nuclear magnetic resonance,NMR)光谱实验中准确地确定蛋白质的三维结构是生物物理学中的一个热门课题,因为蛋白质是生物体的重要组成成分,了解蛋白质的空间结构对研究其功能至关重要,然而由于实验数据的严重缺乏使其成为一个很大的挑战。方法 在本文中,通过恢复距离矩阵的矩阵填充(matrix completion,MC)算法来解决蛋白质结构确定问题。首先,初始距离矩阵模型被建立,由于实验数据的缺乏,此时的初始距离矩阵为不完整矩阵,随后通过MC算法恢复初始距离矩阵的缺失数据,从而获得整个蛋白质三维结构。为了进一步测试算法的性能,本文选取了4种不同拓扑结构的蛋白质和6种现有的MC算法进行了测试,探究了算法在不同的采样率以及不同程度噪声的情况下算法的恢复效果。结果 通过分析均方根偏差(root-mean-square deviation,RMSD)和计算时间这两个重要指标的平均值及标准差评估了算法的性能,结果显示当采样率和噪声因子控制在一定范围内时,RMSD值和标准差都能达到很小的值。另外本文更加具体地比较了不同算法的特点和优势,在精确采样情况下...  相似文献   

9.
We develop a new approach to estimate a matrix of pairwise evolutionary distances from a codon-based alignment based on a codon evolutionary model. The method first computes a standard distance matrix for each of the three codon positions. Then these three distance matrices are weighted according to an estimate of the global evolutionary rate of each codon position and averaged into a unique distance matrix. Using a large set of both real and simulated codon-based alignments of nucleotide sequences, we show that this approach leads to distance matrices that have a significantly better treelikeness compared to those obtained by standard nucleotide evolutionary distances. We also propose an alternative weighting to eliminate the part of the noise often associated with some codon positions, particularly the third position, which is known to induce a fast evolutionary rate. Simulation results show that fast distance-based tree reconstruction algorithms on distance matrices based on this codon position weighting can lead to phylogenetic trees that are at least as accurate as, if not better, than those inferred by maximum likelihood. Finally, a well-known multigene dataset composed of eight yeast species and 106 codon-based alignments is reanalyzed and shows that our codon evolutionary distances allow building a phylogenetic tree which is similar to those obtained by non-distance-based methods (e.g., maximum parsimony and maximum likelihood) and also significantly improved compared to standard nucleotide evolutionary distance estimates.  相似文献   

10.
We present a comprehensive evaluation of a new structure mining method called PB-ALIGN. It is based on the encoding of protein structure as 1D sequence of a combination of 16 short structural motifs or protein blocks (PBs). PBs are short motifs capable of representing most of the local structural features of a protein backbone. Using derived PB substitution matrix and simple dynamic programming algorithm, PB sequences are aligned the same way amino acid sequences to yield structure alignment. PBs are short motifs capable of representing most of the local structural features of a protein backbone. Alignment of these local features as sequence of symbols enables fast detection of structural similarities between two proteins. Ability of the method to characterize and align regions beyond regular secondary structures, for example, N and C caps of helix and loops connecting regular structures, puts it a step ahead of existing methods, which strongly rely on secondary structure elements. PB-ALIGN achieved efficiency of 85% in extracting true fold from a large database of 7259 SCOP domains and was successful in 82% cases to identify true super-family members. On comparison to 13 existing structure comparison/mining methods, PB-ALIGN emerged as the best on general ability test dataset and was at par with methods like YAKUSA and CE on nontrivial test dataset. Furthermore, the proposed method performed well when compared to flexible structure alignment method like FATCAT and outperforms in processing speed (less than 45 s per database scan). This work also establishes a reliable cut-off value for the demarcation of similar folds. It finally shows that global alignment scores of unrelated structures using PBs follow an extreme value distribution. PB-ALIGN is freely available on web server called Protein Block Expert (PBE) at http://bioinformatics.univ-reunion.fr/PBE/.  相似文献   

11.
12.
It is generally accepted that protein structures are more conserved than protein sequences, and 3D structure determination by computer simulations have become an important necessity in the postgenomic area. Despite major successes no robust, fast, and automated ab initio prediction algorithms for deriving accurate folds of single polypeptide chains or structures of intermolecular complexes exist at present. Here we present a methodology that uses selection and filtering of structural models generated by docking of known substructures such as individual proteins or domains through easily obtainable experimental NMR constraints. In particular, residual dipolar couplings and chemical shift mapping are used. Heuristic inclusion of chemical or biochemical knowledge about point-to-point interactions is combined in our selection strategy with the NMR data and commonly used contact potentials. We demonstrate the approach for the determination of protein-protein complexes using the EIN/HPr complex as an example and for establishing the domain-domain orientation in a chimeric protein, the recently determined hybrid human-Escherichia. coli thioredoxin.  相似文献   

13.
14.
15.
16.
17.
The question of how best to compare and classify the (three‐dimensional) structures of proteins is one of the most important unsolved problems in computational biology. To help tackle this problem, we have developed a novel shape‐density superposition algorithm called 3D‐Blast which represents and superposes the shapes of protein backbone folds using the spherical polar Fourier correlation technique originally developed by us for protein docking. The utility of this approach is compared with several well‐known protein structure alignment algorithms using receiver‐operator‐characteristic plots of queries against the “gold standard” CATH database. Despite being completely independent of protein sequences and using no information about the internal geometry of proteins, our results from searching the CATH database show that 3D‐Blast is highly competitive compared to current state‐of‐the‐art protein structure alignment algorithms. A novel and potentially very useful feature of our approach is that it allows an average or “consensus” fold to be calculated easily for a given group of protein structures. We find that using consensus shapes to represent entire fold families also gives very good database query performance. We propose that using the notion of consensus fold shapes could provide a powerful new way to index existing protein structure databases, and that it offers an objective way to cluster and classify all of the currently known folds in the protein universe. Proteins 2012. © 2011 Wiley Periodicals, Inc.  相似文献   

18.
19.
Matching two geometric objects in two-dimensional (2D) and three-dimensional (3D) spaces is a central problem in computer vision, pattern recognition, and protein structure prediction. In particular, the problem of aligning two polygonal chains under translation and rotation to minimize their distance has been studied using various distance measures. It is well known that the Hausdorff distance is useful for matching two point sets, and that the Fréchet distance is a superior measure for matching two polygonal chains. The discrete Fréchet distance closely approximates the (continuous) Fréchet distance, and is a natural measure for the geometric similarity of the folded 3D structures of biomolecules such as proteins. In this paper, we present new algorithms for matching two polygonal chains in two dimensions to minimize their discrete Fréchet distance under translation and rotation, and an effective heuristic for matching two polygonal chains in three dimensions. We also describe our empirical results on the application of the discrete Fréchet distance to protein structure-structure alignment.  相似文献   

20.
A new intrinsic geometry based on a spectral analysis is used to motivate methods for aligning protein folds. The geometry is induced by the fact that a distance matrix can be scaled so that its eigenvalues are positive. We provide a mathematically rigorous development of the intrinsic geometry underlying our spectral approach and use it to motivate two alignment algorithms. The first uses eigenvalues alone and dynamic programming to quickly compute a fold alignment. Family identification results are reported for the Skolnick40 and Proteus300 data sets. The second algorithm extends our spectral method by iterating between our intrinsic geometry and the 3D geometry of a fold to make high-quality alignments. Results and comparisons are reported for several difficult fold alignments. The second algorithm's ability to correctly identify fold families in the Skolnick40 and Proteus300 data sets is also established.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号