首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A new topological method to measure protein structure similarity   总被引:5,自引:0,他引:5  
A method for the quantitative evaluation of structural similarity between protein pairs is developed that makes use of a Delaunay-based topological mapping. The result of the mapping is a three-dimensional array which is representative of the global structural topology and whose elements can be used to construe an integral scoring scheme. This scoring scheme was tested for its dependence on the protein length difference in a pairwise comparison, its ability to provide a reasonable means for structural similarity comparison within a family of structural neighbors of similar length, and its sensitivity to the differences in protein conformation. It is shown that such a topological evaluation of similarity is capable of providing insight into these points of interest. Protein structure comparison using the method is computationally efficient and the topological scores, although providing different information about protein similarity, correlate well with the distance root-mean-square deviation values calculated by rigid-body structural alignment.  相似文献   

2.
A measure of similarity between amino acid residues based on the analysis of the surroundings of each residue in primary structures of native proteins is proposed. The statistical data used for this purpose were obtained from the analysis of 168,808 protein sequences, which comprise the Protein Identification Research database (release 63). Using various threshold values of the proposed measure, amino acid residues were classified into several groups. The classification elaborated differs essentially from groupings previously used. The numerical measure of amino acid residues similarity can be used in site-directed mutagenesis studies for the prediction of probability of local spatial rearrangements in proteins.  相似文献   

3.
MOTIVATION: Microarray technology enables the study of gene expression in large scale. The application of methods for data analysis then allows for grouping genes that show a similar expression profile and that are thus likely to be co-regulated. A relationship among genes at the biological level often presents itself by locally similar and potentially time-shifted patterns in their expression profiles. RESULTS: Here, we propose a new method (CLARITY; Clustering with Local shApe-based similaRITY) for the analysis of microarray time course experiments that uses a local shape-based similarity measure based on Spearman rank correlation. This measure does not require a normalization of the expression data and is comparably robust towards noise. It is also able to detect similar and even time-shifted sub-profiles. To this end, we implemented an approach motivated by the BLAST algorithm for sequence alignment.We used CLARITY to cluster the times series of gene expression data during the mitotic cell cycle of the yeast Saccharomyces cerevisiae. The obtained clusters were related to the MIPS functional classification to assess their biological significance. We found that several clusters were significantly enriched with genes that share similar or related functions.  相似文献   

4.

Background  

Gene Ontology (GO) is a standard vocabulary of functional terms and allows for coherent annotation of gene products. These annotations provide a basis for new methods that compare gene products regarding their molecular function and biological role.  相似文献   

5.

Background  

The Gene Ontology (GO) is a well known controlled vocabulary describing the biological process, molecular function and cellular component aspects of gene annotation. It has become a widely used knowledge source in bioinformatics for annotating genes and measuring their semantic similarity. These measures generally involve the GO graph structure, the information content of GO aspects, or a combination of both. However, only a few of the semantic similarity measures described so far can handle GO annotations differently according to their origin (i.e. their evidence codes).  相似文献   

6.

Background  

Predicting which molecules can bind to a given binding site of a protein with known 3D structure is important to decipher the protein function, and useful in drug design. A classical assumption in structural biology is that proteins with similar 3D structures have related molecular functions, and therefore may bind similar ligands. However, proteins that do not display any overall sequence or structure similarity may also bind similar ligands if they contain similar binding sites. Quantitatively assessing the similarity between binding sites may therefore be useful to propose new ligands for a given pocket, based on those known for similar pockets.  相似文献   

7.

Background  

The sequencing of the human genome has enabled us to access a comprehensive list of genes (both experimental and predicted) for further analysis. While a majority of the approximately 30000 known and predicted human coding genes are characterized and have been assigned at least one function, there remains a fair number of genes (about 12000) for which no annotation has been made. The recent sequencing of other genomes has provided us with a huge amount of auxiliary sequence data which could help in the characterization of the human genes. Clustering these sequences into families is one of the first steps to perform comparative studies across several genomes.  相似文献   

8.
9.
陶华  唐旭清 《生物信息学》2012,10(4):269-273,279
基于模糊邻近关系的粒度空间,对蛋白质序列进行聚类结构分析。利用MEGA软件计算选取的木聚糖酶序列间的比对距离,引入内积将其转化为模糊邻近关系(或矩阵),再应用算法求解其粒度空间,进行序列的聚类结构分析和最佳聚类确定研究。这些研究为蛋白质序列提供了定量分析的工具。  相似文献   

10.
Here we propose a weighted measure for the similarity analysis of DNA sequences. It is based on LZ complexity and (0,1) characteristic sequences of DNA sequences. This weighted measure enables biologists to extract similarity information from biological sequences according to their requirements. For example, by this weighted measure, one can obtain either the full similarity information or a similarity analysis from a given biological aspect. Moreover, the length of DNA sequence is not problematic. The application of the weighted measure to the similarity analysis of β-globin genes from nine species shows its flexibility.  相似文献   

11.
生物序列相似性(或差异性)分析是生物信息学研究的一种重要的方法。其中基于对齐的生物序列相似性分析方法,重点介绍基于隐马尔可夫模型的比较方法,并比较基于对齐的各种生物序列分析方法的优缺点。  相似文献   

12.
We introduce a new variant of the root mean square distance (RMSD) for comparing protein structures whose range of values is independent of protein size. This new dimensionless measure (relative RMSD, or RRMSD) is zero between identical structures and one between structures that are as globally dissimilar as an average pair of random polypeptides of respective sizes. The RRMSD probability distribution between random polypeptides converges to a universal curve as the chain length increases. The correlation coefficients between aligned random structures are computed as a function of polypeptide size showing two characteristic lengths of 4.7 and 37 residues. These lengths mark the separation between phases of different structural order between native protein fragments. The implications for threading are discussed.  相似文献   

13.
给出了蛋白质序列的一种六维表示方法,根据这种表示方法有3种不同表示形式,利用这3种形式来构造距离矩阵的信息熵,然后通过信息熵向量的欧式距离、夹角来比较序列之间的相似性。  相似文献   

14.
SUMMARY: The main source of hypotheses on the structure and function of new proteins is their homology to proteins with known properties. Homologous relationships are typically established through sequence similarity searches, multiple alignments and phylogenetic reconstruction. In cases where the number of potential relationships is large, for example in P-loop NTPases with many thousands of members, alignments and phylogenies become computationally demanding, accumulate errors and lose resolution. In search of a better way to analyze relationships in large sequence datasets we have developed a Java application, CLANS (CLuster ANalysis of Sequences), which uses a version of the Fruchterman-Reingold graph layout algorithm to visualize pairwise sequence similarities in either two-dimensional or three-dimensional space. AVAILABILITY: CLANS can be downloaded at http://protevo.eb.tuebingen.mpg.de/download.  相似文献   

15.
We introduce a new approach to compare DNA primary sequences. The core of our method is a new measure of pairwise distances among sequences. Using the primitive discrimination substrings of sequence S and Q, a discrimination measure DM(S, Q) is defined for the similarity analysis of them. The proposed method does not require multiple alignments and is fully automatic. To illustrate its utility, we construct phylogenetic trees on two independent data sets. The results indicate that the method is efficient and powerful.  相似文献   

16.
This work aims at the similarity of biological sequences. Based on the Burrows-Wheeler transform, a definition of Burrows-Wheeler similarity distribution of two sequences is proposed to compare two sequences. Some distance measures are naturally followed by the distribution. The expectation and entropy of the similarity distribution are used to construct phylogenetic trees on two independent data sets. The result demonstrates that the method is efficient and powerful.  相似文献   

17.
一种新的基因注释语义相似度计算方法   总被引:1,自引:0,他引:1  
基因本体(GO)数据库为基因提供了统一的注释,有效地解决了不同数据库描述相同基因的不一致问题。但是,根据基因注释如何比较基因的功能相似性,这个问题仍然没有得到有效解决。本文提出一种新的基因注释语义相似度计算方法,这种方法在本质上是基于基因的生物学特性,其特点在于结点的语义相似度与结点所在集合无关,只与结点在GO图的位置有关,语义相似度可被重复利用。它既考虑了基因所映射的GO结点深度,又考虑了两GO结点之间所有路径对结点语义相似度的影响。文中以酵母菌的异亮氨酸降解代谢通路和谷氨酸合成代谢通路为实验,实验结果表明这种算法能准确地计算基因注释语义相似度。  相似文献   

18.
A new method to measure the semantic similarity of GO terms   总被引:4,自引:0,他引:4  
  相似文献   

19.
20.
WSE, a new sequence distance measure based on word frequencies   总被引:1,自引:0,他引:1  
In this article, we present a new distance metric, the Weighted Sequence Entropy (WSE), based on the short word composition of biological sequences. As a revision of the classical relative entropy (RE), our metric (1) works equivalently with RE in the case of small k, (2) avoids the degeneracy when some word types are absent in one sequence but not in the other. Experiments on 25 viruses including SARS-CoVs show that our method and RE give exactly the same phylogenetic tree when word length k3. When k>3, our method still works and gets convergent phylogenetic topology but the RE gives degenerate results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号