首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background  

For many metalloproteins, sequence motifs characteristic of metal-binding sites have not been found or are so short that they would not be expected to be metal-specific. Striking examples of such metalloproteins are those containing Mg2+, one of the most versatile metal cofactors in cellular biochemistry. Even when Mg2+-proteins share insufficient sequence homology to identify Mg2+-specific sequence motifs, they may still share similarity in the Mg2+-binding site structure. However, no structural motifs characteristic of Mg2+-binding sites have been reported. Thus, our aims are (i) to develop a general method for discovering structural patterns/motifs characteristic of ligand-binding sites, given the 3D protein structures, and (ii) to apply it to Mg2+-proteins sharing <30% sequence identity. Our motif discovery method employs structural alphabet encoding to convert 3D structures to the corresponding 1D structural letter sequences, where the Mg2+-structural motifs are identified as recurring structural patterns.  相似文献   

2.
3.
4.
5.
6.
7.
Discovering peptide ligands using epitope libraries.   总被引:9,自引:0,他引:9  
Epitope libraries are large collections of peptides. Each peptide is displayed on the surface of a bacteriophage particle and is encoded by a randomly mutated region of the phage genome, thus associating each unique peptide with the DNA molecule encoding it. Antibodies and other binding proteins are used to select specifically for rare, phage-bearing peptide ligands; sequencing of the corresponding viral DNA will reveal their amino acid sequences. Relatively high-affinity peptides for a variety of peptide- and non-peptide-binding ligates have been affinity-isolated from epitope libraries. This technology has been used to map epitopes on proteins and to find peptide mimics for non-peptide-binding ligates. The current challenge lies in developing epitope library technology so that tight-binding peptide ligands can be detected for a wider variety of ligates, including those that recognize folded proteins. Should this be accomplished, many powerful applications can be envisioned in the areas of drug design and the development of diagnostic markers and vaccines.  相似文献   

8.
Finding motifs using random projections.   总被引:19,自引:0,他引:19  
  相似文献   

9.
MOTIVATION: Mining the hereditary disease-genes from human genome is one of the most important tasks in bioinformatics research. A variety of sequence features and functional similarities between known human hereditary disease-genes and those not known to be involved in disease have been systematically examined and efficient classifiers have been constructed based on the identified common patterns. The availability of human genome-wide protein-protein interactions (PPIs) provides us with new opportunity for discovering hereditary disease-genes by topological features in PPIs network. RESULTS: This analysis reveals that the hereditary disease-genes ascertained from OMIM in the literature-curated (LC) PPIs network are characterized by a larger degree, tendency to interact with other disease-genes, more common neighbors and quick communication to each other whereas those properties could not be detected from the network identified from high-throughput yeast two-hybrid mapping approach (EXP) and predicted interactions (PDT) PPIs network. KNN classifier based on those features was created and on average gained overall prediction accuracy of 0.76 in cross-validation test. Then the classifier was applied to 5262 genes on human genome and predicted 178 novel disease-genes. Some of the predictions have been validated by biological experiments.  相似文献   

10.
11.
生物序列motif的识别是后基因组时代的一个核心问题。本文首先回顾了识别motif的几种主要算法,然后根据motif的重要性和随机性介绍了利用网络识别motif的两种具有代表性的方法:一种是建立一个随机网络混合模型,利用EM算法识别其中随机的网络motif;另一种用修正的参数流算法过滤出其中的最大密度予图,即为生物序列motif,并指出这两种方法的优劣,最后还对今后研究方向给出了讨论。  相似文献   

12.
Carbohydrates, or glycans, are one of the most abundant and structurally diverse biopolymers constitute the third major class of biomolecules, following DNA and proteins. However, the study of carbohydrate sugar chains has lagged behind compared to that of DNA and proteins, mainly due to their inherent structural complexity. However, their analysis is important because they serve various important roles in biological processes, including signaling transduction and cellular recognition. In order to glean some light into glycan function based on carbohydrate structure, kernel methods have been developed in the past, in particular to extract potential glycan biomarkers by classifying glycan structures found in different tissue samples. The recently developed weighted qgram method (LK-method) exhibits good performance on glycan structure classification while having limitations in feature selection. That is, it was unable to extract biologically meaningful features from the data. Therefore, we propose a biochemicallyweighted tree kernel (BioLK-method) which is based on a glycan similarity matrix and also incorporates biochemical information of individual q-grams in constructing the kernel matrix. We further applied our new method for the classification and recognition of motifs on publicly available glycan data. Our novel tree kernel (BioLK-method) using a Support Vector Machine (SVM) is capable of detecting biologically important motifs accurately while LK-method failed to do so. It was tested on three glycan data sets from the Consortium for Functional Glycomics (CFG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) GLYCAN and showed that the results are consistent with the literature. The newly developed BioLK-method also maintains comparable classification performance with the LK-method. Our results obtained here indicate that the incorporation of biochemical information of q-grams further shows the flexibility and capability of the novel kernel in feature extraction, which may aid in the prediction of glycan biomarkers.  相似文献   

13.
Through extensive experiment, simulation, and analysis of protein S6 (1RIS), we find that variations in nucleation and folding pathway between circular permutations are determined principally by the restraints of topology and specific nucleation, and affected by changes in chain entropy. Simulations also relate topological features to experimentally measured stabilities. Despite many sizable changes in phi values and the structure of the transition state ensemble that result from permutation, we observe a common theme: the critical nucleus in each of the mutants share a subset of residues that can be mapped to the critical nucleus residues of the wild-type. Circular permutations create new N and C termini, which are the location of the largest disruption of the folding nucleus, leading to a decrease in both phi values and the role in nucleation. Mutant nuclei are built around the wild-type nucleus but are biased towards different parts of the S6 structure depending on the topological and entropic changes induced by the location of the new N and C termini.  相似文献   

14.
15.
16.
17.
In pathogenic bacteria, point and other simple mutations can provide a strong selective advantage during the course of a single infection. Our understanding of the importance of these randomly occurring mutations has been hampered by a lack of technologies allowing mutation scanning on a genomic scale. Here, a novel technology is described that makes it possible to scan, in a single Southern blot experiment, the sequence identity of genomic regions with a combined length of hundreds of kilobases.  相似文献   

18.
19.
This paper takes a new view of motif discovery, addressing a common problem in existing motif finders. A motif is treated as a feature of the input promoter regions that leads to a good classifier between these promoters and a set of background promoters. This perspective allows us to adapt existing methods of feature selection, a well-studied topic in machine learning, to motif discovery. We develop a general algorithmic framework that can be specialized to work with a wide variety of motif models, including consensus models with degenerate symbols or mismatches, and composite motifs. A key feature of our algorithm is that it measures overrepresentation while maintaining information about the distribution of motif instances in individual promoters. The assessment of a motif's discriminative power is normalized against chance behaviour by a probabilistic analysis. We apply our framework to two popular motif models and are able to detect several known binding sites in sets of co-regulated genes in yeast.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号