首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Correctly predicting off-targets for a given molecular structure, which would have the ability to bind a large range of ligands, is both particularly difficult and important if they share no significant sequence or fold similarity with the respective molecular target (“distant off-targets”). A novel approach for identification of off-targets by direct superposition of protein binding pocket surfaces is presented and applied to a set of well-studied and highly relevant drug targets, including representative kinases and nuclear hormone receptors. The entire Protein Data Bank is searched for similar binding pockets and convincing distant off-target candidates were identified that share no significant sequence or fold similarity with the respective target structure. These putative target off-target pairs are further supported by the existence of compounds that bind strongly to both with high topological similarity, and in some cases, literature examples of individual compounds that bind to both. Also, our results clearly show that it is possible for binding pockets to exhibit a striking surface similarity, while the respective off-target shares neither significant sequence nor significant fold similarity with the respective molecular target (“distant off-target”).  相似文献   

2.
We present a new microRNA target prediction algorithm called TargetBoost, and show that the algorithm is stable and identifies more true targets than do existing algorithms. TargetBoost uses machine learning on a set of validated microRNA targets in lower organisms to create weighted sequence motifs that capture the binding characteristics between microRNAs and their targets. Existing algorithms require candidates to have (1) near-perfect complementarity between microRNAs' 5' end and their targets; (2) relatively high thermodynamic duplex stability; (3) multiple target sites in the target's 3' UTR; and (4) evolutionary conservation of the target between species. Most algorithms use one of the two first requirements in a seeding step, and use the three others as filters to improve the method's specificity. The initial seeding step determines an algorithm's sensitivity and also influences its specificity. As all algorithms may add filters to increase the specificity, we propose that methods should be compared before such filtering. We show that TargetBoost's weighted sequence motif approach is favorable to using both the duplex stability and the sequence complementarity steps. (TargetBoost is available as a Web tool from http://www.interagon.com/demo/.).  相似文献   

3.
Gene identification in genomic DNA from eukaryotes is complicated by the vast combinatorial possibilities of potential exon assemblies. If the gene encodes a protein that is closely related to known proteins, gene identification is aided by matching similarity of potential translation products to those target proteins. The genomic DNA and protein sequences can be aligned directly by scoring the implied residues of in-frame nucleotide triplets against the protein residues in conventional ways, while allowing for long gaps in the alignment corresponding to introns in the genomic DNA. We describe a novel method for such spliced alignment. The method derives an optimal alignment based on scoring for both sequence similarity of the predicted gene product to the protein sequence and intrinsic splice site strength of the predicted introns. Application of the method to a representative set of 50 known genes from Arabidopsis thaliana showed significant improvement in prediction accuracy compared to previous spliced alignment methods. The method is also more accurate than ab initio gene prediction methods, provided sufficiently close target proteins are available. In view of the fast growth of public sequence repositories, we argue that close targets will be available for the majority of novel genes, making spliced alignment an excellent practical tool for high-throughput automated genome annotation.  相似文献   

4.
COVID-19 is a rapidly emerging infectious disease caused by the SARS-CoV-2 virus currently spreading throughout the world. To date, there are no specific drugs formulated for it, and researchers around the globe are racing against the clock to investigate potential drug candidates. The repurposing of existing drugs in the market represents an effective and economical strategy commonly utilized in such investigations. In this study, we used a multiple-sequence alignment approach for preliminary screening of commercially-available drugs on SARS-CoV sequences from the Kingdom of Saudi Arabia (KSA) isolates. The viral genomic sequences from KSA isolates were obtained from GISAID, an open access repository housing a wide variety of epidemic and pandemic virus data. A phylogenetic analysis of the present 164 sequences from the KSA provinces was carried out using the MEGA X software, which displayed high similarity (around 98%). The sequence was then analyzed using the VIGOR4 genome annotator to construct its genomic structure. Screening of existing drugs was carried out by mining data based on viral gene expressions from the ZINC database. A total of 73 hits were generated. The viral target orthologs were mapped to the SARS-CoV-2 KSA isolate sequence by multiple sequence alignment using CLUSTAL OMEGA, and a list of 29 orthologs with purchasable drug information was generated. The results showed that the SARS CoV replicase polyprotein 1a had the highest sequence similarity at 79.91%. Through ZINC data mining, tanshinones were found to have high binding affinities to this target. These compounds could be ideal candidates for SARS-CoV-2. Other matches ranged between 27 and 52%. The results of this study would serve as a significant endeavor towards drug discovery that would increase our chances of finding an effective treatment or prevention against COVID19.  相似文献   

5.
Plant microRNAs (miRNAs) affect only a small number of targets with high sequence complementarity, while animal miRNAs usually have hundreds of targets with limited complementarity. We used artificial miRNAs (amiRNAs) to determine whether the narrow action spectrum of natural plant miRNAs reflects only intrinsic properties of the plant miRNA machinery or whether it is also due to past selection against natural miRNAs with broader specificity. amiRNAs were designed to target individual genes or groups of endogenous genes. Like natural miRNAs, they had varying numbers of target mismatches. Previously determined parameters of target selection for natural miRNAs could accurately predict direct targets of amiRNAs. The specificity of amiRNAs, as deduced from genome-wide expression profiling, was as high as that of natural plant miRNAs, supporting the notion that extensive base pairing with targets is required for plant miRNA function. amiRNAs make an effective tool for specific gene silencing in plants, especially when several related, but not identical, target genes need to be downregulated. We demonstrate that amiRNAs are also active when expressed under tissue-specific or inducible promoters, with limited nonautonomous effects. The design principles for amiRNAs have been generalized and integrated into a Web-based tool (http://wmd.weigelworld.org).  相似文献   

6.
MicroRNAs: something new under the sun   总被引:3,自引:0,他引:3  
MicroRNAs are plentiful in plants, as in animals. The effects of mutations which disrupt their processing imply that miRNAs have important roles in plant development. Although the targets of these miRNAs are still not known, excellent candidates have been identified based on sequence similarity to the miRNAs.  相似文献   

7.
基于SVM 的药物靶点预测方法及其应用   总被引:1,自引:0,他引:1       下载免费PDF全文
目的:基于已知药物靶点和潜在药物靶点蛋白的一级结构相似性,结合SVM技术研究新的有效的药物靶点预测方法。方法:构造训练样本集,提取蛋白质序列的一级结构特征,进行数据预处理,选择最优核函数,优化参数并进行特征选择,训练最优预测模型,检验模型的预测效果。以G蛋白偶联受体家族的蛋白质为预测集,应用建立的最优分类模型对其进行潜在药物靶点挖掘。结果:基于SVM所建立的最优分类模型预测的平均准确率为81.03%。应用最优分类器对构造的G蛋白预测集进行预测,结果发现预测排位在前20的蛋白质中有多个与疾病相关。特别的,其中有两个G蛋白在治疗靶点数据库(TTD)中显示已作为临床试验的药物靶点。结论:基于SVM和蛋白质序列特征的药物靶点预测方法是有效的,应用该方法预测出的潜在药物靶点能够为发现新的药靶提供参考。  相似文献   

8.
Structure‐based drug design tries to mutually map pharmacological space populated by putative target proteins onto chemical space comprising possible small molecule drug candidates. Both spaces are connected where proteins and ligands recognize each other: in the binding pockets. Therefore, it is highly relevant to study the properties of the space composed by all possible binding cavities. In the present contribution, a global mapping of protein cavity space is presented by extracting consensus cavities from individual members of protein families and clustering them in terms of their shape and exposed physicochemical properties. Discovered similarities indicate common binding epitopes in binding pockets independent of any possibly given similarity in sequence and fold space. Unexpected links between remote targets indicate possible cross‐reactivity of ligands and suggest putative side effects. The global clustering of cavity space is compared to a similar clustering of sequence and fold space and compared to chemical ligand space spanned by the chemical properties of small molecules found in binding pockets of crystalline complexes. The overall similarity architecture of sequence, fold, and cavity space differs significantly. Similarities in cavity space can be mapped best to similarities in ligand binding space indicating possible cross‐reactivities. Most cross‐reactivities affect co‐factor and other endogenous ligand binding sites. Proteins 2009. © 2008 Wiley‐Liss, Inc.  相似文献   

9.
Fast and effective prediction of microRNA/target duplexes   总被引:32,自引:1,他引:31  
  相似文献   

10.
Katara P  Grover A  Kuntal H  Sharma V 《Protoplasma》2011,248(4):799-804
Identification of potential drug targets is the first step in the process of modern drug discovery, subjected to their validation and drug development. Whole genome sequences of a number of organisms allow prediction of potential drug targets using sequence comparison approaches. Here, we present a subtractive approach exploiting the knowledge of global gene expression along with sequence comparisons to predict the potential drug targets more efficiently. Based on the knowledge of 155 known virulence and their coexpressed genes mined from microarray database in the public domain, 357 coexpressed probable virulence genes for Vibrio cholerae were predicted. Based on screening of Database of Essential Genes using blastn, a total of 102 genes out of these 357 were enlisted as vitally essential genes, and hence good putative drug targets. As the effective drug target is a protein which is only present in the pathogen, similarity search of these 102 essential genes against human genome sequence led to subtraction of 66 genes, thus leaving behind a subset of 36 genes whose products have been called as potential drug targets. The gene ontology analysis using Blast2GO of these 36 genes revealed their roles in important metabolic pathways of V. cholerae or on the surface of the pathogen. Thus, we propose that the products of these genes be evaluated as target sites of drugs against V. cholerae in future investigations.  相似文献   

11.
12.
MOTIVATION: Most computational methodologies for miRNA:mRNA target gene prediction use the seed segment of the miRNA and require cross-species sequence conservation in this region of the mRNA target. Methods that do not rely on conservation generate numbers of predictions, which are too large to validate. We describe a target prediction method (NBmiRTar) that does not require sequence conservation, using instead, machine learning by a na?ve Bayes classifier. It generates a model from sequence and miRNA:mRNA duplex information from validated targets and artificially generated negative examples. Both the 'seed' and 'out-seed' segments of the miRNA:mRNA duplex are used for target identification. RESULTS: The application of machine-learning techniques to the features we have used is a useful and general approach for microRNA target gene prediction. Our technique produces fewer false positive predictions and fewer target candidates to be tested. It exhibits higher sensitivity and specificity than algorithms that rely on conserved genomic regions to decrease false positive predictions.  相似文献   

13.
利用噬菌体随机肽库展示技术,筛选出与脓毒症单核/巨噬细胞特异性结合的短肽,探索脓毒症治疗的新方法.分别以经过脂多糖(lipopolysaccharide, LPS)处理的人外周血单核细胞株(THP-1)细胞作为筛选的靶细胞,以未经LPS处理的THP-1细胞作为非特异性噬菌体吸附细胞,对噬菌体随机环七肽库进行4轮“差减"筛选,经过细胞ELISA验证阳性噬菌体克隆,对获得的阳性克隆进行DNA测序及生物信息学分析,并进一步利用免疫荧光实验,鉴定噬菌体克隆与LPS处理THP-1细胞的结合特异性.4轮筛选后,随机挑取的噬菌体克隆,测序后得到可与LPS处理的THP-1细胞特异性结合肽.对去冗余后的七肽进行Clustal W多序列比对分析和BlastP蛋白同源相似性分析,细胞免疫荧光检测确定获得的噬菌体展示七肽可与LPS处理的THP-1细胞特异性结合.噬菌体随机肽库技术为脓毒症单核/巨噬细胞表面靶位的筛选提供了高效、快捷的筛选体系,实验获得的多肽基序具有高度保守性和细胞特异性,这些多肽的生物活性将是下一步的研究内容.  相似文献   

14.
The majority of existing computational tools rely on sequence homology and/or structural similarity to identify novel microRNA (miRNA) genes. Recently supervised algorithms are utilized to address this problem, taking into account sequence, structure and comparative genomics information. In most of these studies miRNA gene predictions are rarely supported by experimental evidence and prediction accuracy remains uncertain. In this work we present a new computational tool (SSCprofiler) utilizing a probabilistic method based on Profile Hidden Markov Models to predict novel miRNA precursors. Via the simultaneous integration of biological features such as sequence, structure and conservation, SSCprofiler achieves a performance accuracy of 88.95% sensitivity and 84.16% specificity on a large set of human miRNA genes. The trained classifier is used to identify novel miRNA gene candidates located within cancer-associated genomic regions and rank the resulting predictions using expression information from a full genome tiling array. Finally, four of the top scoring predictions are verified experimentally using northern blot analysis. Our work combines both analytical and experimental techniques to show that SSCprofiler is a highly accurate tool which can be used to identify novel miRNA gene candidates in the human genome. SSCprofiler is freely available as a web service at http://www.imbb.forth.gr/SSCprofiler.html.  相似文献   

15.
Tuncbag N  Keskin O  Nussinov R  Gursoy A 《Proteins》2012,80(4):1239-1249
The similarity between folding and binding led us to posit the concept that the number of protein-protein interface motifs in nature is limited, and interacting protein pairs can use similar interface architectures repeatedly, even if their global folds completely vary. Thus, known protein-protein interface architectures can be used to model the complexes between two target proteins on the proteome scale, even if their global structures differ. This powerful concept is combined with a flexible refinement and global energy assessment tool. The accuracy of the method is highly dependent on the structural diversity of the interface architectures in the template dataset. Here, we validate this knowledge-based combinatorial method on the Docking Benchmark and show that it efficiently finds high-quality models for benchmark complexes and their binding regions even in the absence of template interfaces having sequence similarity to the targets. Compared to "classical" docking, it is computationally faster; as the number of target proteins increases, the difference becomes more dramatic. Further, it is able to distinguish binders from nonbinders. These features allow performing large-scale network modeling. The results on an independent target set (proteins in the p53 molecular interaction map) show that current method can be used to predict whether a given protein pair interacts. Overall, while constrained by the diversity of the template set, this approach efficiently produces high-quality models of protein-protein complexes. We expect that with the growing number of known interface architectures, this type of knowledge-based methods will be increasingly used by the broad proteomics community.  相似文献   

16.
The New York Consortium on Membrane Protein Structure (NYCOMPS), a part of the Protein Structure Initiative (PSI) in the USA, has as its mission to establish a high-throughput pipeline for determination of novel integral membrane protein structures. Here we describe our current target selection protocol, which applies structural genomics approaches informed by the collective experience of our team of investigators. We first extract all annotated proteins from our reagent genomes, i.e. the 96 fully sequenced prokaryotic genomes from which we clone DNA. We filter this initial pool of sequences and obtain a list of valid targets. NYCOMPS defines valid targets as those that, among other features, have at least two predicted transmembrane helices, no predicted long disordered regions and, except for community nominated targets, no significant sequence similarity in the predicted transmembrane region to any known protein structure. Proteins that feed our experimental pipeline are selected by defining a protein seed and searching the set of all valid targets for proteins that are likely to have a transmembrane region structurally similar to that of the seed. We require sequence similarity aligning at least half of the predicted transmembrane region of seed and target. Seeds are selected according to their feasibility and/or biological interest, and they include both centrally selected targets and community nominated targets. As of December 2008, over 6,000 targets have been selected and are currently being processed by the experimental pipeline. We discuss how our target list may impact structural coverage of the membrane protein space.  相似文献   

17.
With the completion of the Human Genome Project in 2003, many new projects to sequence bacterial genomes were started and soon many complete bacterial genome sequences were available. The sequenced genomes of pathogenic bacteria provide useful information for understanding host-pathogen interactions. These data prove to be a new weapon in fighting against pathogenic bacteria by providing information about potential drug targets. But the limitation of computational tools for finding potential drug targets has hindered the process and further experimental analysis. There are many in silico approaches proposed for finding drug targets but only few have been automated. One such approach finds essential genes in bacterial genomes with no human homologue and predicts these as potential drug targets. The same approach is used in our tool. T-iDT, a tool for the identification of drug targets, finds essential genes by comparing a bacterial gene set against DEG (Database of Essential Genes) and excludes homologue genes by comparing against a human protein database. The tool predicts both the set of essential genes as well as potential target genes for the given genome. The tool was tested with Mycobacterium tuberculosis and results were validated. With default parameters, the tool predicted 236 essential genes and 52 genes to encode potential drug targets. A pathway-based approach was used to validate these potential drug target genes. The pathway in which the products of these genes are involved was determined. Our analysis shows that almost all these pathways are very essential for the bacterial survival and hence these genes encode possible drug targets. Our tool provides a fast method for finding possible drug targets in bacterial genomes with varying stringency level. The tool will be helpful in finding possible drug targets in various pathogenic organisms and can be used for further analysis in novel therapeutic drug development. The tool can be downloaded from http://www.milser.co.in/research.htm and http://www.srmbioinformatics.edu.in/ forum.htm.  相似文献   

18.
19.
Sequence logos are frequently used to illustrate substrate preferences and specificity of proteases. Here, we employed the compiled substrates of the MEROPS database to introduce a novel metric for comparison of protease substrate preferences. The constructed similarity matrix of 62 proteases can be used to intuitively visualize similarities in protease substrate readout via principal component analysis and construction of protease specificity trees. Since our new metric is solely based on substrate data, we can engraft the protease tree including proteolytic enzymes of different evolutionary origin. Thereby, our analyses confirm pronounced overlaps in substrate recognition not only between proteases closely related on sequence basis but also between proteolytic enzymes of different evolutionary origin and catalytic type. To illustrate the applicability of our approach we analyze the distribution of targets of small molecules from the ChEMBL database in our substrate-based protease specificity trees. We observe a striking clustering of annotated targets in tree branches even though these grouped targets do not necessarily share similarity on protein sequence level. This highlights the value and applicability of knowledge acquired from peptide substrates in drug design of small molecules, e.g., for the prediction of off-target effects or drug repurposing. Consequently, our similarity metric allows to map the degradome and its associated drug target network via comparison of known substrate peptides. The substrate-driven view of protein-protein interfaces is not limited to the field of proteases but can be applied to any target class where a sufficient amount of known substrate data is available.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号