首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
药物从研发到临床应用需要耗费较长的时间,研发期间的投入成本可高达十几亿元。而随着医药研发与人工智能的结合以及生物信息学的飞速发展,药物活性相关数据急剧增加,传统的实验手段进行药物活性预测已经难以满足药物研发的需求。借助算法来辅助药物研发,解决药物研发中的各种问题能够大大推动药物研发进程。传统机器学习方法尤其是随机森林、支持向量机和人工神经网络在药物活性方面能够达到较高的预测精度。深度学习由于具有多层神经网络,模型可以接收高维的输入变量且不需要人工限定数据输入特征,可以拟合较为复杂的函数模型,应用于药物研发可以进一步提高各个环节的效率。在药物活性预测中应用较为广泛的深度学习模型主要是深度神经网络(deep neural networks,DNN)、循环神经网络(recurrent neural networks,RNN)和自编码器(auto encoder,AE),而生成对抗网络(generative adversarial networks,GAN)由于其生成数据的能力常常被用来和其他模型结合进行数据增强。近年来深度学习在药物分子活性预测方面的研究和应用综述表明,深度学习模型的准确度和效率均高于传统实验方法和传统机器学习方法。因此,深度学习模型有望成为药物研发领域未来十年最重要的辅助计算模型。  相似文献   

2.
蛋白质结构的预测在理解蛋白质结构组成和蛋白质的生物学功能有重要意义,而蛋白质二级结构预测是蛋白质结构预测的重要环节。当PSSM位置特异性进化矩阵被广泛应用于将蛋白质初级结构序列编码作为输入样本后,每个残基可以被表示成二维空间的数据平面,由此文中尝试利用卷积神经网络对其进行训练。文中还设计了另一种卷积神经网络,利用长短记忆网络感知了CNN最后卷积特征面的横向特征和纵向特征后连同卷积神经网络的全连接共同完成分类,最后用ensemble方法对两类卷积神经网络模型进行了整合,最终ensemble方法中包含两类卷积神经网络的六个模型,在CB513蛋白质数据集测得的Q3结果为77.2。  相似文献   

3.
核酸适配体是通过体外指数富集配体系统进化(SELEX)技术筛选获得,并能够和蛋白质靶标高特异性、高亲和力结合的单链寡核苷酸。核酸适配体不但具有抗体的识别特性,而且具有自己独特的优良性能,目前已应用于分析检验、食品安全和生物医药等各个领域。蛋白质具有多种多样的生物功能以及临床诊断价值。因此,核酸适配体针对蛋白质靶标并在蛋白质相关的基础研究领域受到广泛的关注。核酸适配体应用性能的优劣取决于与其靶标蛋白质的亲和力与特异性。本文主要综述核酸适配体对蛋白质靶标的亲和力表征方法,以及在药物研发、肿瘤检测、生物成像以及生物传感器方面的应用。  相似文献   

4.
蛋白质亚细胞定位预测对蛋白质的功能、相互作用及调控机制的研究具有重要意义。本文基于物化性质和结构性质对氨基酸的约化,描述序列局部和全局信息的"组成"、"转换"和"分布"特征,并利用氨基酸亲疏水性的数值统计特征,提出了一种新的蛋白质特征表示方法(NSBH)。分别使用三种分类器KNN、SVM及BP神经网络进行蛋白质亚细胞定位预测,比较了几种方法和特征融合方法的预测结果,显示融合特征表示及结合SVM分类器时能够达到更好的预测准确率。同时,还详细讨论了不同参数对实验结果的影响,具体的实验及比较结果显示了该方法的有效性。  相似文献   

5.
基于生物信息学方法发现潜在药物靶标   总被引:2,自引:0,他引:2  
药物靶点通常是在代谢或信号通路中与特定疾病或病理状态有关的关键分子.通过绑定到特定活动区域抑制这个关键分子进行药物设计.确定特定疾病有关的靶标分子是现代新药开发的基础.在药物靶标发现的过程中,生物信息学方法发挥了不可替代的重要的作用,尤其适用于大规模多组学数据的分析.目前,已涌现了许多与疾病相关的数据库资源,基于生物网络特征、多基因芯片、蛋白质组、代谢组数据等建立了多种生物信息学方法发现潜在的药物靶标,并预测靶标可药性和药物副作用.  相似文献   

6.
药物的使用极大地提高了人类的生存质量。药物的有效性是药物发现研究中的关键环节。药物的有效性通过识别药物与其作用的靶标蛋白来判断。然而,通过高通量筛选的实验方法分析确定化合物药物-靶标蛋白互作关联是一个十分昂贵、耗时且富有挑战性的任务。基于计算方法的化合物药物-靶标蛋白互作关联预测研究具有效率高、成本低的特点,越来越受到人们的重视。相比实验验证方法,化合物药物-靶标蛋白互作关联的计算方法可为药物发现研究后续的生物药学实验提供更为准确的潜在化合物药物-靶标蛋白候选对,达到减少生物实验的时间和成本的目的。本文回顾了近20年来基于计算方法的化合物药物-靶标蛋白互作关联预测算法所涉及的生物医学特征数据、预测方法和技术,并分析研究过程中所面临的生物医学特征数据高维稀疏,以及多源生物医学数据融合程度不高等问题,为进一步研究提供有价值的参考。  相似文献   

7.
近年来,随着计算机硬件、软件工具和数据丰度的不断突破,以机器学习为代表的人工智能技术在生物、基础医学和药学等领域的应用不断拓展和融合,极大地推动了这些领域的发展,尤其是药物研发领域的变革。其中,药物-靶标相互作用(drug-target interactions, DTI)的识别是药物研发领域中的重要难题和人工智能技术交叉融合的热门方向,研究人员在DTI预测方面做了大量的工作,构建了许多重要的数据库,开发或拓展了各类机器学习算法和工具软件。对基于机器学习的DTI预测的基本流程进行了介绍,并对利用机器学习预测DTI的研究进行了回顾,同时对不同的机器学习方法运用于DTI预测的优缺点进行了简单总结,以期对开发更加有效的预测算法和DTI预测的发展提供帮助。  相似文献   

8.
研究蛋白质和配体相互作用的结构和亲和力,不仅有助于了解蛋白质的功能,而且对药物研发以及药物作用机制的研究,也具有十 分重要的意义。目前,人们通过人工检索和半自动检索的方式,从文献和蛋白质数据库(Protein Data Bank,PDB)中获得了许多蛋白质- 配体亲和力信息和生物相关配体信息,并构建了许多蛋白质-配体相互作用的信息数据库。对3 个蛋白质-配体亲和力数据库和6 个蛋白质 晶体结构-配体生物相关性数据库进行介绍,并对其主要应用进行简述,希望能为实现高效准确地筛选和设计药物提供一定的帮助。  相似文献   

9.
全新结构药物的研发存在周期长、耗资大、风险高的问题.通过各种技术预测已有药物的新适应症,即药物重定位,可以缩短药物研发时间、降低研发成本和风险.由于疾病种类和已知药物的数量繁多,完全通过实验筛选已知药物的新用途仍然具有很高的成本.随着组学和药物信息学数据的积累,药物重定位进入到了理性设计和实验筛选相结合的阶段,药物重定位的计算预测已经成为计算生物学和系统生物学的重要研究方向.本文将目前药物重定位计算分析的策略归纳为药物-靶标关系分析、药物-药物关系分析和药物-疾病关系分析,对已报道的技术方法及其成功应用实例进行了综述.  相似文献   

10.
许多微生物的次生代谢物属于小分子活性化合物,在医疗及农业领域发挥着重要的作用。在基因组学、蛋白质组学与生物信息学等技术的推动下,一些新的小分子药物靶标寻找方法应运而生了,这些新的方法主要是基于细胞中基因或蛋白质的表达量、蛋白质的亲和性、稳定性等各种特性进行靶标寻找的。小分子药物靶标寻找方法的发展加快了阐明小分子药物作用机理的历程,也为发现新的靶标资源以便于进一步筛选活性更高的药物提供了技术保障。  相似文献   

11.
夏彬彬  王军 《生物工程学报》2021,37(11):3863-3879
随着蛋白质序列及结构数据的大量累积,在获得了大量描述性信息之后如何有效利用海量数据,从已有数据中高效提取信息并且应用到下游任务当中就成为了研究者亟待解决的问题。蛋白质的设计可使新蛋白的研发不再受限于实验条件,这对药物靶点预测、新药研发和材料设计等领域具有重要意义。深度学习作为一种高效的数据特征提取方法,可以通过它对蛋白质数据进行建模,进而加入先验信息对蛋白质进行设计。故此基于深度学习的蛋白质设计就成为一个具有广阔前景的研究领域。文中主要阐述基于深度学习的蛋白质序列与结构数据的建模和设计方法。详述该方法的策略、原理、适用范围、应用实例。讨论了深度学习方法在本领域的应用前景及局限性,以期为相关研究提供参考。  相似文献   

12.
BackgroundSimilarity based computational methods are a useful tool for predicting protein functions from protein–protein interaction (PPI) datasets. Although various similarity-based prediction algorithms have been proposed, unsatisfactory prediction results have occurred on many occasions. The purpose of this type of algorithm is to predict functions of an unannotated protein from the functions of those proteins that are similar to the unannotated protein. Therefore, the prediction quality largely depends on how to select a set of proper proteins (i.e., a prediction domain) from which the functions of an unannotated protein are predicted, and how to measure the similarity between proteins. Another issue with existing algorithms is they only believe the function prediction is a one-off procedure, ignoring the fact that interactions amongst proteins are mutual and dynamic in terms of similarity when predicting functions. How to resolve these major issues to increase prediction quality remains a challenge in computational biology.ResultsIn this paper, we propose an innovative approach to predict protein functions of unannotated proteins iteratively from a PPI dataset. The iterative approach takes into account the mutual and dynamic features of protein interactions when predicting functions, and addresses the issues of protein similarity measurement and prediction domain selection by introducing into the prediction algorithm a new semantic protein similarity and a method of selecting the multi-layer prediction domain. The new protein similarity is based on the multi-layered information carried by protein functions. The evaluations conducted on real protein interaction datasets demonstrated that the proposed iterative function prediction method outperformed other similar or non-iterative methods, and provided better prediction results.ConclusionsThe new protein similarity derived from multi-layered information of protein functions more reasonably reflects the intrinsic relationships among proteins, and significant improvement to the prediction quality can occur through incorporation of mutual and dynamic features of protein interactions into the prediction algorithm.  相似文献   

13.
Accurate identification of compound–protein interactions(CPIs) in silico may deepen our understanding of the underlying mechanisms of drug action and thus remarkably facilitate drug discovery and development.Conventional similarity-or docking-based computational methods for predicting CPIs rarely exploit latent features from currently available large-scale unlabeled compound and protein data and often limit their usage to relatively small-scale datasets.In the present study,we propose Deep CPI,a novel general and scalable computational framework that combines effective feature embedding(a technique of representation learning) with powerful deep learning methods to accurately predict CPIs at a large scale.Deep CPI automatically learns the implicit yet expressive low-dimensional features of compounds and proteins from a massive amount of unlabeled data.Evaluations of the measured CPIs in large-scale databases,such as Ch EMBL and Binding DB,as well as of the known drug–target interactions from Drug Bank,demonstrated the superior predictive performance of Deep CPI.Furthermore,several interactions among smallmolecule compounds and three G protein-coupled receptor targets(glucagon-like peptide-1 receptor,glucagon receptor,and vasoactive intestinal peptide receptor) predicted using Deep CPI were experimentally validated.The present study suggests that Deep CPI is a useful and powerful tool for drug discovery and repositioning.The source code of Deep CPI can be downloaded from https://github.com/Fangping Wan/Deep CPI.  相似文献   

14.
Models capable of predicting the possible involvement of cytochromes P450 in the metabolism of drugs or drug candidates are important tools in drug discovery and development. Ideally, functional information would be obtained from crystal structures of all the cytochromes P450 of interest. Initially, only crystal structures of distantly related bacterial cytochromes P450 were available-comparative modeling techniques were used to bridge the gap and produce structural models of human cytochromes P450, and thereby obtain some useful functional information. A significant step forward in the reliability of these models came four years ago with the first crystal structure of a mammalian cytochrome P450, rabbit CYP2C5, followed by the structures of two human enzymes, CYP2C8 and CYP2C9, and a second rabbit enzyme, CYP2B4. The evolution of a CYP2D6 model, leading to the validation of the model as an in silico tool for predicting binding and metabolism, is presented as a case study.  相似文献   

15.
Fragment-based drug design (FBDD) is currently being implemented in drug discovery, creating a demand for developing efficient techniques for fragment screening. Due to the intrinsic weak or transient binding of fragments (mM–μM in dissociation constant (KD)) to targets, methods must be sensitive enough to accurately detect and quantify an interaction. This study presents weak affinity chromatography (WAC) as an alternative tool for screening of small fragments. The technology was demonstrated by screening of a selected 23-compound fragment collection of documented binders, mostly amidines, using trypsin and thrombin as model target protease proteins. WAC was proven to be a sensitive, robust, and reproducible technique that also provides information about affinity of a fragment in the range of 1 mM–10 μM. Furthermore, it has potential for high throughput as was evidenced by analyzing mixtures in the range of 10 substances by WAC–MS. The accessibility and flexibility of the technology were shown as fragment screening can be performed on standard HPLC equipment. The technology can further be miniaturized and adapted to the requirements of affinity ranges of the fragment library. All these features of WAC make it a potential method in drug discovery for fragment screening.  相似文献   

16.
基于SVM 的药物靶点预测方法及其应用   总被引:1,自引:0,他引:1       下载免费PDF全文
目的:基于已知药物靶点和潜在药物靶点蛋白的一级结构相似性,结合SVM技术研究新的有效的药物靶点预测方法。方法:构造训练样本集,提取蛋白质序列的一级结构特征,进行数据预处理,选择最优核函数,优化参数并进行特征选择,训练最优预测模型,检验模型的预测效果。以G蛋白偶联受体家族的蛋白质为预测集,应用建立的最优分类模型对其进行潜在药物靶点挖掘。结果:基于SVM所建立的最优分类模型预测的平均准确率为81.03%。应用最优分类器对构造的G蛋白预测集进行预测,结果发现预测排位在前20的蛋白质中有多个与疾病相关。特别的,其中有两个G蛋白在治疗靶点数据库(TTD)中显示已作为临床试验的药物靶点。结论:基于SVM和蛋白质序列特征的药物靶点预测方法是有效的,应用该方法预测出的潜在药物靶点能够为发现新的药靶提供参考。  相似文献   

17.
mirSVR is a new machine learning method for ranking microRNA target sites by a down-regulation score. The algorithm trains a regression model on sequence and contextual features extracted from miRanda-predicted target sites. In a large-scale evaluation, miRanda-mirSVR is competitive with other target prediction methods in identifying target genes and predicting the extent of their downregulation at the mRNA or protein levels. Importantly, the method identifies a significant number of experimentally determined non-canonical and non-conserved sites.  相似文献   

18.
Teng S  Luo H  Wang L 《Amino acids》2012,43(1):447-455
Protein sumoylation is a post-translational modification that plays an important role in a wide range of cellular processes. Small ubiquitin-related modifier (SUMO) can be covalently and reversibly conjugated to the sumoylation sites of target proteins, many of which are implicated in various human genetic disorders. The accurate prediction of protein sumoylation sites may help biomedical researchers to design their experiments and understand the molecular mechanism of protein sumoylation. In this study, a new machine learning approach has been developed for predicting sumoylation sites from protein sequence information. Random forests (RFs) and support vector machines (SVMs) were trained with the data collected from the literature. Domain-specific knowledge in terms of relevant biological features was used for input vector encoding. It was shown that RF classifier performance was affected by the sequence context of sumoylation sites, and 20 residues with the core motif ΨKXE in the middle appeared to provide enough context information for sumoylation site prediction. The RF classifiers were also found to outperform SVM models for predicting protein sumoylation sites from sequence features. The results suggest that the machine learning approach gives rise to more accurate prediction of protein sumoylation sites than the other existing methods. The accurate classifiers have been used to develop a new web server, called seeSUMO (http://bioinfo.ggc.org/seesumo/), for sequence-based prediction of protein sumoylation sites.  相似文献   

19.
Jie Hou  Tianqi Wu  Renzhi Cao  Jianlin Cheng 《Proteins》2019,87(12):1165-1178
Predicting residue-residue distance relationships (eg, contacts) has become the key direction to advance protein structure prediction since 2014 CASP11 experiment, while deep learning has revolutionized the technology for contact and distance distribution prediction since its debut in 2012 CASP10 experiment. During 2018 CASP13 experiment, we enhanced our MULTICOM protein structure prediction system with three major components: contact distance prediction based on deep convolutional neural networks, distance-driven template-free (ab initio) modeling, and protein model ranking empowered by deep learning and contact prediction. Our experiment demonstrates that contact distance prediction and deep learning methods are the key reasons that MULTICOM was ranked 3rd out of all 98 predictors in both template-free and template-based structure modeling in CASP13. Deep convolutional neural network can utilize global information in pairwise residue-residue features such as coevolution scores to substantially improve contact distance prediction, which played a decisive role in correctly folding some free modeling and hard template-based modeling targets. Deep learning also successfully integrated one-dimensional structural features, two-dimensional contact information, and three-dimensional structural quality scores to improve protein model quality assessment, where the contact prediction was demonstrated to consistently enhance ranking of protein models for the first time. The success of MULTICOM system clearly shows that protein contact distance prediction and model selection driven by deep learning holds the key of solving protein structure prediction problem. However, there are still challenges in accurately predicting protein contact distance when there are few homologous sequences, folding proteins from noisy contact distances, and ranking models of hard targets.  相似文献   

20.
Techniques for predicting interactions between a drug and a target (protein) are useful for strategic drug repositioning. Neighborhood regularized logistic matrix factorization (NRLMF) is one of the state-of-the-art drug–target interaction prediction methods; it is based on a statistical model using the Bernoulli distribution. However, the prediction is not accurate when drug–target interaction pairs have less interaction information (e.g., the sum of the number of ligands for a target and the number of target proteins for a drug). This study aimed to address this issue by proposing NRLMF with beta distribution rescoring (NRLMFβ), which is an algorithm to improve the score of NRLMF. The score of NRLMFβ is equivalent to the value of the original NRLMF score when the concentration of the beta distribution becomes infinity. The beta distribution is known as a conjugative prior distribution of the Bernoulli distribution and can reflect the amount of interaction information to its shape based on Bayesian inference. Therefore, in NRLMFβ, the beta distribution was used for rescoring the NRLMF score. In the evaluation experiment, we measured the average values of area under the receiver operating characteristics and area under precision versus recall and the 95% confidence intervals. The performance of NRLMFβ was found to be better than that of NRLMF in the four types of benchmark datasets. Thus, we concluded that NRLMFβ improved the prediction accuracy of NRLMF. The source code is available at https://github.com/akiyamalab/NRLMFb.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号