首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
    
Computational approaches for predicting protein-protein interfaces are extremely useful for understanding and modelling the quaternary structure of protein assemblies. In particular, partner-specific binding site prediction methods allow delineating the specific residues that compose the interface of protein complexes. In recent years, new machine learning and other algorithmic approaches have been proposed to solve this problem. However, little effort has been made in finding better training datasets to improve the performance of these methods. With the aim of vindicating the importance of the training set compilation procedure, in this work we present BIPSPI+, a new version of our original server trained on carefully curated datasets that outperforms our original predictor. We show how prediction performance can be improved by selecting specific datasets that better describe particular types of protein interactions and interfaces (e.g. homo/hetero). In addition, our upgraded web server offers a new set of functionalities such as the sequence-structure prediction mode, hetero- or homo-complex specialization and the guided docking tool that allows to compute 3D quaternary structure poses using the predicted interfaces. BIPSPI+ is freely available at https://bipspi.cnb.csic.es.  相似文献   

2.
    
Polypharmacology, the ability of drugs to interact with multiple targets, is a fundamental concept of interest to the pharmaceutical industry in its efforts to solve the current issues of the rise in the cost of drug development and decline in productivity. Polypharmacology has the potential to greatly benefit drug repurposing, bringing existing pharmaceuticals on the market to treat different ailments quicker and more affordably than developing new drugs, and may also facilitate the development of new, potent pharmaceuticals with reduced negative off-target effects and adverse side effects. Present day computational power, when combined with applications such as supercomputer-based virtual high-throughput screening (docking) will enable these advances on a massive chemogenomic level, potentially transforming the pharmaceutical industry. However, while the potential of supercomputing-based drug discovery is unequivocal, the technical and fundamental challenges are considerable.  相似文献   

3.
    
Nuclear magnetic resonance (NMR) crystallography is one of the main methods in structural biology for analyzing protein stereochemistry and structure. The chemical shift of the resonance frequency reflects the effect of the protons in a molecule producing distinct NMR signals in different chemical environments. Apprehending chemical shifts from NMR signals can be challenging since having an NMR structure does not necessarily provide all the required chemical shift information, making predictive models essential for accurately deducing chemical shifts, either from protein structures or, more ideally, directly from amino acid sequences. Here, we present EFG-CS, a web server that specializes in chemical shift prediction. EFG-CS employs a machine learning-based transfer prediction model for backbone atom chemical shift prediction, using ESMFold-predicted protein structures. Additionally, ESG-CS incorporates a graph neural network-based model to provide comprehensive side-chain atom chemical shift predictions. Our method demonstrated reliable performance in backbone atom prediction, achieving comparable accuracy levels with root mean square errors (RMSE) of 0.30 ppm for H, 0.22 ppm for Hα, 0.89 ppm for C, 0.89 ppm for Cα, 0.84 ppm for Cβ, and 1.69 ppm for N. Moreover, our approach also showed predictive capabilities in side-chain atom chemical shift prediction achieving RMSE values of 0.71 ppm for Hβ, 0.74–1.15 ppm for Hδ, and 0.58–0.94 ppm for Hγ, solely utilizing amino acid sequences without homology or feature curation. This work shows for the first time that generative AI protein models can predict NMR shifts nearly comparable to experimental models. This web server is freely available at https://biosig.lab.uq.edu.au/efg_cs, and the chemical shift prediction results can be downloaded in tabular format and visualized in 3D format.  相似文献   

4.
    
Molecular docking is a computational method for predicting the placement of ligands in the binding sites of their receptor(s). In this review, we discuss the methodological developments that occurred in the docking field in 2012 and 2013, with a particular focus on the more difficult aspects of this computational discipline. The main challenges and therefore focal points for developments in docking, covered in this review, are receptor flexibility, solvation, scoring, and virtual screening. We specifically deal with such aspects of molecular docking and its applications as selection criteria for constructing receptor ensembles, target dependence of scoring functions, integration of higher‐level theory into scoring, implicit and explicit handling of solvation in the binding process, and comparison and evaluation of docking and scoring methods. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

5.
6.
microRNA(miRNA)是一类不编码蛋白的调控小分子RNA,在真核生物中发挥着广泛而重要的调控功能.由于miRNA的表达具有时空特异性,因而通过计算方法预测miRNA而后有针对性的实验验证是miRNA发现的一条重要途径.降低假阳性率是miRNA预测方法面临的重要挑战.本研究采用集成学习方法构建预测miRNA前体的分类器SVMbagging,对训练集、测试集和独立测试集的结果表明,本研究的方法性能稳健、假阳性率低,具有很好的泛化能力,尤其是当阈值取0.9时,特异性高达99.90%,敏感性在26%以上,适合于全基因组预测.采用SVMbagging在人全基因组中预测miRNA前体,当取阈值0.9时,得到14933个可能的miRNA前体.通过与高通量小RNA测序数据的比较,发现其中4481个miRNA前体具有完全匹配的小RNA序列,与理论估计的真阳性数值非常接近.最后,对32个可能的miRNA进行实验验证,确定其中2条为真实的miRNA.  相似文献   

7.
    
  1. Download : Download high-res image (216KB)
  2. Download : Download full-size image
  相似文献   

8.
    
It is difficult to properly validate algorithms that dock a small molecule ligand into its protein receptor using data from the public domain: the predictions are not blind because the correct binding mode is already known, and public test cases may not be representative of compounds of interest such as drug leads. Here, we use private data from a real drug discovery program to carry out a blind evaluation of the RosettaLigand docking methodology and find that its performance is on average comparable with that of the best commercially available current small molecule docking programs. The strength of RosettaLigand is the use of the Rosetta sampling methodology to simultaneously optimize protein sidechain, protein backbone and ligand degrees of freedom; the extensive benchmark test described here identifies shortcomings in other aspects of the protocol and suggests clear routes to improving the method.  相似文献   

9.
    
The present study illustrates the design and synthesis of new series of 3-trifluoromethylpyrazole tethered chalcone-pyrrole and pyrazoline-pyrrole derivatives. All compounds were further screened for in vitro cytostatic activities on full NCI 60 cancer cell lines at National Cancer Institute, USA. Compounds (2E)-3-(1H-pyrrol-2-yl)-1-{4-[3-(trifluoromethyl)-1H-pyrazol-1-yl]phenyl}prop-2-en-1-one ( 5a ) and (2E)-1-{3-methyl-4-[3-(trifluoromethyl)-1H-pyrazol-1-yl]phenyl}-3-(1H-pyrrol-2-yl)prop-2-en-1-one ( 5c ) displayed significant antiproliferative activity (Growth Percentage: −77.10 and −92.13, respectively at 10 μM concentration) against the UO-31 cell lines from renal cancer and were further selected for assay at 10-fold dilutions of five different concentrations (10−4 to 10−8 M). Both compounds 5a and 5c exhibited promising antiproliferative activity (GI50: 1.36 to 0.27 μM) against leukemia cancer cell lines HL-60 and RPMI-8226, colon cancer cell lines KM-12; breast cancer cell lines BT-549. Moreover, both compounds 5a and 5c were found to be non-cytotoxic (LC50>100) against HL-60, RPMI-8226, and KM-12 cell lines. Remarkably, GI50 values of compounds 5a and 5c were identified as more promising than sunitinib against most cancer cell lines. In silico study of compounds 5a and 5c exemplified the desired ADME properties for drug-likeness as well as tighter interactions with VEGFR-2. Hence, compounds 5a and 5c would be good cytotoxic agents after further clinical study.  相似文献   

10.
蛋白质S-亚磺酰化是可逆的蛋白质翻译后修饰(post-translational modifications, PTMs),在生物生长中发挥至关重要的作用。同时,它与一些疾病相关。因此,无论是从基础研究还是药物开发的角度,都面临着一个具有挑战性的问题:哪些是属于S-亚磺酰化位点?为了解决这个问题,本文开发了一种基于机器学习的预测方法。该系统主要步骤为:(1)将这些蛋白质组合成等长度的伪氨基酸;(2)使用下采样方法来平衡训练数据集;(3)通过集成方法建立一个综合的预测系统进行预测。最终,得到的独立测试集的准确率达到90.77%,其他各个指标对比现有方法提升效果明显,为生物信息学的发展提供了帮助。本文建立了一个友好的web服务器预测网站:http: //www.jci-bioinfo.cn/iSulf_Wide-PseAAC,通过该网站不需要复杂的计算公式即可在线预测,它将为用户提供便利和进一步研究的指南。与此同时,本文中使用到的数学方法会解决类似相关领域的诸多其他问题。  相似文献   

11.
赖氨酸琥珀酰化是一种新型的翻译后修饰,在蛋白质调节和细胞功能控制中发挥重要作用,所以准确识别蛋白质中的琥珀酰化位点是有必要的。传统的实验耗费物力和财力。通过计算方法预测是近段时间以来提出的一种高效的预测方法。本研究中,我们开发了一种新的预测方法iSucc-PseAAC,它是通过使用多种分类算法结合不同的特征提取方法。最终发现,基于耦合序列(PseAAC)特征提取下,使用支持向量机分类效果是最好的,并结合集成学习解决了数据不平衡问题。与现有方法预测效果对比,iSucc-PseAAC在区分赖氨酸琥珀酰化位点方面,更具有意义和实用性。  相似文献   

12.

Background

State-of-the-art protein-ligand docking methods are generally limited by the traditionally low accuracy of their scoring functions, which are used to predict binding affinity and thus vital for discriminating between active and inactive compounds. Despite intensive research over the years, classical scoring functions have reached a plateau in their predictive performance. These assume a predetermined additive functional form for some sophisticated numerical features, and use standard multivariate linear regression (MLR) on experimental data to derive the coefficients.

Results

In this study we show that such a simple functional form is detrimental for the prediction performance of a scoring function, and replacing linear regression by machine learning techniques like random forest (RF) can improve prediction performance. We investigate the conditions of applying RF under various contexts and find that given sufficient training samples RF manages to comprehensively capture the non-linearity between structural features and measured binding affinities. Incorporating more structural features and training with more samples can both boost RF performance. In addition, we analyze the importance of structural features to binding affinity prediction using the RF variable importance tool. Lastly, we use Cyscore, a top performing empirical scoring function, as a baseline for comparison study.

Conclusions

Machine-learning scoring functions are fundamentally different from classical scoring functions because the former circumvents the fixed functional form relating structural features with binding affinities. RF, but not MLR, can effectively exploit more structural features and more training samples, leading to higher prediction performance. The future availability of more X-ray crystal structures will further widen the performance gap between RF-based and MLR-based scoring functions. This further stresses the importance of substituting RF for MLR in scoring function development.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-291) contains supplementary material, which is available to authorized users.  相似文献   

13.
    
The global connectivities in very large protein similarity networks contain traces of evolution among the proteins for detecting protein remote evolutionary relations or structural similarities. To investigate how well a protein network captures the evolutionary information, a key limitation is the intensive computation of pairwise sequence similarities needed to construct very large protein networks. In this article, we introduce label propagation on low-rank kernel approximation (LP-LOKA) for searching massively large protein networks. LP-LOKA propagates initial protein similarities in a low-rank graph by Nyström approximation without computing all pairwise similarities. With scalable parallel implementations based on distributed-memory using message-passing interface and Apache-Hadoop/Spark on cloud, LP-LOKA can search protein networks with one million proteins or more. In the experiments on Swiss-Prot/ADDA/CASP data, LP-LOKA significantly improved protein ranking over the widely used HMM-HMM or profile-sequence alignment methods utilizing large protein networks. It was observed that the larger the protein similarity network, the better the performance, especially on relatively small protein superfamilies and folds. The results suggest that computing massively large protein network is necessary to meet the growing need of annotating proteins from newly sequenced species and LP-LOKA is both scalable and accurate for searching massively large protein networks.  相似文献   

14.
15.
    
Antimicrobial resistance is a growing health concern. Antimicrobial peptides (AMPs) disrupt harmful microorganisms by nonspecific mechanisms, making it difficult for microbes to develop resistance. Accordingly, they are promising alternatives to traditional antimicrobial drugs. In this study, we developed an improved AMP classification model, called AMP-BERT. We propose a deep learning model with a fine-tuned bidirectional encoder representations from transformers (BERT) architecture designed to extract structural/functional information from input peptides and identify each input as AMP or non-AMP. We compared the performance of our proposed model and other machine/deep learning-based methods. Our model, AMP-BERT, yielded the best prediction results among all models evaluated with our curated external dataset. In addition, we utilized the attention mechanism in BERT to implement an interpretable feature analysis and determine the specific residues in known AMPs that contribute to peptide structure and antimicrobial function. The results show that AMP-BERT can capture the structural properties of peptides for model learning, enabling the prediction of AMPs or non-AMPs from input sequences. AMP-BERT is expected to contribute to the identification of candidate AMPs for functional validation and drug development. The code and dataset for the fine-tuning of AMP-BERT is publicly available at https://github.com/GIST-CSBL/AMP-BERT .  相似文献   

16.
Frizzled is the earliest discovered glycosylated Wnt protein receptor and is critical for the initiation of Wnt signaling. Antagonizing Frizzled is effective in inhibiting the growth of multiple tumor types. The extracellular N terminus of Frizzled contains a conserved cysteine-rich domain that directly interacts with Wnt ligands. Structure-based virtual screening and cell-based assays were used to identify five small molecules that can inhibit canonical Wnt signaling and have low IC50 values in the micromolar range. NMR experiments confirmed that these compounds specifically bind to the Wnt binding site on the Frizzled8 cysteine-rich domain with submicromolar dissociation constants. Our study confirms the feasibility of targeting the Frizzled cysteine-rich domain as an effective way of regulating canonical Wnt signaling. These small molecules can be further optimized into more potent therapeutic agents for regulating abnormal Wnt signaling by targeting Frizzled.  相似文献   

17.
18.
19.
Abstract

Lysine-specific demethylase 1 (LSD1) has been reported to connect with a range of solid tumors. Thus, the exploration of LSD1 inhibitors has emerged as an effective strategy for cancer treatment. In this study, we constructed a pharmacophore model based on a series of flavin adenine dinucleotide (FAD)-competing inhibitors bearing triazole???dithiocarbamate scaffold combining docking, structure–activity relationship (SAR) study, and molecular dynamic (MD) simulation. Meanwhile, another pharmacophore model was also constructed manually, relying on several speculated substrate-competing inhibitors and reported putative vital interactions with LSD1. On the basis of the two pharmacophore models, multi-step virtual screenings (VSs) were performed against substrate-binding pocket and FAD-binding pocket, respectively, combining pharmacophore-based and structure-based strategy to exploit novel LSD1 inhibitors. After bioassay evaluation, four compounds among 21 hits with diverse and novel scaffolds exhibited inhibition activity at the range of 3.63–101.43?μM. Furthermore, substructure-based enrichment was performed, and four compounds with a more potent activity were identified. After that, the time-dependent assay proved that the most potent compound with IC50 2.21?μM inhibits LSD1 activity in a manner of time-independent. In addition, the compound exhibited a cellular inhibitory effect against LSD1 in MGC-803 cells and may inhibit cell migration and invasion by reversing EMT in cultured gastric cancer cells. Considering the binding mode and SAR of the series of compounds, we could roughly deem that these compounds containing 3-methylxanthine scaffold act through occupying substrate-binding pocket competitively. This study presented a new starting point to develop novel LSD1 inhibitors.  相似文献   

20.
    
Park H  Lee J  Lee S 《Proteins》2006,65(3):549-554
A major problem in virtual screening concerns the accuracy of the binding free energy between a target protein and a putative ligand. Here we report an example supporting the outperformance of the AutoDock scoring function in virtual screening in comparison to the other popular docking programs. The original AutoDock program is in itself inefficient to be used in virtual screening because the grids of interaction energy have to be calculated for each putative ligand in chemical database. However, the automation of the AutoDock program with the potential grids defined in common for all putative ligands leads to more than twofold increase in the speed of virtual database screening. The utility of the automated AutoDock in virtual screening is further demonstrated by identifying the actual inhibitors of various target enzymes in chemical databases with accuracy higher than the other docking tools including DOCK and FlexX. These results exemplify the usefulness of the automated AutoDock as a new promising tool in structure-based virtual screening.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号