共查询到15条相似文献,搜索用时 0 毫秒
1.
根据凋亡蛋白的亚细胞位置主要决定于它的氨基酸序列这一观点,基于局部氨基酸序列的n肽组分和序列的亲疏水性分布信息,采用离散增量结合支持向量机(ID_SVM)算法,对六类细胞凋亡蛋白的亚细胞位置进行预测。结果表明,在Re-substitution检验和Jackknife检验下,ID_SVM算法的总体预测成功率分别达到了94.6%和84.2%;在5-fold检验和10-fold检验下,其总体预测成功率也都达到了83%以上。通过比较ID和ID_SVM两种方法的预测能力发现,结合了支持向量机的离散增量算法能够改进预测成功率,结果表明ID_SVM是预测凋亡蛋白亚细胞位置的一种很有效的方法。 相似文献
2.
Ion channels are integral membrane proteins that control movement of ions into or out of cells. They are key components in a wide range of biological processes. Different types of ion channels have different biological functions. With the appearance of vast proteomic data, it is highly desirable for both basic research and drug-target discovery to develop a computational method for the reliable prediction of ion channels and their types. In this study, we developed a support vector machine-based method to predict ion channels and their types using primary sequence information. A feature selection technique, analysis of variance (ANOVA), was introduced to remove feature redundancy and find out an optimized feature set for improving predictive performance. Jackknife cross-validated results show that the proposed method can discriminate ion channels from non-ion channels with an overall accuracy of 86.6%, classify voltage-gated ion channels and ligand-gated ion channels with an overall accuracy of 92.6% and predict four types (potassium, sodium, calcium and anion) of voltage-gated ion channels with an overall accuracy of 87.8%, respectively. These results indicate that the proposed method can correctly identify ion channels and provide important instructions for drug-target discovery. The predictor can be freely downloaded from http://cobi.uestc.edu.cn/people/hlin/tools/IonchanPred/. 相似文献
3.
《Bioorganic & medicinal chemistry》2020,28(10):115422
Cytotoxicity is a critical property in determining the fate of a small molecule in the drug discovery pipeline. Cytotoxic compounds are identified and triaged in both target-based and cell-based phenotypic approaches due to their off-target toxicity or on-target and on-mechanism toxicity for oncology and neurodegenerative targets. It is critical that chemical-induced cytotoxicity be reliably predicted before drug candidates advance to the late stage of development, or more ideally, before compounds are synthesized. In this study, we assessed the cell-based cytotoxicity of nearly 10,000 compounds in NCATS annotated libraries against four ‘normal’ cell lines (HEK 293, NIH 3T3, CRL-7250 and HaCat) using CellTiter-Glo (CTG) technology and constructed highly predictive models to estimate cytotoxicity from chemical structures. There are 5,241 non-redundant compounds having unambiguous activities in the four different cell lines, among which 11.8% compounds exhibited cytotoxicity in two or more cell lines and are thus labelled cytotoxic. The support vector classification (SVC) models trained with 80% randomly selected molecules achieved the area under the receiver operating characteristic curve (AUC-ROC) of 0.88 on average for the remaining 20% compounds in the test sets in 10 repeating experiments. Application of under-sampling rebalancing method further improved the averaged AUC-ROC to 0.90. Analysis of structural features shared by cytotoxic compounds may offer medicinal chemists heuristic design ideas to eliminate undesirable cytotoxicity. The profiling of cytotoxicity of drug-like molecules with annotated primary mechanism of action (MOA) will inform on the roles played by different targets or pathways in cellular viability. The predictive models for cytotoxicity (accessible at https://tripod.nih.gov/web_adme/cytotox.html) provide the scientific community a fast yet reliable way to prioritize molecules with little or no cytotoxicity for downstream development. 相似文献
4.
Prediction of apoptosis protein subcellular location using improved hybrid approach and pseudo-amino acid composition 总被引:3,自引:0,他引:3
Apoptosis proteins are very important for understanding the mechanism of programmed cell death. The apoptosis protein localization can provide valuable information about its molecular function. The prediction of localization of an apoptosis protein is a challenging task. In our previous work we proposed an increment of diversity (ID) method using protein sequence information for this prediction task. In this work, based on the concept of Chou's pseudo-amino acid composition [Chou, K.C., 2001. Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins: Struct. Funct. Genet. (Erratum: Chou, K.C., 2001, vol. 44, 60) 43, 246-255, Chou, K.C., 2005. Using amphiphilic pseudo-amino acid composition to predict enzyme subfamily classes. Bioinformatics 21, 10-19], a different pseudo-amino acid composition by using the hydropathy distribution information is introduced. A novel ID_SVM algorithm combined ID with support vector machine (SVM) is proposed. This method is applied to three data sets (317 apoptosis proteins, 225 apoptosis proteins and 98 apoptosis proteins). The higher predictive success rates than the previous algorithms are obtained by the jackknife tests. 相似文献
5.
6.
基于蛋白质序列,提出了一种新的超二级结构模体β-发夹的预测方法。利用离散增量构成的向量来表示序列信息,并将6个离散增量输入支持向量机,在六维向量空间中寻找最优超平面,将β-发夹和非β-发夹进行分类。计算结果表明,利用所设计的算法预测β-发夹,有较高的预测能力。对于训练集,5-交叉检验的预测总精度为81.24%,相关系数为0.57,β-发夹敏感性为83.06%;对于独立的检验集,预测总精度为78.34%,相关系数0.56,β-发夹敏感性为77.24%。将此预测模型应用于CASP6的63个蛋白质进行检验,得到较好结果。 相似文献
7.
In order to make renewable fuels and chemicals from microbes, new methods are required to engineer microbes more intelligently. Computational approaches, to engineer strains for enhanced chemical production typically rely on detailed mechanistic models (e.g., kinetic/stoichiometric models of metabolism)—requiring many experimental datasets for their parameterization—while experimental methods may require screening large mutant libraries to explore the design space for the few mutants with desired behaviors. To address these limitations, we developed an active and machine learning approach (ActiveOpt) to intelligently guide experiments to arrive at an optimal phenotype with minimal measured datasets. ActiveOpt was applied to two separate case studies to evaluate its potential to increase valine yields and neurosporene productivity in Escherichia coli. In both the cases, ActiveOpt identified the best performing strain in fewer experiments than the case studies used. This work demonstrates that machine and active learning approaches have the potential to greatly facilitate metabolic engineering efforts to rapidly achieve its objectives. 相似文献
8.
基于SVM 的药物靶点预测方法及其应用 总被引:1,自引:0,他引:1
尚振伟李晋姜永帅张明明吕洪超张瑞杰 《现代生物医学进展》2012,12(20):3943-3946
目的:基于已知药物靶点和潜在药物靶点蛋白的一级结构相似性,结合SVM技术研究新的有效的药物靶点预测方法。方法:构造训练样本集,提取蛋白质序列的一级结构特征,进行数据预处理,选择最优核函数,优化参数并进行特征选择,训练最优预测模型,检验模型的预测效果。以G蛋白偶联受体家族的蛋白质为预测集,应用建立的最优分类模型对其进行潜在药物靶点挖掘。结果:基于SVM所建立的最优分类模型预测的平均准确率为81.03%。应用最优分类器对构造的G蛋白预测集进行预测,结果发现预测排位在前20的蛋白质中有多个与疾病相关。特别的,其中有两个G蛋白在治疗靶点数据库(TTD)中显示已作为临床试验的药物靶点。结论:基于SVM和蛋白质序列特征的药物靶点预测方法是有效的,应用该方法预测出的潜在药物靶点能够为发现新的药靶提供参考。 相似文献
9.
Based on the 639 non-homologous proteins with 2910 cysteine-containing segments of well-resolved three-dimensional structures, a novel approach has been proposed to predict the disulfide-bonding state of cysteines in proteins by constructing a two-stage classifier combining a first global linear discriminator based on their amino acid composition and a second local support vector machine classifier. The overall prediction accuracy of this hybrid classifier for the disulfide-bonding state of cysteines in proteins has scored 84.1% and 80.1%, when measured on cysteine and protein basis using the rigorous jack-knife procedure, respectively. It shows that whether cysteines should form disulfide bonds depends not only on the global structural features of proteins but also on the local sequence environment of proteins. The result demonstrates the applicability of this novel method and provides comparable prediction performance compared with existing methods for the prediction of the oxidation states of cysteines in proteins. 相似文献
10.
This article offers a novel sequence-based approach to discriminate outer membrane proteins (OMPs). The first step is to use a new representation approach, factor analysis scales of generalized amino acid information (FASGAI) representing hydrophobicity, alpha and turn propensities, bulky properties, compositional characteristics, local flexibility and electronic properties, etc., to characterize sequences of OMPs and non-OMPs. The subsequent data is then transformed into a uniform matrix by the auto cross covariance (ACC). The second step is to develop discrimination predictors of OMPs from non-OMPs using a support vector machine (SVM). The SVM predictors thus successfully produce a high Matthews correlation coefficient (MCC) of 0.916 on 208 OMPs from non-OMPs including 206 α-helical membrane proteins and 673 globular proteins by a fivefold cross validation test. Meanwhile, overall MCC values of 0.923 and 0.930 are obtained for the discrimination OMPs from the α-helical membrane proteins and the globular proteins, respectively. The results demonstrate that the FASGAI-ACC-SVM combination approach shows great prospect of application in the field of bioinformatics or proteomics studies. 相似文献
11.
By using of the composite vector with increment of diversity and scoring function to express the information of sequence,
a support vector machine (SVM) algorithm for predicting β-hairpin motifs is proposed. The prediction is done on a dataset
of 3,088 non homologous proteins containing 6,027 β-hairpins. The overall accuracy of prediction and Matthew’s correlation
coefficient are 79.9% and 0.59 for the independent testing dataset. In addition, a higher accuracy of 83.3% and Matthew’s
correlation coefficient of 0.67 in the independent testing dataset are obtained on a dataset previously used by Kumar et al.
(Nuclic Acid Res 33:154–159). The performance of the method is also evaluated by predicting the β-hairpins of in the CASP6
proteins, and the better results are obtained. Moreover, this method is used to predict four kinds of supersecondary structures.
The overall accuracy of prediction is 64.5% for the independent testing dataset. 相似文献
12.
以序列相似性低于40%的1895条蛋白质序列构建涵盖27个折叠类型的蛋白质折叠子数据库,从蛋白质序列出发,用模体频数值、低频功率谱密度值、氨基酸组分、预测的二级结构信息和自相关函数值构成组合向量表示蛋白质序列信息,采用支持向量机算法,基于整体分类策略,对27类蛋白质折叠子的折叠类型进行预测,独立检验的预测精度达到了66.67%。同时,以同样的特征参数和算法对27类折叠子的4个结构类型进行了预测,独立检验的预测精度达到了89.24%。将同样的方法用于前人使用过的27类折叠子数据库,得到了好于前人的预测结果。 相似文献
13.
Nagaraja M. Phani Shreeshakala Acharya Seethu Xavy Nalini Bhaskaranand Manoj K. Bhat Aditya Jain Padmalatha S. Rai Satyamoorthy Kapaettu 《Gene》2014
Establishing genetic basis of Idiopathic generalized epilepsies (IGE) is challenging because of their complex inheritance pattern and genetic heterogeneity. Kir4.1 inwardly rectifying channel (KCNJ10) is one of the independent genes reported to be associated with seizure susceptibility. In the current study we have performed a comprehensive in silico analysis of genetic variants in KCNJ10gene at functional and structural level along with a case–control analysis for the association ofrs1130183 (R271C) polymorphism in Indian patients with IGE. Age and sex matched 108epileptic patients and normal healthy controls were examined. Genotyping of KCNJ10rs1130183variation was performed using PCR-RFLP method. The risk association was determined by using odds ratio and 95% confidence interval. Functional effects of non-synonymous SNPs (nsSNPs) in KCNJ10 gene were analyzed using SIFT PolyPhen-2, I-Mutant 2.0, PANTHER and FASTSNP. Subsequently, homology modeling of protein three dimensional (3D) structures was performed using Modeller tool (9.10v) and compared the native protein with mutant for assessment of structure and stability. SIFT, PolyPhen-2, I-Mutant 2.0 and PANTHER collectively showed rs1130183, rs1130182 and rs137853073 SNPs inKCNJ10 gene affect protein structure and function. There was a considerable variation in the Root Mean Square Deviation (RMSD) value between the native and mutant structure (1.17?). Association analysis indicate KCNJ10rs1130183 did not contribute to risk of seizure susceptibility in Indian patients with IGE (OR- 0.38; 95%CI, 0.07–2.05) and T allele frequency (0.02%) was in concordance with dbSNP reports. This study identifies potential SNPs that may contribute to seizure susceptibility and further studies with the selected SNPs in larger number of samples and their functional analysis is required for understanding the variants of KCNJ10with seizure susceptibility. 相似文献
14.
BackgroundIntravoxel incoherent motion (IVIM) plays an important role in predicting treatment responses in patient with nasopharyngeal carcinoma (NPC). The goal of this study was to develop and validate a radiomics nomogram based on IVIM parametric maps and clinical data for the prediction of treatment responses in NPC patients.MethodsEighty patients with biopsy-proven NPC were enrolled in this study. Sixty-two patients had complete responses and 18 patients had incomplete responses to treatment. Each patient received a multiple b-value diffusion-weighted imaging (DWI) examination before treatment. Radiomics features were extracted from IVIM parametric maps derived from DWI image. Feature selection was performed by the least absolute shrinkage and selection operator method. Radiomics signature was generated by support vector machine based on the selected features. Receiver operating characteristic (ROC) curves and area under the ROC curve (AUC) values were used to evaluate the diagnostic performance of radiomics signature. A radiomics nomogram was established by integrating the radiomics signature and clinical data.ResultsThe radiomics signature showed good prognostic performance to predict treatment response in both training (AUC = 0.906, P<0.001) and testing (AUC = 0.850, P<0.001) cohorts. The radiomic nomogram established by integrating the radiomic signature with clinical data significantly outperformed clinical data alone (C-index, 0.929 vs 0.724; P<0.0001).ConclusionsThe IVIM-based radiomics nomogram provided high prognostic ability to treatment responses in patients with NPC. The IVIM-based radiomics signature has the potential to be a new biomarker in prediction of the treatment responses and may affect treatment strategies in patients with NPC. 相似文献
15.
《Molecular & cellular proteomics : MCP》2022,21(10):100277
The recent surge of coronavirus disease 2019 (COVID-19) hospitalizations severely challenges healthcare systems around the globe and has increased the demand for reliable tests predictive of disease severity and mortality. Using multiplexed targeted mass spectrometry assays on a robust triple quadrupole MS setup which is available in many clinical laboratories, we determined the precise concentrations of hundreds of proteins and metabolites in plasma from hospitalized COVID-19 patients. We observed a clear distinction between COVID-19 patients and controls and, strikingly, a significant difference between survivors and nonsurvivors. With increasing length of hospitalization, the survivors’ samples showed a trend toward normal concentrations, indicating a potential sensitive readout of treatment success. Building a machine learning multi-omic model that considers the concentrations of 10 proteins and five metabolites, we could predict patient survival with 92% accuracy (area under the receiver operating characteristic curve: 0.97) on the day of hospitalization. Hence, our standardized assays represent a unique opportunity for the early stratification of hospitalized COVID-19 patients. 相似文献