首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 272 毫秒
1.
支持向量机在害虫发生量预测中的应用   总被引:6,自引:0,他引:6  
害虫发生量与其影响因子之间具有复杂的非线性和时滞性关系,传统方法不能很好的分析和拟合高度非线性的害虫发生量变化规律,导致预测精度不理想。为了有效构建害虫发生量与其影响因子之间复杂的非线性关系模型,提高害虫发生量预测精度,提出一种基于支持向量机的害虫发生量预测方法。该方法首先通过F测验对害虫发生量的最佳时滞阶数进行确定,并利用最佳时滞阶数对样本进行重构;然后利用前向浮动因子筛选法对害虫发生量的影响因子进行筛选,筛选出对预测结果贡献大的影响因子;最后采用10折交叉验证得到害虫发生量的最优预测模型。采用粘虫的幼虫发生密度数据在Mat-lab7.0平台下对该方法进行测试与分析,实验结果表明,相对于其它预测方法,支持向量机提高了害虫发生量的预测精度,克服了传统方法的缺陷,更适合于非线性、小样本的害虫发生量预测。  相似文献   

2.
支持向量机是在统计学习理论基础上发展起来的一种新型的机器学习方法,已在模式识别、非线性建模等领域中得到了应用.本文将最小二乘支持向量机方法应用于农田水汽通量的建模中,并同前馈反向传播神经网络的建模性能进行了比较.结果表明,最小二乘支持向量机方法具有可调参数少、学习速度较快等优点,具有更好的推广能力,以更高的精度建立农田水汽通量模型.模型的敏感性分析进一步显示,用最小二乘支持向量机方法建立的农田水汽通量模型是合理可行的.  相似文献   

3.
ARIMA与SVM组合模型在害虫预测中的应用   总被引:2,自引:0,他引:2  
向昌盛  周子英 《昆虫学报》2010,53(9):1055-1060
害虫发生是一种复杂、 动态时间序列数据, 单一预测模型都是基于线性或非线性数据, 不能同时捕捉害虫发生的线性和非线性规律, 很难达到理想的预测精度。本研究首先采用差分自回归移动平均模型对昆虫发生时间序列进行线性建模, 然后采用支持向量机对非线性部分进行建模, 最后得到两种模型的组合预测结果。将组合模型应用到松毛虫Dendrolimus punctatus发生面积的预测, 实验结果表明组合模型的预测精度明显优于单一模型, 发挥了两种模型各自的优势。组合模型是一种切实可行的害虫预测预报方法。  相似文献   

4.
基于SVR和CAR的多维时间序列分析及其在生态学中的应用   总被引:1,自引:0,他引:1  
基于支持向量回归(SVR)并融合带受控项的自回归模型(CAR),建立了一种既反映样本集动态特征又体现环境因子影响的非线性多维时间序列分析预测方法(SVR-CAR)。用一步预测法对两个生态学样本集的预测结果表明,SVR-CAR在所有参比模型中预测精度最高,并具结构风险最小、非线性、避免过拟合、泛化推广能力优异等诸多优点。SVR-CAR在生态学、农业科学、经济学等多维时间序列预测领域有较广泛的应用前景。  相似文献   

5.
基于支持向量机的~(31)P磁共振波谱肝细胞癌诊断   总被引:1,自引:1,他引:0  
支持向量机是在统计学习理论基础上发展起来的一种新的机器学习方法,在模式识别领域有着广泛的应用。利用基于支持向量机模型的31P磁共振波谱数据对肝脏进行分类,区别肝细胞癌,肝硬化和正常的肝组织。通过对基于多项式核函数和径向基核函数的支持向量机分类器进行比较,并且得到三种肝脏分类的识别率。实验表明基于31P磁共振波谱数据的支持向量机分类模型能够对活体肝脏进行诊断性的预测。  相似文献   

6.
支持向量机与神经网络的关系研究   总被引:2,自引:0,他引:2  
支持向量机是一种基于统计学习理论的新颖的机器学习方法,由于其出色的学习性能,该技术已成为当前国际机器学习界的研究热点,该方法已经广泛用于解决分类和回归问题.本文将结构风险函数应用于径向基函数网络学习中,同时讨论了支持向量回归模型和径向基函数网络之间的关系.仿真实例表明所给算法提高了径向基函数网络的泛化性能.  相似文献   

7.
基于支持向量机和贝叶斯方法的蛋白质四级结构分类研究   总被引:6,自引:2,他引:4  
用支持向量机和贝叶斯两种方法对蛋白质四级结构进行分类研究。结果表明,基于支持向量机的分类结果最好,其l0CV检验的总分类精度、正样本正确预测率、Matthes相关系数和假阳性率分别为74.2%、84.6%、0.474、38.9%;基于贝叶斯的分类结果没有支持向量机的分类结果好,但其l0CV检验的假阳性率最低(15.9%).这些结果说明同源寡聚蛋白质一级序列包含四级结构信息,同时特征向量的确表示了埋藏在缔合亚基作用部位接触表面的基本信息。  相似文献   

8.
闫化军  章毅 《生物信息学》2004,2(4):19-24,41
运用加入竞争层的BP网络,研究了基于蛋白质二级结构内容的域结构类预测问题.在BP网络中嵌入一竞争,层显著提高了网络预测性能.仅使用了一个小的训练集和简单的网络结构,获得了很高的预测精度自支持精度97.62%,jack-knife测试精度97.62%,及平均外推精度90.74%.在建立更完备的域结构类特征向量和更有代表性的训练集的基础上,所述方法将为蛋白质域结构分类领域提供新的分类基准.  相似文献   

9.
目的:建立一种预测精度较高的定量构效关系(QSAR)模型,为设计和合成活性更高的头孢菌素类抗生素提供理论依据。方法:发展了一种基于支持向量回归(SVR)和k-最近邻(KNN)的非线性组合预测方法(SVR-KNN),系统研究了48种抗流感嗜血杆菌头孢菌素衍生物的QSAR。结果:留一法预测结果表明,非线性筛选描述符和子模型能明显提高预测精度,汰选子模型后的组合预测精度优于单一子模型,SVR-KNN的MSE、MAPE分别为0.019、1.81%;独立样本预测结果显示,SVR-KNN在所有参比模型中具有最优的预测精度及稳定性,其MSE、MAPE分别为0.010、1.33%。结论:SVR-KNN模型具有较强的预测能力和优异的泛化推广能力,在抗生素及其他药物的QSAR研究中有广泛应用前景。  相似文献   

10.
本文针对现有的作物水分生产函数模型拟合精度低,提出基于支持向量回归机的方法拟合作物水分生产函数,并与现有的模型进行比较,拟合结果显示,基于支持向量机的模型拟合明显优于现有模型.  相似文献   

11.
A support vector machine (SVM) modeling approach for short-term load forecasting is proposed. The SVM learning scheme is applied to the power load data, forcing the network to learn the inherent internal temporal property of power load sequence. We also study the performance when other related input variables such as temperature and humidity are considered. The performance of our proposed SVM modeling approach has been tested and compared with feed-forward neural network and cosine radial basis function neural network approaches. Numerical results show that the SVM approach yields better generalization capability and lower prediction error compared to those neural network approaches.  相似文献   

12.
基于SVM 的药物靶点预测方法及其应用   总被引:1,自引:0,他引:1       下载免费PDF全文
目的:基于已知药物靶点和潜在药物靶点蛋白的一级结构相似性,结合SVM技术研究新的有效的药物靶点预测方法。方法:构造训练样本集,提取蛋白质序列的一级结构特征,进行数据预处理,选择最优核函数,优化参数并进行特征选择,训练最优预测模型,检验模型的预测效果。以G蛋白偶联受体家族的蛋白质为预测集,应用建立的最优分类模型对其进行潜在药物靶点挖掘。结果:基于SVM所建立的最优分类模型预测的平均准确率为81.03%。应用最优分类器对构造的G蛋白预测集进行预测,结果发现预测排位在前20的蛋白质中有多个与疾病相关。特别的,其中有两个G蛋白在治疗靶点数据库(TTD)中显示已作为临床试验的药物靶点。结论:基于SVM和蛋白质序列特征的药物靶点预测方法是有效的,应用该方法预测出的潜在药物靶点能够为发现新的药靶提供参考。  相似文献   

13.
藉均匀设计(UD)方法,构建了苏云金杆菌(Bt)杀虫晶体蛋白氨基酸组成特征与其杀虫活性之间关系的支持向量机(SVM)模型。当惩罚系数为0·01、epsilon值为0·2、gamma值为0·05、域值为0·5时,该模型对Bt杀虫晶体蛋白杀虫活性的预测平均准确率达73%。  相似文献   

14.
15.
Sethi D  Garg A  Raghava GP 《Amino acids》2008,35(3):599-605
The association of structurally disordered proteins with a number of diseases has engendered enormous interest and therefore demands a prediction method that would facilitate their expeditious study at molecular level. The present study describes the development of a computational method for predicting disordered proteins using sequence and profile compositions as input features for the training of SVM models. First, we developed the amino acid and dipeptide compositions based SVM modules which yielded sensitivities of 75.6 and 73.2% along with Matthew’s Correlation Coefficient (MCC) values of 0.75 and 0.60, respectively. In addition, the use of predicted secondary structure content (coil, sheet and helices) in the form of composition values attained a sensitivity of 76.8% and MCC value of 0.77. Finally, the training of SVM models using evolutionary information hidden in the multiple sequence alignment profile improved the prediction performance by achieving a sensitivity value of 78% and MCC of 0.78. Furthermore, when evaluated on an independent dataset of partially disordered proteins, the same SVM module provided a correct prediction rate of 86.6%. Based on the above study, a web server (“DPROT”) was developed for the prediction of disordered proteins, which is available at .  相似文献   

16.
Prediction of RNA binding sites in a protein using SVM and PSSM profile   总被引:1,自引:0,他引:1  
Kumar M  Gromiha MM  Raghava GP 《Proteins》2008,71(1):189-194
  相似文献   

17.
The prediction of global solar radiation in a region is of great importance as it provides investors and politicians with more detailed knowledge about the solar resource of that region, which can be very beneficial for large-scale solar energy development. In this sense, the main objective of this study is to predict the daily global solar radiation data of 27 cities (Brussels, Paris, Lisbon, Madrid…), located in 27 countries, which have mostly different solar radiation distributions in Europe. In this research, six different machine-learning algorithms (Linear model (LM), Decision Tree (DT), Support Vector Machine (SVM), Deep Learning (DL), Random Forest (RF) and Gradient Boosted Trees (GBT)) are used. In the training of these algorithms, daily air temperature(Ta), wind speed(Va), relative humidity(RH) and solar radiation of these cities are used. The data is supplied from the Meteonorm tool and cover the last years grouped in two periods (1960–1990; 2000–2019). To decide on the success of these algorithms, four different statistical metrics (Average Relative Error (ARE), Average absolute Error (AAE), Root Mean Squared Error (RMSE), and R2 (R-Squared)) are discussed in the study. In addition, the forecasting of air temperature and global solar radiation of these cities in 2050 and 2100 were made using three of the most recent Intergovernmental Panel on Climate Change (IPCC) scenarios (RCP2.6; RCP 4.5, and RCP 8.5). The results show that ARE, R,2 and RMSE values of all algorithms are ranging from 0.114 to 6.321, from 0.382 to 0.985, from 0.145 to 2.126 MJ/m2, respectively. By analysing all the algorithms, it is noticed that the Decision tree exhibited the worst result in terms of R,2 and RMSE metrics. Among the six prediction algorithms, the DL was recognized as the only algorithm that exceeded the t-critical value (The t-critical value is the cutoff between retaining or rejecting the null hypothesis). Globally, all the six machine learning algorithms used in this research can be applied to predict the daily global solar radiation data with good accuracy. Despite this, the SVM model is the best model among all the six models used. It is followed by the DL, LM, GB, RF and DT, respectively.  相似文献   

18.
Bikadi Z  Hazai I  Malik D  Jemnitz K  Veres Z  Hari P  Ni Z  Loo TW  Clarke DM  Hazai E  Mao Q 《PloS one》2011,6(10):e25815
Human P-glycoprotein (P-gp) is an ATP-binding cassette multidrug transporter that confers resistance to a wide range of chemotherapeutic agents in cancer cells by active efflux of the drugs from cells. P-gp also plays a key role in limiting oral absorption and brain penetration and in facilitating biliary and renal elimination of structurally diverse drugs. Thus, identification of drugs or new molecular entities to be P-gp substrates is of vital importance for predicting the pharmacokinetics, efficacy, safety, or tissue levels of drugs or drug candidates. At present, publicly available, reliable in silico models predicting P-gp substrates are scarce. In this study, a support vector machine (SVM) method was developed to predict P-gp substrates and P-gp-substrate interactions, based on a training data set of 197 known P-gp substrates and non-substrates collected from the literature. We showed that the SVM method had a prediction accuracy of approximately 80% on an independent external validation data set of 32 compounds. A homology model of human P-gp based on the X-ray structure of mouse P-gp as a template has been constructed. We showed that molecular docking to the P-gp structures successfully predicted the geometry of P-gp-ligand complexes. Our SVM prediction and the molecular docking methods have been integrated into a free web server (http://pgp.althotas.com), which allows the users to predict whether a given compound is a P-gp substrate and how it binds to and interacts with P-gp. Utilization of such a web server may prove valuable for both rational drug design and screening.  相似文献   

19.
Lo SL  Cai CZ  Chen YZ  Chung MC 《Proteomics》2005,5(4):876-884
Knowledge of protein-protein interaction is useful for elucidating protein function via the concept of 'guilt-by-association'. A statistical learning method, Support Vector Machine (SVM), has recently been explored for the prediction of protein-protein interactions using artificial shuffled sequences as hypothetical noninteracting proteins and it has shown promising results (Bock, J. R., Gough, D. A., Bioinformatics 2001, 17, 455-460). It remains unclear however, how the prediction accuracy is affected if real protein sequences are used to represent noninteracting proteins. In this work, this effect is assessed by comparison of the results derived from the use of real protein sequences with that derived from the use of shuffled sequences. The real protein sequences of hypothetical noninteracting proteins are generated from an exclusion analysis in combination with subcellular localization information of interacting proteins found in the Database of Interacting Proteins. Prediction accuracy using real protein sequences is 76.9% compared to 94.1% using artificial shuffled sequences. The discrepancy likely arises from the expected higher level of difficulty for separating two sets of real protein sequences than that for separating a set of real protein sequences from a set of artificial sequences. The use of real protein sequences for training a SVM classification system is expected to give better prediction results in practical cases. This is tested by using both SVM systems for predicting putative protein partners of a set of thioredoxin related proteins. The prediction results are consistent with observations, suggesting that real sequence is more practically useful in development of SVM classification system for facilitating protein-protein interaction prediction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号