首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
为探讨小流域尺度丘陵区的高分辨率数字土壤制图方法,通过对景观相分类的探索,配合应用不同尺度的Geomorphons(GM)微地形特征数据构成分类变量组参与高分辨率土壤pH、黏粒含量和阳离子交换量的预测制图,并与传统数字高程模型衍生变量和遥感变量进行组合与比较分析。此外,采用支持向量机、偏最小二乘回归和随机森林3种机器学习模型择优与残差回归克里金复合参与预测模型的构建与评价。结果表明: 景观及多尺度微地形分类变量组的应用分别提高小流域尺度丘陵地貌区pH、黏粒含量和阳离子交换量预测精度的18.8%、8.2%和8.7%。包含植被信息的景观相分类图相比土地利用数据有更高的模型贡献度;5 m分辨率的GM微地形分类图相比低分辨率的分类图更适宜高精度的预测制图。黏粒含量使用随机森林复合模型有最高的预测精度,而pH和阳离子交换量则不适宜在随机森林模型的基础上加入残差回归克里金模型。景观-多尺度微地形分类变量、数字高程模型衍生变量和遥感变量三者结合的模型预测表现最佳,表明多元变量在起伏地形区域相比单一数据源能够包含更多的土壤有效信息。由GM数据和地表景观数据组成的景观分类变量组作为主要变量能够解释小流域丘陵区部分土壤属性约40%的空间变异。在同类型土壤预测制图研究中,多分辨率GM及景观分类数据有潜力作为环境变量参与预测模型的构建。  相似文献   

2.
Under the network environment, the trading volume and asset price of a financial commodity or instrument are affected by various complicated factors. Machine learning and sentiment analysis provide powerful tools to collect a great deal of data from the website and retrieve useful information for effectively forecasting financial risk of associated companies. This article studies trading volume and asset price risk when sentimental financial information data are available using both sentiment analysis and popular machine learning approaches: artificial neural network (ANN) and support vector machine (SVM). Nonlinear GARCH-based mining models are developed by integrating GARCH (generalized autoregressive conditional heteroskedasticity) theory and ANN and SVM. Empirical studies in the U.S. stock market show that the proposed approach achieves favorable forecast performances. GARCH-based SVM outperforms GARCH-based ANN for volatility forecast, whereas GARCH-based ANN achieves a better forecast result for the volatility trend. Results also indicate a strong correlation between information sentiment and both trading volume and asset price volatility.  相似文献   

3.
The “Height Variation Hypothesis” is an indirect approach used to estimate forest biodiversity through remote sensing data, stating that greater tree height heterogeneity (HH) measured by CHM LiDAR data indicates higher forest structure complexity and tree species diversity. This approach has traditionally been analyzed using only airborne LiDAR data, which limits its application to the availability of the dedicated flight campaigns. In this study we analyzed the relationship between tree species diversity and HH, calculated with four different heterogeneity indices using two freely available CHMs derived from the new space-borne GEDI LiDAR data. The first, with a spatial resolution of 30 m, was produced through a regression tree machine learning algorithm integrating GEDI LiDAR data and Landsat optical information. The second, with a spatial resolution of 10 m, was created using Sentinel-2 images and a deep learning convolutional neural network. We tested this approach separately in 30 forest plots situated in the northern Italian Alps, in 100 plots in the forested area of Traunstein (Germany) and successively in all the 130 plots through a cross-validation analysis. Forest density information was also included as influencing factor in a multiple regression analysis. Our results show that the GEDI CHMs can be used to assess biodiversity patterns in forest ecosystems through the estimation of the HH that is correlated to the tree species diversity. However, the results also indicate that this method is influenced by different factors including the GEDI CHMs dataset of choice and their related spatial resolution, the heterogeneity indices used to calculate the HH and the forest density. Our finding suggest that GEDI LIDAR data can be a valuable tool in the estimation of forest tree heterogeneity and related tree species diversity in forest ecosystems, which can aid in global biodiversity estimation.  相似文献   

4.
彭哲也  唐紫珺  谢民主 《遗传》2018,40(3):218-226
复杂疾病是基因与基因、基因与环境交互作用的结果,高维基因交互作用的探测给计算带来了极大的挑战。在过去20年间,机器学习方法被用于探测基因-基因交互作用,并取得了一定的效果。本文综述了机器学习方法在基因交互作用探测中的研究进展,系统地介绍了神经网络(neural networks, NN)、随机森林(random forest, RF)、支持向量机(support vector machines, SVM)和多因子降维法(multifactor dimensionality reduction, MDR)等机器学习方法在全基因组关联研究(genome wide association study, GWAS)中探测基因交互作用的原理和局限性,并对未来的研究进行了展望。  相似文献   

5.
Right ventricular apical pacing (RVA) appears to have potential deleterious effects on myocardial systolic and diastolic left ventricular function, especially in patients with intact AV conduction. Therefore, new pacing sites in the right ventricle are being explored to overcome these detrimental effects. Alternative pacing sites in the right ventricle are the right ventricular outflow tract (RVOT) and the right ventricular septum (RVS). In this case report, we demonstrate an exceptional form of ventricular fusion, namely normalisation of the QRS complex in a patient with pre-existing right bundle branch block by RVS pacing. To our knowledge, this is the first report in the literature where right ventricular pacing could restore a complete RBBB to a normal QRS complex by stimulating distally from the anatomical position of the RBBB, due to fusion between artificial right ventricular stimulation and intrinsic conduction over the left bundle of the specific His-Purkinje system.  相似文献   

6.
This paper investigated application of a machine learning approach (Support vector machine, SVM) for the automatic recognition of gait changes due to ageing using three types of gait measures: basic temporal/spatial, kinetic and kinematic. The gaits of 12 young and 12 elderly participants were recorded and analysed using a synchronized PEAK motion analysis system and a force platform during normal walking. Altogether, 24 gait features describing the three types of gait characteristics were extracted for developing gait recognition models and later testing of generalization performance. Test results indicated an overall accuracy of 91.7% by the SVM in its capacity to distinguish the two gait patterns. The classification ability of the SVM was found to be unaffected across six kernel functions (linear, polynomial, radial basis, exponential radial basis, multi-layer perceptron and spline). Gait recognition rate improved when features were selected from different gait data type. A feature selection algorithm demonstrated that as little as three gait features, one selected from each data type, could effectively distinguish the age groups with 100% accuracy. These results demonstrate considerable potential in applying SVMs in gait classification for many applications.  相似文献   

7.
Elucidation of signaling events in a pathogen is potentially important to tackle the infection caused by it. Such events mediated by protein phosphorylation play important roles in infection, and therefore, to predict the phosphosites and substrates of the serine/threonine protein kinases, we have developed a Machine learning-based approach for Mycobacterium tuberculosis serine/threonine protein kinases using kinase-peptide structure–sequence data. This approach utilizes features derived from kinase three-dimensional-structure environment and known phosphosite sequences to generate support vector machine (SVM)-based kinase-specific predictions of phosphosites of serine/threonine protein kinases (STPKs) with no or scarce data of their substrates. SVM outperformed the four machine learning algorithms we tried (random forest, logistic regression, SVM, and k-nearest neighbors) with an area under the curve receiver-operating characteristic value of 0.88 on the independent testing dataset and a 10-fold cross-validation accuracy of ~81.6% for the final model. Our predicted phosphosites of M. tuberculosis STPKs form a useful resource for experimental biologists enabling elucidation of STPK mediated posttranslational regulation of important cellular processes.  相似文献   

8.
Erroneous behavior usually elicits a distinct pattern in neural waveforms. In particular, inspection of the concurrent recorded electroencephalograms (EEG) typically reveals a negative potential at fronto-central electrodes shortly following a response error (Ne or ERN) as well as an error-awareness-related positivity (Pe). Seemingly, the brain signal contains information about the occurrence of an error. Assuming a general error evaluation system, the question arises whether this information can be utilized in order to classify behavioral performance within or even across different cognitive tasks. In the present study, a machine learning approach was employed to investigate the outlined issue. Ne as well as Pe were extracted from the single-trial EEG signals of participants conducting a flanker and a mental rotation task and subjected to a machine learning classification scheme (via a support vector machine, SVM). Overall, individual performance in the flanker task was classified more accurately, with accuracy rates of above 85%. Most importantly, it was even feasible to classify responses across both tasks. In particular, an SVM trained on the flanker task could identify erroneous behavior with almost 70% accuracy in the EEG data recorded during the rotation task, and vice versa. Summed up, we replicate that the response-related EEG signal can be used to identify erroneous behavior within a particular task. Going beyond this, it was possible to classify response types across functionally different tasks. Therefore, the outlined methodological approach appears promising with respect to future applications.  相似文献   

9.
10.
A support vector machine (SVM) modeling approach for short-term load forecasting is proposed. The SVM learning scheme is applied to the power load data, forcing the network to learn the inherent internal temporal property of power load sequence. We also study the performance when other related input variables such as temperature and humidity are considered. The performance of our proposed SVM modeling approach has been tested and compared with feed-forward neural network and cosine radial basis function neural network approaches. Numerical results show that the SVM approach yields better generalization capability and lower prediction error compared to those neural network approaches.  相似文献   

11.
We have introduced a new method of protein secondary structure prediction which is based on the theory of support vector machine (SVM). SVM represents a new approach to supervised pattern classification which has been successfully applied to a wide range of pattern recognition problems, including object recognition, speaker identification, gene function prediction with microarray expression profile, etc. In these cases, the performance of SVM either matches or is significantly better than that of traditional machine learning approaches, including neural networks.The first use of the SVM approach to predict protein secondary structure is described here. Unlike the previous studies, we first constructed several binary classifiers, then assembled a tertiary classifier for three secondary structure states (helix, sheet and coil) based on these binary classifiers. The SVM method achieved a good performance of segment overlap accuracy SOV=76.2 % through sevenfold cross validation on a database of 513 non-homologous protein chains with multiple sequence alignments, which out-performs existing methods. Meanwhile three-state overall per-residue accuracy Q(3) achieved 73.5 %, which is at least comparable to existing single prediction methods. Furthermore a useful "reliability index" for the predictions was developed. In addition, SVM has many attractive features, including effective avoidance of overfitting, the ability to handle large feature spaces, information condensing of the given data set, etc. The SVM method is conveniently applied to many other pattern classification tasks in biology.  相似文献   

12.

Background  

Protein secondary structure prediction method based on probabilistic models such as hidden Markov model (HMM) appeals to many because it provides meaningful information relevant to sequence-structure relationship. However, at present, the prediction accuracy of pure HMM-type methods is much lower than that of machine learning-based methods such as neural networks (NN) or support vector machines (SVM).  相似文献   

13.
The discovery of protein variation is an important strategy in disease diagnosis within the biological sciences. The current benchmark for elucidating information from multiple biological variables is the so called “omics” disciplines of the biological sciences. Such variability is uncovered by implementation of multivariable data mining techniques which come under two primary categories, machine learning strategies and statistical based approaches. Typically proteomic studies can produce hundreds or thousands of variables, p, per observation, n, depending on the analytical platform or method employed to generate the data. Many classification methods are limited by an np constraint, and as such, require pre-treatment to reduce the dimensionality prior to classification. Recently machine learning techniques have gained popularity in the field for their ability to successfully classify unknown samples. One limitation of such methods is the lack of a functional model allowing meaningful interpretation of results in terms of the features used for classification. This is a problem that might be solved using a statistical model-based approach where not only is the importance of the individual protein explicit, they are combined into a readily interpretable classification rule without relying on a black box approach. Here we incorporate statistical dimension reduction techniques Partial Least Squares (PLS) and Principal Components Analysis (PCA) followed by both statistical and machine learning classification methods, and compared them to a popular machine learning technique, Support Vector Machines (SVM). Both PLS and SVM demonstrate strong utility for proteomic classification problems.  相似文献   

14.
15.
Cluster Computing - This paper introduces and tests a novel machine learning approach to detect Android malware. The proposed approach is composed of Support Vector Machine (SVM) classifier and...  相似文献   

16.
Promoters are DNA sequences located upstream of the gene region and play a central role in gene expression. Computational techniques show good accuracy in gene prediction but are less successful in predicting promoters, primarily because of the high number of false positives that reflect characteristics of the promoter sequences. Many machine learning methods have been used to address this issue. Neural Networks (NN) have been successfully used in this field because of their ability to recognize imprecise and incomplete patterns characteristic of promoter sequences. In this paper, NN was used to predict and recognize promoter sequences in two data sets: (i) one based on nucleotide sequence information and (ii) another based on stability sequence information. The accuracy was approximately 80% for simulation (i) and 68% for simulation (ii). In the rules extracted, biological consensus motifs were important parts of the NN learning process in both simulations.  相似文献   

17.
目的:探讨起搏器术后新发房性心律失常的发生情况及其相关影响因素。方法:选择2006年1月至2007年12月于沈阳军区总医院首次植入永久起搏器的107例患者,男性50例,平均年龄65.0±11.9岁,术前通过追问病史及相关检查均排除房性心律失常(房颤、房扑、房速),术后平均随访3.9年,观察新发房性心律失常情况。按术后是否出现房性心律失常,将患者分为新发房性心律失常组和无房性心律失常组,比较两组患者术前和术后心脏超声结果的变化、心室起搏比例、起搏部位及起搏模式,并通过logistic回归分析起搏器术后发生房性心律失常的影响因素。结果:新发房性心律失常组26例(24.3%),其中房颤17例(15.9%),房扑2例(1.9%),房速7例(6.5%);无房性心律失常组81例。与无房性心律失常组比较,新发房性心律失常组左房内径明显增加(P=0.040)、二尖瓣返流程度较重(P=0.032)及左室射血分数明显下降(P=0.001),心室起搏百分比(VP%)显著升高(P=0.017)。心尖部起搏患者房性心律失常的发生率明显高于间隔部起搏(33.3%vs 16.9%,P<0.05),双腔起搏组患者房性心律失常发生率明显低于单腔起搏器组(18.7%vs 37.5%,P<0.05)。Logistic回归分析显示术后新发房性心律失常的发生与高比例的心室起搏(P=0.006)、VVI(R)起搏模式(P=0.014)及右心室起搏电极导线植于心尖部(P=0.024)显著相关。结论:起搏模式、心室起搏百分比、起搏部位是起搏器术后发生房性心律失常的影响因素。  相似文献   

18.
有关蛋白质功能的研究是解析生命奥秘的基础,机器学习技术在该领域已有广泛应用。利用支持向量机(support vectormachine,SVM)方法,构建一个预测蛋白质功能位点的通用平台。该平台先提取非同源蛋白质序列,再对这些序列进行特征编码(包括序列的基本信息、物化特征、结构信息及序列保守性特征等),以编码好的样本作为训练数据,利用SVM进行训练,得到敏感性、特异性、Matthew相关系数、准确率及ROC曲线等评价指标,反复测试,得到评价指标最优的SVM模型后,便可以用来预测蛋白质序列上的功能位点。该平台除了应用在预测蛋白质功能位点之外,还可以应用于疾病相关单核苷酸多态性(SNP)预测分析、预测蛋白质结构域分析、生物分子间的相互作用等。  相似文献   

19.
BackgroundPrevious epidemiological studies have examined the prevalence and risk factors for a variety of parasitic illnesses, including protozoan and soil-transmitted helminth (STH, e.g., hookworms and roundworms) infections. Despite advancements in machine learning for data analysis, the majority of these studies use traditional logistic regression to identify significant risk factors.MethodsIn this study, we used data from a survey of 54 risk factors for intestinal parasitosis in 954 Ethiopian school children. We investigated whether machine learning approaches can supplement traditional logistic regression in identifying intestinal parasite infection risk factors. We used feature selection methods such as InfoGain (IG), ReliefF (ReF), Joint Mutual Information (JMI), and Minimum Redundancy Maximum Relevance (MRMR). Additionally, we predicted children’s parasitic infection status using classifiers such as Logistic Regression (LR), Support Vector Machines (SVM), Random Forests (RF) and XGBoost (XGB), and compared their accuracy and area under the receiver operating characteristic curve (AUROC) scores. For optimal model training, we performed tenfold cross-validation and tuned the classifier hyperparameters. We balanced our dataset using the Synthetic Minority Oversampling (SMOTE) method. Additionally, we used association rule learning to establish a link between risk factors and parasitic infections.Key findingsOur study demonstrated that machine learning could be used in conjunction with logistic regression. Using machine learning, we developed models that accurately predicted four parasitic infections: any parasitic infection at 79.9% accuracy, helminth infection at 84.9%, any STH infection at 95.9%, and protozoan infection at 94.2%. The Random Forests (RF) and Support Vector Machines (SVM) classifiers achieved the highest accuracy when top 20 risk factors were considered using Joint Mutual Information (JMI) or all features were used. The best predictors of infection were socioeconomic, demographic, and hematological characteristics.ConclusionsWe demonstrated that feature selection and association rule learning are useful strategies for detecting risk factors for parasite infection. Additionally, we showed that advanced classifiers might be utilized to predict children’s parasitic infection status. When combined with standard logistic regression models, machine learning techniques can identify novel risk factors and predict infection risk.  相似文献   

20.
MOTIVATION: Small non-coding RNA (ncRNA) genes play important regulatory roles in a variety of cellular processes. However, detection of ncRNA genes is a great challenge to both experimental and computational approaches. In this study, we describe a new approach called positive sample only learning (PSoL) to predict ncRNA genes in the Escherichia coli genome. Although PSoL is a machine learning method for classification, it requires no negative training data, which, in general, is hard to define properly and affects the performance of machine learning dramatically. In addition, using the support vector machine (SVM) as the core learning algorithm, PSoL can integrate many different kinds of information to improve the accuracy of prediction. Besides the application of PSoL for predicting ncRNAs, PSoL is applicable to many other bioinformatics problems as well. RESULTS: The PSoL method is assessed by 5-fold cross-validation experiments which show that PSoL can achieve about 80% accuracy in recovery of known ncRNAs. We compared PSoL predictions with five previously published results. The PSoL method has the highest percentage of predictions overlapping with those from other methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号