首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Cardiorespiratory events (CREs), including bradycardia and apnea, in infants are a major concern for physicians and families. Our hypothesis was that there is a difference in the heart rate variability (HRV) of infants who have CREs when compared to normal control infants. The purpose of this study was to develop CRE prediction models based on HRV measured during a polysomnographic (PSG) recording. ANCOVA analysis accounting for factors such as age and sleep state show a relationship between HRV variables and CRE. Prediction models, including neural networks and support vector machines, were developed to predict CRE within either (a) 1-week or (b) 1-month after the PSG. The support vector machine prediction accuracy, for CRE susceptibility one month after the PSG on an independent testing dataset, was 50.0% sensitivity and 82.6% specificity. Although the developed prediction models were not sufficiently accurate for clinical decision making, these results support the potential role of abnormalities in autonomic control of heart rate among infants at risk for CREs.  相似文献   

2.
A model selection method based on tabu search is proposed to build support vector machines (binary decision functions) of reduced complexity and efficient generalization. The aim is to build a fast and efficient support vector machines classifier. A criterion is defined to evaluate the decision function quality which blends recognition rate and the complexity of a binary decision functions together. The selection of the simplification level by vector quantization, of a feature subset and of support vector machines hyperparameters are performed by tabu search method to optimize the defined decision function quality criterion in order to find a good sub-optimal model on tractable times.  相似文献   

3.
支持向量机与神经网络的关系研究   总被引:2,自引:0,他引:2  
支持向量机是一种基于统计学习理论的新颖的机器学习方法,由于其出色的学习性能,该技术已成为当前国际机器学习界的研究热点,该方法已经广泛用于解决分类和回归问题.本文将结构风险函数应用于径向基函数网络学习中,同时讨论了支持向量回归模型和径向基函数网络之间的关系.仿真实例表明所给算法提高了径向基函数网络的泛化性能.  相似文献   

4.
Classification of patients based on molecular markers, for example into different risk groups, is a modern field in medical research. The aim of this classification is often a better diagnosis or individualized therapy. The search for molecular markers often utilizes extremely high-dimensional data sets (e.g. gene-expression microarrays). However, in situations where the number of measured markers (genes) is intrinsically higher than the number of available patients, standard methods from statistical learning fail to deal correctly with this so-called "curse of dimensionality". Also feature or dimension reduction techniques based on statistical models promise only limited success. Several recent methods explore ideas of how to quantify and incorporate biological prior knowledge of molecular interactions and known cellular processes into the feature selection process. This article aims to give an overview of such current methods as well as the databases, where this external knowledge can be obtained from. For illustration, two recent methods are compared in detail, a feature selection approach for support vector machines as well as a boosting approach for regression models. As a practical example, data on patients with acute lymphoblastic leukemia are considered, where the binary endpoint "relapse within first year" should be predicted.  相似文献   

5.
6.
We provide a novel interpretation of the dual of support vector machines (SVMs) in terms of scatter with respect to class prototypes and their mean. As a key contribution, we extend this framework to multiple classes, providing a new joint Scatter SVM algorithm, at the level of its binary counterpart in the number of optimization variables. This enables us to implement computationally efficient solvers based on sequential minimal and chunking optimization. As a further contribution, the primal problem formulation is developed in terms of regularized risk minimization and the hinge loss, revealing the score function to be used in the actual classification of test patterns. We investigate Scatter SVM properties related to generalization ability, computational efficiency, sparsity and sensitivity maps, and report promising results.  相似文献   

7.
Prognostic and diagnostic biomarker discovery is one of the key issues for a successful stratification of patients according to clinical risk factors. For this purpose, statistical classification methods, such as support vector machines (SVM), are frequently used tools. Different groups have recently shown that the usage of prior biological knowledge significantly improves the classification results in terms of accuracy as well as reproducibility and interpretability of gene lists. Here, we introduce pathClass, a collection of different SVM-based classification methods for improved gene selection and classfication performance. The methods contained in pathClass do not merely rely on gene expression data but also exploit the information that is carried in gene network data. AVAILABILITY: pathClass is open source and freely available as an R-Package on the CRAN repository at http://cran.r-project.org.  相似文献   

8.
Statistical, spectral, multi-resolution and non-linear methods were applied to heart rate variability (HRV) series linked with classification schemes for the prognosis of cardiovascular risk. A total of 90 HRV records were analyzed: 45 from healthy subjects and 45 from cardiovascular risk patients. A total of 52 features from all the analysis methods were evaluated using standard two-sample Kolmogorov-Smirnov test (KS-test). The results of the statistical procedure provided input to multi-layer perceptron (MLP) neural networks, radial basis function (RBF) neural networks and support vector machines (SVM) for data classification. These schemes showed high performances with both training and test sets and many combinations of features (with a maximum accuracy of 96.67%). Additionally, there was a strong consideration for breathing frequency as a relevant feature in the HRV analysis.  相似文献   

9.
Selecting a small number of informative genes for microarray-based tumor classification is central to cancer prediction and treatment. Based on model population analysis, here we present a new approach, called Margin Influence Analysis (MIA), designed to work with support vector machines (SVM) for selecting informative genes. The rationale for performing margin influence analysis lies in the fact that the margin of support vector machines is an important factor which underlies the generalization performance of SVM models. Briefly, MIA could reveal genes which have statistically significant influence on the margin by using Mann-Whitney U test. The reason for using the Mann-Whitney U test rather than two-sample t test is that Mann-Whitney U test is a nonparametric test method without any distribution-related assumptions and is also a robust method. Using two publicly available cancerous microarray data sets, it is demonstrated that MIA could typically select a small number of margin-influencing genes and further achieves comparable classification accuracy compared to those reported in the literature. The distinguished features and outstanding performance may make MIA a good alternative for gene selection of high dimensional microarray data. (The source code in MATLAB with GNU General Public License Version 2.0 is freely available at http://code.google.com/p/mia2009/).  相似文献   

10.
We compared the ability of three machine learning algorithms (linear discriminant analysis, decision tree, and support vector machines) to automate the classification of calls of nine frogs and three bird species. In addition, we tested two ways of characterizing each call to train/test the system. Calls were characterized with four standard call variables (minimum and maximum frequencies, call duration and maximum power) or eleven variables that included three standard call variables (minimum and maximum frequencies, call duration) and a coarse representation of call structure (frequency of maximum power in eight segments of the call). A total of 10,061 isolated calls were used to train/test the system. The average true positive rates for the three methods were: 94.95% for support vector machine (0.94% average false positive rate), 89.20% for decision tree (1.25% average false positive rate) and 71.45% for linear discriminant analysis (1.98% average false positive rate). There was no statistical difference in classification accuracy based on 4 or 11 call variables, but this efficient data reduction technique in conjunction with the high classification accuracy of the SVM is a promising combination for automated species identification by sound. By combining automated digital recording systems with our automated classification technique, we can greatly increase the temporal and spatial coverage of biodiversity data collection.  相似文献   

11.
12.
Wearable sensors have potential for quantitative, gait-based, point-of-care fall risk assessment that can be easily and quickly implemented in clinical-care and older-adult living environments. This investigation generated models for wearable-sensor based fall-risk classification in older adults and identified the optimal sensor type, location, combination, and modelling method; for walking with and without a cognitive load task. A convenience sample of 100 older individuals (75.5 ± 6.7 years; 76 non-fallers, 24 fallers based on 6 month retrospective fall occurrence) walked 7.62 m under single-task and dual-task conditions while wearing pressure-sensing insoles and tri-axial accelerometers at the head, pelvis, and left and right shanks. Participants also completed the Activities-specific Balance Confidence scale, Community Health Activities Model Program for Seniors questionnaire, six minute walk test, and ranked their fear of falling. Fall risk classification models were assessed for all sensor combinations and three model types: multi-layer perceptron neural network, naïve Bayesian, and support vector machine. The best performing model was a multi-layer perceptron neural network with input parameters from pressure-sensing insoles and head, pelvis, and left shank accelerometers (accuracy = 84%, F1 score = 0.600, MCC score = 0.521). Head sensor-based models had the best performance of the single-sensor models for single-task gait assessment. Single-task gait assessment models outperformed models based on dual-task walking or clinical assessment data. Support vector machines and neural networks were the best modelling technique for fall risk classification. Fall risk classification models developed for point-of-care environments should be developed using support vector machines and neural networks, with a multi-sensor single-task gait assessment.  相似文献   

13.
Marginalized kernels for biological sequences   总被引:1,自引:0,他引:1  
MOTIVATION: Kernel methods such as support vector machines require a kernel function between objects to be defined a priori. Several works have been done to derive kernels from probability distributions, e.g., the Fisher kernel. However, a general methodology to design a kernel is not fully developed. RESULTS: We propose a reasonable way of designing a kernel when objects are generated from latent variable models (e.g., HMM). First of all, a joint kernel is designed for complete data which include both visible and hidden variables. Then a marginalized kernel for visible data is obtained by taking the expectation with respect to hidden variables. We will show that the Fisher kernel is a special case of marginalized kernels, which gives another viewpoint to the Fisher kernel theory. Although our approach can be applied to any object, we particularly derive several marginalized kernels useful for biological sequences (e.g., DNA and proteins). The effectiveness of marginalized kernels is illustrated in the task of classifying bacterial gyrase subunit B (gyrB) amino acid sequences.  相似文献   

14.
McCALL  CHEKE  WILSON    POST  FLOOK  MANK  SIMA  & MAS 《Medical and veterinary entomology》1998,12(3):267-275
Onchocerciasis is endemic on the island of Bioko, Equatorial Guinea, where it is transmitted by the 'Bioko form' of the Simulium damnosum complex, a cytospecies unique to the island. To determine the distribution of vector breeding, three dry season and two wet season expeditions were made in 1989, 1996 and 1997, and 226 of the island's 247 rivers (91.5%) were visited. Of these 226 rivers, 130 (58%) were flowing during the dry season, forty-five (20%) supported aquatic stages of Simuliidae of any species and twenty-five (11%) contained larvae or pupae of the S. damnosum complex. The twenty-one rivers not prospected were in the mountainous south of the island, where an additional seventeen rivers were reached but not satisfactorily prospected. Of these thirty-eight rivers, twenty-nine were considered highly likely to support vector breeding, bringing the total number of rivers which could harbour the vector during the dry season to fifty-four (21.9% of the island's total). Breeding was believed to be limited to river stretches below 1000 m altitude, and during the dry season the total length of those stretches which could support breeding on Bioko was estimated to be 1020 km. A combination of factors, including low river discharges during the dry season, the relatively low water temperature on Bioko, the suitability of limited stretches of most rivers as vector breeding sites and the close proximity of many rivers within a small geographical area, render the vector vulnerable to eradication by aerial treatment of rivers with insecticide. The isolation of the Bioko form of the S. damnosum complex suggests that reinvasion following treatment would be unlikely, and eradication of the vector might be achieved by a dry season larviciding programme in one or two years.  相似文献   

15.
A possibility for enrichment of the methodology of expert system development based on a simulation of the probabilistic component of information vagueness is proposed. Using this concept, Gaussian noise with different sizes of dispersion was applied to the experimental values of the main physiological state variables in phenylalanine fermentation by genetically manipulated Echerichia coli. These variables were introduced at the input of the recognition block of the expert system, imitating noisy experimental data. It is shown that implementation of this approach can reveal some important characteristics of expert systems and can be useful for their improvement.This study was carried out within the framework of Bulgaria-UNDP Joint Project DP/BUL/007/86.The authors express their gratitude to Mr. Ricaredo Matanguihan for his assistance during the numerical examination of the fuzzy expert system for the control and management of phenylalanine fermentation.  相似文献   

16.
A pseudo-random generator is an algorithm to generate a sequence of objects determined by a truly random seed which is not truly random. It has been widely used in many applications, such as cryptography and simulations. In this article, we examine current popular machine learning algorithms with various on-line algorithms for pseudo-random generated data in order to find out which machine learning approach is more suitable for this kind of data for prediction based on on-line algorithms. To further improve the prediction performance, we propose a novel sample weighted algorithm that takes generalization errors in each iteration into account. We perform intensive evaluation on real Baccarat data generated by Casino machines and random number generated by a popular Java program, which are two typical examples of pseudo-random generated data. The experimental results show that support vector machine and k-nearest neighbors have better performance than others with and without sample weighted algorithm in the evaluation data set.  相似文献   

17.
A new software tool making use of a genetic algorithm for multi-objective experimental optimization (GAME.opt) was developed based on a strength Pareto evolutionary algorithm. The software deals with high dimensional variable spaces and unknown interactions of design variables. This approach was evaluated by means of multi-objective test problems replacing the experimental results. A default parameter setting is proposed enabling users without expert knowledge to minimize the experimental effort (small population sizes and few generations).  相似文献   

18.
Sparse kernel methods like support vector machines (SVM) have been applied with great success to classification and (standard) regression settings. Existing support vector classification and regression techniques however are not suitable for partly censored survival data, which are typically analysed using Cox's proportional hazards model. As the partial likelihood of the proportional hazards model only depends on the covariates through inner products, it can be 'kernelized'. The kernelized proportional hazards model however yields a solution that is dense, i.e. the solution depends on all observations. One of the key features of an SVM is that it yields a sparse solution, depending only on a small fraction of the training data. We propose two methods. One is based on a geometric idea, where-akin to support vector classification-the margin between the failed observation and the observations currently at risk is maximised. The other approach is based on obtaining a sparse model by adding observations one after another akin to the Import Vector Machine (IVM). Data examples studied suggest that both methods can outperform competing approaches. AVAILABILITY: Software is available under the GNU Public License as an R package and can be obtained from the first author's website http://www.maths.bris.ac.uk/~maxle/software.html.  相似文献   

19.
Lu CH  Chen YC  Yu CS  Hwang JK 《Proteins》2007,67(2):262-270
Disulfide bonds play an important role in stabilizing protein structure and regulating protein function. Therefore, the ability to infer disulfide connectivity from protein sequences will be valuable in structural modeling and functional analysis. However, to predict disulfide connectivity directly from sequences presents a challenge to computational biologists due to the nonlocal nature of disulfide bonds, i.e., the close spatial proximity of the cysteine pair that forms the disulfide bond does not necessarily imply the short sequence separation of the cysteine residues. Recently, Chen and Hwang (Proteins 2005;61:507-512) treated this problem as a multiple class classification by defining each distinct disulfide pattern as a class. They used multiple support vector machines based on a variety of sequence features to predict the disulfide patterns. Their results compare favorably with those in the literature for a benchmark dataset sharing less than 30% sequence identity. However, since the number of disulfide patterns grows rapidly when the number of disulfide bonds increases, their method performs unsatisfactorily for the cases of large number of disulfide bonds. In this work, we propose a novel method to represent disulfide connectivity in terms of cysteine pairs, instead of disulfide patterns. Since the number of bonding states of the cysteine pairs is independent of that of disulfide bonds, the problem of class explosion is avoided. The bonding states of the cysteine pairs are predicted using the support vector machines together with the genetic algorithm optimization for feature selection. The complete disulfide patterns are then determined from the connectivity matrices that are constructed from the predicted bonding states of the cysteine pairs. Our approach outperforms the current approaches in the literature.  相似文献   

20.
Question: Which is the best model to predict the habitat distribution of Buxus balearica Lam. in southern Spain? Location: Málaga and Granada, Spain, across an area of 38 180 km2. Methods: Prediction models based on 17 environmental variables were tested. Six methods were compared: multivariate adaptive regression spline (MARS), maximum entropy approach to modelling species' distributions (Maxent), two generic algorithms based on environmental metrics dissimilarity (BIOCLIM and DOMAIN), Genetic Algorithm for Rule‐set Prediction (GARP), and supervised learning methods based on generalized linear classifiers (support vector machines, SVMs). To test the predictive power of the models we used the Kappa index. Results: Maxent most accurately predicted the habitat distribution of B. balearica, followed by MARS models. The other models tested yielded lower accuracy values. A comparison of the predictive power of the models revealed that climate variables made the highest contributions among the environmental variables studied. The variables that made the lowest contributions were the insolation models. To examine the sensitivity of the models to a reduction in the number of variables, a test showed that accuracy of over 0.90 was maintained by applying just three climatic variables (spring rainfall, mean temperature of the warmest month, and mean temperature of the coldest month). Maps derived from the algorithms of all models tested coincided well with the known distribution of the species. Conclusions: Model habitat prediction is a preliminary step towards highlighting areas of high habitat suitability of B. balearica. These data support the results of previous research, which show that MaxEnt is the best technique for modelling species distributions with small sample sizes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号