首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Lameness is one of the costliest health problems, as well as a welfare concern in dairy cows. However, it is difficult to detect cows with possible lameness, or the ones that are at risk of becoming lame e.g. in the next week or so. In this study, we investigated the ability of three machine learning algorithms, Naïve Bayes (NB), Random Forest (RF) and Multilayer Perceptron (MLP), to predict cases of lameness using milk production and conformation traits. The performance of these algorithms was compared with logistic regression (LR) as the gold standard approach for binary classification. We had a total of 2 535 lameness scores (2 248 sound and 287 unsound) and 29 predictor features from nine dairy herds in Australia to predict lameness incidence. Training was done on 80% of the data within each herd with the remainder used as validation set. Our results indicated that in terms of area under curve of receiver operating characteristics, there were negligible differences between LR (0.67) and NB (0.66) while MLP (0.62) and RF (0.61) underperformed compared to the other two methods. However, the F1-score in NB (27%) outperformed LR (1%), suggesting that NB could potentially be a more reliable method for the prediction of lameness in practice, given enough relevant data are available for proper training, which was a limitation in this study. Considering the small size of our dataset, lack of information about environmental conditions prior to the incidence of lameness, management practices, short time gap between production records and lameness scoring, and farm information, this study proved the concept of using machine learning predictive models to predict the incidence of lameness a priori to its occurrence and thus may become a valuable decision support system for better lameness management in precision dairy farming.  相似文献   

2.
Electronic Nose based ENT bacteria identification in hospital environment is a classical and challenging problem of classification. In this paper an electronic nose (e-nose), comprising a hybrid array of 12 tin oxide sensors (SnO2) and 6 conducting polymer sensors has been used to identify three species of bacteria, Escherichia coli (E. coli), Staphylococcus aureus (S. aureus), and Pseudomonas aeruginosa (P. aeruginosa) responsible for ear nose and throat (ENT) infections when collected as swab sample from infected patients and kept in ISO agar solution in the hospital environment. In the next stage a sub-classification technique has been developed for the classification of two different species of S. aureus, namely Methicillin-Resistant S. aureus (MRSA) and Methicillin Susceptible S. aureus (MSSA). An innovative Intelligent Bayes Classifier (IBC) based on "Baye's theorem" and "maximum probability rule" was developed and investigated for these three main groups of ENT bacteria. Along with the IBC three other supervised classifiers (namely, Multilayer Perceptron (MLP), Probabilistic neural network (PNN), and Radial Basis Function Network (RBFN)) were used to classify the three main bacteria classes. A comparative evaluation of the classifiers was conducted for this application. IBC outperformed MLP, PNN and RBFN. The best results suggest that we are able to identify and classify three bacteria main classes with up to 100% accuracy rate using IBC. We have also achieved 100% classification accuracy for the classification of MRSA and MSSA samples with IBC. We can conclude that this study proves that IBC based e-nose can provide very strong and rapid solution for the identification of ENT infections in hospital environment.  相似文献   

3.
A real-time plant species recognition under an unconstrained environment is a challenging and time-consuming process. The recognition model should cope up with the computer vision challenges such as scale variations, illumination changes, camera viewpoint or object orientation changes, cluttered backgrounds and structure of leaf (simple or compound). In this paper, a bilateral convolutional neural network (CNN) with machine learning classifiers are investigated in relation to the real-time implementation of plant species recognition. The CNN models considered are MobileNet, Xception and DenseNet-121. In the bilateral CNNs (Homogeneous/Heterogeneous type), the models are connected using the cascade early fusion strategy. The Bilateral CNN is used in the process of feature extraction. Then, the extracted features are classified using different machine learning classifiers such as Linear Discriminant Analysis (LDA), multinomial Logistic Regression (MLR), Naïve Bayes (NB), k-Nearest Neighbor (k−NN), Classification and Regression Tree (CART), Random Forest Classifier (RF), Bagging Classifier (BC), Multi-Layer Perceptron (MLP) and Support Vector Machine (SVM). From the experimental investigation, it is observed that the multinomial Logistic Regression classifier performed better compared to other classifiers, irrespective of the bilateral CNN models (Homogeneous - MoMoNet, XXNet, DeDeNet; Heterogeneous - MoXNet, XDeNet, MoDeNet). It is also observed that the MoDeNet + MLR model attained the state-of-the-art results (Flavia: 98.71%, Folio: 96.38%, Swedish Leaf: 99.41%, custom created Leaf-12: 99.39%), irrespective of the dataset. The number of misprediction/class is highly reduced by utilizing the MoDeNet + MLR model for real-time plant species recognition.  相似文献   

4.
Banerjee AK  M S  M N  Murty US 《Bioinformation》2010,4(10):456-462
Biological systems are highly organized and enormously coordinated maintaining greater complexity. The increment of secondary data generation and progress of modern mining techniques provided us an opportunity to discover hidden intra and inter relations among these non linear dataset. This will help in understanding the complex biological phenomenon with greater efficiency. In this paper we report comparative classification of Pyruvate Dehydrogenase protein sequences from bacterial sources based on 28 different physicochemical parameters (such as bulkiness, hydrophobicity, total positively and negatively charged residues, α helices, β strand etc.) and 20 type amino acid compositions. Logistic, MLP (Multi Layer Perceptron), SMO (Sequential Minimal Optimization), RBFN (Radial Basis Function Network) and SL (simple logistic) methods were compared in this study. MLP was found to be the best method with maximum average accuracy of 88.20%. Same dataset was subjected for clustering using 2*2 grid of a two dimensional SOM (Self Organizing Maps). Clustering analysis revealed the proximity of the unannotated sequences with the Mycobacterium and Synechococcus genus.  相似文献   

5.
Since, aggregate stability is the main physical property regulating erodibility; its observations can act as a useful indicator for monitoring and managing soil degradation. In this context, this study carried out in the alluvial plain of Cheliff, a semi-arid area aimed to predict aggregate stability through Mean Weight Diameter (MWD), using pedotransfer functions (PTFs) with different stratifications (textural, salinity and organic-textural) and artificial neural networks (ANNs). Results showed that the best MWD predictions were those related to organic-textural PTFs, in this stratification the silty-clay moderately rich OM class showed the highest significant determination coefficient R2 (0.65) and the lowest mean square error (0.03), whereas, the textural and salinity PTFs were a very weak predictors with a very low R2. It was also found that the performances of ANNs in predicting MWD were better than those of PTFs, regarding ANNs input variables the best predictions were those obtained with a large number of input variables, furthermore, by using a large number of hidden neurons, the performances of Radial Basis Function (RBF) were better than those of Multilayer Perceptron (MLP). It was also noted that the best RBF results were always related to the Gaussian hidden activation, whereas, MLP was not related to a specific hidden activation.  相似文献   

6.
This paper presents DMP3 (Dynamic Multilayer Perceptron 3), a multilayer perceptron (MLP) constructive training method that constructs MLPs by incrementally adding network elements of varying complexity to the network. DMP3 differs from other MLP construction techniques in several important ways, and the motivation for these differences are given. Information gain rather than error minimization is used to guide the growth of the network, which increases the utility of newly added network elements and decreases the likelihood that a premature dead end in the growth of the network will occur. The generalization performance of DMP3 is compared with that of several other well-known machine learning and neural network learning algorithms on nine real world data sets. Simulation results show that DMP3 performs better (on average) than any of the other algorithms on the data sets tested. The main reasons for this result are discussed in detail.  相似文献   

7.
This paper shows a quantitative relation between the regularization techniques, the generalization ability, and the sensitivity of the Multilayer Perceptron (MLP) to input noise. Although many studies about these topics have been presented, in most cases only one of the problems is addressed, and only experimentally obtained evidence is provided to illustrate some kind of correlation between generalization, noise immunity and the use of regularization techniques to obtain a set of weights after training that provides the corresponding MLP with generalization ability and noise immunity. Here, a new measurement of noise immunity for a MLP is presented. This measurement, which is termed Mean Squared Sensitivity (MSS), explicitly evaluates the Mean Squared Error (MSE) degradation of a MLP when it is perturbed by input noise, and can be computed from the statistical sensitivities (previously proposed) of the output neurons. The MSS provides an accurate evaluation of the MLP performance loss when its inputs are perturbed by noise and can also be considered a measurement of the smoothness of the error surface with respect to the inputs. Thus, as the MSS can be used to evaluate the noise immunity or the generalization ability, it gives a criterion to select among different weight configurations that present a similar MSE after training.  相似文献   

8.
In this paper, a new approach based on eigen-systems pseudo-spectral estimation methods, namely Eigenvector (EV) and MUSIC, and Multiple Layer Perceptron (MLP) neural network is introduced. In this approach, the calculated EEG (electroencephalogram) spectrum is divided into smaller frequency sub-bands. Then, a set of features, {maximum, entropy, average, standard deviation, mobility}, are extracted from these sub-bands. Next, incorporating a set of the EEG time domain features {standard deviation, complexity measure} with the spectral feature set, a feature vector is formed. The feature vector is then fetched into a MLP neural network to classify the signal into the following three states: normal (healthy), epileptic patient signal in a seizure-free interval (inter-ictal), and epileptic patient signal in a full seizure interval (ictal). The experimental results show that the classification of the EEG signals maybe achieved with approximately 97.5% accuracy and the variance of 0.095% using an available public EEG signals database. The results are among the best reported methods for classifying the three states aforementioned. This is a high speed with high accuracy as well as low misclassifying rate method so it can make the practical and real-time detection of this chronic disease feasible.  相似文献   

9.
Rangelands with more than 8000 plant species occupy nearly 54.6% of the land area of Iran and thus are accounted for a rich plant genetic storage. Mazandaran province has 378,000 ha of rangelands with high plant species richness and diversity due to its climate conditions but plants distribution is at risk because of non-principle management, land use change and as a result changing environmental factors. Vegetation management strategies can be guided by models that predict plant species distribution based on governing environmental variables. This is especially useful for the dominant species that determine ecosystem processes. In fact, modelling algorithm in each SDM determines its suitability for different ecosystems. Our aim was to compare the predictive power of a number of SDMs and to evaluate the importance of a range of environmental variables as predictors in the context of semi-arid rangeland vegetation. The selected study area, the Sarkhas rangelands (northern Iran, 36°10′ 42˝ N - 51°19′ 11˝ E), covers approximately 4358.9 ha of Mazandaran province. The efficacy of four different modelling techniques as well as Ensemble model was evaluated to predict the distribution of five dominant forage plant species (Vicia villosa, Stachys lavandulifolia, Coronilla balansae, Sanguisorba minor and Alopecurus textilis). The used models included artificial neural network (ANN), boosted regression trees (BRT), classification and regression trees (CART), and random forest (RF). Ensemble, RF and CART had the highest area under curve. The AUC obtained for Vicia villosa, Stachys lavandulifolia, Coronilla balansae, Sanguisorba minor and Alopecurus textilis, were 0.90, 0.72, 0.76, 0.69 and 0.75 respectively. Ensemble model was the model that most consistently demonstrated high predictive power across species in the rangeland context investigated here. BRT exhibited the least predictive power. An importance analysis of variables showed that soil organic C according to the CART model (0.396) and K according to the RF model (0.396) were the most important environmental variables.  相似文献   

10.
Prostate cancer is the most common cancer in men over 50 years of age and it has been shown that nuclear magnetic resonance spectra are sensitive enough to distinguish normal and cancer tissues. In this paper, we propose a classification technique of spectra from magnetic resonance spectroscopy. We studied automatic classification with and without quantification of metabolite signals. The dataset is composed of 22 patient datasets with a biopsy-proven cancer, from which we extracted 2464 spectra from the whole prostate and of which 1062 were localised in the peripheral zone. The spectra were manually classed into 3 different categories by a spectroscopist with 4 years experience in clinical spectroscopy of prostate cancer: undetermined, healthy and pathologic. We used different preprocessing methods (module, phase correction only, phase correction and baseline correction) as input for Support Vector Machine and for Multilayer Perceptron, and we compared the results with those from the expert. If we class only healthy and pathologic spectra we reach a total error rate of 4.51%. However, if we class all spectra (undetermined, healthy and pathologic) the total error rate rises to 11.49%. We have shown in this paper that the best results are obtained using the pre-processed spectra without quantification as input for the classifiers and we confirm that Support Vector Machine are more efficient than Multilayer Perceptron in processing high dimensional data.  相似文献   

11.
MOTIVATION: Multilayer perceptrons (MLP) represent one of the widely used and effective machine learning methods currently applied to diagnostic classification based on high-dimensional genomic data. Since the dimensionalities of the existing genomic data often exceed the available sample sizes by orders of magnitude, the MLP performance may degrade owing to the curse of dimensionality and over-fitting, and may not provide acceptable prediction accuracy. RESULTS: Based on Fisher linear discriminant analysis, we designed and implemented an MLP optimization scheme for a two-layer MLP that effectively optimizes the initialization of MLP parameters and MLP architecture. The optimized MLP consistently demonstrated its ability in easing the curse of dimensionality in large microarray datasets. In comparison with a conventional MLP using random initialization, we obtained significant improvements in major performance measures including Bayes classification accuracy, convergence properties and area under the receiver operating characteristic curve (A(z)). SUPPLEMENTARY INFORMATION: The Supplementary information is available on http://www.cbil.ece.vt.edu/publications.htm  相似文献   

12.
The goal of this study is to investigate the influence of mental fatigue on the event related potential P300 features (maximum pick, minimum amplitude, latency and period) during virtual wheelchair navigation. For this purpose, an experimental environment was set up based on customizable environmental parameters (luminosity, number of obstacles and obstacles velocities). A correlation study between P300 and fatigue ratings was conducted. Finally, the best correlated features supplied three classification algorithms which are MLP (Multi Layer Perceptron), Linear Discriminate Analysis and Support Vector Machine. The results showed that the maximum feature over visual and temporal regions as well as period feature over frontal, fronto-central and visual regions were correlated with mental fatigue levels. In the other hand, minimum amplitude and latency features didn’t show any correlation. Among classification techniques, MLP showed the best performance although the differences between classification techniques are minimal. Those findings can help us in order to design suitable mental fatigue based wheelchair control.  相似文献   

13.

Background

Extracting relevant information from microarray data is a very complex task due to the characteristics of the data sets, as they comprise a large number of features while few samples are generally available. In this sense, feature selection is a very important aspect of the analysis helping in the tasks of identifying relevant genes and also for maximizing predictive information.

Methods

Due to its simplicity and speed, Stepwise Forward Selection (SFS) is a widely used feature selection technique. In this work, we carry a comparative study of SFS and Genetic Algorithms (GA) as general frameworks for the analysis of microarray data with the aim of identifying group of genes with high predictive capability and biological relevance. Six standard and machine learning-based techniques (Linear Discriminant Analysis (LDA), Support Vector Machines (SVM), Naive Bayes (NB), C-MANTEC Constructive Neural Network, K-Nearest Neighbors (kNN) and Multilayer perceptron (MLP)) are used within both frameworks using six free-public datasets for the task of predicting cancer outcome.

Results

Better cancer outcome prediction results were obtained using the GA framework noting that this approach, in comparison to the SFS one, leads to a larger selection set, uses a large number of comparison between genetic profiles and thus it is computationally more intensive. Also the GA framework permitted to obtain a set of genes that can be considered to be more biologically relevant. Regarding the different classifiers used standard feedforward neural networks (MLP), LDA and SVM lead to similar and best results, while C-MANTEC and k-NN followed closely but with a lower accuracy. Further, C-MANTEC, MLP and LDA permitted to obtain a more limited set of genes in comparison to SVM, NB and kNN, and in particular C-MANTEC resulted in the most robust classifier in terms of changes in the parameter settings.

Conclusions

This study shows that if prediction accuracy is the objective, the GA-based approach lead to better results respect to the SFS approach, independently of the classifier used. Regarding classifiers, even if C-MANTEC did not achieve the best overall results, the performance was competitive with a very robust behaviour in terms of the parameters of the algorithm, and thus it can be considered as a candidate technique for future studies.
  相似文献   

14.
 土地覆盖是植物群落研究的重要参数,反映植物群落的生长状况及其所处生存环境的优劣。小尺度常规的测定方法费力、费时,而且是破坏性的,不能动态监测其变化。而对于大尺度的测定,常规方法无能为力,只能采用遥感方法。应用人工神经网络和多谱段遥感数据对香港大屿山岛进行土地覆盖的分类,设计了一个合适的多层感知器前向反馈神经网络用于土地覆盖分类,并将分类结果与传统的最大似然分类方法所得的结果作比较,结果表明神经网络方法在分类精度上有了很大的提高。  相似文献   

15.
16.
The identification of the vocal repertoire of a species represents a crucial prerequisite for a correct interpretation of animal behavior. Artificial Neural Networks (ANNs) have been widely used in behavioral sciences, and today are considered a valuable classification tool for reducing the level of subjectivity and allowing replicable results across different studies. However, to date, no studies have applied this tool to nonhuman primate vocalizations. Here, we apply for the first time ANNs, to discriminate the vocal repertoire in a primate species, Eulemur macaco macaco. We designed an automatic procedure to extract both spectral and temporal features from signals, and performed a comparative analysis between a supervised Multilayer Perceptron and two statistical approaches commonly used in primatology (Discriminant Function Analysis and Cluster Analysis), in order to explore pros and cons of these methods in bioacoustic classification. Our results show that ANNs were able to recognize all seven vocal categories previously described (92.5–95.6%) and perform better than either statistical analysis (76.1–88.4%). The results show that ANNs can provide an effective and robust method for automatic classification also in primates, suggesting that neural models can represent a valuable tool to contribute to a better understanding of primate vocal communication. The use of neural networks to identify primate vocalizations and the further development of this approach in studying primate communication are discussed. Am. J. Primatol. 72:337–348, 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

17.
Thermophilic streptococci play an important role in the manufacture of many European cheeses, and a rapid and reliable method for their identification is needed. Randomly amplified polymorphic DNA (RAPD) PCR (RAPD-PCR) with two different primers coupled to hierarchical cluster analysis has proven to be a powerful tool for the classification and typing of Streptococcus thermophilus, Enterococcus faecium, and Enterococcus faecalis (G. Moschetti, G. Blaiotta, M. Aponte, P. Catzeddu, F. Villani, P. Deiana, and S. Coppola, J. Appl. Microbiol. 85:25-36, 1998). In order to develop a fast and inexpensive method for the identification of thermophilic streptococci, RAPD-PCR patterns were generated with a single primer (XD9), and the results were analyzed using artificial neural networks (Multilayer Perceptron, Radial Basis Function network, and Bayesian network) and multivariate statistical techniques (cluster analysis, linear discriminant analysis, and classification trees). Cluster analysis allowed the identification of S. thermophilus but not of enterococci. A Bayesian network proved to be more effective than a Multilayer Perceptron or a Radial Basis Function network for the identification of S. thermophilus, E. faecium, and E. faecalis using simplified RAPD-PCR patterns (obtained by summing the bands in selected areas of the patterns). The Bayesian network also significantly outperformed two multivariate statistical techniques (linear discriminant analysis and classification trees) and proved to be less sensitive to the size of the training set and more robust in the response to patterns belonging to unknown species.  相似文献   

18.
The study reports on the possibility of classifying sleep stages in infants using an artificial neural network. The polygraphic data from 4 babies aged 6 weeks, 6 months and 1 year recorded over 8 hours were available for classification. From each baby 22 signals were recorded, digitized and stored on an optical disc. Subsets of these signals and additional calculated parameters were used to obtain data vectors, each of which represents an interval of 30 sec. For classification, two types of neural networks were used, a Multilayer Perceptron and a Learning Vector Quantizer. The teaching input for both networks was provided by a human expert. For the 6 sleep classes in babies aged 6 months, a 65% to 80% rate of correct classification (4 babies) was obtained for the testing data not previously seen.  相似文献   

19.
Soil cadmium (Cd) contamination has attracted a great deal of attention because of its detrimental effects on animals and humans. This study aimed to develop and compare the performances of stepwise linear regression (SLR), classification and regression tree (CART) and random forest (RF) models in the prediction and mapping of the spatial distribution of soil Cd and to identify likely sources of Cd accumulation in Fuyang County, eastern China. Soil Cd data from 276 topsoil (0–20 cm) samples were collected and randomly divided into calibration (222 samples) and validation datasets (54 samples). Auxiliary data, including detailed land use information, soil organic matter, soil pH, and topographic data, were incorporated into the models to simulate the soil Cd concentrations and further identify the main factors influencing soil Cd variation. The predictive models for soil Cd concentration exhibited acceptable overall accuracies (72.22% for SLR, 70.37% for CART, and 75.93% for RF). The SLR model exhibited the largest predicted deviation, with a mean error (ME) of 0.074 mg/kg, a mean absolute error (MAE) of 0.160 mg/kg, and a root mean squared error (RMSE) of 0.274 mg/kg, and the RF model produced the results closest to the observed values, with an ME of 0.002 mg/kg, an MAE of 0.132 mg/kg, and an RMSE of 0.198 mg/kg. The RF model also exhibited the greatest R2 value (0.772). The CART model predictions closely followed, with ME, MAE, RMSE, and R2 values of 0.013 mg/kg, 0.154 mg/kg, 0.230 mg/kg and 0.644, respectively. The three prediction maps generally exhibited similar and realistic spatial patterns of soil Cd contamination. The heavily Cd-affected areas were primarily located in the alluvial valley plain of the Fuchun River and its tributaries because of the dramatic industrialization and urbanization processes that have occurred there. The most important variable for explaining high levels of soil Cd accumulation was the presence of metal smelting industries. The good performance of the RF model was attributable to its ability to handle the non-linear and hierarchical relationships between soil Cd and environmental variables. These results confirm that the RF approach is promising for the prediction and spatial distribution mapping of soil Cd at the regional scale.  相似文献   

20.
This paper reviews recent research into predicting the eating qualities of beef. A range of instrumental and grading approaches have been discussed, highlighting implications for the European beef industry. Studies incorporating a number of instrumental and spectroscopic techniques illustrate the potential for online systems to non-destructively measure muscle pH, colour, fat and moisture content of beef with R2 (coefficient of determination) values >0.90. Direct predictions of eating quality (tenderness, flavour, juiciness) and fatty acid content using these methods are also discussed though success is greatly variable. R2 values for instrumental measures of tenderness have been quoted as high as 0.85 though R2 values for sensory tenderness values can be as low as 0.01. Discriminant analysis models can improve prediction of variables such as pH and shear force, correctly classifying beef samples into categorical groups with >90% accuracy. Prediction of beef flavour continues to challenge researchers and the industry alike, with R2 values rarely quoted above 0.50, regardless of instrumental or statistical analysis used. Beef grading systems such as EUROP and United States Department of Agriculture systems provide carcase classification and some indication of yield. Other systems attempt to classify the whole carcase according to expected eating quality. These are being supplemented by schemes such as Meat Standards Australia (MSA), based on consumer satisfaction for individual cuts. In Australia, MSA has grown steadily since its inception generating a 10% premium for the beef industry in 2015-16 of $187 million. There is evidence that European consumers would respond to an eating quality guarantee provided it is simple and independently controlled. A European beef quality assurance system might encompass environmental and nutritional measures as well as eating quality and would need to be profitable, simple, effective and sufficiently flexible to allow companies to develop their own brands.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号