首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
Freshwater inhabitants in Piedmont (Italy) have been deeply disadvantaged by environmental changes caused by human disturbance. Hence there are engendered species that need human intervention of an entirely different kind – better management through the development of innovative practical tools. The most ecologically important of the river-dwelling invertebrates is a threatened species, the native white-clawed crayfish Austropotamobius pallipes. This is the species that we focused on in our effort to contribute to species conservation. Specifically we contrasted three different techniques of managing data relating to the presence/absence of this species: logistic regression, decision-tree models and artificial neural networks (ANN). Logistic regression and decision tree models (unpruned and pruned) performed worse than ANN. In this case, tree-pruning techniques did not make these models significantly more reliable, but did make the trees less complex and therefore did make the models clearer. ANN performed the best. Therefore we have judged them to be the most effective techniques.  相似文献   

2.
Recent advances in computing technology have increased interest in applying data mining to ecology. Machine learning is one of the methods used in most of these data mining applications. As is well known, approximately 80% of the resources in most data mining applications are devoted to cleaning and preprocessing the data. However, there are few studies on preprocessing the ecological data used as the input in these data mining systems. In this study, we use four different feature selection methods (χ2, Information Gain, Gain Ratio, and Symmetrical Uncertainty) and evaluate their effectiveness in preprocessing the input data to be used for inducing artificial neural networks (ANNs) and decision trees (DTs). The presence/absence of fish is the data item used to illustrate our models. Feature selection is fundamental in order to increase the performances of the models obtained. Accuracy of classification improves when a small set of optimally selected features is used. DTs and ANNs are very useful tools when applied to modeling presence/absence of Alburnus alburnus alborella. ANNs generally performed better than DT models.  相似文献   

3.
Alburnus alburnus alborella is a fish species native to northern Italy. It has suffered a very sharp decrease in population over the last 20 years due to human impact. Therefore, it was selected for reintroduction projects. In this research project, support vector machines (SVM) were tested as possible tools for building reliable models of presence/absence of the species. A system of 198 sites located along the rivers of Piedmont in North-Western Italy was investigated. At each site, 19 physical-chemical and environmental variables were measured. We verified that performances did not improve after feature selection but, instead, they slightly decreased (from Correctly Classified Instances [CCI] = 84.34 and Cohen's k [k] = 0.69 to CCI = 82.81 and k = 0.66). However, feature selection is crucial in identifying the relevant features for the presence/absence of the species. We then compared SVMs performances with decision trees (DTs) and artificial neural networks (ANNs) built using the same dataset. SVMs outperformed DTs (CCI = 81.39 and k = 0.63) but not ANNs (CCI = 83.03 and k = 0.66), showing that SVMs and ANNs are the best performing models, proving that their application in freshwater management is more promising than traditional and other machine-learning techniques.  相似文献   

4.
Accurate prediction of species distributions based on sampling and environmental data is essential for further scientific analysis, such as stock assessment, detection of abundance fluctuation due to climate change or overexploitation, and to underpin management and legislation processes. The evolution of computer science and statistics has allowed the development of sophisticated and well-established modelling techniques as well as a variety of promising innovative approaches for modelling species distribution. The appropriate selection of modelling approach is crucial to the quality of predictions about species distribution. In this study, modelling techniques based on different approaches are compared and evaluated in relation to their predictive performance, utilizing fish density acoustic data. Generalized additive models and mixed models amongst the regression models, associative neural networks (ANNs) and artificial neural networks ensemble amongst the artificial neural networks and ordinary kriging amongst the geostatistical techniques are applied and evaluated. A verification dataset is used for estimating the predictive performance of these models. A combination of outputs from the different models is applied for prediction optimization to exploit the ability of each model to explain certain aspects of variation in species acoustic density. Neural networks and especially ANNs appear to provide more accurate results in fitting the training dataset while generalized additive models appear more flexible in predicting the verification dataset. The efficiency of each technique in relation to certain sampling and output strategies is also discussed.  相似文献   

5.
The aim of this work was to predict local fish species richness in the Garonne river basin using three environmental variables (distance from the source, elevation and catchment area J. Commonly, patterns of fish species richness have been investigated using simple or multi-linear statistical models. Here, we used backpropagation of artificial neural networks (ANNs) to develop stochastic models of local fish diversity. Two independent data collections were used, the first one to build and test the model; the second one to validate the model. Correlation coefficients between observed values and predicted values both in the testing and the validation procedures were highly significant (r = 0.904, P< 0.001 and r = 0.822, P< 0.001, respectively J. The ANN model obtained using only three environmental variables succeeded in explaining ca 70 % of the total variation in local fish species richness. Through these findings, ANNs can be seen as a powerful predictive tool compared to traditional modelling approaches.  相似文献   

6.
We evaluated 1) the performance of an artificial neural network (ANN)-based technology in assessing the respiratory system resistance (Rrs) and compliance (Crs) in a porcine model of acute lung injury and 2) the possibility of using, for ANN training, signals coming from an electrical analog (EA) of the lung. Two differently experienced ANNs were compared. One ANN (ANN(BIO)) was trained on tracings recorded at different time points after the administration of oleic acid in 10 anesthetized and paralyzed pigs during constant-flow mechanical ventilation. A second ANN (ANN(MOD)) was trained on EA simulations. Both ANNs were evaluated prospectively on data coming from four different pigs. Linear regression between ANN output and manually computed mechanics showed a regression coefficient (R) of 0.98 for both ANNs in assessing Crs. On Rrs, ANN(BIO) showed a performance expressed by R = 0.40 and ANN(MOD) by R = 0.61. These results suggest that ANNs can learn to assess the respiratory system mechanics during mechanical ventilation but that the assessment of resistance and compliance by ANNs may require different approaches.  相似文献   

7.
Ho WH  Lee KT  Chen HY  Ho TW  Chiu HC 《PloS one》2012,7(1):e29179

Background

A database for hepatocellular carcinoma (HCC) patients who had received hepatic resection was used to develop prediction models for 1-, 3- and 5-year disease-free survival based on a set of clinical parameters for this patient group.

Methods

The three prediction models included an artificial neural network (ANN) model, a logistic regression (LR) model, and a decision tree (DT) model. Data for 427, 354 and 297 HCC patients with histories of 1-, 3- and 5-year disease-free survival after hepatic resection, respectively, were extracted from the HCC patient database. From each of the three groups, 80% of the cases (342, 283 and 238 cases of 1-, 3- and 5-year disease-free survival, respectively) were selected to provide training data for the prediction models. The remaining 20% of cases in each group (85, 71 and 59 cases in the three respective groups) were assigned to validation groups for performance comparisons of the three models. Area under receiver operating characteristics curve (AUROC) was used as the performance index for evaluating the three models.

Conclusions

The ANN model outperformed the LR and DT models in terms of prediction accuracy. This study demonstrated the feasibility of using ANNs in medical decision support systems for predicting disease-free survival based on clinical databases in HCC patients who have received hepatic resection.  相似文献   

8.
M. Nie    W. Q. Zhang    M. Xiao    J. L. Luo    K. Bao    J. K. Chen    B. Li 《Journal of Phytopathology》2007,155(6):364-367
A rapid spectroscopic approach for whole‐organism fingerprinting of Fourier transform infrared (FT‐IR) spectroscopy was used to analyse 16 isolates from five closely related species of Fusarium: F. graminearum, F. moniliforme, F. nivale, F. semitectum and F. oxysporum. Principal components analysis and hierarchical cluster analysis were used to study the clusters in the data. On visual inspection of the clusters from both methods, the spectra were not differentiated into five separate clusters corresponding to species and these unsupervised methods failed to identify these fungal strains. When the data were trained by back propagation algorithm of artificial neural networks (ANNs) with principal components scores of spectra used as input modes, the strains were accurately predicted and recognized. The results in this study show that FT‐IR spectroscopy in combination with principal component artificial neural networks (PC‐ANNs) is well suited for identifying Fusarium spp. It would be advantageous to establish a comprehensive database of taxonomically well‐defined Fusarium species to aid the identification of unknown strains.  相似文献   

9.
To facilitate decision support in freshwater ecosystem protection and restoration management, habitat suitability models can be very valuable. Data driven methods such as artificial neural networks (ANNs) are particularly useful in this context, seen their time-efficient development and relatively high reliability. However, specialized and technical literature on neural network modelling offers a variety of model development criteria to select model architecture, training procedure, etc. This may lead to confusion among ecosystem modellers and managers regarding the optimal training and validation methodology. This paper focuses on the analysis of ANN development and application for predicting macroinvertebrate communities, a species group commonly used in freshwater assessment worldwide. This review reflects on the different aspects regarding model development and application based on a selection of 26 papers reporting the use of ANN models for the prediction of macroinvertebrates. This analysis revealed that the applied model training and validation methodologies can often be improved and moreover crucial steps in the modelling process are often poorly documented. Therefore, suggestions to improve model development, assessment and application in ecological river management are presented. In particular, data pre-processing determines to a high extent the reliability of the induced models and their predictive relevance. This also counts for the validation criteria, that need to be better tuned to the practical simulation requirements. Moreover, the use of sensitivity methods can help to extract knowledge on the habitat preference of species and allow peer-review by ecological experts. The selection of relevant input variables remains a critical challenge as well. Model coupling is a missing crucial step to link human activities, hydrology, physical habitat conditions, water quality and ecosystem status. This last aspect is probably the most valuable aspect to enable decision support in water management based on ANN models.  相似文献   

10.
Bythotrephes longimanus is an invasive pelagic crustacean, which first arrived in North America from Europe in early 1980s and can now be found throughout the Great Lakes and in many inland lakes and waterways. Determining the suitability of lakes to Bythotrephes establishment is an important step in quantifying its potential habitat range and environmental risk. Lake environmental conditions, planktivorous fishes, sport fishes and Bythotrephes occurrence data from 179 south-central Ontario lakes were used in this study to model lake characteristics suitable for its establishment. The performance of principal component analysis and different predictive models was used to determine the habitats that are suitable for the survival of Bythotrephes and the factors that may regulate its spread. Four modeling approaches were employed: linear discriminant analysis; multiple logistic regression; random forests; and, artificial neural networks. Ensemble prediction based on the four modeling approaches was also used as an indicator for predicting Bythotrephes occurrence. Bythotrephes appears to establish more readily in larger, deeper lakes with lower elevation, that have more sport fishes. Bythotrephes occurrence can be best predicted by artificial neural networks when including the measures of fish data, in addition to lake environmental data. Lake elevation, surface area and sport fish occurrence were ranked as the most important predictors of Bythotrephes invasion. The inclusion of biotic variables (occurrence or diversity of sport or planktivorous fishes) enhanced cross-validated models relative to analyses based on environmental data alone.  相似文献   

11.
In this paper, we propose to use probabilistic neural networks (PNNs) for classification of bacterial growth/no-growth data and modeling the probability of growth. The PNN approach combines both Bayes theorem of conditional probability and Parzen's method for estimating the probability density functions of the random variables. Unlike other neural network training paradigms, PNNs are characterized by high training speed and their ability to produce confidence levels for their classification decision. As a practical application of the proposed approach, PNNs were investigated for their ability in classification of growth/no-growth state of a pathogenic Escherichia coli R31 in response to temperature and water activity. A comparison with the most frequently used traditional statistical method based on logistic regression and multilayer feedforward artificial neural network (MFANN) trained by error backpropagation was also carried out. The PNN-based models were found to outperform linear and nonlinear logistic regression and MFANN in both the classification accuracy and ease by which PNN-based models are developed.  相似文献   

12.
13.
Various studies have been reported on the bioeffects of magnetic field exposure; however, no consensus or guideline is available for experimental designs relating to exposure conditions as yet. In this study, logistic regression (LR) and artificial neural networks (ANNs) were used in order to analyze and predict the melatonin excretion patterns in the rat exposed to extremely low frequency magnetic fields (ELF‐MF). Subsequently, on a database containing 33 experiments, performances of LR and ANNs were compared through resubstitution and jackknife tests. Predictor variables were more effective parameters and included frequency, polarization, exposure duration, and strength of magnetic fields. Also, five performance measures including accuracy, sensitivity, specificity, Matthew's Correlation Coefficient (MCC) and normalized percentage, better than random (S) were used to evaluate the performance of models. The LR as a conventional model obtained poor prediction performance. Nonetheless, LR distinguished the duration of magnetic fields as a statistically significant parameter. Also, horizontal polarization of magnetic fields with the highest logit coefficient (or parameter estimate) with negative sign was found to be the strongest indicator for experimental designs relating to exposure conditions. This means that each experiment with horizontal polarization of magnetic fields has a higher probability to result in “not changed melatonin level” pattern. On the other hand, ANNs, a more powerful model which has not been introduced in predicting melatonin excretion patterns in the rat exposed to ELF‐MF, showed high performance measure values and higher reliability, especially obtaining 0.55 value of MCC through jackknife tests. Obtained results showed that such predictor models are promising and may play a useful role in defining guidelines for experimental designs relating to exposure conditions. In conclusion, analysis of the bioelectromagnetic data could result in finding a relationship between electromagnetic fields and different biological processes. Bioelectromagnetics 31:164–171, 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

14.
Background: The aim of the present study was to confirm the role of Brachyury in breast cancer and to verify whether four types of machine learning models can use Brachyury expression to predict the survival of patients.Methods: We conducted a retrospective review of the medical records to obtain patient information, and made the patient’s paraffin tissue into tissue chips for staining analysis. We selected 303 patients for research and implemented four machine learning algorithms, including multivariate logistic regression model, decision tree, artificial neural network and random forest, and compared the results of these models with each other. Area under the receiver operating characteristic (ROC) curve (AUC) was used to compare the results.Results: The chi-square test results of relevant data suggested that the expression of Brachyury protein in cancer tissues was significantly higher than that in paracancerous tissues (P=0.0335); patients with breast cancer with high Brachyury expression had a worse overall survival (OS) compared with patients with low Brachyury expression. We also found that Brachyury expression was associated with ER expression (P=0.0489). Subsequently, we used four machine learning models to verify the relationship between Brachyury expression and the survival of patients with breast cancer. The results showed that the decision tree model had the best performance (AUC = 0.781).Conclusions: Brachyury is highly expressed in breast cancer and indicates that patients had a poor prognosis. Compared with conventional statistical methods, decision tree model shows superior performance in predicting the survival status of patients with breast cancer.  相似文献   

15.
16.
The identification of the vocal repertoire of a species represents a crucial prerequisite for a correct interpretation of animal behavior. Artificial Neural Networks (ANNs) have been widely used in behavioral sciences, and today are considered a valuable classification tool for reducing the level of subjectivity and allowing replicable results across different studies. However, to date, no studies have applied this tool to nonhuman primate vocalizations. Here, we apply for the first time ANNs, to discriminate the vocal repertoire in a primate species, Eulemur macaco macaco. We designed an automatic procedure to extract both spectral and temporal features from signals, and performed a comparative analysis between a supervised Multilayer Perceptron and two statistical approaches commonly used in primatology (Discriminant Function Analysis and Cluster Analysis), in order to explore pros and cons of these methods in bioacoustic classification. Our results show that ANNs were able to recognize all seven vocal categories previously described (92.5–95.6%) and perform better than either statistical analysis (76.1–88.4%). The results show that ANNs can provide an effective and robust method for automatic classification also in primates, suggesting that neural models can represent a valuable tool to contribute to a better understanding of primate vocal communication. The use of neural networks to identify primate vocalizations and the further development of this approach in studying primate communication are discussed. Am. J. Primatol. 72:337–348, 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

17.
Halophile proteins can tolerate high salt concentrations. Understanding halophilicity features is the first step toward engineering halostable crops. To this end, we examined protein features contributing to the halo-toleration of halophilic organisms. We compared more than 850 features for halophilic and non-halophilic proteins with various screening, clustering, decision tree, and generalized rule induction models to search for patterns that code for halo-toleration. Up to 251 protein attributes selected by various attribute weighting algorithms as important features contribute to halo-stability; from them 14 attributes selected by 90% of models and the count of hydrogen gained the highest value (1.0) in 70% of attribute weighting models, showing the importance of this attribute in feature selection modeling. The other attributes mostly were the frequencies of di-peptides. No changes were found in the numbers of groups when K-Means and TwoStep clustering modeling were performed on datasets with or without feature selection filtering. Although the depths of induced trees were not high, the accuracies of trees were higher than 94% and the frequency of hydrophobic residues pointed as the most important feature to build trees. The performance evaluation of decision tree models had the same values and the best correctness percentage recorded with the Exhaustive CHAID and CHAID models. We did not find any significant difference in the percent of correctness, performance evaluation, and mean correctness of various decision tree models with or without feature selection. For the first time, we analyzed the performance of different screening, clustering, and decision tree algorithms for discriminating halophilic and non-halophilic proteins and the results showed that amino acid composition can be used to discriminate between halo-tolerant and halo-sensitive proteins.  相似文献   

18.
Artificial neural networks (ANN) are being applied to recovery of products from fermentation broths. Recovery methods for which mathematical models are complex or non-existent are particularly suitable for control and analysis by ANNs. Use and potential of artificial neural networks for product recovery applications are reviewed.  相似文献   

19.
Aim The purpose of this study was to improve understanding of the relationship between the spatial patterns of an important insect pest, the aphid Myzus persicae, and aspects of its environment. The main objectives were to determine the predominant geographical, climatic and land use factors that are linked with the aphid's distribution, to quantify their role in determining that distribution, including their interacting effects and to explore the ability of artificial neural networks (ANNs) to provide predictive models. Location The study focused on four spatial scales to account for the aphid data base characteristics and available land use data sets: Europe; a broad zone over Europe covering Belgium, Denmark, France, Ireland, Italy, The Netherlands, Scotland, Sweden and Wales (Regio data base coverage); North‐West Europe (i.e. Belgium, France and the United Kingdom); and England with Wales. Methods Multiple linear regression (MLR) was used to identify the variables in the Geographic location, Climate and Land use groups, that explained significant proportions of the variance in M. persicae total annual numbers and Julian date of first capture. A variance partitioning procedure was used to measure the fraction of the variation that can be explained by each environmental factor and of shared variation between the different factors. Finally, ANNs were employed as an alternative modelling approach for the two largest study areas, i.e. Europe and the Regio data base coverage, to determine whether the relationship between aphid and environmental variables was better described by more complex functions as well as their ability to generalize to new data. Results Land use variables are shown to play a significant role in explaining aphid numbers. The area of agricultural crops, in particular oilseed rape, is positively correlated with M. persicae annual numbers. Among the climatic variables, rainfall is negatively correlated with aphid numbers and temperature is positively correlated. The geographical components also explain a significant part of aphid annual numbers. However, the variance partitioning procedure indicates that while each group has an effect, none is dominant. Aphid first capture is mainly explained by climate where rainfall tends to delay migration and warmer conditions tend to advance it. Climate accounts for the greatest part of the variance when considered separately from the other factors. The geographical and land use components also have a significant effect on first capture at each scale, but their direct contribution is negligible. The ability of the ANN models to generalize to new total numbers and phenological data compared with MLR models was less for Europe (9 and 6% increase in the variance accounted for, respectively) than for the Regio data coverage where an increase of 44% in the variance accounted for was observed. Main conclusions This research supports the hypothesis that climate, land use and geographical location play a role in determining patterns of aphid annual numbers and phenology. The ability of ANN models to predict aphid distribution is improved by the inclusion of temporal land use data. However, identification of the processes involved in such relationships is difficult due to numerous interactions between the environmental factors.  相似文献   

20.
The effect of environmental conditions on river macrobenthic communities was studied using a dataset consisting of 343 sediment samples from unnavigable watercourses in Flanders, Belgium. Artificial neural network models were used to analyse the relation among river characteristics and macrobenthic communities. The dataset included presence or absence of macroinvertebrate taxa and 12 physicochemical and hydromorphological variables for each sampling site. The abiotic variables served as input for the artificial neural networks to predict the macrobenthic community. The effects of the input variables on model performance were assessed in order to identify the most diagnostic river characteristics for macrobenthic community composition. This was done by consecutively eliminating the least important variables and, when beneficial for model performance, adding previously removed ones again. This stepwise input variable selection procedure was tested not only on a model predicting the entire macrobenthic community, but also on three models, each predicting an individual taxon. Additionally, during each step of the stepwise leave-one-out procedure, a sensitivity analysis was performed to determine the response of the predicted macroinvertebrate taxa to the input variables applied. This research illustrated that a combination of input variable selection with sensitivity analyses can contribute to the development of reliable and ecologically relevant ANN models. The river characteristics predicting presence or absence of the benthic macroinvertebrates best were the Julian day, conductivity, and dissolved oxygen content. These conditions reflect the importance of discharges of untreated wastewater that occurred during the period of investigation in nearly all Flemish rivers.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号