首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Aim To test statistical models used to predict species distributions under different shapes of occurrence–environment relationship. We addressed three questions: (1) Is there a statistical technique that has a consistently higher predictive ability than others for all kinds of relationships? (2) How does species prevalence influence the relative performance of models? (3) When an automated stepwise selection procedure is used, does it improve predictive modelling, and are the relevant variables being selected? Location We used environmental data from a real landscape, the state of California, and simulated species distributions within this landscape. Methods Eighteen artificial species were generated, which varied in their occurrence response to the environmental gradients considered (random, linear, Gaussian, threshold or mixed), in the interaction of those factors (no interaction vs. multiplicative), and on their prevalence (50% vs. 5%). The landscape was then randomly sampled with a large (n = 2000) or small (n = 150) sample size, and the predictive ability of each statistical approach was assessed by comparing the true and predicted distributions using five different indexes of performance (area under the receiver‐operator characteristic curve, Kappa, correlation between true and predictive probability of occurrence, sensitivity and specificity). We compared generalized additive models (GAM) with and without flexible degrees of freedom, logistic regressions (general linear models, GLM) with and without variable selection, classification trees, and the genetic algorithm for rule‐set production (GARP). Results Species with threshold and mixed responses, additive environmental effects, and high prevalence generated better predictions than did other species for all statistical models. In general, GAM outperforms all other strategies, although differences with GLM are usually not significant. The two variable‐selection strategies presented here did not discriminate successfully between truly causal factors and correlated environmental variables. Main conclusions Based on our analyses, we recommend the use of GAM or GLM over classification trees or GARP, and the specification of any suspected interaction terms between predictors. An expert‐based variable selection procedure was preferable to the automated procedures used here. Finally, for low‐prevalence species, variability in model performance is both very high and sample‐dependent. This suggests that distribution models for species with low prevalence can be improved through targeted sampling.  相似文献   

2.
Systematists may rely on morphometric differences among samples of specimens for the recognition of living and fossil species, even though morphometric differentiation may be caused by non-genetic factors, such as ecophenotypy, differential growth rates and taphonomic mixing. When genetic differences between sexes or among closely related species are expressed as differences in the morphology of the individual or population, potentially valuable information becomes available to the systematist for a variety of genetic and ecological investigations. We have studied the morphology of the freshwater snail Melanoides tuberculata (Muller, 1774) in Israel, where males occur in what would otherwise be normally parthenogenetic (all female) populations. In modern M. tuberculata, sex may be determined by observation of gonadal tissue; in fossil specimens, any classification according to sex must be accomplished using only preservable features of the mineralized shell. Previous research confirmed that in large samples, mean shell shape of male and female snails differed significantly, but the degree of difference was too small to identify the sex of any individual specimen. We apply a three stage process that results with a high degree of accuracy in the discrimination of individual M. tuberculata specimens by sex on the basis of continuous morphological characters: (1) measurement of many aspects of shell morphology of individuals of known sex, and stepwise discrimination to discover which of the variables, if any, contribute to the morphometric differentiation of males from females (one time only, for the species); (2) use of these selected variables in a clustering procedure to make a preliminary assignment of each specimen to sex; (3) use of cluster assignments in a discrimination procedure to optimally predict sex. For species that exhibit morphometric differences between two groups, and for which continuous morphometric variation precludes the a priori recognition of discrete clusters, this sequential procedure may be of broad applicability. These objective methods may be applied to the discrimination within any set of specimens for which the hypothesis of two, and only two, constituent groups may be entertained.  相似文献   

3.
Synthetic collagen peptides containing larger numbers of Gly‐Pro‐Hyp repeats are difficult to purify by standard chromatographic procedures. Therefore, efficient strategies are required for the synthesis of higher molecular weight collagen‐type peptides. Applying the Fmoc/tBu chemistry, a comparative analysis of the standard stepwise chain elongation procedure on solid support with the procedure based on the use of the synthons Fmoc‐Gly‐Pro‐Hyp(tBu)‐OH and Fmoc‐Pro‐Hyp‐Gly‐OH was performed. The crude products resulting from the stepwise elongation procedure and from the use of Fmoc‐Gly‐Pro‐Hyp(tBu)‐OH clearly revealed large amounts of microheterogeneities that result from incomplete imino acid acylation as well as from diketopiperazine formation with cleavage of Gly‐Pro units from the growing peptide chain. Conversely, by the use of the Fmoc‐Pro‐Hyp‐Gly‐OH synthon, the quality of the crude products was significantly improved; moreover, protection of the Hyp side chain hydroxyl function is not required using the Fmoc/tBu strategy. With this optimized synthetic procedure, relatively large collagen‐type peptides were obtained in satisfactory yields as highly homogeneous compounds. Copyright © 1999 European Peptide Society and John Wiley & Sons, Ltd.  相似文献   

4.
Evolutionary studies of communication can benefit from classification procedures that allow individual animals to be assigned to groups (e.g. species) on the basis of high-dimension data representing their signals. Prior to classification, signals are usually transformed by a signal processing procedure into structural features. Applications of these signal processing procedures to animal communication have been largely restricted to the manual or semi-automated identification of landmark features from graphical representations of signals. Nonetheless, theory predicts that automated time-frequency-based digital signal processing (DSP) procedures can represent signals more efficiently (using fewer features) than can landmark procedures or frequency-based DSP – allowing more accurate classification. Moreover, DSP procedures are objective in that they require little previous knowledge of signal diversity, and are relatively free from potentially ungrounded assumptions of cross-taxon homology. Using a model data set of electric organ discharge waveforms from five sympatric species of the electric fish Gymnotus, we adopted an exhaustive simulation approach to investigate the classificatory performance of different signal processing procedures. We considered a landmark procedure, a frequency-based DSP procedure (the fast Fourier transform), and two kinds of time-frequency-based DSP procedures (a short-time Fourier transform, and several implementations of the discrete wavelet transform -DWT). The features derived from each of these signal processing procedures were then subjected to dimension reduction procedures to separate those features which permit the most effective discrimination among groups of signalers. We considered four alternative dimension reduction methods. Finally, each combination of reduced data was submitted to classification by linear discriminant analysis. Our results support theoretical predictions that time-frequency DSP procedures (especially DWT) permit more efficient discrimination of groups. The performance of signal processing was found to depend largely upon the dimension reduction procedure employed, and upon the number of resulting features. Because the best combinations of procedures are dataset-dependent and difficult to predict, we conclude that simulations of the kind described here, or at least simplified versions of them, should be routinely executed before classification of animal signals - especially unfamiliar ones.  相似文献   

5.
ABSTRACT The jaguar (Panthera onca) and puma (Puma concolor) are the largest felids of the American Continent and live in sympatry along most of their distribution. Their tracks are frequently used for research and management purposes, but tracks are difficult to distinguish from each other and can be confused with those of big canids. We used tracks from pumas, jaguars, large dogs, and maned wolves (Chrysocyon brachyurus) to evaluate traditional qualitative and quantitative identification methods and to elaborate multivariate methods to differentiate big canids versus big felids and puma versus jaguar tracks (n = 167 tracks from 18 zoos). We tested accuracy of qualitative classification through an identification exercise with field-experienced volunteers. Qualitative methods were useful but there was high variability in accuracy of track identification. Most of the traditional quantitative methods showed an elevated percentage of misclassified tracks (≥20%). We used stepwise discriminant function analysis to develop 3 discriminant models: 1 for big canid versus big felid track identification and 2 alternative models for jaguar versus puma track differentiation using 1) best discriminant variables, and 2) size-independent variables. These models had high classification performance, with <10% of error in the validation procedures. We used simpler discriminant models in the elaboration of identification keys to facilitate track classification process. We developed an accurate method for track identification, capable of distinguishing between big felids (puma and jaguar) and large canids (dog and maned wolf) tracks and between jaguar and puma tracks. Application of our method will allow a more reliable use of tracks in puma and jaguar research and it will help managers using tracks as indicators of these felids' presence for conservation or management purposes.  相似文献   

6.
Morpho-colorimetric quantitative variables describing seed size, shape, colour and texture were analysed using image analysis techniques, in order to evaluate the variability among Medicago taxa sect. Dendrotelis and verify the current taxonomic treatment which divides this section into three species: M. arborea L., M. citrina (Font Quer) Greuter and M. strasseri Greuter, Matthäs & H. Risse. Further comparisons were conducted to discriminate among populations and regions of provenance. Data obtained were statistically analysed applying stepwise Linear Discriminant Analysis method (LDA), recording an overall cross-validated classification performance of 100% at species level. With regard to inter-population comparisons, percentages of correct discrimination above 98% were achieved and high performance was recorded in the discrimination among M. arborea taxa distinguished by region of provenance. For each of these statistical comparisons, the best discriminant variables chosen by the stepwise LDA were related to colour and textural information. Finally, the obtained results confirmed the validity of the proposed method to be highly diagnostic in the statistical assessment of the morpho-colorimetric traits variability of Medicago taxa seeds, both for the taxonomic differentiation at species level and regional and population groups.  相似文献   

7.
We develop a robust classification procedure, based on TIKU'S (1967, 1980, 1982) MML (modified maximum likelihood) estimators, for classifying an observation in one of the two populations π1 and π2 and show that this procedure is superior to the classical and nonparametric procedures.  相似文献   

8.
Summary The solution structure of a specific DNA complex of the minimum DNA-binding domain of the mouse c-Myb protein was determined by distance geometry calculations using a set of 1732 nuclear Overhauser enhancement (NOE) distance restraints. In order to determine the complex structure independent of the initial guess, we have developed two different procedures for the docking calculation using simulated annealing in four-dimensional space (4D-SA). One is a multiple-step procedure, where the protein and the DNA were first constructed independently by 4D-SA using only the individual intramolecular NOE distance restraints. Here, the initial structure of the protein was a random coil and that of the DNA was a typical B-form duplex. Then, as the starting structure for the next docking procedure, the converged protein and DNA structures were placed in random molecular orientations, separated by 50 Å. The two molecules were docked by 4D-SA utilizing all the restraints, including the additional 66 intermolecular distance restraints. The second procedure comprised a single step, in which a random-coil protein and a typical B-form DNA duplex were first placed 70 Å from each other. Then, using all the intramolecular and intermolecular NOE distance restraints, the complex structure was constructed by 4D-SA. Both procedures yielded the converged complex structures with similar quality and structural divergence, but the multiple-step procedure has much better convergence power than the single-step procedure. A model study of the two procedures was performed to confirm the structural quality, depending upon the number of intermolecular distance restraints, using the X-ray structure of the engrailed homeodomain-DNA complex.Abbreviations rmsd root-mean-square deviation - NOE nuclear Overhauser enhancement - 4D-SA simulated annealing in four-dimensional space - Myb-R2R3 repeats 2 and 3 of the DNA-binding domain of the c-Myb protein - DNA 16 Myb-specific binding DNA duplex with 16 base pairs - IHDD-C residues 3 to 59 of the C-chain of the engrailed homeodomain-DNA complex - DNA11 DNA duplex with base pairs 9 to 19 of the engrailed homeodomain-DNA complex  相似文献   

9.
Capsule Discriminant functions based on morphometric variables provide a reliable method for sex identification of free‐living and hacked young Ospreys.

Aims To describe an easy, accurate and low‐cost method for sex determination of fully grown nestling and fledgling Ospreys Pandion haliaetus based on morphometric measurements.

Methods Four different measurements were taken in 114 birds (40–73 days old) and a DNA analysis, using PCR amplification, was carried out for sex identification. A forward stepwise discriminant analysis was performed to build the best explanatory discriminant models, which were subsequently validated using statistics and external samples.

Results Our best discriminant function retained forearm and tarsus as the best predictor variables and classified 95.1% of the sample correctly, supported also by external cross‐validations with both hacked and free‐living birds. Moreover, a discriminant function with only forearm as predictor showed a similar high correct classification power (93.4%).

Conclusions These discriminant functions can be used as a reliable and immediate method for sex determination of young Ospreys since they showed high discriminant accuracy, close to that of molecular procedures, and were supported by external cross‐validations, both for free‐living and hacked birds. Thus, these morphometric measurements should be considered as standard tools for future scientific studies and management of Osprey populations  相似文献   

10.
A comparison of the performance of five modelling methods using presence/absence (generalized additive models, discriminant analysis) or presence-only (genetic algorithm for rule-set prediction, ecological niche factor analysis, Gower distance) data for modelling the distribution of the tick species Boophilus decoloratus (Koch, 1844) (Acarina: Ixodidae) at a continental scale (Africa) using climate data was conducted. This work explicitly addressed the usefulness of clustering using the normalized difference vegetation index (NDVI) to split original records and build partial models for each region (cluster) as a method of improving model performance. Models without clustering have a consistently lower performance (as measured by sensitivity and area under the curve [AUC]), although presence/absence models perform better than presence-only models. Two cluster-related variables, namely, prevalence (commonness of tick records in the cluster) and marginality (the relative position of the climate niche occupied by the tick in relation to that available in the cluster) greatly affect the performance of each model (P < 0.05). Both sensitivity and AUC are better for NDVI-derived clusters where the tick is more prevalent or its marginality is low. However, the total size of the cluster or its fragmentation (measured by Shannon's evenness index) did not affect the performance of models. Models derived separately for each cluster produced the best output but resulted in a patchy distribution of predicted occurrence. The use of such a method together with weighting procedures based on prevalence and marginality as derived from populations at each cluster produced a slightly lower predictive performance but a better estimation of the continental distribution of the tick. Therefore, cluster-derived models are able to effectively capture restricting conditions for different tick populations at a regional level. It is concluded that data partitioning is a powerful method with which to describe the climate niche of populations of a tick species, as adapted to local conditions. The use of this methodology greatly improves the performance of climate suitability models.  相似文献   

11.
BACKGROUND: We present a novel strategy for classification of DNA molecules using measurements from an alpha-Hemolysin channel detector. The proposed approach provides excellent classification performance for five different DNA hairpins that differ in only one base-pair. For multi-class DNA classification problems, practitioners usually adopt approaches that use decision trees consisting of binary classifiers. Finding the best tree topology requires exploring all possible tree topologies and is computationally prohibitive. We propose a computational framework based on feature primitives that eliminates the need of a decision tree of binary classifiers. In the first phase, we generate a pool of weak features from nanopore blockade current measurements by using HMM analysis, principal component analysis and various wavelet filters. In the next phase, feature selection is performed using AdaBoost. AdaBoost provides an ensemble of weak learners of various types learned from feature primitives. RESULTS AND CONCLUSION: We show that our technique, despite its inherent simplicity, provides a performance comparable to recent multi-class DNA molecule classification results. Unlike the approach presented by Winters-Hilt et al., where weaker data is dropped to obtain better classification, the proposed approach provides comparable classification accuracy without any need for rejection of weak data. A weakness of this approach, on the other hand, is the very "hands-on" tuning and feature selection that is required to obtain good generalization. Simply put, this method obtains a more informed set of features and provides better results for that reason. The strength of this approach appears to be in its ability to identify strong features, an area where further results are actively being sought.  相似文献   

12.
Techniques that describe the use of covariance when heterogeneity of slopes exists are severely limited. Although a few procedures for model selection have been recommended, none, except the hierarchical approach, is straightforward and usable with present computer programs. The hierarchical subset selection procedure presented in this paper is based on the proposition that heterogeneity may be present only for certain terms in the model. After hierarchical selection, those terms which do not involve heterogeneity are interpreted as in the usual analysis for covariance. The interpretations of those terms which do involve heterogeneity are modified with respect to significance tests performed at various values of the covariate. The hierarchical subset selection method allows one to investigate heterogeneity of slopes in covariance models as functions of the classification variables present in the design.  相似文献   

13.
The effect of environmental conditions on river macrobenthic communities was studied using a dataset consisting of 343 sediment samples from unnavigable watercourses in Flanders, Belgium. Artificial neural network models were used to analyse the relation among river characteristics and macrobenthic communities. The dataset included presence or absence of macroinvertebrate taxa and 12 physicochemical and hydromorphological variables for each sampling site. The abiotic variables served as input for the artificial neural networks to predict the macrobenthic community. The effects of the input variables on model performance were assessed in order to identify the most diagnostic river characteristics for macrobenthic community composition. This was done by consecutively eliminating the least important variables and, when beneficial for model performance, adding previously removed ones again. This stepwise input variable selection procedure was tested not only on a model predicting the entire macrobenthic community, but also on three models, each predicting an individual taxon. Additionally, during each step of the stepwise leave-one-out procedure, a sensitivity analysis was performed to determine the response of the predicted macroinvertebrate taxa to the input variables applied. This research illustrated that a combination of input variable selection with sensitivity analyses can contribute to the development of reliable and ecologically relevant ANN models. The river characteristics predicting presence or absence of the benthic macroinvertebrates best were the Julian day, conductivity, and dissolved oxygen content. These conditions reflect the importance of discharges of untreated wastewater that occurred during the period of investigation in nearly all Flemish rivers.  相似文献   

14.
15.
The Fourier transform (FT) method was applied to specify the distribution of 14 predefined groups of amino acids (64 residues) at both termini of annotated type III and type I secreted proteins from proteobacteria. Type I proteins displayed a higher occurrence of significant periodicities at both C-and N-termini, indicating potent features to discriminate between secretion types, particularly by the use of variables selected from the full periodicity profiles at 19 orders of FT. The Fishers linear discriminant analysis, together with the stepwise selection of variables throughout equal pairs of combinations for all predefined groups of residues, revealed the C-terminal harmonics of aromatic (HFWY) and aliphatic (VLIA) residues as a set of strong predictor variables to classify both types of secreted proteins with an accuracy of 100% for original grouped cases and 96.4% for cross-validated grouped cases. The prediction accuracy of proposed discriminant function was estimated by repeated k-fold cross-validation procedures where the original data set was randomly divided into k subsets, with one of the k-subsets serving as the test set and the remaining data forming the training set. The average error rate computed across all k-trials and repeats did not exceed that of leave-one-out procedure. The proposed set of predictor variables could be used to assess the compatibility between secretion pathways and secretion substrates of proteobacteria by means of discriminant analysis.  相似文献   

16.
Distribution models should take into account the different limiting factors that simultaneously influence species ranges. Species distribution models built with different explanatory variables can be combined into more comprehensive ones, but the resulting models should maximize complementarity and avoid redundancy. Our aim was to compare the different methods available for combining species distribution models. We modelled 19 threatened vertebrate species in mainland Spain, producing models according to three individual explanatory factors: spatial constraints, topography and climate, and human influence. We used five approaches for model combination: Bayesian inference, Akaike weight averaging, stepwise variable selection, updating, and fuzzy logic. We compared the performance of these approaches by assessing different aspects of their classification and discrimination capacity. We demonstrated that different approaches to model combination give rise to disparities in the model outputs. Bayesian integration was systematically affected by an error in the equations that are habitually used in distribution modelling. Akaike weights produced models that were driven by the best single factor and therefore failed at combining the models effectively. The updating and the stepwise approaches shared recalibration as the basic concept for model combination, were very similar in their performance, and showed the highest sensitivity and discrimination capacity. The fuzzy‐logic approach yielded models with the highest classification capacity according to Cohen's kappa. In conclusion: 1) Bayesian integration, employing the currently used equation, and the Akaike weight procedure should be avoided; 2) the updating and stepwise approaches can be considered minor variants of the same recalibrating approach; and 3) there is a trade‐off between this recalibrating approach, which has the highest sensitivity, and fuzzy logic, which has the highest overall classification capacity. Recalibration is better if unfavourable conditions in one environmental factor may be counterbalanced with favourable conditions in a different factor, otherwise fuzzy logic is better.  相似文献   

17.
The success of invasive plant species is driven, in part, by feedback with soil ecosystems. Yet, how variation in belowground communities across latitudinal gradients affects invader distributions remains poorly understood. To determine the effect of soil communities on the performance of the noxious weed Cirsium arvense across its invaded range, we grew seedlings for 40 days in soils collected across a 699 km linear distance from both inside and outside established populations. We also described the mesofaunal and bacterial communities across all soil samples. We found that C. arvense typically performed better when grown in soils sourced from northern populations than from southern locations where it has a longer invasion history. We also found evidence that C. arvense performed best in soils sourced from outside invaded patches, although this was not consistent across all sites. The bacterial community showed a significant increase in the magnitude of compositional change in invaded sites at higher latitudes, while the mesofaunal community showed the opposite pattern. Bacterial community composition was significantly correlated with C. arvense performance, although mesofaunal community composition was not. Our results demonstrate that the interactions between an invasive plant and associated soil communities change across the invaded range, and the bacterial community in particular may affect variation in plant performance. Observed patterns may be caused by C.arvense presence and time since invasion allowing for an accumulation of species‐specific pathogens in southern soils, while the naïveté of northern soils to invasion results in a more responsive bacterial community. Although these interactions are difficult to predict, such effects could possibly facilitate the establishment of this exotic species to novel locations.  相似文献   

18.
The roles of ultimate and proximate factors in regulating basal and summit metabolic rates of passerine birds during winter have received little study, and the extent to which winter temperatures affect these variables is unknown. To address this question, we measured basal and summit (maximum cold-induced) metabolic rates in black-capped chickadees (Poecile atricapillus), dark-eyed juncos (Junco hyemalis), and American tree sparrows (Spizella arborea) during winters from 1991/1992 to 1997 in southeastern South Dakota. Both temperature and these metabolic rates varied within and among winters. Least-squares regression revealed significant negative relationships for normalized basal and summit metabolism against mean winter temperature for all species pooled (R2=0.62 to 0.69, P相似文献   

19.
The influence of environmental variables on the selection of a water body as breeding habitat by Salamandra salamandra was studied in an arid zone located in the southwestern part of its distribution range. From November 2002 to October 2003, 50 water bodies were monitored in the south east of the Iberian Peninsula. Environmental data were submitted to a stepwise logistic regression analysis at macrohabitat, water body typology and microhabitat scales in order to establish the main factors influencing the use of a given water body as breeding habitat by this species. A significant degree of dependence between the reproduction of Salamandra salamandra and environmental variables was observed at all of these levels. These results should be taken into account when populations of this species are subjected to management and/or recovery programmes in arid areas.  相似文献   

20.
Aim To establish possible interpopulation relationships among Colombian Artemia franciscana (Crustacea, Anostraca) populations. Location Colombian Caribbean coast (Manaure, Galerazamba, Salina Cero and Tayrona) and a similar thalassohaline reference population from San Francisco Bay (SFB‐USA). Methods Morphometric characters of male and female cultured individuals of A. franciscana were measured. The populations were grouped according to: (1) population type (populations grouped according to two broad regions of origin: North America and the Caribbean coast), and (2) specific geographical origin (populations selected according to five specific local origins: Manaure, Galerazamba, Salina Cero, Tayrona and SFB) and evaluated using forward stepwise discriminant analysis (SPSS, Ver. 10). Results Optimal discriminant variables for males grouped by the type of population were left setae and antenna length, and for females they were abdominal length and antenna length. However, for males grouped by their specific geographical origin, the optimal variables were furca length, left setae, antenna length, eye separation, abdominal width and abdominal length, and for the females, they were furca length, abdominal length, left setae and eye separation. Male and female Colombian Caribbean populations were separated from the North American populations. However, our results show that the classification based on male characters provides better group membership than females. Main conclusions Male morphometric characters separated the type of population groups more clearly than the female characters, because all Colombian populations were correctly positioned in the Caribbean coast region and the SFB population in the North American region, with no overlapping between the two types, as was the case for the female individuals. Likewise, male individuals correctly position the Salina Cero population to its neighbouring Galerazamba population and to the other Colombian populations. In contrast, female individuals from Salina Cero did not cluster with the other Colombian coast populations (Galerazamba, Tayrona and Manaure) or with the SFB population.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号