首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Abstract. This article investigates whether the Braun‐Blanquet abundance/dominance (AD) scores that commonly appear in phytosociological tables can properly be analysed by conventional multivariate analysis methods such as Principal Components Analysis and Correspondence Analysis. The answer is a definite NO. The source of problems is that the AD values express species performance on a scale, namely the ordinal scale, on which differences are not interpretable. There are several arguments suggesting that no matter which methods have been preferred in contemporary numerical syntaxonomy and why, ordinal data should be treated in an ordinal way. In addition to the inadmissibility of arithmetic operations with the AD scores, these arguments include interpretability of dissimilarities derived from ordinal data, consistency of all steps throughout the analysis and universality of the method which enables simultaneous treatment of various measurement scales. All the ordination methods that are commonly used, for example, Principal Components Analysis and all variants of Correspondence Analysis as well as standard cluster analyses such as Ward's method and group average clustering, are inappropriate when using AD data. Therefore, the application of ordinal clustering and scaling methods to traditional phytosociological data is advocated. Dissimilarities between relevés should be calculated using ordinal measures of resemblance, and ordination and clustering algorithms should also be ordinal in nature. A good ordination example is Non‐metric Multidimensional Scaling (NMDS) as long as it is calculated from an ordinal dissimilarity measure such as the Goodman & Kruskal γ coefficient, and for clustering the new OrdClAn‐H and OrdClAn‐N methods.  相似文献   

2.
Using visual estimation of species cover in ordinal interval classes may reduce costs in vegetation studies. In phytosociology, species cover within plots is usually estimated according to the well-known Braun-Blanquet scale and ordinal data from this scale are usually treated using common exploratory analysis tools that are adequate for ratio-scale variables only. This paper addresses whether the visual estimation of ordinal cover data and the treatment of these data with multivariate procedures tailored for ratio-scale data would lead to a significant loss of information with respect to the use of more accurate methods of data collection and analysis. To answer these questions we used three data sets sampled by different authors in different sites of Tuscany (central Italy) in which the species cover is measured with the point quadrat method. For each data set we used a Mantel test to compare the dissimilarity matrices obtained from the original point-quadrat cover data with those obtained from the corresponding ordinal interval classes. The results suggest that the ordinal data are suitable to represent the plot-to-plot dissimilarity structure of all data sets in a reasonable way and that in using such data there is no need to apply dissimilarity coefficients specifically tailored for ordinal scales.  相似文献   

3.
Summary As most georeferenced data sets are multivariate and concern variables of different types, spatial mapping methods must be able to deal with such data. The main difficulties are the prediction of non‐Gaussian variables and the modeling of the dependence between processes. The aim of this article is to present a new hierarchical Bayesian approach that permits simultaneous modeling of dependent Gaussian, count, and ordinal spatial fields. This approach is based on spatial generalized linear mixed models. We use a moving average approach to model the spatial dependence between the processes. The method is first validated through a simulation study. We show that the multivariate model has better predictive abilities than the univariate one. Then the multivariate spatial hierarchical model is applied to a real data set collected in French Guiana to predict topsoil patterns.  相似文献   

4.

Aim

Species distribution models are important tools used to study the distribution and abundance of organisms relative to abiotic variables. Dynamic local interactions among species in a community can affect abundance. The abundance of a single species may not be at equilibrium with the environment for spreading invasive species and species that are range shifting because of climate change. Innovation : We develop methods for incorporating temporal processes into a spatial joint species distribution model for presence/absence and ordinal abundance data. We model non‐equilibrium conditions via a temporal random effect and temporal dynamics with a vector‐autoregressive process allowing for intra‐ and interspecific dependence between co‐occurring species. The autoregressive term captures how the abundance of each species can enhance or inhibit its own subsequent abundance or the subsequent abundance of other species in the community and is well suited for a ‘community modules’ approach of strongly interacting species within a food web. R code is provided for fitting multispecies models within a Bayesian framework for ordinal data with any number of locations, time points, covariates and ordinal categories.

Main conclusions

We model ordinal abundance data of two invasive insects (hemlock woolly adelgid and elongate hemlock scale) that share a host tree and were undergoing northwards range expansion in the eastern U.S.A. during the period 1997–2011. Accounting for range expansion and high inter‐annual variability in abundance led to improved estimation of the species–environment relationships. We would have erroneously concluded that winter temperatures did not affect scale abundance had we not accounted for the range expansion of scale. The autoregressive component revealed weak evidence for commensalism, in which adelgid may have predisposed hemlock stands for subsequent infestation by scale. Residual spatial dependence indicated that an unmeasured variable additionally affected scale abundance. Our robust modelling approach could provide similar insights for other community modules of co‐occurring species.  相似文献   

5.
Methods to examine whether genetic and/or environmental sources can account for the residual variation in ordinal family data usually assume proportional odds. However, standard software to fit the non‐proportional odds model to ordinal family data is limited because the correlation structure of family data is more complex than for other types of clustered data. To perform these analyses we propose the non‐proportional odds multivariate logistic regression model and take a simulation‐based approach to model fitting using Markov chain Monte Carlo methods, such as partially collapsed Gibbs sampling and the Metropolis algorithm. We applied the proposed methodology to male pattern baldness data from the Victorian Family Heart Study.  相似文献   

6.
A disease severity index (DSI) is a single number for summarising a large amount of information on disease severity. The DSI has most often been used with data based on a special type of ordinal scale comprising a series of consecutive ranges of defined numeric intervals, generally based on the percent area of symptoms presenting on the specimen(s). Plant pathologists and other professionals use such ordinal scale data in conjunction with a DSI (%) for treatment comparisons. The objective of this work is to explore the effects on both of different scales (i.e. those having equal or unequal classes, or different widths of intervals) and of the selection of values for scale intervals (i.e. the ordinal grade for the category or the midpoint value of the interval) on the null hypothesis test for the treatment comparison. A two‐stage simulation approach was employed to approximate the real mechanisms governing the disease‐severity sampling design. Subsequently, a meta‐analysis was performed to compare the effects of two treatments, which demonstrated that using quantitative ordinal rating grades or the midpoint conversion for the ranges of disease severity yielded very comparable results with respect to the power of hypothesis testing. However, the principal factor determining the power of the hypothesis test is the nature of the intervals, not the selection of values for ordinal scale intervals (i.e. not the mid‐point or ordinal grade). Although using the percent scale is always preferable, the results of this study provide a framework for developing improved research methods where the use of ordinal scales in conjunction with a DSI is either preferred or a necessity for comparing disease severities.  相似文献   

7.
This paper concerns with the analysis of item response data, which are usually measured on a rating scale and are therefore ordinal. These study items tended to be highly inter‐correlated. Rasch models, which convert ordinal categorical scales into linear measurements, are widely used in ordinal data analysis. In this paper, we improve the current methodology in order to incorporate inter‐item correlations. We have advocated the latent variable approach for this purpose, in combination with generalized estimating equations to estimate the Rasch model parameters. The data on a study of families of lung cancer patients demonstrate the utility of our methods.  相似文献   

8.
Cover-abundance estimates are commonly employed in phytosociological investigations to record the performance of species. Because the coded values are on an ordinal scale of measure, various authors have suggested that some transformation is necessary before such values can be used for classification and ordination. However, it is not clear that transformation is a sufficient treatment, and it would seem preferable to use ordinal data directly. In this paper we examine such direct use of partial rankings and show that several dissimilarity measures can be defined for this case without invoking any transformations. They include dissimilarity measures associated with various rank correlation measures and with distances between strings; all the measure are variant forms of Hausdorf's interset distance. Certain other kinds of data, such as those employing dominant and subdominant species and the dry-weight-rank estimation of biomass, are also on an ordinal scale and could be analysed using similar techniques.To illustrate the approach, a string dissimilarity measure is used to analyse a set of data from Slovakian grasslands which appear to reflect a simple gradient. The original data were recorded with 10 classes of performance and are analysed using hierarchical and nondeterministic, overlapping, classifications.  相似文献   

9.
Incorrect statistical methods are often used for the analysisof ordinal response data. Such data are frequently summarizedinto mean scores for comparisons, a fallacious practice becauseordinal data are inherently not equidistant. The ubiquitousPearson chi-square test is invalid because it ignores the rankingof ordinal data. Although some of the non-parametric statisticalmethods take into account the ordering of ordinal data, thesemethods do not accommodate statistical adjustment of confoundingor assessment of effect modification, two overriding analyticgoals in virtually all etiologic inference in biology and medicine.The cumulative logit model is eminently suitable for the anlaysisof ordinal response data. This multivariate method not onlyconsiders the ranked order inherent in ordinal response data,but it also allows adjustment of confounding and assessmentof effect modification based on modest sample size. A non-technicalaccount of the cumulative logit model is given and its applicationsare illustrated by two research examples. The SAS programs forthe data analysis of the research examples are available fromthe author.  相似文献   

10.
The association between a binary variable Y and a variable X having an at least ordinal measurement scale might be examined by selecting a cutpoint in the range of X and then performing an association test for the obtained 2 x 2 contingency table using the chi-square statistic. The distribution of the maximally selected chi-square statistic (i.e. the maximal chi-square statistic over all possible cutpoints) under the null-hypothesis of no association between X and Y is different from the known chi-square distribution. In the last decades, this topic has been extensively studied for continuous X variables, but not for non-continuous variables of at least ordinal measurement scale (which include e.g. classical ordinal or discretized continuous variables). In this paper, we suggest an exact method to determine the finite-sample distribution of maximally selected chi-square statistics in this context. This novel approach can be seen as a method to measure the association between a binary variable and variables having an at least ordinal scale of different types (ordinal, discretized continuous, etc). As an illustration, this method is applied to a new data set describing pregnancy and birth for 811 babies.  相似文献   

11.
Characterizing genetic structure across geographic space is a fundamental challenge in population genetics. Multivariate statistical analyses are powerful tools for summarizing genetic variability, but geographic information and accompanying metadata are not always easily integrated into these methods in a user‐friendly fashion. Here, we present a deployable Python‐based web‐tool, mvmapper , for visualizing and exploring results of multivariate analyses in geographic space. This tool can be used to map results of virtually any multivariate analysis of georeferenced data, and routines for exporting results from a number of standard methods have been integrated in the R package adegenet , including principal components analysis (PCA), spatial PCA, discriminant analysis of principal components, principal coordinates analysis, nonmetric dimensional scaling and correspondence analysis. mvmapper 's greatest strength is facilitating dynamic and interactive exploration of the statistical and geographic frameworks side by side, a task that is difficult and time‐consuming with currently available tools. Source code and deployment instructions, as well as a link to a hosted instance of mvmapper , can be found at https://popphylotools.github.io/mvMapper/ .  相似文献   

12.
Discrete state‐space models are used in ecology to describe the dynamics of wild animal populations, with parameters, such as the probability of survival, being of ecological interest. For a particular parametrization of a model it is not always clear which parameters can be estimated. This inability to estimate all parameters is known as parameter redundancy or a model is described as nonidentifiable. In this paper we develop methods that can be used to detect parameter redundancy in discrete state‐space models. An exhaustive summary is a combination of parameters that fully specify a model. To use general methods for detecting parameter redundancy a suitable exhaustive summary is required. This paper proposes two methods for the derivation of an exhaustive summary for discrete state‐space models using discrete analogues of methods for continuous state‐space models. We also demonstrate that combining multiple data sets, through the use of an integrated population model, may result in a model in which all parameters are estimable, even though models fitted to the separate data sets may be parameter redundant.  相似文献   

13.
Marginal methods have been widely used for the analysis of longitudinal ordinal and categorical data. These models do not require full parametric assumptions on the joint distribution of repeated response measurements but only specify the marginal or even association structures. However, inference results obtained from these methods often incur serious bias when variables are subject to error. In this paper, we tackle the problem that misclassification exists in both response and categorical covariate variables. We develop a marginal method for misclassification adjustment, which utilizes second‐order estimating functions and a functional modeling approach, and can yield consistent estimates and valid inference for mean and association parameters. We propose a two‐stage estimation approach for cases in which validation data are available. Our simulation studies show good performance of the proposed method under a variety of settings. Although the proposed method is phrased to data with a longitudinal design, it also applies to correlated data arising from clustered and family studies, in which association parameters may be of scientific interest. The proposed method is applied to analyze a dataset from the Framingham Heart Study as an illustration.  相似文献   

14.
Xu S  Xu C 《Heredity》2006,97(6):409-417
Many economically important characteristics of agricultural crops are measured as ordinal traits. Statistical analysis of the genetic basis of ordinal traits appears to be quite different from regular quantitative traits. The generalized linear model methodology implemented via the Newton-Raphson algorithm offers improved efficiency in the analysis of such data, but does not take full advantage of the extensive theory developed in the linear model arena. Instead, we develop a multivariate model for ordinal trait analysis and implement an EM algorithm for parameter estimation. We also propose a method for calculating the variance-covariance matrix of the estimated parameters. The EM equations turn out to be extremely similar to formulae seen in standard linear model analysis. Computer simulations are performed to validate the EM algorithm. A real data set is analyzed to demonstrate the application of the method. The advantages of the EM algorithm over other methods are addressed. Application of the method to QTL mapping for ordinal traits is demonstrated using a simulated baclcross (BC) population.  相似文献   

15.
Abstract In exploring the relationship between multivariate abundance data and environmental variables, a rarely used approach is to graph raw data separately for each different taxon. It is proposed that such raw data graphs become part of the standard toolset for graphing and analysing multivariate abundances. The key advantage of this approach is that axis scales have quantitative interpretations, enabling quantitative interpretation of patterns in abundance. In contrast, ordinations only present qualitative information. Ordinations are useful for inferring overall, qualitative patterns and raw data graphing is a complementary tool of greater use for answering more specific questions, aimed at a deeper understanding the ecology of a community. It is demonstrated using some well‐known examples that our understanding of the nature of associations can be considerably improved by using raw data graphs, even when only plotting a subset of variables. One example describes how an often‐cited dataset has been misinterpreted in key methodological papers, because data were interpreted from ordinations alone, with no consideration of plots of the raw data.  相似文献   

16.
In this paper, we provide an overview of recently developed methods for the analysis of multivariate data that do not necessarily emanate from a normal universe. Multivariate data occur naturally in the life sciences and in other research fields. When drawing inference, it is generally recommended to take the multivariate nature of the data into account, and not merely analyze each variable separately. Furthermore, it is often of major interest to select an appropriate set of important variables. We present contributions in three different, but closely related, research areas: first, a general approach to the comparison of mean vectors, which allows for profile analysis and tests of dimensionality; second, non‐parametric and parametric methods for the comparison of independent samples of multivariate observations; and third, methods for the situation where the experimental units are observed repeatedly, for example, over time, and the main focus is on analyzing different time profiles when the number p of repeated observations per subject is larger than the number n of subjects.  相似文献   

17.
Ecological data sets often record the abundance of species, together with a set of explanatory variables. Multivariate statistical methods are optimal to analyze such data and are thus frequently used in ecology for exploration, visualization, and inference. Most approaches are based on pairwise distance matrices instead of the sites‐by‐species matrix, which stands in stark contrast to univariate statistics, where data models, assuming specific distributions, are the norm. However, through advances in statistical theory and computational power, models for multivariate data have gained traction. Systematic simulation‐based performance evaluations of these methods are important as guides for practitioners but still lacking. Here, we compare two model‐based methods, multivariate generalized linear models (MvGLMs) and constrained quadratic ordination (CQO), with two distance‐based methods, distance‐based redundancy analysis (dbRDA) and canonical correspondence analysis (CCA). We studied the performance of the methods to discriminate between causal variables and noise variables for 190 simulated data sets covering different sample sizes and data distributions. MvGLM and dbRDA differentiated accurately between causal and noise variables. The former had the lowest false‐positive rate (0.008), while the latter had the lowest false‐negative rate (0.027). CQO and CCA had the highest false‐negative rate (0.291) and false‐positive rate (0.256), respectively, where these error rates were typically high for data sets with linear responses. Our study shows that both model‐ and distance‐based methods have their place in the ecologist's statistical toolbox. MvGLM and dbRDA are reliable for analyzing species–environment relations, whereas both CQO and CCA exhibited considerable flaws, especially with linear environmental gradients.  相似文献   

18.
This comparison of methods for assessing the development of muscle insertion sites, or entheses, suggests that three‐dimensional (3D) quantification of enthesis morphology can produce a picture of habitual muscle use patterns in a past population that is similar to one produced by ordinal scores for describing enthesis morphology. Upper limb skeletal elements (humeri, radii, and ulnae) from a sample of 24 middle‐aged adult males from the Pottery Mound site in New Mexico were analyzed for both fibrous and fibrocartilaginous enthesis development with three different methods: ordinal scores, two‐dimensional (2D) area measurements, and 3D surface areas. The methods were compared using tests for asymmetry and correlations among variables in each quantitative data set. 2D representations of enthesis area did not agree as closely as ordinal scores and 3D surface areas did regarding which entheses were significantly asymmetrical. There was significant correlation between 3D and 2D data, but correlation coefficients were not consistently high. Intraobserver error was also assessed for the 3D method. Cronbach's alpha values fell between 0.68 and 0.73, and error rates for all entheses fell between 10% and 15%. Marginally acceptable intraobserver error and the analytic versatility of 3D images encourage further investigation of using 3D scanning technology for quantifying enthesis development. Am J Phys Anthropol 152:417–424, 2013. © 2013 Wiley Periodicals, Inc.  相似文献   

19.
For pathway analysis of genomic data, the most common methods involve combining p-values from individual statistical tests. However, there are several multivariate statistical methods that can be used to test whether a pathway has changed. Because of the large number of variables and pathway sizes in genomics data, some of these statistics cannot be computed. However, in metabolomics data, the number of variables and pathway sizes are typically much smaller, making such computations feasible. Of particular interest is being able to detect changes in pathways that may not be detected for the individual variables. We compare the performance of both the p-value methods and multivariate statistics for self-contained tests with an extensive simulation study and a human metabolomics study. Permutation tests, rather than asymptotic results are used to assess the statistical significance of the pathways. Furthermore, both one and two-sided alternatives hypotheses are examined. From the human metabolomic study, many pathways were statistically significant, although the majority of the individual variables in the pathway were not. Overall, the p-value methods perform at least as well as the multivariate statistics for these scenarios.  相似文献   

20.
Recent advances in high‐throughput methods of molecular analyses have led to an explosion of studies generating large‐scale ecological data sets. In particular, noticeable effect has been attained in the field of microbial ecology, where new experimental approaches provided in‐depth assessments of the composition, functions and dynamic changes of complex microbial communities. Because even a single high‐throughput experiment produces large amount of data, powerful statistical techniques of multivariate analysis are well suited to analyse and interpret these data sets. Many different multivariate techniques are available, and often it is not clear which method should be applied to a particular data set. In this review, we describe and compare the most widely used multivariate statistical techniques including exploratory, interpretive and discriminatory procedures. We consider several important limitations and assumptions of these methods, and we present examples of how these approaches have been utilized in recent studies to provide insight into the ecology of the microbial world. Finally, we offer suggestions for the selection of appropriate methods based on the research question and data set structure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号