首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
To understand the role of human microbiota in health and disease, we need to study effects of environmental and other epidemiological variables on the composition of microbial communities. The composition of a microbial community may depend on multiple factors simultaneously. Therefore we need multivariate methods for detecting, analyzing and visualizing the interactions between environmental variables and microbial communities. We provide two different approaches for multivariate analysis of these complex combined datasets: (i) We select variables that correlate with overall microbiota composition and microbiota members that correlate with the metadata using canonical correlation analysis, determine independency of the observed correlations in a multivariate regression analysis, and visualize the effect size and direction of the observed correlations using heatmaps; (ii) We select variables and microbiota members using univariate or bivariate regression analysis, followed by multivariate regression analysis, and visualize the effect size and direction of the observed correlations using heatmaps. We illustrate the results of both approaches using a dataset containing respiratory microbiota composition and accompanying metadata. The two different approaches provide slightly different results; with approach (i) using canonical correlation analysis to select determinants and microbiota members detecting fewer and stronger correlations only and approach (ii) using univariate or bivariate analyses to select determinants and microbiota members detecting a similar but broader pattern of correlations. The proposed approaches both detect and visualize independent correlations between multiple environmental variables and members of the microbial community. Depending on the size of the datasets and the hypothesis tested one can select the method of preference.  相似文献   

2.
Missing outcomes or irregularly timed multivariate longitudinal data frequently occur in clinical trials or biomedical studies. The multivariate t linear mixed model (MtLMM) has been shown to be a robust approach to modeling multioutcome continuous repeated measures in the presence of outliers or heavy‐tailed noises. This paper presents a framework for fitting the MtLMM with an arbitrary missing data pattern embodied within multiple outcome variables recorded at irregular occasions. To address the serial correlation among the within‐subject errors, a damped exponential correlation structure is considered in the model. Under the missing at random mechanism, an efficient alternating expectation‐conditional maximization (AECM) algorithm is used to carry out estimation of parameters and imputation of missing values. The techniques for the estimation of random effects and the prediction of future responses are also investigated. Applications to an HIV‐AIDS study and a pregnancy study involving analysis of multivariate longitudinal data with missing outcomes as well as a simulation study have highlighted the superiority of MtLMMs on the provision of more adequate estimation, imputation and prediction performances.  相似文献   

3.
Thall PF  Simon RM  Shen Y 《Biometrics》2000,56(1):213-219
We propose an approximate Bayesian method for comparing an experimental treatment to a control based on a randomized clinical trial with multivariate patient outcomes. Overall treatment effect is characterized by a vector of parameters corresponding to effects on the individual patient outcomes. We partition the parameter space into four sets where, respectively, the experimental treatment is superior to the control, the control is superior to the experimental, the two treatments are equivalent, and the treatment effects are discordant. We compute posterior probabilities of the parameter sets by treating an estimator of the parameter vector like a random variable in the Bayesian paradigm. The approximation may be used in any setting where a consistent, asymptotically normal estimator of the parameter vector is available. The method is illustrated by application to a breast cancer data set consisting of multiple time-to-event outcomes with covariates and to count data arising from a cross-classification of response, infection, and treatment in an acute leukemia trial.  相似文献   

4.
Dunson DB  Perreault SD 《Biometrics》2001,57(1):302-308
This article describes a general class of factor analytic models for the analysis of clustered multivariate data in the presence of informative missingness. We assume that there are distinct sets of cluster-level latent variables related to the primary outcomes and to the censoring process, and we account for dependency between these latent variables through a hierarchical model. A linear model is used to relate covariates and latent variables to the primary outcomes for each subunit. A generalized linear model accounts for covariate and latent variable effects on the probability of censoring for subunits within each cluster. The model accounts for correlation within clusters and within subunits through a flexible factor analytic framework that allows multiple latent variables and covariate effects on the latent variables. The structure of the model facilitates implementation of Markov chain Monte Carlo methods for posterior estimation. Data from a spermatotoxicity study are analyzed to illustrate the proposed approach.  相似文献   

5.
Small study effects occur when smaller studies show different, often larger, treatment effects than large ones, which may threaten the validity of systematic reviews and meta-analyses. The most well-known reasons for small study effects include publication bias, outcome reporting bias, and clinical heterogeneity. Methods to account for small study effects in univariate meta-analysis have been extensively studied. However, detecting small study effects in a multivariate meta-analysis setting remains an untouched research area. One of the complications is that different types of selection processes can be involved in the reporting of multivariate outcomes. For example, some studies may be completely unpublished while others may selectively report multiple outcomes. In this paper, we propose a score test as an overall test of small study effects in multivariate meta-analysis. Two detailed case studies are given to demonstrate the advantage of the proposed test over various naive applications of univariate tests in practice. Through simulation studies, the proposed test is found to retain nominal Type I error rates with considerable power in moderate sample size settings. Finally, we also evaluate the concordance between the proposed tests with the naive application of univariate tests by evaluating 44 systematic reviews with multiple outcomes from the Cochrane Database.  相似文献   

6.
《Ecological monographs》2011,81(4):635-663
Ecology is inherently multivariate, but high-dimensional data are difficult to understand. Dimension reduction with ordination analysis helps with both data exploration and clarification of the meaning of inferences (e.g., randomization tests, variation partitioning) about a statistical population. Most such inferences are asymmetric, in that variables are classified as either response or explanatory (e.g., factors, predictors). But this asymmetric approach has limitations (e.g., abiotic variables may not entirely explain correlations between interacting species). We study symmetric population-level inferences by modeling correlations and co-occurrences, using these models for out-of-sample prediction. Such modeling requires a novel treatment of ordination axes as random effects, because fixed effects only allow within-sample predictions. We advocate an iterative methodology for random-effects ordination: (1) fit a set of candidate models differing in complexity (e.g., number of axes); (2) use information criteria to choose among models; (3) compare model predictions with data; (4) explore dimension-reduced graphs (e.g., biplots); (5) repeat 1–4 if model performance is poor. We describe and illustrate random-effects ordination models (with software) for two types of data: multivariate-normal (e.g., log morphometric data) and presence–absence community data. A large simulation experiment with multivariate-normal data demonstrates good performance of (1) a small-sample-corrected information criterion and (2) factor analysis relative to principal component analysis. Predictive comparisons of multiple alternative models is a powerful form of scientific reasoning: we have shown that unconstrained ordination can be based on such reasoning.  相似文献   

7.
The meta-analytic approach to evaluating surrogate end points assesses the predictiveness of treatment effect on the surrogate toward treatment effect on the clinical end point based on multiple clinical trials. Definition and estimation of the correlation of treatment effects were developed in linear mixed models and later extended to binary or failure time outcomes on a case-by-case basis. In a general regression setting that covers nonnormal outcomes, we discuss in this paper several metrics that are useful in the meta-analytic evaluation of surrogacy. We propose a unified 3-step procedure to assess these metrics in settings with binary end points, time-to-event outcomes, or repeated measures. First, the joint distribution of estimated treatment effects is ascertained by an estimating equation approach; second, the restricted maximum likelihood method is used to estimate the means and the variance components of the random treatment effects; finally, confidence intervals are constructed by a parametric bootstrap procedure. The proposed method is evaluated by simulations and applications to 2 clinical trials.  相似文献   

8.
9.
Systems involving many variables are important in population and quantitative genetics, for example, in multi-trait prediction of breeding values and in exploration of multi-locus associations. We studied departures of the joint distribution of sets of genetic variables from independence. New measures of association based on notions of statistical distance between distributions are presented. These are more general than correlations, which are pairwise measures, and lack a clear interpretation beyond the bivariate normal distribution. Our measures are based on logarithmic (Kullback-Leibler) and on relative 'distances' between distributions. Indexes of association are developed and illustrated for quantitative genetics settings in which the joint distribution of the variables is either multivariate normal or multivariate-t, and we show how the indexes can be used to study linkage disequilibrium in a two-locus system with multiple alleles and present applications to systems of correlated beta distributions. Two multivariate beta and multivariate beta-binomial processes are examined, and new distributions are introduced: the GMS-Sarmanov multivariate beta and its beta-binomial counterpart.  相似文献   

10.
Significant correlations between allelic frequencies and environmental variables in a number of insect species have been demonstrated by multivariate techniques. Since many environmental variables show a strong relationship to geographic location and since gene flow between populations can also produce patterns of gene frequencies which are related to the geographic location, both selection and gene-flow hypotheses are consistent with the observed correlations. The genetic variables can be corrected for geographic location and so for linear gene-flow patterns. If, after correction, the genetic variables still show significant correlations with similarly corrected environmental variables, then these correlations are consistent with hypotheses of selection but not of gene flow. The data of Johnson and Schaffer (1973) have been reanalyzed using the method of canonical correlation after correction for geographical location by means of multiple regression. Five of the nine loci studied exhibit significant canonical correlations. These results, under the assumption of linear gene flow, support hypotheses of selective action of environmental variables in the genotype-environment relationships observed.  相似文献   

11.
The scatter plot is a well known and easily applicable graphical tool to explore relationships between two quantitative variables. For the exploration of relations between multiple variables, generalisations of the scatter plot are useful. We present an overview of multivariate scatter plots focussing on the following situations. Firstly, we look at a scatter plot for portraying relations between quantitative variables within one data matrix. Secondly, we discuss a similar plot for the case of qualitative variables. Thirdly, we describe scatter plots for the relationships between two sets of variables where we focus on correlations. Finally, we treat plots of the relationships between multiple response and predictor variables, focussing on the matrix of regression coefficients. We will present both known and new results, where an important original contribution concerns a procedure for the inclusion of scales for the variables in multivariate scatter plots. We provide software for drawing such scales. We illustrate the construction and interpretation of the plots by means of examples on data collected in a genomic research program on taste in tomato.  相似文献   

12.
Biomedical studies often collect multivariate event time data from multiple clusters (either subjects or groups) within each of which event times for individuals are correlated and the correlation may vary in different classes. In such survival analyses, heterogeneity among clusters for shared and specific classes can be accommodated by incorporating parametric frailty terms into the model. In this article, we propose a Bayesian approach to relax the parametric distribution assumption for shared and specific‐class frailties by using a Dirichlet process prior while also allowing for the uncertainty of heterogeneity for different classes. Multiple cluster‐specific frailty selections rely on variable selection‐type mixture priors by applying mixtures of point masses at zero and inverse gamma distributions to the variance of log frailties. This selection allows frailties with zero variance to effectively drop out of the model. A reparameterization of log‐frailty terms is performed to reduce the potential bias of fixed effects due to variation of the random distribution and dependence among the parameters resulting in easy interpretation and faster Markov chain Monte Carlo convergence. Simulated data examples and an application to a lung cancer clinical trial are used for illustration.  相似文献   

13.
This article considers global tests of differences between paired vectors of binomial probabilities, based on data from two dependent multivariate binary samples. Difference is defined as either an inhomogeneity in the marginal distributions or asymmetry in the joint distribution. For detecting the first type of difference, we propose a multivariate extension of McNemar's test and show that it is a generalized score test under a generalized estimating equations (GEE) approach. Univariate features such as the relationship between the Wald and score tests and the dropout of pairs with the same response carry over to the multivariate case and the test does not depend on the working correlation assumption among the components of the multivariate response. For sparse or imbalanced data, such as occurs when the number of variables is large or the proportions are close to zero, the test is best implemented using a bootstrap, and if this is computationally too complex, a permutation distribution. We apply the test to safety data for a drug, in which two doses are evaluated by comparing multiple responses by the same subjects to each one of them.  相似文献   

14.
Many studies aim to assess whether a therapy has a beneficial effect on multiple outcomes simultaneously relative to a control. Often the joint null hypothesis of no difference for the set of outcomes is tested using separate tests with a correction for multiple tests, or using a multivariate T 2-like MANOVA or global test. However, a more powerful test in this case is a multivariate one-sided or one-directional test directed at detecting a simultaneous beneficial treatment effect on each outcome, though not necessarily of the same magnitude. The Wei-Lachin test is a simple 1 df test obtained from a simple sum of the component statistics that was originally described in the context of a multivariate rank analysis. Under mild conditions this test provides a maximin efficient test of the null hypothesis of no difference between treatment groups for all outcomes versus the alternative hypothesis that the experimental treatment is better than control for some or all of the component outcomes, and not worse for any. Herein applications are described to a simultaneous test for multiple differences in means, proportions or life-times, and combinations thereof, all on potentially different scales. The evaluation of sample size and power for such analyses is also described. For a test of means of two outcomes with a common unit variance and correlation 0.5, the sample size needed to provide 90% power for two separate one-sided tests at the 0.025 level is 64% greater than that needed for the single Wei-Lachin multivariate one-directional test at the 0.05 level. Thus, a Wei-Lachin test with these operating characteristics is 39% more efficient than two separate tests. Likewise, compared to a T 2-like omnibus test on 2 df, the Wei-Lachin test is 32% more efficient. An example is provided in which the Wei-Lachin test of multiple components has superior power to a test of a composite outcome.  相似文献   

15.
In the context of analyzing multiple functional limitation responses collected longitudinally from the Longitudinal Study of Aging (LSOA), we investigate the heterogeneity of these outcomes with respect to their associations with previous functional status and other risk factors in the presence of informative drop-out and confounding by baseline outcomes. We accommodate the longitudinal nature of the multiple outcomes with a unique extension of the nested random effects logistic model with an autoregressive structure to include drop-out and baseline outcome components with shared random effects. Estimation of fixed effects and variance components is by maximum likelihood with numerical integration. This shared parameter selection model assumes that drop-out is conditionally independent of the multiple functional limitation outcomes given the underlying random effect representing an individual's trajectory of functional status across time. Whereas it is not possible to fully assess the adequacy of this assumption, we assess the robustness of this approach by varying the assumptions underlying the proposed model such as the random effects structure, the drop-out component, and omission of baseline functional outcomes as dependent variables in the model. Heterogeneity among the associations between each functional limitation outcome and a set of risk factors for functional limitation, such as previous functional limitation and physical activity, exists for the LSOA data of interest. Less heterogeneity is observed among the estimates of time-level random effects variance components that are allowed to vary across functional outcomes and time. We also note that. under an autoregressive structure, bias results from omitting the baseline outcome component linked to the follow-up outcome component by subject-level random effects.  相似文献   

16.
17.
Miglioretti DL 《Biometrics》2003,59(3):710-720
Health status is a complex outcome, often characterized by multiple measures. When assessing changes in health status over time, multiple measures are typically collected longitudinally. Analytic challenges posed by these multivariate longitudinal data are further complicated when the outcomes are combinations of continuous, categorical, and count data. To address these challenges, we propose a fully Bayesian latent transition regression approach for jointly analyzing a mixture of longitudinal outcomes from any distribution. Health status is assumed to be a categorical latent variable, and the multiple outcomes are treated as surrogate measures of the latent health state, observed with error. Using this approach, both baseline latent health state prevalences and the probabilities of transitioning between the health states over time are modeled as functions of covariates. The observed outcomes are related to the latent health states through regression models that include subject-specific effects to account for residual correlation among repeated measures over time, and covariate effects to account for differential measurement of the latent health states. We illustrate our approach with data from a longitudinal study of back pain.  相似文献   

18.
The majority of species interact with at least several others. We develop simple genetic models of coevolution between three species where interactions are mediated by quantitative traits. We assume that one of the species has two quantitative traits, each of which governs its interaction with one of the other two species. We use this model to explore how genetic correlations between the two traits in the multivariate species shape the evolutionary dynamics and outcomes of three species interactions. Our results suggest that genetic correlations are most important when at least one of the interactions is between a predator and prey or parasite and host. In these cases, genetic correlations between traits lead to a wide variety of novel coevolutionary outcomes and dynamics. In particular, genetic correlations can affect the existence and stability of coevolutionary equilibrium points, and they can lead to recurrent or permanent maladaptation. When the three species interact only as competitors or mutualists, however, genetic correlations have no effect on the outcome of coevolution. In all cases, our results reveal the surprising conclusion that both positive and negative genetic correlations between traits have qualitatively identical effects on coevolutionary dynamics.  相似文献   

19.
In continuation of the work by Abt (1977, 1982) and Ackermann (1980) a new construction technique is proposed to construct multivariate non-parametric tolerance regions based on the higher-dimensional correlations among the N variables under consideration. The scale-independency of the method is shown and a decision function is described to decide whether a given point lies within such a tolerance region or not. The method is discussed using the data of a clinical example.  相似文献   

20.
The North American obligate cave fauna: regional patterns   总被引:7,自引:0,他引:7  
The obligate cave faunas of nine regions of the United States –Florida Lime Sinks, Appalachians, Interior Low Plateaus, Ozarks, Driftless Area,Edwards Aquifer/Balcones Escarpment, Guadalupe Mountains, Black Hills, andMother Lode – are described and compared. The number of aquatic(stygobitic) species ranged from zero (Black Hills) to 82 (Appalachians), andthe number of terrestrial (troglobitic) species ranged from zero (Florida LimeSinks) to 256 (Interior Low Plateau). Even at the level of genus, overlapbetween regions is low. Several predictor variables (karst area, number ofcaves, number of long caves, number of deep caves, distance from the Pleistoceneice margin, distance from the late Cretaceous Sea, and vegetation type – asurrogate for productivity) were assessed using rank order statistics,especially rank order multiple regression with a backward elimination procedure.For both stygobites and troglobites, only number of caves was a significantpredictor. The absence of a karst area effect suggested that the degree of karstdevelopment is better described by the number of caves rather than area ofkarst. There was no evidence that distance to Pleistocene glacial boundaries wasimportant, but there was some support for the importance of distance from late Cretaceous sea margins, a potential source of aquatic subterranean colonists. Finally,there was no indication that surface productivity had an effect on speciesrichness. Analysis was complicated by correlations among predictor variables.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号