首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 10 毫秒
1.
Diggle and Kenward (1994, Applied Statistics 43, 49-93) proposed a selection model for continuous longitudinal data subject to nonrandom dropout. It has provoked a large debate about the role for such models. The original enthusiasm was followed by skepticism about the strong but untestable assumptions on which this type of model invariably rests. Since then, the view has emerged that these models should ideally be made part of a sensitivity analysis. This paper presents a formal and flexible approach to such a sensitivity assessment based on local influence (Cook, 1986, Journal of the Royal Statistical Society, Series B 48, 133-169). The influence of perturbing a missing-at-random dropout model in the direction of nonrandom dropout is explored. The method is applied to data from a randomized experiment on the inhibition of testosterone production in rats.  相似文献   

2.
Recently, a lot of concern has been raised about assumptions needed in order to fit statistical models to incomplete multivariate and longitudinal data. In response, research efforts are being devoted to the development of tools that assess the sensitivity of such models to often strong but always, at least in part, unverifiable assumptions. Many efforts have been devoted to longitudinal data, primarily in the selection model context, although some researchers have expressed interest in the pattern-mixture setting as well. A promising tool, proposed by Verbeke et al. (2001, Biometrics 57, 43-50), is based on local influence (Cook, 1986, Journal of the Royal Statistical Society, Series B 48, 133-169). These authors considered the Diggle and Kenward (1994, Applied Statistics 43, 49-93) model, which is based on a selection model, integrating a linear mixed model for continuous outcomes with logistic regression for dropout. In this article, we show that a similar idea can be developed for multivariate and longitudinal binary data, subject to nonmonotone missingness. We focus on the model proposed by Baker, Rosenberger, and DerSimonian (1992, Statistics in Medicine 11, 643-657). The original model is first extended to allow for (possibly continuous) covariates, whereafter a local influence strategy is developed to support the model-building process. The model is able to deal with nonmonotone missingness but has some limitations as well, stemming from the conditional nature of the model parameters. Some analytical insight is provided into the behavior of the local influence graphs.  相似文献   

3.
Most statistical solutions to the problem of statistical inferencewith missing data involve integration or expectation. This canbe done in many ways: directly or indirectly, analytically ornumerically, deterministically or stochastically. Missing-dataproblems can be formulated in terms of latent random variables,so that hierarchical likelihood methods of Lee & Nelder(1996) can be applied to missing-value problems to provide onesolution to the problem of integration of the likelihood. Theresulting methods effectively use a Laplace approximation tothe marginal likelihood with an additional adjustment to themeasures of precision to accommodate the estimation of the fixedeffects parameters. We first consider missing at random caseswhere problems are simpler to handle because the integrationdoes not need to involve the missing-value mechanism and thenconsider missing not at random cases. We also study tobit regressionand refit the missing not at random selection model to the antidepressanttrial data analyzed in Diggle & Kenward (1994).  相似文献   

4.
Dobson A  Henderson R 《Biometrics》2003,59(4):741-751
We present a variety of informal graphical procedures for diagnostic assessment of joint models for longitudinal and dropout time data. A random effects approach for Gaussian responses and proportional hazards dropout time is assumed. We consider preliminary assessment of dropout classification categories based on residuals following a standard longitudinal data analysis with no allowance for informative dropout. Residual properties conditional upon dropout information are discussed and case influence is considered. The proposed methods do not require computationally intensive methods over and above those used to fit the proposed model. A longitudinal trial into the treatment of schizophrenia is used to illustrate the suggestions.  相似文献   

5.
We describe an extension to matched case-control studies of the parametric modelling framework developed by Diggle (1990) and Diggle and Rowlingson (1994) to investigate raised risk around putative sources of environmental pollution. We use a conditional likelihood approach for the family of risk functions considered in Diggle and Rowlingson (1994). We show that the likelihood surface that results from these models may be highly irregular, and provide a Bayesian analysis in which we investigate the posterior distribution using Markov chain Monte Carlo. An analysis of one-one matched data that were collected to investigate the relationship between respiratory disease and distance to roads in East London is presented.  相似文献   

6.
This article derives generalized prediction intervals for random effects in linear random‐effects models. For balanced and unbalanced data in two‐way layouts, models are considered with and without interaction. Coverage of the proposed generalized prediction intervals was estimated in a simulation study based on an agricultural field experiment. Generalized prediction intervals were compared with prediction intervals based on the restricted maximum likelihood (REML) procedure and the approximate methods of Satterthwaite and Kenward and Roger. The simulation study showed that coverage of generalized prediction intervals was closer to the nominal level 0.95 than coverage of prediction intervals based on the REML procedure.  相似文献   

7.
Summary In the analysis of missing data, sensitivity analyses are commonly used to check the sensitivity of the parameters of interest with respect to the missing data mechanism and other distributional and modeling assumptions. In this article, we formally develop a general local influence method to carry out sensitivity analyses of minor perturbations to generalized linear models in the presence of missing covariate data. We examine two types of perturbation schemes (the single‐case and global perturbation schemes) for perturbing various assumptions in this setting. We show that the metric tensor of a perturbation manifold provides useful information for selecting an appropriate perturbation. We also develop several local influence measures to identify influential points and test model misspecification. Simulation studies are conducted to evaluate our methods, and real datasets are analyzed to illustrate the use of our local influence measures.  相似文献   

8.
Dropouts are common in longitudinal study. If the dropout probability depends on the missing observations at or after dropout, this type of dropout is called informative (or nonignorable) dropout (ID). Failure to accommodate such dropout mechanism into the model will bias the parameter estimates. We propose a conditional autoregressive model for longitudinal binary data with an ID model such that the probabilities of positive outcomes as well as the drop‐out indicator in each occasion are logit linear in some covariates and outcomes. This model adopting a marginal model for outcomes and a conditional model for dropouts is called a selection model. To allow for the heterogeneity and clustering effects, the outcome model is extended to incorporate mixture and random effects. Lastly, the model is further extended to a novel model that models the outcome and dropout jointly such that their dependency is formulated through an odds ratio function. Parameters are estimated by a Bayesian approach implemented using the user‐friendly Bayesian software WinBUGS. A methadone clinic dataset is analyzed to illustrate the proposed models. Result shows that the treatment time effect is still significant but weaker after allowing for an ID process in the data. Finally the effect of drop‐out on parameter estimates is evaluated through simulation studies.  相似文献   

9.
Summary A class of nonignorable models is presented for handling nonmonotone missingness in categorical longitudinal responses. This class of models includes the traditional selection models and shared parameter models. This allows us to perform a broader than usual sensitivity analysis. In particular, instead of considering variations to a chosen nonignorable model, we study sensitivity between different missing data frameworks. An appealing feature of the developed class is that parameters with a marginal interpretation are obtained, while algebraically simple models are considered. Specifically, marginalized mixed‐effects models ( Heagerty, 1999 , Biometrics 55, 688–698) are used for the longitudinal process that model separately the marginal mean and the correlation structure. For the correlation structure, random effects are introduced and their distribution is modeled either parametrically or non‐parametrically to avoid potential misspecifications.  相似文献   

10.
Yuan Y  Little RJ 《Biometrics》2009,65(2):478-486
Summary .  Selection models and pattern-mixture models are often used to deal with nonignorable dropout in longitudinal studies. These two classes of models are based on different factorizations of the joint distribution of the outcome process and the dropout process. We consider a new class of models, called mixed-effect hybrid models (MEHMs), where the joint distribution of the outcome process and dropout process is factorized into the marginal distribution of random effects, the dropout process conditional on random effects, and the outcome process conditional on dropout patterns and random effects. MEHMs combine features of selection models and pattern-mixture models: they directly model the missingness process as in selection models, and enjoy the computational simplicity of pattern-mixture models. The MEHM provides a generalization of shared-parameter models (SPMs) by relaxing the conditional independence assumption between the measurement process and the dropout process given random effects. Because SPMs are nested within MEHMs, likelihood ratio tests can be constructed to evaluate the conditional independence assumption of SPMs. We use data from a pediatric AIDS clinical trial to illustrate the models.  相似文献   

11.
Aim Our aim was to investigate how the environment, species characteristics and historical factors at the subcontinental scale affect patterns of diversity. We used the assembly of the Yellowstone biota over the past 10,000 years as a natural experiment for investigating the processes that generate a modern non‐volant mammal species pool. Location The data represent species from throughout North America with special attention to the non‐volant mammals of Yellowstone National Park, USA. Methods We used digitized range maps to determine biogeographical affinity for all non‐volant mammals in the Rocky Mountains, Deserts and Great Plains biogeographical regions of North America. This biogeographical affinity, along with taxonomic order and body size class, was used to test whether non‐random patterns exist in the assemblage of Yellowstone non‐volant mammals. These characteristics were also used to investigate the strength of non‐random processes, such as habitat or taxon filtering, on particular groups of species or individual species. Results Our results indicated that the Yellowstone fauna is composed of a non‐random subset of mammals from specific body size classes and with particular biogeographical affinities. Analyses by taxonomic order found significantly more Carnivora from the Rocky Mountains region and significantly fewer Rodentia from the Deserts region than expected from random assembly. Analyses using body size classes revealed deviations from expectations, including several significant differences between the frequency distribution of regional body sizes and the distribution of those species found within Yellowstone. Main conclusions Our novel approach explores processes affecting species pool assembly in the Yellowstone region and elsewhere, and particularly identifies unique properties of species that may contribute to non‐random assembly. Focusing on the mechanisms generating diversity, not just current diversity patterns, will assist the design of conservation strategies given future environmental change scenarios.  相似文献   

12.
Within the pattern-mixture modeling framework for informative dropout, conditional linear models (CLMs) are a useful approach to deal with dropout that can occur at any point in continuous time (not just at observation times). However, in contrast with selection models, inferences about marginal covariate effects in CLMs are not readily available if nonidentity links are used in the mean structures. In this article, we propose a CLM for long series of longitudinal binary data with marginal covariate effects directly specified. The association between the binary responses and the dropout time is taken into account by modeling the conditional mean of the binary response as well as the dependence between the binary responses given the dropout time. Specifically, parameters in both the conditional mean and dependence models are assumed to be linear or quadratic functions of the dropout time; and the continuous dropout time distribution is left completely unspecified. Inference is fully Bayesian. We illustrate the proposed model using data from a longitudinal study of depression in HIV-infected women, where the strategy of sensitivity analysis based on the extrapolation method is also demonstrated.  相似文献   

13.
Wang C  Daniels MJ 《Biometrics》2011,67(3):810-818
Summary Pattern mixture modeling is a popular approach for handling incomplete longitudinal data. Such models are not identifiable by construction. Identifying restrictions is one approach to mixture model identification ( Little, 1995 , Journal of the American Statistical Association 90 , 1112–1121; Little and Wang, 1996 , Biometrics 52 , 98–111; Thijs et al., 2002 , Biostatistics 3 , 245–265; Kenward, Molenberghs, and Thijs, 2003 , Biometrika 90 , 53–71; Daniels and Hogan, 2008 , in Missing Data in Longitudinal Studies: Strategies for Bayesian Modeling and Sensitivity Analysis) and is a natural starting point for missing not at random sensitivity analysis ( Thijs et al., 2002 , Biostatistics 3 , 245–265; Daniels and Hogan, 2008 , in Missing Data in Longitudinal Studies: Strategies for Bayesian Modeling and Sensitivity Analysis). However, when the pattern specific models are multivariate normal, identifying restrictions corresponding to missing at random (MAR) may not exist. Furthermore, identification strategies can be problematic in models with covariates (e.g., baseline covariates with time‐invariant coefficients). In this article, we explore conditions necessary for identifying restrictions that result in MAR to exist under a multivariate normality assumption and strategies for identifying sensitivity parameters for sensitivity analysis or for a fully Bayesian analysis with informative priors. In addition, we propose alternative modeling and sensitivity analysis strategies under a less restrictive assumption for the distribution of the observed response data. We adopt the deviance information criterion for model comparison and perform a simulation study to evaluate the performances of the different modeling approaches. We also apply the methods to a longitudinal clinical trial. Problems caused by baseline covariates with time‐invariant coefficients are investigated and an alternative identifying restriction based on residuals is proposed as a solution.  相似文献   

14.
15.
Aim Intraspecific variation in patch occupancy often is related to physical features of a landscape, such as the amount and distribution of habitat. However, communities occupying patchy environments typically exhibit non‐random distributions in which local assemblages of species‐poor patches are nested subsets of assemblages occupying more species‐rich patches. Nestedness of local communities implies interspecific differences in sensitivity to patchiness. Several hypotheses have been proposed to explain interspecific variation in responses to patchiness within a community, including differences in (1) colonization ability, (2) extinction proneness, (3) tolerance to disturbance, (4) sociality and (5) level of adaptation to prevailing environmental conditions. We used data on North American mammals to compare the performance of these ‘ecological’ hypotheses and the ‘physical landscape’ hypothesis. We then compared the best of these models against models that scaled landscape structure to ecologically relevant attributes of individual species. Location North America. Methods We analysed data on prevalence (i.e. proportion of patches occupied in a network of patches) and occupancy for 137 species of non‐volant mammals and twenty networks consisting of four to seventy‐five patches. Insular and terrestrial networks exhibited significantly different mean levels of prevalence and occupancy and thus were analysed separately. Indicator variables at ordinal and family levels were included in models to correct for effects caused by phylogeny. Akaike's information criterion was used in conjunction with ordinary least squares and logistic regression to compare hypotheses. Results A patch network's physical structure, indexed using patch area and isolation, received the greatest support among models predicting the prevalence of species on insular networks. Niche breadth (diet and habitat) received the greatest support for predicting prevalence of species occupying terrestrial networks. For both insular and terrestrial systems, physical features (patch area and isolation) received greater support than any of the ecological hypotheses for predicting species occupancy of individual patches. For terrestrial systems, scaling patch area by its suitability to a focal species and by individual area requirements of the species, and scaling patch isolation by species‐specific dispersal ability and niche breadth, resulted in models of patch occupancy that were superior to models relying solely on physical landscape features. For all selected models, unexplained levels of variation were high. Main conclusions Stochasticity dominated the systems we studied, indicating that random events are probably quite important in shaping local communities. With respect to deterministic factors, our results suggest that forces affecting species prevalence and occupancy may differ between insular and terrestrial systems. Physical features of insular systems appeared to swamp ecological differences among species in determining prevalence and occupancy, whereas species with broad niches were disproportionately represented in terrestrial networks. We hypothesize that differential extinction over long time periods in highly variable networks has driven nestedness of mammalian communities on islands, whereas differential colonization over shorter time‐scales in more homogeneous networks probably governed the local structure of terrestrial communities. Our results also demonstrate that integration of a species' ecological traits with physical features of a patch network is superior to reliance on either factor separately when attempting to predict the species' probability of patch occupancy in terrestrial systems.  相似文献   

16.
Null community is a spatio‐temporal abstraction of an initial regional species pool from which local species pools and actual community assemblages are organized. Any process that causes joint responses of species with similar susceptibilities affects community assembly. Through time, sequential assembly processes change the composition of a species pool in a way analogous to the one in which evolutionary processes promote character changes from an ancestor to current species. The segregation of species occurrences in an actual community suggests that assembly processes non‐randomly structured the observed community assemblages. However, going backwards to imply the causes of a particular arrangement of species is a non‐trivial challenge. I merge these premises with the philosophical and methodological foundations of cladistics. I propound parsimony analysis of species co‐occurrences as an outstanding means of devising operational hypotheses about the assembly of any non‐randomly structured set of actual community assemblages related to a common species pool. To explore this approach, I used field data gathered in a suite of 10 wetland assemblages. First, I tested independence of 101 plant species occurrences by a null model. As significant non‐random species co‐occurrence was detected, I applied a parsimony analysis taking the species occurrences as attributes, the assemblages as terminal units, and a putative null community constituted by all the present local species as the root of the assembly suite. The analysis produced four most parsimonious trees of assembly relationships. These trees maximize the number of similarities among community assemblages that can be explained by the sole fact of sharing a common regional species pool. One most parsimonious spatio‐temporal arrangement of species occurrence changes was reconstructed on one of the trees. I interpret this reconstruction in terms of assembly events, species exclusions and recruitments, showing the potentialities of this analysis to formulate operational hypotheses about community organization.  相似文献   

17.
In longitudinal studies investigators frequently have to assess and address potential biases introduced by missing data. New methods are proposed for modeling longitudinal categorical data with nonignorable dropout using marginalized transition models and shared random effects models. Random effects are introduced for both serial dependence of outcomes and nonignorable missingness. Fisher‐scoring and Quasi–Newton algorithms are developed for parameter estimation. Methods are illustrated with a real dataset.  相似文献   

18.
Non‐random patterns of species segregation and aggregation within ecological communities are often interpreted as evidence for interspecific interactions. However, it is unclear whether theoretical models can predict such patterns and how environmental factors may modify the effects of species interactions on species co‐occurrence. Here we extend a spatially explicit neutral model by including competitive effects on birth and death probabilities to assess whether competition alone is able to produce non‐random patterns of species co‐occurrence. We show that transitive and intransitive competitive hierarchies alone (in the absence of environmental heterogeneity) are indeed able to generate non‐random patterns with commonly used metrics and null models. Moreover, even weak levels of intransitive competition can increase local species richness. However, there is no simple rule or consistent directional change towards aggregation or segregation caused by competitive interactions. Instead, the spatial pattern depends on both the type of species interaction and the strength of dispersal. We conclude that co‐occurrence analysis alone may not able to identify the underlying processes that generate the patterns.  相似文献   

19.
The dispersal ability of plants is a major factor driving ecological responses to global change. In wind‐dispersed plant species, non‐random seed release in relation to wind speeds has been identified as a major determinant of dispersal distances. However, little information is available about the costs and benefits of non‐random abscission and the consequences of timing for dispersal distances. We asked: 1) to what extent is non‐random abscission able to promote long‐distance dispersal and what is the effect of potentially increased pre‐dispersal risk costs? 2) Which meteorological factors and respective timescales are important for maximizing dispersal? These questions were addressed by combining a mechanistic modelling approach and field data collection for herbaceous wind‐dispersed species. Model optimization with a dynamic dispersal approach using measured hourly wind speed showed that plants can increase long‐distance dispersal by developing a hard wind speed threshold below which no seeds are released. At the same time, increased risk costs limit the possibilities for dispersal distance gain and reduce the optimum level of the wind speed threshold, in our case (under representative Dutch meteorological conditions) to a threshold of 5–6 m s–1. The frequency and predictability (auto‐correlation in time) of pre‐dispersal seed‐loss had a major impact on optimal non‐random abscission functions and resulting dispersal distances. We observed a similar, but more gradual, bias towards higher wind speeds in six out of seven wind‐dispersed species under natural conditions. This confirmed that non‐random abscission exists in many species and that, under local Dutch meteorological conditions, abscission was biased towards winds exceeding 5–6 m s–1. We conclude that timing of seed release can vastly enhance dispersal distances in wind‐dispersed species, but increased risk costs may greatly limit the benefits of selecting wind conditions for long‐distance dispersal, leading to moderate seed abscission thresholds, depending on local meteorological conditions and disturbances.  相似文献   

20.
Ecological data often show temporal, spatial, hierarchical (random effects), or phylogenetic structure. Modern statistical approaches are increasingly accounting for such dependencies. However, when performing cross‐validation, these structures are regularly ignored, resulting in serious underestimation of predictive error. One cause for the poor performance of uncorrected (random) cross‐validation, noted often by modellers, are dependence structures in the data that persist as dependence structures in model residuals, violating the assumption of independence. Even more concerning, because often overlooked, is that structured data also provides ample opportunity for overfitting with non‐causal predictors. This problem can persist even if remedies such as autoregressive models, generalized least squares, or mixed models are used. Block cross‐validation, where data are split strategically rather than randomly, can address these issues. However, the blocking strategy must be carefully considered. Blocking in space, time, random effects or phylogenetic distance, while accounting for dependencies in the data, may also unwittingly induce extrapolations by restricting the ranges or combinations of predictor variables available for model training, thus overestimating interpolation errors. On the other hand, deliberate blocking in predictor space may also improve error estimates when extrapolation is the modelling goal. Here, we review the ecological literature on non‐random and blocked cross‐validation approaches. We also provide a series of simulations and case studies, in which we show that, for all instances tested, block cross‐validation is nearly universally more appropriate than random cross‐validation if the goal is predicting to new data or predictor space, or for selecting causal predictors. We recommend that block cross‐validation be used wherever dependence structures exist in a dataset, even if no correlation structure is visible in the fitted model residuals, or if the fitted models account for such correlations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号