首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Zhu H  Ibrahim JG  Chi YY  Tang N 《Biometrics》2012,68(3):954-964
Summary This article develops a variety of influence measures for carrying out perturbation (or sensitivity) analysis to joint models of longitudinal and survival data (JMLS) in Bayesian analysis. A perturbation model is introduced to characterize individual and global perturbations to the three components of a Bayesian model, including the data points, the prior distribution, and the sampling distribution. Local influence measures are proposed to quantify the degree of these perturbations to the JMLS. The proposed methods allow the detection of outliers or influential observations and the assessment of the sensitivity of inferences to various unverifiable assumptions on the Bayesian analysis of JMLS. Simulation studies and a real data set are used to highlight the broad spectrum of applications for our Bayesian influence methods.  相似文献   

2.
Land use change is one of the main drivers of species extinction. In Europe, grasslands are under active conflict between conservation efforts and increasing agricultural pressures. Here, we examine the demographic effects of differential land use on the herbaceous perennial Trollius europaeus L. (Ranunculaceae), a bioindicator of species-richness and ecosystem services in wet grasslands of Central Europe. Demographic data were collected in 2006–2009 from nine populations in seven protected sites of northeastern Germany representing four land use types. We constructed stage-based matrix population models to explore the effects of various land management on demographic viability of focal populations. We show that most studied populations are declining (λ < 1), although the estimates of local extinction vary between ≤15 years for grazed and woodland populations, and 20–99 years for mown and abandoned populations. The joint information from our elasticity analyses and life table response experiments revealed that reproduction, growth of small vegetative individuals and survival of reproductive stages are most important for population viability. Our study shows that the current land uses in protected areas where T. europaeus is found is incompatible with its long-term viability. We suggest that, when compatible with in situ practices, grasslands containing this species be mown after maturity in order to enhance seedling recruitment and to reduce competition for juveniles. Prolonged extinction times in abandoned populations offer a buffer to develop conservation schemes there. An improvement of conservation measures is urgently needed to maintain the populations of this important bioindicator and its associated community of moist species-rich fen grasslands.  相似文献   

3.
MOTIVATION: Significance analysis of differential expression in DNA microarray data is an important task. Much of the current research is focused on developing improved tests and software tools. The task is difficult not only owing to the high dimensionality of the data (number of genes), but also because of the often non-negligible presence of missing values. There is thus a great need to reliably impute these missing values prior to the statistical analyses. Many imputation methods have been developed for DNA microarray data, but their impact on statistical analyses has not been well studied. In this work we examine how missing values and their imputation affect significance analysis of differential expression. RESULTS: We develop a new imputation method (LinCmb) that is superior to the widely used methods in terms of normalized root mean squared error. Its estimates are the convex combinations of the estimates of existing methods. We find that LinCmb adapts to the structure of the data: If the data are heterogeneous or if there are few missing values, LinCmb puts more weight on local imputation methods; if the data are homogeneous or if there are many missing values, LinCmb puts more weight on global imputation methods. Thus, LinCmb is a useful tool to understand the merits of different imputation methods. We also demonstrate that missing values affect significance analysis. Two datasets, different amounts of missing values, different imputation methods, the standard t-test and the regularized t-test and ANOVA are employed in the simulations. We conclude that good imputation alleviates the impact of missing values and should be an integral part of microarray data analysis. The most competitive methods are LinCmb, GMC and BPCA. Popular imputation schemes such as SVD, row mean, and KNN all exhibit high variance and poor performance. The regularized t-test is less affected by missing values than the standard t-test. AVAILABILITY: Matlab code is available on request from the authors.  相似文献   

4.
This article is concerned with drawing inference about aspects of the population distribution of ordinal outcome data measured on a cohort of individuals on two occasions, where some subjects are missing their second measurement. We present two complementary approaches for constructing bounds under assumptions on the missing data mechanism considered plausible by scientific experts. We develop our methodology within the context of a randomized trial of the "Good Behavior Game," an intervention designed to reduce aggressive misbehavior among children.  相似文献   

5.
Multispecies occupancy models can estimate species richness from spatially replicated multispecies detection/non‐detection survey data, while accounting for imperfect detection. A model extension using data augmentation allows inferring the total number of species in the community, including those completely missed by sampling (i.e., not detected in any survey, at any site). Here we investigate the robustness of these estimates. We review key model assumptions and test performance via simulations, under a range of scenarios of species characteristics and sampling regimes, exploring sensitivity to the Bayesian priors used for model fitting. We run tests when assumptions are perfectly met and when violated. We apply the model to a real dataset and contrast estimates obtained with and without predictors, and for different subsets of data. We find that, even with model assumptions perfectly met, estimation of the total number of species can be poor in scenarios where many species are missed (>15%–20%) and that commonly used priors can accentuate overestimation. Our tests show that estimation can often be robust to violations of assumptions about the statistical distributions describing variation of occupancy and detectability among species, but lower‐tail deviations can result in large biases. We obtain substantially different estimates from alternative analyses of our real dataset, with results suggesting that missing relevant predictors in the model can result in richness underestimation. In summary, estimates of total richness are sensitive to model structure and often uncertain. Appropriate selection of priors, testing of assumptions, and model refinement are all important to enhance estimator performance. Yet, these do not guarantee accurate estimation, particularly when many species remain undetected. While statistical models can provide useful insights, expectations about accuracy in this challenging prediction task should be realistic. Where knowledge about species numbers is considered truly critical for management or policy, survey effort should ideally be such that the chances of missing species altogether are low.  相似文献   

6.
Longitudinal studies frequently incur outcome-related nonresponse. In this article, we discuss a likelihood-based method for analyzing repeated binary responses when the mechanism leading to missing response data depends on unobserved responses. We describe a pattern-mixture model for the joint distribution of the vector of binary responses and the indicators of nonresponse patterns. Specifically, we propose an extension of the multivariate logistic model to handle nonignorable nonresponse. This method yields estimates of the mean parameters under a variety of assumptions regarding the distribution of the unobserved responses. Because these models make unverifiable identifying assumptions, we recommended conducting sensitivity analyses that provide a range of inferences, each of which is valid under different assumptions for nonresponse. The methodology is illustrated using data from a longitudinal study of obesity in children.  相似文献   

7.
Albert PS 《Biometrics》2000,56(2):602-608
Binary longitudinal data are often collected in clinical trials when interest is on assessing the effect of a treatment over time. Our application is a recent study of opiate addiction that examined the effect of a new treatment on repeated urine tests to assess opiate use over an extended follow-up. Drug addiction is episodic, and a new treatment may affect various features of the opiate-use process such as the proportion of positive urine tests over follow-up and the time to the first occurrence of a positive test. Complications in this trial were the large amounts of dropout and intermittent missing data and the large number of observations on each subject. We develop a transitional model for longitudinal binary data subject to nonignorable missing data and propose an EM algorithm for parameter estimation. We use the transitional model to derive summary measures of the opiate-use process that can be compared across treatment groups to assess treatment effect. Through analyses and simulations, we show the importance of properly accounting for the missing data mechanism when assessing the treatment effect in our example.  相似文献   

8.
In randomized studies with missing outcomes, non-identifiable assumptions are required to hold for valid data analysis. As a result, statisticians have been advocating the use of sensitivity analysis to evaluate the effect of varying assumptions on study conclusions. While this approach may be useful in assessing the sensitivity of treatment comparisons to missing data assumptions, it may be dissatisfying to some researchers/decision makers because a single summary is not provided. In this paper, we present a fully Bayesian methodology that allows the investigator to draw a 'single' conclusion by formally incorporating prior beliefs about non-identifiable, yet interpretable, selection bias parameters. Our Bayesian model provides robustness to prior specification of the distributional form of the continuous outcomes.  相似文献   

9.
Methods in the literature for missing covariate data in survival models have relied on the missing at random (MAR) assumption to render regression parameters identifiable. MAR means that missingness can depend on the observed exit time, and whether or not that exit is a failure or a censoring event. By considering ways in which missingness of covariate X could depend on the true but possibly censored failure time T and the true censoring time C, we attempt to identify missingness mechanisms which would yield MAR data. We find that, under various reasonable assumptions about how missingness might depend on T and/or C, additional strong assumptions are needed to obtain MAR. We conclude that MAR is difficult to justify in practical applications. One exception arises when missingness is independent of T, and C is independent of the value of the missing X. As alternatives to MAR, we propose two new missingness assumptions. In one, the missingness depends on T but not on C; in the other, the situation is reversed. For each, we show that the failure time model is identifiable. When missingness is independent of T, we show that the naive complete record analysis will yield a consistent estimator of the failure time distribution. When missingness is independent of C, we develop a complete record likelihood function and a corresponding estimator for parametric failure time models. We propose analyses to evaluate the plausibility of either assumption in a particular data set, and illustrate the ideas using data from the literature on this problem.  相似文献   

10.
In this paper we consider applications of local influence (Cook, 1986) to evaluate small perturbations in the model or data set in the context of structural comparative calibration (Bolfarine and Galea, 1995) assuming that the measurements obtained follow a multivariate elliptical distribution. Different perturbation schemes are investigated and an application is considered to a real data set, using the elliptical t-distribution.  相似文献   

11.
Restriction site-associated DNA sequencing (RAD-seq) and related methods have become relatively common approaches to resolve species-level phylogeny. It is not clear, however, whether RAD-seq data matrices are well suited to relaxed clock inference of divergence times, given the size of the matrices and the abundance of missing data. We investigated the sensitivity of Bayesian relaxed clock estimates of divergence times to alternative analytical decisions on an empirical RAD-seq phylogenetic matrix. We explored the relative contribution of secondary calibration strategies, amount of missing data, and the data partition analyzed to overall variance in divergence times inferred using BEAST MCMC analyses of Carex section Schoenoxiphium (Cyperaceae)—a recent radiation for which we have nearly complete species sampling of RAD-seq data. The crown node for Schoenoxiphium was estimated to be 15.22 (9.56–21.18) Ma using a single calibration point and low missing data, 11.93 (8.07–16.03) Ma using multiple calibration points and low missing data, and 8.34 (5.41–11.22) using multiple calibrations but high missing data. We found that using matrices with more than half of the individuals with missing data inferred younger mean ages for all nodes. Moreover, we have found that our molecular clock estimates are sensitive to the positions of the calibration(s) in our phylogenetic tree (using matrices with low missing data), especially when only a single calibration was applied to estimate divergence times. These results argue for sensitivity analyses and caution in interpreting divergence time estimates from RAD-seq data.  相似文献   

12.
Missing data is a common issue in research using observational studies to investigate the effect of treatments on health outcomes. When missingness occurs only in the covariates, a simple approach is to use missing indicators to handle the partially observed covariates. The missing indicator approach has been criticized for giving biased results in outcome regression. However, recent papers have suggested that the missing indicator approach can provide unbiased results in propensity score analysis under certain assumptions. We consider assumptions under which the missing indicator approach can provide valid inferences, namely, (1) no unmeasured confounding within missingness patterns; either (2a) covariate values of patients with missing data were conditionally independent of treatment or (2b) these values were conditionally independent of outcome; and (3) the outcome model is correctly specified: specifically, the true outcome model does not include interactions between missing indicators and fully observed covariates. We prove that, under the assumptions above, the missing indicator approach with outcome regression can provide unbiased estimates of the average treatment effect. We use a simulation study to investigate the extent of bias in estimates of the treatment effect when the assumptions are violated and we illustrate our findings using data from electronic health records. In conclusion, the missing indicator approach can provide valid inferences for outcome regression, but the plausibility of its assumptions must first be considered carefully.  相似文献   

13.
In this paper, the local influence approach for detecting the effect of small perturbations of the model or data is applied in the context of comparative calibration models. Such models are typically used for comparing several measuring instruments and can be considered in a functional version as well as in a structural version as is the case with ordinary measurement error models. Different perturbation schemes are considered and some real data applications illustrate the usefulness of the approach.  相似文献   

14.
Surveillance is critical to mounting an appropriate and effective response to pandemics. However, aggregated case report data suffers from reporting delays and can lead to misleading inferences. Different from aggregated case report data, line list data is a table contains individual features such as dates of symptom onset and reporting for each reported case and a good source for modeling delays. Current methods for modeling reporting delays are not particularly appropriate for line list data, which typically has missing symptom onset dates that are non-ignorable for modeling reporting delays. In this paper, we develop a Bayesian approach that dynamically integrates imputation and estimation for line list data. Specifically, this Bayesian approach can accurately estimate the epidemic curve and instantaneous reproduction numbers, even with most symptom onset dates missing. The Bayesian approach is also robust to deviations from model assumptions, such as changes in the reporting delay distribution or incorrect specification of the maximum reporting delay. We apply the Bayesian approach to COVID-19 line list data in Massachusetts and find the reproduction number estimates correspond more closely to the control measures than the estimates based on the reported curve.  相似文献   

15.
High frequency physiologic data are routinely generated for intensive care patients. While massive amounts of data make it difficult for clinicians to extract meaningful signals, these data could provide insight into the state of critically ill patients and guide interventions. We develop uniquely customized computational methods to uncover the causal structure within systemic and brain physiologic measures recorded in a neurological intensive care unit after subarachnoid hemorrhage. While the data have many missing values, poor signal-to-noise ratio, and are composed from a heterogeneous patient population, our advanced imputation and causal inference techniques enable physiologic models to be learned for individuals. Our analyses confirm that complex physiologic relationships including demand and supply of oxygen underlie brain oxygen measurements and that mechanisms for brain swelling early after injury may differ from those that develop in a delayed fashion. These inference methods will enable wider use of ICU data to understand patient physiology.  相似文献   

16.
We study the sensitivity of fishery management per-recruit harvest rates which may be part of a quantitative harvest strategy designed to achieve some objective for catch or population size. We use a local influence sensitivity analysis to derive equations that describe how these reference harvest rates are affected by perturbations to productivity processes. These equations give a basic theoretical understanding of sensitivity that can be used to predict what the likely impacts of future changes in productivity will be. Our results indicate that per-recruit reference harvest rates are more sensitive to perturbations when the equilibrium catch or population size per recruit, as functions of the harvest rate, have less curvature near the reference point. Overall our results suggest that per recruit reference points will, with some exceptions, usually increase if (1) growth rates increase, (2) natural mortality rates increase, or (3) fishery selectivity increases to an older age.  相似文献   

17.
Lachos VH  Bandyopadhyay D  Dey DK 《Biometrics》2011,67(4):1594-1604
HIV RNA viral load measures are often subjected to some upper and lower detection limits depending on the quantification assays. Hence, the responses are either left or right censored. Linear (and nonlinear) mixed-effects models (with modifications to accommodate censoring) are routinely used to analyze this type of data and are based on normality assumptions for the random terms. However, those analyses might not provide robust inference when the normality assumptions are questionable. In this article, we develop a Bayesian framework for censored linear (and nonlinear) models replacing the Gaussian assumptions for the random terms with normal/independent (NI) distributions. The NI is an attractive class of symmetric heavy-tailed densities that includes the normal, Student's-t, slash, and the contaminated normal distributions as special cases. The marginal likelihood is tractable (using approximations for nonlinear models) and can be used to develop Bayesian case-deletion influence diagnostics based on the Kullback-Leibler divergence. The newly developed procedures are illustrated with two HIV AIDS studies on viral loads that were initially analyzed using normal (censored) mixed-effects models, as well as simulations.  相似文献   

18.
Noncompliance is a common problem in experiments involving randomized assignment of treatments, and standard analyses based on intention-to-treat or treatment received have limitations. An attractive alternative is to estimate the Complier-Average Causal Effect (CACE), which is the average treatment effect for the subpopulation of subjects who would comply under either treatment (Angrist, Imbens, and Rubin, 1996, Journal of American Statistical Association 91, 444-472). We propose an extended general location model to estimate the CACE from data with noncompliance and missing data in the outcome and in baseline covariates. Models for both continuous and categorical outcomes and ignorable and latent ignorable (Frangakis and Rubin, 1999, Biometrika 86, 365-379) missing-data mechanisms are developed. Inferences for the models are based on the EM algorithm and Bayesian MCMC methods. We present results from simulations that investigate sensitivity to model assumptions and the influence of missing-data mechanism. We also apply the method to the data from a job search intervention for unemployed workers.  相似文献   

19.
A common problem in clinical trials is the missing data that occurs when patients do not complete the study and drop out without further measurements. Missing data cause the usual statistical analysis of complete or all available data to be subject to bias. There are no universally applicable methods for handling missing data. We recommend the following: (1) Report reasons for dropouts and proportions for each treatment group; (2) Conduct sensitivity analyses to encompass different scenarios of assumptions and discuss consistency or discrepancy among them; (3) Pay attention to minimize the chance of dropouts at the design stage and during trial monitoring; (4) Collect post-dropout data on the primary endpoints, if at all possible; and (5) Consider the dropout event itself an important endpoint in studies with many.  相似文献   

20.

Introduction

Systematic reviewer authors intending to include all randomized participants in their meta-analyses need to make assumptions about the outcomes of participants with missing data.

Objective

The objective of this paper is to provide systematic reviewer authors with a relatively simple guidance for addressing dichotomous data for participants excluded from analyses of randomized trials.

Methods

This guide is based on a review of the Cochrane handbook and published methodological research. The guide deals with participants excluded from the analysis who were considered ‘non-adherent to the protocol’ but for whom data are available, and participants with missing data.

Results

Systematic reviewer authors should include data from ‘non-adherent’ participants excluded from the primary study authors'' analysis but for whom data are available. For missing, unavailable participant data, authors may conduct a complete case analysis (excluding those with missing data) as the primary analysis. Alternatively, they may conduct a primary analysis that makes plausible assumptions about the outcomes of participants with missing data. When the primary analysis suggests important benefit, sensitivity meta-analyses using relatively extreme assumptions that may vary in plausibility can inform the extent to which risk of bias impacts the confidence in the results of the primary analysis. The more plausible assumptions draw on the outcome event rates within the trial or in all trials included in the meta-analysis. The proposed guide does not take into account the uncertainty associated with assumed events.

Conclusions

This guide proposes methods for handling participants excluded from analyses of randomized trials. These methods can help in establishing the extent to which risk of bias impacts meta-analysis results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号