首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Cai Z  Kuroki M  Pearl J  Tian J 《Biometrics》2008,64(3):695-701
Summary .   This article considers the problem of estimating the average controlled direct effect (ACDE) of a treatment on an outcome, in the presence of unmeasured confounders between an intermediate variable and the outcome. Such confounders render the direct effect unidentifiable even in cases where the total effect is unconfounded (hence identifiable). Kaufman et al. (2005, Statistics in Medicine 24, 1683–1702) applied a linear programming software to find the minimum and maximum possible values of the ACDE for specific numerical data. In this article, we apply the symbolic Balke–Pearl (1997, Journal of the American Statistical Association 92, 1171–1176) linear programming method to derive closed-form formulas for the upper and lower bounds on the ACDE under various assumptions of monotonicity. These universal bounds enable clinical experimenters to assess the direct effect of treatment from observed data with minimum computational effort, and they further shed light on the sign of the direct effect and the accuracy of the assessments.  相似文献   

2.
In a clinical trial, statistical reports are typically concerned about the mean difference in two groups. Now there is increasing interest in the heterogeneity of the treatment effect, which has important implications in treatment evaluation and selection. The treatment harm rate (THR), which is defined by the proportion of people who has a worse outcome on the treatment compared to the control, was used to characterize the heterogeneity. Since THR involves the joint distribution of the two potential outcomes, it cannot be identified without further assumptions even in the randomized trials. We can only derive the simple bounds with the observed data. But the simple bounds are usually too wide. In this paper, we use a secondary outcome that satisfies the monotonicity assumption to tighten the bounds. It is shown that the bounds we derive cannot be wider than the simple bounds. We also construct some simulation studies to assess the performance of our bounds in finite sample. The results show that a secondary outcome, which is more closely related to the primary outcome, can lead to narrower bounds. Finally, we illustrate the application of the proposed bounds in a randomized clinical trial of determining whether the intensive glycemia could reduce the risk of development or progression of diabetic retinopathy.  相似文献   

3.
Health researchers are often interested in assessing the direct effect of a treatment or exposure on an outcome variable, as well as its indirect (or mediation) effect through an intermediate variable (or mediator). For an outcome following a nonlinear model, the mediation formula may be used to estimate causally interpretable mediation effects. This method, like others, assumes that the mediator is observed. However, as is common in structural equations modeling, we may wish to consider a latent (unobserved) mediator. We follow a potential outcomes framework and assume a generalized structural equations model (GSEM). We provide maximum‐likelihood estimation of GSEM parameters using an approximate Monte Carlo EM algorithm, coupled with a mediation formula approach to estimate natural direct and indirect effects. The method relies on an untestable sequential ignorability assumption; we assess robustness to this assumption by adapting a recently proposed method for sensitivity analysis. Simulation studies show good properties of the proposed estimators in plausible scenarios. Our method is applied to a study of the effect of mother education on occurrence of adolescent dental caries, in which we examine possible mediation through latent oral health behavior.  相似文献   

4.
In epidemiological and clinical research, investigators are frequently interested in estimating the direct effect of a treatment on an outcome that is not relayed by intermediate variables. In 2009, VanderWeele presented marginal structural models (MSMs) for estimating direct effects based on interventions on the mediator. This paper focuses on direct effects based on principal stratification, i.e. principal stratum direct effects (PSDEs), which are causal effects within latent subgroups of subjects where the mediator is constant, regardless of the exposure status. We propose MSMs for estimating PSDEs. We demonstrate that the PSDE can be estimated readily using MSMs under the monotonicity assumption.  相似文献   

5.
The focus of many medical applications is to model the impact of several factors on time to an event. A standard approach for such analyses is the Cox proportional hazards model. It assumes that the factors act linearly on the log hazard function (linearity assumption) and that their effects are constant over time (proportional hazards (PH) assumption). Variable selection is often required to specify a more parsimonious model aiming to include only variables with an influence on the outcome. As follow-up increases the effect of a variable often gets weaker, which means that it varies in time. However, spurious time-varying effects may also be introduced by mismodelling other parts of the multivariable model, such as omission of an important covariate or an incorrect functional form of a continuous covariate. These issues interact. To check whether the effect of a variable varies in time several tests for non-PH have been proposed. However, they are not sufficient to derive a model, as appropriate modelling of the shape of time-varying effects is required. In three examples we will compare five recently published strategies to assess whether and how the effects of covariates from a multivariable model vary in time. For practical use we will give some recommendations.  相似文献   

6.
Generalized causal mediation analysis   总被引:1,自引:0,他引:1  
Albert JM  Nelson S 《Biometrics》2011,67(3):1028-1038
The goal of mediation analysis is to assess direct and indirect effects of a treatment or exposure on an outcome. More generally, we may be interested in the context of a causal model as characterized by a directed acyclic graph (DAG), where mediation via a specific path from exposure to outcome may involve an arbitrary number of links (or "stages"). Methods for estimating mediation (or pathway) effects are available for a continuous outcome and a continuous mediator related via a linear model, while for a categorical outcome or categorical mediator, methods are usually limited to two-stage mediation. We present a method applicable to multiple stages of mediation and mixed variable types using generalized linear models. We define pathway effects using a potential outcomes framework and present a general formula that provides the effect of exposure through any specified pathway. Some pathway effects are nonidentifiable and their estimation requires an assumption regarding the correlation between counterfactuals. We provide a sensitivity analysis to assess the impact of this assumption. Confidence intervals for pathway effect estimates are obtained via a bootstrap method. The method is applied to a cohort study of dental caries in very low birth weight adolescents. A simulation study demonstrates low bias of pathway effect estimators and close-to-nominal coverage rates of confidence intervals. We also find low sensitivity to the counterfactual correlation in most scenarios.  相似文献   

7.
Shanshan Luo  Wei Li  Yangbo He 《Biometrics》2023,79(1):502-513
It is challenging to evaluate causal effects when the outcomes of interest suffer from truncation-by-death in many clinical studies; that is, outcomes cannot be observed if patients die before the time of measurement. To address this problem, it is common to consider average treatment effects by principal stratification, for which, the identifiability results and estimation methods with a binary treatment have been established in previous literature. However, in multiarm studies with more than two treatment options, estimation of causal effects becomes more complicated and requires additional techniques. In this article, we consider identification, estimation, and bounds of causal effects with multivalued ordinal treatments and the outcomes subject to truncation-by-death. We define causal parameters of interest in this setting and show that they are identifiable either using some auxiliary variable or based on linear model assumption. We then propose a semiparametric method for estimating the causal parameters and derive their asymptotic results. When the identification conditions are invalid, we derive sharp bounds of the causal effects by use of covariates adjustment. Simulation studies show good performance of the proposed estimator. We use the estimator to analyze the effects of a four-level chronic toxin on fetal developmental outcomes such as birth weight in rats and mice, with data from a developmental toxicity trial conducted by the National Toxicology Program. Data analyses demonstrate that a high dose of the toxin significantly reduces the weights of pups.  相似文献   

8.
Suppose that having established a marginal total effect of a point exposure on a time-to-event outcome, an investigator wishes to decompose this effect into its direct and indirect pathways, also known as natural direct and indirect effects, mediated by a variable known to occur after the exposure and prior to the outcome. This paper proposes a theory of estimation of natural direct and indirect effects in two important semiparametric models for a failure time outcome. The underlying survival model for the marginal total effect and thus for the direct and indirect effects, can either be a marginal structural Cox proportional hazards model, or a marginal structural additive hazards model. The proposed theory delivers new estimators for mediation analysis in each of these models, with appealing robustness properties. Specifically, in order to guarantee ignorability with respect to the exposure and mediator variables, the approach, which is multiply robust, allows the investigator to use several flexible working models to adjust for confounding by a large number of pre-exposure variables. Multiple robustness is appealing because it only requires a subset of working models to be correct for consistency; furthermore, the analyst need not know which subset of working models is in fact correct to report valid inferences. Finally, a novel semiparametric sensitivity analysis technique is developed for each of these models, to assess the impact on inference, of a violation of the assumption of ignorability of the mediator.  相似文献   

9.
Suppose we are interested in the effect of a treatment in a clinical trial. The efficiency of inference may be limited due to small sample size. However, external control data are often available from historical studies. Motivated by an application to Helicobacter pylori infection, we show how to borrow strength from such data to improve efficiency of inference in the clinical trial. Under an exchangeability assumption about the potential outcome mean, we show that the semiparametric efficiency bound for estimating the average treatment effect can be reduced by incorporating both the clinical trial data and external controls. We then derive a doubly robust and locally efficient estimator. The improvement in efficiency is prominent especially when the external control data set has a large sample size and small variability. Our method allows for a relaxed overlap assumption, and we illustrate with the case where the clinical trial only contains a treated group. We also develop doubly robust and locally efficient approaches that extrapolate the causal effect in the clinical trial to the external population and the overall population. Our results also offer a meaningful implication for trial design and data collection. We evaluate the finite-sample performance of the proposed estimators via simulation. In the Helicobacter pylori infection application, our approach shows that the combination treatment has potential efficacy advantages over the triple therapy.  相似文献   

10.
The purpose of this study was to investigate the motorunit conduction velocity (CV) as a function of frequency. A wavelet based correlation and coherence analysis was introduced to measure CV as a function of frequency. Based on the most simple assumption that the power spectra of the motor unit action potential is shifted to higher frequencies with increasing CV, we hypothesized that there would be a monotonic or linear trend of increasing CV with frequency. This trend was only confirmed at higher frequencies. At lower frequencies the trend was often reversed leading to a decrease in CV with increasing frequency. Thus the CV was high at low frequencies, went through a minimum at about 170 Hz and increased at higher frequencies, as expected. The observed CV at low frequencies could not be fully explained by assuming non-propagating signals or variable groups of motor units. We concluded that spectra and CV contain partly independent information about the muscles and that the wavelet based method provides the tools to measure them both simultaneously.  相似文献   

11.
OBJECTIVES: The question of interest is estimating the relationship between haplotypes and an outcome measure, based upon unphased genotypes. The outcome of interest might be predicting the presence of disease in a logistic model, predicting a numeric drug response in a linear model, or predicting survival time in a parametric survival model with censoring. Explanatory variables may include phased haplotype design variables, environmental variables, or interactions between them. METHODS: We extend existing generalized linear haplotype models to parametric survival outcomes. To improve the stability of model variance estimates, a profile likelihood solution is proposed. An adjustment for population stratification is also considered. Here we investigate data sampled from known 'strata' (e.g., gender or ethnicity) that influence haplotype prior probabilities and thus the regression model weights. Differing linear model variance estimates, and the effect of stratification and departures from Hardy-Weinberg Equilibrium (HWE) on parameter estimates, are compared and contrasted via simulation. RESULTS: From simulations, we observed an improvement in statistical power when using a solution to profile likelihood equations. We also saw that stratification had little impact on estimates. Haplotypes that are not in HWE had a negative impact on power to test hypotheses. Finally, profile likelihood solutions for haplotypes deviating from HWE had improved power and confidence interval coverage of regression model coefficients.  相似文献   

12.
A key challenge in the estimation of tropical arthropod species richness is the appropriate management of the large uncertainties associated with any model. Such uncertainties had largely been ignored until recently, when we attempted to account for uncertainty associated with model variables, using Monte Carlo analysis. This model is restricted by various assumptions. Here, we use a technique known as probability bounds analysis to assess the influence of assumptions about (1) distributional form and (2) dependencies between variables, and to construct probability bounds around the original model prediction distribution. The original Monte Carlo model yielded a median estimate of 6.1 million species, with a 90 % confidence interval of [3.6, 11.4]. Here we found that the probability bounds (p-bounds) surrounding this cumulative distribution were very broad, owing to uncertainties in distributional form and dependencies between variables. Replacing the implicit assumption of pure statistical independence between variables in the model with no dependency assumptions resulted in lower and upper p-bounds at 0.5 cumulative probability (i.e., at the median estimate) of 2.9–12.7 million. From here, replacing probability distributions with probability boxes, which represent classes of distributions, led to even wider bounds (2.4–20.0 million at 0.5 cumulative probability). Even the 100th percentile of the uppermost bound produced (i.e., the absolutely most conservative scenario) did not encompass the well-known hyper-estimate of 30 million species of tropical arthropods. This supports the lower estimates made by several authors over the last two decades.  相似文献   

13.
M R Crager 《Biometrics》1987,43(4):895-901
Analysis of covariance (ANCOVA) techniques are often employed in the analysis of clinical trials to try to account for the effects of varying pretreatment baseline values of an outcome variable on posttreatment measurements of the same variable. Baseline measurements of outcome variables are typically random variables, which violates the usual ANCOVA assumption that covariate values are fixed. Therefore, the usual ANCOVA hypothesis tests of treatment effects may be invalid, and the ANCOVA slope parameter estimator biased, for this application. We show, however, that if the pretreatment - posttreatment measurements have a bivariate normal distribution, then (i) the ANCOVA model with residual error independent of the covariate is a valid expression of the relationship between pretreatment and posttreatment measurements; (ii) the usual (fixed-covariate analysis) ANCOVA estimates of the slope parameter and treatment effect contrasts are unbiased; and (iii) the usual ANCOVA treatment effect contrast t-tests are valid significance tests for treatment effects. Moreover, as long as the magnitudes of the treatment effects do not depend on the "true" pretreatment value of the outcome variable, the true slope parameter must lie in the interval (0, 1) and the ANCOVA model has a clear interpretation as an adjustment (based on between- and within-subject variability) to an analysis of variance model applied to the posttreatment-pretreatment differences.  相似文献   

14.
In the development of structural equation models (SEMs), observed variables are usually assumed to be normally distributed. However, this assumption is likely to be violated in many practical researches. As the non‐normality of observed variables in an SEM can be obtained from either non‐normal latent variables or non‐normal residuals or both, semiparametric modeling with unknown distribution of latent variables or unknown distribution of residuals is needed. In this article, we find that an SEM becomes nonidentifiable when both the latent variable distribution and the residual distribution are unknown. Hence, it is impossible to estimate reliably both the latent variable distribution and the residual distribution without parametric assumptions on one or the other. We also find that the residuals in the measurement equation are more sensitive to the normality assumption than the latent variables, and the negative impact on the estimation of parameters and distributions due to the non‐normality of residuals is more serious. Therefore, when there is no prior knowledge about parametric distributions for either the latent variables or the residuals, we recommend making parametric assumption on latent variables, and modeling residuals nonparametrically. We propose a semiparametric Bayesian approach using the truncated Dirichlet process with a stick breaking prior to tackle the non‐normality of residuals in the measurement equation. Simulation studies and a real data analysis demonstrate our findings, and reveal the empirical performance of the proposed methodology. A free WinBUGS code to perform the analysis is available in Supporting Information.  相似文献   

15.
Summary .  In many studies, the aim is to learn about the direct exposure effect, that is, the effect not mediated through an intermediate variable. For example, in circulation disease studies it may be of interest to assess whether a suitable level of physical activity can prevent disease, even if it fails to prevent obesity. It is well known that stratification on the intermediate may introduce a so-called posttreatment selection bias. To handle this problem, we use the framework of principal stratification ( Frangakis and Rubin, 2002 , Biometrics 58, 21–29) to define a causally relevant estimand—the principal stratum direct effect (PSDE). The PSDE is not identified in our setting. We propose a method of sensitivity analysis that yields a range of plausible values for the causal estimand. We compare our work to similar methods proposed in the literature for handling the related problem of "truncation by death."  相似文献   

16.
Question: Predictive vegetation modelling relies on the use of environmental variables, which are usually derived from abase data set with some level of error, and this error is propagated to any subsequently derived environmental variables. The question for this study is: What is the level of error and uncertainty in environmental variables based on the error propagated from a Digital Elevation Model (DEM) and how does it vary for both direct and indirect variables? Location: Kioloa region, New South Wales, Australia Methods: The level of error in a DEM is assessed and used to develop an error model for analysing error propagation to derived environmental variables. We tested both indirect (elevation, slope, aspect, topographic position) and direct (average air temperature, net solar radiation, and topographic wetness index) variables for their robustness to propagated error from the DEM. Results: It is shown that the direct environmental variable net solar radiation is less affected by error in the DEM than the indirect variables aspect and slope, but that regional conditions such as slope steepness and cloudiness can influence this outcome. However, the indirect environmental variable topographic position was less affected by error in the DEM than topographic wetness index. Interestingly, the results disagreed with the current assumption that indirect variables are necessarily less sensitive to propagated error because they are less derived. Conclusions: The results indicate that variables exhibit both systematic bias and instability under uncertainty. There is a clear need to consider the sensitivity of variables to error in their base data sets in addition to the question of whether to use direct or indirect variables.  相似文献   

17.
In this paper we derive entropy bounds for hierarchical networks. More precisely, starting from a recently introduced measure to determine the topological entropy of non-hierarchical networks, we provide bounds for estimating the entropy of hierarchical graphs. Apart from bounds to estimate the entropy of a single hierarchical graph, we see that the derived bounds can also be used for characterizing graph classes. Our contribution is an important extension to previous results about the entropy of non-hierarchical networks because for practical applications hierarchical networks are playing an important role in chemistry and biology. In addition to the derivation of the entropy bounds, we provide a numerical analysis for two special graph classes, rooted trees and generalized trees, and demonstrate hereby not only the computational feasibility of our method but also learn about its characteristics and interpretability with respect to data analysis.  相似文献   

18.
Multiple diagnostic tests and risk factors are commonly available for many diseases. This information can be either redundant or complimentary. Combining them may improve the diagnostic/predictive accuracy, but also unnecessarily increase complexity, risks, and/or costs. The improved accuracy gained by including additional variables can be evaluated by the increment of the area under (AUC) the receiver‐operating characteristic curves with and without the new variable(s). In this study, we derive a new test statistic to accurately and efficiently determine the statistical significance of this incremental AUC under a multivariate normality assumption. Our test links AUC difference to a quadratic form of a standardized mean shift in a unit of the inverse covariance matrix through a properly linear transformation of all diagnostic variables. The distribution of the quadratic estimator is related to the multivariate Behrens–Fisher problem. We provide explicit mathematical solutions of the estimator and its approximate non‐central F‐distribution, type I error rate, and sample size formula. We use simulation studies to prove that our new test maintains prespecified type I error rates as well as reasonable statistical power under practical sample sizes. We use data from the Study of Osteoporotic Fractures as an application example to illustrate our method.  相似文献   

19.
Follmann D  Nason M 《Biometrics》2011,67(3):1127-1134
Summary Quantal bioassay experiments relate the amount or potency of some compound; for example, poison, antibody, or drug to a binary outcome such as death or infection in animals. For infectious diseases, probit regression is commonly used for inference and a key measure of potency is given by the IDP , the amount that results in P% of the animals being infected. In some experiments, a validation set may be used where both direct and proxy measures of the dose are available on a subset of animals with the proxy being available on all. The proxy variable can be viewed as a messy reflection of the direct variable, leading to an errors‐in‐variables problem. We develop a model for the validation set and use a constrained seemingly unrelated regression (SUR) model to obtain the distribution of the direct measure conditional on the proxy. We use the conditional distribution to derive a pseudo‐likelihood based on probit regression and use the parametric bootstrap for statistical inference. We re‐evaluate an old experiment in 21 monkeys where neutralizing antibodies (nABs) to HIV were measured using an old (proxy) assay in all monkeys and with a new (direct) assay in a validation set of 11 who had sufficient stored plasma. Using our methods, we obtain an estimate of the ID1 for the new assay, an important target for HIV vaccine candidates. In simulations, we compare the pseudo‐likelihood estimates with regression calibration and a full joint likelihood approach.  相似文献   

20.
Statistical models support medical research by facilitating individualized outcome prognostication conditional on independent variables or by estimating effects of risk factors adjusted for covariates. Theory of statistical models is well‐established if the set of independent variables to consider is fixed and small. Hence, we can assume that effect estimates are unbiased and the usual methods for confidence interval estimation are valid. In routine work, however, it is not known a priori which covariates should be included in a model, and often we are confronted with the number of candidate variables in the range 10–30. This number is often too large to be considered in a statistical model. We provide an overview of various available variable selection methods that are based on significance or information criteria, penalized likelihood, the change‐in‐estimate criterion, background knowledge, or combinations thereof. These methods were usually developed in the context of a linear regression model and then transferred to more generalized linear models or models for censored survival data. Variable selection, in particular if used in explanatory modeling where effect estimates are of central interest, can compromise stability of a final model, unbiasedness of regression coefficients, and validity of p‐values or confidence intervals. Therefore, we give pragmatic recommendations for the practicing statistician on application of variable selection methods in general (low‐dimensional) modeling problems and on performing stability investigations and inference. We also propose some quantities based on resampling the entire variable selection process to be routinely reported by software packages offering automated variable selection algorithms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号