首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Dynamic treatment regimes (DTRs) consist of a sequence of decision rules, one per stage of intervention, that aim to recommend effective treatments for individual patients according to patient information history. DTRs can be estimated from models which include interactions between treatment and a (typically small) number of covariates which are often chosen a priori. However, with increasingly large and complex data being collected, it can be difficult to know which prognostic factors might be relevant in the treatment rule. Therefore, a more data-driven approach to select these covariates might improve the estimated decision rules and simplify models to make them easier to interpret. We propose a variable selection method for DTR estimation using penalized dynamic weighted least squares. Our method has the strong heredity property, that is, an interaction term can be included in the model only if the corresponding main terms have also been selected. We show our method has both the double robustness property and the oracle property theoretically; and the newly proposed method compares favorably with other variable selection approaches in numerical studies. We further illustrate the proposed method on data from the Sequenced Treatment Alternatives to Relieve Depression study.  相似文献   

2.
Stepped wedge cluster randomized trials (SWCRT) are increasingly used for the evaluation of complex interventions in health services research. They randomly allocate treatments to clusters that switch to intervention under investigation at variable time points without returning to control condition. The resulting unbalanced allocation over time periods and the uncertainty about the underlying correlation structures at cluster-level renders designing and analyzing SWCRTs a challenge. Adjusting for time trends is recommended, appropriate parameterizations depend on the particular context. For sample size calculation, the covariance structure and covariance parameters are usually assumed to be known. These assumptions greatly affect the influence single cluster-period cells have on the effect estimate. Thus, it is important to understand how cluster-period cells contribute to the treatment effect estimate. We therefore discuss two measures of cell influence. These are functions of the design characteristics and covariance structure only and can thus be calculated at the planning stage: the coefficient matrix as discussed by Matthews and Forbes and information content (IC) as introduced by Kasza and Forbes. The main result is a new formula for IC that is more general and computationally more efficient. The formula applies to any generalized least squares estimator, especially for any type of time trend adjustment or nonblock diagonal matrices. We further show a functional relationship between IC and the coefficient matrix. We give two examples that tie in with current literature. All discussed tools and methods are implemented in the R package SteppedPower .  相似文献   

3.
There is growing interest in conducting cluster randomized trials (CRTs). For simplicity in sample size calculation, the cluster sizes are assumed to be identical across all clusters. However, equal cluster sizes are not guaranteed in practice. Therefore, the relative efficiency (RE) of unequal versus equal cluster sizes has been investigated when testing the treatment effect. One of the most important approaches to analyze a set of correlated data is the generalized estimating equation (GEE) proposed by Liang and Zeger, in which the “working correlation structure” is introduced and the association pattern depends on a vector of association parameters denoted by ρ. In this paper, we utilize GEE models to test the treatment effect in a two‐group comparison for continuous, binary, or count data in CRTs. The variances of the estimator of the treatment effect are derived for the different types of outcome. RE is defined as the ratio of variance of the estimator of the treatment effect for equal to unequal cluster sizes. We discuss a commonly used structure in CRTs—exchangeable, and derive the simpler formula of RE with continuous, binary, and count outcomes. Finally, REs are investigated for several scenarios of cluster size distributions through simulation studies. We propose an adjusted sample size due to efficiency loss. Additionally, we also propose an optimal sample size estimation based on the GEE models under a fixed budget for known and unknown association parameter (ρ) in the working correlation structure within the cluster.  相似文献   

4.
Longitudinal studies are often applied in biomedical research and clinical trials to evaluate the treatment effect. The association pattern within the subject must be considered in both sample size calculation and the analysis. One of the most important approaches to analyze such a study is the generalized estimating equation (GEE) proposed by Liang and Zeger, in which “working correlation structure” is introduced and the association pattern within the subject depends on a vector of association parameters denoted by ρ. The explicit sample size formulas for two‐group comparison in linear and logistic regression models are obtained based on the GEE method by Liu and Liang. For cluster randomized trials (CRTs), researchers proposed the optimal sample sizes at both the cluster and individual level as a function of sampling costs and the intracluster correlation coefficient (ICC). In these approaches, the optimal sample sizes depend strongly on the ICC. However, the ICC is usually unknown for CRTs and multicenter trials. To overcome this shortcoming, Van Breukelen et al. consider a range of possible ICC values identified from literature reviews and present Maximin designs (MMDs) based on relative efficiency (RE) and efficiency under budget and cost constraints. In this paper, the optimal sample size and number of repeated measurements using GEE models with an exchangeable working correlation matrix is proposed under the considerations of fixed budget, where “optimal” refers to maximum power for a given sampling budget. The equations of sample size and number of repeated measurements for a known parameter value ρ are derived and a straightforward algorithm for unknown ρ is developed. Applications in practice are discussed. We also discuss the existence of the optimal design when an AR(1) working correlation matrix is assumed. Our proposed method can be extended under the scenarios when the true and working correlation matrix are different.  相似文献   

5.
The stepped wedge cluster randomized trial (SW-CRT) is an increasingly popular design for evaluating health service delivery or policy interventions. An essential consideration of this design is the need to account for both within-period and between-period correlations in sample size calculations. Especially when embedded in health care delivery systems, many SW-CRTs may have subclusters nested in clusters, within which outcomes are collected longitudinally. However, existing sample size methods that account for between-period correlations have not allowed for multiple levels of clustering. We present computationally efficient sample size procedures that properly differentiate within-period and between-period intracluster correlation coefficients in SW-CRTs in the presence of subclusters. We introduce an extended block exchangeable correlation matrix to characterize the complex dependencies of outcomes within clusters. For Gaussian outcomes, we derive a closed-form sample size expression that depends on the correlation structure only through two eigenvalues of the extended block exchangeable correlation structure. For non-Gaussian outcomes, we present a generic sample size algorithm based on linearization and elucidate simplifications under canonical link functions. For example, we show that the approximate sample size formula under a logistic linear mixed model depends on three eigenvalues of the extended block exchangeable correlation matrix. We provide an extension to accommodate unequal cluster sizes and validate the proposed methods via simulations. Finally, we illustrate our methods in two real SW-CRTs with subclusters.  相似文献   

6.
Assessing the significance of the correlation between two spatial processes   总被引:20,自引:0,他引:20  
Modified tests of association based on the correlation coefficient or the covariance between two spatially autocorrelated processes are presented. These tests can be used both for lattice and nonlattice data. They are based on the evaluation of an effective sample size that takes into account the spatial structure. For positively autocorrelated processes, the effective sample size is reduced. A method for evaluating this reduction via an approximation of the variance of the correlation coefficient is developed. The performance of the tests is assessed by Monte Carlo simulations. The method is illustrated by examples from geographical epidemiology.  相似文献   

7.
In clinical trials, sample size reestimation is a useful strategy for mitigating the risk of uncertainty in design assumptions and ensuring sufficient power for the final analysis. In particular, sample size reestimation based on unblinded interim effect size can often lead to sample size increase, and statistical adjustment is usually needed for the final analysis to ensure that type I error rate is appropriately controlled. In current literature, sample size reestimation and corresponding type I error control are discussed in the context of maintaining the original randomization ratio across treatment groups, which we refer to as “proportional increase.” In practice, not all studies are designed based on an optimal randomization ratio due to practical reasons. In such cases, when sample size is to be increased, it is more efficient to allocate the additional subjects such that the randomization ratio is brought closer to an optimal ratio. In this research, we propose an adaptive randomization ratio change when sample size increase is warranted. We refer to this strategy as “nonproportional increase,” as the number of subjects increased in each treatment group is no longer proportional to the original randomization ratio. The proposed method boosts power not only through the increase of the sample size, but also via efficient allocation of the additional subjects. The control of type I error rate is shown analytically. Simulations are performed to illustrate the theoretical results.  相似文献   

8.
In crop protection and ecology accurate and precise estimates of insect populations are required for many purposes. The spatial pattern of the organism sampled, in relation to the sampling scheme adopted, affects the difference between the actual and estimated population density, the bias, and the variability of that estimate, the precision. Field monitoring schemes usually adopt time‐efficient sampling regimes involving contiguous units rather than the most efficient for estimation, the completely random sample. This paper uses spatially‐explicit ecological field data on aphids and beetles to compare common sampling regimes. The random sample was the most accurate method and often the most precise; of the contiguous schemes the line transect was superior to more compact arrangements such as a square block. Bias depended on the relationship between the size and shape of the group of units comprising the sample and the dominant cluster size underlying the spatial pattern. Existing knowledge of spatial pattern to inform the choice of sampling scheme may provide considerable improvements in accuracy. It is recommended to use line transects longer than the grain of the spatial pattern, where grain is defined as the average dimension of clusters over both patches and gaps, and with length at least twice the dominant cluster size.  相似文献   

9.
Covariance matrix estimation is a fundamental statistical task in many applications, but the sample covariance matrix is suboptimal when the sample size is comparable to or less than the number of features. Such high-dimensional settings are common in modern genomics, where covariance matrix estimation is frequently employed as a method for inferring gene networks. To achieve estimation accuracy in these settings, existing methods typically either assume that the population covariance matrix has some particular structure, for example, sparsity, or apply shrinkage to better estimate the population eigenvalues. In this paper, we study a new approach to estimating high-dimensional covariance matrices. We first frame covariance matrix estimation as a compound decision problem. This motivates defining a class of decision rules and using a nonparametric empirical Bayes g-modeling approach to estimate the optimal rule in the class. Simulation results and gene network inference in an RNA-seq experiment in mouse show that our approach is comparable to or can outperform a number of state-of-the-art proposals.  相似文献   

10.
The primate masticatory apparatus (MA) is a functionally integrated set of features, each of which performs important functions in biting, ingestive, and chewing behaviors. A comparison of morphological covariance structure among species for these MA features will help us to further understand the evolutionary history of this region. In this exploratory analysis, the covariance structure of the MA is compared across seven galago species to investigate 1) whether there are differences in covariance structure in this region, and 2) if so, how has this covariation changed with respect to size, MA form, diet, and/or phylogeny? Ten measurements of the MA functionally related to bite force production and load resistance were obtained from 218 adults of seven galago species. Correlation matrices were generated for these 10 dimensions and compared among species via matrix correlations and Mantel tests. Subsequently, pairwise covariance disparity in the MA was estimated as a measure of difference in covariance structure between species. Covariance disparity estimates were correlated with pairwise distances related to differences in body size, MA size and shape, genetic distance (based on cytochrome‐b sequences) and percentage of dietary foods to determine whether one or more of these factors is linked to differences in covariance structure. Galagos differ in MA covariance structure. Body size appears to be a major factor correlated with differences in covariance structure among galagos. The largest galago species, Otolemur crassicaudatus, exhibits large differences in body mass and covariance structure relative to other galagos, and thus plays a primary role in creating this association. MA size and shape do not correlate with covariance structure when body mass is held constant. Diet also shows no association. Genetic distance is significantly negatively correlated with covariance disparity when body mass is held constant, but this correlation appears to be a function of the small body size and large genetic distance for Galagoides demidoff. These exploratory results indicate that changing body size may have been a key factor in the evolution of the galago MA. Am. J. Primatol. 69:46–58, 2007. © 2006 Wiley‐Liss, Inc.  相似文献   

11.
A generalized variance component model is proposed for the analysis of a categorical response variable with extra-multinomial variation. Categorical data obtained from research designs such as randomized multicenter clinical trials or complex sample surveys with clustering frequently exhibit extra-variation resulting from intracluster correlation. General correlation patterns are accounted for by utilizing a mixed-effects modelling approach, estimating the cluster variance components through the method of moments and modelling functions of the observed proportions through the use of estimating equations. A flexible set of assumptions characterizing the underlying covariance structure for the proportions can be accommodated. The importance of accounting for extra-variation when performing hypothesis tests is highlighted with an application to data from a multi-investigator clinical trial.  相似文献   

12.
David I. Warton 《Biometrics》2011,67(1):116-123
Summary A modification of generalized estimating equations (GEEs) methodology is proposed for hypothesis testing of high‐dimensional data, with particular interest in multivariate abundance data in ecology, an important application of interest in thousands of environmental science studies. Such data are typically counts characterized by high dimensionality (in the sense that cluster size exceeds number of clusters, n>K) and over‐dispersion relative to the Poisson distribution. Usual GEE methods cannot be applied in this setting primarily because sandwich estimators become numerically unstable as n increases. We propose instead using a regularized sandwich estimator that assumes a common correlation matrix R , and shrinks the sample estimate of R toward the working correlation matrix to improve its numerical stability. It is shown via theory and simulation that this substantially improves the power of Wald statistics when cluster size is not small. We apply the proposed approach to study the effects of nutrient addition on nematode communities, and in doing so discuss important issues in implementation, such as using statistics that have good properties when parameter estimates approach the boundary (), and using resampling to enable valid inference that is robust to high dimensionality and to possible model misspecification.  相似文献   

13.
Dynamic treatment regimes (DTRs) aim to formalize personalized medicine by tailoring treatment decisions to individual patient characteristics. G‐estimation for DTR identification targets the parameters of a structural nested mean model, known as the blip function, from which the optimal DTR is derived. Despite its potential, G‐estimation has not seen widespread use in the literature, owing in part to its often complex presentation and implementation, but also due to the necessity for correct specification of the blip. Using a quadratic approximation approach inspired by iteratively reweighted least squares, we derive a quasi‐likelihood function for G‐estimation within the DTR framework, and show how it can be used to form an information criterion for blip model selection. We outline the theoretical properties of this model selection criterion and demonstrate its application in a variety of simulation studies as well as in data from the Sequenced Treatment Alternatives to Relieve Depression study.  相似文献   

14.
Cluster randomized studies are common in community trials. The standard method for estimating sample size for cluster randomized studies assumes a common cluster size. However often in cluster randomized studies, size of the clusters vary. In this paper, we derive sample size estimation for continuous outcomes for cluster randomized studies while accounting for the variability due to cluster size. It is shown that the proposed formula for estimating total cluster size can be obtained by adding a correction term to the traditional formula which uses the average cluster size. Application of these results to the design of a health promotion educational intervention study is discussed.  相似文献   

15.
Aimng at the evaluation of environmental damages, we proposed a dynamic evaluation approach based on the optimal cluster criterion. Firstly, a method for sample data standardization was introduced. After determining the measurement of damage grade, we applied the system cluster analysis approach to classify the grades of corresponding environmental pollution events. Then, the optimal cluster level was evaluated based on the optimal clustering criterion. By using the marine environmental damages caused by 17 marine oil spill events as a sample, we tested the dynamic evaluation method proposed in this article with its practicability. Further, by comparing it with some traditional damage evaluation methods, we found that their evaluation results were consistent. All these have shown that the dynamic evaluation approach proposed in this paper can meet the requirement of environmental damage evaluation. Finally, directions of future researches were pointed out.  相似文献   

16.
Wang CY 《Biometrics》2000,56(1):106-112
Consider the problem of estimating the correlation between two nutrient measurements, such as the percent energy from fat obtained from a food frequency questionnaire (FFQ) and that from repeated food records or 24-hour recalls. Under a classical additive model for repeated food records, it is known that there is an attenuation effect on the correlation estimation if the sample average of repeated food records for each subject is used to estimate the underlying long-term average. This paper considers the case in which the selection probability of a subject for participation in the calibration study, in which repeated food records are measured, depends on the corresponding FFQ value, and the repeated longitudinal measurement errors have an autoregressive structure. This paper investigates a normality-based estimator and compares it with a simple method of moments. Both methods are consistent if the first two moments of nutrient measurements exist. Furthermore, joint estimating equations are applied to estimate the correlation coefficient and related nuisance parameters simultaneously. This approach provides a simple sandwich formula for the covariance estimation of the estimator. Finite sample performance is examined via a simulation study, and the proposed weighted normality-based estimator performs well under various distributional assumptions. The methods are applied to real data from a dietary assessment study.  相似文献   

17.
The log response ratio, lnRR, is the most frequently used effect size statistic for meta-analysis in ecology. However, often missing standard deviations (SDs) prevent estimation of the sampling variance of lnRR. We propose new methods to deal with missing SDs via a weighted average coefficient of variation (CV) estimated from studies in the dataset that do report SDs. Across a suite of simulated conditions, we find that using the average CV to estimate sampling variances for all observations, regardless of missingness, performs with minimal bias. Surprisingly, even with missing SDs, this simple method outperforms the conventional approach (basing each effect size on its individual study-specific CV) with complete data. This is because the conventional method ultimately yields less precise estimates of the sampling variances than using the pooled CV from multiple studies. Our approach is broadly applicable and can be implemented in all meta-analyses of lnRR, regardless of ‘missingness’.  相似文献   

18.
Evidence supporting the current World Health Organization recommendations of early antiretroviral therapy (ART) initiation for adolescents is inconclusive. We leverage a large observational data and compare, in terms of mortality and CD4 cell count, the dynamic treatment initiation rules for human immunodeficiency virus‐infected adolescents. Our approaches extend the marginal structural model for estimating outcome distributions under dynamic treatment regimes, developed in Robins et al. (2008), to allow the causal comparisons of both specific regimes and regimes along a continuum. Furthermore, we propose strategies to address three challenges posed by the complex data set: continuous‐time measurement of the treatment initiation process; sparse measurement of longitudinal outcomes of interest, leading to incomplete data; and censoring due to dropout and death. We derive a weighting strategy for continuous‐time treatment initiation, use imputation to deal with missingness caused by sparse measurements and dropout, and define a composite outcome that incorporates both death and CD4 count as a basis for comparing treatment regimes. Our analysis suggests that immediate ART initiation leads to lower mortality and higher median values of the composite outcome, relative to other initiation rules.  相似文献   

19.
Marginal structural models (MSMs) have been proposed for estimating a treatment's effect, in the presence of time‐dependent confounding. We aimed to evaluate the performance of the Cox MSM in the presence of missing data and to explore methods to adjust for missingness. We simulated data with a continuous time‐dependent confounder and a binary treatment. We explored two classes of missing data: (i) missed visits, which resemble clinical cohort studies; (ii) missing confounder's values, which correspond to interval cohort studies. Missing data were generated under various mechanisms. In the first class, the source of the bias was the extreme treatment weights. Truncation or normalization improved estimation. Therefore, particular attention must be paid to the distribution of weights, and truncation or normalization should be applied if extreme weights are noticed. In the second case, bias was due to the misspecification of the treatment model. Last observation carried forward (LOCF), multiple imputation (MI), and inverse probability of missingness weighting (IPMW) were used to correct for the missingness. We found that alternatives, especially the IPMW method, perform better than the classic LOCF method. Nevertheless, in situations with high marker's variance and rarely recorded measurements none of the examined method adequately corrected the bias.  相似文献   

20.
Pragmatic trials evaluating health care interventions often adopt cluster randomization due to scientific or logistical considerations. Systematic reviews have shown that coprimary endpoints are not uncommon in pragmatic trials but are seldom recognized in sample size or power calculations. While methods for power analysis based on K ( K 2 $K\ge 2$ ) binary coprimary endpoints are available for cluster randomized trials (CRTs), to our knowledge, methods for continuous coprimary endpoints are not yet available. Assuming a multivariate linear mixed model (MLMM) that accounts for multiple types of intraclass correlation coefficients among the observations in each cluster, we derive the closed-form joint distribution of K treatment effect estimators to facilitate sample size and power determination with different types of null hypotheses under equal cluster sizes. We characterize the relationship between the power of each test and different types of correlation parameters. We further relax the equal cluster size assumption and approximate the joint distribution of the K treatment effect estimators through the mean and coefficient of variation of cluster sizes. Our simulation studies with a finite number of clusters indicate that the predicted power by our method agrees well with the empirical power, when the parameters in the MLMM are estimated via the expectation-maximization algorithm. An application to a real CRT is presented to illustrate the proposed method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号