首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 10 毫秒
1.
Designs for synthetic case-control studies in open cohorts   总被引:3,自引:0,他引:3  
Several designs are proposed for case-control studies within cohorts when the cohort is open to late entry. These and previously proposed designs are examined with respect to consistency and efficiency of relative risk parameter estimation, and a small simulation study is reported. If study costs increase in proportion to the total number of "at-risk" controls, the most efficient design, Design C, is as follows. For a case failing at time t, controls are selected at random (and without regard to "at-risk" status) from among cohort members who are (i) known not to have failed prior to t and (ii) have not been previously selected as controls. At each t, control sampling proceeds until a prespecified number of controls who are "at risk" at t have been obtained. The efficiency advantage of Design C over that of the standard case-control design proposed by Thomas (in Appendix to Liddell, McDonald, and Thomas, 1977, Journal of the Royal Statistical Society, Series B 140, 469-490) will often be small. If, on the other hand, the costs increase in proportion to the number of distinct "at-risk" controls, Design C is no longer the most efficient design. In this case, several alternative designs are proposed.  相似文献   

2.
Lu SE  Wang MC 《Biometrics》2002,58(4):764-772
Cohort case-control design is an efficient and economical design to study risk factors for disease incidence or mortality in a large cohort. In the last few decades, a variety of cohort case-control designs have been developed and theoretically justified. These designs have been exclusively applied to the analysis of univariate failure-time data. In this work, a cohort case-control design adapted to multivariate failure-time data is developed. A risk set sampling method is proposed to sample controls from nonfailures in a large cohort for each case matched by failure time. This method leads to a pseudolikelihood approach for the estimation of regression parameters in the marginal proportional hazards model (Cox, 1972, Journal of the Royal Statistical Society, Series B 34, 187-220), where the correlation structure between individuals within a cluster is left unspecified. The performance of the proposed estimator is demonstrated by simulation studies. A bootstrap method is proposed for inferential purposes. This methodology is illustrated by a data example from a child vitamin A supplementation trial in Nepal (Nepal Nutrition Intervention Project-Sarlahi, or NNIPS).  相似文献   

3.
The Cochran-Armitage trend test (CATT) is well suited for testing association between a marker and a disease in case-control studies. When the underlying genetic model for the disease is known, the CATT optimal for the genetic model is used. For complex diseases, however, the genetic models of the true disease loci are unknown. In this situation, robust tests are preferable. We propose a two-phase analysis with model selection for the case-control design. In the first phase, we use the difference of Hardy-Weinberg disequilibrium coefficients between the cases and the controls for model selection. Then, an optimal CATT corresponding to the selected model is used for testing association. The correlation of the statistics used for selection and the test for association is derived to adjust the two-phase analysis with control of the Type-I error rate. The simulation studies show that this new approach has greater efficiency robustness than the existing methods.  相似文献   

4.

Background

Control selection is a major challenge in epidemiologic case-control studies. The aim of our study was to evaluate using hospital versus neighborhood control groups in studying risk factors of esophageal squamous cell carcinoma (ESCC).

Methodology/Principal Findings

We compared the results of two different case-control studies of ESCC conducted in the same region by a single research group. Case definition and enrollment were the same in the two studies, but control selection differed. In the first study, we selected two age- and sex-matched controls from inpatient subjects in hospitals, while for the second we selected two age- and sex-matched controls from each subject''s neighborhood of residence. We used the test of heterogeneity to compare the results of the two studies. We found no significant differences in exposure data for tobacco-related variables such as cigarette smoking, chewing Nass (a tobacco product) and hookah (water pipe) usage, but the frequency of opium usage was significantly different between hospital and neighborhood controls. Consequently, the inference drawn for the association between ESCC and tobacco use did not differ between the studies, but it did for opium use. In the study using neighborhood controls, opium use was associated with a significantly increased risk of ESCC (adjusted OR 1.77, 95% CI 1.17–2.68), while in the study using hospital controls, this was not the case (OR 1.09, 95% CI 0.63–1.87). Comparing the prevalence of opium consumption in the two control groups and a cohort enrolled from the same geographic area suggested that the neighborhood controls were more representative of the study base population for this exposure.

Conclusions/Significance

Hospital and neighborhood controls did not lead us to the same conclusion for a major hypothesized risk factor for ESCC in this population. Our results show that control group selection is critical in drawing appropriate conclusions in observational studies.  相似文献   

5.
Case-cohort designs and analysis for clustered failure time data   总被引:1,自引:0,他引:1  
Lu SE  Shih JH 《Biometrics》2006,62(4):1138-1148
Case-cohort design is an efficient and economical design to study risk factors for infrequent disease in a large cohort. It involves the collection of covariate data from all failures ascertained throughout the entire cohort, and from the members of a random subcohort selected at the onset of follow-up. In the literature, the case-cohort design has been extensively studied, but was exclusively considered for univariate failure time data. In this article, we propose case-cohort designs adapted to multivariate failure time data. An estimation procedure with the independence working model approach is used to estimate the regression parameters in the marginal proportional hazards model, where the correlation structure between individuals within a cluster is left unspecified. Statistical properties of the proposed estimators are developed. The performance of the proposed estimators and comparisons of statistical efficiencies are investigated with simulation studies. A data example from the Translating Research into Action for Diabetes (TRIAD) study is used to illustrate the proposed methodology.  相似文献   

6.
J M Robins  M H Gail  J H Lubin 《Biometrics》1986,42(2):293-299
The authors consider several aspects of the design and analysis of synthetic case-control studies of cohort data under a proportional hazards model. First, in highly stratified data, consistent estimates of the relative risk are shown to result only if controls are sampled randomly with replacement from the entire risk set or without replacement from the noncases. Second, if previous controls are excluded from consideration as future controls but are included as cases if they fail, then inconsistent estimates of the relative risk can occur if "time" in the proportional hazards model represents an individual's chronological age and age at entry into follow-up is variable. On the other hand, if "time" represents time since the beginning of follow-up, estimates of the relative risk will be consistent, but the usual variance estimator will be inconsistent.  相似文献   

7.
In biomedical cohort studies for assessing the association between an outcome variable and a set of covariates, usually, some covariates can only be measured on a subgroup of study subjects. An important design question is—which subjects to select into the subgroup to increase statistical efficiency. When the outcome is binary, one may adopt a case-control sampling design or a balanced case-control design where cases and controls are further matched on a small number of complete discrete covariates. While the latter achieves success in estimating odds ratio (OR) parameters for the matching covariates, similar two-phase design options have not been explored for the remaining covariates, especially the incompletely collected ones. This is of great importance in studies where the covariates of interest cannot be completely collected. To this end, assuming that an external model is available to relate the outcome and complete covariates, we propose a novel sampling scheme that oversamples cases and controls with worse goodness-of-fit based on the external model and further matches them on complete covariates similarly to the balanced design. We develop a pseudolikelihood method for estimating OR parameters. Through simulation studies and explorations in a real-cohort study, we find that our design generally leads to reduced asymptotic variances of the OR estimates and the reduction for the matching covariates is comparable to that of the balanced design.  相似文献   

8.
Four large (n > 1000) populations of Drosophila melanogaster, derived from control populations maintained on a 3 week discrete generation cycle, were subjected to selection for fast development and early reproduction. Egg to eclosion survivorship and development time and dry weight at eclosion were monitored every 10 generations. Over 70 generations of selection, development time in the selected populations decreased by approximately 36 h relative to controls, a 20% decline. The difference in male and female development time was also reduced in the selected populations. Flies from the selected populations were increasingly lighter at eclosion than controls, with the reduction in dry weight at eclosion over 70 generations of selection being approximately 45% in males and 39% in females. Larval growth rate (dry weight at eclosion/development time) was also reduced in the selected lines over 70 generations, relative to controls, by approximately 32% in males and 24% in females. However, part of this relative reduction was due to an increase in growth rate of the controls populations, presumably an expression of adaptation to conditions in our laboratory. After 50 generations of selection had elapsed, a considerable and increasing pre-adult viability cost to faster development became apparent, with viability in the selected populations being about 22% less than that of controls at generation 70 of selection.  相似文献   

9.
Case-control studies offer a rapid and efficient way to evaluate hypotheses. On the other hand, proper selection of the controls is challenging, and the potential for selection bias is a major weakness. Valid inferences about parameters of interest cannot be drawn if selection bias exists. Furthermore, the selection bias is difficult to evaluate. Even in situations where selection bias can be estimated, few methods are available. In the matched case-control Northern Manhattan Stroke Study (NOMASS), stroke-free controls are sampled in two stages. First, a telephone survey ascertains demographic and exposure status from a large random sample. Then, in an in-person interview, detailed information is collected for the selected controls to be used in a matched case-control study. The telephone survey data provides information about the selection probability and the potential selection bias. In this article, we propose bias-corrected estimators in a case-control study using a joint estimating equation approach. The proposed bias-corrected estimate and its standard error can be easily obtained by standard statistical software.  相似文献   

10.
Case-cohort analysis with accelerated failure time model   总被引:1,自引:0,他引:1  
Kong L  Cai J 《Biometrics》2009,65(1):135-142
Summary .  In a case–cohort design, covariates are assembled only for a subcohort that is randomly selected from the entire cohort and any additional cases outside the subcohort. This design is appealing for large cohort studies of rare disease, especially when the exposures of interest are expensive to ascertain for all the subjects. We propose statistical methods for analyzing the case–cohort data with a semiparametric accelerated failure time model that interprets the covariates effects as to accelerate or decelerate the time to failure. Asymptotic properties of the proposed estimators are developed. The finite sample properties of case–cohort estimator and its relative efficiency to full cohort estimator are assessed via simulation studies. A real example from a study of cardiovascular disease is provided to illustrate the estimating procedure.  相似文献   

11.
In a typical case-control study, exposure information is collected at a single time point for the cases and controls. However, case-control studies are often embedded in existing cohort studies containing a wealth of longitudinal exposure history about the participants. Recent medical studies have indicated that incorporating past exposure history, or a constructed summary measure of cumulative exposure derived from the past exposure history, when available, may lead to more precise and clinically meaningful estimates of the disease risk. In this article, we propose a flexible Bayesian semiparametric approach to model the longitudinal exposure profiles of the cases and controls and then use measures of cumulative exposure based on a weighted integral of this trajectory in the final disease risk model. The estimation is done via a joint likelihood. In the construction of the cumulative exposure summary, we introduce an influence function, a smooth function of time to characterize the association pattern of the exposure profile on the disease status with different time windows potentially having differential influence/weights. This enables us to analyze how the present disease status of a subject is influenced by his/her past exposure history conditional on the current ones. The joint likelihood formulation allows us to properly account for uncertainties associated with both stages of the estimation process in an integrated manner. Analysis is carried out in a hierarchical Bayesian framework using reversible jump Markov chain Monte Carlo algorithms. The proposed methodology is motivated by, and applied to a case-control study of prostate cancer where longitudinal biomarker information is available for the cases and controls.  相似文献   

12.
D C Thomas  M Blettner  N E Day 《Biometrics》1992,48(3):781-794
A method is proposed for analysis of nested case-control studies that combines the matched comparison of covariate values between cases and controls and a comparison of the observed numbers of cases in the nesting cohort with expected numbers based on external rates and average relative risks estimated from the controls. The former comparison is based on the conditional likelihood for matched case-control studies and the latter on the unconditional likelihood for Poisson regression. It is shown that the two likelihoods are orthogonal and that their product is an estimator of the full survival likelihood that would have been obtained on the total cohort, had complete covariate data been available. Parameter estimation and significance tests follow in the usual way by maximizing this product likelihood. The method is illustrated using data on leukemia following irradiation for cervical cancer. In this study, the original cohort study showed a clear excess of leukemia in the first 15 years after exposure, but it was not feasible to obtain dose estimates on the entire cohort. However, the subsequent nested case-control study failed to demonstrate significant differences between alternative dose-response relations and effects of time-related modifiers. The combined analysis allows much clearer discrimination between alternative dose-time-response models.  相似文献   

13.
This protocol details the steps for data quality assessment and control that are typically carried out during case-control association studies. The steps described involve the identification and removal of DNA samples and markers that introduce bias. These critical steps are paramount to the success of a case-control study and are necessary before statistically testing for association. We describe how to use PLINK, a tool for handling SNP data, to perform assessments of failure rate per individual and per SNP and to assess the degree of relatedness between individuals. We also detail other quality-control procedures, including the use of SMARTPCA software for the identification of ancestral outliers. These platforms were selected because they are user-friendly, widely used and computationally efficient. Steps needed to detect and establish a disease association using case-control data are not discussed here. Issues concerning study design and marker selection in case-control studies have been discussed in our earlier protocols. This protocol, which is routinely used in our labs, should take approximately 8 h to complete.  相似文献   

14.
Advances in human genetics have led to epidemiological investigations not only of the effects of genes alone but also of gene-environment (G-E) interaction. A widely accepted design strategy in the study of how G-E relate to disease risks is the population-based case-control study (PBCCS). For simple random samples, semiparametric methods for testing G-E have been developed by Chatterjee and Carroll in 2005. The use of complex sampling in PBCCS that involve differential probabilities of sample selection of cases and controls and possibly cluster sampling is becoming more common. Two complexities, weighting for selection probabilities and intracluster correlation of observations, are induced by the complex sampling. We develop pseudo-semiparametric maximum likelihood estimators (pseudo-SPMLE) that apply to PBCCS with complex sampling. We study the finite sample performance of the pseudo-SPMLE using simulations and illustrate the pseudo-SPMLE with a US case-control study of kidney cancer.  相似文献   

15.

Background

Colorectal cancer (CRC) is considered a complex disease, and thus the majority of the genetic susceptibility is thought to lie in the form of low-penetrance variants following a polygenic model of inheritance. Candidate-gene studies have so far been one of the basic approaches taken to identify these susceptibility variants. The consistent involvement of some signaling routes in carcinogenesis provided support for pathway-based studies as a natural strategy to select genes that could potentially harbour new susceptibility loci.

Methodology/Principal Findings

We selected two main carcinogenesis-related pathways: Wnt and BMP, in order to screen the implicated genes for new risk variants. We then conducted a case-control association study in 933 CRC cases and 969 controls based on coding and regulatory SNPs. We also included rs4444235 and rs9929218, which did not fulfill our selection criteria but belonged to two genes in the BMP pathway and had consistently been linked to CRC in previous studies. Neither allelic, nor genotypic or haplotypic analyses showed any signs of association between the 37 screened variants and CRC risk. Adjustments for sex and age, and stratified analysis between sporadic and control groups did not yield any positive results either.

Conclusions/Significance

Despite the relevance of both pathways in the pathogenesis of the disease, and the fact that this is indeed the first study that considers these pathways as a candidate-gene selection approach, our study does not present any evidence of the presence of low-penetrance variants for the selected markers in any of the considered genes in our cohort.  相似文献   

16.
17.
Chan KC  Wang MC 《Biometrics》2012,68(2):521-531
A prevalent sample consists of individuals who have experienced disease incidence but not failure event at the sampling time. We discuss methods for estimating the distribution function of a random vector defined at baseline for an incident disease population when data are collected by prevalent sampling. Prevalent sampling design is often more focused and economical than incident study design for studying the survival distribution of a diseased population, but prevalent samples are biased by design. Subjects with longer survival time are more likely to be included in a prevalent cohort, and other baseline variables of interests that are correlated with survival time are also subject to sampling bias induced by the prevalent sampling scheme. Without recognition of the bias, applying empirical distribution function to estimate the population distribution of baseline variables can lead to serious bias. In this article, nonparametric and semiparametric methods are developed for distribution estimation of baseline variables using prevalent data.  相似文献   

18.

Introduction

Russia has experienced massive fluctuations in mortality at working ages over the past three decades. Routine data analyses suggest that these are largely driven by fluctuations in heavy alcohol drinking. However, individual-level evidence supporting alcohol having a major role in Russian mortality comes from only two case-control studies, which could be subject to serious biases due to their design.

Methods and Findings

A prospective study of mortality (2003–9) of 2000 men aged 25–54 years at recruitment was conducted in the city of Izhevsk, Russia. This cohort was free from key limitations inherent in the design of the two earlier case-control studies. Cox proportional hazards regression was used to estimate hazard ratios of all-cause mortality by alcohol drinking type as reported by a proxy informant. Hazardous drinkers were defined as those who either drank non-beverage alcohols or were reported to regularly have hangovers or other behaviours related to heavy drinking episodes.Over the follow-up period 113 men died. Compared to non-hazardous drinkers and abstainers, men who drank hazardously had appreciably higher mortality (HR = 3.4, 95% CI 2.2, 5.1) adjusted for age, smoking and education. The population attributable risk percent (PAR%) for hazardous drinking was 26% (95% CI 14,37). However, larger effects were seen in the first two years of follow-up, with a HR of 4.6 (2.5, 8.2) and a corresponding PAR% of 37% (17, 51).

Interpretation

This prospective cohort study strengthens the evidence that hazardous alcohol consumption has been a major determinant of mortality among working age men in a typical Russian city. As such the similar findings of the previous case-control studies cannot be explained as artefacts of limitations of their design. As Russia struggles to raise life expectancy, which even in 2009 was only 62 years among men, control of hazardous drinking must remain a top public health priority.  相似文献   

19.
S Wacholder  M Gail  D Pee 《Biometrics》1991,47(1):63-76
We develop approximate methods to compare the efficiencies and to compute the power of alternative potential designs for sampling from a cohort before beginning to collect exposure data. Our methods require only that the cohort be assembled, meaning that the numbers of individuals Nkj at risk at pairs of event times tk and tj greater than or equal to tk are available. To compute Nkj, one needs to know the entry, follow-up, censoring, and event history, but not the exposure, for each individual. Our methods apply to any "unbiased control sampling design," in which cases are compared to a random sample of noncases at risk at the time of an event. We apply our methods to approximate the efficiencies of the nested case-control design, the case-cohort design, and an augmented case-cohort design, compared to the full cohort design, in an assembled cohort of 17,633 members of an insurance cooperative who were followed for mortality from prostatic cancer. The assumptions underlying the approximation are that exposure is unrelated both to the hazard of an event and to the hazard for censoring. The approximations performed well in simulations when both assumptions held and when the exposure was moderately related to censoring.  相似文献   

20.

Background

Determining genetic risk is a fundamental prerequisite for the implementation of primary prevention trials for type 1 diabetes (T1D). The aim of this study was to assess the risk conferred by HLA-DRB1, INS-VNTR and PTPN22 single genes on the onset of T1D and the joint risk conferred by all these three susceptibility loci using the Bayesian Network (BN) approach in both population-based case-control and family clustering data sets.

Methodology/Principal Findings

A case-control French cohort, consisting of 868 T1D patients and 73 French control subjects, a French family data set consisting of 1694 T1D patients and 2340 controls were analysed. We studied both samples separately applying the BN probabilistic approach, that is a graphical model that encodes probabilistic relationships among variables of interest. As expected HLA-DRB1 is the most relevant susceptibility gene. We proved that INS and PTPN22 genes marginally influence T1D risk in all risk HLA-DRB1 genotype categories. The absolute risk conferred by carrying simultaneously high, moderate or low risk HLA-DRB1 genotypes together with at risk INS and PTPN22 genotypes, was 11.5%, 1.7% and 0.1% in the case-control sample and 19.8%, 6.6% and 2.2% in the family cohort, respectively.

Conclusions/Significance

This work represents, to the best of our knowledge, the first study based on both case-control and family data sets, showing the joint effect of HLA, INS and PTPN22 in a T1D Caucasian population with a wide range of age at T1D onset, adding new insights to previous findings regarding data sets consisting of patients and controls <15 years at onset.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号