共查询到20条相似文献,搜索用时 0 毫秒
1.
Methods for the analysis of unmatched case-control data based on a finite population sampling model are developed. Under this model, and the prospective logistic model for disease probabilities, a likelihood for case-control data that accommodates very general sampling of controls is derived. This likelihood has the form of a weighted conditional logistic likelihood. The flexibility of the methods is illustrated by providing a number of control sampling designs and a general scheme for their analyses. These include frequency matching, counter-matching, case-base, randomized recruitment, and quota sampling. A study of risk factors for childhood asthma illustrates an application of the counter-matching design. Some asymptotic efficiency results are presented and computational methods discussed. Further, it is shown that a 'marginal' likelihood provides a link to unconditional logistic methods. The methods are examined in a simulation study that compares frequency and counter-matching using conditional and unconditional logistic analyses and indicate that the conditional logistic likelihood has superior efficiency. Extensions that accommodate sampling of cases and multistage designs are presented. Finally, we compare the analysis methods presented here to other approaches, compare counter-matching and two-stage designs, and suggest areas for further research.To whom correspondence should be addressed. 相似文献
2.
3.
We consider population-based case-control designs in which controls are selected by one of three cluster sampling plans from the entire population at risk. The effects of cluster sampling on classical epidemiologic procedures are investigated, and appropriately modified procedures are developed. In particular, modified procedures for testing the homogeneity of odds ratios across strata, and for estimating and testing a common odds ratio are presented. Simulations that use the data from the 1970 Health Interview Survey as a population suggest that classical procedures may be fairly robust in the presence of cluster sampling. A more extreme example based on a mixed multinomial model clearly demonstrates that the classical Mantel-Haenszel (1959, Journal of the National Cancer Institute 22, 719-748) and Woolf-Haldane tests of no exposure effect may have sizes exceeding nominal levels and confidence intervals with less than nominal coverage under an alternative hypothesis. Classical estimates of odds ratios may also be biased with non-self-weighting cluster samples. The modified procedures we propose remedy these defects. 相似文献
4.
5.
On the design of synthetic case-control studies 总被引:6,自引:0,他引:6
R L Prentice 《Biometrics》1986,42(2):301-310
A design is proposed for "case-control within cohort" studies. In this design, controls are sampled without replacement from failure-free members of the cohort at each distinct failure time. Upon selection, a subject ceases to be eligible for control selection at later failure times. Also, if a subject failing at time t had been selected as a control at t' less than t, then the matched controls at t are selected to have also been at risk at t'. In these circumstances correlation exists between score statistic contributions at t and t'. An estimator is developed for this correlation. A small simulation study compares the design just described to other possible synthetic case-control designs. 相似文献
6.
McNamee R 《Biostatistics (Oxford, England)》2005,6(4):590-603
This paper addresses optimal design and efficiency of two-phase (2P) case-control studies in which the first phase uses an error-prone exposure measure, Z, while the second phase measures true, dichotomous exposure, X, in a subset of subjects. Optimal design of a separate second phase, to be added to a preexisting study, is also investigated. Differential misclassification is assumed throughout. Results are also applicable to 2P cohort studies with error-prone and error-free measures of disease status but error-free exposure measures. While software based on the mean score method of Reilly and Pepe (1995, Biometrika 82, 299--314) can find optimal designs given pilot data, the lack of simple formulae makes it difficult to generalize about efficiency compared to one-phase (1P) studies based on X alone. Here, formulae for the optimal ratios of cases to controls and first- to second-phase sizes, and the optimal second-phase stratified sampling fractions, given a fixed budget, are given. The maximum efficiency of 2P designs compared to a 1P design is deduced and is shown to be bounded from above by a function of the sensitivities and specificities of Z. The efficiency of 'balanced' separate second-phase designs (Breslow and Cain, 1988, Biometrika 75, 11--20)-in which equal numbers of subjects are chosen from each first-phase strata-compared to optimal design is deduced, enabling situations where balanced designs are nearly optimal to be identified. 相似文献
7.
Logistic regression methods for retrospective case-control studies using complex sampling procedures
There are a number of possible designs for case-control studies. The simplest uses two separate simple random samples, but an actual study may use more complex sampling procedures. Typically, stratification is used to control for the effects of one or more risk factors in which we are interested. It has been shown (Anderson, 1972, Biometrika 59, 19-35; Prentice and Pyke, 1979, Biometrika 66, 403-411) that the unconditional logistic regression estimators apply under stratified sampling, so long as the logistic model includes a term for each stratum. We consider the case-control problem with stratified samples and assume a logistic model that does not include terms for strata, i.e., for fixed covariates the (prospective) probability of disease does not depend on stratum. We assume knowledge of the proportion sampled in each stratum as well as the total number in the stratum. We use this knowledge to obtain the maximum likelihood estimators for all parameters in the logistic model including those for variables completely associated with strata. The approach may also be applied to obtain estimators under probability sampling. 相似文献
8.
Sensitivity analysis for matched case-control studies 总被引:1,自引:0,他引:1
P R Rosenbaum 《Biometrics》1991,47(1):87-100
A sensitivity analysis in an observational study indicates the degree to which conclusions would be altered by hidden biases of various magnitudes. A method of sensitivity analysis previously proposed for cohort studies is extended for use in matched case-control studies with multiple controls, where slightly different derivations and calculations are required. Also discussed is a sensitivity analysis for case-control studies that have two distinct types of controls, say hospital and neighborhood controls, where the two types may be affected by different biases. For illustration, the method is applied to five case-control studies, including a study of herniated lumbar disc in which there are three types of cases, and a study of breast cancer with two types of controls. 相似文献
9.
We consider matched case-control familial studies which match a group of patients, called "case probands," with a group of disease-free subjects, called "control probands," using a set of family-level matching variables. Family members of each proband are then recruited into the study. Of interest here is the familial aggregation of the response variable and the effects of subject-specific covariates on the response. We propose an estimating equation approach to jointly estimate the main effects and intrafamilial correlations for matched family studies with a continuous outcome. Only knowledge of the first two joint moments of the response variable is required. The induced estimators for the main effects and intrafamilial correlations are consistent and asymptotically normally distributed. We apply the proposed method to sleep apnea data. A simulation study demonstrates the usefulness of our approach. 相似文献
10.
Chorionic villus sampling (CVS) is a valued method of prenatal diagnosis that is often preferred over amniocentesis because it can be performed earlier, but which has also raised concern over a possible association with increased risk of terminal transverse limb deficiency (TTLD). We present and apply a meta-analytic method for estimating a combined dose-response effect from a series of case-control and cohort studies in which the exposure variable is interval-censored. Assuming coarsening at random for the interval-censoring, and calling upon the familiar result of Cornfield to pool case-control and cohort information on the association between a rare binary outcome and a multilevel exposure variable, we form a likelihood-based model to assess the effect of gestational age at the time of CVS on the presence or absence of a rare birth defect. Effect estimates are computed with a variant of the EM algorithm termed the method of weights, which enables the use of standard weighted regression software. Our findings suggest that CVS exposure at early gestational age leads to an increased risk of TTLD. 相似文献
11.
Clarke GM Anderson CA Pettersson FH Cardon LR Morris AP Zondervan KT 《Nature protocols》2011,6(2):121-133
This protocol describes how to perform basic statistical analysis in a population-based genetic association case-control study. The steps described involve the (i) appropriate selection of measures of association and relevance of disease models; (ii) appropriate selection of tests of association; (iii) visualization and interpretation of results; (iv) consideration of appropriate methods to control for multiple testing; and (v) replication strategies. Assuming no previous experience with software such as PLINK, R or Haploview, we describe how to use these popular tools for handling single-nucleotide polymorphism data in order to carry out tests of association and visualize and interpret results. This protocol assumes that data quality assessment and control has been performed, as described in a previous protocol, so that samples and markers deemed to have the potential to introduce bias to the study have been identified and removed. Study design, marker selection and quality control of case-control studies have also been discussed in earlier protocols. The protocol should take ~1 h to complete. 相似文献
12.
Using unphased genotype data, we studied statistical inference for association between a disease and a haplotype in matched case-control studies. Statistical inference for haplotype data is complicated due to ambiguity of genotype phases. An estimating equation-based method is developed for estimating odds ratios and testing disease-haplotype association. The method potentially can also be applied to testing haplotype-environment interaction. Simulation studies show that the proposed method has good performance. The performance of the method in the presence of departures from Hardy-Weinberg equilibrium is also studied. 相似文献
13.
Cohort case-control design is an efficient and economical design to study risk factors for disease incidence or mortality in a large cohort. In the last few decades, a variety of cohort case-control designs have been developed and theoretically justified. These designs have been exclusively applied to the analysis of univariate failure-time data. In this work, a cohort case-control design adapted to multivariate failure-time data is developed. A risk set sampling method is proposed to sample controls from nonfailures in a large cohort for each case matched by failure time. This method leads to a pseudolikelihood approach for the estimation of regression parameters in the marginal proportional hazards model (Cox, 1972, Journal of the Royal Statistical Society, Series B 34, 187-220), where the correlation structure between individuals within a cluster is left unspecified. The performance of the proposed estimator is demonstrated by simulation studies. A bootstrap method is proposed for inferential purposes. This methodology is illustrated by a data example from a child vitamin A supplementation trial in Nepal (Nepal Nutrition Intervention Project-Sarlahi, or NNIPS). 相似文献
14.
Genetic epidemiologic studies often collect genotype data at multiple loci within a genomic region of interest from a sample of unrelated individuals. One popular method for analyzing such data is to assess whether haplotypes, i.e., the arrangements of alleles along individual chromosomes, are associated with the disease phenotype or not. For many study subjects, however, the exact haplotype configuration on the pair of homologous chromosomes cannot be derived with certainty from the available locus-specific genotype data (phase ambiguity). In this article, we consider estimating haplotype-specific association parameters in the Cox proportional hazards model, using genotype, environmental exposure, and the disease endpoint data collected from cohort or nested case-control studies. We study alternative Expectation-Maximization algorithms for estimating haplotype frequencies from cohort and nested case-control studies. Based on a hazard function of the disease derived from the observed genotype data, we then propose a semiparametric method for joint estimation of relative-risk parameters and the cumulative baseline hazard function. The method is greatly simplified under a rare disease assumption, for which an asymptotic variance estimator is also proposed. The performance of the proposed estimators is assessed via simulation studies. An application of the proposed method is presented, using data from the Alpha-Tocopherol, Beta-Carotene Cancer Prevention Study. 相似文献
15.
《Current biology : CB》2021,31(16):3656-3662.e3
16.
17.
18.
We consider a semiparametric inference procedure for data from epidemiologic studies conducted with a two-component sampling scheme where both a simple random sample and multiple outcome- or outcome-/auxiliary-dependent samples are observed. This sampling scheme allows the investigators to oversample certain subpopulations believed to have more information about the regression model while still gaining insights about the underlying population through the simple random sample. We focus on settings where there is no additional information about the parent cohort and the sampling probability is nonidentifiable. We motivate our problem with an ongoing study to assess the association between the mutation level of epidermal growth factor receptor (EGFR) and the antitumor response to EGFR-targeted therapy among nonsmall cell lung cancer patients. The proposed method applies to both binary and multicategorical outcome data and allows an arbitrary link function in the framework of generalized linear models. Simulation studies show that the proposed estimator has nice small sample properties. The proposed method is illustrated with a data example. 相似文献
19.
20.