首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 900 毫秒
1.
The ecological study design suffers from a broad range of biases that result from the loss of information regarding the joint distribution of individual-level outcomes, exposures, and confounders. The consequent nonidentifiability of individual-level models cannot be overcome without additional information; we combine ecological data with a sample of individual-level case-control data. The focus of this article is hierarchical models to account for between-group heterogeneity. Estimation and inference pose serious computational challenges. We present a Bayesian implementation based on a data augmentation scheme where the unobserved data are treated as auxiliary variables. The methods are illustrated with a dataset of county-specific infant mortality data from the state of North Carolina.  相似文献   

2.
Greenland S 《Biometrics》2001,57(1):182-188
Standard presentations of epidemiological results focus on incidence-ratio estimates derived from regression models fit to specialized study data. These data are often highly nonrepresentative of populations for which public-health impacts must be evaluated. Basic methods are provided for interval estimation of attributable fractions from model-based incidence-ratio estimates combined with independent survey estimates of the exposure distribution in the target population of interest. These methods are illustrated in estimation of the potential impact of magnetic-field exposures on childhood leukemia in the United States, based on pooled data from 11 case-control studies and a U.S. sample survey of magnetic-field exposures.  相似文献   

3.
A mediation model explores the direct and indirect effects between an independent variable and a dependent variable by including other variables (or mediators). Mediation analysis has recently been used to dissect the direct and indirect effects of genetic variants on complex diseases using case-control studies. However, bias could arise in the estimations of the genetic variant-mediator association because the presence or absence of the mediator in the study samples is not sampled following the principles of case-control study design. In this case, the mediation analysis using data from case-control studies might lead to biased estimates of coefficients and indirect effects. In this article, we investigated a multiple-mediation model involving a three-path mediating effect through two mediators using case-control study data. We propose an approach to correct bias in coefficients and provide accurate estimates of the specific indirect effects. Our approach can also be used when the original case-control study is frequency matched on one of the mediators. We employed bootstrapping to assess the significance of indirect effects. We conducted simulation studies to investigate the performance of the proposed approach, and showed that it provides more accurate estimates of the indirect effects as well as the percent mediated than standard regressions. We then applied this approach to study the mediating effects of both smoking and chronic obstructive pulmonary disease (COPD) on the association between the CHRNA5-A3 gene locus and lung cancer risk using data from a lung cancer case-control study. The results showed that the genetic variant influences lung cancer risk indirectly through all three different pathways. The percent of genetic association mediated was 18.3% through smoking alone, 30.2% through COPD alone, and 20.6% through the path including both smoking and COPD, and the total genetic variant-lung cancer association explained by the two mediators was 69.1%.  相似文献   

4.
Weibin Zhong  Guoqing Diao 《Biometrics》2023,79(3):1959-1971
Two-phase studies such as case-cohort and nested case-control studies are widely used cost-effective sampling strategies. In the first phase, the observed failure/censoring time and inexpensive exposures are collected. In the second phase, a subgroup of subjects is selected for measurements of expensive exposures based on the information from the first phase. One challenging issue is how to utilize all the available information to conduct efficient regression analyses of the two-phase study data. This paper proposes a joint semiparametric modeling of the survival outcome and the expensive exposures. Specifically, we assume a class of semiparametric transformation models and a semiparametric density ratio model for the survival outcome and the expensive exposures, respectively. The class of semiparametric transformation models includes the proportional hazards model and the proportional odds model as special cases. The density ratio model is flexible in modeling multivariate mixed-type data. We develop efficient likelihood-based estimation and inference procedures and establish the large sample properties of the nonparametric maximum likelihood estimators. Extensive numerical studies reveal that the proposed methods perform well under practical settings. The proposed methods also appear to be reasonably robust under various model mis-specifications. An application to the National Wilms Tumor Study is provided.  相似文献   

5.
Weinberg CR 《Genomics》2009,93(1):10-12
Most diseases are complex in that they are caused by the joint action of multiple factors, both genetic and environmental. Over the past few decades, the mathematical convenience of logistic regression has served to enshrine the multiplicative model, to the point where many epidemiologists believe that departure from additivity on a log scale implies that two factors interact in causing disease. Other terminology in epidemiology, where students are told that inequality of relative risks across levels of a second factor should be seen as "effect modification," reinforces an uncritical acceptance of multiplicative joint effect as the biologically meaningful no-interaction null. Our first task, when studying joint effects, is to understand the limitations of our definitions for "interaction," and recognize that what statisticians mean and what biologists might want to mean by interaction may not coincide. Joint effects are notoriously hard to identify and characterize, even when asking a simple and unsatisfying question, like whether two effects are log-additive. The rule of thumb for such efforts is that a factor-of-four sample size is needed, compared with that needed to demonstrate main effects of either genes or exposures. So strategies have been devised that focus on the most informative individuals, either through risk-based sampling for a cohort, or case-control sampling, extreme phenotype sampling, pooling, two-stage sampling, exposed-only, or case-only designs. These designs gain efficiency, but at a cost of flexibility in models for joint effects. A relatively new approach avoids population controls by genotyping case-parent triads. Because it requires parents, the method works best for diseases with onset early in life. With this design, the role of autosomal genetic variants is assessed by in effect treating the nontransmitted parental alleles as controls for affected offspring. Despite advantages for looking at genetic effects, the triad design faces limitations when examining joint effects of genetic and environmental factors. Because population-based controls are not included, main effects for exposures cannot be estimated, and consequently one only has access to inference related to a multiplicative null. We have proposed a hybrid approach that offers the best features of both case-parent and case-control designs. Through genotyping of parents of population-based controls and assuming Mendelian transmission, power is markedly enhanced. One can also estimate main effects for exposures and now flexibly assess models for joint effects.  相似文献   

6.
In biomedical cohort studies for assessing the association between an outcome variable and a set of covariates, usually, some covariates can only be measured on a subgroup of study subjects. An important design question is—which subjects to select into the subgroup to increase statistical efficiency. When the outcome is binary, one may adopt a case-control sampling design or a balanced case-control design where cases and controls are further matched on a small number of complete discrete covariates. While the latter achieves success in estimating odds ratio (OR) parameters for the matching covariates, similar two-phase design options have not been explored for the remaining covariates, especially the incompletely collected ones. This is of great importance in studies where the covariates of interest cannot be completely collected. To this end, assuming that an external model is available to relate the outcome and complete covariates, we propose a novel sampling scheme that oversamples cases and controls with worse goodness-of-fit based on the external model and further matches them on complete covariates similarly to the balanced design. We develop a pseudolikelihood method for estimating OR parameters. Through simulation studies and explorations in a real-cohort study, we find that our design generally leads to reduced asymptotic variances of the OR estimates and the reduction for the matching covariates is comparable to that of the balanced design.  相似文献   

7.
The aim of the present analysis is to combine evidence for association from the two most commonly used designs in genetic association analysis, the case-control design and the transmission disequilibrium test (TDT) design. The cases here are affected offspring from nuclear families and are used in both the case-control and TDT designs. As a result, inference from these designs is not independent. We applied a simple logistic regression method for combining evidence for association from case-control and TDT designs to single-nucleotide polymorphism data purchased on a region on chromosome 3, replicate 1 of the Aipotu population. Combining the evidence from the case-control and TDT designs yielded a 5-10% reduction in the standard errors of the relative risk estimates. The authors did not know the results before the analyses were conducted.  相似文献   

8.

Background

A diverse range of study designs (e.g. case-control or cohort) are used in the evaluation of adverse effects. We aimed to ascertain whether the risk estimates from meta-analyses of case-control studies differ from that of other study designs.

Methods

Searches were carried out in 10 databases in addition to reference checking, contacting experts, and handsearching key journals and conference proceedings. Studies were included where a pooled relative measure of an adverse effect (odds ratio or risk ratio) from case-control studies could be directly compared with the pooled estimate for the same adverse effect arising from other types of observational studies.

Results

We included 82 meta-analyses. Pooled estimates of harm from the different study designs had 95% confidence intervals that overlapped in 78/82 instances (95%). Of the 23 cases of discrepant findings (significant harm identified in meta-analysis of one type of study design, but not with the other study design), 16 (70%) stemmed from significantly elevated pooled estimates from case-control studies. There was associated evidence of funnel plot asymmetry consistent with higher risk estimates from case-control studies. On average, cohort or cross-sectional studies yielded pooled odds ratios 0.94 (95% CI 0.88–1.00) times lower than that from case-control studies.

Interpretation

Empirical evidence from this overview indicates that meta-analysis of case-control studies tend to give slightly higher estimates of harm as compared to meta-analyses of other observational studies. However it is impossible to rule out potential confounding from differences in drug dose, duration and populations when comparing between study designs.  相似文献   

9.
10.
Both the absolute risk and the relative risk (RR) have a crucial role to play in epidemiology. RR is often approximated by odds ratio (OR) under the rare-disease assumption in conventional case-control study; however, such a study design does not provide an estimate for absolute risk. The case-base study is an alternative approach which readily produces RR estimation without resorting to the rare-disease assumption. However, previous researchers only considered one single dichotomous exposure and did not elaborate how absolute risks can be estimated in a case-base study. In this paper, the authors propose a logistic model for the case-base study. The model is flexible enough to admit multiple exposures in any measurement scale—binary, categorical or continuous. It can be easily fitted using common statistical packages. With one additional step of simple calculations of the model parameters, one readily obtains relative and absolute risk estimates as well as their confidence intervals. Monte-Carlo simulations show that the proposed method can produce unbiased estimates and adequate-coverage confidence intervals, for ORs, RRs and absolute risks. The case-base study with all its desirable properties and its methods of analysis fully developed in this paper may become a mainstay in epidemiology.  相似文献   

11.
Prospective studies of diagnostic test accuracy have important advantages over retrospective designs. Yet, when the disease being detected by the diagnostic test(s) has a low prevalence rate, a prospective design can require an enormous sample of patients. We consider two strategies to reduce the costs of prospective studies of binary diagnostic tests: stratification and two-phase sampling. Utilizing neither, one, or both of these strategies provides us with four study design options: (1) the conventional design involving a simple random sample (SRS) of patients from the clinical population; (2) a stratified design where patients from higher-prevalence subpopulations are more heavily sampled; (3) a simple two-phase design using a SRS in the first phase and selection for the second phase based on the test results from the first; and (4) a two-phase design with stratification in the first phase. We describe estimators for sensitivity and specificity and their variances for each design, along with sample size estimation. We offer some recommendations for choosing among the various designs. We illustrate the study designs with two examples.  相似文献   

12.
Respiratory syncytial virus (RSV) is the most common viral pathogen that causes lower respiratory tract infections in infants. Studies have implicated severe RSV infections early in life as a risk factor for subsequent development of reactive airway disease. We are conducting a study to validate RSV-associated diagnoses in the Danish National Patient Registry, to assess whether the incidence of severe RSV infection is increasing in Denmark, to identify predisposing and protective factors for RSV-associated hospitalization in Denmark, and to examine the association of severe RSV infection with reactive airway disease. The influence of various biological, social and environmental factors on hospitalization for RSV infection will be studied through several population-based registers, including the Danish National Birth Cohort: 'Better health for mothers and children'. The RSV hospitalization cases will be compared with control individuals selected within the same population groups on a case-control or a cohort basis in order to produce estimates of age-adjusted and sex-adjusted relative risks (odds ratio and relative risk) for hospitalization associated with various risk factors. Using register linkage and unique registration of exposures collected through interviews and blood samples from the Danish National Birth Cohort, we will be able to resolve the issues referred to above in a very large sample of Danish children.  相似文献   

13.
Data from epidemiological studies might be seen as superior to data from animal bioassays for risk assessment purposes. Because humans are the population of interest, use of epidemiological data avoids interspecies extrapolation. However, one must not assume that an epidemiological study is necessarily valid at face value. We describe issues of validity that arise in the conduct and interpretation of epidemiological research and that affect the utility of epidemiological data in risk assessment. These issues include choice of study design, size and representativeness of the study sample, measurement of exposures and outcomes, control of confounding and specification of statistical model for analysis of data, all of which affect the accuracy and validity of study results.  相似文献   

14.
Chen J  Lin D  Hochner H 《Biometrics》2012,68(3):869-877
Summary Case-control mother-child pair design represents a unique advantage for dissecting genetic susceptibility of complex traits because it allows the assessment of both maternal and offspring genetic compositions. This design has been widely adopted in studies of obstetric complications and neonatal outcomes. In this work, we developed an efficient statistical method for evaluating joint genetic and environmental effects on a binary phenotype. Using a logistic regression model to describe the relationship between the phenotype and maternal and offspring genetic and environmental risk factors, we developed a semiparametric maximum likelihood method for the estimation of odds ratio association parameters. Our method is novel because it exploits two unique features of the study data for the parameter estimation. First, the correlation between maternal and offspring SNP genotypes can be specified under the assumptions of random mating, Hardy-Weinberg equilibrium, and Mendelian inheritance. Second, environmental exposures are often not affected by offspring genes conditional on maternal genes. Our method yields more efficient estimates compared with the standard prospective method for fitting logistic regression models to case-control data. We demonstrated the performance of our method through extensive simulation studies and the analysis of data from the Jerusalem Perinatal Study.  相似文献   

15.

Background

Typically, a two-phase (double) sampling strategy is employed when classifications are subject to error and there is a gold standard (perfect) classifier available. Two-phase sampling involves classifying the entire sample with an imperfect classifier, and a subset of the sample with the gold-standard.

Methodology/Principal Findings

In this paper we consider an alternative strategy termed reclassification sampling, which involves classifying individuals using the imperfect classifier more than one time. Estimates of sensitivity, specificity and prevalence are provided for reclassification sampling, when either one or two binary classifications of each individual using the imperfect classifier are available. Robustness of estimates and design decisions to model assumptions are considered. Software is provided to compute estimates and provide advice on the optimal sampling strategy.

Conclusions/Significance

Reclassification sampling is shown to be cost-effective (lower standard error of estimates for the same cost) for estimating prevalence as compared to two-phase sampling in many practical situations.  相似文献   

16.
Genome-wide association studies (GWAS) are routinely conducted for both quantitative and binary (disease) traits. We present two analytical tools for use in the experimental design of GWAS. Firstly, we present power calculations quantifying power in a unified framework for a range of scenarios. In this context we consider the utility of quantitative scores (e.g. endophenotypes) that may be available on cases only or both cases and controls. Secondly, we consider, the accuracy of prediction of genetic risk from genome-wide SNPs and derive an expression for genomic prediction accuracy using a liability threshold model for disease traits in a case-control design. The expected values based on our derived equations for both power and prediction accuracy agree well with observed estimates from simulations.  相似文献   

17.
To assess whether screening people at high risk of malignant melanoma would be effective in reducing the mortality from the disease data from 400 case-control pairs in a study of cutaneous malignant melanoma conducted in Western Australia during 1980-1 were used to predict the risk of melanoma in the remaining 111 pairs. All variables previously shown to be associated with a decrease or increase in the incidence of melanoma were considered for inclusion in a single conditional logistic regression model of the incidence of melanoma in the randomly chosen subset of 400 case-control pairs. Five of these variables—number of raised naevi on the arms, arrival in Australia before 10 years of age, history of non-melanocytic skin cancer, time spent outdoors in summer from the age of 10 to 24, and family history of melanoma—provided good discrimination between patients and controls in this sample and the 111 other case-control pairs. Among the 222 subjects in these other case-control pairs a group defined as being at high risk of melanoma by a risk score derived from these five variables contained 60 (54%) of the patients with melanoma but only 18 (16%) of the controls.These data suggest that in Western Australia more than half of all new patients with melanoma arise in an identifiable subpopulation constituting less than one fifth of the whole population. Identifying this subpopulation and screening it regularly for cutaneous malignant melanoma could be cost effective in reducing mortality from this disease.  相似文献   

18.
Standard errors for attributable risk for simple and complex sample designs   总被引:1,自引:0,他引:1  
Graubard BI  Fears TR 《Biometrics》2005,61(3):847-855
Adjusted attributable risk (AR) is the proportion of diseased individuals in a population that is due to an exposure. We consider estimates of adjusted AR based on odds ratios from logistic regression to adjust for confounding. Influence function methods used in survey sampling are applied to obtain simple and easily programmable expressions for estimating the variance of AR. These variance estimators can be applied to data from case-control, cross-sectional, and cohort studies with or without frequency or individual matching and for sample designs with subject samples that range from simple random samples to (sample) weighted multistage stratified cluster samples like those used in national household surveys. The variance estimation of AR is illustrated with: (i) a weighted stratified multistage clustered cross-sectional study of childhood asthma from the Third National Health and Examination Survey (NHANES III), and (ii) a frequency-matched case-control study of melanoma skin cancer.  相似文献   

19.
We propose a conditional scores procedure for obtaining bias-corrected estimates of log odds ratios from matched case-control data in which one or more covariates are subject to measurement error. The approach involves conditioning on sufficient statistics for the unobservable true covariates that are treated as fixed unknown parameters. For the case of Gaussian nondifferential measurement error, we derive a set of unbiased score equations that can then be solved to estimate the log odds ratio parameters of interest. The procedure successfully removes the bias in naive estimates, and standard error estimates are obtained by resampling methods. We present an example of the procedure applied to data from a matched case-control study of prostate cancer and serum hormone levels, and we compare its performance to that of regression calibration procedures.  相似文献   

20.
Mukherjee B  Zhang L  Ghosh M  Sinha S 《Biometrics》2007,63(3):834-844
In case-control studies of gene-environment association with disease, when genetic and environmental exposures can be assumed to be independent in the underlying population, one may exploit the independence in order to derive more efficient estimation techniques than the traditional logistic regression analysis (Chatterjee and Carroll, 2005, Biometrika92, 399-418). However, covariates that stratify the population, such as age, ethnicity and alike, could potentially lead to nonindependence. In this article, we provide a novel semiparametric Bayesian approach to model stratification effects under the assumption of gene-environment independence in the control population. We illustrate the methods by applying them to data from a population-based case-control study on ovarian cancer conducted in Israel. A simulation study is conducted to compare our method with other popular choices. The results reflect that the semiparametric Bayesian model allows incorporation of key scientific evidence in the form of a prior and offers a flexible, robust alternative when standard parametric model assumptions do not hold.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号