首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
Shih JH  Chatterjee N 《Biometrics》2002,58(3):502-509
In case-control family studies with survival endpoint, age of onset of diseases can be used to assess the familial aggregation of the disease and the relationship between the disease and genetic or environmental risk factors. Because of the retrospective nature of the case--control study, methods for analyzing prospectively collected correlated failure time data do not apply directly. In this article, we propose a semiparametric quasi-partial-likelihood approach to simultaneously estimate the effect of covariates on the age of onset and the association of ages of onset among family members that does not require specification of the baseline marginal distribution. We conducted a simulation study to evaluate the performance of the proposed approach and compare it with the existing semiparametric ones. Simulation results demonstrate that the proposed approach has better performance in terms of consistency and efficiency. We illustrate the methodology using a subset of data from the Washington Ashkenazi Study.  相似文献   

2.
Pennell ML  Dunson DB 《Biometrics》2006,62(4):1044-1052
Many biomedical studies collect data on times of occurrence for a health event that can occur repeatedly, such as infection, hospitalization, recurrence of disease, or tumor onset. To analyze such data, it is necessary to account for within-subject dependency in the multiple event times. Motivated by data from studies of palpable tumors, this article proposes a dynamic frailty model and Bayesian semiparametric approach to inference. The widely used shared frailty proportional hazards model is generalized to allow subject-specific frailties to change dynamically with age while also accommodating nonproportional hazards. Parametric assumptions on the frailty distribution are avoided by using Dirichlet process priors for a shared frailty and for multiplicative innovations on this frailty. By centering the semiparametric model on a conditionally conjugate dynamic gamma model, we facilitate posterior computation and lack-of-fit assessments of the parametric model. Our proposed method is demonstrated using data from a cancer chemoprevention study.  相似文献   

3.
Estimating the effects of haplotypes on the age of onset of a disease is an important step toward the discovery of genes that influence complex human diseases. A haplotype is a specific sequence of nucleotides on the same chromosome of an individual and can only be measured indirectly through the genotype. We consider cohort studies which collect genotype data on a subset of cohort members through case-cohort or nested case-control sampling. We formulate the effects of haplotypes and possibly time-varying environmental variables on the age of onset through a broad class of semiparametric regression models. We construct appropriate nonparametric likelihoods, which involve both finite- and infinite-dimensional parameters. The corresponding nonparametric maximum likelihood estimators are shown to be consistent, asymptotically normal, and asymptotically efficient. Consistent variance-covariance estimators are provided, and efficient and reliable numerical algorithms are developed. Simulation studies demonstrate that the asymptotic approximations are accurate in practical settings and that case-cohort and nested case-control designs are highly cost-effective. An application to a major cardiovascular study is provided.  相似文献   

4.
Shen Y  Huang X 《Biometrics》2005,61(4):992-999
We propose a nonparametric estimation of preclinical duration distribution in cancer based on data from a randomized early detection trial. In cancer screening studies, the preclinical duration of a disease is of great interest for better understanding the natural history of the disease, and for developing optimal screening strategies. To estimate the sojourn time distribution nonparametrically, we first estimate the distribution of the age at onset of preclinical disease nonparametrically using data from the screening arm in a randomized screening trial, and the distribution for the age at onset of clinical disease from the control arm of the randomized screening trial. Finally, by using deconvolution the two estimated distributions lead to a nonparametric estimate of the distribution for the gap time between the onset of preclinical disease and the onset of clinical disease. We illustrate the methodology using data from a randomized breast cancer screening trial.  相似文献   

5.
A multiplicative model is described relating HLA typing information to disease incidence. A likelihood-based method for estimating parameters in this model is proposed for use with data sets in which HLA haplotype information is available on a series of cases and their parents. This approach is extended to incorporate information from a matched control series for the purpose of estimating HLA and environmental risk factor effects simultaneously. The method is applied to data from aplastic anemia patients treated by bone marrow transplantation and the results are compared to unmatched case-control analyses using the same case series and several different control series.  相似文献   

6.
S. Mandal  J. Qin  R.M. Pfeiffer 《Biometrics》2023,79(3):1701-1712
We propose and study a simple and innovative non-parametric approach to estimate the age-of-onset distribution for a disease from a cross-sectional sample of the population that includes individuals with prevalent disease. First, we estimate the joint distribution of two event times, the age of disease onset and the survival time after disease onset. We accommodate that individuals had to be alive at the time of the study by conditioning on their survival until the age at sampling. We propose a computationally efficient expectation–maximization (EM) algorithm and derive the asymptotic properties of the resulting estimates. From these joint probabilities we then obtain non-parametric estimates of the age-at-onset distribution by marginalizing over the survival time after disease onset to death. The method accommodates categorical covariates and can be used to obtain unbiased estimates of the covariate distribution in the source population. We show in simulations that our method performs well in finite samples even under large amounts of truncation for prevalent cases. We apply the proposed method to data from female participants in the Washington Ashkenazi Study to estimate the age-at-onset distribution of breast cancer associated with carrying BRCA1 or BRCA2 mutations.  相似文献   

7.
Weinberg CR 《Genomics》2009,93(1):10-12
Most diseases are complex in that they are caused by the joint action of multiple factors, both genetic and environmental. Over the past few decades, the mathematical convenience of logistic regression has served to enshrine the multiplicative model, to the point where many epidemiologists believe that departure from additivity on a log scale implies that two factors interact in causing disease. Other terminology in epidemiology, where students are told that inequality of relative risks across levels of a second factor should be seen as "effect modification," reinforces an uncritical acceptance of multiplicative joint effect as the biologically meaningful no-interaction null. Our first task, when studying joint effects, is to understand the limitations of our definitions for "interaction," and recognize that what statisticians mean and what biologists might want to mean by interaction may not coincide. Joint effects are notoriously hard to identify and characterize, even when asking a simple and unsatisfying question, like whether two effects are log-additive. The rule of thumb for such efforts is that a factor-of-four sample size is needed, compared with that needed to demonstrate main effects of either genes or exposures. So strategies have been devised that focus on the most informative individuals, either through risk-based sampling for a cohort, or case-control sampling, extreme phenotype sampling, pooling, two-stage sampling, exposed-only, or case-only designs. These designs gain efficiency, but at a cost of flexibility in models for joint effects. A relatively new approach avoids population controls by genotyping case-parent triads. Because it requires parents, the method works best for diseases with onset early in life. With this design, the role of autosomal genetic variants is assessed by in effect treating the nontransmitted parental alleles as controls for affected offspring. Despite advantages for looking at genetic effects, the triad design faces limitations when examining joint effects of genetic and environmental factors. Because population-based controls are not included, main effects for exposures cannot be estimated, and consequently one only has access to inference related to a multiplicative null. We have proposed a hybrid approach that offers the best features of both case-parent and case-control designs. Through genotyping of parents of population-based controls and assuming Mendelian transmission, power is markedly enhanced. One can also estimate main effects for exposures and now flexibly assess models for joint effects.  相似文献   

8.
Multivariate survival data arise from case-control family studies in which the ages at disease onset for family members may be correlated. In this paper, we consider a multivariate survival model with the marginal hazard function following the proportional hazards model. We use a frailty-based approach in the spirit of Glidden and Self (1999) to account for the correlation of ages at onset among family members. Specifically, we first estimate the baseline hazard function nonparametrically by the innovation theorem, and then obtain maximum pseudolikelihood estimators for the regression and correlation parameters plugging in the baseline hazard function estimator. We establish a connection with a previously proposed generalized estimating equation-based approach. Simulation studies and an analysis of case-control family data of breast cancer illustrate the methodology's practical utility.  相似文献   

9.
Chen J  Rodriguez C 《Biometrics》2007,63(4):1099-1107
Genetic epidemiologists routinely assess disease susceptibility in relation to haplotypes, that is, combinations of alleles on a single chromosome. We study statistical methods for inferring haplotype-related disease risk using single nucleotide polymorphism (SNP) genotype data from matched case-control studies, where controls are individually matched to cases on some selected factors. Assuming a logistic regression model for haplotype-disease association, we propose two conditional likelihood approaches that address the issue that haplotypes cannot be inferred with certainty from SNP genotype data (phase ambiguity). One approach is based on the likelihood of disease status conditioned on the total number of cases, genotypes, and other covariates within each matching stratum, and the other is based on the joint likelihood of disease status and genotypes conditioned only on the total number of cases and other covariates. The joint-likelihood approach is generally more efficient, particularly for assessing haplotype-environment interactions. Simulation studies demonstrated that the first approach was more robust to model assumptions on the diplotype distribution conditioned on environmental risk variables and matching factors in the control population. We applied the two methods to analyze a matched case-control study of prostate cancer.  相似文献   

10.
In some cross-sectional studies of chronic disease, data consist of the age at examination, whether the disease was present at the exam, and recall of the age at first diagnosis. This article describes a flexible parametric approach for combining current status and age at first diagnosis data. We assume that the log odds of onset by a given age and of detection by a given age conditional on onset by that age are nondecreasing functions of time plus linear combinations of covariates. Piecewise linear models are used to characterize changes across time in the baseline odds. Methods are described for accommodating informatively missing current status data and inferences based on the age-specific incidence of disease prior to a landmark event (e.g., puberty, menopause). Our formulation enables straightforward maximum likelihood estimation without requiring restrictive parametric or Markov assumptions. The methods are applied to data from a study of uterine fibroids.  相似文献   

11.
It is well recognized that age at onset of Huntington disease (HD) is strongly influenced by the sex of the affected parent, and this has lead to suggestions that genetic imprinting or maternal specific factors may play a role in the expression of the disease. This study evaluated maternal and paternal ages, birth order, parental age at onset, and sex of the affected parent and grandparent in 1,764 patients in the National HD Roster by using linear-regression techniques which incorporated a weighted least-squares approach to accommodate the correlation among siblings. It was found that paternal age is negatively associated with age at onset of HD, particularly among subjects who inherit the mutant gene from grandfathers. Apparent associations between age at onset and birth order and between age at onset and maternal age were not significant after adjustment for paternal age. The paternal age effect is strongest among juvenile-onset cases and individuals with anticipation of greater than or equal to 10 years, although it is detectable across the entire age-at-onset distribution. The tendency for older fathers, including those not transmitting the HD gene, to have affected offspring with early-onset disease may be consistent with a gene imprinting mechanism involving DNA methylation. Because paternal age in unaffected fathers is also a significant determinant of age at onset, methylation in this context might involve HD modifier genes or the normal HD allele.  相似文献   

12.
A variety of statistical methods exist for detecting haplotype-disease association through use of genetic data from a case-control study. Since such data often consist of unphased genotypes (resulting in haplotype ambiguity), such statistical methods typically apply the expectation-maximization (EM) algorithm for inference. However, the majority of these methods fail to perform inference on the effect of particular haplotypes or haplotype features on disease risk. Since such inference is valuable, we develop a retrospective likelihood for estimating and testing the effects of specific features of single-nucleotide polymorphism (SNP)-based haplotypes on disease risk using unphased genotype data from a case-control study. Our proposed method has a flexible structure that allows, among other choices, modeling of multiplicative, dominant, and recessive effects of specific haplotype features on disease risk. In addition, our method relaxes the requirement of Hardy-Weinberg equilibrium of haplotype frequencies in case subjects, which is typically required of EM-based haplotype methods. Also, our method easily accommodates missing SNP information. Finally, our method allows for asymptotic, permutation-based, or bootstrap inference. We apply our method to case-control SNP genotype data from the Finland-United States Investigation of Non-Insulin-Dependent Diabetes Mellitus (FUSION) Genetics study and identify two haplotypes that appear to be significantly associated with type 2 diabetes. Using the FUSION data, we assess the accuracy of asymptotic P values by comparing them with P values obtained from a permutation procedure. We also assess the accuracy of asymptotic confidence intervals for relative-risk parameters for haplotype effects, by a simulation study based on the FUSION data.  相似文献   

13.

Background

Determining genetic risk is a fundamental prerequisite for the implementation of primary prevention trials for type 1 diabetes (T1D). The aim of this study was to assess the risk conferred by HLA-DRB1, INS-VNTR and PTPN22 single genes on the onset of T1D and the joint risk conferred by all these three susceptibility loci using the Bayesian Network (BN) approach in both population-based case-control and family clustering data sets.

Methodology/Principal Findings

A case-control French cohort, consisting of 868 T1D patients and 73 French control subjects, a French family data set consisting of 1694 T1D patients and 2340 controls were analysed. We studied both samples separately applying the BN probabilistic approach, that is a graphical model that encodes probabilistic relationships among variables of interest. As expected HLA-DRB1 is the most relevant susceptibility gene. We proved that INS and PTPN22 genes marginally influence T1D risk in all risk HLA-DRB1 genotype categories. The absolute risk conferred by carrying simultaneously high, moderate or low risk HLA-DRB1 genotypes together with at risk INS and PTPN22 genotypes, was 11.5%, 1.7% and 0.1% in the case-control sample and 19.8%, 6.6% and 2.2% in the family cohort, respectively.

Conclusions/Significance

This work represents, to the best of our knowledge, the first study based on both case-control and family data sets, showing the joint effect of HLA, INS and PTPN22 in a T1D Caucasian population with a wide range of age at T1D onset, adding new insights to previous findings regarding data sets consisting of patients and controls <15 years at onset.  相似文献   

14.
Motivated by a Finnish case-control study of early onset diabetes in which diabetic children are matched to sibling controls, we investigate ascertainment bias of the usual rate ratio estimator from case-control data under simplex complete ascertainment of families during a fixed interval of time. Analytic results indicate that the assumptions necessary for valid estimation are that the disease is rare and the factors under study are exchangeable--essentially that the covariate distribution does not depend on calendar time or birth order. Further, we found that the rare disease assumption could be dropped by restricting to cases that were diagnosed during the enrollment period of the study or including all cases but eliminating the proband as a control for non-enrollment-period cases. An important consequence of this work is that standard family-based case-control studies are subject to ascertainment bias if exchangeability of the covariates under investigation does not hold.  相似文献   

15.
A mediation model explores the direct and indirect effects between an independent variable and a dependent variable by including other variables (or mediators). Mediation analysis has recently been used to dissect the direct and indirect effects of genetic variants on complex diseases using case-control studies. However, bias could arise in the estimations of the genetic variant-mediator association because the presence or absence of the mediator in the study samples is not sampled following the principles of case-control study design. In this case, the mediation analysis using data from case-control studies might lead to biased estimates of coefficients and indirect effects. In this article, we investigated a multiple-mediation model involving a three-path mediating effect through two mediators using case-control study data. We propose an approach to correct bias in coefficients and provide accurate estimates of the specific indirect effects. Our approach can also be used when the original case-control study is frequency matched on one of the mediators. We employed bootstrapping to assess the significance of indirect effects. We conducted simulation studies to investigate the performance of the proposed approach, and showed that it provides more accurate estimates of the indirect effects as well as the percent mediated than standard regressions. We then applied this approach to study the mediating effects of both smoking and chronic obstructive pulmonary disease (COPD) on the association between the CHRNA5-A3 gene locus and lung cancer risk using data from a lung cancer case-control study. The results showed that the genetic variant influences lung cancer risk indirectly through all three different pathways. The percent of genetic association mediated was 18.3% through smoking alone, 30.2% through COPD alone, and 20.6% through the path including both smoking and COPD, and the total genetic variant-lung cancer association explained by the two mediators was 69.1%.  相似文献   

16.
Case-control designs are widely used in rare disease studies. In a typical case-control study, data are collected from a sample of all available subjects who have experienced a disease (cases) and a sub-sample of subjects who have not experienced the disease (controls) in a study cohort. Cases are oversampled in case-control studies. Logistic regression is a common tool to estimate the relative risks of the disease with respect to a set of covariates. Very often in such a study, information of ages-at-onset of the disease for all cases and ages at survey of controls are known. Standard logistic regression analysis using age as a covariate is based on a dichotomous outcome and does not efficiently use such age-at-onset (time-to-event) information. We propose to analyze age-at-onset data using a modified case-cohort method by treating the control group as an approximation of a subcohort assuming rare events. We investigate the asymptotic bias of this approximation and show that the asymptotic bias of the proposed estimator is small when the disease rate is low. We evaluate the finite sample performance of the proposed method through a simulation study and illustrate the method using a breast cancer case-control data set.  相似文献   

17.
We present a Bayesian approach to analyze matched "case-control" data with multiple disease states. The probability of disease development is described by a multinomial logistic regression model. The exposure distribution depends on the disease state and could vary across strata. In such a model, the number of stratum effect parameters grows in direct proportion to the sample size leading to inconsistent MLEs for the parameters of interest even when one uses a retrospective conditional likelihood. We adopt a semiparametric Bayesian framework instead, assuming a Dirichlet process prior with a mixing normal distribution on the distribution of the stratum effects. We also account for possible missingness in the exposure variable in our model. The actual estimation is carried out through a Markov chain Monte Carlo numerical integration scheme. The proposed methodology is illustrated through simulation and an example of a matched study on low birth weight of newborns (Hosmer, D. A. and Lemeshow, S., 2000, Applied Logistic Regression) with two possible disease groups matched with a control group.  相似文献   

18.
We derive a multivariate survival model for age of onset data of a sibship from an additive genetic gamma frailty model constructed basing on the inheritance vectors, and investigate the properties of this model. Based on this model, we propose a retrospective likelihood approach for genetic linkage analysis using sibship data. This test is an allele-sharing-based test, and does not require specification of genetic models or the penetrance functions. This new approach can incorporate both affected and unaffected sibs, environmental covariates and age of onset or age at censoring information and, therefore, provides a practical solution for mapping genes for complex diseases with variable age of onset. Small simulation study indicates that the proposed method performs better than the commonly used allele-sharing-based methods for linkage analysis, especially when the population disease rate is high. We applied this method to a type 1 diabetes sib pair data set and a small breast cancer data set. Both simulated and real data sets also indicate that the method is relatively robust to the misspecification to the baseline hazard function.  相似文献   

19.
Correlations in age at onset between relatives affect risk to relatives of a given age. Either an increase or a decrease in risk may be observed for a relative of a proband, according to whether there is a causal relationship between liability to disease and age at onset. Likelihood formulas are given for pairs of relatives under a number of different sampling schemes, and it is shown how data collected from relatives enable maximum-likelihood estimation of parameters of a linear model relating disease liability and age at onset. A genotype-environment extension of this model was fitted to data on age at onset for schizophrenia that were obtained from the National Academy of Sciences-National Research Council Twin Registry. Age at onset is correlated between twins, but this correlation appears to be associated with factors that are separate from those which affect liability to disease. However, even this relatively large sample of twins is too small to draw firm conclusions about any causal relationship between disease liability and onset.  相似文献   

20.
In genetic research of chronic diseases, age-at-onset outcomes within families are often correlated. The nature of correlation of age-at-onset outcomes is indicative of common genetic and/or shared environmental risk factors among family members. Understanding patterns of such correlation may shed light on the disease etiology and, hence, is an important step to take prior to further searching for the responsible genes via segregation and linkage studies. Age-at-onset outcomes are different from those familiar quantitative or qualitative traits for which many statistical methods have been developed. In comparison with the quantitative traits, age-at-onset outcomes are often censored, i.e., instead of actual age-at-onset outcomes, only the current ages or ages at death are observed. They are also different from qualitative traits because of their continuity. Because of the complexity of correlated censored outcomes, few methods have yet been developed. A traditional approach is to impose a parametric joint distribution for the correlated age-at-onset outcomes, which has been criticized for requiring a stringent assumption about the entire distribution of age at onset. The purpose of this paper is to describe a method for assessing familial aggregation of correlated age-at-onset outcomes semiparametrically, by use of estimating equations. This method does not require any parametric assumption for modeling the age at onset. The estimates of parameters, including those quantifying the correlation within families, are consistent and have an asymptotic normal distribution that can be used to make inferences. To illustrate this new method, we analyzed two age-at-onset data sets that were obtained from studies conducted in the States of Washington and Hawaii, with the objective of quantifying the familial aggregations of age at onset of breast cancer.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号