首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 32 毫秒
1.
The aim of the present analysis is to combine evidence for association from the two most commonly used designs in genetic association analysis, the case-control design and the transmission disequilibrium test (TDT) design. The cases here are affected offspring from nuclear families and are used in both the case-control and TDT designs. As a result, inference from these designs is not independent. We applied a simple logistic regression method for combining evidence for association from case-control and TDT designs to single-nucleotide polymorphism data purchased on a region on chromosome 3, replicate 1 of the Aipotu population. Combining the evidence from the case-control and TDT designs yielded a 5-10% reduction in the standard errors of the relative risk estimates. The authors did not know the results before the analyses were conducted.  相似文献   

2.
Widespread multifactor interactions present a significant challenge in determining risk factors of complex diseases. Several combinatorial approaches, such as the multifactor dimensionality reduction (MDR) method, have emerged as a promising tool for better detecting gene-gene (G x G) and gene-environment (G x E) interactions. We recently developed a general combinatorial approach, namely the generalized multifactor dimensionality reduction (GMDR) method, which can entertain both qualitative and quantitative phenotypes and allows for both discrete and continuous covariates to detect G x G and G x E interactions in a sample of unrelated individuals. In this article, we report the development of an algorithm that can be used to study G x G and G x E interactions for family-based designs, called pedigree-based GMDR (PGMDR). Compared to the available method, our proposed method has several major improvements, including allowing for covariate adjustments and being applicable to arbitrary phenotypes, arbitrary pedigree structures, and arbitrary patterns of missing marker genotypes. Our Monte Carlo simulations provide evidence that the PGMDR method is superior in performance to identify epistatic loci compared to the MDR-pedigree disequilibrium test (PDT). Finally, we applied our proposed approach to a genetic data set on tobacco dependence and found a significant interaction between two taste receptor genes (i.e., TAS2R16 and TAS2R38) in affecting nicotine dependence.  相似文献   

3.
In deriving the efficiency of the stratified to the simple random sample design in survey research, the critical link between the designs is the population analysis of variance. Analogously, the efficiency of the case-control to the cohort design in epidemiologic research can be derived using Bayes' theorem as the essential connection between the designs. Prior information on the odds of the disease is also required. A numerical example using data from Fleiss' text is used to illustrate the result.  相似文献   

4.
Lu SE  Wang MC 《Biometrics》2002,58(4):764-772
Cohort case-control design is an efficient and economical design to study risk factors for disease incidence or mortality in a large cohort. In the last few decades, a variety of cohort case-control designs have been developed and theoretically justified. These designs have been exclusively applied to the analysis of univariate failure-time data. In this work, a cohort case-control design adapted to multivariate failure-time data is developed. A risk set sampling method is proposed to sample controls from nonfailures in a large cohort for each case matched by failure time. This method leads to a pseudolikelihood approach for the estimation of regression parameters in the marginal proportional hazards model (Cox, 1972, Journal of the Royal Statistical Society, Series B 34, 187-220), where the correlation structure between individuals within a cluster is left unspecified. The performance of the proposed estimator is demonstrated by simulation studies. A bootstrap method is proposed for inferential purposes. This methodology is illustrated by a data example from a child vitamin A supplementation trial in Nepal (Nepal Nutrition Intervention Project-Sarlahi, or NNIPS).  相似文献   

5.
Tung L  Gordon D  Finch SJ 《Human heredity》2007,63(2):101-110
This paper extends gene-environment (G x E) interaction study designs in which the gene (G) is known and the environmental variable (E) is specified to the analysis of 'time-to-event' data, using Cox proportional hazards (PH) modeling. The objectives are to assess whether a random sample of subjects can be used to detect a specific G x E interaction and to study the sensitivity of the power of PH modeling to genotype misclassification. We find that a random sample of 2,100 is sufficient to detect a moderate G x E interaction. The increase in sample size necessary (SSN) to maintain Type I and Type II error rates is calculated for each of the 6 genotyping errors for both dominant and recessive modes of inheritance (MOI). The increase in SSN required is relatively small when each genotyping error rate is less than 1% and the disease allele frequency is between 0.2 and 0.5. The genotyping errors that require the greatest increase in SSN are any misclassification of a subject without the at-risk genotype as having the at-risk genotype. Such errors require an indefinitely large increase in SSN as the disease allele frequency approaches 0, suggesting that it is especially important that subjects recorded as having the at-risk genotype be correctly genotyped. Additionally, for a dominant MOI, large increases in SSN can occur with large disease allele frequency.  相似文献   

6.
The most simple and commonly used approach for genetic associations is the case-control study design of unrelated people. This design is susceptible to population stratification. This problem is obviated in family-based studies, but it is usually difficult to accumulate large enough samples of well-characterized families. We addressed empirically whether the two designs give similar estimates of association in 93 investigations where both unrelated case-control and family-based designs had been employed. Estimated odds ratios differed beyond chance between the two designs in only four instances (4%). The summary relative odds ratio (ROR) (the ratio of odds ratios obtained from unrelated case-control and family-based studies) was close to unity (0.96 [95% confidence interval, 0.91-1.01]). There was no heterogeneity in the ROR across studies (amount of heterogeneity beyond chance I(2) = 0%). Differences on whether results were nominally statistically significant (p < 0.05) or not with the two designs were common (opposite classification rates 14% and 17%); this reflected largely differences in power. Conclusions were largely similar in diverse subgroup analyses. Unrelated case-control and family-based designs give overall similar estimates of association. We cannot rule out rare large biases or common small biases.  相似文献   

7.
Genome-wide association studies of gene-environment interaction (GxE GWAS) are becoming popular. As with main effects GWAS, quantile-quantile plots (QQ-plots) and Genomic Control are being used to assess and correct for population substructure. However, in G x E work these approaches can be seriously misleading, as we illustrate; QQ-plots may give strong indications of substructure when absolutely none is present. Using simulation and theory, we show how and why spurious QQ-plot inflation occurs in G x E GWAS, and how this differs from main-effects analyses. We also explain how simple adjustments to standard regression-based methods used in G x E GWAS can alleviate this problem.  相似文献   

8.
The power of genomic control   总被引:16,自引:0,他引:16       下载免费PDF全文
Although association analysis is a useful tool for uncovering the genetic underpinnings of complex traits, its utility is diminished by population substructure, which can produce spurious association between phenotype and genotype within population-based samples. Because family-based designs are robust against substructure, they have risen to the fore of association analysis. Yet, if population substructure could be ignored, this robustness can come at the price of power. Unfortunately it is rarely evident when population substructure can be ignored. Devlin and Roeder recently have proposed a method, termed "genomic control" (GC), which has the robustness of family-based designs even though it uses population-based data. GC uses the genome itself to determine appropriate corrections for population-based association tests. Using the GC method, we contrast the power of two study designs, family trios (i.e., father, mother, and affected progeny) versus case-control. For analysis of trios, we use the TDT test. When population substructure is absent, we find GC is always more powerful than TDT; furthermore, contrary to previous results, we show that as a disease becomes more prevalent the discrepancy in power becomes more extreme. When population substructure is present, however, the results are more complex: TDT is more powerful when population substructure is substantial, and GC is more powerful otherwise. We also explore general issues of power and implementation of GC within the case-control setting and find that, economically, GC is at least comparable to and often less expensive than family-based methods. Therefore, GC methods should prove a useful complement to family-based methods for the genetic analysis of complex traits.  相似文献   

9.
Variations in tobacco-related cancers, incidence and prevalence reflect differences in tobacco consumption in addition to genetic factors. Besides, genes related to lung cancer risk could be related to smoking behavior. Polymorphisms altering DNA repair capacity may lead to synergistic effects with tobacco carcinogen-induced lung cancer risk. Common problems in genetic association studies, such as presence of gene-by-environment (G x E) correlation in the population, may reduce the validity of these designs. The main purpose of this study was to evaluate the independence assumption for selected SNPs and smoking behaviour in a cohort of 320 healthy Spanish smokers. We found an association between the wild type alleles of XRCC3 Thr241Met or KLC3 Lys751Gln and greater smoking intensity (OR = 12.98, 95% CI = 2.86–58.82 and OR=16.90, 95% CI=2.09-142.8; respectively). Although preliminary, the results of our study provide evidence that genetic variations in DNA-repair genes may influence both smoking habits and the development of lung cancer. Population-specific G x E studies should be carried out when genetic and environmental factors interact to cause the disease.  相似文献   

10.
A mapping population of 104 F(3) lines of pearl millet, derived from a cross between two inbred lines H 77/833-2 x PRLT 2/89-33, was evaluated, as testcrosses on a common tester, for traits determining grain and stover yield in seven different field trials, distributed over 3 years and two seasons. The total genetic variation was partitioned into effects due to season (S), genotype (G), genotype x season interaction (G x S), and genotype x environment-within-season interaction [G x E(S)]. QTLs were determined for traits for their G, G x S, and G x E(S) effects, to assess the magnitude and the nature (cross over/non-crossover) of environmental interaction effects on individual QTLs. QTLs for some traits were associated with G effects only, while others were associated with the effects of both G and G x S and/or G, G x S and G x E(S) effects. The major G x S QTLs detected were for flowering time (on LG 4 and LG 6), and mapped to the same intervals as G x S QTLs for several other traits (including stover yield, harvest index, biomass yield and panicle number m(-2)). All three QTLs detected for grain yield were unaffected by G x S interaction however. All three QTLs for stover yield (mapping on LG 2, LG 4 and LG 6) and one of the three QTLs for grain yield (mapping on LG 4) were also free of QTL x E(S) interactions. The grain yield QTLs that were affected by QTL x E(S) interactions (mapping on LG 2 and LG 6), appeared to be linked to parallel QTL x E(S) interactions of the QTLs for panicle number m(-2) on (LG 2) and of QTLs for both panicle number m(-2) and harvest index (LG 6). In general, QTL x E(S) interactions were more frequently observed for component traits of grain and stover yield, than for grain or stover yield per se.  相似文献   

11.
We present a class of likelihood-based score statistics that accommodate genotypes of both unrelated individuals and families, thereby combining the advantages of case-control and family-based designs. The likelihood extends the one proposed by Schaid and colleagues (Schaid and Sommer 1993, 1994; Schaid 1996; Schaid and Li 1997) to arbitrary family structures with arbitrary patterns of missing data and to dense sets of multiple markers. The score statistic comprises two component test statistics. The first component statistic, the nonfounder statistic, evaluates disequilibrium in the transmission of marker alleles from parents to offspring. This statistic, when applied to nuclear families, generalizes the transmission/disequilibrium test to arbitrary numbers of affected and unaffected siblings, with or without typed parents. The second component statistic, the founder statistic, compares observed or inferred marker genotypes in the family founders with those of controls or those of some reference population. The founder statistic generalizes the statistics commonly used for case-control data. The strengths of the approach include both the ability to assess, by comparison of nonfounder and founder statistics, the potential bias resulting from population stratification and the ability to accommodate arbitrary family structures, thus eliminating the need for many different ad hoc tests. A limitation of the approach is the potential power loss and/or bias resulting from inappropriate assumptions on the distribution of founder genotypes. The systematic likelihood-based framework provided here should be useful in the evaluation of both the relative merits of case-control and various family-based designs and the relative merits of different tests applied to the same design. It should also be useful for genotype-disease association studies done with the use of a dense set of multiple markers.  相似文献   

12.
Methods for the analysis of unmatched case-control data based on a finite population sampling model are developed. Under this model, and the prospective logistic model for disease probabilities, a likelihood for case-control data that accommodates very general sampling of controls is derived. This likelihood has the form of a weighted conditional logistic likelihood. The flexibility of the methods is illustrated by providing a number of control sampling designs and a general scheme for their analyses. These include frequency matching, counter-matching, case-base, randomized recruitment, and quota sampling. A study of risk factors for childhood asthma illustrates an application of the counter-matching design. Some asymptotic efficiency results are presented and computational methods discussed. Further, it is shown that a 'marginal' likelihood provides a link to unconditional logistic methods. The methods are examined in a simulation study that compares frequency and counter-matching using conditional and unconditional logistic analyses and indicate that the conditional logistic likelihood has superior efficiency. Extensions that accommodate sampling of cases and multistage designs are presented. Finally, we compare the analysis methods presented here to other approaches, compare counter-matching and two-stage designs, and suggest areas for further research.To whom correspondence should be addressed.  相似文献   

13.
Although genetic association studies using unrelated individuals may be subject to bias caused by population stratification, alternative methods that are robust to population stratification, such as family-based association designs, may be less powerful. Furthermore, it is often more feasible and less expensive to collect unrelated individuals. Recently, several statistical methods have been proposed for case-control association tests in a structured population; these methods may be robust to population stratification. In the present study, we propose a quantitative similarity-based association test (QSAT) to identify association between a candidate marker and a quantitative trait of interest, through use of unrelated individuals. For the QSAT, we first determine whether two individuals are from the same subpopulation or from different subpopulations, using genotype data at a set of independent markers. We then perform an association test between the candidate marker and the quantitative trait, through incorporation of such information. Simulation results based on either coalescent models or empirical population genetics data show that the QSAT has a correct type I error rate in the presence of population stratification and that the power of the QSAT is higher than that of family-based association designs.  相似文献   

14.
Huang Y  Pepe MS 《Biometrika》2009,96(4):991-997
The performance of a well-calibrated risk model for a binary disease outcome can be characterized by the population distribution of risk and displayed with the predictiveness curve. Better performance is characterized by a wider distribution of risk, since this corresponds to better risk stratification in the sense that more subjects are identified at low and high risk for the disease outcome. Although methods have been developed to estimate predictiveness curves from cohort studies, most studies to evaluate novel risk prediction markers employ case-control designs. Here we develop semiparametric methods that accommodate case-control data. The semiparametric methods are flexible, and naturally generalize methods previously developed for cohort data. Applications to prostate cancer risk prediction markers illustrate the methods.  相似文献   

15.
Many existing cohort studies initially designed to investigate disease risk as a function of environmental exposures have collected genomic data in recent years with the objective of testing for gene-environment interaction (G × E) effects. In environmental epidemiology, interest in G × E arises primarily after a significant effect of the environmental exposure has been documented. Cohort studies often collect rich exposure data; as a result, assessing G × E effects in the presence of multiple exposure markers further increases the burden of multiple testing, an issue already present in both genetic and environment health studies. Latent variable (LV) models have been used in environmental epidemiology to reduce dimensionality of the exposure data, gain power by reducing multiplicity issues via condensing exposure data, and avoid collinearity problems due to presence of multiple correlated exposures. We extend the LV framework to characterize gene-environment interaction in presence of multiple correlated exposures and genotype categories. Further, similar to what has been done in case-control G × E studies, we use the assumption of gene-environment (G-E) independence to boost the power of tests for interaction. The consequences of making this assumption, or the issue of how to explicitly model G-E association has not been previously investigated in LV models. We postulate a hierarchy of assumptions about the LV model regarding the different forms of G-E dependence and show that making such assumptions may influence inferential results on the G, E, and G × E parameters. We implement a class of shrinkage estimators to data adaptively trade-off between the most restrictive to most flexible form of G-E dependence assumption and note that such class of compromise estimators can serve as a benchmark of model adequacy in LV models. We demonstrate the methods with an example from the Early Life Exposures in Mexico City to Neuro-Toxicants Study of lead exposure, iron metabolism genes, and birth weight.  相似文献   

16.
Efficiency of cohort sampling designs: some surprising results.   总被引:3,自引:0,他引:3  
B Langholz  D C Thomas 《Biometrics》1991,47(4):1563-1571
Cohort sampling designs are proposed which one would intuitively expect to be more efficient than nested case-control sampling. Two of these designs start with a nested case-control sample and distribute controls to sampled risk sets other than those for which they were picked. The third design has the goal of maximizing the number of distinct persons in a nested case-control sample. Simulation results show surprisingly little gain, and more often a loss in efficiency of these new designs relative to nested case-control sampling. This is due to the sampling-induced covariance between score terms. We conclude that the often stated intuition that nested case-control sampling does not make good use of sampled individuals' covariate histories is false.  相似文献   

17.
Designs for synthetic case-control studies in open cohorts   总被引:3,自引:0,他引:3  
Several designs are proposed for case-control studies within cohorts when the cohort is open to late entry. These and previously proposed designs are examined with respect to consistency and efficiency of relative risk parameter estimation, and a small simulation study is reported. If study costs increase in proportion to the total number of "at-risk" controls, the most efficient design, Design C, is as follows. For a case failing at time t, controls are selected at random (and without regard to "at-risk" status) from among cohort members who are (i) known not to have failed prior to t and (ii) have not been previously selected as controls. At each t, control sampling proceeds until a prespecified number of controls who are "at risk" at t have been obtained. The efficiency advantage of Design C over that of the standard case-control design proposed by Thomas (in Appendix to Liddell, McDonald, and Thomas, 1977, Journal of the Royal Statistical Society, Series B 140, 469-490) will often be small. If, on the other hand, the costs increase in proportion to the number of distinct "at-risk" controls, Design C is no longer the most efficient design. In this case, several alternative designs are proposed.  相似文献   

18.
王艳  张军  黄青阳 《遗传》2008,30(6):711-715
采用病例.家系对照和随机病例.对照两种设计,分析了603例样本脂联素基因(Adiponectin,APMl)单核苷酸多态性(SNP)rs13061862(T45G)与湖北汉族人群2型糖尿病的相关性.在所有样本中,2型糖尿病病人的G等位基因及GG基因型频率显著高于正常人(G:42.0%比21.7%,P<0.001;GG:13.6%比4.5%,P=0.032);在180个病例.家系对照中,2型糖尿病患者的GG基因型频率显著高于对照组(GG:17.8%比5.6%,P=0.011);在423个随机病例.对照中,2型糖尿病患者GG基因型频率也显著高于对照组(GG:12.2%比3.9%,P=0.025);单因素Logistic回归分析显示,GG基因型是2型糖尿病的危险因子(OR=3.58,95%C/=1.70-7.54).这些结果表明,脂联素基因SNPT45G多态性与湖北汉族人群2型糖尿病的发生发展相关,GG基因型是中国湖北汉族人2型糖尿病的遗传危险因素.  相似文献   

19.
There are two common designs for association mapping of complex diseases: case-control and family-based designs. A case-control sample is more powerful to detect genetic effects than a family-based sample that contains the same numbers of affected and unaffected persons, although additional markers may be required to control for spurious association. When family and unrelated samples are available, statistical analyses are often performed in the family and unrelated samples separately, conditioning on parental information for the former, thus resulting in reduced power. In this report, we propose a unified approach that can incorporate both family and case-control samples and, provided the additional markers are available, at the same time corrects for population stratification. We apply the principal components of a marker matrix to adjust for the effect of population stratification. This unified approach makes it unnecessary to perform a conditional analysis of the family data and is more powerful than the separate analyses of unrelated and family samples, or a meta-analysis performed by combining the results of the usual separate analyses. This property is demonstrated in both a variety of simulation models and empirical data. The proposed approach can be equally applied to the analysis of both qualitative and quantitative traits.  相似文献   

20.
On the design of synthetic case-control studies   总被引:6,自引:0,他引:6  
R L Prentice 《Biometrics》1986,42(2):301-310
A design is proposed for "case-control within cohort" studies. In this design, controls are sampled without replacement from failure-free members of the cohort at each distinct failure time. Upon selection, a subject ceases to be eligible for control selection at later failure times. Also, if a subject failing at time t had been selected as a control at t' less than t, then the matched controls at t are selected to have also been at risk at t'. In these circumstances correlation exists between score statistic contributions at t and t'. An estimator is developed for this correlation. A small simulation study compares the design just described to other possible synthetic case-control designs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号