期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Robust genomic control for association studies

Zheng G Freidlin B Gastwirth JL 《American journal of human genetics》2006,78(2):350-356

Population-based case-control studies are a useful method to test for a genetic association between a trait and a marker. However, the analysis of the resulting data can be affected by population stratification or cryptic relatedness, which may inflate the variance of the usual statistics, resulting in a higher-than-nominal rate of false-positive results. One approach to preserving the nominal type I error is to apply genomic control, which adjusts the variance of the Cochran-Armitage trend test by calculating the statistic on data from null loci. This enables one to estimate any additional variance in the null distribution of statistics. When the underlying genetic model (e.g., recessive, additive, or dominant) is known, genomic control can be applied to the corresponding optimal trend tests. In practice, however, the mode of inheritance is unknown. The genotype-based chi (2) test for a general association between the trait and the marker does not depend on the underlying genetic model. Since this general association test has 2 degrees of freedom (df), the existing formulas for estimating the variance factor by use of genomic control are not directly applicable. By expressing the general association test in terms of two Cochran-Armitage trend tests, one can apply genomic control to each of the two trend tests separately, thereby adjusting the chi (2) statistic. The properties of this robust genomic control test with 2 df are examined by simulation. This genomic control-adjusted 2-df test has control of type I error and achieves reasonable power, relative to the optimal tests for each model. 相似文献

2.

Robust tests for matched case-control genetic association studies

Yong Zang Wing Kam Fung 《BMC genetics》2010,11(1):1-14

Background

Infectious disease of livestock continues to be a cause of substantial economic loss and has adverse welfare consequences in both the developing and developed world. New solutions to control disease are needed and research focused on the genetic loci determining variation in immune-related traits has the potential to deliver solutions. However, identifying selectable markers and the causal genes involved in disease resistance and vaccine response is not straightforward. The aims of this study were to locate regions of the bovine genome that control the immune response post immunisation. 195 F2 and backcross Holstein Charolais cattle were immunised with a 40-mer peptide derived from foot-and-mouth disease virus (FMDV). T cell and antibody (IgG1 and IgG2) responses were measured at several time points post immunisation. All experimental animals (F0, F1 and F2, n = 982) were genotyped with 165 microsatellite markers for the genome scan.

Results

Considerable variability in the immune responses across time was observed and sire, dam and age had significant effects on responses at specific time points. There were significant correlations within traits across time, and between IgG1 and IgG2 traits, also some weak correlations were detected between T cell and IgG2 responses. The whole genome scan detected 77 quantitative trait loci (QTL), on 22 chromosomes, including clusters of QTL on BTA 4, 5, 6, 20, 23 and 25. Two QTL reached 5% genome wide significance (on BTA 6 and 24) and one on BTA 20 reached 1% genome wide significance.

Conclusions

A proportion of the variance in the T cell and antibody response post immunisation with an FDMV peptide has a genetic component. Even though the antigen was relatively simple, the humoral and cell mediated responses were clearly under complex genetic control, with the majority of QTL located outside the MHC locus. The results suggest that there may be specific genes or loci that impact on variation in both the primary and secondary immune responses, whereas other loci may be specifically important for early or later phases of the immune response. Future fine mapping of the QTL clusters identified has the potential to reveal the causal variations underlying the variation in immune response observed. 相似文献

3.

Robust trend tests for genetic association in case-control studies using family data

Tian X Joo J Zheng G Lin JP 《BMC genetics》2005,6(Z1):S107

We studied a trend test for genetic association between disease and the number of risk alleles using case-control data. When the data are sampled from families, this trend test can be adjusted to take into account the correlations among family members in complex pedigrees. However, the test depends on the scores based on the underlying genetic model and thus it may have substantial loss of power when the model is misspecified. Since the mode of inheritance will be unknown for complex diseases, we have developed two robust trend tests for case-control studies using family data. These robust tests have relatively good power for a class of possible genetic models. The trend tests and robust trend tests were applied to a dataset of Genetic Analysis Workshop 14 from the Collaborative Study on the Genetics of Alcoholism. 相似文献

4.

MAX-rank: a simple and robust genome-wide scan for case-control association studies

Li Q Yu K Li Z Zheng G 《Human genetics》2008,123(6):617-623

In genome-wide association studies (GWAS), single-marker analysis is usually employed to identify the most significant single nucleotide polymorphisms (SNPs). The trend test has been proposed for analysis of case-control association. Three trend tests, optimal for the recessive, additive and dominant models respectively, are available. When the underlying genetic model is unknown, the maximum of the three trend test results (MAX) has been shown to be robust against genetic model misspecification. Since the asymptotic distribution of MAX depends on the allele frequency of the SNP, using the P-value of MAX for ranking may be different from using the MAX statistic. Calculating the P-value of MAX for 300,000 (300 K) or more SNPs is computationally intensive and the software and program to obtain the P-value of MAX are not widely available. On the other hand, the MAX statistic is very easy to calculate without complex computer programs. Thus, we study whether or not one could use the MAX statistic instead of its P-value to rank SNPs in GWAS. The approaches using the MAX and its P-value to rank SNPs are referred to as MAX-rank and P-rank. By applying MAX-rank and P-rank to simulated and four real datasets from GWAS, we found the ranks of SNPs with true association are very similar using both approaches. Thus, we recommend to use MAX-rank for genome-wide scans. After the top-ranked SNPs are identified, their P-values based on MAX can be calculated and compared with the significance level. The work of Q. Li was partially supported by the Knowledge Innovation Program of the Chinese Academy of Sciences, No. 30465W0 and 30475V0. The research of Z Li was partially sponsored by NIH grant EY014478. 相似文献

5.

Combining association tests across multiple genetic markers in case-control studies

Zhou H Wei LJ Xu X Xu X 《Human heredity》2008,65(3):166-174

In the search to detect genetic associations between complex traits and DNA variants, a practice is to select a subset of Single Nucleotide Polymorphisms (tag SNPs) in a gene or chromosomal region of interest. This allows study of untyped polymorphisms in this region through the phenomenon of linkage disequilibrium (LD). However, it is crucial in the analysis to utilize such multiple SNP markers efficiently. In this study, we present a robust testing approach (T(C)) that combines single marker association test statistics or p values. This combination is based on the summation of single test statistics or p values, giving greater weight to those with lower p values. We compared the powers of T(C) in identifying common trait loci, using tag SNPs within the same haplotype block that the trait loci reside, with competing published tests, in case-control settings. These competing tests included the Bonferroni procedure (T(B)), the simple permutation procedure (T(P)), the permutation procedure proposed by Hoh et al. (T(P-H)) and its revised version using 'deflated' statistics (T(P-H_def)), the traditional chi(2) procedure (T(CHI)), the regression procedure (Hotelling T(2) test) (T(R)) and the haplotype-based test (T(H)). Results of these comparisons show that our proposed combining procedure (T(C)) is preferred in all scenarios examined. We also apply this new test to a data set from a previously reported association study on airway responsiveness to methacholine. 相似文献

6.

Data quality control in genetic case-control association studies

Anderson CA Pettersson FH Clarke GM Cardon LR Morris AP Zondervan KT 《Nature protocols》2010,5(9):1564-1573

This protocol details the steps for data quality assessment and control that are typically carried out during case-control association studies. The steps described involve the identification and removal of DNA samples and markers that introduce bias. These critical steps are paramount to the success of a case-control study and are necessary before statistically testing for association. We describe how to use PLINK, a tool for handling SNP data, to perform assessments of failure rate per individual and per SNP and to assess the degree of relatedness between individuals. We also detail other quality-control procedures, including the use of SMARTPCA software for the identification of ancestral outliers. These platforms were selected because they are user-friendly, widely used and computationally efficient. Steps needed to detect and establish a disease association using case-control data are not discussed here. Issues concerning study design and marker selection in case-control studies have been discussed in our earlier protocols. This protocol, which is routinely used in our labs, should take approximately 8 h to complete. 相似文献

7.

Issues in association analysis: error control in case-control association studies for disease gene discovery

Ott J 《Human heredity》2004,58(3-4):171-174

Several sources of errors are discussed. While genotyping errors have little effect on power in case-control association studies, they tend to strongly increase false positive results in TDT type tests unless occurrence of errors is allowed for in the analysis (e.g., TDTae test). Disregarding non-genetic risk factors is shown to lead to a form of hidden heterogeneity, which can strongly reduce power. Stratification of data into more homogeneous subgroups is advocated as a simple solution to allowing for non-genetic risk factors such as socio-economic status and food preferences. 相似文献

8.

Statistical tests of genetic association for case-control study designs

Wang K 《Biostatistics (Oxford, England)》2012,13(4):724-733

The central theme in case-control genetic association studies is to efficiently identify genetic markers associated with trait status. Powerful statistical methods are critical to accomplishing this goal. A popular method is the omnibus Pearson's chi-square test applied to genotype counts. To achieve increased power, tests based on an assumed trait model have been proposed. However, they are not robust to model misspecification. Much research has been carried out on enhancing robustness of such model-based tests. An analysis framework that tests the equality of allele frequency while allowing for different deviation from Hardy-Weinberg equilibrium (HWE) between cases and controls is proposed. The proposed method does not require specification of trait models nor HWE. It involves only 1 degree of freedom. The likelihood ratio statistic, score statistic, and Wald statistic associated with this framework are introduced. Their performance is evaluated by extensive computer simulation in comparison with existing methods. 相似文献

9.

Asymptotic distribution for epistatic tests in case-control studies 总被引：1，自引：0，他引：1

Liu T Thalamuthu A Liu JJ Chen C Wang Z Wu R 《Genomics》2011,98(2):145-151

We propose a statistical model for dissecting a multilocus genotypic value into its main (additive and dominant) effects and epistatic effects between different loci in a case-control association study. The model can discern four different kinds of epistasis, additive × additive, additive × dominant, dominant × additive, and dominant × dominant interactions. To test each kind of epistasis, a χ² test statistic was computed for a two by two contingency table derived from combined genotypes in both case and control groups. We derived an analytical approach for estimating the asymptotic distribution of the χ² test statistic for epistatic tests under the null hypothesis, with the result being consistent with that from Monte Carlo simulations. The new model was used to analyze a case-control data set for candidate gene studies of stroke, leading to the identification of several significant interactions between causal SNPs on this disease. 相似文献

10.

Nonlinear tests for genomewide association studies

下载免费PDF全文

Zhao J Jin L Xiong M 《Genetics》2006,174(3):1529-1538

As millions of single-nucleotide polymorphisms (SNPs) have been identified and high-throughput genotyping technologies have been rapidly developed, large-scale genomewide association studies are soon within reach. However, since a genomewide association study involves a large number of SNPs it is therefore nearly impossible to ensure a genomewide significance level of 0.05 using the available statistics, although the multiple-test problems can be alleviated, but not sufficiently, by the use of tagging SNPs. One strategy to circumvent the multiple-test problem associated with genome-wide association tests is to develop novel test statistics with high power. In this report, we introduce several nonlinear tests, which are based on nonlinear transformation of allele or haplotype frequencies. We investigate the power of the nonlinear test statistics and demonstrate that under certain conditions, some nonlinear test statistics have much higher power than the standard chi2-test statistic. Type I error rates of the nonlinear tests are validated using simulation studies. We also show that a class of similarity measure-based test statistics is based on the quadratic function of allele or haplotype frequencies, and thus they belong to nonlinear tests. To evaluate their performance, the nonlinear test statistics are also applied to three real data sets. Our study shows that nonlinear test statistics have great potential in association studies of complex diseases. 相似文献

11.

On robust estimation in logistic case-control studies 总被引：1，自引：0，他引：1

WANG C. Y.; CARROLL R. J. 《Biometrika》1993,80(1):237-241

相似文献

12.

A robust method for testing association in genome-wide association studies

Chen Z Ng HK 《Human heredity》2012,73(1):26-34

In genetic association studies, due to the varying underlying genetic models, no single statistical test can be the most powerful test under all situations. Current studies show that if the underlying genetic models are known, trend-based tests, which outperform the classical Pearson χ2 test, can be constructed. However, when the underlying genetic models are unknown, the χ2 test is usually more robust than trend-based tests. In this paper, we propose a new association test based on a generalized genetic model, namely the generalized order-restricted relative risks model. Through a Monte Carlo simulation study, we show that the proposed association test is generally more powerful than the χ2 test, and more robust than those trend-based tests. The proposed methodologies are also illustrated by some real SNP datasets. 相似文献

13.

Complexity and power in case-control association studies 总被引：12，自引：0，他引：12

下载免费PDF全文

Longmate JA 《American journal of human genetics》2001,68(5):1229-1237

A general method is described for estimation of the power and sample size of studies relating a dichotomous phenotype to multiple interacting loci and environmental covariates. Either a simple case-control design or more complex stratified sampling may be used. The method can be used to design individual studies, to evaluate the power of alternative test statistics for complex traits, and to examine general questions of study design through explicit scenarios. The method is used here to study how the power of association tests is affected by problems of allelic heterogeneity and to investigate the potential role for collective testing of sets of related candidate genes in the presence of locus heterogeneity. The results indicate that allele-discovery efforts are crucial and that omnibus tests or collective testing of alleles can be substantially more powerful than separate testing of individual allelic variants. Joint testing of multiple candidate loci can also dramatically improve power, despite model misspecification and inclusion of irrelevant loci, but requires an a priori hypothesis defining the set of loci to investigate. 相似文献

14.

Case-control association studies in mixed populations: correcting using genomic control

Shmulewitz D Zhang J Greenberg DA 《Human heredity》2004,58(3-4):145-153

OBJECTIVE: Case-control association studies in mixed populations can result in spurious disease-marker associations if subpopulation disease prevalence and marker frequencies both differ. Genomic control (GC) uses neutral loci to correct for spurious association (due to population stratification), but how well this works remains undetermined. METHODS: We simulated and mixed populations with different disease and marker frequencies but without marker-disease association. We generated case-control datasets, calculated the chi2 for disease association with each marker, and applied two GC procedures, dividing by the mean chi2 or median-chi2/0.456. RESULTS: Corrections became conservative (false positive rate [FPR] <5%) with increasing subpopulation prevalence and marker differences. The mean correction resulted in FPRs close to 5% at average subpopulation allele frequency differences <0.26, but inclusion of just a few markers with large frequency differences resulted in conservative FPRs. FPRs from the median correction were mostly conservative but became anticonservative when a few markers with large frequency differences were included. CONCLUSION: GC can both lead to a notable loss of power to detect a true association (conservative) in many circumstances or may fail to eliminate the spurious associations (anticonservative). The mean correction factor is useful in certain situations to correct population stratification, but it is difficult to know when those situations exist. 相似文献

15.

Testing for population subdivision and association in four case-control studies 总被引：13，自引：0，他引：13

下载免费PDF全文

Ardlie KG Lunetta KL Seielstad M 《American journal of human genetics》2002,71(2):304-311

Population structure has been presumed to cause many of the unreplicated disease-marker associations reported in the literature, yet few actual case-control studies have been evaluated for the presence of structure. Here, we examine four moderate case-control samples, comprising 3,472 individuals, to determine if detectable population subdivision is present. The four population samples include: 500 U.S. whites and 236 African Americans with hypertension; and 500 U.S. whites and 500 Polish whites with type 2 diabetes, all with matched control subjects. Both diabetes populations were typed for the PPARg Pro12Ala polymorphism, to replicate this well-supported association (Altshuler et al. 2000). In each of the four samples, we tested for structure, using the sum of the case-control allele frequency chi(2) statistics for 9 STR and 35 SNP markers (Pritchard and Rosenberg 1999). We found weak evidence for population structure in the African American sample only, but further refinement of the sample, to include only individuals with U.S.-born parents and grandparents, eliminated the stratification. Our examples provide insight into the factors affecting the replication of association studies and suggest that carefully matched, moderate-sized case-control samples in cosmopolitan U.S. and European populations are unlikely to contain levels of structure that would result in significantly inflated numbers of false-positive associations. We explore the role that extreme differences in power among studies, due to sample size and risk-allele frequency differences, may play in the replication problem. 相似文献

16.

Experimental designs for robust detection of effects in genome-wide case-control studies

Ball RD 《Genetics》2011,189(4):1497-1514

In genome-wide association studies hundreds of thousands of loci are scanned in thousands of cases and controls, with the goal of identifying genomic loci underpinning disease. This is a challenging statistical problem requiring strong evidence. Only a small proportion of the heritability of common diseases has so far been explained. This "dark matter of the genome" is a subject of much discussion. It is critical to have experimental design criteria that ensure that associations between genomic loci and phenotypes are robustly detected. To ensure associations are robustly detected we require good power (e.g., 0.8) and sufficiently strong evidence [i.e., a high Bayes factor (e.g., 10(6), meaning the data are 1 million times more likely if the association is real than if there is no association)] to overcome the low prior odds for any given marker in a genome scan to be associated with a causal locus. Power calculations are given for determining the sample sizes necessary to detect effects with the required power and Bayes factor for biallelic markers in linkage disequilibrium with causal loci in additive, dominant, and recessive genetic models. Significantly stronger evidence and larger sample sizes are required than indicated by traditional hypothesis tests and power calculations. Many reported putative effects are not robustly detected and many effects including some large moderately low-frequency effects may remain undetected. These results may explain the dark matter in the genome. The power calculations have been implemented in R and will be available in the R package ldDesign. 相似文献

17.

Genomic control for association studies 总被引：96，自引：0，他引：96

Devlin B Roeder K 《Biometrics》1999,55(4):997-1004

A dense set of single nucleotide polymorphisms (SNP) covering the genome and an efficient method to assess SNP genotypes are expected to be available in the near future. An outstanding question is how to use these technologies efficiently to identify genes affecting liability to complex disorders. To achieve this goal, we propose a statistical method that has several optimal properties: It can be used with case control data and yet, like family-based designs, controls for population heterogeneity; it is insensitive to the usual violations of model assumptions, such as cases failing to be strictly independent; and, by using Bayesian outlier methods, it circumvents the need for Bonferroni correction for multiple tests, leading to better performance in many settings while still constraining risk for false positives. The performance of our genomic control method is quite good for plausible effects of liability genes, which bodes well for future genetic analyses of complex disorders. 相似文献

18.

Designing candidate gene and genome-wide case-control association studies

Zondervan KT Cardon LR 《Nature protocols》2007,2(10):2492-2501

This protocol describes how to appropriately design a genetic association case-control study, either focusing on a candidate gene (CG) or region or implementing a genome-wide approach. The steps described involve: (i) defining the case phenotype in adequate detail; (ii) checking the heritability of the disease in question; (iii) considering whether a population-based study is the appropriate design for the research question; (iv) the appropriate selection of controls; (v) sample size calculations and (vi) giving due consideration to whether it is a de novo or replication study. General guidelines are given, as well as specific examples of a CG and a genome-wide association study into type 2 diabetes. Software and websites used in this protocol include the International HapMap Consortium website, Genetic Power Calculator, CaT, and SNPSpD. Running each of the programs takes only a few seconds; the rate-limiting steps involve thinking through the designs and parameters in the disease models. 相似文献

19.

Genetic model selection in two-phase analysis for case-control association studies 总被引：1，自引：0，他引：1

Zheng G Ng HK 《Biostatistics (Oxford, England)》2008,9(3):391-399

The Cochran-Armitage trend test (CATT) is well suited for testing association between a marker and a disease in case-control studies. When the underlying genetic model for the disease is known, the CATT optimal for the genetic model is used. For complex diseases, however, the genetic models of the true disease loci are unknown. In this situation, robust tests are preferable. We propose a two-phase analysis with model selection for the case-control design. In the first phase, we use the difference of Hardy-Weinberg disequilibrium coefficients between the cases and the controls for model selection. Then, an optimal CATT corresponding to the selected model is used for testing association. The correlation of the statistics used for selection and the test for association is derived to adjust the two-phase analysis with control of the Type-I error rate. The simulation studies show that this new approach has greater efficiency robustness than the existing methods. 相似文献

20.

Accounting for control mislabeling in case-control biomarker studies

Rantalainen M Holmes CC 《Journal of proteome research》2011,10(12):5562-5567

In biomarker discovery studies, uncertainty associated with case and control labels is often overlooked. By omitting to take into account label uncertainty, model parameters and the predictive risk can become biased, sometimes severely. The most common situation is when the control set contains an unknown number of undiagnosed, or future, cases. This has a marked impact in situations where the model needs to be well-calibrated, e.g., when the prediction performance of a biomarker panel is evaluated. Failing to account for class label uncertainty may lead to underestimation of classification performance and bias in parameter estimates. This can further impact on meta-analysis for combining evidence from multiple studies. Using a simulation study, we outline how conventional statistical models can be modified to address class label uncertainty leading to well-calibrated prediction performance estimates and reduced bias in meta-analysis. We focus on the problem of mislabeled control subjects in case-control studies, i.e., when some of the control subjects are undiagnosed cases, although the procedures we report are generic. The uncertainty in control status is a particular situation common in biomarker discovery studies in the context of genomic and molecular epidemiology, where control subjects are commonly sampled from the general population with an established expected disease incidence rate. 相似文献