首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In population-based case-control association studies, the regular chi (2) test is often used to investigate association between a candidate locus and disease. However, it is well known that this test may be biased in the presence of population stratification and/or genotyping error. Unlike some other biases, this bias will not go away with increasing sample size. On the contrary, the false-positive rate will be much larger when the sample size is increased. The usual family-based designs are robust against population stratification, but they are sensitive to genotype error. In this article, we propose a novel method of simultaneously correcting for the bias arising from population stratification and/or for the genotyping error in case-control studies. The appropriate corrections depend on sample odds ratios of the standard 2x3 tables of genotype by case and control from null loci. Therefore, the test is simple to apply. The corrected test is robust against misspecification of the genetic model. If the null hypothesis of no association is rejected, the corrections can be further used to estimate the effect of the genetic factor. We considered a simulation study to investigate the performance of the new method, using parameter values similar to those found in real-data examples. The results show that the corrected test approximately maintains the expected type I error rate under various simulation conditions. It also improves the power of the association test in the presence of population stratification and/or genotyping error. The discrepancy in power between the tests with correction and those without correction tends to be more extreme as the magnitude of the bias becomes larger. Therefore, the bias-correction method proposed in this article should be useful for the genetic analysis of complex traits.  相似文献   

2.
人类复杂疾病关联研究中群体分层的检出和校正   总被引:2,自引:1,他引:2  
病例对照研究是鉴定多基因疾病易感位点重要的遗传流行病学方法, 而群体分层是导致病例对照研究关联研究结果出现偏倚甚至是假关联的重要原因之一。文章对人群分层的检出及校正的方法和原理进行了阐述, 包括基于核心家系的传递/不平衡检验(TDT)以及基于不相关基因组遗传标记的基因组对照(GC)和结构化关联(SA)等, 并且对这几种方法进行了比较。  相似文献   

3.
MOTIVATION: Although population-based association mapping may be subject to the bias caused by population stratification, alternative methods that are robust to population stratification such as family-based linkage analysis have lower mapping resolution. Recently, various statistical methods robust to population stratification were proposed for association studies, using unrelated individuals to identify associations between candidate genes and traits of interest. The association between a candidate gene and a quantitative trait is often evaluated via a regression model with inferred population structure variables as covariates, where the residual distribution is customarily assumed to be from a symmetric and unimodal parametric family, such as a Gaussian, although this may be inappropriate for the analysis of many real-life datasets. RESULTS: In this article, we proposed a new structured association (SA) test. Our method corrects for continuous population stratification by first deriving population structure and kinship matrices through a set of random genetic markers and then modeling the relationship between trait values, genotypic scores at a candidate marker and genetic background variables through a semiparametric model, where the error distribution is modeled as a mixture of Polya trees centered around a normal family of distributions. We compared our model to the existing SA tests in terms of model fit, type I error rate, power, precision and accuracy by application to a real dataset as well as simulated datasets.  相似文献   

4.
Sha Q  Zhang Z  Zhang S 《PloS one》2011,6(7):e21957
In family-based data, association information can be partitioned into the between-family information and the within-family information. Based on this observation, Steen et al. (Nature Genetics. 2005, 683-691) proposed an interesting two-stage test for genome-wide association (GWA) studies under family-based designs which performs genomic screening and replication using the same data set. In the first stage, a screening test based on the between-family information is used to select markers. In the second stage, an association test based on the within-family information is used to test association at the selected markers. However, we learn from the results of case-control studies (Skol et al. Nature Genetics. 2006, 209-213) that this two-stage approach may be not optimal. In this article, we propose a novel two-stage joint analysis for GWA studies under family-based designs. For this joint analysis, we first propose a new screening test that is based on the between-family information and is robust to population stratification. This new screening test is used in the first stage to select markers. Then, a joint test that combines the between-family information and within-family information is used in the second stage to test association at the selected markers. By extensive simulation studies, we demonstrate that the joint analysis always results in increased power to detect genetic association and is robust to population stratification.  相似文献   

5.
Wang T  Elston RC 《Human heredity》2005,60(3):134-142
The lack of replication of model-free linkage analyses performed on complex diseases raises questions about the robustness of these methods to various biases. The confounding effect of population stratification on a genetic association study has long been recognized in the genetic epidemiology community. Because the estimation of the number of alleles shared identical by descent (IBD) does not depend on the marker allele frequency when founders of families are observed, model-free linkage analysis is usually thought to be robust to population stratification. However, for common complex diseases, the genotypes of founders are often unobserved and therefore population stratification has the potential to impair model-free linkage analysis. Here, we demonstrate that, when some or all of the founder genotypes are missing, population stratification can introduce deleterious effects on various model-free linkage methods or designs. For an affected sib pair design, it can cause excess false-positive discoveries even when the trait distribution is homogeneous among subpopulations. After incorporating a control group of discordant sib pairs or for a quantitative trait, two circumstances must be met for population stratification to be a confounder: the distributions for both the marker and the trait must be heterogeneous among subpopulations. When this occurs, the bias can result in either a liberal, and hence invalid, test or a conservative test. Bias can be eliminated or alleviated by inclusion of founders' or other family members' genotype data. When this is not possible, new methods need to be developed to be robust to population stratification.  相似文献   

6.
Genes can be associated with disease through an individual's inherited genotype, the maternal genotype or the interaction between these two. When the gene is highly polymorphic, it is more difficult to identify the gene's functional role than for less polymorphic loci, because different alleles at the locus may be associated with the disease through separate and joint effects from maternal and offspring genotypes. Family-based studies are used to test genetic associations because of their robustness to population stratification. However, parental genotype data are often missing, and omitting incompletely genotyped families is inefficient. Methods have been proposed to accommodate incomplete families in family-based association studies. They are not easily generalized to allow simultaneous examination of offspring allelic, maternal allelic and maternal-fetal genotype (MFG) incompatibility effects. Since many MFG incompatibility effects occur through matching between maternal and offspring's genotypes, we present an identity-by-state (IBS) framework to incorporate incomplete families in the MFG test when modeling genetic effects produced by a polymorphic gene. Using simulations, we examine the MFG test's performance with incomplete parental genotype data and an IBS framework. The MFG test using the IBS framework is immune to population stratification and efficiently uses information from incomplete families.  相似文献   

7.
We introduce a new powerful nonparametric testing strategy for family-based association studies in which multiple quantitative traits are recorded and the phenotype with the strongest genetic component is not known prior to the analysis. In the first stage, using a population-based test based on the generalized estimating equation approach, we test all recorded phenotypes for association with the marker locus without biasing the nominal significance level of the later family-based analysis. In the second stage the phenotype with the smallest p value is selected and tested by a family-based association test for association with the marker locus. This strategy is robust against population admixture and stratification and does not require any adjustment for multiple testing. We demonstrate the advantages of this testing strategy over standard methodology in a simulation study. The practical importance of our testing strategy is illustrated by applications to the Childhood Asthma Management Program asthma data sets.  相似文献   

8.
Cheng KF  Chen JH 《Human heredity》2007,64(2):114-122
The transmission/disequilibrium test (TDT), a family based test of linkage and association, is a popular test for studies of complex inheritance, as it is nonparametric and robust against spurious conclusions induced by hidden genetic structure, such as stratification or admixture. However, the TDT may be biased by genotyping errors. Undetected genotyping errors may be contributing to an inflated type I error rate among reported TDT-derived associations. To adjust for bias, a popular approach is to assume a genotype error model for describing the pattern of errors and propose association tests using likelihood method. However, all model-based approaches tend to perform unsatisfactorily if the related genotyping error rates are not identical across all families. In this paper, we propose a TDT-type association test which is not only simple, robust against population stratification (and hence the assumption of Hardy-Weinberg equilibrium is not required), but also robust against genotyping error with error rates varying across families. Simulation studies confirm that the new test has very reasonable performance.  相似文献   

9.
Population stratification is a form of confounding by ethnicity that may cause bias to effect estimates and inflate test statistics in genetic association studies. Unlinked genetic markers have been used to adjust for test statistics, but their use in correcting biased effect estimates has not been addressed. We evaluated the potential of bias correction that could be achieved by a single null marker (M) in studies involving one candidate gene (G). When the distribution of M varied greatly across ethnicities, controlling for M in a logistic regression model substantially reduced biases on odds ratio estimates. When M had same distributions as G across ethnicities, biases were further reduced or eliminated by subtracting the regression coefficient of M from the coefficient of G in the model, which was fitted either with or without a multiplicative interaction term between M and G. Correction of bias due to population stratification depended specifically on the distributions of G and M, the difference between baseline disease risks across ethnicities, and whether G had an effect on disease risk or not. Our results suggested that marker choice and the specific treatment of that marker in analysis greatly influenced bias correction.  相似文献   

10.
复杂疾病全基因组关联研究进展——遗传统计分析   总被引:7,自引:0,他引:7  
严卫丽 《遗传》2008,30(5):543-549
2005年, Science杂志首次报道了有关人类年龄相关性黄斑变性的全基因组关联研究, 此后有关肥胖、2型糖尿病、冠心病、阿尔茨海默病等一系列复杂疾病的全基因组关联研究被陆续报道, 这一阶段被称为人类全基因组关联研究的第一次浪潮。文章分别介绍了全基因组关联研究统计分析的方法、软件和应用实例; 比较了关联分析中多重检验的P值调整方法, 包括Bonferroni、递减的Bonferroni校正法、模拟运算法和控制错误发现率的方法; 还讨论了人群混杂对关联分析结果可能产生的影响及原理, 以及全基因组关联研究中控制人群混杂的方法的研究进展和应用实例。在全基因组关联研究的第一次浪潮中, 应用经典的遗传统计方法发现了许多基因-表型之间的关联并且能够对这些关联做出解释, 其中包括许多基因组中的未知基因和染色体区域。然而, 全基因组关联研究的继续发展需要进一步阐述基因组内基因之间相互作用、基因-基因之间的复杂作用网络与环境因素的相互作用在复杂疾病发生中的作用, 现有的统计分析方法肯定不能满足需要, 开发更为高级的统计分析方法势在必行。最后, 文章还给出了全基因组关联研究统计分析软件的相关网站信息。  相似文献   

11.
曹宗富  马传香  王雷  蔡斌 《遗传》2010,32(9):921-928
在复杂疾病的全基因组关联研究中,人群分层现象会增加结果的假阳性率,因此考虑人群遗传结构、控制人群分层是很有必要的。而在人群分层研究中,使用随机选择的SNP的效果还有待进一步探讨。文章利用HapMap Phase2人群中无关个体的Affymetrix SNP 6.0芯片分型数据,在全基因组上随机均匀选择不同数量的SNP,同时利用f值和Fisher精确检验方法筛选祖先信息标记(Ancestry Informative Markers,AIMs)。然后利用HapMap Phase3中的无关个体的数据,以F-statistics和STRUCTURE分析两种方法评估所选出的不同SNP组合对人群的区分效果。研究发现,随机均匀分布于全基因组的SNP可用于识别人群内部存在的遗传结构。文章进一步提示,在全基因组关联研究中,当没有针对特定人群的AIMs时,可在全基因组上随机选择3000以上均匀分布的SNP来控制人群分层。  相似文献   

12.
Case-control genetic association studies in admixed populations are known to be susceptible to genetic confounding due to population stratification. The transmission/disequilibrium test (TDT) approach can avoid this problem. However, the TDT is expensive and impractical for late-onset diseases. Case-control study designs, in which, cases and controls are matched by admixture, can be an appealing and a suitable alternative for genetic association studies in admixed populations. In this study, we applied this matching strategy when recruiting our African American participants in the Study of African American, Asthma, Genes and Environments. Group admixture in this cohort consists of 83% African ancestry and 17% European ancestry, which was consistent with reports from other studies. By carrying out several complementary analyses, our results show that there is a substructure in the cohort, but that the admixture distributions are almost identical in cases and controls, and also in cases only. We performed association tests for asthma-related traits with ancestry, and only found that FEV(1), a measure for baseline pulmonary function, was associated with ancestry after adjusting for socio-economic and environmental risk factors (P=0.01). We did not observe an excess of type I error rate in our association tests for ancestry informative markers and asthma-related phenotypes when ancestry was not adjusted in the analyses. Furthermore, using the association tests between genetic variants in a known asthma candidate gene, beta(2) adrenergic receptor (beta(2)AR) and DeltaFEF(25-75), an asthma-related phenotype, as an example, we demonstrated population stratification was not a confounder in our genetic association. Our present work demonstrates that admixture-matched case-control strategies can efficiently control population stratification confounding in admixed populations.  相似文献   

13.
Although genetic association studies using unrelated individuals may be subject to bias caused by population stratification, alternative methods that are robust to population stratification, such as family-based association designs, may be less powerful. Furthermore, it is often more feasible and less expensive to collect unrelated individuals. Recently, several statistical methods have been proposed for case-control association tests in a structured population; these methods may be robust to population stratification. In the present study, we propose a quantitative similarity-based association test (QSAT) to identify association between a candidate marker and a quantitative trait of interest, through use of unrelated individuals. For the QSAT, we first determine whether two individuals are from the same subpopulation or from different subpopulations, using genotype data at a set of independent markers. We then perform an association test between the candidate marker and the quantitative trait, through incorporation of such information. Simulation results based on either coalescent models or empirical population genetics data show that the QSAT has a correct type I error rate in the presence of population stratification and that the power of the QSAT is higher than that of family-based association designs.  相似文献   

14.
CYP3A4-V, an A to G promoter variant associated with prostate cancer in African Americans, exhibits large differences in allele frequency between populations. Given that the African American population is genetically heterogeneous because of its African ancestry and subsequent admixture with European Americans, case-control studies with African Americans are highly susceptible to spurious associations. To test for association with prostate cancer, we genotyped CYP3A4-V in 1376 (2 N) chromosomes from prostate cancer patients and age- and ethnicity-matched controls representing African Americans, Nigerians, and European Americans. To detect population stratification among the African American samples, 10 unlinked genetic markers were genotyped. To correct for the stratification, the uncorrected association statistic was divided by the average of association statistics across the 10 unlinked markers. Sharp differences in CYP3A4-V frequencies were observed between Nigerian and European American controls (0.87 and 0.10, respectively; P<0.0001). African Americans were intermediate (0.66). An association uncorrected for stratification was observed between CYP3A4-V and prostate cancer in African Americans (P=0.007). A nominal association was also observed among European Americans (P=0.02) but not Nigerians. In addition, the unlinked genetic marker test provided strong evidence of population stratification among African Americans. Because of the high level of stratification, the corrected P-value was not significant (P=0.25). Follow-up studies on a larger dataset will be needed to confirm whether the association is indeed spurious; however, these results reveal the potential for confounding of association studies by using African Americans and the need for study designs that take into account substructure caused by differences in ancestral proportions between cases and controls.  相似文献   

15.
Yu Z 《Human heredity》2011,71(3):171-179
The case-parents design has been widely used to detect genetic associations as it can prevent spurious association that could occur in population-based designs. When examining the effect of an individual genetic locus on a disease, logistic regressions developed by conditioning on parental genotypes provide complete protection from spurious association caused by population stratification. However, when testing gene-gene interactions, it is unknown whether conditional logistic regressions are still robust. Here we evaluate the robustness and efficiency of several gene-gene interaction tests that are derived from conditional logistic regressions. We found that in the presence of SNP genotype correlation due to population stratification or linkage disequilibrium, tests with incorrectly specified main-genetic-effect models can lead to inflated type I error rates. We also found that a test with fully flexible main genetic effects always maintains correct test size and its robustness can be achieved with negligible sacrifice of its power. When testing gene-gene interactions is the focus, the test allowing fully flexible main effects is recommended to be used.  相似文献   

16.
Complex human diseases commonly differ in their phenotypic characteristics, e.g., Crohn’s disease (CD) patients are heterogeneous with regard to disease location and disease extent. The genetic susceptibility to Crohn’s disease is widely acknowledged and has been demonstrated by identification of over 100 CD associated genetic loci. However, relating CD subphenotypes to disease susceptible loci has proven to be a difficult task. In this paper we discuss the use of cluster analysis on genetic markers to identify genetic-based subgroups while taking into account possible confounding by population stratification. We show that it is highly relevant to consider the confounding nature of population stratification in order to avoid that detected clusters are strongly related to population groups instead of disease-specific groups. Therefore, we explain the use of principal components to correct for population stratification while clustering affected individuals into genetic-based subgroups. The principal components are obtained using 30 ancestry informative markers (AIM), and the first two PCs are determined to discriminate between continental origins of the affected individuals. Genotypes on 51 CD associated single nucleotide polymorphisms (SNPs) are used to perform latent class analysis, hierarchical and Partitioning Around Medoids (PAM) cluster analysis within a sample of affected individuals with and without the use of principal components to adjust for population stratification. It is seen that without correction for population stratification clusters seem to be influenced by population stratification while with correction clusters are unrelated to continental origin of individuals.  相似文献   

17.
Population-based case-control studies are a useful method to test for a genetic association between a trait and a marker. However, the analysis of the resulting data can be affected by population stratification or cryptic relatedness, which may inflate the variance of the usual statistics, resulting in a higher-than-nominal rate of false-positive results. One approach to preserving the nominal type I error is to apply genomic control, which adjusts the variance of the Cochran-Armitage trend test by calculating the statistic on data from null loci. This enables one to estimate any additional variance in the null distribution of statistics. When the underlying genetic model (e.g., recessive, additive, or dominant) is known, genomic control can be applied to the corresponding optimal trend tests. In practice, however, the mode of inheritance is unknown. The genotype-based chi (2) test for a general association between the trait and the marker does not depend on the underlying genetic model. Since this general association test has 2 degrees of freedom (df), the existing formulas for estimating the variance factor by use of genomic control are not directly applicable. By expressing the general association test in terms of two Cochran-Armitage trend tests, one can apply genomic control to each of the two trend tests separately, thereby adjusting the chi (2) statistic. The properties of this robust genomic control test with 2 df are examined by simulation. This genomic control-adjusted 2-df test has control of type I error and achieves reasonable power, relative to the optimal tests for each model.  相似文献   

18.
Population stratification results from unequal, nonrandom genetic contribution of ancestors and should be reflected in the underlying genealogies. In Quebec, the distribution of Mendelian diseases points to local founder effects suggesting stratification of the contemporary French Canadian gene pool. Here we characterize the population structure through the analysis of the genetic contribution of 7,798 immigrant founders identified in the genealogies of 2,221 subjects partitioned in eight regions. In all but one region, about 90% of gene pools were contributed by early French founders. In the eastern region where this contribution was 76%, we observed higher contributions of Acadians, British and American Loyalists. To detect population stratification from genealogical data, we propose an approach based on principal component analysis (PCA) of immigrant founders' genetic contributions. This analysis was compared with a multidimensional scaling of pairwise kinship coefficients. Both methods showed evidence of a distinct identity of the northeastern and eastern regions and stratification of the regional populations correlated with geographical location along the St-Lawrence River. In addition, we observed a West-East decreasing gradient of diversity. Analysis of PC-correlated founders illustrates the differential impact of early versus latter founders consistent with specific regional genetic patterns. These results highlight the importance of considering the geographic origin of samples in the design of genetic epidemiology studies conducted in Quebec. Moreover, our results demonstrate that the study of deep ascending genealogies can accurately reveal population structure.  相似文献   

19.
Zang Y  Zhang H  Yang Y  Zheng G 《Human heredity》2007,63(3-4):187-195
The population-based case-control design is a powerful approach for detecting susceptibility markers of a complex disease. However, this approach may lead to spurious association when there is population substructure: population stratification (PS) or cryptic relatedness (CR). Two simple approaches to correct for the population substructure are genomic control (GC) and delta centralization (DC). GC uses the variance inflation factor to correct for the variance distortion of a test statistic, and the DC centralizes the non-central chi-square distribution of the test statistic. Both GC and DC have been studied for case-control association studies mainly under a specific genetic model (e.g. recessive, additive or dominant), under which an optimal trend test is available. The genetic model is usually unknown for many complex diseases. In this situation, we study the performance of three robust tests based on the GC and DC corrections in the presence of the population substructure. Our results show that, when the genetic model is unknown, the DC- (or GC-) corrected maximum and Pearson's association test are robust and have good control of Type I error and high power relative to the optimal trend tests in the presence of PS (or CR).  相似文献   

20.
The estimation of genetic ancestry in human populations has important applications in medical genetic studies. Genetic ancestry is used to control for population stratification in genetic association studies, and is used to understand the genetic basis for ethnic differences in disease susceptibility. In this review, we present an overview of genetic ancestry estimation in human disease studies, followed by a review of popular softwares and methods used for this estimation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号