首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 26 毫秒
1.
Many popular methods for exploring gene-gene interactions, including the case-only approach, rely on the key assumption that physically distant loci are in linkage equilibrium in the underlying population. These methods utilize the presence of correlation between unlinked loci in a disease-enriched sample as evidence of interactions among the loci in the etiology of the disease. We use data from the CGEMS case-control genome-wide association study of breast cancer to demonstrate empirically that the case-only and related methods have the potential to create large-scale false positives because of the presence of population stratification (PS) that creates long-range linkage disequilibrium in the genome. We show that the bias can be removed by considering parametric and nonparametric methods that assume gene-gene independence between unlinked loci, not in the entire population, but only conditional on population substructure that can be uncovered based on the principal components of a suitably large panel of PS markers. Applications in the CGEMS study as well as simulated data show that the proposed methods are robust to the presence of population stratification and are yet much more powerful, relative to standard logistic regression methods that are also commonly used as robust alternatives to the case-only type methods.  相似文献   

2.
Mapping disease genes: family-based association studies.   总被引:19,自引:9,他引:10       下载免费PDF全文
With recent rapid advances in mapping of the human genome, including highly polymorphic and closely linked markers, studies of marker associations with disease are increasingly relevant for mapping disease genes. The use of nuclear-family data in association studies was initially developed to avoid possible ethnic mismatching between patients and randomly ascertained controls. The parental marker alleles not transmitted to an affected child or never transmitted to an affected sib pair form the so-called AFBAC (affected family-based controls) population. In this paper, the theoretical foundation of the AFBAC method is proved for any single-locus model of disease and for any nuclear family-based ascertainment scheme. In a random-mating population, when there is a marker association with disease, the AFBAC population provides an unbiased estimate of the overall population (control) marker alleles when the recombination fraction (theta) between the marker and disease genes is sufficiently small that it can be taken as zero (theta = 0). With population stratification, only marker associations present in the subpopulations will be detected with family-based analyses. Of more importance, however, is the fact that, when theta not equal to 0, differences between transmitted parental (patient) marker allele frequencies and non- or never-transmitted parental marker allele frequencies (implying a marker association with disease) can only be observed for marker genes linked to a disease gene (theta < 1/2). Thus, associations of unlinked marker loci with disease at the population level, caused by population stratification, migration, or admixture, are eliminated. This validates the use of family-based association tests as an appropriate strategy for mapping disease genes.  相似文献   

3.
Case-control studies of association in structured or admixed populations   总被引:7,自引:0,他引:7  
Case-control tests for association are an important tool for mapping complex-trait genes. But population structure can invalidate this approach, leading to apparent associations at markers that are unlinked to disease loci. Family-based tests of association can avoid this problem, but such studies are often more expensive and in some cases--particularly for late-onset diseases--are impractical. In this review article we describe a series of approaches published over the past 2 years which use multilocus genotype data to enable valid case-control tests of association, even in the presence of population structure. These tests can be classified into two categories. "Genomic control" methods use the independent marker loci to adjust the distribution of a standard test statistic, while "structured association" methods infer the details of population structure en route to testing for association. We discuss the statistical issues involved in the different approaches and present results from simulations comparing the relative performance of the methods under a range of models.  相似文献   

4.
Although genetic association studies using unrelated individuals may be subject to bias caused by population stratification, alternative methods that are robust to population stratification, such as family-based association designs, may be less powerful. Furthermore, it is often more feasible and less expensive to collect unrelated individuals. Recently, several statistical methods have been proposed for case-control association tests in a structured population; these methods may be robust to population stratification. In the present study, we propose a quantitative similarity-based association test (QSAT) to identify association between a candidate marker and a quantitative trait of interest, through use of unrelated individuals. For the QSAT, we first determine whether two individuals are from the same subpopulation or from different subpopulations, using genotype data at a set of independent markers. We then perform an association test between the candidate marker and the quantitative trait, through incorporation of such information. Simulation results based on either coalescent models or empirical population genetics data show that the QSAT has a correct type I error rate in the presence of population stratification and that the power of the QSAT is higher than that of family-based association designs.  相似文献   

5.
MOTIVATION: Admixed populations offer a unique opportunity for mapping diseases that have large disease allele frequency differences between ancestral populations. However, association analysis in such populations is challenging because population stratification may lead to association with loci unlinked to the disease locus. Methods and results: We show that local ancestry at a test single nucleotide polymorphism (SNP) may confound with the association signal and ignoring it can lead to spurious association. We demonstrate theoretically that adjustment for local ancestry at the test SNP is sufficient to remove the spurious association regardless of the mechanism of population stratification, whether due to local or global ancestry differences among study subjects; however, global ancestry adjustment procedures may not be effective. We further develop two novel association tests that adjust for local ancestry. Our first test is based on a conditional likelihood framework which models the distribution of the test SNP given disease status and flanking marker genotypes. A key advantage of this test lies in its ability to incorporate different directions of association in the ancestral populations. Our second test, which is computationally simpler, is based on logistic regression, with adjustment for local ancestry proportion. We conducted extensive simulations and found that the Type I error rates of our tests are under control; however, the global adjustment procedures yielded inflated Type I error rates when stratification is due to local ancestry difference.  相似文献   

6.
When two or more populations have been separated by geographic or cultural boundaries for many generations, drift, spontaneous mutations, differential selection pressures and other factors may lead to allele frequency differences among populations. If these 'parental' populations subsequently come together and begin inter-mating, disequilibrium among linked markers may span a greater genetic distance than it typically does among populations under panmixia [see glossary]. This extended disequilibrium can make association studies highly effective and more economical than disequilibrium mapping in panmictic populations since less marker loci are needed to detect regions of the genome that harbor phenotype-influencing loci. However, under some circumstances, this process of intermating (as well as other processes) can produce disequilibrium between pairs of unlinked loci and thus create the possibility of confounding or spurious associations due to this population stratification. Accordingly, researchers are advised to employ valid statistical tests for linkage disequilibrium mapping allowing conduct of genetic association studies that control for such confounding. Many recent papers have addressed this need. We provide a comprehensive review of advances made in recent years in correcting for population stratification and then evaluate and synthesize these methods based on statistical principles such as (1) randomization, (2) conditioning on sufficient statistics, and (3) identifying whether the method is based on testing the genotype-phenotype covariance (conditional upon familial information) and/or testing departures of the marginal distribution from the expected genotypic frequencies.  相似文献   

7.
To control for hidden population stratification in genetic-association studies, statistical methods that use marker genotype data to infer population structure have been proposed as a possible alternative to family-based designs. In principle, it is possible to infer population structure from associations between marker loci and from associations of markers with the trait, even when no information about the demographic background of the population is available. In a model in which the total population is formed by admixture between two or more subpopulations, confounding can be estimated and controlled. Current implementations of this approach have limitations, the most serious of which is that they do not allow for uncertainty in estimations of individual admixture proportions or for lack of identifiability of subpopulations in the model. We describe methods that overcome these limitations by a combination of Bayesian and classical approaches, and we demonstrate the methods by using data from three admixed populations--African American, African Caribbean, and Hispanic American--in which there is extreme confounding of trait-genotype associations because the trait under study (skin pigmentation) varies with admixture proportions. In these data sets, as many as one-third of marker loci show crude associations with the trait. Control for confounding by population stratification eliminates these associations, except at loci that are linked to candidate genes for the trait. With only 32 markers informative for ancestry, the efficiency of the analysis is 70%. These methods can deal with both confounding and selection bias in genetic-association studies, making family-based designs unnecessary.  相似文献   

8.
CYP3A4-V, an A to G promoter variant associated with prostate cancer in African Americans, exhibits large differences in allele frequency between populations. Given that the African American population is genetically heterogeneous because of its African ancestry and subsequent admixture with European Americans, case-control studies with African Americans are highly susceptible to spurious associations. To test for association with prostate cancer, we genotyped CYP3A4-V in 1376 (2 N) chromosomes from prostate cancer patients and age- and ethnicity-matched controls representing African Americans, Nigerians, and European Americans. To detect population stratification among the African American samples, 10 unlinked genetic markers were genotyped. To correct for the stratification, the uncorrected association statistic was divided by the average of association statistics across the 10 unlinked markers. Sharp differences in CYP3A4-V frequencies were observed between Nigerian and European American controls (0.87 and 0.10, respectively; P<0.0001). African Americans were intermediate (0.66). An association uncorrected for stratification was observed between CYP3A4-V and prostate cancer in African Americans (P=0.007). A nominal association was also observed among European Americans (P=0.02) but not Nigerians. In addition, the unlinked genetic marker test provided strong evidence of population stratification among African Americans. Because of the high level of stratification, the corrected P-value was not significant (P=0.25). Follow-up studies on a larger dataset will be needed to confirm whether the association is indeed spurious; however, these results reveal the potential for confounding of association studies by using African Americans and the need for study designs that take into account substructure caused by differences in ancestral proportions between cases and controls.  相似文献   

9.
人类复杂疾病关联研究中群体分层的检出和校正   总被引:2,自引:1,他引:2  
病例对照研究是鉴定多基因疾病易感位点重要的遗传流行病学方法, 而群体分层是导致病例对照研究关联研究结果出现偏倚甚至是假关联的重要原因之一。文章对人群分层的检出及校正的方法和原理进行了阐述, 包括基于核心家系的传递/不平衡检验(TDT)以及基于不相关基因组遗传标记的基因组对照(GC)和结构化关联(SA)等, 并且对这几种方法进行了比较。  相似文献   

10.
MOTIVATION: Although population-based association mapping may be subject to the bias caused by population stratification, alternative methods that are robust to population stratification such as family-based linkage analysis have lower mapping resolution. Recently, various statistical methods robust to population stratification were proposed for association studies, using unrelated individuals to identify associations between candidate genes and traits of interest. The association between a candidate gene and a quantitative trait is often evaluated via a regression model with inferred population structure variables as covariates, where the residual distribution is customarily assumed to be from a symmetric and unimodal parametric family, such as a Gaussian, although this may be inappropriate for the analysis of many real-life datasets. RESULTS: In this article, we proposed a new structured association (SA) test. Our method corrects for continuous population stratification by first deriving population structure and kinship matrices through a set of random genetic markers and then modeling the relationship between trait values, genotypic scores at a candidate marker and genetic background variables through a semiparametric model, where the error distribution is modeled as a mixture of Polya trees centered around a normal family of distributions. We compared our model to the existing SA tests in terms of model fit, type I error rate, power, precision and accuracy by application to a real dataset as well as simulated datasets.  相似文献   

11.
Association mapping in structured populations   总被引:43,自引:0,他引:43       下载免费PDF全文
The use, in association studies, of the forthcoming dense genomewide collection of single-nucleotide polymorphisms (SNPs) has been heralded as a potential breakthrough in the study of the genetic basis of common complex disorders. A serious problem with association mapping is that population structure can lead to spurious associations between a candidate marker and a phenotype. One common solution has been to abandon case-control studies in favor of family-based tests of association, such as the transmission/disequilibrium test (TDT), but this comes at a considerable cost in the need to collect DNA from close relatives of affected individuals. In this article we describe a novel, statistically valid, method for case-control association studies in structured populations. Our method uses a set of unlinked genetic markers to infer details of population structure, and to estimate the ancestry of sampled individuals, before using this information to test for associations within subpopulations. It provides power comparable with the TDT in many settings and may substantially outperform it if there are conflicting associations in different subpopulations.  相似文献   

12.
OBJECTIVE: Case-control association studies in mixed populations can result in spurious disease-marker associations if subpopulation disease prevalence and marker frequencies both differ. Genomic control (GC) uses neutral loci to correct for spurious association (due to population stratification), but how well this works remains undetermined. METHODS: We simulated and mixed populations with different disease and marker frequencies but without marker-disease association. We generated case-control datasets, calculated the chi2 for disease association with each marker, and applied two GC procedures, dividing by the mean chi2 or median-chi2/0.456. RESULTS: Corrections became conservative (false positive rate [FPR] <5%) with increasing subpopulation prevalence and marker differences. The mean correction resulted in FPRs close to 5% at average subpopulation allele frequency differences <0.26, but inclusion of just a few markers with large frequency differences resulted in conservative FPRs. FPRs from the median correction were mostly conservative but became anticonservative when a few markers with large frequency differences were included. CONCLUSION: GC can both lead to a notable loss of power to detect a true association (conservative) in many circumstances or may fail to eliminate the spurious associations (anticonservative). The mean correction factor is useful in certain situations to correct population stratification, but it is difficult to know when those situations exist.  相似文献   

13.
We propose a novel latent-class approach to detect and account for population stratification in a case-control study of association between a candidate gene and a disease. In our approach, population substructure is detected and accounted for using data on additional loci that are in linkage equilibrium within subpopulations but have alleles that vary in frequency between subpopulations. We have tested our approach using simulated data based on allele frequencies in 12 short tandem repeat (STR) loci in four populations in Argentina.  相似文献   

14.
In population-based case-control association studies, the regular chi (2) test is often used to investigate association between a candidate locus and disease. However, it is well known that this test may be biased in the presence of population stratification and/or genotyping error. Unlike some other biases, this bias will not go away with increasing sample size. On the contrary, the false-positive rate will be much larger when the sample size is increased. The usual family-based designs are robust against population stratification, but they are sensitive to genotype error. In this article, we propose a novel method of simultaneously correcting for the bias arising from population stratification and/or for the genotyping error in case-control studies. The appropriate corrections depend on sample odds ratios of the standard 2x3 tables of genotype by case and control from null loci. Therefore, the test is simple to apply. The corrected test is robust against misspecification of the genetic model. If the null hypothesis of no association is rejected, the corrections can be further used to estimate the effect of the genetic factor. We considered a simulation study to investigate the performance of the new method, using parameter values similar to those found in real-data examples. The results show that the corrected test approximately maintains the expected type I error rate under various simulation conditions. It also improves the power of the association test in the presence of population stratification and/or genotyping error. The discrepancy in power between the tests with correction and those without correction tends to be more extreme as the magnitude of the bias becomes larger. Therefore, the bias-correction method proposed in this article should be useful for the genetic analysis of complex traits.  相似文献   

15.
Linkage disequilibrium (LD), a measure of nonrandom association of alleles at different loci, is of great interest to evolutionary geneticists as it can be used to help identify loci that explain phenotypic variation. Surveys of the extent of LD across genomes have been carried out in a number of systems, most notably humans and model organisms. However, studies of natural populations of vertebrates have rarely been performed. Here, we describe an investigation of LD in a free-living island population of red deer Cervus elaphus. Relatively high levels of LD extended several tens of centimorgans, and significant LD was frequently detected between unlinked markers. The magnitude of LD varied depending on how the population was sampled. It also varied across different chromosomes, and was shown to be a function of sample size, intermarker distance and marker heterozygosity. A recent admixture event in the population led to an ephemeral increase in LD. Association mapping may be possible in this population, although a high 'baseline' level of LD could lead to false positive associations between marker loci and a trait of interest.  相似文献   

16.
Population stratification is a form of confounding by ethnicity that may cause bias to effect estimates and inflate test statistics in genetic association studies. Unlinked genetic markers have been used to adjust for test statistics, but their use in correcting biased effect estimates has not been addressed. We evaluated the potential of bias correction that could be achieved by a single null marker (M) in studies involving one candidate gene (G). When the distribution of M varied greatly across ethnicities, controlling for M in a logistic regression model substantially reduced biases on odds ratio estimates. When M had same distributions as G across ethnicities, biases were further reduced or eliminated by subtracting the regression coefficient of M from the coefficient of G in the model, which was fitted either with or without a multiplicative interaction term between M and G. Correction of bias due to population stratification depended specifically on the distributions of G and M, the difference between baseline disease risks across ethnicities, and whether G had an effect on disease risk or not. Our results suggested that marker choice and the specific treatment of that marker in analysis greatly influenced bias correction.  相似文献   

17.
The genealogical relationships of individuals in a finite population can create statistical non-independence of alleles at unlinked loci. In this paper, we introduce a flexible graphical method for computing the probabilities that two individuals in a finite, randomly mating population have the same haplotype or genotype at several loci. This method allows us to generalize the analysis of Laurie and Weir [2003. Dependency effects in multi-locus match probabilities. Theor. Popul. Biol. 63, 207-219] to cases with more loci and other models of mating. We show that monogamy increases the probabilities of genotypic matches at unlinked loci and that the effect of monogamy increases with the number L of loci. We conjecture a sharp upper bound on the effect of monogamy for a given L.  相似文献   

18.
We have previously shown that linkage disequilibrium (LD) in the elite cultivated barley (Hordeum vulgare) gene pool extends, on average, for <1-5 cM. Based on this information, we have developed a platform for whole genome association studies that comprises a collection of elite lines that we have characterized at 3060 genome-wide single nucleotide polymorphism (SNP) marker loci. Interrogating this data set shows that significant population substructure is present within the elite gene pool and that diversity and LD vary considerably across each of the seven barley chromosomes. However, we also show that a subpopulation comprised of only the two-rowed spring germplasm is less structured and well suited to whole genome association studies without the need for extensive statistical intervention to account for structure. At the current marker density, the two-rowed spring population is suited for fine mapping simple traits that are located outside of the genetic centromeres with a resolution that is sufficient for candidate gene identification by exploiting conservation of synteny with fully sequenced model genomes and the emerging barley physical map.  相似文献   

19.
The use of inbred strains of mice to dissect the genetic complexity of common diseases offers a viable alternative to human studies, given the control over experimental parameters that can be exercised. Central to efforts to map susceptibility loci for common diseases in mice is a comprehensive map of DNA variation among the common inbred strains of mice. Here we present one of the most comprehensive high-density, single nucleotide polymorphism (SNP) maps of mice constructed to date. This map consists of 10,350 SNPs genotyped in 62 strains of inbred mice. We demonstrate the utility of these data via a novel integrative genomics approach to mapping susceptibility loci for complex traits. By integrating in silico quantitative trait locus (QTL) mapping with progressive QTL mapping strategies in segregating mouse populations that leverage large-scale mapping of the genetic determinants of gene expression traits, we not only facilitate identification of candidate quantitative trait genes, but also protect against spurious associations that can arise in genetic association studies due to allelic association among unlinked markers. Application of this approach to our high-density SNP map and two previously described F2 crosses between strains C57BL/6J (B6) and DBA/2J and between B6 ApoE(-/-) and C3H/HeJ ApoE(-/-) results in the identification of Insig2 as a strong candidate susceptibility gene for total plasma cholesterol levels.  相似文献   

20.
Jiang N  Wang M  Jia T  Wang L  Leach L  Hackett C  Marshall D  Luo Z 《PloS one》2011,6(8):e23192

Background

It has been well established that theoretical kernel for recently surging genome-wide association study (GWAS) is statistical inference of linkage disequilibrium (LD) between a tested genetic marker and a putative locus affecting a disease trait. However, LD analysis is vulnerable to several confounding factors of which population stratification is the most prominent. Whilst many methods have been proposed to correct for the influence either through predicting the structure parameters or correcting inflation in the test statistic due to the stratification, these may not be feasible or may impose further statistical problems in practical implementation.

Methodology

We propose here a novel statistical method to control spurious LD in GWAS from population structure by incorporating a control marker into testing for significance of genetic association of a polymorphic marker with phenotypic variation of a complex trait. The method avoids the need of structure prediction which may be infeasible or inadequate in practice and accounts properly for a varying effect of population stratification on different regions of the genome under study. Utility and statistical properties of the new method were tested through an intensive computer simulation study and an association-based genome-wide mapping of expression quantitative trait loci in genetically divergent human populations.

Results/Conclusions

The analyses show that the new method confers an improved statistical power for detecting genuine genetic association in subpopulations and an effective control of spurious associations stemmed from population structure when compared with other two popularly implemented methods in the literature of GWAS.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号