首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Müller BU  Stich B  Piepho HP 《Heredity》2011,106(5):825-831
Control of the genome-wide type I error rate (GWER) is an important issue in association mapping and linkage mapping experiments. For the latter, different approaches, such as permutation procedures or Bonferroni correction, were proposed. The permutation test, however, cannot account for population structure present in most association mapping populations. This can lead to false positive associations. The Bonferroni correction is applicable, but usually on the conservative side, because correlation of tests cannot be exploited. Therefore, a new approach is proposed, which controls the genome-wide error rate, while accounting for population structure. This approach is based on a simulation procedure that is equally applicable in a linkage and an association-mapping context. Using the parameter settings of three real data sets, it is shown that the procedure provides control of the GWER and the generalized genome-wide type I error rate (GWER(k)).  相似文献   

2.
Jiang N  Wang M  Jia T  Wang L  Leach L  Hackett C  Marshall D  Luo Z 《PloS one》2011,6(8):e23192

Background

It has been well established that theoretical kernel for recently surging genome-wide association study (GWAS) is statistical inference of linkage disequilibrium (LD) between a tested genetic marker and a putative locus affecting a disease trait. However, LD analysis is vulnerable to several confounding factors of which population stratification is the most prominent. Whilst many methods have been proposed to correct for the influence either through predicting the structure parameters or correcting inflation in the test statistic due to the stratification, these may not be feasible or may impose further statistical problems in practical implementation.

Methodology

We propose here a novel statistical method to control spurious LD in GWAS from population structure by incorporating a control marker into testing for significance of genetic association of a polymorphic marker with phenotypic variation of a complex trait. The method avoids the need of structure prediction which may be infeasible or inadequate in practice and accounts properly for a varying effect of population stratification on different regions of the genome under study. Utility and statistical properties of the new method were tested through an intensive computer simulation study and an association-based genome-wide mapping of expression quantitative trait loci in genetically divergent human populations.

Results/Conclusions

The analyses show that the new method confers an improved statistical power for detecting genuine genetic association in subpopulations and an effective control of spurious associations stemmed from population structure when compared with other two popularly implemented methods in the literature of GWAS.  相似文献   

3.
Crosses between laboratory strains of mice provide a powerful way of detecting quantitative trait loci for complex traits related to human disease. Hundreds of these loci have been detected, but only a small number of the underlying causative genes have been identified. The main difficulty is the extensive linkage disequilibrium (LD) in intercross progeny and the slow process of fine-scale mapping by traditional methods. Recently, new approaches have been introduced, such as association studies with inbred lines and multigenerational crosses. These approaches are very useful for interval reduction, but generally do not provide single-gene resolution because of strong LD extending over one to several megabases. Here, we investigate the genetic structure of a natural population of mice in Arizona to determine its suitability for fine-scale LD mapping and association studies. There are three main findings: (1) Arizona mice have a high level of genetic variation, which includes a large fraction of the sequence variation present in classical strains of laboratory mice; (2) they show clear evidence of local inbreeding but appear to lack stable population structure across the study area; and (3) LD decays with distance at a rate similar to human populations, which is considerably more rapid than in laboratory populations of mice. Strong associations in Arizona mice are limited primarily to markers less than 100 kb apart, which provides the possibility of fine-scale association mapping at the level of one or a few genes. Although other considerations, such as sample size requirements and marker discovery, are serious issues in the implementation of association studies, the genetic variation and LD results indicate that wild mice could provide a useful tool for identifying genes that cause variation in complex traits.  相似文献   

4.
There is currently tremendous interest in the possibility of using genome-wide association mapping to identify genes responsible for natural variation, particularly for human disease susceptibility. The model plant Arabidopsis thaliana is in many ways an ideal candidate for such studies, because it is a highly selfing hermaphrodite. As a result, the species largely exists as a collection of naturally occurring inbred lines, or accessions, which can be genotyped once and phenotyped repeatedly. Furthermore, linkage disequilibrium in such a species will be much more extensive than in a comparable outcrossing species. We tested the feasibility of genome-wide association mapping in A. thaliana by searching for associations with flowering time and pathogen resistance in a sample of 95 accessions for which genome-wide polymorphism data were available. In spite of an extremely high rate of false positives due to population structure, we were able to identify known major genes for all phenotypes tested, thus demonstrating the potential of genome-wide association mapping in A. thaliana and other species with similar patterns of variation. The rate of false positives differed strongly between traits, with more clinal traits showing the highest rate. However, the false positive rates were always substantial regardless of the trait, highlighting the necessity of an appropriate genomic control in association studies.  相似文献   

5.
Understanding the population structure and linkage disequilibrium in an association panel can effectively avoid spurious associations and improve the accuracy in association mapping. In this study, one hundred and fifty eight elite cotton (Gossypium hirsutum L.) germplasm from all over the world, which were genotyped with 212 whole genome-wide marker loci and phenotyped with an disease nursery and greenhouse screening method, were assayed for population structure, linkage disequilibrium, and association mapping of Verticillium wilt resistance. A total of 480 alleles ranging from 2 to 4 per locus were identified from all collections. Model-based analysis identified two groups (G1 and G2) and seven subgroups (G1a–c, G2a–d), and differentiation analysis showed that subgroup having a single origin or pedigree was apt to differentiate with those having a mixed origin. Only 8.12% linked marker pairs showed significant LD (P<0.001) in this association panel. The LD level for linked markers is significantly higher than that for unlinked markers, suggesting that physical linkage strongly influences LD in this panel, and LD level was elevated when the panel was classified into groups and subgroups. The LD decay analysis for several chromosomes showed that different chromosomes showed a notable change in LD decay distances for the same gene pool. Based on the disease nursery and greenhouse environment, 42 marker loci associated with Verticillium wilt resistance were identified through association mapping, which widely were distributed among 15 chromosomes. Among which 10 marker loci were found to be consistent with previously identified QTLs and 32 were new unreported marker loci, and QTL clusters for Verticillium wilt resistanc on Chr.16 were also proved in our study, which was consistent with the strong linkage in this chromosome. Our results would contribute to association mapping and supply the marker candidates for marker-assisted selection of Verticillium wilt resistance in cotton.  相似文献   

6.
Historically, linkage mapping populations have consisted of large, randomly selected samples of progeny from a given pedigree or cell lines from a panel of radiation hybrids. We demonstrate that, to construct a map with high genome-wide marker density, it is neither necessary nor desirable to genotype all markers in every individual of a large mapping population. Instead, a reduced sample of individuals bearing complementary recombinational or radiation-induced breakpoints may be selected for genotyping subsequent markers from a large, but sparsely genotyped, mapping population. Choosing such a sample can be reduced to a discrete stochastic optimization problem for which the goal is a sample with breakpoints spaced evenly throughout the genome. We have developed several different methods for selecting such samples and have evaluated their performance on simulated and actual mapping populations, including the Lister and Dean Arabidopsis thaliana recombinant inbred population and the GeneBridge 4 human radiation hybrid panel. Our methods quickly and consistently find much-reduced samples with map resolution approaching that of the larger populations from which they are derived. This approach, which we have termed selective mapping, can facilitate the production of high-quality, high-density genome-wide linkage maps.  相似文献   

7.
Marker–trait associations based on populations from controlled crosses have been established in peach using markers mapped on the peach consensus map. In this study, we explored the utility of unstructured populations for association mapping to determine useful marker–trait associations in peach/nectarine cultivars. We used 94 peach cultivars representing local Spanish and modern cultivars from international breeding programs that are maintained at the Experimental Station of Aula Dei, Spain. This collection was characterized for pomological traits and was screened with 40 SSR markers that span the peach genome. Population structure analysis using STRUCTURE software identified two subpopulations, the local and modern cultivars, with admixture within both groups. The local Spanish cultivars were somewhat less diverse than modern cultivars. Marker–trait associations were determined in TASSEL with and without modelling coefficient of membership (Q) values as covariates. The results showed significant associations with pomological traits. We chose three markers on LG4 because of their proximity to the endoPG locus (freestone–melting flesh) that strongly affects pomological traits. Two genotypes of BPPCT015 marker showed significant associations with harvest date, flavonoids and sorbitol. Also, two genotypes of CPPCT028 showed associations with harvest date, total phenolics, RAC, and total sugars. Finally, two genotypes of endoPG1 showed associations with flesh firmness and total sugars. The analysis of linkage disequilibrium (LD) revealed a high level of LD up to 20 cM, and decay at farther distances. Therefore, association mapping could be a powerful tool for identifying marker–trait associations and would be useful for marker-assisted selection in peach breeding.  相似文献   

8.
Genome-wide association study (GWAS) has become an obvious general approach for studying traits of agricultural importance in higher plants, especially crops. Here, we present a GWAS of 32 morphologic and 10 agronomic traits in a collection of 615 barley cultivars genotyped by genome-wide polymorphisms from a recently developed barley oligonucleotide pool assay. Strong population structure effect related to mixed sampling based on seasonal growth habit and ear row number is present in this barley collection. Comparison of seven statistical approaches in a genome-wide scan for significant associations with or without correction for confounding by population structure, revealed that in reducing false positive rates while maintaining statistical power, a mixed linear model solution outperforms genomic control, structured association, stepwise regression control and principal components adjustment. The present study reports significant associations for sixteen morphologic and nine agronomic traits and demonstrates the power and feasibility of applying GWAS to explore complex traits in highly structured plant samples.  相似文献   

9.
Detection of quantitative trait loci (QTL) controlling complex traits followed by selection has become a common approach for selection in crop plants. The QTL are most often identified by linkage mapping using experimental F2, backcross, advanced inbred, or doubled haploid families. An alternative approach for QTL detection are genome-wide association studies (GWAS) that use pre-existing lines such as those found in breeding programs. We explored the implementation of GWAS in oat (Avena sativa L.) to identify QTL affecting β-glucan concentration, a soluble dietary fiber with several human health benefits when consumed as a whole grain. A total of 431 lines of worldwide origin were tested over 2?years and genotyped using Diversity Array Technology (DArT) markers. A mixed model approach was used where both population structure fixed effects and pair-wise kinship random effects were included. Various mixed models that differed with respect to population structure and kinship were tested for their ability to control for false positives. As expected, given the level of population structure previously described in oat, population structure did not play a large role in controlling for false positives. Three independent markers were significantly associated with β-glucan concentration. Significant marker sequences were compared with rice and one of the three showed sequence homology to genes localized on rice chromosome seven adjacent to the CslF gene family, known to have β-glucan synthase function. Results indicate that GWAS in oat can be a successful option for QTL detection, more so with future development of higher-density markers.  相似文献   

10.
Diversity arrays technology (DArT) and simple sequence repeat (SSR) markers were applied to investigate population structure, extent of linkage disequilibrium and genetic diversity (kinship) on a genome-wide level in European barley (Hordeum vulgare L.) cultivars. A set of 183 varieties could be clearly distinguished into spring and winter types and was classified into five subgroups based on 253 DArT or 22 SSR markers. Despite the fact, that the same number of groups was revealed by both marker types, it could be shown that this grouping was more distinct for the SSRs than the DArTs, when assigned to a Q-matrix by STRUCTURE. This was supported by the findings from principal coordinate analysis, where the SSRs showed a better resolution according to seasonal habit and row number than the DArTs. A considerable influence on the rate of significant associations with malting and kernel quality parameters was revealed by different marker types in this genome-wide association study using general and mixed linear models considering population structure. Fewer spurious associations were observed when population structure was based on SSR rather than on DArT markers. We therefore conclude that it is advisable to use independent marker datasets for calculating population structure and for performing the association analysis.  相似文献   

11.
One way to use a crop germplasm collection directly to map QTLs without using line-crossing experiments is the whole genome association mapping. A major problem with association mapping is the presence of population structure, which can lead to both false positives and failure to detect genuine associations (i.e., false negatives). Particularly in highly selfing species such as Asian cultivated rice, high levels of population structure are expected and therefore the efficiency of association mapping remains almost unknown. Here, we propose an approach that combines a Bayesian method for mapping multiple QTLs with a regression method that directly incorporates estimates of population structure. That is, the effects due to both multiple QTLs and population structure were included in our statistical model. We evaluated the efficiency of our approach in simulated- and real-trait analyses of a rice germplasm collection. Simulation analyses based on real marker data showed that our model could suppress both false-positive and false-negative rates and the error of estimation of genetic effects over single QTL models, indicating that our model has statistically desirable attributes over single QTL models. As real traits, we analyzed the size and shape of milled rice grains and found significant markers that may be linked to QTLs reported previously. Association mapping should have good prospects in highly selfing species such as rice if proper methods are adopted. Our approach will be useful for the whole genome association mapping of various selfing crop species.  相似文献   

12.
Population structure and genome-wide linkage disequilibrium (LD) were investigated in 192 Hordeum vulgare accessions providing a comprehensive coverage of past and present barley breeding in the Mediterranean basin, using 50 nuclear microsatellite and 1,130 DArT® markers. Both clustering and principal coordinate analyses clearly sub-divided the sample into five distinct groups centred on key ancestors and regions of origin of the germplasm. For given genetic distances, large variation in LD values was observed, ranging from closely linked markers completely at equilibrium to marker pairs at 50 cM separation still showing significant LD. Mean LD values across the whole population sample decayed below r 2 of 0.15 after 3.2 cM. By assaying 1,130 genome-wide DArT® markers, we demonstrated that, after accounting for population substructure, current genome coverage of 1 marker per 1.5 cM except for chromosome 4H with 1 marker per 3.62 cM is sufficient for whole genome association scans. We show, by identifying associations with powdery mildew that map in genomic regions known to have resistance loci, that associations can be detected in strongly stratified samples provided population structure is effectively controlled in the analysis. The population we describe is, therefore, shown to be a valuable resource, which can be used in basic and applied research in barley.  相似文献   

13.
Case-control studies of association in structured or admixed populations   总被引:7,自引:0,他引:7  
Case-control tests for association are an important tool for mapping complex-trait genes. But population structure can invalidate this approach, leading to apparent associations at markers that are unlinked to disease loci. Family-based tests of association can avoid this problem, but such studies are often more expensive and in some cases--particularly for late-onset diseases--are impractical. In this review article we describe a series of approaches published over the past 2 years which use multilocus genotype data to enable valid case-control tests of association, even in the presence of population structure. These tests can be classified into two categories. "Genomic control" methods use the independent marker loci to adjust the distribution of a standard test statistic, while "structured association" methods infer the details of population structure en route to testing for association. We discuss the statistical issues involved in the different approaches and present results from simulations comparing the relative performance of the methods under a range of models.  相似文献   

14.
15.
复杂疾病全基因组关联研究进展——遗传统计分析   总被引:7,自引:0,他引:7  
严卫丽 《遗传》2008,30(5):543-549
2005年, Science杂志首次报道了有关人类年龄相关性黄斑变性的全基因组关联研究, 此后有关肥胖、2型糖尿病、冠心病、阿尔茨海默病等一系列复杂疾病的全基因组关联研究被陆续报道, 这一阶段被称为人类全基因组关联研究的第一次浪潮。文章分别介绍了全基因组关联研究统计分析的方法、软件和应用实例; 比较了关联分析中多重检验的P值调整方法, 包括Bonferroni、递减的Bonferroni校正法、模拟运算法和控制错误发现率的方法; 还讨论了人群混杂对关联分析结果可能产生的影响及原理, 以及全基因组关联研究中控制人群混杂的方法的研究进展和应用实例。在全基因组关联研究的第一次浪潮中, 应用经典的遗传统计方法发现了许多基因-表型之间的关联并且能够对这些关联做出解释, 其中包括许多基因组中的未知基因和染色体区域。然而, 全基因组关联研究的继续发展需要进一步阐述基因组内基因之间相互作用、基因-基因之间的复杂作用网络与环境因素的相互作用在复杂疾病发生中的作用, 现有的统计分析方法肯定不能满足需要, 开发更为高级的统计分析方法势在必行。最后, 文章还给出了全基因组关联研究统计分析软件的相关网站信息。  相似文献   

16.
An eggplant (Solanum melongena) association panel of 191 accessions, comprising a mixture of breeding lines, old varieties and landrace selections was SNP genotyped and phenotyped for key breeding fruit and plant traits at two locations over two seasons. A genome-wide association (GWA) analysis was performed using the mixed linear model, which takes into account both a kinship matrix and the sub-population membership of the accessions. Overall, 194 phenotype/genotype associations were uncovered, relating to 30 of the 33 measured traits. These associations involved 79 SNP loci mapping to 39 distinct chromosomal regions distributed over all 12 eggplant chromosomes. A comparison of the map positions of these SNPs with those of loci derived from conventional linkage mapping showed that GWA analysis both validated many of the known controlling loci and detected a large number of new marker/trait associations. Exploiting established syntenic relationships between eggplant chromosomes and those of tomato and pepper recognized orthologous regions in ten eggplant chromosomes harbouring genes influencing breeders’ traits.  相似文献   

17.
Although high-density SNP genotyping platforms generate a momentum for detailed genome-wide association (GWA) studies, an offshoot is a new insight into population genetics. Here, we present an example in one of the best-known founder populations by scrutinizing ten distinct Finnish early- and late-settlement subpopulations. By determining genetic distances, homozygosity, and patterns of linkage disequilibrium, we demonstrate that population substructure, and even individual ancestry, is detectable at a very high resolution and supports the concept of multiple historical bottlenecks resulting from consecutive founder effects. Given that genetic studies are currently aiming at identifying smaller and smaller genetic effects, recognizing and controlling for population substructure even at this fine level becomes imperative to avoid confounding and spurious associations. This study provides an example of the power of GWA data sets to demonstrate stratification caused by population history even within a seemingly homogeneous population, like the Finns. Further, the results provide interesting lessons concerning the impact of population history on the genome landscape of humans, as well as approaches to identify rare variants enriched in these subpopulations.  相似文献   

18.
High-density genotyping is extensively exploited in genome-wide association mapping studies and genomic selection in maize. By contrast, linkage mapping studies were until now mostly based on low-density genetic maps and theoretical results suggested this to be sufficient. This raises the question, if an increase in marker density would be an overkill for linkage mapping in biparental populations, or if important QTL mapping parameters would benefit from it. In this study, we addressed this question using experimental data and a simulation based on linkage maps with marker densities of 1, 2, and 5 cM. QTL mapping was performed for six diverse traits in a biparental population with 204 doubled haploid maize lines and in a simulation study with varying QTL effects and closely linked QTL for different population sizes. Our results showed that high-density maps neither improved the QTL detection power nor the predictive power for the proportion of explained genotypic variance. By contrast, the precision of QTL localization, the precision of effect estimates of detected QTL, especially for small and medium sized QTL, as well as the power to resolve closely linked QTL profited from an increase in marker density from 5 to 1 cM. In conclusion, the higher costs for high-density genotyping are compensated for by more precise estimates of parameters relevant for knowledge-based breeding, thus making an increase in marker density for linkage mapping attractive.  相似文献   

19.
The genetic diversity, population structure, and linkage disequilibrium (LD) of peaches are greatly important in genome-wide association mapping. In the current study, 104 peach landrace accessions from six Chinese geographical regions were evaluated for fruit and phenological period. The accessions were genotyped with 53 genome-wide simple sequence repeat (SSR) markers. All SSR markers were highly polymorphic across the accessions, and a total of 340 alleles were detected, including 59 private alleles. Of the six regions studied, the northern part of China as well as the middle and lower reaches of the Changjiang River were found to be the most highly diverse genetically. Based on population structure analysis, the peaches were divided into five groups, which well agreed with the geographical distribution. Of the SSR pairs in these accessions, 18.07% (P?<?0.05) were in LD. The mean r 2 value for all intrachromosomal loci pairs was 0.0149, and LD decayed at 6.01?cM. The general linear model was used to calculate the genome-wide marker-trait associations of 10 complex traits. The traits include flesh color around the stone, red pigment in the flesh, flesh texture, flesh adhesion, flesh firmness, fruit weight, chilling requirement, flowering time, ripening time, and fruit development period. These traits were estimated by analyzing the 104 landraces. Many of the associated markers were located in regions where quantitative trait loci (QTLs) were previously identified. Peach association mapping is an effective approach for identifying QTLs and may be an alternative to QTL mapping based on crosses between different lines.  相似文献   

20.
Association mapping promises to overcome the limitations of linkage mapping methods. The main objective of this study was to examine the applicability of multivariate association mapping with an empirical data set of sugar beet. A total of 111 diploid sugar beet inbreds was selected from the seed parent heterotic pool to represent a broad diversity with respect to sugar content (SC). The inbreds were genotyped with 26 simple sequence repeat markers chosen according to their map positions in proximity to previously identified quantitative trait loci for SC. For SC and beet yield (BY), the genotypic variances were highly significant (P < 0.01). Based on the global test of the bivariate mixed-model approach, four markers were significantly associated with SC, BY, or both at a false discovery rate of 0.025. All four markers were significantly (P < 0.05) associated with BY but only two with SC. The identification of markers associated with SC, BY, or both indicated that association mapping can be successfully applied in a sugar beet breeding context for detection of marker-phenotype associations. Furthermore, based on our results multivariate association mapping can be recommended as a promising tool to discriminate with a high mapping resolution between pleiotropy and linkage as reasons for co-localization of marker-phenotype associations for different traits.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号