首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
Li Q  Yu K  Li Z  Zheng G 《Human genetics》2008,123(6):617-623
In genome-wide association studies (GWAS), single-marker analysis is usually employed to identify the most significant single nucleotide polymorphisms (SNPs). The trend test has been proposed for analysis of case-control association. Three trend tests, optimal for the recessive, additive and dominant models respectively, are available. When the underlying genetic model is unknown, the maximum of the three trend test results (MAX) has been shown to be robust against genetic model misspecification. Since the asymptotic distribution of MAX depends on the allele frequency of the SNP, using the P-value of MAX for ranking may be different from using the MAX statistic. Calculating the P-value of MAX for 300,000 (300 K) or more SNPs is computationally intensive and the software and program to obtain the P-value of MAX are not widely available. On the other hand, the MAX statistic is very easy to calculate without complex computer programs. Thus, we study whether or not one could use the MAX statistic instead of its P-value to rank SNPs in GWAS. The approaches using the MAX and its P-value to rank SNPs are referred to as MAX-rank and P-rank. By applying MAX-rank and P-rank to simulated and four real datasets from GWAS, we found the ranks of SNPs with true association are very similar using both approaches. Thus, we recommend to use MAX-rank for genome-wide scans. After the top-ranked SNPs are identified, their P-values based on MAX can be calculated and compared with the significance level. The work of Q. Li was partially supported by the Knowledge Innovation Program of the Chinese Academy of Sciences, No. 30465W0 and 30475V0. The research of Z Li was partially sponsored by NIH grant EY014478.  相似文献   

2.
Ahn MJ  Won HH  Lee J  Lee ST  Sun JM  Park YH  Ahn JS  Kwon OJ  Kim H  Shim YM  Kim J  Kim K  Kim YH  Park JY  Kim JW  Park K 《Human genetics》2012,131(3):365-372
The proportion of never smoker non-small cell lung cancer (NSCLC) in Asia is about 30-40%. Despite the striking demographics and high prevalence of never smoker NSCLC, the exact causes still remain undetermined. Although several genome wide association (GWA) studies were conducted to find susceptibility loci for lung cancer in never smokers, no regions were replicated except for 5p15.33, suggesting locus heterogeneity and different environmental toxic effects. To identify genetic loci associated with susceptibility of lung cancer in never smokers, we performed a GWA analysis using the Affymetrix 6.0 SNP array. For discovery GWA set, we recruited 446 never smoking Korean patients with NSCLC and 497 normal subjects. We tested association of SNPs with lung cancer susceptibility using the Cochran-Armitage trend test. For validation, 39 SNPs were selected from the top 50 SNPs and five additional SNPs were selected in the DAB1 gene region which showed significant associations in the GWA analysis. The validation SNPs were genotyped in an independent sample including 434 patients and 1,000 controls. Among the 44 validation SNPs, two SNPs (rs11080466 and rs11663246) near the APCDD1, NAPG and FAM38B genes in the 18p11.22 region were replicated. P value of rs11080466 was 1.08 × 10(-6) in the combined sets (2.68 × 10(-5) in the discovery set and 2.60 × 10(-3) in the validation set) and odds ratio was 0.68 (0.58-0.79). We observed similar association for rs11663246. Our result suggests the 18p11.22 region as a novel lung cancer susceptibility locus in never smokers.  相似文献   

3.
Genome-wide association (GWA) studies usually detect common genetic variants with low-to-medium effect sizes. Many contributing variants are not revealed, since they fail to reach significance after strong correction for multiple comparisons. The WTCCC study for hypertension, for example, failed to identify genome-wide significant associations. We hypothesized that genetic variation in genes expressed specifically in the endothelium may be important for hypertension development. Results from the WTCCC study were combined with previously published gene expression data from mice to specifically investigate SNPs located within endothelial-specific genes, bypassing the requirement for genome-wide significance. Six SNPs from the WTCCC study were selected for independent replication in 5205 hypertensive patients and 5320 population-based controls, and successively in a cohort of 16537 individuals. A common variant (rs10860812) in the DRAM (damage-regulated autophagy modulator) locus showed association with hypertension (P = 0.008) in the replication study. The minor allele (A) had a protective effect (OR = 0.93; 95% CI 0.88–0.98 per A-allele), which replicates the association in the WTCCC GWA study. However, a second follow-up, in the larger cohort, failed to reveal an association with blood pressure. We further tested the endothelial-specific genes for co-localization with a panel of newly discovered SNPs from large meta-GWAS on hypertension or blood pressure. There was no significant overlap between those genes and hypertension or blood pressure loci. The result does not support the hypothesis that genetic variation in genes expressed in endothelium plays an important role for hypertension development. Moreover, the discordant association of rs10860812 with blood pressure in the case control study versus the larger Malmö Preventive Project–study highlights the importance of rigorous replication in multiple large independent studies.  相似文献   

4.
Summary A two‐stage design is cost‐effective for genome‐wide association studies (GWAS) testing hundreds of thousands of single nucleotide polymorphisms (SNPs). In this design, each SNP is genotyped in stage 1 using a fraction of case–control samples. Top‐ranked SNPs are selected and genotyped in stage 2 using additional samples. A joint analysis, combining statistics from both stages, is applied in the second stage. Follow‐up studies can be regarded as a two‐stage design. Once some potential SNPs are identified, independent samples are further genotyped and analyzed separately or jointly with previous data to confirm the findings. When the underlying genetic model is known, an asymptotically optimal trend test (TT) can be used at each analysis. In practice, however, genetic models for SNPs with true associations are usually unknown. In this case, the existing methods for analysis of the two‐stage design and follow‐up studies are not robust across different genetic models. We propose a simple robust procedure with genetic model selection to the two‐stage GWAS. Our results show that, if the optimal TT has about 80% power when the genetic model is known, then the existing methods for analysis of the two‐stage design have minimum powers about 20% across the four common genetic models (when the true model is unknown), while our robust procedure has minimum powers about 70% across the same genetic models. The results can be also applied to follow‐up and replication studies with a joint analysis.  相似文献   

5.

Background

Genome-wide association (GWA) study has recently become a powerful approach for detecting genetic variants for common diseases without prior knowledge of the variant's location or function. Generally, in GWA studies, the most significant single-nucleotide polymorphisms (SNPs) associated with top-ranked p values are selected in stage one, with follow-up in stage two. The value of selecting SNPs based on statistically significant p values is obvious. However, when minor allele frequencies (MAFs) are relatively low, less-significant p values can still correspond to higher odds ratios (ORs), which might be more useful for prediction of disease status. Therefore, if SNPs are selected using an approach based only on significant p values, some important genetic variants might be missed. We proposed a hybrid approach for selecting candidate SNPs from the discovery stage of GWA study, based on both p values and ORs, and conducted a simulation study to demonstrate the performance of our approach.

Results

The simulation results showed that our hybrid ranking approach was more powerful than the existing ranked p value approach for identifying relatively less-common SNPs. Meanwhile, the type I error probabilities of the hybrid approach is well-controlled at the end of the second stage of the two-stage GWA study.

Conclusions

In GWA studies, SNPs should be considered for inclusion based not only on ranked p values but also on ranked ORs.  相似文献   

6.
Although they have demonstrated success in searching for common variants for complex diseases, genome-wide association (GWA) studies are less successful in detecting rare genetic variants because of the poor statistical power of most of current methods. We developed a two-stage method that can apply to GWA studies for detecting rare variants. Here we report the results of applying this two-stage method to the Wellcome Trust Case Control Consortium (WTCCC) dataset that include seven complex diseases: bipolar disorder, cardiovascular disease, hypertension (HT), rheumatoid arthritis, Crohn’s disease, type 1 diabetes and type 2 diabetes (T2D). We identified 24 genes or regions that reach genome wide significance. Eight of them are novel and were not reported in the WTCCC study. The cumulative risk (or protective) haplotype frequency for each of the 8 genes or regions is small, being at most 11%. For each of the novel genes, the risk (or protective) haplotype set cannot be tagged by the common SNPs available in chips (r 2 < 0.32). The gene identified in HT was further replicated in the Framingham Heart Study, and is also significantly associated with T2D. Our analysis suggests that searching for rare genetic variants is feasible in current GWA studies and candidate gene studies, and the results can severe as guides to future resequencing studies to identify the underlying rare functional variants.  相似文献   

7.
Genome‐wide association (GWA) studies based on GBLUP models are a common practice in animal breeding. However, effect sizes of GWA tests are small, requiring larger sample sizes to enhance power of detection of rare variants. Because of difficulties in increasing sample size in animal populations, one alternative is to implement a meta‐analysis (MA), combining information and results from independent GWA studies. Although this methodology has been used widely in human genetics, implementation in animal breeding has been limited. Thus, we present methods to implement a MA of GWA, describing the proper approach to compute weights derived from multiple genomic evaluations based on animal‐centric GBLUP models. Application to real datasets shows that MA increases power of detection of associations in comparison with population‐level GWA, allowing for population structure and heterogeneity of variance components across populations to be accounted for. Another advantage of MA is that it does not require access to genotype data that is required for a joint analysis. Scripts related to the implementation of this approach, which consider the strength of association as well as the sign, are distributed and thus account for heterogeneity in association phase between QTL and SNPs. Thus, MA of GWA is an attractive alternative to summarizing results from multiple genomic studies, avoiding restrictions with genotype data sharing, definition of fixed effects and different scales of measurement of evaluated traits.  相似文献   

8.
9.
Recent genome‐wide association (GWA) studies have identified a number of novel genes/variants predisposing to obesity. However, most GWA studies have focused on individual single‐nucleotide polymorphism (SNPs)/genes with a strong statistical association with a phenotypic trait without considering potential biological interplay of the tested genes. In this study, we performed biological pathway‐based GWA analysis for BMI and body fat mass. We used individual level genotype data generated from 1,000 unrelated US whites that were genotyped for ~500,000 SNPs. Statistical analysis of pathways was performed using a modification of the Gene Set Enrichment Algorithm. A total of 963 pathways extracted from the BioCarta, Kyoto Encyclopedia of Genes and Genomes (KEGG), Ambion GeneAssist, and Gene Ontology (GO) databases were analyzed. Among all of the pathways analyzed, the vasoactive intestinal peptide (VIP) pathway was most strongly associated with fat mass (nominal P = 0.0009) and was the third most strongly associated pathway with BMI (nominal P = 0.0006). After multiple testing correction, the VIP pathway achieved false‐discovery rate (FDR) q values of 0.042 and 0.120 for fat mass and BMI, respectively. Our study is the first to demonstrate that the VIP pathway may play an important role in development of obesity. The study also highlights the importance of pathway‐based GWA analysis in identification of additional genes/variants for complex human diseases.  相似文献   

10.
Fuchs endothelial corneal dystrophy (FECD) is a common, late-onset disorder of the corneal endothelium. Although progress has been made in understanding the genetic basis of FECD by studying large families in which the phenotype is transmitted in an autosomal dominant fashion, a recently reported genome-wide association study identified common alleles at a locus on chromosome 18 near TCF4 which confer susceptibility to FECD. Here, we report the findings of our independent validation study for TCF4 using the largest FECD dataset to date (450 FECD cases and 340 normal controls). Logistic regression with sex as a covariate was performed for three genetic models: dominant (DOM), additive (ADD), and recessive (REC). We found significant association with rs613872, the target marker reported by Baratz et al.(2010), for all three genetic models (DOM: P = 9.33×10−35; ADD: P = 7.48×10−30; REC: P = 5.27×10−6). To strengthen the association study, we also conducted a genome-wide linkage scan on 64 multiplex families, composed primarily of affected sibling pairs (ASPs), using both parametric and non-parametric two-point and multipoint analyses. The most significant linkage region localizes to chromosome 18 from 69.94cM to 85.29cM, with a peak multipoint HLOD = 2.5 at rs1145315 (75.58cM) under the DOM model, mapping 1.5 Mb proximal to rs613872. In summary, our study presents evidence to support the role of the intronic TCF4 single nucleotide polymorphism rs613872 in late-onset FECD through both association and linkage studies.  相似文献   

11.
Genome-wide association studies (GWAS) are widely used to search for genetic loci that underlie human disease. Another goal is to predict disease risk for different individuals given their genetic sequence. Such predictions could either be used as a “black box” in order to promote changes in life-style and screening for early diagnosis, or as a model that can be studied to better understand the mechanism of the disease. Current methods for risk prediction typically rank single nucleotide polymorphisms (SNPs) by the p-value of their association with the disease, and use the top-associated SNPs as input to a classification algorithm. However, the predictive power of such methods is relatively poor. To improve the predictive power, we devised BootRank, which uses bootstrapping in order to obtain a robust prioritization of SNPs for use in predictive models. We show that BootRank improves the ability to predict disease risk of unseen individuals in the Wellcome Trust Case Control Consortium (WTCCC) data and results in a more robust set of SNPs and a larger number of enriched pathways being associated with the different diseases. Finally, we show that combining BootRank with seven different classification algorithms improves performance compared to previous studies that used the WTCCC data. Notably, diseases for which BootRank results in the largest improvements were recently shown to have more heritability than previously thought, likely due to contributions from variants with low minimum allele frequency (MAF), suggesting that BootRank can be beneficial in cases where SNPs affecting the disease are poorly tagged or have low MAF. Overall, our results show that improving disease risk prediction from genotypic information may be a tangible goal, with potential implications for personalized disease screening and treatment.  相似文献   

12.
The selection consequences of competition in plants have been traditionally interpreted based on a “size‐advantage” hypothesis – that is, under intense crowding/competition from neighbors, natural selection generally favors capacity for a relatively large plant body size. However, this conflicts with abundant data, showing that resident species body size distributions are usually strongly right‐skewed at virtually all scales within vegetation. Using surveys within sample plots and a neighbor‐removal experiment, we tested: (1) whether resident species that have a larger maximum potential body size (MAX) generally have more successful local individual recruitment, and thus greater local abundance/density (as predicted by the traditional size‐advantage hypothesis); and (2) whether there is a general between‐species trade‐off relationship between MAX and capacity to produce offspring when body size is severely suppressed by crowding/competition – that is, whether resident species with a larger MAX generally also need to reach a larger minimum reproductive threshold size (MIN) before they can reproduce at all. The results showed that MIN had a positive relationship with MAX across resident species, and local density – as well as local density of just reproductive individuals – was generally greater for species with smaller MIN (and hence smaller MAX). In addition, the cleared neighborhoods of larger target species (which had relatively large MIN) generally had – in the following growing season – a lower ratio of conspecific recruitment within these neighborhoods relative to recruitment of other (i.e., smaller) species (which had generally smaller MIN). These data are consistent with an alternative hypothesis based on a ‘reproductive‐economy‐advantage’ – that is, superior fitness under competition in plants generally requires not larger potential body size, but rather superior capacity to recruit offspring that are in turn capable of producing grand‐offspring – and hence transmitting genes to future generations – despite intense and persistent (cross‐generational) crowding/competition from near neighbors. Selection for the latter is expected to favor relatively small minimum reproductive threshold size and hence – as a tradeoff – relatively small (not large) potential body size.  相似文献   

13.

Background

High-throughput genotype (HTG) data has been used primarily in genome-wide association (GWA) studies; however, GWA results explain only a limited part of the complete genetic variation of traits. In systems genetics, network approaches have been shown to be able to identify pathways and their underlying causal genes to unravel the biological and genetic background of complex diseases and traits, e.g., the Weighted Gene Co-expression Network Analysis (WGCNA) method based on microarray gene expression data. The main objective of this study was to develop a scale-free weighted genetic interaction network method using whole genome HTG data in order to detect biologically relevant pathways and potential genetic biomarkers for complex diseases and traits.

Results

We developed the Weighted Interaction SNP Hub (WISH) network method that uses HTG data to detect genome-wide interactions between single nucleotide polymorphism (SNPs) and its relationship with complex traits. Data dimensionality reduction was achieved by selecting SNPs based on its: 1) degree of genome-wide significance and 2) degree of genetic variation in a population. Network construction was based on pairwise Pearson's correlation between SNP genotypes or the epistatic interaction effect between SNP pairs. To identify modules the Topological Overlap Measure (TOM) was calculated, reflecting the degree of overlap in shared neighbours between SNP pairs. Modules, clusters of highly interconnected SNPs, were defined using a tree-cutting algorithm on the SNP dendrogram created from the dissimilarity TOM (1-TOM). Modules were selected for functional annotation based on their association with the trait of interest, defined by the Genome-wide Module Association Test (GMAT). We successfully tested the established WISH network method using simulated and real SNP interaction data and GWA study results for carcass weight in a pig resource population; this resulted in detecting modules and key functional and biological pathways related to carcass weight.

Conclusions

We developed the WISH network method which is a novel 'systems genetics' approach to study genetic networks underlying complex trait variation. The WISH network method reduces data dimensionality and statistical complexity in associating genotypes with phenotypes in GWA studies and enables researchers to identify biologically relevant pathways and potential genetic biomarkers for any complex trait of interest.
  相似文献   

14.
Top signals from genome-wide association studies (GWASs) of type 2 diabetes (T2D) are enriched with expression quantitative trait loci (eQTLs) identified in skeletal muscle and adipose tissue. We therefore hypothesized that such eQTLs might account for a disproportionate share of the heritability estimated from all SNPs interrogated through GWASs. To test this hypothesis, we applied linear mixed models to the Wellcome Trust Case Control Consortium (WTCCC) T2D data set and to data sets representing Mexican Americans from Starr County, TX, and Mexicans from Mexico City. We estimated the proportion of phenotypic variance attributable to the additive effect of all variants interrogated in these GWASs, as well as a much smaller set of variants identified as eQTLs in human adipose tissue, skeletal muscle, and lymphoblastoid cell lines. The narrow-sense heritability explained by all interrogated SNPs in each of these data sets was substantially greater than the heritability accounted for by genome-wide-significant SNPs (∼10%); GWAS SNPs explained over 50% of phenotypic variance in the WTCCC, Starr County, and Mexico City data sets. The estimate of heritability attributable to cross-tissue eQTLs was greater in the WTCCC data set and among lean Hispanics, whereas adipose eQTLs significantly explained heritability among Hispanics with a body mass index ≥ 30. These results support an important role for regulatory variants in the genetic component of T2D susceptibility, particularly for eQTLs that elicit effects across insulin-responsive peripheral tissues.  相似文献   

15.
Alzheimer??s disease (AD) is a serious neurodegenerative disorder and its cause remains largely elusive. In past years, genome-wide association (GWA) studies have provided an effective means for AD research. However, the univariate method that is commonly used in GWA studies cannot effectively detect the biological mechanisms associated with this disease. In this study, we propose a new strategy for the GWA analysis of AD that combines random forests with enrichment analysis. First, backward feature selection using random forests was performed on a GWA dataset of AD patients carrying the apolipoprotein gene (APOE?4) and 1058 susceptible single nucleotide polymorphisms (SNPs) were detected, including several known AD-associated SNPs. Next, the susceptible SNPs were investigated by enrichment analysis and significantly-associated gene functional annotations, such as ??alternative splicing??, ??glycoprotein??, and ??neuron development??, were successfully discovered, indicating that these biological mechanisms play important roles in the development of AD in APOE?4 carriers. These findings may provide insights into the pathogenesis of AD and helpful guidance for further studies. Furthermore, this strategy can easily be modified and applied to GWA studies of other complex diseases.  相似文献   

16.
For many genome-wide association (GWA) studies individually genotyping one million or more SNPs provides a marginal increase in coverage at a substantial cost. Much of the information gained is redundant due to the correlation structure inherent in the human genome. Pooling-based GWA studies could benefit significantly by utilizing this redundancy to reduce noise, improve the accuracy of the observations and increase genomic coverage. We introduce a measure of correlation between individual genotyping and pooling, under the same framework that r(2) provides a measure of linkage disequilibrium (LD) between pairs of SNPs. We then report a new non-haplotype multimarker multi-loci method that leverages the correlation structure between SNPs in the human genome to increase the efficacy of pooling-based GWA studies. We first give a theoretical framework and derivation of our multimarker method. Next, we evaluate simulations using this multimarker approach in comparison to single marker analysis. Finally, we experimentally evaluate our method using different pools of HapMap individuals on the Illumina 450S Duo, Illumina 550K and Affymetrix 5.0 platforms for a combined total of 1 333 631 SNPs. Our results show that use of multimarker analysis reduces noise specific to pooling-based studies, allows for efficient integration of multiple microarray platforms and provides more accurate measures of significance than single marker analysis. Additionally, this approach can be extended to allow for imputing the association significance for SNPs not directly observed using neighboring SNPs in LD. This multimarker method can now be used to cost-effectively complete pooling-based GWA studies with multiple platforms across over one million SNPs and to impute neighboring SNPs weighted for the loss of information due to pooling.  相似文献   

17.
A previous genome-wide association (GWA) meta-analysis of 12,386 PD cases and 21,026 controls conducted by the International Parkinson''s Disease Genomics Consortium (IPDGC) discovered or confirmed 11 Parkinson''s disease (PD) loci. This first analysis of the two-stage IPDGC study focused on the set of loci that passed genome-wide significance in the first stage GWA scan. However, the second stage genotyping array, the ImmunoChip, included a larger set of 1,920 SNPs selected on the basis of the GWA analysis. Here, we analyzed this set of 1,920 SNPs, and we identified five additional PD risk loci (combined p<5×10−10, PARK16/1q32, STX1B/16p11, FGF20/8p22, STBD1/4q21, and GPNMB/7p15). Two of these five loci have been suggested by previous association studies (PARK16/1q32, FGF20/8p22), and this study provides further support for these findings. Using a dataset of post-mortem brain samples assayed for gene expression (n = 399) and methylation (n = 292), we identified methylation and expression changes associated with PD risk variants in PARK16/1q32, GPNMB/7p15, and STX1B/16p11 loci, hence suggesting potential molecular mechanisms and candidate genes at these risk loci.  相似文献   

18.
The improved characterisation of risk factors for rheumatoid arthritis (RA) suggests they could be combined to identify individuals at increased disease risks in whom preventive strategies may be evaluated. We aimed to develop an RA prediction model capable of generating clinically relevant predictive data and to determine if it better predicted younger onset RA (YORA). Our novel modelling approach combined odds ratios for 15 four-digit/10 two-digit HLA-DRB1 alleles, 31 single nucleotide polymorphisms (SNPs) and ever-smoking status in males to determine risk using computer simulation and confidence interval based risk categorisation. Only males were evaluated in our models incorporating smoking as ever-smoking is a significant risk factor for RA in men but not women. We developed multiple models to evaluate each risk factor''s impact on prediction. Each model''s ability to discriminate anti-citrullinated protein antibody (ACPA)-positive RA from controls was evaluated in two cohorts: Wellcome Trust Case Control Consortium (WTCCC: 1,516 cases; 1,647 controls); UK RA Genetics Group Consortium (UKRAGG: 2,623 cases; 1,500 controls). HLA and smoking provided strongest prediction with good discrimination evidenced by an HLA-smoking model area under the curve (AUC) value of 0.813 in both WTCCC and UKRAGG. SNPs provided minimal prediction (AUC 0.660 WTCCC/0.617 UKRAGG). Whilst high individual risks were identified, with some cases having estimated lifetime risks of 86%, only a minority overall had substantially increased odds for RA. High risks from the HLA model were associated with YORA (P<0.0001); ever-smoking associated with older onset disease. This latter finding suggests smoking''s impact on RA risk manifests later in life. Our modelling demonstrates that combining risk factors provides clinically informative RA prediction; additionally HLA and smoking status can be used to predict the risk of younger and older onset RA, respectively.  相似文献   

19.
Although several genome‐wide association (GWA) studies of human personality have been recently published, genetic variants that are highly associated with certain personality traits remain unknown, due to difficulty reproducing results. To further investigate these genetic variants, we assessed biological pathways using GWA datasets. Pathway analysis using GWA data was performed on 1089 Korean women whose personality traits were measured with the Revised NEO Personality Inventory for the 5‐factor model of personality. A total of 1042 pathways containing 8297 genes were included in our study. Of these, 14 pathways were highly enriched with association signals that were validated in 1490 independent samples. These pathways include association of: Neuroticism with axon guidance [L1 cell adhesion molecule (L1CAM) interactions]; Extraversion with neuronal system and voltage‐gated potassium channels; Agreeableness with L1CAM interaction, neurotransmitter receptor binding and downstream transmission in postsynaptic cells; and Conscientiousness with the interferon‐gamma and platelet‐derived growth factor receptor beta polypeptide pathways. Several genes that contribute to top‐ranked pathways in this study were previously identified in GWA studies or by pathway analysis in schizophrenia or other neuropsychiatric disorders. Here we report the first pathway analysis of all five personality traits. Importantly, our analysis identified novel pathways that contribute to understanding the etiology of personality traits.  相似文献   

20.
Stringer S  Wray NR  Kahn RS  Derks EM 《PloS one》2011,6(11):e27964
Complex diseases are often highly heritable. However, for many complex traits only a small proportion of the heritability can be explained by observed genetic variants in traditional genome-wide association (GWA) studies. Moreover, for some of those traits few significant SNPs have been identified. Single SNP association methods test for association at a single SNP, ignoring the effect of other SNPs. We show using a simple multi-locus odds model of complex disease that moderate to large effect sizes of causal variants may be estimated as relatively small effect sizes in single SNP association testing. This underestimation effect is most severe for diseases influenced by numerous risk variants. We relate the underestimation effect to the concept of non-collapsibility found in the statistics literature. As described, continuous phenotypes generated with linear genetic models are not affected by this underestimation effect. Since many GWA studies apply single SNP analysis to dichotomous phenotypes, previously reported results potentially underestimate true effect sizes, thereby impeding identification of true effect SNPs. Therefore, when a multi-locus model of disease risk is assumed, a multi SNP analysis may be more appropriate.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号