首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 250 毫秒
1.
Several lines of evidence suggest that genome-wide association studies (GWASs) have the potential to explain more of the “missing heritability” of common complex phenotypes. However, reliable methods for identifying a larger proportion of SNPs are currently lacking. Here, we present a genetic-pleiotropy-informed method for improving gene discovery with the use of GWAS summary-statistics data. We applied this methodology to identify additional loci associated with schizophrenia (SCZ), a highly heritable disorder with significant missing heritability. Epidemiological and clinical studies suggest comorbidity between SCZ and cardiovascular-disease (CVD) risk factors, including systolic blood pressure, triglycerides, low- and high-density lipoprotein, body mass index, waist-to-hip ratio, and type 2 diabetes. Using stratified quantile-quantile plots, we show enrichment of SNPs associated with SCZ as a function of the association with several CVD risk factors and a corresponding reduction in false discovery rate (FDR). We validate this “pleiotropic enrichment” by demonstrating increased replication rate across independent SCZ substudies. Applying the stratified FDR method, we identified 25 loci associated with SCZ at a conditional FDR level of 0.01. Of these, ten loci are associated with both SCZ and CVD risk factors, mainly triglycerides and low- and high-density lipoproteins but also waist-to-hip ratio, systolic blood pressure, and body mass index. Together, these findings suggest the feasibility of using genetic-pleiotropy-informed methods for improving gene discovery in SCZ and identifying potential mechanistic relationships with various CVD risk factors.  相似文献   

2.
3.
《PloS one》2012,7(12)
Genome-wide association studies (GWAS) have successfully identified a number of single-nucleotide polymorphisms (SNPs) associated with colorectal cancer (CRC) risk. However, these susceptibility loci known today explain only a small fraction of the genetic risk. Gene-gene interaction (GxG) is considered to be one source of the missing heritability. To address this, we performed a genome-wide search for pair-wise GxG associated with CRC risk using 8,380 cases and 10,558 controls in the discovery phase and 2,527 cases and 2,658 controls in the replication phase. We developed a simple, but powerful method for testing interaction, which we term the Average Risk Due to Interaction (ARDI). With this method, we conducted a genome-wide search to identify SNPs showing evidence for GxG with previously identified CRC susceptibility loci from 14 independent regions. We also conducted a genome-wide search for GxG using the marginal association screening and examining interaction among SNPs that pass the screening threshold (p<10−4). For the known locus rs10795668 (10p14), we found an interacting SNP rs367615 (5q21) with replication p = 0.01 and combined p = 4.19×10−8. Among the top marginal SNPs after LD pruning (n = 163), we identified an interaction between rs1571218 (20p12.3) and rs10879357 (12q21.1) (nominal combined p = 2.51×10−6; Bonferroni adjusted p = 0.03). Our study represents the first comprehensive search for GxG in CRC, and our results may provide new insight into the genetic etiology of CRC.  相似文献   

4.
Genome-wide association studies (GWASs) have identified a number of susceptibility genes for schizophrenia (SCZ) and bipolar disorder (BD). However, the identification of risk genes for major depressive disorder (MDD) has been unsuccessful because the etiology of MDD is more influenced by environmental factors; thus, gene–environment (G×E) interactions are important, such as interplay with stressful life events (SLEs). We assessed the G×E interactions and main effects of genes targeting depressive symptoms. Using a case–control design, 922 hospital staff members were evaluated for depressive symptoms according to Beck Depressive Inventory (BDI; “depression” and “control” groups were classified by scores of 10 in the BDI test), SLEs, and personality. A total of sixty-three genetic variants were selected on the basis of previous GWASs of MDD, SCZ, and BD as well as candidate-gene (SLC6A4, BDNF, DBH, and FKBP5) studies. Logistic regression analysis revealed a marginally significant interaction (genetic variant × SLE) at rs4523957 (Puncorrected = 0.0034) with depression and a significant association of single nucleotide polymorphism identified from evidence of BD GWAS (rs7296288, downstream of DHH at 12q13.1) with depression as the main effect (Puncorrected = 9.4×10−4, Pcorrected = 0.0424). We also found that SLEs had a larger impact on depression (odds ratio∼3), as reported previously. These results suggest that DHH plays a possible role in depression etiology; however, variants from MDD or SCZ GWAS evidence or candidate genes showed no significant associations or minimal effects of interactions with SLEs on depression.  相似文献   

5.
Schizophrenia (SCZ) is a severe psychiatric disorder associated with many different risk factors, both genetic and environmental. A recent genome-wide association study (GWAS) of Han Chinese identified three single-nucleotide polymorphisms (SNPs rs11038167, rs11038172, and rs835784) in the tetraspanins gene TSPAN18 as possible susceptibility loci for schizophrenia. Hoping to validate these findings, we conducted a case-control study of Han Chinese with 1093 schizophrenia cases and 1022 healthy controls. Using the LDR-PCR method to genotype polymorphisms in TSPAN18, we found no significant differences (P>0.05) between patients and controls in either the allele or genotype frequency of the SNPs rs11038167 and rs11038172. We did find, however, that the frequency of the ‘A’ allele of SNP rs835784 is significantly higher in patients than in controls. We further observed a significant association (OR  = 1.197, 95%CI  = 1.047–1.369) between risk for SCZ and this ‘A’ allele. These results confirm the significant association, in Han Chinese populations, of increased SCZ risk and the variant of the TSPAN18 gene containing the ‘A’ allele of SNP rs835784.  相似文献   

6.
Genome-wide association studies (GWAS) have identified thousands of genetic variants that are associated with complex traits. However, a stringent significance threshold is required to identify robust genetic associations. Leveraging relevant auxiliary covariates has the potential to boost statistical power to exceed the significance threshold. Particularly, abundant pleiotropy and the non-random distribution of SNPs across various functional categories suggests that leveraging GWAS test statistics from related traits and/or functional genomic data may boost GWAS discovery. While type 1 error rate control has become standard in GWAS, control of the false discovery rate can be a more powerful approach. The conditional false discovery rate (cFDR) extends the standard FDR framework by conditioning on auxiliary data to call significant associations, but current implementations are restricted to auxiliary data satisfying specific parametric distributions, typically GWAS p-values for related traits. We relax these distributional assumptions, enabling an extension of the cFDR framework that supports auxiliary covariates from arbitrary continuous distributions (“Flexible cFDR”). Our method can be applied iteratively, thereby supporting multi-dimensional covariate data. Through simulations we show that Flexible cFDR increases sensitivity whilst controlling FDR after one or several iterations. We further demonstrate its practical potential through application to an asthma GWAS, leveraging various functional genomic data to find additional genetic associations for asthma, which we validate in the larger, independent, UK Biobank data resource.  相似文献   

7.
The direct estimation of heritability from genome-wide common variant data as implemented in the program Genome-wide Complex Trait Analysis (GCTA) has provided a means to quantify heritability attributable to all interrogated variants. We have quantified the variance in liability to disease explained by all SNPs for two phenotypically-related neurobehavioral disorders, obsessive-compulsive disorder (OCD) and Tourette Syndrome (TS), using GCTA. Our analysis yielded a heritability point estimate of 0.58 (se = 0.09, p = 5.64e-12) for TS, and 0.37 (se = 0.07, p = 1.5e-07) for OCD. In addition, we conducted multiple genomic partitioning analyses to identify genomic elements that concentrate this heritability. We examined genomic architectures of TS and OCD by chromosome, MAF bin, and functional annotations. In addition, we assessed heritability for early onset and adult onset OCD. Among other notable results, we found that SNPs with a minor allele frequency of less than 5% accounted for 21% of the TS heritability and 0% of the OCD heritability. Additionally, we identified a significant contribution to TS and OCD heritability by variants significantly associated with gene expression in two regions of the brain (parietal cortex and cerebellum) for which we had available expression quantitative trait loci (eQTLs). Finally we analyzed the genetic correlation between TS and OCD, revealing a genetic correlation of 0.41 (se = 0.15, p = 0.002). These results are very close to previous heritability estimates for TS and OCD based on twin and family studies, suggesting that very little, if any, heritability is truly missing (i.e., unassayed) from TS and OCD GWAS studies of common variation. The results also indicate that there is some genetic overlap between these two phenotypically-related neuropsychiatric disorders, but suggest that the two disorders have distinct genetic architectures.  相似文献   

8.
Recent results indicate that genome-wide association studies (GWAS) have the potential to explain much of the heritability of common complex phenotypes, but methods are lacking to reliably identify the remaining associated single nucleotide polymorphisms (SNPs). We applied stratified False Discovery Rate (sFDR) methods to leverage genic enrichment in GWAS summary statistics data to uncover new loci likely to replicate in independent samples. Specifically, we use linkage disequilibrium-weighted annotations for each SNP in combination with nominal p-values to estimate the True Discovery Rate (TDR = 1−FDR) for strata determined by different genic categories. We show a consistent pattern of enrichment of polygenic effects in specific annotation categories across diverse phenotypes, with the greatest enrichment for SNPs tagging regulatory and coding genic elements, little enrichment in introns, and negative enrichment for intergenic SNPs. Stratified enrichment directly leads to increased TDR for a given p-value, mirrored by increased replication rates in independent samples. We show this in independent Crohn''s disease GWAS, where we find a hundredfold variation in replication rate across genic categories. Applying a well-established sFDR methodology we demonstrate the utility of stratification for improving power of GWAS in complex phenotypes, with increased rejection rates from 20% in height to 300% in schizophrenia with traditional FDR and sFDR both fixed at 0.05. Our analyses demonstrate an inherent stratification among GWAS SNPs with important conceptual implications that can be leveraged by statistical methods to improve the discovery of loci.  相似文献   

9.
Most of the previously reported loci for total immunoglobulin E (IgE) levels are related to Th2 cell-dependent pathways. We undertook a genome-wide association study (GWAS) to identify genetic loci responsible for IgE regulation. A total of 479,940 single nucleotide polymorphisms (SNPs) were tested for association with total serum IgE levels in 1180 Japanese adults. Fine-mapping with SNP imputation demonstrated 6 candidate regions: the PYHIN1/IFI16, MHC classes I and II, LEMD2, GRAMD1B, and chr13∶60576338 regions. Replication of these candidate loci in each region was assessed in 2 independent Japanese cohorts (n = 1110 and 1364, respectively). SNP rs3130941 in the HLA-C region was consistently associated with total IgE levels in 3 independent populations, and the meta-analysis yielded genome-wide significance (P = 1.07×10−10). Using our GWAS results, we also assessed the reproducibility of previously reported gene associations with total IgE levels. Nine of 32 candidate genes identified by a literature search were associated with total IgE levels after correction for multiple testing. Our findings demonstrate that SNPs in the HLA-C region are strongly associated with total serum IgE levels in the Japanese population and that some of the previously reported genetic associations are replicated across ethnic groups.  相似文献   

10.
The evidence for the existence of genetic susceptibility variants for the common form of hypertension (“essential hypertension”) remains weak and inconsistent. We sought genetic variants underlying blood pressure (BP) by conducting a genome-wide association study (GWAS) among African Americans, a population group in the United States that is disproportionately affected by hypertension and associated complications, including stroke and kidney diseases. Using a dense panel of over 800,000 SNPs in a discovery sample of 1,017 African Americans from the Washington, D.C., metropolitan region, we identified multiple SNPs reaching genome-wide significance for systolic BP in or near the genes: PMS1, SLC24A4, YWHA7, IPO7, and CACANA1H. Two of these genes, SLC24A4 (a sodium/potassium/calcium exchanger) and CACNA1H (a voltage-dependent calcium channel), are potential candidate genes for BP regulation and the latter is a drug target for a class of calcium channel blockers. No variant reached genome wide significance for association with diastolic BP (top scoring SNP rs1867226, p = 5.8×10−7) or with hypertension as a binary trait (top scoring SNP rs9791170, p = 5.1×10−7). We replicated some of the significant SNPs in a sample of West Africans. Pathway analysis revealed that genes harboring top-scoring variants cluster in pathways and networks of biologic relevance to hypertension and BP regulation. This is the first GWAS for hypertension and BP in an African American population. The findings suggests that, in addition to or in lieu of relying solely on replicated variants of moderate-to-large effect reaching genome-wide significance, pathway and network approaches may be useful in identifying and prioritizing candidate genes/loci for further experiments.  相似文献   

11.
Five novel loci recently found to be associated with body mass in two GWAS of East Asian populations were evaluated in two cohorts of Swedish and Greek children and adolescents. These loci are located within, or in the proximity of: CDKAL1, PCSK1, GP2, PAX6 and KLF9. No association with body mass has previously been reported for these loci in GWAS performed on European populations. The single nucleotide polymorphisms (SNPs) with the strongest association at each loci in the East Asian GWAS were genotyped in two cohorts, one obesity case control cohort of Swedish children and adolescents consisting of 496 cases and 520 controls and one cross-sectional cohort of 2293 nine-to-thirteen year old Greek children and adolescents. SNPs were surveyed for association with body mass and other phenotypic traits commonly associated with obesity, including adipose tissue distribution, insulin resistance and daily caloric intake. No association with body mass was found in either cohort. However, among the Greek children, association with insulin resistance could be observed for the two CDKAL1-related SNPs: rs9356744 (β = 0.018, p = 0.014) and rs2206734 (β = 0.024, p = 0.001). CDKAL1-related variants have previously been associated with type 2 diabetes and insulin response. This study reports association of CDKAL1-related SNPs with insulin resistance, a clinical marker related to type 2 diabetes in a cross-sectional cohort of Greek children and adolescents of European descent.  相似文献   

12.
Coronary artery disease (CAD) is the leading cause of death worldwide. Recent genome-wide association studies (GWAS) identified >50 common variants associated with CAD or its complication myocardial infarction (MI), but collectively they account for <20% of heritability, generating a phenomena of “missing heritability”. Rare variants with large effects may account for a large portion of missing heritability. Genome-wide linkage studies of large families and follow-up fine mapping and deep sequencing are particularly effective in identifying rare variants with large effects. Here we show results from a genome-wide linkage scan for CAD in multiplex GeneQuest families with early onset CAD and MI. Whole genome genotyping was carried out with 408 markers that span the human genome by every 10 cM and linkage analyses were performed using the affected relative pair analysis implemented in GENEHUNTER. Affected only nonparametric linkage (NPL) analysis identified two novel CAD loci with highly significant evidence of linkage on chromosome 3p25.1 (peak NPL  = 5.49) and 3q29 (NPL  = 6.84). We also identified four loci with suggestive linkage on 9q22.33, 9q34.11, 17p12, and 21q22.3 (NPL  = 3.18–4.07). These results identify novel loci for CAD and provide a framework for fine mapping and deep sequencing to identify new susceptibility genes and novel variants associated with risk of CAD.  相似文献   

13.
Association mapping is a powerful approach for dissecting the genetic architecture of complex quantitative traits using high-density SNP markers in maize. Here, we expanded our association panel size from 368 to 513 inbred lines with 0.5 million high quality SNPs using a two-step data-imputation method which combines identity by descent (IBD) based projection and k-nearest neighbor (KNN) algorithm. Genome-wide association studies (GWAS) were carried out for 17 agronomic traits with a panel of 513 inbred lines applying both mixed linear model (MLM) and a new method, the Anderson-Darling (A-D) test. Ten loci for five traits were identified using the MLM method at the Bonferroni-corrected threshold −log10 (P) >5.74 (α = 1). Many loci ranging from one to 34 loci (107 loci for plant height) were identified for 17 traits using the A-D test at the Bonferroni-corrected threshold −log10 (P) >7.05 (α = 0.05) using 556809 SNPs. Many known loci and new candidate loci were only observed by the A-D test, a few of which were also detected in independent linkage analysis. This study indicates that combining IBD based projection and KNN algorithm is an efficient imputation method for inferring large missing genotype segments. In addition, we showed that the A-D test is a useful complement for GWAS analysis of complex quantitative traits. Especially for traits with abnormal phenotype distribution, controlled by moderate effect loci or rare variations, the A-D test balances false positives and statistical power. The candidate SNPs and associated genes also provide a rich resource for maize genetics and breeding.  相似文献   

14.
Genome-wide association studies (GWAS) have identified ∼100 loci associated with blood lipid levels, but much of the trait heritability remains unexplained, and at most loci the identities of the trait-influencing variants remain unknown. We conducted a trans-ethnic fine-mapping study at 18, 22, and 18 GWAS loci on the Metabochip for their association with triglycerides (TG), high-density lipoprotein cholesterol (HDL-C), and low-density lipoprotein cholesterol (LDL-C), respectively, in individuals of African American (n = 6,832), East Asian (n = 9,449), and European (n = 10,829) ancestry. We aimed to identify the variants with strongest association at each locus, identify additional and population-specific signals, refine association signals, and assess the relative significance of previously described functional variants. Among the 58 loci, 33 exhibited evidence of association at P<1×10−4 in at least one ancestry group. Sequential conditional analyses revealed that ten, nine, and four loci in African Americans, Europeans, and East Asians, respectively, exhibited two or more signals. At these loci, accounting for all signals led to a 1.3- to 1.8-fold increase in the explained phenotypic variance compared to the strongest signals. Distinct signals across ancestry groups were identified at PCSK9 and APOA5. Trans-ethnic analyses narrowed the signals to smaller sets of variants at GCKR, PPP1R3B, ABO, LCAT, and ABCA1. Of 27 variants reported previously to have functional effects, 74% exhibited the strongest association at the respective signal. In conclusion, trans-ethnic high-density genotyping and analysis confirm the presence of allelic heterogeneity, allow the identification of population-specific variants, and limit the number of candidate SNPs for functional studies.  相似文献   

15.

Purpose

A recent large genome-wide association study (GWAS) identified multiple variants associated with primary angle-closure glaucoma (PACG). The present study investigated the role of these variants in two cohorts with PACG recruited from Australia and Nepal.

Method

Patients with PACG and appropriate controls were recruited from eye clinics in Australia (n = 232 cases and n = 288 controls) and Nepal (n = 106 cases and 204 controls). Single nucleotide polymorphisms (SNPs) rs3753841 (COL11A1), rs1015213 (located between PCMTD1 and ST18), rs11024102 (PLEKHA7), and rs3788317 (TXNRD2) were selected and genotyped on the Sequenom. Analyses were conducted using PLINK and METAL.

Results

After adjustment for age and sex, SNP rs3753841 was found to be significantly associated with PACG in the Australian cohort (p = 0.017; OR = 1.34). SNPs rs1015213 (p = 0.014; OR 2.35) and rs11024102 (p = 0.039; OR 1.43) were significantly associated with the disease development in the Nepalese cohort. None of these SNPs survived Bonferroni correction (p = 0.05/4 = 0.013). However, in the combined analysis, of both cohorts, rs3753841 and rs1015213 showed significant association with p-values of 0.009 and 0.004, respectively both surviving Bonferroni correction. SNP rs11024102 showed suggestive association with PACG (p-value 0.035) and no association was found with rs3788317.

Conclusion

The present results support the initial GWAS findings, and confirm the SNP’s contribution to PACG. This is the first study to investigate these loci in both Australian Caucasian and Nepalese populations.  相似文献   

16.
Most of the genetic architecture of schizophrenia (SCZ) has not yet been identified. Here, we apply a novel statistical algorithm called Covariate-Modulated Mixture Modeling (CM3), which incorporates auxiliary information (heterozygosity, total linkage disequilibrium, genomic annotations, pleiotropy) for each single nucleotide polymorphism (SNP) to enable more accurate estimation of replication probabilities, conditional on the observed test statistic (“z-score”) of the SNP. We use a multiple logistic regression on z-scores to combine information from auxiliary information to derive a “relative enrichment score” for each SNP. For each stratum of these relative enrichment scores, we obtain nonparametric estimates of posterior expected test statistics and replication probabilities as a function of discovery z-scores, using a resampling-based approach that repeatedly and randomly partitions meta-analysis sub-studies into training and replication samples. We fit a scale mixture of two Gaussians model to each stratum, obtaining parameter estimates that minimize the sum of squared differences of the scale-mixture model with the stratified nonparametric estimates. We apply this approach to the recent genome-wide association study (GWAS) of SCZ (n = 82,315), obtaining a good fit between the model-based and observed effect sizes and replication probabilities. We observed that SNPs with low enrichment scores replicate with a lower probability than SNPs with high enrichment scores even when both they are genome-wide significant (p < 5x10-8). There were 693 and 219 independent loci with model-based replication rates ≥80% and ≥90%, respectively. Compared to analyses not incorporating relative enrichment scores, CM3 increased out-of-sample yield for SNPs that replicate at a given rate. This demonstrates that replication probabilities can be more accurately estimated using prior enrichment information with CM3.  相似文献   

17.
The first Genome Wide Association Study (GWAS) of otitis media (OM) found evidence of association in the Western Australian Pregnancy Cohort (Raine) study, but lacked replication in an independent OM population. The aim of this study was to investigate association at these loci in our family-based sample of chronic otitis media with effusion and recurrent otitis media (COME/ROM). Autosomal SNPs were selected from the Raine OM GWAS results. SNPs from the Raine cohort GWAS genotyped in our GWAS of COME/ROM had P-values ranging from P = 0.06–0.80. After removal of SNPs previously genotyped in our GWAS of COME/ROM (N = 21) and those that failed Fluidigm assay design (N = 1), 26 SNPs were successfully genotyped in 716 individuals from our COME/ROM family population. None of the SNP associations replicated in our family-based population (unadjusted P = 0.03–0.93). Replication in an independent sample would confirm that these represent novel OM loci, and that further investigation is warranted.  相似文献   

18.
The X chromosome (chrX) represents one potential source for the “missing heritability” for complex phenotypes, which thus far has remained underanalyzed in genome-wide association studies (GWAS). Here we demonstrate the benefits of including chrX in GWAS by assessing the contribution of 404,862 chrX SNPs to levels of twelve commonly studied cardiometabolic and anthropometric traits in 19,697 Finnish and Swedish individuals with replication data on 5,032 additional Finns. By using a linear mixed model, we estimate that on average 2.6% of the additive genetic variance in these twelve traits is attributable to chrX, this being in proportion to the number of SNPs in the chromosome. In a chrX-wide association analysis, we identify three novel loci: two for height (rs182838724 near FGF16/ATRX/MAGT1, joint P-value = 2.71×10−9, and rs1751138 near ITM2A, P-value = 3.03×10−10) and one for fasting insulin (rs139163435 in Xq23, P-value = 5.18×10−9). Further, we find that effect sizes for variants near ITM2A, a gene implicated in cartilage development, show evidence for a lack of dosage compensation. This observation is further supported by a sex-difference in ITM2A expression in whole blood (P-value = 0.00251), and is also in agreement with a previous report showing ITM2A escapes from X chromosome inactivation (XCI) in the majority of women. Hence, our results show one of the first links between phenotypic variation in a population sample and an XCI-escaping locus and pinpoint ITM2A as a potential contributor to the sexual dimorphism in height. In conclusion, our study provides a clear motivation for including chrX in large-scale genetic studies of complex diseases and traits.  相似文献   

19.
The presence of oligoclonal bands (OCB) in cerebrospinal fluid (CSF) is a typical finding in multiple sclerosis (MS). We applied data from Norwegian, Swedish and Danish (i.e. Scandinavian) MS patients from a genome-wide association study (GWAS) to search for genetic differences in MS relating to OCB status. GWAS data was compared in 1367 OCB positive and 161 OCB negative Scandinavian MS patients, and nine of the most associated SNPs were genotyped for replication in 3403 Scandinavian MS patients. HLA-DRB1 genotypes were analyzed in a subset of the OCB positive (n = 2781) and OCB negative (n = 292) MS patients and compared to 890 healthy controls. Results from the genome-wide analyses showed that single nucleotide polymorphisms (SNPs) from the HLA complex and six other loci were associated to OCB status. In SNPs selected for replication, combined analyses showed genome-wide significant association for two SNPs in the HLA complex; rs3129871 (p = 5.7×10−15) and rs3817963 (p = 5.7×10−10) correlating with the HLA-DRB1*15 and the HLA-DRB1*04 alleles, respectively. We also found suggestive association to one SNP in the Calsyntenin-2 gene (p = 8.83×10−7). In HLA-DRB1 analyses HLA-DRB1*15∶01 was a stronger risk factor for OCB positive than OCB negative MS, whereas HLA-DRB1*04∶04 was associated with increased risk of OCB negative MS and reduced risk of OCB positive MS. Protective effects of HLA-DRB1*01∶01 and HLA-DRB1*07∶01 were detected in both groups. The groups were different with regard to age at onset (AAO), MS outcome measures and gender. This study confirms both shared and distinct genetic risk for MS subtypes in the Scandinavian population defined by OCB status and indicates different clinical characteristics between the groups. This suggests differences in disease mechanisms between OCB negative and OCB positive MS with implications for patient management, which need to be further studied.  相似文献   

20.
A single mutation can alter cellular and global homeostatic mechanisms and give rise to multiple clinical diseases. We hypothesized that these disease mechanisms could be identified using low minor allele frequency (MAF<0.1) non-synonymous SNPs (nsSNPs) associated with “mechanistic phenotypes”, comprised of collections of related diagnoses. We studied two mechanistic phenotypes: (1) thrombosis, evaluated in a population of 1,655 African Americans; and (2) four groupings of cancer diagnoses, evaluated in 3,009 white European Americans. We tested associations between nsSNPs represented on GWAS platforms and mechanistic phenotypes ascertained from electronic medical records (EMRs), and sought enrichment in functional ontologies across the top-ranked associations. We used a two-step analytic approach whereby nsSNPs were first sorted by the strength of their association with a phenotype. We tested associations using two reverse genetic models and standard additive and recessive models. In the second step, we employed a hypothesis-free ontological enrichment analysis using the sorted nsSNPs to identify functional mechanisms underlying the diagnoses comprising the mechanistic phenotypes. The thrombosis phenotype was solely associated with ontologies related to blood coagulation (Fisher''s p = 0.0001, FDR p = 0.03), driven by the F5, P2RY12 and F2RL2 genes. For the cancer phenotypes, the reverse genetics models were enriched in DNA repair functions (p = 2×10−5, FDR p = 0.03) (POLG/FANCI, SLX4/FANCP, XRCC1, BRCA1, FANCA, CHD1L) while the additive model showed enrichment related to chromatid segregation (p = 4×10−6, FDR p = 0.005) (KIF25, PINX1). We were able to replicate nsSNP associations for POLG/FANCI, BRCA1, FANCA and CHD1L in independent data sets. Mechanism-oriented phenotyping using collections of EMR-derived diagnoses can elucidate fundamental disease mechanisms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号