首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Objective

We examined whether a panel of SNPs, systematically selected from genome-wide association studies (GWAS), could improve risk prediction of coronary heart disease (CHD), over-and-above conventional risk factors. These SNPs have already demonstrated reproducible associations with CHD; here we examined their use in long-term risk prediction.

Study Design and Setting

SNPs identified from meta-analyses of GWAS of CHD were tested in 840 men and women aged 55–75 from the Edinburgh Artery Study, a prospective, population-based study with 15 years of follow-up. Cox proportional hazards models were used to evaluate the addition of SNPs to conventional risk factors in prediction of CHD risk. CHD was classified as myocardial infarction (MI), coronary intervention (angioplasty, or coronary artery bypass surgery), angina and/or unspecified ischaemic heart disease as a cause of death; additional analyses were limited to MI or coronary intervention. Model performance was assessed by changes in discrimination and net reclassification improvement (NRI).

Results

There were significant improvements with addition of 27 SNPs to conventional risk factors for prediction of CHD (NRI of 54%, P<0.001; C-index 0.671 to 0.740, P = 0.001), as well as MI or coronary intervention, (NRI of 44%, P<0.001; C-index 0.717 to 0.750, P = 0.256). ROC curves showed that addition of SNPs better improved discrimination when the sensitivity of conventional risk factors was low for prediction of MI or coronary intervention.

Conclusion

There was significant improvement in risk prediction of CHD over 15 years when SNPs identified from GWAS were added to conventional risk factors. This effect may be particularly useful for identifying individuals with a low prognostic index who are in fact at increased risk of disease than indicated by conventional risk factors alone.  相似文献   

2.
Genome-wide association studies (GWAS) have been fruitful in identifying disease susceptibility loci for common and complex diseases. A remaining question is whether we can quantify individual disease risk based on genotype data, in order to facilitate personalized prevention and treatment for complex diseases. Previous studies have typically failed to achieve satisfactory performance, primarily due to the use of only a limited number of confirmed susceptibility loci. Here we propose that sophisticated machine-learning approaches with a large ensemble of markers may improve the performance of disease risk assessment. We applied a Support Vector Machine (SVM) algorithm on a GWAS dataset generated on the Affymetrix genotyping platform for type 1 diabetes (T1D) and optimized a risk assessment model with hundreds of markers. We subsequently tested this model on an independent Illumina-genotyped dataset with imputed genotypes (1,008 cases and 1,000 controls), as well as a separate Affymetrix-genotyped dataset (1,529 cases and 1,458 controls), resulting in area under ROC curve (AUC) of ∼0.84 in both datasets. In contrast, poor performance was achieved when limited to dozens of known susceptibility loci in the SVM model or logistic regression model. Our study suggests that improved disease risk assessment can be achieved by using algorithms that take into account interactions between a large ensemble of markers. We are optimistic that genotype-based disease risk assessment may be feasible for diseases where a notable proportion of the risk has already been captured by SNP arrays.  相似文献   

3.
It is widely acknowledged that genome-wide association studies (GWAS) of complex human disease fail to explain a large portion of heritability, primarily due to lack of statistical power—a problem that is exacerbated when seeking detection of interactions of multiple genomic loci. An untapped source of information that is already widely available, and that is expected to grow in coming years, is population samples. Such samples contain genetic marker data for additional individuals, but not their relevant phenotypes. In this article we develop a highly efficient testing framework based on a constrained maximum-likelihood estimate in a case–control–population setting. We leverage the available population data and optional modeling assumptions, such as Hardy–Weinberg equilibrium (HWE) in the population and linkage equilibrium (LE) between distal loci, to substantially improve power of association and interaction tests. We demonstrate, via simulation and application to actual GWAS data sets, that our approach is substantially more powerful and robust than standard testing approaches that ignore or make naive use of the population sample. We report several novel and credible pairwise interactions, in bipolar disorder, coronary artery disease, Crohn’s disease, and rheumatoid arthritis.  相似文献   

4.
Cardiovascular disease (CVD) is the leading cause of death worldwide. Recent genome-wide association (GWA) studies have pinpointed many loci associated with CVD risk factors in adults. It is unclear, however, if these loci predict trait levels at all ages, if they are associated with how a trait develops over time, or if they could be used to screen individuals who are pre-symptomatic to provide the opportunity for preventive measures before disease onset. We completed a genome-wide association study on participants in the longitudinal Bogalusa Heart Study (BHS) and have characterized the association between genetic factors and the development of CVD risk factors from childhood to adulthood. We report 7 genome-wide significant associations involving CVD risk factors, two of which have been previously reported. Top regions were tested for replication in the Young Finns Study (YF) and two associations strongly replicated: rs247616 in CETP with HDL levels (combined P = 9.7×10−24), and rs445925 at APOE with LDL levels (combined P = 8.7×10−19). We show that SNPs previously identified in adult cross-sectional studies tend to show age-independent effects in the BHS with effect sizes consistent with previous reports. Previously identified variants were associated with adult trait levels above and beyond those seen in childhood; however, variants with time-dependent effects were also promising predictors. This is the first GWA study to evaluate the role of common genetic variants in the development of CVD risk factors in children as they advance through adulthood and highlights the utility of using longitudinal studies to identify genetic predictors of adult traits in children.  相似文献   

5.

Background

Previous studies of network properties of human disease genes have mainly focused on monogenic diseases or cancers and have suffered from discovery bias. Here we investigated the network properties of complex disease genes identified by genome-wide association studies (GWAs), thereby eliminating discovery bias.

Principal findings

We derived a network of complex diseases (n = 54) and complex disease genes (n = 349) to explore the shared genetic architecture of complex diseases. We evaluated the centrality measures of complex disease genes in comparison with essential and monogenic disease genes in the human interactome. The complex disease network showed that diseases belonging to the same disease class do not always share common disease genes. A possible explanation could be that the variants with higher minor allele frequency and larger effect size identified using GWAs constitute disjoint parts of the allelic spectra of similar complex diseases. The complex disease gene network showed high modularity with the size of the largest component being smaller than expected from a randomized null-model. This is consistent with limited sharing of genes between diseases. Complex disease genes are less central than the essential and monogenic disease genes in the human interactome. Genes associated with the same disease, compared to genes associated with different diseases, more often tend to share a protein-protein interaction and a Gene Ontology Biological Process.

Conclusions

This indicates that network neighbors of known disease genes form an important class of candidates for identifying novel genes for the same disease.  相似文献   

6.
7.
The success of genome-wide association studies relies on much of the risk of common diseases being due to common genetic variants; but evidence for this is inconclusive. The results of published genome-wide association studies are examined to see what can be learnt about the distribution of disease-associated variants and how this might influence future study design. Although replicated disease-associated variants tend to be very common and frequency is inversely correlated with estimated effect size, our simulations suggest that such observations are the result of power. We find that for studies conducted to date, the frequency and effect size of significantly associated alleles are likely to be similar to those of the underlying disease alleles that they represent. Little of the genetic variation of disease has been explained so far, but current studies are only adequately powered to detect very common alleles unless they greatly increase disease risk. Thus, although the truth of the common disease / common variant hypothesis remains undecided, recent successes suggest that there are many more common genetic disease-associated variants, requiring larger studies to be identified.  相似文献   

8.
Genome-wide association studies (GWAS) are routinely conducted for both quantitative and binary (disease) traits. We present two analytical tools for use in the experimental design of GWAS. Firstly, we present power calculations quantifying power in a unified framework for a range of scenarios. In this context we consider the utility of quantitative scores (e.g. endophenotypes) that may be available on cases only or both cases and controls. Secondly, we consider, the accuracy of prediction of genetic risk from genome-wide SNPs and derive an expression for genomic prediction accuracy using a liability threshold model for disease traits in a case-control design. The expected values based on our derived equations for both power and prediction accuracy agree well with observed estimates from simulations.  相似文献   

9.
The bacterial composition of the human fecal microbiome is influenced by many lifestyle factors, notably diet. It is less clear, however, what role host genetics plays in dictating the composition of bacteria living in the gut. In this study, we examined the association of ~200K host genotypes with the relative abundance of fecal bacterial taxa in a founder population, the Hutterites, during two seasons (n = 91 summer, n = 93 winter, n = 57 individuals collected in both). These individuals live and eat communally, minimizing variation due to environmental exposures, including diet, which could potentially mask small genetic effects. Using a GWAS approach that takes into account the relatedness between subjects, we identified at least 8 bacterial taxa whose abundances were associated with single nucleotide polymorphisms in the host genome in each season (at genome-wide FDR of 20%). For example, we identified an association between a taxon known to affect obesity (genus Akkermansia) and a variant near PLD1, a gene previously associated with body mass index. Moreover, we replicate a previously reported association from a quantitative trait locus (QTL) mapping study of fecal microbiome abundance in mice (genus Lactococcus, rs3747113, P = 3.13 x 10−7). Finally, based on the significance distribution of the associated microbiome QTLs in our study with respect to chromatin accessibility profiles, we identified tissues in which host genetic variation may be acting to influence bacterial abundance in the gut.  相似文献   

10.
Genome-wide association studies are revolutionizing the search for the genes underlying human complex diseases. The main decisions to be made at the design stage of these studies are the choice of the commercial genotyping chip to be used and the numbers of case and control samples to be genotyped. The most common method of comparing different chips is using a measure of coverage, but this fails to properly account for the effects of sample size, the genetic model of the disease, and linkage disequilibrium between SNPs. In this paper, we argue that the statistical power to detect a causative variant should be the major criterion in study design. Because of the complicated pattern of linkage disequilibrium (LD) in the human genome, power cannot be calculated analytically and must instead be assessed by simulation. We describe in detail a method of simulating case-control samples at a set of linked SNPs that replicates the patterns of LD in human populations, and we used it to assess power for a comprehensive set of available genotyping chips. Our results allow us to compare the performance of the chips to detect variants with different effect sizes and allele frequencies, look at how power changes with sample size in different populations or when using multi-marker tags and genotype imputation approaches, and how performance compares to a hypothetical chip that contains every SNP in HapMap. A main conclusion of this study is that marked differences in genome coverage may not translate into appreciable differences in power and that, when taking budgetary considerations into account, the most powerful design may not always correspond to the chip with the highest coverage. We also show that genotype imputation can be used to boost the power of many chips up to the level obtained from a hypothetical “complete” chip containing all the SNPs in HapMap. Our results have been encapsulated into an R software package that allows users to design future association studies and our methods provide a framework with which new chip sets can be evaluated.  相似文献   

11.
We present GStream, a method that combines genome-wide SNP and CNV genotyping in the Illumina microarray platform with unprecedented accuracy. This new method outperforms previous well-established SNP genotyping software. More importantly, the CNV calling algorithm of GStream dramatically improves the results obtained by previous state-of-the-art methods and yields an accuracy that is close to that obtained by purely CNV-oriented technologies like Comparative Genomic Hybridization (CGH). We demonstrate the superior performance of GStream using microarray data generated from HapMap samples. Using the reference CNV calls generated by the 1000 Genomes Project (1KGP) and well-known studies on whole genome CNV characterization based either on CGH or genotyping microarray technologies, we show that GStream can increase the number of reliably detected variants up to 25% compared to previously developed methods. Furthermore, the increased genome coverage provided by GStream allows the discovery of CNVs in close linkage disequilibrium with SNPs, previously associated with disease risk in published Genome-Wide Association Studies (GWAS). These results could provide important insights into the biological mechanism underlying the detected disease risk association. With GStream, large-scale GWAS will not only benefit from the combined genotyping of SNPs and CNVs at an unprecedented accuracy, but will also take advantage of the computational efficiency of the method.  相似文献   

12.
Genome-wide association studies (GWAS) have successfully identified several risk loci for Alzheimer''s disease (AD). Nonetheless, these loci do not explain the entire susceptibility of the disease, suggesting that other genetic contributions remain to be identified. Here, we performed a meta-analysis combining data of 4,569 individuals (2,540 cases and 2,029 healthy controls) derived from three publicly available GWAS in AD and replicated a broad genomic region (>248,000 bp) associated with the disease near the APOE/TOMM40 locus in chromosome 19. To detect minor effect size contributions that could help to explain the remaining genetic risk, we conducted network-based pathway analyses either by extracting gene-wise p-values (GW), defined as the single strongest association signal within a gene, or calculated a more stringent gene-based association p-value using the extended Simes (GATES) procedure. Comparison of these strategies revealed that ontological sub-networks (SNs) involved in glutamate signaling were significantly overrepresented in AD (p<2.7×10−11, p<1.9×10−11; GW and GATES, respectively). Notably, glutamate signaling SNs were also found to be significantly overrepresented (p<5.1×10−8) in the Alzheimer''s disease Neuroimaging Initiative (ADNI) study, which was used as a targeted replication sample. Interestingly, components of the glutamate signaling SNs are coordinately expressed in disease-related tissues, which are tightly related to known pathological hallmarks of AD. Our findings suggest that genetic variation within glutamate signaling contributes to the remaining genetic risk of AD and support the notion that functional biological networks should be targeted in future therapies aimed to prevent or treat this devastating neurological disorder.  相似文献   

13.
14.
For genome-wide association studies in family-based designs, we propose a new, universally applicable approach. The new test statistic exploits all available information about the association, while, by virtue of its design, it maintains the same robustness against population admixture as traditional family-based approaches that are based exclusively on the within-family information. The approach is suitable for the analysis of almost any trait type, e.g. binary, continuous, time-to-onset, multivariate, etc., and combinations of those. We use simulation studies to verify all theoretically derived properties of the approach, estimate its power, and compare it with other standard approaches. We illustrate the practical implications of the new analysis method by an application to a lung-function phenotype, forced expiratory volume in one second (FEV1) in 4 genome-wide association studies.  相似文献   

15.
Characteristics of peripheral arterial disease (PAD) are the occlusion or stenosis of multiple vessel sites caused mainly by atherosclerosis and chronic lower limb ischemia. To identify PAD susceptible loci, we conducted a genome-wide association study (GWAS) with 785 cases and 3,383 controls in a Japanese population using 431,666 single nucleotide polymorphisms (SNP). After staged analyses including a total of 3,164 cases and 20,134 controls, we identified 3 novel PAD susceptibility loci at IPO5/RAP2A, EDNRA and HDAC9 with genome wide significance (combined P = 6.8 x 10−14, 5.3 x 10−9 and 8.8 x 10−8, respectively). Fine-mapping at the IPO5/RAP2A locus revealed that rs9584669 conferred risk of PAD. Luciferase assay showed that the risk allele at this locus reduced expression levels of IPO5. To our knowledge, these are the first genetic risk factors for PAD.  相似文献   

16.

Background

Growth and meat production traits are significant economic traits in sheep. The aim of the study is to identify candidate genes affecting growth and meat production traits at genome level with high throughput single nucleotide polymorphisms (SNP) genotyping technologies.

Methodology and Results

Using Illumina OvineSNP50 BeadChip, we performed a GWA study in 329 purebred sheep for 11 growth and meat production traits (birth weight, weaning weight, 6-month weight, eye muscle area, fat thickness, pre-weaning gain, post-weaning gain, daily weight gain, height at withers, chest girth, and shin circumference). After quality control, 319 sheep and 48,198 SNPs were analyzed by TASSEL program in a mixed linear model (MLM). 36 significant SNPs were identified for 7 traits, and 10 of them reached genome-wise significance level for post-weaning gain. Gene annotation was implemented with the latest sheep genome Ovis_aries_v3.1 (released October 2012). More than one-third SNPs (14 out of 36) were located within ovine genes, others were located close to ovine genes (878bp-398,165bp apart). The strongest new finding is 5 genes were thought to be the most crucial candidate genes associated with post-weaning gain: s58995.1 was located within the ovine genes MEF2B and RFXANK, OAR3_84073899.1, OAR3_115712045.1 and OAR9_91721507.1 were located within CAMKMT, TRHDE, and RIPK2 respectively. GRM1, POL, MBD5, UBR2, RPL7 and SMC2 were thought to be the important candidate genes affecting post-weaning gain too. Additionally, 25 genes at chromosome-wise significance level were also forecasted to be the promising genes that influencing sheep growth and meat production traits.

Conclusions

The results will contribute to the similar studies and facilitate the potential utilization of genes involved in growth and meat production traits in sheep in future.  相似文献   

17.
Recent advances in high-throughput genotyping technologies have provided the opportunity to map genes using associations between complex traits and markers. Genome-wide association studies (GWAS) based on either a single marker or haplotype have identified genetic variants and underlying genetic mechanisms of quantitative traits. Prompted by the achievements of studies examining economic traits in cattle and to verify the consistency of these two methods using real data, the current study was conducted to construct the haplotype structure in the bovine genome and to detect relevant genes genuinely affecting a carcass trait and a meat quality trait. Using the Illumina BovineHD BeadChip, 942 young bulls with genotyping data were introduced as a reference population to identify the genes in the beef cattle genome significantly associated with foreshank weight and triglyceride levels. In total, 92,553 haplotype blocks were detected in the genome. The regions of high linkage disequilibrium extended up to approximately 200 kb, and the size of haplotype blocks ranged from 22 bp to 199,266 bp. Additionally, the individual SNP analysis and the haplotype-based analysis detected similar regions and common SNPs for these two representative traits. A total of 12 and 7 SNPs in the bovine genome were significantly associated with foreshank weight and triglyceride levels, respectively. By comparison, 4 and 5 haplotype blocks containing the majority of significant SNPs were strongly associated with foreshank weight and triglyceride levels, respectively. In addition, 36 SNPs with high linkage disequilibrium were detected in the GNAQ gene, a potential hotspot that may play a crucial role for regulating carcass trait components.  相似文献   

18.
Cleft lip with or without cleft palate (CL/P) is the most commonly occurring craniofacial birth defect. We provide insight into the genetic etiology of this birth defect by performing genome-wide association studies in two species: dogs and humans. In the dog, a genome-wide association study of 7 CL/P cases and 112 controls from the Nova Scotia Duck Tolling Retriever (NSDTR) breed identified a significantly associated region on canine chromosome 27 (unadjusted p=1.1 x 10-13; adjusted p= 2.2 x 10-3). Further analysis in NSDTR families and additional full sibling cases identified a 1.44 Mb homozygous haplotype (chromosome 27: 9.29 – 10.73 Mb) segregating with a more complex phenotype of cleft lip, cleft palate, and syndactyly (CLPS) in 13 cases. Whole-genome sequencing of 3 CLPS cases and 4 controls at 15X coverage led to the discovery of a frameshift mutation within ADAMTS20 (c.1360_1361delAA (p.Lys453Ilefs*3)), which segregated concordant with the phenotype. In a parallel study in humans, a family-based association analysis (DFAM) of 125 CL/P cases, 420 unaffected relatives, and 392 controls from a Guatemalan cohort, identified a suggestive association (rs10785430; p =2.67 x 10-6) with the same gene, ADAMTS20. Sequencing of cases from the Guatemalan cohort was unable to identify a causative mutation within the coding region of ADAMTS20, but four coding variants were found in additional cases of CL/P. In summary, this study provides genetic evidence for a role of ADAMTS20 in CL/P development in dogs and as a candidate gene for CL/P development in humans.  相似文献   

19.
Insights into genetic origin of diseases and related traits could substantially impact strategies for improving human health. The results of genome-wide association studies (GWAS) are often positioned as discoveries of unconditional risk alleles of complex health traits. We re-analyzed the associations of single nucleotide polymorphisms (SNPs) associated with total cholesterol (TC) in a large-scale GWAS meta-analysis. We focused on three generations of genotyped participants of the Framingham Heart Study (FHS). We show that the effects of all ten directly-genotyped SNPs were clustered in different FHS generations and/or birth cohorts in a sex-specific or sex-unspecific manner. The sample size and procedure-therapeutic issues play, at most, a minor role in this clustering. An important result was clustering of significant associations with the strongest effects in the youngest, or 3rd Generation, cohort. These results imply that an assumption of unconditional connections of these SNPs with TC is generally implausible and that a demographic perspective can substantially improve GWAS efficiency. The analyses of genetic effects in age-matched samples suggest a role of environmental and age-related mechanisms in the associations of different SNPs with TC. Analysis of the literature supports systemic roles for genes for these SNPs beyond those related to lipid metabolism. Our analyses reveal strong antagonistic effects of rs2479409 (the PCSK9 gene) that cautions strategies aimed at targeting this gene in the next generation of lipid drugs. Our results suggest that standard GWAS strategies need to be advanced in order to appropriately address the problem of genetic susceptibility to complex traits that is imperative for translation to health care.  相似文献   

20.
《PloS one》2016,11(3)

Background

Data are limited on genome-wide association studies (GWAS) for incident coronary heart disease (CHD). Moreover, it is not known whether genetic variants identified to date also associate with risk of CHD in a prospective setting.

Methods

We performed a two-stage GWAS analysis of incident myocardial infarction (MI) and CHD in a total of 64,297 individuals (including 3898 MI cases, 5465 CHD cases). SNPs that passed an arbitrary threshold of 5×10−6 in Stage I were taken to Stage II for further discovery. Furthermore, in an analysis of prognosis, we studied whether known SNPs from former GWAS were associated with total mortality in individuals who experienced MI during follow-up.

Results

In Stage I 15 loci passed the threshold of 5×10−6; 8 loci for MI and 8 loci for CHD, for which one locus overlapped and none were reported in previous GWAS meta-analyses. We took 60 SNPs representing these 15 loci to Stage II of discovery. Four SNPs near QKI showed nominally significant association with MI (p-value<8.8×10−3) and three exceeded the genome-wide significance threshold when Stage I and Stage II results were combined (top SNP rs6941513: p = 6.2×10−9). Despite excellent power, the 9p21 locus SNP (rs1333049) was only modestly associated with MI (HR = 1.09, p-value = 0.02) and marginally with CHD (HR = 1.06, p-value = 0.08). Among an inception cohort of those who experienced MI during follow-up, the risk allele of rs1333049 was associated with a decreased risk of subsequent mortality (HR = 0.90, p-value = 3.2×10−3).

Conclusions

QKI represents a novel locus that may serve as a predictor of incident CHD in prospective studies. The association of the 9p21 locus both with increased risk of first myocardial infarction and longer survival after MI highlights the importance of study design in investigating genetic determinants of complex disorders.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号