共查询到20条相似文献,搜索用时 0 毫秒
1.
M. B. Freidin E. Yu. Bragina O. S. Fedorova I. A. Deev E. S. Kulikov L. M. Ogorodova V. P. Puzyrev 《Molecular Biology》2011,45(3):421-429
Genome-wide association studies are currently considered as one of the most powerful tools for establishing the genetic basis
of complex diseases. A number of such studies have been carried out for allergic diseases; however, in the Russian population,
this analysis has not been performed so far. For the first time, we performed a genome-wide association study of allergic
diseases in Russian residents of West Siberia. Two new loci associated with childhood bronchial asthma (20q13.12, rs2425656,
P = 1.99 × 10−7; 1q32.1, rs3817222, rs12734001, P = 2.18 × 10−7 and 2.79 × 10−7, respectively) as well as one locus associated with allergic rhinitis (2q36.1, rs1597167, P = 3.69 × 10−7) were identified. Genes located in these loci, YWHAB and PPP1R12B for asthma and KCNE4 for allergic rhinitis, are suggested as new candidate genes for these diseases. It was also found that BAT1 (6p21.33), MAGI2 (7q21.11), and ACPL2 (3q23) are probably common (syntropic) genes of allergic disease and atopic sensitization. It was shown that RIT2 (18q12.3) and FSTL4 (5q31.1) genes can be involved in the control of lung function. The results of the study contribute to the body of data on
genetic factors of allergy and expand the list of genes underlying these diseases. 相似文献
2.
3.
Emily R. Davenport Darren A. Cusanovich Katelyn Michelini Luis B. Barreiro Carole Ober Yoav Gilad 《PloS one》2015,10(11)
The bacterial composition of the human fecal microbiome is influenced by many lifestyle factors, notably diet. It is less clear, however, what role host genetics plays in dictating the composition of bacteria living in the gut. In this study, we examined the association of ~200K host genotypes with the relative abundance of fecal bacterial taxa in a founder population, the Hutterites, during two seasons (n = 91 summer, n = 93 winter, n = 57 individuals collected in both). These individuals live and eat communally, minimizing variation due to environmental exposures, including diet, which could potentially mask small genetic effects. Using a GWAS approach that takes into account the relatedness between subjects, we identified at least 8 bacterial taxa whose abundances were associated with single nucleotide polymorphisms in the host genome in each season (at genome-wide FDR of 20%). For example, we identified an association between a taxon known to affect obesity (genus Akkermansia) and a variant near PLD1, a gene previously associated with body mass index. Moreover, we replicate a previously reported association from a quantitative trait locus (QTL) mapping study of fecal microbiome abundance in mice (genus Lactococcus, rs3747113, P = 3.13 x 10−7). Finally, based on the significance distribution of the associated microbiome QTLs in our study with respect to chromatin accessibility profiles, we identified tissues in which host genetic variation may be acting to influence bacterial abundance in the gut. 相似文献
4.
Genome-wide association studies (GWAS) have evolved over the last ten years into a powerful tool for investigating the genetic architecture of human disease. In this work, we review the key concepts underlying GWAS, including the architecture of common diseases, the structure of common human genetic variation, technologies for capturing genetic information, study designs, and the statistical methods used for data analysis. We also look forward to the future beyond GWAS.
What to Learn in This Chapter
- Basic genetic concepts that drive genome-wide association studies
- Genotyping technologies and common study designs
- Statistical concepts for GWAS analysis
- Replication, interpretation, and follow-up of association results
This article is part of the “Translational Bioinformatics” collection for PLOS Computational Biology.相似文献
5.
The mitochondrial DNA (mtDNA) of 60 Russians from West Siberia was analyzed with the following restriction enzymes: BamHI, HindIII, PstI, PvuII and SacI that recognize 6 bp. The observed restriction fragment length polymorphisms (morphs) were classified into 13 types of distinct cleavage patterns (mitotypes). The distributions of the mtDNA morphs were compared with those characteristic of some other human populations. 相似文献
6.
Binnaz Yalcin Jér?me Nicod Amarjit Bhomra Stuart Davidson James Cleak Laurent Farinelli Magne ?ster?s Adam Whitley Wei Yuan Xiangchao Gan Martin Goodson Paul Klenerman Ansu Satpathy Diane Mathis Christophe Benoist David J. Adams Richard Mott Jonathan Flint 《PLoS genetics》2010,6(9)
Genome-wide association studies using commercially available outbred mice can detect genes involved in phenotypes of biomedical interest. Useful populations need high-frequency alleles to ensure high power to detect quantitative trait loci (QTLs), low linkage disequilibrium between markers to obtain accurate mapping resolution, and an absence of population structure to prevent false positive associations. We surveyed 66 colonies for inbreeding, genetic diversity, and linkage disequilibrium, and we demonstrate that some have haplotype blocks of less than 100 Kb, enabling gene-level mapping resolution. The same alleles contribute to variation in different colonies, so that when mapping progress stalls in one, another can be used in its stead. Colonies are genetically diverse: 45% of the total genetic variation is attributable to differences between colonies. However, quantitative differences in allele frequencies, rather than the existence of private alleles, are responsible for these population differences. The colonies derive from a limited pool of ancestral haplotypes resembling those found in inbred strains: over 95% of sequence variants segregating in outbred populations are found in inbred strains. Consequently it is possible to impute the sequence of any mouse from a dense SNP map combined with inbred strain sequence data, which opens up the possibility of cataloguing and testing all variants for association, a situation that has so far eluded studies in completely outbred populations. We demonstrate the colonies'' potential by identifying a deletion in the promoter of H2-Ea as the molecular change that strongly contributes to setting the ratio of CD4+ and CD8+ lymphocytes. 相似文献
7.
On the Analysis of Genome-Wide Association Studies in Family-Based Designs: A Universal,Robust Analysis Approach and an Application to Four Genome-Wide Association Studies 下载免费PDF全文
Sungho Won Jemma B. Wilk Rasika A. Mathias Christopher J. O'Donnell Edwin K. Silverman Kathleen Barnes George T. O'Connor Scott T. Weiss Christoph Lange 《PLoS genetics》2009,5(11)
For genome-wide association studies in family-based designs, we propose a new, universally applicable approach. The new test statistic exploits all available information about the association, while, by virtue of its design, it maintains the same robustness against population admixture as traditional family-based approaches that are based exclusively on the within-family information. The approach is suitable for the analysis of almost any trait type, e.g. binary, continuous, time-to-onset, multivariate, etc., and combinations of those. We use simulation studies to verify all theoretically derived properties of the approach, estimate its power, and compare it with other standard approaches. We illustrate the practical implications of the new analysis method by an application to a lung-function phenotype, forced expiratory volume in one second (FEV1) in 4 genome-wide association studies. 相似文献
8.
9.
Freĭdin MB Bragina EIu Fedorova OS Deev IA Kulikov ES Ogorodova LM Puzyrev VP 《Molekuliarnaia biologiia》2011,45(3):464-472
Genome-wide association studies are currently considered as one of the most powerful tools to establishing the genetic basis of complex diseases. A number of such studies were carried out for allergic diseases; however, in Russian population this analysis has not been performed so far. For the first time, we performed genome-wide association study of allergic diseases in Russian inhabitants of Western Siberia. Two new loci associated with childhood bronchial asthma were identified (20q13.12, rs2425656, P = 1.99 x 10(-7); 1q32.1, rs3817222, rs12734001, P = 2.18 x 10(-7) and 2.79 x 10(-7), respectively) as well as one locus, associated with allergic rhinitis (2q36.1, rs1597167, P = 3.69 x 10(-7)). Genes located in the loci, YWHAB and PPP1R12B for asthma and KCNE4 for allergic rhinitis, are new genes for these diseases. It was found that BAT1 (6p21.33), MAGI2 (7q21.11) and ACPL2 (3q23) genes are, likely, common (syntropic) genes of allergic disease and a topic sensitisation. It was shown that RIT2 (18q12.3) and (5q31.1) genes can be involved in the control of lung function. The results of the study enlarge the body of data on genetic factors of allergy and expand the list of genes underlying these diseases. 相似文献
10.
Genome-wide association studies (GWAS) are designed to identify the portion of single-nucleotide polymorphisms (SNPs) in genome sequences associated with a complex trait. Strategies based on the gene list enrichment concept are currently applied for the functional analysis of GWAS, according to which a significant overrepresentation of candidate genes associated with a biological pathway is used as a proxy to infer overrepresentation of candidate SNPs in the pathway. Here we show that such inference is not always valid and introduce the program SNP2GO, which implements a new method to properly test for the overrepresentation of candidate SNPs in biological pathways. 相似文献
11.
It is widely acknowledged that genome-wide association studies (GWAS) of complex human disease fail to explain a large portion of heritability, primarily due to lack of statistical power—a problem that is exacerbated when seeking detection of interactions of multiple genomic loci. An untapped source of information that is already widely available, and that is expected to grow in coming years, is population samples. Such samples contain genetic marker data for additional individuals, but not their relevant phenotypes. In this article we develop a highly efficient testing framework based on a constrained maximum-likelihood estimate in a case–control–population setting. We leverage the available population data and optional modeling assumptions, such as Hardy–Weinberg equilibrium (HWE) in the population and linkage equilibrium (LE) between distal loci, to substantially improve power of association and interaction tests. We demonstrate, via simulation and application to actual GWAS data sets, that our approach is substantially more powerful and robust than standard testing approaches that ignore or make naive use of the population sample. We report several novel and credible pairwise interactions, in bipolar disorder, coronary artery disease, Crohn’s disease, and rheumatoid arthritis. 相似文献
12.
Genome-wide association studies have been extensively conducted, searching for markers for biologically meaningful outcomes and phenotypes. Penalization methods have been adopted in the analysis of the joint effects of a large number of SNPs (single nucleotide polymorphisms) and marker identification. This study is partly motivated by the analysis of heterogeneous stock mice dataset, in which multiple correlated phenotypes and a large number of SNPs are available. Existing penalization methods designed to analyze a single response variable cannot accommodate the correlation among multiple response variables. With multiple response variables sharing the same set of markers, joint modeling is first employed to accommodate the correlation. The group Lasso approach is adopted to select markers associated with all the outcome variables. An efficient computational algorithm is developed. Simulation study and analysis of the heterogeneous stock mice dataset show that the proposed method can outperform existing penalization methods. 相似文献
13.
Genome-wide association studies (GWAS) aim to identify genetic variants related to diseases by examining the associations between phenotypes and hundreds of thousands of genotyped markers. Because many genes are potentially involved in common diseases and a large number of markers are analyzed, it is crucial to devise an effective strategy to identify truly associated variants that have individual and/or interactive effects, while controlling false positives at the desired level. Although a number of model selection methods have been proposed in the literature, including marginal search, exhaustive search, and forward search, their relative performance has only been evaluated through limited simulations due to the lack of an analytical approach to calculating the power of these methods. This article develops a novel statistical approach for power calculation, derives accurate formulas for the power of different model selection strategies, and then uses the formulas to evaluate and compare these strategies in genetic model spaces. In contrast to previous studies, our theoretical framework allows for random genotypes, correlations among test statistics, and a false-positive control based on GWAS practice. After the accuracy of our analytical results is validated through simulations, they are utilized to systematically evaluate and compare the performance of these strategies in a wide class of genetic models. For a specific genetic model, our results clearly reveal how different factors, such as effect size, allele frequency, and interaction, jointly affect the statistical power of each strategy. An example is provided for the application of our approach to empirical research. The statistical approach used in our derivations is general and can be employed to address the model selection problems in other random predictor settings. We have developed an R package markerSearchPower to implement our formulas, which can be downloaded from the Comprehensive R Archive Network (CRAN) or http://bioinformatics.med.yale.edu/group/. 相似文献
14.
Xing Zhang Jinming Zhao Yuanpeng Bu Dong Xue Zhangxiong Liu Xiangnan Li Jing Huang Na Guo Haitang Wang Han Xing Lijuan Qiu 《Plant Molecular Biology Reporter》2018,36(4):605-617
Soybean seed hardness is an important quality character in soybean food processing. Both vegetable soybean and natto require soft seeds to achieve a desirable sensory experience and for effective processing. In this study, we used a texture analyzer to measure the seed hardness of Chinese mini core collection via two indexes over 4 years and found significant correlations among the seed hardness, seed oil content, and germplasm eco-region. Based on 1514 SNPs, genome-wide association studies (GWAS) were conducted using a mixed linear model (MLM). Seventeen SNPs were identified to be associated with seed hardness in at least two environments. Among them, one locus, designated Q-15-0087770, was associated with two indexes, and 13 putative genes were confirmed based on their annotations in SoyBase. This research provides new insights into advanced marker-assisted selections for breeding soybeans for seed hardness and oil content. 相似文献
15.
Li Zhang Jiasen Liu Fuping Zhao Hangxing Ren Lingyang Xu Jian Lu Shifang Zhang Xiaoning Zhang Caihong Wei Guobin Lu Youmin Zheng Lixin Du 《PloS one》2013,8(6)
Background
Growth and meat production traits are significant economic traits in sheep. The aim of the study is to identify candidate genes affecting growth and meat production traits at genome level with high throughput single nucleotide polymorphisms (SNP) genotyping technologies.Methodology and Results
Using Illumina OvineSNP50 BeadChip, we performed a GWA study in 329 purebred sheep for 11 growth and meat production traits (birth weight, weaning weight, 6-month weight, eye muscle area, fat thickness, pre-weaning gain, post-weaning gain, daily weight gain, height at withers, chest girth, and shin circumference). After quality control, 319 sheep and 48,198 SNPs were analyzed by TASSEL program in a mixed linear model (MLM). 36 significant SNPs were identified for 7 traits, and 10 of them reached genome-wise significance level for post-weaning gain. Gene annotation was implemented with the latest sheep genome Ovis_aries_v3.1 (released October 2012). More than one-third SNPs (14 out of 36) were located within ovine genes, others were located close to ovine genes (878bp-398,165bp apart). The strongest new finding is 5 genes were thought to be the most crucial candidate genes associated with post-weaning gain: s58995.1 was located within the ovine genes MEF2B and RFXANK, OAR3_84073899.1, OAR3_115712045.1 and OAR9_91721507.1 were located within CAMKMT, TRHDE, and RIPK2 respectively. GRM1, POL, MBD5, UBR2, RPL7 and SMC2 were thought to be the important candidate genes affecting post-weaning gain too. Additionally, 25 genes at chromosome-wise significance level were also forecasted to be the promising genes that influencing sheep growth and meat production traits.Conclusions
The results will contribute to the similar studies and facilitate the potential utilization of genes involved in growth and meat production traits in sheep in future. 相似文献16.
Yang Wu Huizhong Fan Yanhui Wang Lupei Zhang Xue Gao Yan Chen Junya Li HongYan Ren Huijiang Gao 《PloS one》2014,9(10)
Recent advances in high-throughput genotyping technologies have provided the opportunity to map genes using associations between complex traits and markers. Genome-wide association studies (GWAS) based on either a single marker or haplotype have identified genetic variants and underlying genetic mechanisms of quantitative traits. Prompted by the achievements of studies examining economic traits in cattle and to verify the consistency of these two methods using real data, the current study was conducted to construct the haplotype structure in the bovine genome and to detect relevant genes genuinely affecting a carcass trait and a meat quality trait. Using the Illumina BovineHD BeadChip, 942 young bulls with genotyping data were introduced as a reference population to identify the genes in the beef cattle genome significantly associated with foreshank weight and triglyceride levels. In total, 92,553 haplotype blocks were detected in the genome. The regions of high linkage disequilibrium extended up to approximately 200 kb, and the size of haplotype blocks ranged from 22 bp to 199,266 bp. Additionally, the individual SNP analysis and the haplotype-based analysis detected similar regions and common SNPs for these two representative traits. A total of 12 and 7 SNPs in the bovine genome were significantly associated with foreshank weight and triglyceride levels, respectively. By comparison, 4 and 5 haplotype blocks containing the majority of significant SNPs were strongly associated with foreshank weight and triglyceride levels, respectively. In addition, 36 SNPs with high linkage disequilibrium were detected in the GNAQ gene, a potential hotspot that may play a crucial role for regulating carcass trait components. 相似文献
17.
Principal Component Analysis Characterizes Shared Pathogenetics from Genome-Wide Association Studies
Genome-wide association studies (GWASs) have recently revealed many genetic associations that are shared between different diseases. We propose a method, disPCA, for genome-wide characterization of shared and distinct risk factors between and within disease classes. It flips the conventional GWAS paradigm by analyzing the diseases themselves, across GWAS datasets, to explore their “shared pathogenetics”. The method applies principal component analysis (PCA) to gene-level significance scores across all genes and across GWASs, thereby revealing shared pathogenetics between diseases in an unsupervised fashion. Importantly, it adjusts for potential sources of heterogeneity present between GWAS which can confound investigation of shared disease etiology. We applied disPCA to 31 GWASs, including autoimmune diseases, cancers, psychiatric disorders, and neurological disorders. The leading principal components separate these disease classes, as well as inflammatory bowel diseases from other autoimmune diseases. Generally, distinct diseases from the same class tend to be less separated, which is in line with their increased shared etiology. Enrichment analysis of genes contributing to leading principal components revealed pathways that are implicated in the immune system, while also pointing to pathways that have yet to be explored before in this context. Our results point to the potential of disPCA in going beyond epidemiological findings of the co-occurrence of distinct diseases, to highlighting novel genes and pathways that unsupervised learning suggest to be key players in the variability across diseases. 相似文献
18.
Understanding the role of genetic variation in human diseases remains an important problem to be solved in genomics. An important component of such variation consist of variations at single sites in DNA, or single nucleotide polymorphisms (SNPs). Typically, the problem of associating particular SNPs to phenotypes has been confounded by hidden factors such as the presence of population structure, family structure or cryptic relatedness in the sample of individuals being analyzed. Such confounding factors lead to a large number of spurious associations and missed associations. Various statistical methods have been proposed to account for such confounding factors such as linear mixed-effect models (LMMs) or methods that adjust data based on a principal components analysis (PCA), but these methods either suffer from low power or cease to be tractable for larger numbers of individuals in the sample. Here we present a statistical model for conducting genome-wide association studies (GWAS) that accounts for such confounding factors. Our method scales in runtime quadratic in the number of individuals being studied with only a modest loss in statistical power as compared to LMM-based and PCA-based methods when testing on synthetic data that was generated from a generalized LMM. Applying our method to both real and synthetic human genotype/phenotype data, we demonstrate the ability of our model to correct for confounding factors while requiring significantly less runtime relative to LMMs. We have implemented methods for fitting these models, which are available at http://www.microsoft.com/science. 相似文献
19.
Yi-Cheng Chang Pi-Hua Liu Yu-Hsiang Yu Shan-Shan Kuo Tien-Jyun Chang Yi-Der Jiang Jiun-Yi Nong Juey-Jen Hwang Lee-Ming Chuang 《PloS one》2014,9(4)
Background
Several genome-wide association studies (GWAS) involving European populations have successfully identified risk genetic variants associated with type 2 diabetes mellitus (T2DM). However, the effects conferred by these variants in Han Chinese population have not yet been fully elucidated.Methods
We analyzed the effects of 24 risk genetic variants with reported associations from European GWAS in 3,040 Han Chinese subjects in Taiwan (including 1,520 T2DM cases and 1,520 controls). The discriminative power of the prediction models with and without genotype scores was compared. We further meta-analyzed the association of these variants with T2DM by pooling all candidate-gene association studies conducted in Han Chinese.Results
Five risk variants in IGF2BP2 (rs4402960, rs1470579), CDKAL1 (rs10946398), SLC30A8 (rs13266634), and HHEX (rs1111875) genes were nominally associated with T2DM in our samples. The odds ratio was 2.22 (95% confidence interval, 1.81-2.73, P<0.0001) for subjects with the highest genetic score quartile (score>34) as compared with subjects with the lowest quartile (score<29). The incoporation of genotype score into the predictive model increased the C-statistics from 0.627 to 0.657 (P<0.0001). These estimates are very close to those observed in European populations. Gene-environment interaction analysis showed a significant interaction between rs13266634 in SLC30A8 gene and age on T2DM risk (P<0.0001). Further meta-analysis pooling 20 studies in Han Chinese confirmed the association of 10 genetic variants in IGF2BP2, CDKAL1, JAZF1, SCL30A8, HHEX, TCF7L2, EXT2, and FTO genes with T2DM. The effect sizes conferred by these risk variants in Han Chinese were similar to those observed in Europeans but the allele frequencies differ substantially between two populations.Conclusion
We confirmed the association of 10 variants identified by European GWAS with T2DM in Han Chinese population. The incorporation of genotype scores into the prediction model led to a small but significant improvement in T2DM prediction. 相似文献20.
To date, the genome-wide association study (GWAS) is the primary tool to identify genetic variants that cause phenotypic variation. As GWAS analyses are generally univariate in nature, multivariate phenotypic information is usually reduced to a single composite score. This practice often results in loss of statistical power to detect causal variants. Multivariate genotype–phenotype methods do exist but attain maximal power only in special circumstances. Here, we present a new multivariate method that we refer to as TATES (Trait-based Association Test that uses Extended Simes procedure), inspired by the GATES procedure proposed by Li et al (2011). For each component of a multivariate trait, TATES combines p-values obtained in standard univariate GWAS to acquire one trait-based p-value, while correcting for correlations between components. Extensive simulations, probing a wide variety of genotype–phenotype models, show that TATES''s false positive rate is correct, and that TATES''s statistical power to detect causal variants explaining 0.5% of the variance can be 2.5–9 times higher than the power of univariate tests based on composite scores and 1.5–2 times higher than the power of the standard MANOVA. Unlike other multivariate methods, TATES detects both genetic variants that are common to multiple phenotypes and genetic variants that are specific to a single phenotype, i.e. TATES provides a more complete view of the genetic architecture of complex traits. As the actual causal genotype–phenotype model is usually unknown and probably phenotypically and genetically complex, TATES, available as an open source program, constitutes a powerful new multivariate strategy that allows researchers to identify novel causal variants, while the complexity of traits is no longer a limiting factor. 相似文献