首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 296 毫秒
1.
Identifying causal genetic variants underlying heritable phenotypic variation is a long‐standing goal in evolutionary genetics. We previously identified several quantitative trait loci (QTL) for five morphological traits in a captive population of zebra finches (Taeniopygia guttata) by whole‐genome linkage mapping. We here follow up on these studies with the aim to narrow down on the quantitative trait variants (QTN) in one wild and three captive populations. First, we performed an association study using 672 single nucleotide polymorphisms (SNPs) within candidate genes located in the previously identified QTL regions in a sample of 939 wild‐caught zebra finches. Then, we validated the most promising SNP–phenotype associations (n = 25 SNPs) in 5228 birds from four populations. Genotype–phenotype associations were generally weak in the wild population, where linkage disequilibrium (LD) spans only short genomic distances. In contrast, in captive populations, where LD blocks are large, apparent SNP effects on morphological traits (i.e. associations) were highly repeatable with independent data from the same population. Most of those SNPs also showed significant associations with the same trait in other captive populations, but the direction and magnitude of these effects varied among populations. This suggests that the tested SNPs are not the causal QTN but rather physically linked to them, and that LD between SNPs and causal variants differs between populations due to founder effects. While the identification of QTN remains challenging in nonmodel organisms, we illustrate that it is indeed possible to confirm the location and magnitude of QTL in a population with stable linkage between markers and causal variants.  相似文献   

2.
Quantitative traits often underlie risk for complex diseases. Many studies collect multiple correlated quantitative phenotypes and perform univariate analyses on each of them respectively. However, this strategy may not be powerful and has limitations to detect plei- otropic genes that may underlie correlated quantitative traits. In addition, testing multiple traits individually will exacerbate perplexing problem of multiple testing. In this study, generalized estimating equation 2 (GEE2) is applied to association mapping of two correlated quantitative traits. We suppose that a quantitative trait locus is located in a chromosome region that exerts pleiotropic effects on multiple quantitative traits. In that region, multiple SNPs are genotyped. Genotypes of these SNPs and the two quantitative traits affected by a causal SNP were simulated under various parameter values: residual correlation coefficient between two traits, causal SNP heritability, minor allele frequency of the causal SNP, extent of linkage disequilibrium with the causal SNP, and the test sample size. By power ana- lytical analyses, it is showed that the bivariate method is generally more powerful than the univariate method. This method is robust and yields false-positive rates close to the pre-set nominal significance level. Our real data analyses attested to the usefulness of the method.  相似文献   

3.
The genome-wide association study (GWAS) approach has discovered hundreds of genetic variants associated with diseases and quantitative traits. However, despite clinical overlap and statistical correlation between many phenotypes, GWAS are generally performed one-phenotype-at-a-time. Here we compare the performance of modelling multiple phenotypes jointly with that of the standard univariate approach. We introduce a new method and software, MultiPhen, that models multiple phenotypes simultaneously in a fast and interpretable way. By performing ordinal regression, MultiPhen tests the linear combination of phenotypes most associated with the genotypes at each SNP, and thus potentially captures effects hidden to single phenotype GWAS. We demonstrate via simulation that this approach provides a dramatic increase in power in many scenarios. There is a boost in power for variants that affect multiple phenotypes and for those that affect only one phenotype. While other multivariate methods have similar power gains, we describe several benefits of MultiPhen over these. In particular, we demonstrate that other multivariate methods that assume the genotypes are normally distributed, such as canonical correlation analysis (CCA) and MANOVA, can have highly inflated type-1 error rates when testing case-control or non-normal continuous phenotypes, while MultiPhen produces no such inflation. To test the performance of MultiPhen on real data we applied it to lipid traits in the Northern Finland Birth Cohort 1966 (NFBC1966). In these data MultiPhen discovers 21% more independent SNPs with known associations than the standard univariate GWAS approach, while applying MultiPhen in addition to the standard approach provides 37% increased discovery. The most associated linear combinations of the lipids estimated by MultiPhen at the leading SNPs accurately reflect the Friedewald Formula, suggesting that MultiPhen could be used to refine the definition of existing phenotypes or uncover novel heritable phenotypes.  相似文献   

4.
In many case-control genetic association studies, a set of correlated secondary phenotypes that may share common genetic factors with disease status are collected. Examination of these secondary phenotypes can yield valuable insights about the disease etiology and supplement the main studies. However, due to unequal sampling probabilities between cases and controls, standard regression analysis that assesses the effect of SNPs (single nucleotide polymorphisms) on secondary phenotypes using cases only, controls only, or combined samples of cases and controls can yield inflated type I error rates when the test SNP is associated with the disease. To solve this issue, we propose a Gaussian copula-based approach that efficiently models the dependence between disease status and secondary phenotypes. Through simulations, we show that our method yields correct type I error rates for the analysis of secondary phenotypes under a wide range of situations. To illustrate the effectiveness of our method in the analysis of real data, we applied our method to a genome-wide association study on high-density lipoprotein cholesterol (HDL-C), where "cases" are defined as individuals with extremely high HDL-C level and "controls" are defined as those with low HDL-C level. We treated 4 quantitative traits with varying degrees of correlation with HDL-C as secondary phenotypes and tested for association with SNPs in LIPG, a gene that is well known to be associated with HDL-C. We show that when the correlation between the primary and secondary phenotypes is >0.2, the P values from case-control combined unadjusted analysis are much more significant than methods that aim to correct for ascertainment bias. Our results suggest that to avoid false-positive associations, it is important to appropriately model secondary phenotypes in case-control genetic association studies.  相似文献   

5.
Robust assessment of genetic effects on quantitative traits or complex-disease risk requires synthesis of evidence from multiple studies. Frequently, studies have genotyped partially overlapping sets of SNPs within a gene or region of interest, hampering attempts to combine all the available data. By using the example of C-reactive protein (CRP) as a quantitative trait, we show how linkage disequilibrium in and around its gene facilitates use of Bayesian hierarchical models to integrate informative data from all available genetic association studies of this trait, irrespective of the SNP typed. A variable selection scheme, followed by contextualization of SNPs exhibiting independent associations within the haplotype structure of the gene, enhanced our ability to infer likely causal variants in this region with population-scale data. This strategy, based on data from a literature based systematic review and substantial new genotyping, facilitated the most comprehensive evaluation to date of the role of variants governing CRP levels, providing important information on the minimal subset of SNPs necessary for comprehensive evaluation of the likely causal relevance of elevated CRP levels for coronary-heart-disease risk by Mendelian randomization. The same method could be applied to evidence synthesis of other quantitative traits, whenever the typed SNPs vary among studies, and to assist fine mapping of causal variants.  相似文献   

6.
Xiao J  Wang X  Hu Z  Tang Z  Xu C 《Heredity》2007,98(6):427-435
Segregation analysis is a method of detecting major genes for quantitative traits without using marker information. It serves as an important tool in helping investigators to plan further studies such as quantitative trait loci mapping or more sophisticated genomic analyses. However, current methods of segregation analysis for a single trait typically have low statistical power. We propose a multivariate segregation analysis (MSA) that takes advantage of the correlation structure of multiple quantitative traits to detect major genes. This method not only increases the statistical power, but allows dissection of the genetic architecture underlying the trait complex. In MSA the observed phenotypes of multiple correlated traits are fitted to a multivariate Gaussian mixture model. Model parameters are estimated under the maximum likelihood framework via the expectation-maximization algorithm. The presence of major genes is tested using likelihood ratio test statistics. Pleiotropy is distinguished from close linkage by comparing three possible models using the Bayesian information criterion. Two simulation experiments were performed based on the F(2) mating design. In the first, the statistical properties of MSA under varying heritabilities and sample sizes were investigated and the results compared with those obtained from single-trait analysis. In the second simulation the efficacy of MSA in separating pleiotropy from close linkage was demonstrated. Finally, the new method was applied to real data and detected a major gene responsible for both plant height and tiller number in rice.  相似文献   

7.
Yi Jia  Jean-Luc Jannink 《Genetics》2012,192(4):1513-1522
Genetic correlations between quantitative traits measured in many breeding programs are pervasive. These correlations indicate that measurements of one trait carry information on other traits. Current single-trait (univariate) genomic selection does not take advantage of this information. Multivariate genomic selection on multiple traits could accomplish this but has been little explored and tested in practical breeding programs. In this study, three multivariate linear models (i.e., GBLUP, BayesA, and BayesCπ) were presented and compared to univariate models using simulated and real quantitative traits controlled by different genetic architectures. We also extended BayesA with fixed hyperparameters to a full hierarchical model that estimated hyperparameters and BayesCπ to impute missing phenotypes. We found that optimal marker-effect variance priors depended on the genetic architecture of the trait so that estimating them was beneficial. We showed that the prediction accuracy for a low-heritability trait could be significantly increased by multivariate genomic selection when a correlated high-heritability trait was available. Further, multiple-trait genomic selection had higher prediction accuracy than single-trait genomic selection when phenotypes are not available on all individuals and traits. Additional factors affecting the performance of multiple-trait genomic selection were explored.  相似文献   

8.
Inferring causal phenotype networks from segregating populations   总被引:2,自引:1,他引:1       下载免费PDF全文
A major goal in the study of complex traits is to decipher the causal interrelationships among correlated phenotypes. Current methods mostly yield undirected networks that connect phenotypes without causal orientation. Some of these connections may be spurious due to partial correlation that is not causal. We show how to build causal direction into an undirected network of phenotypes by including causal QTL for each phenotype. We evaluate causal direction for each edge connecting two phenotypes, using a LOD score. This new approach can be applied to many different population structures, including inbred and outbred crosses as well as natural populations, and can accommodate feedback loops. We assess its performance in simulation studies and show that our method recovers network edges and infers causal direction correctly at a high rate. Finally, we illustrate our method with an example involving gene expression and metabolite traits from experimental crosses.  相似文献   

9.
Jiang L  Liu J  Sun D  Ma P  Ding X  Yu Y  Zhang Q 《PloS one》2010,5(10):e13661
Genome-wide association studies (GWAS) based on high throughput SNP genotyping technologies open a broad avenue for exploring genes associated with milk production traits in dairy cattle. Motivated by pinpointing novel quantitative trait nucleotide (QTN) across Bos Taurus genome, the present study is to perform GWAS to identify genes affecting milk production traits using current state-of-the-art SNP genotyping technology, i.e., the Illumina BovineSNP50 BeadChip. In the analyses, the five most commonly evaluated milk production traits are involved, including milk yield (MY), milk fat yield (FY), milk protein yield (PY), milk fat percentage (FP) and milk protein percentage (PP). Estimated breeding values (EBVs) of 2,093 daughters from 14 paternal half-sib families are considered as phenotypes within the framework of a daughter design. Association tests between each trait and the 54K SNPs are achieved via two different analysis approaches, a paternal transmission disequilibrium test (TDT)-based approach (L1-TDT) and a mixed model based regression analysis (MMRA). In total, 105 SNPs were detected to be significantly associated genome-wise with one or multiple milk production traits. Of the 105 SNPs, 38 were commonly detected by both methods, while four and 63 were solely detected by L1-TDT and MMRA, respectively. The majority (86 out of 105) of the significant SNPs is located within the reported QTL regions and some are within or close to the reported candidate genes. In particular, two SNPs, ARS-BFGL-NGS-4939 and BFGL-NGS-118998, are located close to the DGAT1 gene (160bp apart) and within the GHR gene, respectively. Our findings herein not only provide confirmatory evidences for previously findings, but also explore a suite of novel SNPs associated with milk production traits, and thus form a solid basis for eventually unraveling the causal mutations for milk production traits in dairy cattle.  相似文献   

10.
Xu C  Li Z  Xu S 《Genetics》2005,169(2):1045-1059
Joint mapping for multiple quantitative traits has shed new light on genetic mapping by pinpointing pleiotropic effects and close linkage. Joint mapping also can improve statistical power of QTL detection. However, such a joint mapping procedure has not been available for discrete traits. Most disease resistance traits are measured as one or more discrete characters. These discrete characters are often correlated. Joint mapping for multiple binary disease traits may provide an opportunity to explore pleiotropic effects and increase the statistical power of detecting disease loci. We develop a maximum-likelihood method for mapping multiple binary traits. We postulate a set of multivariate normal disease liabilities, each contributing to the phenotypic variance of one disease trait. The underlying liabilities are linked to the binary phenotypes through some underlying thresholds. The new method actually maps loci for the variation of multivariate normal liabilities. As a result, we are able to take advantage of existing methods of joint mapping for quantitative traits. We treat the multivariate liabilities as missing values so that an expectation-maximization (EM) algorithm can be applied here. We also extend the method to joint mapping for both discrete and continuous traits. Efficiency of the method is demonstrated using simulated data. We also apply the new method to a set of real data and detect several loci responsible for blast resistance in rice.  相似文献   

11.
J Jiang  Q Zhang  L Ma  J Li  Z Wang  J-F Liu 《Heredity》2015,115(1):29-36
Predicting organismal phenotypes from genotype data is important for preventive and personalized medicine as well as plant and animal breeding. Although genome-wide association studies (GWAS) for complex traits have discovered a large number of trait- and disease-associated variants, phenotype prediction based on associated variants is usually in low accuracy even for a high-heritability trait because these variants can typically account for a limited fraction of total genetic variance. In comparison with GWAS, the whole-genome prediction (WGP) methods can increase prediction accuracy by making use of a huge number of variants simultaneously. Among various statistical methods for WGP, multiple-trait model and antedependence model show their respective advantages. To take advantage of both strategies within a unified framework, we proposed a novel multivariate antedependence-based method for joint prediction of multiple quantitative traits using a Bayesian algorithm via modeling a linear relationship of effect vector between each pair of adjacent markers. Through both simulation and real-data analyses, our studies demonstrated that the proposed antedependence-based multiple-trait WGP method is more accurate and robust than corresponding traditional counterparts (Bayes A and multi-trait Bayes A) under various scenarios. Our method can be readily extended to deal with missing phenotypes and resequence data with rare variants, offering a feasible way to jointly predict phenotypes for multiple complex traits in human genetic epidemiology as well as plant and livestock breeding.  相似文献   

12.
数量遗传学中一种新的求综合性状的方法   总被引:3,自引:1,他引:3  
周士谔  李子先 《遗传学报》1989,16(4):269-275
本文运用申农(Shannon)提供的最大熵原理,提出一种构成单一综合性状的新方法,并以此与数量遗传学中的多元统计法作了比较。在作多元遗传分析吋,常用多元统计法求出多个数量性状的综合性状,再对这些相互关联的基本性状作主成份分析或用典范相关进行遗传分析。本文提出了不同于多元统计学的另一种新的方法——最大熵法求出多个数量性状的单一综合性状值。它具有数学结构简单,过程明晰,结果简明等优点。  相似文献   

13.
Despite evidence of the clustering of metabolic syndrome components, current approaches for identifying unifying genetic mechanisms typically evaluate clinical categories that do not provide adequate etiological information. Here, we used data from 19,486 European American and 6,287 African American Candidate Gene Association Resource Consortium participants to identify loci associated with the clustering of metabolic phenotypes. Six phenotype domains (atherogenic dyslipidemia, vascular dysfunction, vascular inflammation, pro-thrombotic state, central obesity, and elevated plasma glucose) encompassing 19 quantitative traits were examined. Principal components analysis was used to reduce the dimension of each domain such that >55% of the trait variance was represented within each domain. We then applied a statistically efficient and computational feasible multivariate approach that related eight principal components from the six domains to 250,000 imputed SNPs using an additive genetic model and including demographic covariates. In European Americans, we identified 606 genome-wide significant SNPs representing 19 loci. Many of these loci were associated with only one trait domain, were consistent with results in African Americans, and overlapped with published findings, for instance central obesity and FTO. However, our approach, which is applicable to any set of interval scale traits that is heritable and exhibits evidence of phenotypic clustering, identified three new loci in or near APOC1, BRAP, and PLCG1, which were associated with multiple phenotype domains. These pleiotropic loci may help characterize metabolic dysregulation and identify targets for intervention.  相似文献   

14.
15.
We have recently developed analysis methods (GREML) to estimate the genetic variance of a complex trait/disease and the genetic correlation between two complex traits/diseases using genome-wide single nucleotide polymorphism (SNP) data in unrelated individuals. Here we use analytical derivations and simulations to quantify the sampling variance of the estimate of the proportion of phenotypic variance captured by all SNPs for quantitative traits and case-control studies. We also derive the approximate sampling variance of the estimate of a genetic correlation in a bivariate analysis, when two complex traits are either measured on the same or different individuals. We show that the sampling variance is inversely proportional to the number of pairwise contrasts in the analysis and to the variance in SNP-derived genetic relationships. For bivariate analysis, the sampling variance of the genetic correlation additionally depends on the harmonic mean of the proportion of variance explained by the SNPs for the two traits and the genetic correlation between the traits, and depends on the phenotypic correlation when the traits are measured on the same individuals. We provide an online tool for calculating the power of detecting genetic (co)variation using genome-wide SNP data. The new theory and online tool will be helpful to plan experimental designs to estimate the missing heritability that has not yet been fully revealed through genome-wide association studies, and to estimate the genetic overlap between complex traits (diseases) in particular when the traits (diseases) are not measured on the same samples.  相似文献   

16.
Simultaneous analysis of correlated traits that change with time is an important issue in genetic analyses. Several methodologies have already been proposed for the genetic analysis of longitudinal data on single traits, in particular random regression and character process models. Although the latter proved, in most cases, to compare favourably to alternative approaches for analysis of single function-valued traits, they do not allow a straightforward extension to the multivariate case. In this paper, another methodology (structured antedependence models) is proposed, and methods are derived for the genetic analysis of two or more correlated function-valued traits. Multivariate analyses are presented of fertility and mortality in Drosophila and of milk, fat and protein yields in dairy cattle. These models offer a substantial flexibility for the correlation structure, even in the case of complex non-stationary patterns, and perform better than multivariate random regression models, with fewer parameters.  相似文献   

17.

Background

GWAS owe their popularity to the expectation that they will make a major impact on diagnosis, prognosis and management of disease by uncovering genetics underlying clinical phenotypes. The dominant paradigm in GWAS data analysis so far consists of extensive reliance on methods that emphasize contribution of individual SNPs to statistical association with phenotypes. Multivariate methods, however, can extract more information by considering associations of multiple SNPs simultaneously. Recent advances in other genomics domains pinpoint multivariate causal graph-based inference as a promising principled analysis framework for high-throughput data. Designed to discover biomarkers in the local causal pathway of the phenotype, these methods lead to accurate and highly parsimonious multivariate predictive models. In this paper, we investigate the applicability of causal graph-based method TIE* to analysis of GWAS data. To test the utility of TIE*, we focus on anti-CCP positive rheumatoid arthritis (RA) GWAS datasets, where there is a general consensus in the community about the major genetic determinants of the disease.

Results

Application of TIE* to the North American Rheumatoid Arthritis Cohort (NARAC) GWAS data results in six SNPs, mostly from the MHC locus. Using these SNPs we develop two predictive models that can classify cases and disease-free controls with an accuracy of 0.81 area under the ROC curve, as verified in independent testing data from the same cohort. The predictive performance of these models generalizes reasonably well to Swedish subjects from the closely related but not identical Epidemiological Investigation of Rheumatoid Arthritis (EIRA) cohort with 0.71-0.78 area under the ROC curve. Moreover, the SNPs identified by the TIE* method render many other previously known SNP associations conditionally independent of the phenotype.

Conclusions

Our experiments demonstrate that application of TIE* captures maximum amount of genetic information about RA in the data and recapitulates the major consensus findings about the genetic factors of this disease. In addition, TIE* yields reproducible markers and signatures of RA. This suggests that principled multivariate causal and predictive framework for GWAS analysis empowers the community with a new tool for high-quality and more efficient discovery.

Reviewers

This article was reviewed by Prof. Anthony Almudevar, Dr. Eugene V. Koonin, and Prof. Marianthi Markatou.  相似文献   

18.
The recent development of sequencing technology allows identification of association between the whole spectrum of genetic variants and complex diseases. Over the past few years, a number of association tests for rare variants have been developed. Jointly testing for association between genetic variants and multiple correlated phenotypes may increase the power to detect causal genes in family-based studies, but familial correlation needs to be appropriately handled to avoid an inflated type I error rate. Here we propose a novel approach for multivariate family data using kernel machine regression (denoted as MF-KM) that is based on a linear mixed-model framework and can be applied to a large range of studies with different types of traits. In our simulation studies, the usual kernel machine test has inflated type I error rates when applied directly to familial data, while our proposed MF-KM method preserves the expected type I error rates. Moreover, the MF-KM method has increased power compared to methods that either analyze each phenotype separately while considering family structure or use only unrelated founders from the families. Finally, we illustrate our proposed methodology by analyzing whole-genome genotyping data from a lung function study.  相似文献   

19.
An important task of human genetics studies is to predict accurately disease risks in individuals based on genetic markers, which allows for identifying individuals at high disease risks, and facilitating their disease treatment and prevention. Although hundreds of genome-wide association studies (GWAS) have been conducted on many complex human traits in recent years, there has been only limited success in translating these GWAS data into clinically useful risk prediction models. The predictive capability of GWAS data is largely bottlenecked by the available training sample size due to the presence of numerous variants carrying only small to modest effects. Recent studies have shown that different human traits may share common genetic bases. Therefore, an attractive strategy to increase the training sample size and hence improve the prediction accuracy is to integrate data from genetically correlated phenotypes. Yet, the utility of genetic correlation in risk prediction has not been explored in the literature. In this paper, we analyzed GWAS data for bipolar and related disorders and schizophrenia with a bivariate ridge regression method, and found that jointly predicting the two phenotypes could substantially increase prediction accuracy as measured by the area under the receiver operating characteristic curve. We also found similar prediction accuracy improvements when we jointly analyzed GWAS data for Crohn’s disease and ulcerative colitis. The empirical observations were substantiated through our comprehensive simulation studies, suggesting that a gain in prediction accuracy can be obtained by combining phenotypes with relatively high genetic correlations. Through both real data and simulation studies, we demonstrated pleiotropy can be leveraged as a valuable asset that opens up a new opportunity to improve genetic risk prediction in the future.  相似文献   

20.
Asthma is the most common chronic childhood disease in the developed nations, and is a complex disease that has high social and economic costs. Studies of the genetic etiology of asthma offer a way of improving our understanding of its pathogenesis, with the goal of improving preventive strategies, diagnostic tools, and therapies. Considerable effort and expense have been expended in attempts to detect specific polymorphisms in genetic loci contributing to asthma susceptibility. Concomitantly, the technology for detecting single nucleotide polymorphisms (SNPs) has undergone rapid development, extensive catalogues of SNPs across the genome have been constructed, and SNPs have been increasingly used as a method of investigating the genetic etiology of complex human diseases. This paper reviews both current and potential future contributions of SNPs to our understanding of asthma pathophysiology.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号