首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Z Li  J M?tt?nen  M J Sillanp?? 《Heredity》2015,115(6):556-564
Linear regression-based quantitative trait loci/association mapping methods such as least squares commonly assume normality of residuals. In genetics studies of plants or animals, some quantitative traits may not follow normal distribution because the data include outlying observations or data that are collected from multiple sources, and in such cases the normal regression methods may lose some statistical power to detect quantitative trait loci. In this work, we propose a robust multiple-locus regression approach for analyzing multiple quantitative traits without normality assumption. In our method, the objective function is least absolute deviation (LAD), which corresponds to the assumption of multivariate Laplace distributed residual errors. This distribution has heavier tails than the normal distribution. In addition, we adopt a group LASSO penalty to produce shrinkage estimation of the marker effects and to describe the genetic correlation among phenotypes. Our LAD-LASSO approach is less sensitive to the outliers and is more appropriate for the analysis of data with skewedly distributed phenotypes. Another application of our robust approach is on missing phenotype problem in multiple-trait analysis, where the missing phenotype items can simply be filled with some extreme values, and be treated as outliers. The efficiency of the LAD-LASSO approach is illustrated on both simulated and real data sets.  相似文献   

2.
An ultimate goal of genetic research is to understand the connection between genotype and phenotype in order to improve the diagnosis and treatment of diseases. The quantitative genetics field has developed a suite of statistical methods to associate genetic loci with diseases and phenotypes, including quantitative trait loci (QTL) linkage mapping and genome-wide association studies (GWAS). However, each of these approaches have technical and biological shortcomings. For example, the amount of heritable variation explained by GWAS is often surprisingly small and the resolution of many QTL linkage mapping studies is poor. The predictive power and interpretation of QTL and GWAS results are consequently limited. In this study, we propose a complementary approach to quantitative genetics by interrogating the vast amount of high-throughput genomic data in model organisms to functionally associate genes with phenotypes and diseases. Our algorithm combines the genome-wide functional relationship network for the laboratory mouse and a state-of-the-art machine learning method. We demonstrate the superior accuracy of this algorithm through predicting genes associated with each of 1157 diverse phenotype ontology terms. Comparison between our prediction results and a meta-analysis of quantitative genetic studies reveals both overlapping candidates and distinct, accurate predictions uniquely identified by our approach. Focusing on bone mineral density (BMD), a phenotype related to osteoporotic fracture, we experimentally validated two of our novel predictions (not observed in any previous GWAS/QTL studies) and found significant bone density defects for both Timp2 and Abcg8 deficient mice. Our results suggest that the integration of functional genomics data into networks, which itself is informative of protein function and interactions, can successfully be utilized as a complementary approach to quantitative genetics to predict disease risks. All supplementary material is available at http://cbfg.jax.org/phenotype.  相似文献   

3.
The study of change in intermediate phenotypes over time is important in genetics. In this paper we explore a new approach to phenotype definition in the genetic analysis of longitudinal phenotypes. We utilized data from the longitudinal Framingham Heart Study Family Cohort to investigate the familial aggregation and evidence for linkage to change in systolic blood pressure (SBP) over time. We used Gibbs sampling to derive sigma-squared-A-random-effects (SSARs) for the longitudinal phenotype, and then used these as a new phenotype in subsequent genome-wide linkage analyses. Additive genetic effects (sigma2A.time) were estimated to account for approximately 9.2% of the variance in the rate of change of SBP with age, while additive genetic effects (sigma2A) were estimated to account for approximately 43.9% of the variance in SBP at the mean age. The linkage results suggested that one or more major loci regulating change in SBP over time may localize to chromosomes 2, 3, 4, 6, 10, 11, 17, and 19. The results also suggested that one or more major loci regulating level of SBP may localize to chromosomes 3, 8, and 14. Our results support a genetic component to both SBP and change in SBP with age, and are consistent with a complex, multifactorial susceptibility to the development of hypertension. The use of SSARs derived from quantitative traits as input to a conventional linkage analysis appears to be valuable in the linkage analysis of genetically complex traits. We have now demonstrated in this paper the use of SSARs in the context of longitudinal family data.  相似文献   

4.
OBJECTIVES: Severe alpha 1-antitrypsin (A1AT) deficiency is the one proven genetic risk factor for chronic obstructive pulmonary disease (COPD). Familial aggregation has been demonstrated for COPD among individuals who do not have A1AT deficiency, but linkage analysis of COPD has not been reported. To investigate the optimal phenotype definitions and analytical methods for the linkage analysis of COPD, we examined a set of 28 A1AT- deficient families containing 155 individuals. We have used the protease inhibitor (PI) type as a genetic marker rather than a disease gene, and we have performed linkage analysis between PI type and serum A1AT level and spirometry-related phenotypes. METHODS: Linkage analysis was performed on the quantitative phenotypes forced expiratory volume at 1 s (FEV(1) as % predicted), the ratio of FEV(1) to forced vital capacity (FEV(1)/FVC as % predicted), and serum A1AT level using the variance component approach in SOLAR, the generalized estimating equation approach in RELPAL, and the model-based classical lod score method in LINKAGE. Linkage analysis with qualitative A1AT and spirometry phenotypes was performed using a model-based method (LINKAGE) and a model-free method (GENEHUNTER). Adjustments for smoking effects were investigated under each method. RESULTS: All of the methods demonstrated linkage of PI type to serum A1AT level. Interestingly, however, the other quantitative phenotypes provided only weak evidence for linkage of PI type to lung disease. Better evidence for linkage of lung disease to PI type was found using a moderate or a mild threshold for the definition of airflow obstruction. CONCLUSIONS: For linkage analysis of spirometry phenotypes in A1AT deficiency, qualitative phenotypes provided stronger evidence for linkage than quantitative phenotypes. Possible contributors to the stronger evidence for linkage to qualitative spirometry phenotypes include the ascertainment scheme and the nonnormality of the pulmonary function data in PI Z subjects. This study provides guidelines for studies of the genetics of COPD unrelated to A1AT deficiency.  相似文献   

5.
The genome-wide association study (GWAS) approach has discovered hundreds of genetic variants associated with diseases and quantitative traits. However, despite clinical overlap and statistical correlation between many phenotypes, GWAS are generally performed one-phenotype-at-a-time. Here we compare the performance of modelling multiple phenotypes jointly with that of the standard univariate approach. We introduce a new method and software, MultiPhen, that models multiple phenotypes simultaneously in a fast and interpretable way. By performing ordinal regression, MultiPhen tests the linear combination of phenotypes most associated with the genotypes at each SNP, and thus potentially captures effects hidden to single phenotype GWAS. We demonstrate via simulation that this approach provides a dramatic increase in power in many scenarios. There is a boost in power for variants that affect multiple phenotypes and for those that affect only one phenotype. While other multivariate methods have similar power gains, we describe several benefits of MultiPhen over these. In particular, we demonstrate that other multivariate methods that assume the genotypes are normally distributed, such as canonical correlation analysis (CCA) and MANOVA, can have highly inflated type-1 error rates when testing case-control or non-normal continuous phenotypes, while MultiPhen produces no such inflation. To test the performance of MultiPhen on real data we applied it to lipid traits in the Northern Finland Birth Cohort 1966 (NFBC1966). In these data MultiPhen discovers 21% more independent SNPs with known associations than the standard univariate GWAS approach, while applying MultiPhen in addition to the standard approach provides 37% increased discovery. The most associated linear combinations of the lipids estimated by MultiPhen at the leading SNPs accurately reflect the Friedewald Formula, suggesting that MultiPhen could be used to refine the definition of existing phenotypes or uncover novel heritable phenotypes.  相似文献   

6.
The discovery of genetic variants that underlie a complex phenotype is challenging. One possible approach to facilitate this endeavor is to identify quantitative trait loci (QTL) that contribute to the phenotype and consequently unravel the candidate genes within these loci. Each proposed candidate locus contains multiple genes and, therefore, further analysis is required to choose plausible candidate genes. One of such methods is to use comparative genomics in order to narrow down the QTL to a region containing only a few genes. We illustrate this strategy by applying it to genetic findings regarding physical activity (PA) in mice and human. Here, we show that PA is a complex phenotype with a strong biological basis and complex genetic architecture. Furthermore, we provide considerations for the translatability of this phenotype between species. Finally, we review studies which point to candidate genetic regions for PA in humans (genetic association and linkage studies) or use mouse models of PA (QTL studies) and we identify candidate genetic regions that overlap between species. On the basis of a large variety of studies in mice and human, statistical analysis reveals that the number of overlapping regions is not higher than expected on a chance level. We conclude that the discovery of new candidate genes for complex phenotypes, such as PA levels, is hampered by various factors, including genetic background differences, phenotype definition and a wide variety of methodological differences between studies .  相似文献   

7.
High correlations between two quantitative traits may be either due to common genetic factors or common environmental factors or a combination of both. In this study, we develop statistical methods to extract the genetic contribution to the total correlation between the components of a bivariate phenotype. Using data on bivariate phenotypes and marker genotypes for sib-pairs, we propose a test for linkage between a common QTL and a marker locus based on the conditional cross-sib trait correlations (trait 1 of sib 1—trait 2 of sib 2 and conversely) given the identity-by-descent (i.b.d.) sharing at the marker locus. We use Monte-Carlo simulations to evaluate the performance of the proposed test under different trait parameters and quantitative trait distributions. An application of the method is illustrated using data on two alcohol-related phenotypes from a project on the collaborative study on the genetics of alcoholism.  相似文献   

8.
Genetic architecture fundamentally affects the way that traits evolve. However, the mapping of genotype to phenotype includes complex interactions with the environment or even the sex of an organism that can modulate the expressed phenotype. Line‐cross analysis is a powerful quantitative genetics method to infer genetic architecture by analysing the mean phenotype value of two diverged strains and a series of subsequent crosses and backcrosses. However, it has been difficult to account for complex interactions with the environment or sex within this framework. We have developed extensions to line‐cross analysis that allow for gene by environment and gene by sex interactions. Using extensive simulation studies and reanalysis of empirical data, we show that our approach can account for both unintended environmental variation when crosses cannot be reared in a common garden and can be used to test for the presence of gene by environment or gene by sex interactions. In analyses that fail to account for environmental variation between crosses, we find that line‐cross analysis has low power and high false‐positive rates. However, we illustrate that accounting for environmental variation allows for the inference of adaptive divergence, and that accounting for sex differences in phenotypes allows practitioners to infer the genetic architecture of sexual dimorphism.  相似文献   

9.
Increasingly, behavioral ecologists have applied quantitative genetic methods to investigate the evolution of behaviors in wild animal populations. The promise of quantitative genetics in unmanaged populations opens the door for simultaneous analysis of inheritance, phenotypic plasticity, and patterns of selection on behavioral phenotypes all within the same study. In this article, we describe how quantitative genetic techniques provide studies of the evolution of behavior with information that is unique and valuable. We outline technical obstacles for applying quantitative genetic techniques that are of particular relevance to studies of behavior in primates, especially those living in noncaptive populations, e.g., the need for pedigree information, non-Gaussian phenotypes, and demonstrate how many of these barriers are now surmountable. We illustrate this by applying recent quantitative genetic methods to spatial proximity data, a simple and widely collected primate social behavior, from adult rhesus macaques on Cayo Santiago. Our analysis shows that proximity measures are consistent across repeated measurements on individuals (repeatable) and that kin have similar mean measurements (heritable). Quantitative genetics may hold lessons of considerable importance for studies of primate behavior, even those without a specific genetic focus.  相似文献   

10.
Understanding the genetic architecture of complex traits is a major objective in biology. The standard approach for doing so is genome-wide association studies (GWAS), which aim to identify genetic polymorphisms responsible for variation in traits of interest. In human genetics, consistency across studies is commonly used as an indicator of reliability. However, if traits are involved in adaptation to the local environment, we do not necessarily expect reproducibility. On the contrary, results may depend on where you sample, and sampling across a wide range of environments may decrease the power of GWAS because of increased genetic heterogeneity. In this study, we examine how sampling affects GWAS in the model plant species Arabidopsis thaliana. We show that traits like flowering time are indeed influenced by distinct genetic effects in local populations. Furthermore, using gene expression as a molecular phenotype, we show that some genes are globally affected by shared variants, whereas others are affected by variants specific to subpopulations. Remarkably, the former are essentially all cis-regulated, whereas the latter are predominately affected by trans-acting variants. Our result illustrate that conclusions about genetic architecture can be extremely sensitive to sampling and population structure.  相似文献   

11.
The National Institute on Drug Abuse Genetics and Epigenetics Cross‐Cutting Research Team convened a diverse group of researchers, clinicians, and healthcare providers on the campus of the University of California, San Diego, in June 2018. The goal was to develop strategies to integrate genetics and phenotypes across species to achieve a better understanding of substance use disorders through associations between genotypes and addictive behaviors. This conference (a) discussed progress in harmonizing large opioid genetics cohorts, (b) discussed phenotypes that are used for genetics studies in humans, (c) examined phenotypes that are used for genetics studies in animal models, (d) identified synergies and gaps in phenotypic analyses of human and animal models and (e) identified strategies to integrate genetics and genomics data with phenotypes across species. The meeting consisted of panels that focused on phenotype harmonization (Dr. Laura Bierut, Dr. Olivier George, Dr. Dan Larach and Dr. Sesh Mudumbai), translating genetic findings between species (Dr. Elissa Chesler, Dr. Gary Peltz and Dr. Abraham Palmer), interpreting and understanding allelic variations (Dr. Vanessa Troiani and Dr. Tamara Richards) and pathway conservation in animal models and human studies (Dr. Robert Hitzemann, Dr. Huda Akil and Dr. Laura Saba). There were also updates that were provided by large consortia (Dr. Susan Tapert, Dr. Danielle Dick, Dr. Howard Edenberg and Dr. Eric Johnson). Collectively, the conference was convened to discuss progress and changes in genome‐wide association studies.  相似文献   

12.
An important task of human genetics studies is to predict accurately disease risks in individuals based on genetic markers, which allows for identifying individuals at high disease risks, and facilitating their disease treatment and prevention. Although hundreds of genome-wide association studies (GWAS) have been conducted on many complex human traits in recent years, there has been only limited success in translating these GWAS data into clinically useful risk prediction models. The predictive capability of GWAS data is largely bottlenecked by the available training sample size due to the presence of numerous variants carrying only small to modest effects. Recent studies have shown that different human traits may share common genetic bases. Therefore, an attractive strategy to increase the training sample size and hence improve the prediction accuracy is to integrate data from genetically correlated phenotypes. Yet, the utility of genetic correlation in risk prediction has not been explored in the literature. In this paper, we analyzed GWAS data for bipolar and related disorders and schizophrenia with a bivariate ridge regression method, and found that jointly predicting the two phenotypes could substantially increase prediction accuracy as measured by the area under the receiver operating characteristic curve. We also found similar prediction accuracy improvements when we jointly analyzed GWAS data for Crohn’s disease and ulcerative colitis. The empirical observations were substantiated through our comprehensive simulation studies, suggesting that a gain in prediction accuracy can be obtained by combining phenotypes with relatively high genetic correlations. Through both real data and simulation studies, we demonstrated pleiotropy can be leveraged as a valuable asset that opens up a new opportunity to improve genetic risk prediction in the future.  相似文献   

13.
Despite the multitude of examples of evolution in action, relatively fewer studies have taken a replicated approach to understand the repeatability of evolution. Here, we examine the convergent evolution of adaptive coloration in experimental introductions of guppies from a high‐predation (HP) environment into four low‐predation (LP) environments. LP introductions were replicated across 2 years and in two different forest canopy cover types. We take a complementary approach by examining both phenotypes and genetics. For phenotypes, we categorize the whole color pattern on the tail fin of male guppies and analyze evolution using a correspondence analysis. We find that coloration in the introduction sites diverged from the founding Guanapo HP site. Sites group together based on canopy cover, indicating convergence in response to light environment. However, the axis that explains the most variation indicates a lack of convergence. Therefore, evolution may proceed along similar phenotypic trajectories, but still maintain unique variation within sites. For the genetics underlying the divergent phenotypes, we examine expression levels of color genes. We find no evidence for differential expression, indicating that the genetic basis for the color changes remains undetermined.  相似文献   

14.
Cui Y  Kim DY  Zhu J 《Genetics》2006,174(4):2159-2172
Statistical methods for mapping quantitative trait loci (QTL) have been extensively studied. While most existing methods assume normal distribution of the phenotype, the normality assumption could be easily violated when phenotypes are measured in counts. One natural choice to deal with count traits is to apply the classical Poisson regression model. However, conditional on covariates, the Poisson assumption of mean-variance equality may not be valid when data are potentially under- or overdispersed. In this article, we propose an interval-mapping approach for phenotypes measured in counts. We model the effects of QTL through a generalized Poisson regression model and develop efficient likelihood-based inference procedures. This approach, implemented with the EM algorithm, allows for a genomewide scan for the existence of QTL throughout the entire genome. The performance of the proposed method is evaluated through extensive simulation studies along with comparisons with existing approaches such as the Poisson regression and the generalized estimating equation approach. An application to a rice tiller number data set is given. Our approach provides a standard procedure for mapping QTL involved in the genetic control of complex traits measured in counts.  相似文献   

15.
Multivariate phenotypes may be characterized collectively by a variety of low level traits, such as in the diagnosis of a disease that relies on multiple disease indicators. Such multivariate phenotypes are often used in genetic association studies. If highly heritable components of a multivariate phenotype can be identified, it can maximize the likelihood of finding genetic associations. Existing methods for phenotype refinement perform unsupervised cluster analysis on low-level traits and hence do not assess heritability. Existing heritable component analytics either cannot utilize general pedigrees or have to estimate the entire covariance matrix of low-level traits from limited samples, which leads to inaccurate estimates and is often computationally prohibitive. It is also difficult for these methods to exclude fixed effects from other covariates such as age, sex and race, in order to identify truly heritable components. We propose to search for a combination of low-level traits and directly maximize the heritability of this combined trait. A quadratic optimization problem is thus derived where the objective function is formulated by decomposing the traditional maximum likelihood method for estimating the heritability of a quantitative trait. The proposed approach can generate linearly-combined traits of high heritability that has been corrected for the fixed effects of covariates. The effectiveness of the proposed approach is demonstrated in simulations and by a case study of cocaine dependence. Our approach was computationally efficient and derived traits of higher heritability than those by other methods. Additional association analysis with the derived cocaine-use trait identified genetic markers that were replicated in an independent sample, further confirming the utility and advantage of the proposed approach.  相似文献   

16.
Evolvability is a function of the way genetic variation interacts with the mechanisms that produce the phenotype. We explore an explicitly mechanistic way of studying the evolvability of phenotypes that are produced by a relatively simple genetic mechanism, the mitogen-activated protein kinase (MAPK) cascade. We developed a quantitative model of MAPK activation that can be used to study the effects of genetic variation on the various components of this signaling cascade. We show how some standard tools of applied mathematics, such as steady-state formulations and nondimensionalization, can be used to elucidate the relative importance of variation in each gene of this mechanism. We also give insights into non-intuitive patterns of dependence and trade-off among the genes. The mechanism produces several different phenotypes (ultrasensitivity to stimulation, switch-like behavior, amount of MAPK-PP delivered, persistence of MAPK-PP activity), each of which is sensitive to different (but partially overlapping) combinations of genes. We show that the mechanism imposes clear limitations on the evolvability of each of the different phenotypes of the pathway, even in the presence of genetic variation in the components of the mechanism. This approach to the study of evolvability is generally applicable and complements the traditional approach through statistical genetics by providing a mechanistic understanding of the genetic interactions that produce the phenotype.  相似文献   

17.
In many cases, the unprecedented availability of data provided by high-throughput sequencing has shifted the bottleneck from a data availability issue to a data interpretation issue, thus delaying the promised breakthroughs in genetics and precision medicine, for what concerns Human genetics, and phenotype prediction to improve plant adaptation to climate change and resistance to bioagressors, for what concerns plant sciences. In this paper, we propose a novel Genome Interpretation paradigm, which aims at directly modeling the genotype-to-phenotype relationship, and we focus on A. thaliana since it is the best studied model organism in plant genetics. Our model, called Galiana, is the first end-to-end Neural Network (NN) approach following the genomes in/phenotypes out paradigm and it is trained to predict 288 real-valued Arabidopsis thaliana phenotypes from Whole Genome sequencing data. We show that 75 of these phenotypes are predicted with a Pearson correlation ≥0.4, and are mostly related to flowering traits. We show that our end-to-end NN approach achieves better performances and larger phenotype coverage than models predicting single phenotypes from the GWAS-derived known associated genes. Galiana is also fully interpretable, thanks to the Saliency Maps gradient-based approaches. We followed this interpretation approach to identify 36 novel genes that are likely to be associated with flowering traits, finding evidence for 6 of them in the existing literature.  相似文献   

18.
19.
Strug L  Sun L  Corey M 《BMC genetics》2003,4(Z1):S14
There has been a lack of consistency in detecting chromosomal loci that are linked to obesity-related traits. This may be due, in part, to the phenotype definition. Many studies use a one-time, single measurement as a phenotype while one's weight often fluctuates considerably throughout adulthood. Longitudinal data from the Framingham Heart Study were used to derive alternative phenotypes that may lead to more consistent findings. Body mass index (BMI), a measurement for obesity, is known to increase with age and then plateau or decline slightly; the decline phase may represent a threshold or survivor effect. We propose to use the weight gain phase of BMI to derive phenotypes useful for linkage analysis of obesity. Two phenotypes considered in the present study are the average of and the slope of the BMI measurements in the gain phase (gain mean and gain slope). For comparison, we also considered the average of all BMI measurements available (overall mean). Linkage analysis using the gain mean phenotype exhibited two markers with LOD scores greater than 3, with the largest score of 3.52 on chromosome 4 at ATA2A03. In contrast, no LOD scores greater than 3 were observed when overall mean was used. The gain slope produced weak evidence for linkage on chromosome 4 with a multipoint LOD score of 1.77 at GATA8A05. Our analysis shows how omitting the decline phase of BMI in the definition of obesity phenotypes can result in evidence for linkage which might have been otherwise overlooked.  相似文献   

20.
Wang J  Shete S 《PloS one》2011,6(11):e27642
In case-control genetic association studies, cases are subjects with the disease and controls are subjects without the disease. At the time of case-control data collection, information about secondary phenotypes is also collected. In addition to studies of primary diseases, there has been some interest in studying genetic variants associated with secondary phenotypes. In genetic association studies, the deviation from Hardy-Weinberg proportion (HWP) of each genetic marker is assessed as an initial quality check to identify questionable genotypes. Generally, HWP tests are performed based on the controls for the primary disease or secondary phenotype. However, when the disease or phenotype of interest is common, the controls do not represent the general population. Therefore, using only controls for testing HWP can result in a highly inflated type I error rate for the disease- and/or phenotype-associated variants. Recently, two approaches, the likelihood ratio test (LRT) approach and the mixture HWP (mHWP) exact test were proposed for testing HWP in samples from case-control studies. Here, we show that these two approaches result in inflated type I error rates and could lead to the removal from further analysis of potential causal genetic variants associated with the primary disease and/or secondary phenotype when the study of primary disease is frequency-matched on the secondary phenotype. Therefore, we proposed alternative approaches, which extend the LRT and mHWP approaches, for assessing HWP that account for frequency matching. The goal was to maintain more (possible causative) single-nucleotide polymorphisms in the sample for further analysis. Our simulation results showed that both extended approaches could control type I error probabilities. We also applied the proposed approaches to test HWP for SNPs from a genome-wide association study of lung cancer that was frequency-matched on smoking status and found that the proposed approaches can keep more genetic variants for association studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号