首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
J Jiang  Q Zhang  L Ma  J Li  Z Wang  J-F Liu 《Heredity》2015,115(1):29-36
Predicting organismal phenotypes from genotype data is important for preventive and personalized medicine as well as plant and animal breeding. Although genome-wide association studies (GWAS) for complex traits have discovered a large number of trait- and disease-associated variants, phenotype prediction based on associated variants is usually in low accuracy even for a high-heritability trait because these variants can typically account for a limited fraction of total genetic variance. In comparison with GWAS, the whole-genome prediction (WGP) methods can increase prediction accuracy by making use of a huge number of variants simultaneously. Among various statistical methods for WGP, multiple-trait model and antedependence model show their respective advantages. To take advantage of both strategies within a unified framework, we proposed a novel multivariate antedependence-based method for joint prediction of multiple quantitative traits using a Bayesian algorithm via modeling a linear relationship of effect vector between each pair of adjacent markers. Through both simulation and real-data analyses, our studies demonstrated that the proposed antedependence-based multiple-trait WGP method is more accurate and robust than corresponding traditional counterparts (Bayes A and multi-trait Bayes A) under various scenarios. Our method can be readily extended to deal with missing phenotypes and resequence data with rare variants, offering a feasible way to jointly predict phenotypes for multiple complex traits in human genetic epidemiology as well as plant and livestock breeding.  相似文献   

Quantitative traits measured in human families can be analyzed to partition the total population variance into genetic and environmental components, or to elucidate the genetic mechanism involved. We review the estimation of variance components directly from human pedigree data, or in the form of path coefficients from correlations between pairs of relatives. To elucidate genetic mechanisms, a mixed model that allows for segregation at a major locus, a polygenic effect and a sibling environmental correlation is described for nuclear families. In each case appropriate likelihoods are derived as a basis, using numerical maximum likelihood methods, for parameter estimation and hypothesis testing. A general model is then described that allows for several familial sources of environmental variation, assortative mating, and both major gene and polygenic effects; and an algorithm for calculating the likelihood of a pedigree under this model is indicated. Finally, some of the remaining problems in this area of biometric analysis are pointed out.  相似文献   

In the analysis of longitudinal data, before assuming a parametric model, an idea of the shape of the variance and correlation functions for both the genetic and environmental parts should be known. When a small number of observations is available for each subject at a fixed set of times, it is possible to estimate unstructured covariance matrices, but not when the number of observations over time is large and when individuals are not measured at all times. The non-parametric approach, based on the variogram, presented by Diggle & Verbyla (1998), is specially adapted for exploratory analysis of such data. This paper presents a generalization of their approach to genetic analyses. The methodology is applied to daily records for milk production in dairy cattle and data on age-specific fertility in Drosophila.  相似文献   

MOTIVATION: In most quantitative trait locus (QTL) mapping studies, phenotypes are assumed to follow normal distributions. Deviations from this assumption may affect the accuracy of QTL detection and lead to detection of spurious QTLs. To improve the robustness of QTL mapping methods, we replaced the normal distribution for residuals in multiple interacting QTL models with the normal/independent distributions that are a class of symmetric and long-tailed distributions and are able to accommodate residual outliers. Subsequently, we developed a Bayesian robust analysis strategy for dissecting genetic architecture of quantitative traits and for mapping genome-wide interacting QTLs in line crosses. RESULTS: Through computer simulations, we showed that our strategy had a similar power for QTL detection compared with traditional methods assuming normal-distributed traits, but had a substantially increased power for non-normal phenotypes. When this strategy was applied to a group of traits associated with physical/chemical characteristics and quality in rice, more main and epistatic QTLs were detected than traditional Bayesian model analyses under the normal assumption.  相似文献   

We introduce a method for the analysis of multilocus, multitrait genetic data that provides an intuitive and precise characterization of genetic architecture. We show that it is possible to infer the magnitude and direction of causal relationships among multiple correlated phenotypes and illustrate the technique using body composition and bone density data from mouse intercross populations. Using these techniques we are able to distinguish genetic loci that affect adiposity from those that affect overall body size and thus reveal a shortcoming of standardized measures such as body mass index that are widely used in obesity research. The identification of causal networks sheds light on the nature of genetic heterogeneity and pleiotropy in complex genetic systems.  相似文献   

Tao Wang 《BMC genetics》2011,12(1):1-21


In genetic association study of quantitative traits using F models, how to code the marker genotypes and interpret the model parameters appropriately is important for constructing hypothesis tests and making statistical inferences. Currently, the coding of marker genotypes in building F models has mainly focused on the biallelic case. A thorough work on the coding of marker genotypes and interpretation of model parameters for F models is needed especially for genetic markers with multiple alleles.


In this study, we will formulate F genetic models under various regression model frameworks and introduce three genotype coding schemes for genetic markers with multiple alleles. Starting from an allele-based modeling strategy, we first describe a regression framework to model the expected genotypic values at given markers. Then, as extension from the biallelic case, we introduce three coding schemes for constructing fully parameterized one-locus F models and discuss the relationships between the model parameters and the expected genotypic values. Next, under a simplified modeling framework for the expected genotypic values, we consider several reduced one-locus F models from the three coding schemes on the estimability and interpretation of their model parameters. Finally, we explore some extensions of the one-locus F models to two loci. Several fully parameterized as well as reduced two-locus F models are addressed.


The genotype coding schemes provide different ways to construct F models for association testing of multi-allele genetic markers with quantitative traits. Which coding scheme should be applied depends on how convenient it can provide the statistical inferences on the parameters of our research interests. Based on these F models, the standard regression model fitting tools can be used to estimate and test for various genetic effects through statistical contrasts with the adjustment for environmental factors.  相似文献   

Missing outcomes or irregularly timed multivariate longitudinal data frequently occur in clinical trials or biomedical studies. The multivariate t linear mixed model (MtLMM) has been shown to be a robust approach to modeling multioutcome continuous repeated measures in the presence of outliers or heavy‐tailed noises. This paper presents a framework for fitting the MtLMM with an arbitrary missing data pattern embodied within multiple outcome variables recorded at irregular occasions. To address the serial correlation among the within‐subject errors, a damped exponential correlation structure is considered in the model. Under the missing at random mechanism, an efficient alternating expectation‐conditional maximization (AECM) algorithm is used to carry out estimation of parameters and imputation of missing values. The techniques for the estimation of random effects and the prediction of future responses are also investigated. Applications to an HIV‐AIDS study and a pregnancy study involving analysis of multivariate longitudinal data with missing outcomes as well as a simulation study have highlighted the superiority of MtLMMs on the provision of more adequate estimation, imputation and prediction performances.  相似文献   

Nonlinear mixed effects models for repeated measures data   总被引:51,自引:1,他引:50  
We propose a general, nonlinear mixed effects model for repeated measures data and define estimators for its parameters. The proposed estimators are a natural combination of least squares estimators for nonlinear fixed effects models and maximum likelihood (or restricted maximum likelihood) estimators for linear mixed effects models. We implement Newton-Raphson estimation using previously developed computational methods for nonlinear fixed effects models and for linear mixed effects models. Two examples are presented and the connections between this work and recent work on generalized linear mixed effects models are discussed.  相似文献   

Genetic models for quantitative seed traits with effects of several major genes and polygenes, as well as their GE interaction, were proposed. Mixed linear model approaches were suggested for analyzing the genetic models. Monte Carlo simulations were conducted to evaluate unbiasedness and efficiency for estimating fixed effects and variance components of the embryo and the endosperm models, including effects of a major gene from an unbalanced modified diallel mating design with nine parents, respectively. Simulation results showed that estimates of generalized least squares (GLS) were unbiased and efficient, while those of ordinary least squares (OLS) were almost as good as GLS. Minimum norm quadratic unbiased estimation (MINQUE) could obtain unbiased estimates of the variance components. It was also suggested that precision of MINQUE estimation would be improved with augmentation of experimental size. Data from a modified diallel design in upland cotton ( Gossypium hirsutum L.) were used as a worked example to illustrate the parameter estimation.  相似文献   

Bayesian quantitative trait loci mapping for multiple traits   总被引:1,自引:0,他引:1       下载免费PDF全文
Banerjee S  Yandell BS  Yi N 《Genetics》2008,179(4):2275-2289
Most quantitative trait loci (QTL) mapping experiments typically collect phenotypic data on multiple correlated complex traits. However, there is a lack of a comprehensive genomewide mapping strategy for correlated traits in the literature. We develop Bayesian multiple-QTL mapping methods for correlated continuous traits using two multivariate models: one that assumes the same genetic model for all traits, the traditional multivariate model, and the other known as the seemingly unrelated regression (SUR) model that allows different genetic models for different traits. We develop computationally efficient Markov chain Monte Carlo (MCMC) algorithms for performing joint analysis. We conduct extensive simulation studies to assess the performance of the proposed methods and to compare with the conventional single-trait model. Our methods have been implemented in the freely available package R/qtlbim (http://www.qtlbim.org), which greatly facilitates the general usage of the Bayesian methodology for unraveling the genetic architecture of complex traits.  相似文献   



In silico models have recently been created in order to predict which genetic variants are more likely to contribute to the risk of a complex trait given their functional characteristics. However, there has been no comprehensive review as to which type of predictive accuracy measures and data visualization techniques are most useful for assessing these models.


We assessed the performance of the models for predicting risk using various methodologies, some of which include: receiver operating characteristic (ROC) curves, histograms of classification probability, and the novel use of the quantile-quantile plot. These measures have variable interpretability depending on factors such as whether the dataset is balanced in terms of numbers of genetic variants classified as risk variants versus those that are not.


We conclude that the area under the curve (AUC) is a suitable starting place, and for models with similar AUCs, violin plots are particularly useful for examining the distribution of the risk scores.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1616-z) contains supplementary material, which is available to authorized users.  相似文献   

The testing of Bayesian point null hypotheses on variance component models have resulted in a tough assignment for which no clear and generally accepted method exists. In this work we present what we believe is a succeeding approach to such a task. It is based on a simple reparameterization of the model in terms of the total variance and the proportion of the additive genetic variance with respect to it, as well as on the explicit inclusion on the prior probability of a discrete component at origin. The reparameterization was used to bypass an arbitrariness related to the impropriety of uninformative priors onto unbounded variables while the discrete component was necessary to overcome the zero probability assigned to sets of null measure by the usual continuous variable models. The method was tested against computer simulations with appealing results.  相似文献   

Rank-based regression for analysis of repeated measures   总被引:1,自引:0,他引:1  
Wang  You-Gan; Zhu  Min 《Biometrika》2006,93(2):459-464

Estimating the genetic architecture of quantitative traits   总被引:20,自引:0,他引:20  
Understanding and estimating the structure and parameters associated with the genetic architecture of quantitative traits is a major research focus in quantitative genetics. With the availability of a well-saturated genetic map of molecular markers, it is possible to identify a major part of the structure of the genetic architecture of quantitative traits and to estimate the associated parameters. Multiple interval mapping, which was recently proposed for simultaneously mapping multiple quantitative trait loci (QTL), is well suited to the identification and estimation of the genetic architecture parameters, including the number, genomic positions, effects and interactions of significant QTL and their contribution to the genetic variance. With multiple traits and multiple environments involved in a QTL mapping experiment, pleiotropic effects and QTL by environment interactions can also be estimated. We review the method and discuss issues associated with multiple interval mapping, such as likelihood analysis, model selection, stopping rules and parameter estimation. The potential power and advantages of the method for mapping multiple QTL and estimating the genetic architecture are discussed. We also point out potential problems and difficulties in resolving the details of the genetic architecture as well as other areas that require further investigation. One application of the analysis is to improve genome-wide marker-assisted selection, particularly when the information about epistasis is used for selection with mating.  相似文献   

Z Li  J M?tt?nen  M J Sillanp?? 《Heredity》2015,115(6):556-564
Linear regression-based quantitative trait loci/association mapping methods such as least squares commonly assume normality of residuals. In genetics studies of plants or animals, some quantitative traits may not follow normal distribution because the data include outlying observations or data that are collected from multiple sources, and in such cases the normal regression methods may lose some statistical power to detect quantitative trait loci. In this work, we propose a robust multiple-locus regression approach for analyzing multiple quantitative traits without normality assumption. In our method, the objective function is least absolute deviation (LAD), which corresponds to the assumption of multivariate Laplace distributed residual errors. This distribution has heavier tails than the normal distribution. In addition, we adopt a group LASSO penalty to produce shrinkage estimation of the marker effects and to describe the genetic correlation among phenotypes. Our LAD-LASSO approach is less sensitive to the outliers and is more appropriate for the analysis of data with skewedly distributed phenotypes. Another application of our robust approach is on missing phenotype problem in multiple-trait analysis, where the missing phenotype items can simply be filled with some extreme values, and be treated as outliers. The efficiency of the LAD-LASSO approach is illustrated on both simulated and real data sets.  相似文献   

Empirical studies of quantitative genetic variation have revealed robust patterns that are observed both across traits and across species. However, these patterns have no compelling explanation, and some of the observations even appear to be mutually incompatible. We review and extend a major class of theoretical models, 'mutation-selection models', that have been proposed to explain quantitative genetic variation. We also briefly review an alternative class of 'balancing selection models'. We consider to what extent the models are compatible with the general observations, and argue that a key issue is understanding and modelling pleiotropy. We discuss some of the thorny issues that arise when formulating models that describe many traits simultaneously.  相似文献   

Key life history traits such as breeding time and clutch size are frequently both heritable and under directional selection, yet many studies fail to document microevolutionary responses. One general explanation is that selection estimates are biased by the omission of correlated traits that have causal effects on fitness, but few valid tests of this exist. Here, we show, using a quantitative genetic framework and six decades of life‐history data on two free‐living populations of great tits Parus major, that selection estimates for egg‐laying date and clutch size are relatively unbiased. Predicted responses to selection based on the Robertson–Price Identity were similar to those based on the multivariate breeder's equation (MVBE), indicating that unmeasured covarying traits were not missing from the analysis. Changing patterns of phenotypic selection on these traits (for laying date, linked to climate change) therefore reflect changing selection on breeding values, and genetic constraints appear not to limit their independent evolution. Quantitative genetic analysis of correlational data from pedigreed populations can be a valuable complement to experimental approaches to help identify whether apparent associations between traits and fitness are biased by missing traits, and to parse the roles of direct versus indirect selection across a range of environments.  相似文献   

Major locus analysis for quantitative traits.   总被引:6,自引:6,他引:0       下载免费PDF全文

In the classic view introduced by R. A. Fisher, a quantitative trait is encoded by many loci with small, additive effects. Recent advances in quantitative trait loci mapping have begun to elucidate the genetic architectures underlying vast numbers of phenotypes across diverse taxa, producing observations that sometimes contrast with Fisher''s blueprint. Despite these considerable empirical efforts to map the genetic determinants of traits, it remains poorly understood how the genetic architecture of a trait should evolve, or how it depends on the selection pressures on the trait. Here, we develop a simple, population-genetic model for the evolution of genetic architectures. Our model predicts that traits under moderate selection should be encoded by many loci with highly variable effects, whereas traits under either weak or strong selection should be encoded by relatively few loci. We compare these theoretical predictions with qualitative trends in the genetics of human traits, and with systematic data on the genetics of gene expression levels in yeast. Our analysis provides an evolutionary explanation for broad empirical patterns in the genetic basis for traits, and it introduces a single framework that unifies the diversity of observed genetic architectures, ranging from Mendelian to Fisherian.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号