首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
We present a new method of quantitative-trait linkage analysis that combines the simplicity and robustness of regression-based methods and the generality and greater power of variance-components models. The new method is based on a regression of estimated identity-by-descent (IBD) sharing between relative pairs on the squared sums and squared differences of trait values of the relative pairs. The method is applicable to pedigrees of arbitrary structure and to pedigrees selected on the basis of trait value, provided that population parameters of the trait distribution can be correctly specified. Ambiguous IBD sharing (due to incomplete marker information) can be accommodated in the method by appropriate specification of the variance-covariance matrix of IBD sharing between relative pairs. We have implemented this regression-based method and have performed simulation studies to assess, under a range of conditions, estimation accuracy, type I error rate, and power. For normally distributed traits and in large samples, the method is found to give the correct type I error rate and an unbiased estimate of the proportion of trait variance accounted for by the additive effects of the locus-although, in cases where asymptotic theory is doubtful, significance levels should be checked by simulations. In large sibships, the new method is slightly more powerful than variance-components models. The proposed method provides a practical and powerful tool for the linkage analysis of quantitative traits.  相似文献   

2.
Li M  Boehnke M  Abecasis GR  Song PX 《Genetics》2006,173(4):2317-2327
Mapping and identifying variants that influence quantitative traits is an important problem for genetic studies. Traditional QTL mapping relies on a variance-components (VC) approach with the key assumption that the trait values in a family follow a multivariate normal distribution. Violation of this assumption can lead to inflated type I error, reduced power, and biased parameter estimates. To accommodate nonnormally distributed data, we developed and implemented a modified VC method, which we call the "copula VC method," that directly models the nonnormal distribution using Gaussian copulas. The copula VC method allows the analysis of continuous, discrete, and censored trait data, and the standard VC method is a special case when the data are distributed as multivariate normal. Through the use of link functions, the copula VC method can easily incorporate covariates. We use computer simulations to show that the proposed method yields unbiased parameter estimates, correct type I error rates, and improved power for testing linkage with a variety of nonnormal traits as compared with the standard VC and the regression-based methods.  相似文献   

3.
The Haseman-Elston regression method offers a simpler alternative to variance-components (VC) models, for the linkage analysis of quantitative traits. However, even the "revisited" method, which uses the cross-product--rather than the squared difference--in sib trait values, is, in general, less powerful than VC models. In this report, we clarify the relative efficiencies of existing Haseman-Elston methods and show how a new Haseman-Elston method can be constructed to have power equivalent to that of VC models. This method uses as the dependent variable a linear combination of squared sums and squared differences, in which the weights are determined by the overall trait correlation between sibs in a population. We show how this method can be used for both the selection of maximally informative sib pairs for genotyping and the subsequent analysis of such selected samples.  相似文献   

4.
Mao Y  Xu S 《Genetical research》2004,83(3):159-168
Many quantitative traits are measured as percentages. As a result, the assumption of a normal distribution for the residual errors of such percentage data is often violated. However, most quantitative trait locus (QTL) mapping procedures assume normality of the residuals. Therefore, proper data transformation is often recommended before statistical analysis is conducted. We propose the probit transformation to convert percentage data into variables with a normal distribution. The advantage of the probit transformation is that it can handle measurement errors with heterogeneous variance and correlation structure in a statistically sound manner. We compared the results of this data transformation with other transformations and found that this method can substantially increase the statistical power of QTL detection. We develop the QTL mapping procedure based on the maximum likelihood methodology implemented via the expectation-maximization algorithm. The efficacy of the new method is demonstrated using Monte Carlo simulation.  相似文献   

5.
The Haseman-Elston (HE) regression method offers a mathematically and computationally simpler alternative to variance-components (VC) models for the linkage analysis of quantitative traits. However, current versions of HE regression and VC models are not optimised for binary traits. Here, we present a modified HE regression and a liability-threshold VC model for binary-traits. The new HE method is based on the regression of a linear combination of the trait squares and the trait cross-product on the proportion of alleles identical by descent (IBD) at the putative locus, for sibling pairs. We have implemented both the new HE regression-based method and have performed analytic and simulation studies to assess its type 1 error rate and power under a range of conditions. These studies showed that the new HE method is well-behaved under the null hypothesis in large samples, is more powerful than both the original and the revisited HE methods, and is approximately equivalent in power to the liability-threshold VC model.  相似文献   

6.
Z Li  J M?tt?nen  M J Sillanp?? 《Heredity》2015,115(6):556-564
Linear regression-based quantitative trait loci/association mapping methods such as least squares commonly assume normality of residuals. In genetics studies of plants or animals, some quantitative traits may not follow normal distribution because the data include outlying observations or data that are collected from multiple sources, and in such cases the normal regression methods may lose some statistical power to detect quantitative trait loci. In this work, we propose a robust multiple-locus regression approach for analyzing multiple quantitative traits without normality assumption. In our method, the objective function is least absolute deviation (LAD), which corresponds to the assumption of multivariate Laplace distributed residual errors. This distribution has heavier tails than the normal distribution. In addition, we adopt a group LASSO penalty to produce shrinkage estimation of the marker effects and to describe the genetic correlation among phenotypes. Our LAD-LASSO approach is less sensitive to the outliers and is more appropriate for the analysis of data with skewedly distributed phenotypes. Another application of our robust approach is on missing phenotype problem in multiple-trait analysis, where the missing phenotype items can simply be filled with some extreme values, and be treated as outliers. The efficiency of the LAD-LASSO approach is illustrated on both simulated and real data sets.  相似文献   

7.
Functional diversity (FD) is an important component of biodiversity that quantifies the difference in functional traits between organisms. However, FD studies are often limited by the availability of trait data and FD indices are sensitive to data gaps. The distribution of species abundance and trait data, and its transformation, may further affect the accuracy of indices when data is incomplete. Using an existing approach, we simulated the effects of missing trait data by gradually removing data from a plant, an ant and a bird community dataset (12, 59, and 8 plots containing 62, 297 and 238 species respectively). We ranked plots by FD values calculated from full datasets and then from our increasingly incomplete datasets and compared the ranking between the original and virtually reduced datasets to assess the accuracy of FD indices when used on datasets with increasingly missing data. Finally, we tested the accuracy of FD indices with and without data transformation, and the effect of missing trait data per plot or per the whole pool of species. FD indices became less accurate as the amount of missing data increased, with the loss of accuracy depending on the index. But, where transformation improved the normality of the trait data, FD values from incomplete datasets were more accurate than before transformation. The distribution of data and its transformation are therefore as important as data completeness and can even mitigate the effect of missing data. Since the effect of missing trait values pool-wise or plot-wise depends on the data distribution, the method should be decided case by case. Data distribution and data transformation should be given more careful consideration when designing, analysing and interpreting FD studies, especially where trait data are missing. To this end, we provide the R package “traitor” to facilitate assessments of missing trait data.  相似文献   

8.
Cui Y  Kim DY  Zhu J 《Genetics》2006,174(4):2159-2172
Statistical methods for mapping quantitative trait loci (QTL) have been extensively studied. While most existing methods assume normal distribution of the phenotype, the normality assumption could be easily violated when phenotypes are measured in counts. One natural choice to deal with count traits is to apply the classical Poisson regression model. However, conditional on covariates, the Poisson assumption of mean-variance equality may not be valid when data are potentially under- or overdispersed. In this article, we propose an interval-mapping approach for phenotypes measured in counts. We model the effects of QTL through a generalized Poisson regression model and develop efficient likelihood-based inference procedures. This approach, implemented with the EM algorithm, allows for a genomewide scan for the existence of QTL throughout the entire genome. The performance of the proposed method is evaluated through extensive simulation studies along with comparisons with existing approaches such as the Poisson regression and the generalized estimating equation approach. An application to a rice tiller number data set is given. Our approach provides a standard procedure for mapping QTL involved in the genetic control of complex traits measured in counts.  相似文献   

9.
Variance component analysis provides an efficient method for performing linkage analysis for quantitative traits. However, type I error of variance components-based likelihood ratio testing may be affected when phenotypic data are non-normally distributed (especially with high values of kurtosis). This results in inflated LOD scores when the normality assumption does not hold. Even though different solutions have been proposed to deal with this problem with univariate phenotypes, little work has been done in the multivariate case. We present an empirical approach to adjust the inflated LOD scores obtained from a bivariate phenotype that violates the assumption of normality. Using the Collaborative Study on the Genetics of Alcoholism data available for the Genetic Analysis Workshop 14, we show how bivariate linkage analysis with leptokurtotic traits gives an inflated type I error. We perform a novel correction that achieves acceptable levels of type I error.  相似文献   

10.
M. Turelli  N. H. Barton 《Genetics》1994,138(3):913-941
We develop a general population genetic framework for analyzing selection on many loci, and apply it to strong truncation and disruptive selection on an additive polygenic trait. We first present statistical methods for analyzing the infinitesimal model, in which offspring breeding values are normally distributed around the mean of the parents, with fixed variance. These show that the usual assumption of a Gaussian distribution of breeding values in the population gives remarkably accurate predictions for the mean and the variance, even when disruptive selection generates substantial deviations from normality. We then set out a general genetic analysis of selection and recombination. The population is represented by multilocus cumulants describing the distribution of haploid genotypes, and selection is described by the relation between mean fitness and these cumulants. We provide exact recursions in terms of generating functions for the effects of selection on non-central moments. The effects of recombination are simply calculated as a weighted sum over all the permutations produced by meiosis. Finally, the new cumulants that describe the next generation are computed from the non-central moments. Although this scheme is applied here in detail only to selection on an additive trait, it is quite general. For arbitrary epistasis and linkage, we describe a consistent infinitesimal limit in which the short-term selection response is dominated by infinitesimal allele frequency changes and linkage disequilibria. Numerical multilocus results show that the standard Gaussian approximation gives accurate predictions for the dynamics of the mean and genetic variance in this limit. Even with intense truncation selection, linkage disequilibria of order three and higher never cause much deviation from normality. Thus, the empirical deviations frequently found between predicted and observed responses to artificial selection are not caused by linkage-disequilibrium-induced departures from normality. Disruptive selection can generate substantial four-way disequilibria, and hence kurtosis; but even then, the Gaussian assumption predicts the variance accurately. In contrast to the apparent simplicity of the infinitesimal limit, data suggest that changes in genetic variance after 10 or more generations of selection are likely to be dominated by allele frequency dynamics that depend on genetic details.  相似文献   

11.
Wu C  Li G  Zhu J  Cui Y 《PloS one》2011,6(9):e24902
Functional mapping has been a powerful tool in mapping quantitative trait loci (QTL) underlying dynamic traits of agricultural or biomedical interest. In functional mapping, multivariate normality is often assumed for the underlying data distribution, partially due to the ease of parameter estimation. The normality assumption however could be easily violated in real applications due to various reasons such as heavy tails or extreme observations. Departure from normality has negative effect on testing power and inference for QTL identification. In this work, we relax the normality assumption and propose a robust multivariate t-distribution mapping framework for QTL identification in functional mapping. Simulation studies show increased mapping power and precision with the t distribution than that of a normal distribution. The utility of the method is demonstrated through a real data analysis.  相似文献   

12.
Variance-component (VC) methods are flexible and powerful procedures for the mapping of genes that influence quantitative traits. However, traditional VC methods make the critical assumption that the quantitative-trait data within a family either follow or can be transformed to follow a multivariate normal distribution. Violation of the multivariate normality assumption can occur if trait data are censored at some threshold value. Trait censoring can arise in a variety of ways, including assay limitation or confounding due to medication. Valid linkage analyses of censored data require the development of a modified VC method that directly models the censoring event. Here, we present such a model, which we call the "tobit VC method." Using simulation studies, we compare and contrast the performance of the traditional and tobit VC methods for linkage analysis of censored trait data. For the simulation settings that we considered, our results suggest that (1) analyses of censored data by using the traditional VC method lead to severe bias in parameter estimates and a modest increase in false-positive linkage findings, (2) analyses with the tobit VC method lead to unbiased parameter estimates and type I error rates that reflect nominal levels, and (3) the tobit VC method has a modest increase in linkage power as compared with the traditional VC method. We also apply the tobit VC method to censored data from the Finland-United States Investigation of Non-Insulin-Dependent Diabetes Mellitus Genetics study and provide two examples in which the tobit VC method yields noticeably different results as compared with the traditional method.  相似文献   

13.
Existing methods for joint modeling of longitudinal measurements and survival data can be highly influenced by outliers in the longitudinal outcome. We propose a joint model for analysis of longitudinal measurements and competing risks failure time data which is robust in the presence of outlying longitudinal observations during follow‐up. Our model consists of a linear mixed effects sub‐model for the longitudinal outcome and a proportional cause‐specific hazards frailty sub‐model for the competing risks data, linked together by latent random effects. Instead of the usual normality assumption for measurement errors in the linear mixed effects sub‐model, we adopt a t ‐distribution which has a longer tail and thus is more robust to outliers. We derive an EM algorithm for the maximum likelihood estimates of the parameters and estimate their standard errors using a profile likelihood method. The proposed method is evaluated by simulation studies and is applied to a scleroderma lung study (© 2009 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

14.
Extreme discordant sibling-pair (EDSP) designs have been shown in theory to be very powerful for mapping quantitative-trait loci (QTLs) in humans. However, their practical applicability has been somewhat limited by the need to phenotype very large populations to find enough pairs that are extremely discordant. In this paper, we demonstrate that there is also substantial power in pairs that are only moderately discordant, and that designs using moderately discordant pairs can yield a more practical balance between phenotyping and genotyping efforts. The power we demonstrate for moderately discordant pairs stems from a new statistical result. Statistical analysis in discordant-pair studies is generally done by testing for reduced identity by descent (IBD) sharing in the pairs. By contrast, the most commonly-used statistical methods for more standard QTL mapping are Haseman-Elston regression and variance-components analysis. Both of these use statistics that are functions of the trait values given IBD information for the pedigree. We show that IBD sharing statistics and "trait value given IBD" statistics contribute complementary rather than redundant information, and thus that statistics of the two types can be combined to form more powerful tests of linkage. We propose a simple composite statistic, and test it with simulation studies. The simulation results show that our composite statistic increases power only minimally for extremely discordant pairs. However, it boosts the power of moderately discordant pairs substantially and makes them a very practical alternative. Our composite statistic is straightforward to calculate with existing software; we give a practical example of its use by applying it to a Genetic Analysis Workshop (GAW) data set.  相似文献   

15.
Yang R  Yi N  Xu S 《Genetica》2006,128(1-3):133-143
The maximum likelihood method of QTL mapping assumes that the phenotypic values of a quantitative trait follow a normal distribution. If the assumption is violated, some forms of transformation should be taken to make the assumption approximately true. The Box–Cox transformation is a general transformation method which can be applied to many different types of data. The flexibility of the Box–Cox transformation is due to a variable, called transformation factor, appearing in the Box–Cox formula. We developed a maximum likelihood method that treats the transformation factor as an unknown parameter, which is estimated from the data simultaneously along with the QTL parameters. The method makes an objective choice of data transformation and thus can be applied to QTL analysis for many different types of data. Simulation studies show that (1) Box–Cox transformation can substantially increase the power of QTL detection; (2) Box–Cox transformation can replace some specialized transformation methods that are commonly used in QTL mapping; and (3) applying the Box–Cox transformation to data already normally distributed does not harm the result.  相似文献   

16.
In applied entomological experiments, when the response is a count-type variable, certain transformation remedies such as the square root, logarithm (log), or rank transformation are often used to normalize data before analysis of variance. In this study, we examine the usefulness of these transformations by reanalyzing field-collected data from a split-plot experiment and by performing a more comprehensive simulation study of factorial and split-plot experiments. For field-collected data, significant interactions were dependent upon the type of transformation. For the simulation study, Poisson distributed errors were used for a 2 by 2 factorial arrangement, in both randomized complete block and split-plot settings. Various sizes of main effects were induced, and type I error rates and powers of the tests for interaction were examined for the raw response values, log-, square root-, and rank-transformed responses. The aligned rank transformation also was investigated because it has been shown to perform well in testing interactions in factorial arrangements. We found that for testing interactions, the untransformed response and the aligned rank response performed best (preserved nominal type I error rates), whereas the other transformations had inflated error rates when main effects were present. No evaluations of the tests for main effects or simple effects have been conducted. Potentially these transformations will still be necessary when performing these tests.  相似文献   

17.
Variance-component methods are popular and flexible analytic tools for elucidating the genetic mechanisms of complex quantitative traits from pedigree data. However, variance-component methods typically assume that the trait of interest follows a multivariate normal distribution within a pedigree. Studies have shown that violation of this normality assumption can lead to biased parameter estimates and inflations in type-I error. This limits the application of variance-component methods to more general trait outcomes, whether continuous or categorical in nature. In this paper, we develop and apply a general variance-component framework for pedigree analysis of continuous and categorical outcomes. We develop appropriate models using generalized-linear mixed model theory and fit such models using approximate maximum-likelihood procedures. Using our proposed method, we demonstrate that one can perform variance-component pedigree analysis on outcomes that follow any exponential-family distribution. Additionally, we also show how one can modify the method to perform pedigree analysis of ordinal outcomes. We also discuss extensions of our variance-component framework to accommodate pedigrees ascertained based on trait outcome. We demonstrate the feasibility of our method using both simulated data and data from a genetic study of ovarian insufficiency.  相似文献   

18.
Exposure measurement error can result in a biased estimate of the association between an exposure and outcome. When the exposure–outcome relationship is linear on the appropriate scale (e.g. linear, logistic) and the measurement error is classical, that is the result of random noise, the result is attenuation of the effect. When the relationship is non‐linear, measurement error distorts the true shape of the association. Regression calibration is a commonly used method for correcting for measurement error, in which each individual's unknown true exposure in the outcome regression model is replaced by its expectation conditional on the error‐prone measure and any fully measured covariates. Regression calibration is simple to execute when the exposure is untransformed in the linear predictor of the outcome regression model, but less straightforward when non‐linear transformations of the exposure are used. We describe a method for applying regression calibration in models in which a non‐linear association is modelled by transforming the exposure using a fractional polynomial model. It is shown that taking a Bayesian estimation approach is advantageous. By use of Markov chain Monte Carlo algorithms, one can sample from the distribution of the true exposure for each individual. Transformations of the sampled values can then be performed directly and used to find the expectation of the transformed exposure required for regression calibration. A simulation study shows that the proposed approach performs well. We apply the method to investigate the relationship between usual alcohol intake and subsequent all‐cause mortality using an error model that adjusts for the episodic nature of alcohol consumption.  相似文献   

19.
We consider a model of sympatric speciation due to frequency-dependent competition, in which it was previously assumed that the evolving traits have a very simple genetic architecture. In the present study, we numerically analyze the consequences of relaxing this assumption. First, previous models assumed that assortative mating evolves in infinitesimal steps. Here, we show that the range of parameters for which speciation is possible increases when mutational steps are large. Second, it was assumed that the trait under frequency-dependent selection is determined by a single locus with two alleles and additive effects. As a consequence, the resultant intermediate phenotype is always heterozygous and can never breed true. To relax this assumption, here we add a second locus influencing the trait. We find three new possible evolutionary outcomes: evolution of three reproductively isolated species, a monomorphic equilibrium with only the intermediate phenotype, and a randomly mating population with a steep unimodal distribution of phenotypes. Both extensions of the original model thus increase the likelihood of competitive speciation.  相似文献   

20.
In genomic research phenotype transformations are commonly used as a straightforward way to reach normality of the model outcome. Many researchers still believe it to be necessary for proper inference. Using regression simulations, we show that phenotype transformations are typically not needed and, when used in phenotype with heteroscedasticity, result in inflated Type I error rates. We further explain that important is to address a combination of rare variant genotypes and heteroscedasticity. Incorrectly estimated parameter variability or incorrect choice of the distribution of the underlying test statistic provide spurious detection of associations. We conclude that it is a combination of heteroscedasticity, minor allele frequency, sample size, and to a much lesser extent the error distribution, that matter for proper statistical inference.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号