首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 546 毫秒
1.
Z Li  J M?tt?nen  M J Sillanp?? 《Heredity》2015,115(6):556-564
Linear regression-based quantitative trait loci/association mapping methods such as least squares commonly assume normality of residuals. In genetics studies of plants or animals, some quantitative traits may not follow normal distribution because the data include outlying observations or data that are collected from multiple sources, and in such cases the normal regression methods may lose some statistical power to detect quantitative trait loci. In this work, we propose a robust multiple-locus regression approach for analyzing multiple quantitative traits without normality assumption. In our method, the objective function is least absolute deviation (LAD), which corresponds to the assumption of multivariate Laplace distributed residual errors. This distribution has heavier tails than the normal distribution. In addition, we adopt a group LASSO penalty to produce shrinkage estimation of the marker effects and to describe the genetic correlation among phenotypes. Our LAD-LASSO approach is less sensitive to the outliers and is more appropriate for the analysis of data with skewedly distributed phenotypes. Another application of our robust approach is on missing phenotype problem in multiple-trait analysis, where the missing phenotype items can simply be filled with some extreme values, and be treated as outliers. The efficiency of the LAD-LASSO approach is illustrated on both simulated and real data sets.  相似文献   

2.
Xu C  Zhang YM  Xu S 《Heredity》2005,94(1):119-128
Many disease resistance traits in plants have a polygenic background and the disease phenotypes are modified by environmental factors. As a consequence, the phenotypic values usually show a quantitative variation. The phenotypes of such disease traits, however, are often measured in discrete but ordered categories. These traits are called ordinal traits. In terms of disease resistance, they are called quantitative resistance traits, as opposed to qualitative resistance traits, and are controlled by the quantitative resistance loci (QRL). Classical quantitative trait locus mapping methods are not optimal for ordinal trait analysis because the assumption of normal distribution is violated. Methods for mapping binary trait loci are not suitable either because there are more than two categories in ordinal traits. We developed a maximum likelihood method to map these QRL. The method is implemented via a multicycle expectation-conditional-maximization (ECM) algorithm under the threshold model, where we can estimate both the QRL effects and the thresholds that link the disease liability and the categorical phenotype. The method is verified in simulated data under various combinations of the parameters. An SAS program is available to implement the multicycle ECM algorithm. The program can be downloaded from our website at www.statgen.ucr.edu.  相似文献   

3.
Increasing empirical evidence suggests that many genetic variants influence multiple distinct phenotypes. When cross-phenotype effects exist, multivariate association methods that consider pleiotropy are often more powerful than univariate methods that model each phenotype separately. Although several statistical approaches exist for testing cross-phenotype effects for common variants, there is a lack of similar tests for gene-based analysis of rare variants. In order to fill this important gap, we introduce a statistical method for cross-phenotype analysis of rare variants using a nonparametric distance-covariance approach that compares similarity in multivariate phenotypes to similarity in rare-variant genotypes across a gene. The approach can accommodate both binary and continuous phenotypes and further can adjust for covariates. Our approach yields a closed-form test whose significance can be evaluated analytically, thereby improving computational efficiency and permitting application on a genome-wide scale. We use simulated data to demonstrate that our method, which we refer to as the Gene Association with Multiple Traits (GAMuT) test, provides increased power over competing approaches. We also illustrate our approach using exome-chip data from the Genetic Epidemiology Network of Arteriopathy.  相似文献   

4.
Recently, different dehydration-based technologies have been evaluated for the purpose of cell and tissue preservation. Although some early results have been promising, they have not satisfied the requirements for large-scale applications. The long experience of using quantitative trait loci (QTLs) with the yeast Saccharomyces cerevisiae has proven to be a good model organism for studying the link between complex phenotypes and DNA variations. Here, we use QTL analysis as a tool for identifying the specific yeast traits involved in dehydration stress tolerance. Three hybrids obtained from stable haploids and sequenced in the Saccharomyces Genome Resequencing Project showed intermediate dehydration tolerance in most cases. The dehydration resistance trait of 96 segregants from each hybrid was quantified. A smooth, continuous distribution of the anhydrobiosis tolerance trait was found, suggesting that this trait is determined by multiple QTLs. Therefore, we carried out a QTL analysis to identify the determinants of this dehydration tolerance trait at the genomic level. Among the genes identified after reciprocal hemizygosity assays, RSM22, ATG18 and DBR1 had not been referenced in previous studies. We report new phenotypes for these genes using a previously validated test. Finally, our data illustrates the power of this approach in the investigation of the complex cell dehydration phenotype.  相似文献   

5.
Zhang H  Wang X  Ye Y 《Genetics》2006,172(1):693-699
There is growing interest in genomewide association analysis using single-nucleotide polymorphisms (SNPs), because traditional linkage studies are not as powerful in identifying genes for common, complex diseases. Tests for linkage disequilibrium have been developed for binary and quantitative traits. However, since many human conditions and diseases are measured in an ordinal scale, methods need to be developed to investigate the association of genes and ordinal traits. Thus, in the current report we propose and derive a score test statistic that identifies genes that are associated with ordinal traits when gametic disequilibrium between a marker and trait loci exists. Through simulation, the performance of this new test is examined for both ordinal traits and quantitative traits. The proposed statistic not only accommodates and is more powerful for ordinal traits, but also has similar power to that of existing tests when the trait is quantitative. Therefore, our proposed statistic has the potential to serve as a unified approach to identifying genes that are associated with any trait, regardless of how the trait is measured. We further demonstrated the advantage of our test by revealing a significant association (P = 0.00067) between alcohol dependence and a SNP in the growth-associated protein 43.  相似文献   

6.
An advanced intercross line (AIL) is an easier and more cost-effective approach compared to recombinant inbred lines for fine mapping of quantitative trait loci (QTL) identified by F(2) designs. In an AIL, a complex binary trait can be mapped through analysis of either continuously distributed proxy traits for the liability of the binary trait or the liability itself, the latter presenting the greater statistical challenge. In another work, we successfully applied both approaches in an AIL to fine map previously identified QTL underlying anatomical parameters of the cardiac inter-atrial septum including patent foramen ovale. Here, we describe the statistical methods that we used to analyse complex binary traits in our AIL design. This is achieved using a likelihood-based method, with the expectation-maximisation algorithm allowing use of standard logistic regression methods for model fitting.  相似文献   

7.
Traits such as disease resistance are costly to evaluate and slow to improve using current methods. Analysis of gene expression profiles (e.g. DNA microarrays) has potential for predicting such phenotypes and has been used in an analogous way to classify cancer types in human patients. However, doubts have been raised regarding the use of classification methods with microarray data for this purpose. Here we propose a method using random regression with cross validation, which accounts for the distribution of variation in the trait and utilises different subsets of patients or animals to perform a complete validation of predictive ability. Published breast tumour data were used to test the method. Despite the small dataset (n < 100), the new approach resulted in a moderate but significant correlation between the predicted and actual phenotypes (0.32). Binary classification of the predicted phenotypes yielded similar classification error rates to those found by other authors (35%). Unlike other methods, the new method gave a quantitative estimate of phenotype that could be used to rank animals and select those with extreme phenotypic performance. Use of the method in an optimal way using larger sample sizes, and combining DNA microarrays and other testing platforms, is recommended.  相似文献   

8.
MOTIVATION: Phylogenomic profiling is a large-scale comparative genomic method used to infer protein function from evolutionary information first described in a binary form by Pellegrini et al. (1999). Here, we propose improvements of this approach including the use of normalized Blastp bit scores, a normalization of the matrix of profiles to take into account the evolutionary distances between bacteria, the definition of a phylogenomic neighborhood based on continuous pairwise distances between genes and an original annotation procedure including the computation of a p-value for each functional assignment. RESULTS: The method presented here increases the number of Ecocyc enzymes identified as being evolutionarily related by about 25% with respect to the original binary form (absent/present) method. The fraction of 'false' positives is shown to be smaller than 20%. Based on their phylogenomic relationships, genes of unknown function can then be automatically related to annotated genes. Each gene annotation predicted is associated with a p-value, i.e. its probability to be obtained by chance. The validity of this method was extensively tested on a large set of genes of known function using the MultiFun database. We find that 50% of 3122 function attributions that can be made at a p-value level of 10(-11) correspond to the actual gene annotation. The method can be readily applied to any newly sequenced microbial genome. In contrast to earlier work on the same topic, our approach avoids the use of arbitrary cut-off values, and provides a reliability estimate of the functional predictions in form of p-values.  相似文献   

9.
Jia Z  Xu S 《Genetical research》2005,86(3):193-207
Cluster analyses of gene expression data are usually conducted based on their associations with the phenotype of a particular disease. Many disease traits have a clearly defined binary phenotype (presence or absence), so that genes can be clustered based on the differences of expression levels between the two contrasting phenotypic groups. For example, cluster analysis based on binary phenotype has been successfully used in tumour research. Some complex diseases have phenotypes that vary in a continuous manner and the method developed for a binary trait is not immediately applicable to a continuous trait. However, understanding the role of gene expression in these complex traits is of fundamental importance. Therefore, it is necessary to develop a new statistical method to cluster expressed genes based on their association with a quantitative trait phenotype. We developed a model-based clustering method to classify genes based on their association with a continuous phenotype. We used a linear model to describe the relationship between gene expression and the phenotypic value. The model effects of the linear model (linear regression coefficients) represent the strength of the association. We assumed that the model effects of each gene follow a mixture of several multivariate Gaussian distributions. Parameter estimation and cluster assignment were accomplished via an Expectation-Maximization (EM) algorithm. The method was verified by analysing two simulated datasets, and further demonstrated using real data generated in a microarray experiment for the study of gene expression associated with Alzheimer's disease.  相似文献   

10.
A key step toward the discovery of a gene related to a trait is the finding of an association between the trait and one or more haplotypes. Haplotype analyses can also provide critical information regarding the function of a gene; however, when unrelated subjects are sampled, haplotypes are often ambiguous because of unknown linkage phase of the measured sites along a chromosome. A popular method of accounting for this ambiguity in case-control studies uses a likelihood that depends on haplotype frequencies, so that the haplotype frequencies can be compared between the cases and controls; however, this traditional method is limited to a binary trait (case vs. control), and it does not provide a method of testing the statistical significance of specific haplotypes. To address these limitations, we developed new methods of testing the statistical association between haplotypes and a wide variety of traits, including binary, ordinal, and quantitative traits. Our methods allow adjustment for nongenetic covariates, which may be critical when analyzing genetically complex traits. Furthermore, our methods provide several different global tests for association, as well as haplotype-specific tests, which give a meaningful advantage in attempts to understand the roles of many different haplotypes. The statistics can be computed rapidly, making it feasible to evaluate the associations between many haplotypes and a trait. To illustrate the use of our new methods, they are applied to a study of the association of haplotypes (composed of genes from the human-leukocyte-antigen complex) with humoral immune response to measles vaccination. Limited simulations are also presented to demonstrate the validity of our methods, as well as to provide guidelines on how our methods could be used.  相似文献   

11.
Recursive likelihood calculations for genetic analysis with ungenotyped pedigree data employ variations of the Elston-Stewart (ES) or the Lander-Green (LG) algorithms. With the ES algorithm, the number of loci may be limited but not the pedigree size. With the LG algorithm, the reverse is the case. We introduce two new algorithms for the computation of regressive likelihoods for pedigrees with multivariate traits. The first is an alternative formulation of our existing model, which leads to a simpler form in the binary trait, polygenic and mixed model cases. The second is an approximation model, which is computationally efficient. These methods apply to both continuous and binary traits, in the oligogenic and polygenic cases. Both methods coincide in the binary case. We considered these methods for cases in which all the traits are controlled by a single locus, with each trait controlled by one locus independent to the others. Simulation studies and analysis of a real data are presented for segregation analysis as illustrations. These methods can also be used in other model-based analyses. These methods are implemented in G.E.M.S., the genetic epidemiology models software.  相似文献   

12.
Yi N  Banerjee S  Pomp D  Yandell BS 《Genetics》2007,176(3):1855-1864
Development of statistical methods and software for mapping interacting QTL has been the focus of much recent research. We previously developed a Bayesian model selection framework, based on the composite model space approach, for mapping multiple epistatic QTL affecting continuous traits. In this study we extend the composite model space approach to complex ordinal traits in experimental crosses. We jointly model main and epistatic effects of QTL and environmental factors on the basis of the ordinal probit model (also called threshold model) that assumes a latent continuous trait underlies the generation of the ordinal phenotypes through a set of unknown thresholds. A data augmentation approach is developed to jointly generate the latent data and the thresholds. The proposed ordinal probit model, combined with the composite model space framework for continuous traits, offers a convenient way for genomewide interacting QTL analysis of ordinal traits. We illustrate the proposed method by detecting new QTL and epistatic effects for an ordinal trait, dead fetuses, in a F(2) intercross of mice. Utility and flexibility of the method are also demonstrated using a simulated data set. Our method has been implemented in the freely available package R/qtlbim, which greatly facilitates the general usage of the Bayesian methodology for genomewide interacting QTL analysis for continuous, binary, and ordinal traits in experimental crosses.  相似文献   

13.
Sexual selection drives fundamental evolutionary processes such as trait elaboration and speciation. Despite this importance, there are surprisingly few examples of genes unequivocally responsible for variation in sexually selected phenotypes. This lack of information inhibits our ability to predict phenotypic change due to universal behaviours, such as fighting over mates and mate choice. Here, we discuss reasons for this apparent gap and provide recommendations for how it can be overcome by adopting contemporary genomic methods, exploiting underutilized taxa that may be ideal for detecting the effects of sexual selection and adopting appropriate experimental paradigms. Identifying genes that determine variation in sexually selected traits has the potential to improve theoretical models and reveal whether the genetic changes underlying phenotypic novelty utilize common or unique molecular mechanisms. Such a genomic approach to sexual selection will help answer questions in the evolution of sexually selected phenotypes that were first asked by Darwin and can furthermore serve as a model for the application of genomics in all areas of evolutionary biology.  相似文献   

14.
It is a challenging issue to map Quantitative Trait Loci (QTL) underlying complex discrete traits, which usually show discontinuous distribution and less information, using conventional statistical methods. Bayesian-Markov chain Monte Carlo (Bayesian-MCMC) approach is the key procedure in mapping QTL for complex binary traits, which provides a complete posterior distribution for QTL parameters using all prior information. As a consequence, Bayesian estimates of all interested variables can be obtained straightforwardly basing on their posterior samples simulated by the MCMC algorithm. In our study, utilities of Bayesian-MCMC are demonstrated using simulated several animal outbred full-sib families with different family structures for a complex binary trait underlied by both a QTL and polygene. Under the Identity-by-Descent-Based variance component random model, three samplers basing on MCMC, including Gibbs sampling, Metropolis algorithm and reversible jump MCMC, were implemented to generate the joint posterior distribution of all unknowns so that the QTL parameters were obtained by Bayesian statistical inferring. The results showed that Bayesian-MCMC approach could work well and robust under different family structures and QTL effects. As family size increases and the number of family decreases, the accuracy of the parameter estimates will be improved. When the true QTL has a small effect, using outbred population experiment design with large family size is the optimal mapping strategy.  相似文献   

15.
An ultimate goal of genetic research is to understand the connection between genotype and phenotype in order to improve the diagnosis and treatment of diseases. The quantitative genetics field has developed a suite of statistical methods to associate genetic loci with diseases and phenotypes, including quantitative trait loci (QTL) linkage mapping and genome-wide association studies (GWAS). However, each of these approaches have technical and biological shortcomings. For example, the amount of heritable variation explained by GWAS is often surprisingly small and the resolution of many QTL linkage mapping studies is poor. The predictive power and interpretation of QTL and GWAS results are consequently limited. In this study, we propose a complementary approach to quantitative genetics by interrogating the vast amount of high-throughput genomic data in model organisms to functionally associate genes with phenotypes and diseases. Our algorithm combines the genome-wide functional relationship network for the laboratory mouse and a state-of-the-art machine learning method. We demonstrate the superior accuracy of this algorithm through predicting genes associated with each of 1157 diverse phenotype ontology terms. Comparison between our prediction results and a meta-analysis of quantitative genetic studies reveals both overlapping candidates and distinct, accurate predictions uniquely identified by our approach. Focusing on bone mineral density (BMD), a phenotype related to osteoporotic fracture, we experimentally validated two of our novel predictions (not observed in any previous GWAS/QTL studies) and found significant bone density defects for both Timp2 and Abcg8 deficient mice. Our results suggest that the integration of functional genomics data into networks, which itself is informative of protein function and interactions, can successfully be utilized as a complementary approach to quantitative genetics to predict disease risks. All supplementary material is available at http://cbfg.jax.org/phenotype.  相似文献   

16.
Phenotypic variation within populations has two sources: genetic variation and environmental variation. Here, we investigate the coevolution of these two components under fluctuating selection. Our analysis is based on the lottery model in which genetic polymorphism can be maintained by negative frequency-dependent selection, whereas environmental variation can be favored due to bet-hedging. In our model, phenotypes are characterized by a quantitative trait under stabilizing selection with the optimal phenotype fluctuating in time. Genotypes are characterized by their phenotypic offspring distribution, which is assumed to be Gaussian with heritable variation for its mean and variance. Polymorphism in the mean corresponds to genetic variance while the width of the offspring distribution corresponds to environmental variance. We show that increased environmental variance is favored whenever fluctuations in the selective optima are sufficiently strong. Given the environmental variance has evolved to its optimum, genetic polymorphism can still emerge if the distribution of selective optima is sufficiently asymmetric or leptokurtic. Polymorphism evolves in a diagonal direction in trait space: one type becomes a canalized specialist for the more common ecological conditions and the other type a de-canalized bet-hedger thriving on the less-common conditions. All results are based on analytical approximations, complemented by individual-based simulations.  相似文献   

17.
We have developed an integrated approach, using genetic and genomic methods, in conjunction with resources from the Southwest National Primate Research Center (SNPRC) baboon colony, for the identification of genes and their functional variants that encode quantitative trait loci (QTL). In addition, we use comparative genomic methods to overcome the paucity of baboon specific reagents and to augment translation of our findings in a nonhuman primate (NHP) to the human population. We are using the baboon as a model to study the genetics of cardiovascular disease (CVD). A key step for understanding gene–environment interactions in cardiovascular disease is the identification of genes and gene variants that influence CVD phenotypes. We have developed a sequential methodology that takes advantage of the SNPRC pedigreed baboon colony, the annotated human genome, and current genomic and bioinformatic tools. The process of functional polymorphism identification for genes encoding QTLs involves comparison of expression profiles for genes and predicted genes in the genomic region of the QTL for individuals discordant for the phenotypic trait mapping to the QTL. After comparison, genes of interest are prioritized, and functional polymorphisms are identified in candidate genes by genotyping and quantitative trait nucleotide analysis. This approach reduces the time and labor necessary to prioritize and identify genes and their polymorphisms influencing variation in a quantitative trait compared with traditional positional cloning methods.  相似文献   

18.
It is a challenging issue to map Quantitative Trait Loci (QTL) underlying complex discrete traits,which usually show discontinuous distribution and less information,using conventional statisti-cal methods. Bayesian-Markov chain Monte Carlo (Bayesian-MCMC) approach is the key procedure in mapping QTL for complex binary traits,which provides a complete posterior distribution for QTL parameters using all prior information. As a consequence,Bayesian estimates of all interested vari-ables can be obtained straightforwardly basing on their posterior samples simulated by the MCMC algorithm. In our study,utilities of Bayesian-MCMC are demonstrated using simulated several ani-mal outbred full-sib families with different family structures for a complex binary trait underlied by both a QTL and polygene. Under the Identity-by-Descent-Based variance component random model,three samplers basing on MCMC,including Gibbs sampling,Metropolis algorithm and reversible jump MCMC,were implemented to generate the joint posterior distribution of all unknowns so that the QTL parameters were obtained by Bayesian statistical inferring. The results showed that Bayesian-MCMC approach could work well and robust under different family structures and QTL effects. As family size increases and the number of family decreases,the accuracy of the parameter estimates will be im-proved. When the true QTL has a small effect,using outbred population experiment design with large family size is the optimal mapping strategy.  相似文献   

19.
20.
Experimental evolution via continuous culture is a powerful approach to the alteration of complex phenotypes, such as optimal/maximal growth temperatures. The benefit of this approach is that phenotypic selection is tied to growth rate, allowing the production of optimized strains. Herein, we demonstrate the use of a recently described long-term culture apparatus called the Evolugator for the generation of a thermophilic descendant from a mesophilic ancestor (Escherichia coli MG1655). In addition, we used whole-genome sequencing of sequentially isolated strains throughout the thermal adaptation process to characterize the evolutionary history of the resultant genotype, identifying 31 genetic alterations that may contribute to thermotolerance, although some of these mutations may be adaptive for off-target environmental parameters, such as rich medium. We undertook preliminary phenotypic analysis of mutations identified in the glpF and fabA genes. Deletion of glpF in a mesophilic wild-type background conferred significantly improved growth rates in the 43-to-48°C temperature range and altered optimal growth temperature from 37°C to 43°C. In addition, transforming our evolved thermotolerant strain (EVG1064) with a wild-type allele of glpF reduced fitness at high temperatures. On the other hand, the mutation in fabA predictably increased the degree of saturation in membrane lipids, which is a known adaptation to elevated temperature. However, transforming EVG1064 with a wild-type fabA allele had only modest effects on fitness at intermediate temperatures. The Evolugator is fully automated and demonstrates the potential to accelerate the selection for complex traits by experimental evolution and significantly decrease development time for new industrial strains.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号