期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Model selection in binary trait locus mapping

Coffman CJ Doerge RW Simonsen KL Nichols KM Duarte CK Wolfinger RD McIntyre LM 《Genetics》2005,170(3):1281-1297

Quantitative trait locus (QTL) mapping methodology for continuous normally distributed traits is the subject of much attention in the literature. Binary trait locus (BTL) mapping in experimental populations has received much less attention. A binary trait by definition has only two possible values, and the penetrance parameter is restricted to values between zero and one. Due to this restriction, the infinitesimal model appears to come into play even when only a few loci are involved, making selection of an appropriate genetic model in BTL mapping challenging. We present a probability model for an arbitrary number of BTL and demonstrate that, given adequate sample sizes, the power for detecting loci is high under a wide range of genetic models, including most epistatic models. A novel model selection strategy based upon the underlying genetic map is employed for choosing the genetic model. We propose selecting the "best" marker from each linkage group, regardless of significance. This reduces the model space so that an efficient search for epistatic loci can be conducted without invoking stepwise model selection. This procedure can identify unlinked epistatic BTL, demonstrated by our simulations and the reanalysis of Oncorhynchus mykiss experimental data. 相似文献

2.

Asymptotic distribution of the lod score for familial data 总被引：1，自引：0，他引：1

J J Tai C L Chen 《Proceedings of the National Science Council, Republic of China. Part B, Life sciences》1989,13(1):38-41

Using a linkage model with mixed parental mating types between a trait locus and a marker locus the asymptotic null distribution of the statistic U = 2ln(10)Z(theta) was stimulated and compared to the chi square type distribution function 1/2 + 1/2 Pr [chi 2(1) less than mu]. The stimulation results show that the chi square approximation fits the asymptotic null distribution well when both loci are predominated by either one of the two alleles at their loci, respectively, or the linkage phase tends to disequilibrium. 相似文献

3.

Appropriate likelihood ratio tests and marginal distributions for evolutionary tree models with constraints on parameters

Ota R Waddell PJ Hasegawa M Shimodaira H Kishino H 《Molecular biology and evolution》2000,17(5):798-803

相似文献

4.

Prediction of empirical p values from asymptotic p values for conditional logistic affected relative pair linkage analysis

Sinha M Song Y Elston RC Olson JM Goddard KA 《Human heredity》2006,61(1):45-54

OBJECTIVE: p Values are inaccurate for model-free linkage analysis using the conditional logistic model if we assume that the LOD score is asymptotically distributed as a simple mixture of chi-square distributions. When analyzing affected relative pairs alone, permuting the allele sharing of relative pairs does not lead to a useful permutation distribution. As an alternative, we have developed regression prediction models that provide more accurate p values. METHODS: Let E(alpha) be the empirical p value, which is the proportion of statistical tests whose LOD score under the null hypothesis exceeds a threshold determined by alpha, the nominal single test significance value. We used simulated data to obtain values of E(alpha) and compared them with alpha. We also developed a regression model, based on sample size, number of covariates in the model, alpha and marker density, to derive predicted p values for both single-point and multipoint analyses. To evaluate our predictions we used another set of simulated data, comparing the Ealpha for these data with those obtained by using the prediction model, referred to as predicted p values (P(alpha)). RESULTS: Under almost all circumstances the values of P(alpha) were closer to the E(alpha) than were the values of alpha. CONCLUSION: The regression models suggested by our analysis provide more accurate alternative p values for model-free linkage analysis when using the conditional logistic model. 相似文献

5.

MOD-score analysis with simple pedigrees: an overview of likelihood-based linkage methods

Strauch K 《Human heredity》2007,64(3):192-202

A MOD-score analysis, in which the parametric LOD score is maximized with respect to the trait-model parameters, can be a powerful method for the mapping of complex traits. With affected sib pairs, it has been shown before that MOD scores asymptotically follow a mixture of chi(2) distributions with 2, 1 and 0 degrees of freedom under the null hypothesis of no linkage. In that context, a MOD-score analysis yields some (albeit limited) information regarding the trait-model parameters, and there is a chance for an increased power compared to a simple LOD-score analysis. Here, it is shown that with unilineal affected relative pairs, MOD scores asymptotically follow a mixture of chi(2) distributions with 1 and 0 degrees of freedom under the null hypothesis, that is, the same distribution as followed by simple LOD scores. No information regarding the trait model can be obtained in this setting, and no power is gained when compared to a LOD-score analysis. An outlook to larger pedigrees is given. The number of degrees of freedom underlying the null distribution of MOD scores, that depends on the type of pedigrees studied, corresponds to the number of explored dimensions related to power and to the number of parameters that can jointly be estimated. 相似文献

6.

Regression-based sib pair linkage analysis for binary traits

Zeegers MP Rice JP Rijsdijk FV Abecasis GR Sham PC 《Human heredity》2003,55(2-3):125-131

The Haseman-Elston (HE) regression method offers a mathematically and computationally simpler alternative to variance-components (VC) models for the linkage analysis of quantitative traits. However, current versions of HE regression and VC models are not optimised for binary traits. Here, we present a modified HE regression and a liability-threshold VC model for binary-traits. The new HE method is based on the regression of a linear combination of the trait squares and the trait cross-product on the proportion of alleles identical by descent (IBD) at the putative locus, for sibling pairs. We have implemented both the new HE regression-based method and have performed analytic and simulation studies to assess its type 1 error rate and power under a range of conditions. These studies showed that the new HE method is well-behaved under the null hypothesis in large samples, is more powerful than both the original and the revisited HE methods, and is approximately equivalent in power to the liability-threshold VC model. 相似文献

7.

基于选择基因型对数量性状进行关联分析

向阳 ;李玉梅 ;孙振球《生物数学学报》2009,(4):599-608

数量性状的遗传分析可以通过＂选择基因型＂的方式完成。本文提出了一个利用极端样本来对数量性状位点（QTL）进行关联分析的统计量T。统计量T比较上极端群体样本中具有纯合子标记的性状值差异。通过计算机模拟考察了无关联情形时T的分布和Ⅰ型错误率,结果表明,在各种样本选择策略下,T的分布近似于χ^2-分布,Ⅰ型错误率接近设定的显著性水平。同时,考察了各种遗传模型下不同遗传率,不同样本大小,及不同样本选择阈值对T的统计功效的影响,结果表明,T的功效随着标记和QTL间连锁不平衡程度的增强及遗传率和样本大小的增大而增大,当样本选择阈值更严格时,功效也越大。相似文献

8.

Mapping Quantitative Trait Loci for Complex Binary Diseases Using Line Crosses 总被引：15，自引：7，他引：8

下载免费PDF全文

S. Xu W. R. Atchley 《Genetics》1996,143(3):1417-1424

A composite interval gene mapping procedure for complex binary disease traits is proposed in this paper. The binary trait of interest is assumed to be controlled by an underlying liability that is normally distributed. The liability is treated as a typical quantitative character and thus described by the usual quantitative genetics model. Translation from the liability into a binary (disease) phenotype is through the physiological threshold model. Logistic regression analysis is employed to estimate the effects and locations of putative quantitative trait loci (our terminology for a single quantitative trait locus is QTL while multiple loci are referred to as QTLs). Simulation studies show that properties of this mapping procedure mimic those of the composite interval mapping for normally distributed data. Potential utilization of the QTL mapping procedure for resolving alternative genetic models (e.g., single- or two-trait-locus model) is discussed. 相似文献

9.

Mapping binary trait loci in the F(2:3) design 总被引：1，自引：0，他引：1

Zhu C Huang J Zhang YM 《The Journal of heredity》2007,98(4):337-344

In the inheritance analysis of quantitative trait with low heritability, the precision is relatively low. In this situation, an F(2:3) design, which is genotyped in F(2) plants and phenotyped in the F(2:3) progeny, is applied to increase the precision in the detection of quantitative trait loci (QTL). This is because that residual variance on the basis of family-mean-based observations has been significantly decreased by increasing the number of F(2:3) progeny. Our previous results showed that the mixture distribution for the F(2:3) family of heterozygous F(2) plant can significantly increase the power of QTL detection relative to the classical F(2) design. In this article, we extended our previous method from continuous traits to binary traits in the F(2:3) design. The method here also takes full advantage of the mixture distribution. However, the method presented here differs from our previous method in 2 aspects. One is that the penetrance model is integrated with the liability model for mapping binary trait loci (BTL), and another is that the phenotypic data used in the analysis are the sum of phenotypic values of F(2:3) progeny derived from each F(2) plant rather than the average of F(2:3) progeny due to the fact that the distribution of the sum follows binomial distribution. In addition, the threshold in the liability model could also be estimated. Therefore, a new framework of mapping BTL on the basis of a single BTL model was set up and implemented via the Expectation-Maximization algorithm. Results of simulated studies showed that the proposed method provides accurate estimates for both the effects and the locations of BTL, with high statistical power even under the low heritability. With the new method, we are ready to map BTL, as we can do for quantitative traits under the F(2:3) design. The computer program performing the analysis of the simulated data is available to users for real data analysis. 相似文献

10.

Power and sample size calculations for genetic case/control studies using gene-centric SNP maps: application to human chromosomes 6, 21, and 22 in three populations

De La Vega FM Gordon D Su X Scafe C Isaac H Gilbert DA Spier EG 《Human heredity》2005,60(1):43-60

Power and sample size calculations are critical parts of any research design for genetic association. We present a method that utilizes haplotype frequency information and average marker-marker linkage disequilibrium on SNPs typed in and around all genes on a chromosome. The test statistic used is the classic likelihood ratio test applied to haplotypes in case/control populations. Haplotype frequencies are computed through specification of genetic model parameters. Power is determined by computation of the test's non-centrality parameter. Power per gene is computed as a weighted average of the power assuming each haplotype is associated with the trait. We apply our method to genotype data from dense SNP maps across three entire chromosomes (6, 21, and 22) for three different human populations (African-American, Caucasian, Chinese), three different models of disease (additive, dominant, and multiplicative) and two trait allele frequencies (rare, common). We perform a regression analysis using these factors, average marker-marker disequilibrium, and the haplotype diversity across the gene region to determine which factors most significantly affect average power for a gene in our data. Also, as a 'proof of principle' calculation, we perform power and sample size calculations for all genes within 100 kb of the PSORS1 locus (chromosome 6) for a previously published association study of psoriasis. Results of our regression analysis indicate that four highly significant factors that determine average power to detect association are: disease model, average marker-marker disequilibrium, haplotype diversity, and the trait allele frequency. These findings may have important implications for the design of well-powered candidate gene association studies. Our power and sample size calculations for the PSORS1 gene appear consistent with published findings, namely that there is substantial power (>0.99) for most genes within 100 kb of the PSORS1 locus at the 0.01 significance level. 相似文献

11.

The quest for trait convergence and divergence in community assembly: are null‐models the magic wand?

Francesco de Bello 《Global Ecology and Biogeography》2012,21(3):312-317

The relevance of neutral versus niche‐based community assembly rules (i.e. the processes sorting species present in a larger geographical region into local communities) remains to be demonstrated in ecology and biogeography. To attempt to do this, a number of complex null models are increasingly being used that compare observed community functional diversity (FD, i.e. the extent of trait dissimilarity between coexisting species) with randomly simulated FD. However, little is known about the performance of these null models in detecting non‐neutral community assembly rules such as trait convergence and divergence of communities (supposedly revealing habitat selection and limiting similarity, respectively). Here, using both simulated and field communities, I show that assembly rule detection varies systematically with the magnitude of the observed FD, so that these null models do not really succeed in breaking down the observed functional relationships between species. This is a particular concern, making detection of community assembly dependent on: (1) the pool of samples considered, and (2) the capacity of observed FD to correctly discriminate these rules. Null models should be more thoroughly described and validated before being considered as a magic wand to reveal assembly patterns. 相似文献

12.

The Influence of Matrix Size on Statistical Properties of Co-Occurrence and Limiting Similarity Null Models

Thomas Michael Lavender Brandon S. Schamp Eric G. Lamb 《PloS one》2016,11(3)

Null models exploring species co-occurrence and trait-based limiting similarity are increasingly used to explore the influence of competition on community assembly; however, assessments of common models have not thoroughly explored the influence of variation in matrix size on error rates, in spite of the fact that studies have explored community matrices that vary considerably in size. To determine how smaller matrices, which are of greatest concern, perform statistically, we generated biologically realistic presence-absence matrices ranging in size from 3–50 species and sites, as well as associated trait matrices. We examined co-occurrence tests using the C-Score statistic and independent swap algorithm. For trait-based limiting similarity null models, we used the mean nearest neighbour trait distance (NN) and the standard deviation of nearest neighbour distances (SDNN) as test statistics, and considered two common randomization algorithms: abundance independent trait shuffling (AITS), and abundance weighted trait shuffling (AWTS). Matrices as small as three × three resulted in acceptable type I error rates (p < 0.05) for both the co-occurrence and trait-based limiting similarity null models when exclusive p-values were used. The commonly used inclusive p-value (≤ or ≥, as opposed to exclusive p-values; < or >) was associated with increased type I error rates, particularly for matrices with fewer than eight species. Type I error rates increased for limiting similarity tests using the AWTS randomization scheme when community matrices contained more than 35 sites; a similar randomization used in null models of phylogenetic dispersion has previously been viewed as robust. Notwithstanding other potential deficiencies related to the use of small matrices to represent communities, the application of both classes of null model should be restricted to matrices with 10 or more species to avoid the possibility of type II errors. Additionally, researchers should restrict the use of the AWTS randomization to matrices with fewer than 35 sites to avoid type I errors when testing for trait-based limiting similarity. The AITS randomization scheme performed better in terms of type I error rates, and therefore may be more appropriate when considering systems for which traits are not clustered by abundance. 相似文献

13.

Theoretical and empirical power of regression and maximum-likelihood methods to map quantitative trait loci in general pedigrees

下载免费PDF全文

Yu X Knott SA Visscher PM 《American journal of human genetics》2004,75(1):17-26

Both theoretical calculations and simulation studies have been used to compare and contrast the statistical power of methods for mapping quantitative trait loci (QTLs) in simple and complex pedigrees. A widely used approach in such studies is to derive or simulate the expected mean test statistic under the alternative hypothesis of a segregating QTL and to equate a larger mean test statistic with larger power. In the present study, we show that, even when the test statistic under the null hypothesis of no linkage follows a known asymptotic distribution (the standard being chi(2)), it cannot be assumed that the distribution under the alternative hypothesis is noncentral chi(2). Hence, mean test statistics cannot be used to indicate power differences, and a comparison between methods that are based on simulated average test statistics may lead to the wrong conclusion. We illustrate this important finding, through simulations and analytical derivations, for a recently proposed new regression method for the analysis of general pedigrees to map quantitative trait loci. We show that this regression method is not necessarily more powerful nor computationally more efficient than a maximum-likelihood variance-component approach. We advocate the use of empirical power to compare trait-mapping methods. 相似文献

14.

Trait‐based approach confirms the importance of propagule limitation and assembly rules in old‐field restoration

Melinda Halassy Zoltn Botta‐Dukt Anik Csecserits Katalin Szitr Katalin Trk 《Restoration Ecology》2019,27(4):840-849

Community assembly theory is suggested as a guiding principle for ecological restoration to help understand the mechanisms that structure biological communities and identify where restoration interventions are needed. We studied three hypotheses related to propagule limitation, stress‐dominance, and limiting similarity concepts in community assembly in a restoration field experiment with a trait‐based null model approach. The experiment aimed to assist the recovery of sand grassland on former arable land in the Kiskunság, Pannonian biogeographic region, Europe. Treatments included initial seeding of five grassland species, carbon amendment, low‐intensity mowing, and combinations in 1 m by 1 m plots in three old fields from 2003 to 2008. The distribution of 10 individual plant traits was compared to the null model and the effect of time and treatments were tested with linear mixed effect models. Initial seeding had the most visible impact on species and trait composition confirming propagule limitation in grassland recovery. Reducing nutrient availability through carbon amendment strengthened trait convergence for length of flowering as expected based on the stress‐dominance hypothesis. Mowing changed trait divergence to convergence for plant height with a strengthening impact with time, supporting our hypothesis of increasing dominance of limiting similarity with time. Our results support the idea that community assembly is simultaneously influenced by propagule limitation and multiple trait‐based processes that act through different traits. The limited impact of manipulating environmental filtering and limiting similarity compared to seeding, however, supports the view that only targeting the dispersal and environmental filters in parallel would improve restoration outcome. 相似文献

15.

The null distribution of the heterogeneity lod score does depend on the assumed genetic model for the trait.

J Huang V J Vieland 《Human heredity》2001,52(4):217-222

It is well known that the asymptotic null distribution of the homogeneity lod score (LOD) does not depend on the genetic model specified in the analysis. When appropriately rescaled, the LOD is asymptotically distributed as 0.5 chi(2)(0) + 0.5 chi(2)(1), regardless of the assumed trait model. However, because locus heterogeneity is a common phenomenon, the heterogeneity lod score (HLOD), rather than the LOD itself, is often used in gene mapping studies. We show here that, in contrast with the LOD, the asymptotic null distribution of the HLOD does depend upon the genetic model assumed in the analysis. In affected sib pair (ASP) data, this distribution can be worked out explicitly as (0.5 - c)chi(2)(0) + 0.5chi(2)(1) + cchi(2)(2), where c depends on the assumed trait model. E.g., for a simple dominant model (HLOD/D), c is a function of the disease allele frequency p: for p = 0.01, c = 0.0006; while for p = 0.1, c = 0.059. For a simple recessive model (HLOD/R), c = 0.098 independently of p. This latter (recessive) distribution turns out to be the same as the asymptotic distribution of the MLS statistic under the possible triangle constraint, which is asymptotically equivalent to the HLOD/R. The null distribution of the HLOD/D is close to that of the LOD, because the weight c on the chi(2)(2) component is small. These results mean that the cutoff value for a test of size alpha will tend to be smaller for the HLOD/D than the HLOD/R. For example, the alpha = 0.0001 cutoff (on the lod scale) for the HLOD/D with p = 0.05 is 3.01, while for the LOD it is 3.00, and for the HLOD/R it is 3.27. For general pedigrees, explicit analytical expression of the null HLOD distribution does not appear possible, but it will still depend on the assumed genetic model. 相似文献

16.

Linkage disequilibrium mapping of quantitative trait loci under truncation selection

Xiong M Fan R Jin L 《Human heredity》2002,53(3):158-172

As a dense map of single nucleotide polymorphism (SNP) markers are available, population-based linkage disequilibrium (LD) mapping or association study is becoming one of the major tools for identifying quantitative trait loci (QTL) and for fine gene mapping. However, in many cases, LD between the marker and trait locus is not very strong. Approaches that maximize the potential of detecting LD will be essential for the success of LD mapping of QTL. In this paper, we propose two strategies for increasing the probability of detecting LD: (1) phenotypic selection and (2) haplotype LD mapping. To provide the foundations for LD mapping of QTL under selection, we develop analytic tools for assessing the impact of phenotypic selection on allele and haplotype frequencies, and LD under three trait models: single trait locus, two unlinked trait loci, and two linked trait loci with or without epistasis. In addition to a traditional chi(2) test, which compares the difference in allele or haplotype frequencies in the selected sample and population sample, we present multiple regression methods for LD mapping of QTL, and investigate which methods are effective in employing phenotypic selection for QTL mapping. We also develop a statistical framework for investigating and comparing the power of the single marker and multilocus haplotype test for LD mapping of QTL. Finally, the proposed methods are applied to mapping QTL influencing variation in systolic blood pressure in an isolated Chinese population. 相似文献

17.

Detection and localization of a single binary trait locus in experimental populations

McIntyre LM Coffman CJ Doerge RW 《Genetical research》2001,78(1):79-92

The advancements made in molecular technology coupled with statistical methodology have led to the successful detection and location of genomic regions (quantitative trait loci; QTL) associated with quantitative traits. Binary traits (e.g. susceptibility/resistance), while not quantitative in nature, are equally important for the purpose of detecting and locating significant associations with genomic regions. Existing interval regression methods used in binary trait analysis are adapted from quantitative trait analysis and the tests for regression coefficients are tests of effect, not detection. Additionally, estimates of recombination that fail to take into account varying penetrance perform poorly when penetrance is incomplete. In this work a complete probability model for binary trait data is developed allowing for unbiased estimation of both penetrance and recombination between a genetic marker locus and a binary trait locus for backcross and F2 experimental designs. The regression model is reparameterized allowing for tests of detection. Extensive simulations were conducted to assess the performance of estimation and testing in the proposed parameterization. The proposed parameterization was compared with interval regression via simulation. The results indicate that our parameterization shows equivalent estimation capabilities, requires less computational effort and works well with only a single marker. 相似文献

18.

Percentiles of the null distribution of 2 maximum lod score tests

Ulgen A Yoo YJ Gordon D Finch SJ Mendell NR 《Human heredity》2004,57(1):39-48

We here consider the null distribution of the maximum lod score (LOD-M) obtained upon maximizing over transmission model parameters (penetrance values, dominance, and allele frequency) as well as the recombination fraction. Also considered is the lod score maximized over a fixed choice of genetic model parameters and recombination-fraction values set prior to the analysis (MMLS) as proposed by Hodge et al. The objective is to fit parametric distributions to MMLS and LOD-M. Our results are based on 3,600 simulations of samples of n = 100 nuclear families ascertained for having one affected member and at least one other sibling available for linkage analysis. Each null distribution is approximately a mixture p(2)(0) + (1 - p)(2)(v). The values of MMLS appear to fit the mixture 0.20(2)(0) + 0.80chi(2)(1.6). The mixture distribution 0.13(2)(0) + 0.87chi(2)(2.8). appears to describe the null distribution of LOD-M. From these results we derive a simple method for obtaining critical values of LOD-M and MMLS. 相似文献

19.

Mapping quantitative trait loci for binary trait in the F2:3 design

Chengsong Zhu Yuan-Ming Zhang Zhigang Guo 《Journal of genetics》2008,87(3):201-207

In the analysis of inheritance of quantitative traits with low heritability, an F_2:3 design that genotypes plants in F₂ and phenotypes plants in F_2:3 progeny is often used in plant genetics. Although statistical approaches for mapping quantitative trait loci (QTL) in the F_2:3 design have been well developed, those for binary traits of biological interest and economic importance are seldom addressed. In this study, an attempt was made to map binary trait loci (BTL) in the F_2:3 design. The fundamental idea was: the F₂ plants were genotyped, all phenotypic values of each F_2:3 progeny were measured for binary trait, and these binary trait values and the marker genotype informations were used to detect BTL under the penetrance and liability models. The proposed method was verified by a series of Monte-Carlo simulation experiments. These results showed that maximum likelihood approaches under the penetrance and liability models provide accurate estimates for the effects and the locations of BTL with high statistical power, even under of low heritability. Moreover, the penetrance model is as efficient as the liability model, and the F_2:3 design is more efficient than classical F₂ design, even though only a single progeny is collected from each F_2:3 family. With the maximum likelihood approaches under the penetrance and the liability models developed in this study, we can map binary traits as we can do for quantitative trait in the F_2:3 design. 相似文献

20.

Estimating linkage disequilibrium between a polymorphic marker locus and a trait locus in natural populations.

Z W Luo S Suhai 《Genetics》1999,151(1):359-371

Positional cloning of gene(s) underlying a complex trait requires a high-resolution linkage map between the trait locus and genetic marker loci. Recent research has shown that this may be achieved through appropriately modeling and screening linkage disequilibrium between the candidate marker locus and the major trait locus. A quantitative genetics model was developed in the present study to estimate the coefficient of linkage disequilibrium between a polymorphic genetic marker locus and a locus underlying a quantitative trait as well as the relevant genetic parameters using the sample from randomly mating populations. Asymptotic covariances of the maximum-likelihood estimates of the parameters were formulated. Convergence of the EM-based statistical algorithm for calculating the maximum-likelihood estimates was confirmed and its utility to analyze practical data was exploited by use of extensive Monte-Carlo simulations. Appropriateness of calculating the asymptotic covariance matrix in the present model was investigated for three different approaches. Numerical analyses based on simulation data indicated that accurate estimation of the genetic parameters may be achieved if a sample size of 500 is used and if segregation at the trait locus explains not less than a quarter of phenotypic variation of the trait, but the study reveals difficulties in predicting the asymptotic variances of these maximum-likelihood estimates. A comparison was made between the statistical powers of the maximum-likelihood analysis and the previously proposed regression analysis for detecting the disequilibrium. 相似文献