首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Genome-wide association studies (GWAS) comprise a powerful tool for mapping genes of complex traits. However, an inflation of the test statistic can occur because of population substructure or cryptic relatedness, which could cause spurious associations. If information on a large number of genetic markers is available, adjusting the analysis results by using the method of genomic control (GC) is possible. GC was originally proposed to correct the Cochran-Armitage additive trend test. For non-additive models, correction has been shown to depend on allele frequencies. Therefore, usage of GC is limited to situations where allele frequencies of null markers and candidate markers are matched. In this work, we extended the capabilities of the GC method for non-additive models, which allows us to use null markers with arbitrary allele frequencies for GC. Analytical expressions for the inflation of a test statistic describing its dependency on allele frequency and several population parameters were obtained for recessive, dominant, and over-dominant models of inheritance. We proposed a method to estimate these required population parameters. Furthermore, we suggested a GC method based on approximation of the correction coefficient by a polynomial of allele frequency and described procedures to correct the genotypic (two degrees of freedom) test for cases when the model of inheritance is unknown. Statistical properties of the described methods were investigated using simulated and real data. We demonstrated that all considered methods were effective in controlling type 1 error in the presence of genetic substructure. The proposed GC methods can be applied to statistical tests for GWAS with various models of inheritance. All methods developed and tested in this work were implemented using R language as a part of the GenABEL package.  相似文献   

2.
When applying the Cochran‐Armitage (CA) trend test for an association between a candidate allele and a disease in a case‐control study, a set of scores must be assigned to the genotypes. Sasieni (1997, Biometrics 53 , 1253–1261) suggested scores for the recessive, additive, and dominant models but did not examine their statistical properties. Using the criteria of minimizing the required sample size of the CA trend test to achieve prespecified type I and type II errors, we show that the scores given by Sasieni (1997) are optimal for the recessive and dominant models and locally optimal for the additive one. Moreover, the additive scores are shown to be locally optimal for the multiplicative model. The tests are applied to a real dataset.  相似文献   

3.
Population-based case-control studies are a useful method to test for a genetic association between a trait and a marker. However, the analysis of the resulting data can be affected by population stratification or cryptic relatedness, which may inflate the variance of the usual statistics, resulting in a higher-than-nominal rate of false-positive results. One approach to preserving the nominal type I error is to apply genomic control, which adjusts the variance of the Cochran-Armitage trend test by calculating the statistic on data from null loci. This enables one to estimate any additional variance in the null distribution of statistics. When the underlying genetic model (e.g., recessive, additive, or dominant) is known, genomic control can be applied to the corresponding optimal trend tests. In practice, however, the mode of inheritance is unknown. The genotype-based chi (2) test for a general association between the trait and the marker does not depend on the underlying genetic model. Since this general association test has 2 degrees of freedom (df), the existing formulas for estimating the variance factor by use of genomic control are not directly applicable. By expressing the general association test in terms of two Cochran-Armitage trend tests, one can apply genomic control to each of the two trend tests separately, thereby adjusting the chi (2) statistic. The properties of this robust genomic control test with 2 df are examined by simulation. This genomic control-adjusted 2-df test has control of type I error and achieves reasonable power, relative to the optimal tests for each model.  相似文献   

4.
The effect of population bottlenecks on the components of the genetic variance generated by two neutral independent epistatic loci has been studied theoretically (VA, additive; VD, dominant; VAA, additive x additive; VAD, additive x dominant; VDD; dominant x dominant components of variance). Nonoverdominance and overdominance models were considered, covering all possible types of marginal gene action at the single locus level. The variance components in an infinitely large panmictic population (ancestral components) were compared with their expected values at equilibrium, after t consecutive bottlenecks of equal size N (derived components). Formulae were obtained in terms of allele frequencies and effects at each locus and the corresponding epistatic value. An excess of VA after bottlenecks can be assigned to two sources: (1) the spatiotemporal changes in the marginal average effects of gene substitution alpha(i), which are equal to zero only for additive gene action within and between loci; and (2) the covariance between alpha2(i) and the heterozygosity at the loci involved, which is generated by dominance, with or without epistasis. Numerical examples were analyzed, indicating that an increase in VA after bottlenecks will only occur if its ancestral value is minimal or very small. For the nonoverdominance model with weak reinforcing epistasis, that increase has been detected only for extreme frequencies of the negative allele at one or both loci. With strong epistasis, however, this result can be extended to a broad range of intermediate frequencies. With no epistasis, the same qualitative results were found, indicating that dominance can be considered as the primary cause of an increase in VA following bottlenecks. In parallel, the derived total nonadditive variance exceeded its ancestral value (V(NA) = V(D) + V(AA) + V(AD) + V(DD)) for a range of combinations of allele frequencies covering those for an excess of VA and for very large frequencies of the negative allele at both loci. For the overdominance model, an increase in V(A) and V(NA) was respectively observed for equilibrium (intermediate) frequencies at one or both loci or for extreme frequencies at both loci. For all models, the magnitude of the change of V(A) and V(NA) was inversely related to N and t. At low levels of inbreeding, the between-line variance was not affected by the type of gene action. For the models considered, the results indicate that it is unlikely that the rate of evolution may be accelerated after population bottlenecks, in spite of occasional increments of the derived V(A) over its ancestral value.  相似文献   

5.
Zang Y  Zhang H  Yang Y  Zheng G 《Human heredity》2007,63(3-4):187-195
The population-based case-control design is a powerful approach for detecting susceptibility markers of a complex disease. However, this approach may lead to spurious association when there is population substructure: population stratification (PS) or cryptic relatedness (CR). Two simple approaches to correct for the population substructure are genomic control (GC) and delta centralization (DC). GC uses the variance inflation factor to correct for the variance distortion of a test statistic, and the DC centralizes the non-central chi-square distribution of the test statistic. Both GC and DC have been studied for case-control association studies mainly under a specific genetic model (e.g. recessive, additive or dominant), under which an optimal trend test is available. The genetic model is usually unknown for many complex diseases. In this situation, we study the performance of three robust tests based on the GC and DC corrections in the presence of the population substructure. Our results show that, when the genetic model is unknown, the DC- (or GC-) corrected maximum and Pearson's association test are robust and have good control of Type I error and high power relative to the optimal trend tests in the presence of PS (or CR).  相似文献   

6.
A retrospective likelihood-based approach was proposed to test and estimate the effect of haplotype on disease risk using unphased genotype data with adjustment for environmental covariates. The proposed method was also extended to handle the data in which the haplotype and environmental covariates are not independent. Likelihood ratio tests were constructed to test the effects of haplotype and gene-environment interaction. The model parameters such as haplotype effect size was estimated using an Expectation Conditional-Maximization (ECM) algorithm developed by Meng and Rubin (1993). Model-based variance estimates were derived using the observed information matrix. Simulation studies were conducted for three different genetic effect models, including dominant effect, recessive effect, and additive effect. The results showed that the proposed method generated unbiased parameter estimates, proper type I error, and true beta coverage probabilities. The model performed well with small or large sample sizes, as well as short or long haplotypes.  相似文献   

7.
Asymptotic distribution for epistatic tests in case-control studies   总被引:1,自引:0,他引:1  
Liu T  Thalamuthu A  Liu JJ  Chen C  Wang Z  Wu R 《Genomics》2011,98(2):145-151
We propose a statistical model for dissecting a multilocus genotypic value into its main (additive and dominant) effects and epistatic effects between different loci in a case-control association study. The model can discern four different kinds of epistasis, additive × additive, additive × dominant, dominant × additive, and dominant × dominant interactions. To test each kind of epistasis, a χ2 test statistic was computed for a two by two contingency table derived from combined genotypes in both case and control groups. We derived an analytical approach for estimating the asymptotic distribution of the χ2 test statistic for epistatic tests under the null hypothesis, with the result being consistent with that from Monte Carlo simulations. The new model was used to analyze a case-control data set for candidate gene studies of stroke, leading to the identification of several significant interactions between causal SNPs on this disease.  相似文献   

8.
The Cochran–Armitage (CA) linear trend test for proportions is often used for genotype‐based analysis of candidate gene association. Depending on the underlying genetic mode of inheritance, the use of model‐specific scores maximises the power. Commonly, the underlying genetic model, i.e. additive, dominant or recessive mode of inheritance, is a priori unknown. Association studies are commonly analysed using permutation tests, where both inference and identification of the underlying mode of inheritance are important. Especially interesting are tests for case–control studies, defined by a maximum over a series of standardised CA tests, because such a procedure has power under all three genetic models. We reformulate the test problem and propose a conditional maximum test of scores‐specific linear‐by‐linear association tests. For maximum‐type, sum and quadratic test statistics the asymptotic expectation and covariance can be derived in a closed form and the limiting distribution is known. Both the limiting distribution and approximations of the exact conditional distribution can easily be computed using standard software packages. In addition to these technical advances, we extend the area of application to stratified designs, studies involving more than two groups and the simultaneous analysis of multiple loci by means of multiplicity‐adjusted p‐values for the underlying multiple CA trend tests. The new test is applied to reanalyse a study investigating genetic components of different subtypes of psoriasis. A new and flexible inference tool for association studies is available both theoretically as well as practically since already available software packages can be easily used to implement the suggested test procedures.  相似文献   

9.
The effect of population bottlenecks on the components of the genetic variance/covariance generated by n neutral independent additive x additive loci has been studied theoretically. In its simplest version, this situation can be modelled by specifying the allele frequencies and homozygous effects at each locus, and an additional factor measuring the strength of the n-th order epistatic interaction. The variance/covariance components in an infinitely large panmictic population (ancestral components) were compared with their expected values at equilibrium over replicates randomly derived from the base population, after t bottlenecks of size N (derived components). Formulae were obtained giving the derived components (and the between-line variance) as functions of the ancestral ones (alternatively, in terms of allele frequencies and effects) and the corresponding inbreeding coefficient F(t). The n-th order derived component of the genetic variance/covariance is continuously eroded by inbreeding, but the remaining components may increase initially until a critical F(t) value is attained, which is inversely related to the order of the pertinent component, and subsequently decline to zero. These changes can be assigned to the between-line variances/covariances of gene substitution and epistatic effects induced by drift. Numerical examples indicate that: (1) the derived additive variance/covariance component will generally exceed its ancestral value unless epistasis is weak; (2) the derived epistatic variance/covariance components will generally exceed their ancestral values unless allele frequencies are extreme; (3) for systems showing equal ancestral additive and total non-additive variance/covariance components, those including a smaller number of epistatic loci may generate a larger excess in additive variance/covariance after bottlenecks than others involving a larger number of loci, provided that F(t) is low. Our results indicate that it is unlikely that the rate of evolution may be significantly accelerated after population bottlenecks, in spite of occasional increments of the derived additive variance over its ancestral value.  相似文献   

10.
Statistical association between a single nucleotide polymorphism (SNP) genotype and a quantitative trait in genome-wide association studies is usually assessed using a linear regression model, or, in the case of non-normally distributed trait values, using the Kruskal-Wallis test. While linear regression models assume an additive mode of inheritance via equi-distant genotype scores, Kruskal-Wallis test merely tests global differences in trait values associated with the three genotype groups. Both approaches thus exhibit suboptimal power when the underlying inheritance mode is dominant or recessive. Furthermore, these tests do not perform well in the common situations when only a few trait values are available in a rare genotype category (disbalance), or when the values associated with the three genotype categories exhibit unequal variance (variance heterogeneity). We propose a maximum test based on Marcus-type multiple contrast test for relative effect sizes. This test allows model-specific testing of either dominant, additive or recessive mode of inheritance, and it is robust against variance heterogeneity. We show how to obtain mode-specific simultaneous confidence intervals for the relative effect sizes to aid in interpreting the biological relevance of the results. Further, we discuss the use of a related all-pairwise comparisons contrast test with range preserving confidence intervals as an alternative to Kruskal-Wallis heterogeneity test. We applied the proposed maximum test to the Bogalusa Heart Study dataset, and gained a remarkable increase in the power to detect association, particularly for rare genotypes. Our simulation study also demonstrated that the proposed non-parametric tests control family-wise error rate in the presence of non-normality and variance heterogeneity contrary to the standard parametric approaches. We provide a publicly available R library nparcomp that can be used to estimate simultaneous confidence intervals or compatible multiplicity-adjusted p-values associated with the proposed maximum test.  相似文献   

11.
The effect of population bottlenecks on the mean and the additive variance generated by two neutral independent epistatic loci has been studied theoretically. Six epistatic models, used in the analysis of binary disease traits, were considered. Ancestral values in an infinitely large panmictic population were compared with their expectations at equilibrium, after t consecutive bottlenecks of equal size N (derived values). An increase in the additive variance after bottlenecks (inversely related to N and t) will occur only if the frequencies of the negative allele at each locus are: (1) low, invariably associated to strong inbreeding depression; (2) high, always accompanied by an enhancement of the mean with inbreeding. The latter is an undesirable property, making the pertinent models unsuitable for the genetic analysis of disease. For the epistatic models considered, it is unlikely that the rate of evolution may be accelerated after population bottlenecks, in spite of occasional increments of the derived additive variance over its ancestral value.  相似文献   

12.
Genomic evaluation models can fit additive and dominant SNP effects. Under quantitative genetics theory, additive or “breeding” values of individuals are generated by substitution effects, which involve both “biological” additive and dominant effects of the markers. Dominance deviations include only a portion of the biological dominant effects of the markers. Additive variance includes variation due to the additive and dominant effects of the markers. We describe a matrix of dominant genomic relationships across individuals, D, which is similar to the G matrix used in genomic best linear unbiased prediction. This matrix can be used in a mixed-model context for genomic evaluations or to estimate dominant and additive variances in the population. From the “genotypic” value of individuals, an alternative parameterization defines additive and dominance as the parts attributable to the additive and dominant effect of the markers. This approach underestimates the additive genetic variance and overestimates the dominance variance. Transforming the variances from one model into the other is trivial if the distribution of allelic frequencies is known. We illustrate these results with mouse data (four traits, 1884 mice, and 10,946 markers) and simulated data (2100 individuals and 10,000 markers). Variance components were estimated correctly in the model, considering breeding values and dominance deviations. For the model considering genotypic values, the inclusion of dominant effects biased the estimate of additive variance. Genomic models were more accurate for the estimation of variance components than their pedigree-based counterparts.  相似文献   

13.
Summary An Expectation-Maximization (EM)-algorithm procedure is presented that extends Cheliak et al. (1983) method of maximum-likelihood estimation of mating system parameters of mixed mating system models. The extension permits the estimation of the rate of self-fertilization (s) and allele frequencies (Pi) at loci in outcrossing pollen, at marker loci having recessive null alleles. The algorithm makes use of maternal and filial genotypic arrays obtained by the electrophoretic analysis of cohorts of progeny. The genotypes of maternal plants must be known. Explicit equations are given for cases when the genotype of the maternal gamete inherited by a seed can (gymnosperms) or cannot (angiosperms) be determined. The procedure can accommodate any number of codominant alleles, but only one recessive null allele at each locus. An example, using actual data from Pinus banksiana, is presented to illustrate the application of this EM algorithm to the estimation of mating system parameters using marker loci having both codominant and recessive alleles.Issued as AECL-8745  相似文献   

14.
The effect of population bottlenecks on the components of the genetic covariance generated by two neutral independent epistatic loci has been studied theoretically (additive, covA; dominance, covD; additive-by-additive, covAA; additive-by-dominance, covAD; and dominance-by-dominance, covDD). The additive-by-additive model and a more general model covering all possible types of marginal gene action at the single-locus level (additive/dominance epistatic model) were considered. The covariance components in an infinitely large panmictic population (ancestral components) were compared with their expected values at equilibrium over replicates randomly derived from the base population, after t consecutive bottlenecks of equal size N (derived components). Formulae were obtained in terms of the allele frequencies and effects at each locus, the corresponding epistatic effects and the inbreeding coefficient Ft. These expressions show that the contribution of nonadditive loci to the derived additive covariance (covAt) does not linearly decrease with inbreeding, as in the pure additive case, and may initially increase or even change sign in specific situations. Numerical examples were also analyzed, restricted for simplicity to the case of all covariance components being positive. For additive-by-additive epistasis, the condition covAt > covA only holds for high frequencies of the allele decreasing the metric traits at each locus (negative allele) if epistasis is weak, or for intermediate allele frequencies if it is strong. For the additive/dominance epistatic model, however, covAt > covA applies for low frequencies of the negative alleles at one or both loci and mild epistasis, but this result can be progressively extended to intermediate frequencies as epistasis becomes stronger. Without epistasis the same qualitative results were found, indicating that marginal dominance induced by epistasis can be considered as the primary cause of an increase of the additive covariance after bottlenecks. For all models, the magnitude of the ratio covAt/covA was inversely related to N and t.  相似文献   

15.
Selection for production tends to decrease fitness, in particular, major components such as reproductive performance. Under an infinitesimal genetic model restricted index selection can maintain reproductive performance while improving production. However, reproductive traits are thought to be controlled by a finite number of recessive alleles at low frequency. Culling for low reproduction may weed out the negative homozygous genotypes for reproduction in any generation, thus controlling the frequencies of alleles negative for reproduction. Restricted index selection, culling for low reproduction and a new method called empirical restricted index selection were compared for their efficiency in improving production while maintaining reproduction. Empirical restricted index selection selects animals that have on average the highest estimated breeding values for production and on average the same estimated breeding values for reproduction as the base population. An infinitesimal genetic model and models with a finite number of loci for reproduction with rare deleterious recessive alleles, which have additive, dominant or no pleiotropic effects on production, were considered. When reproduction was controlled by a finite number of loci with rare recessive alleles, restricted index selection could not maintain reproduction. The culling of 20% of the animals on reproduction maintained reproduction with all genetic models, except for the model where loci for reproduction had additive effects on production. Empirical restricted selection maintained reproduction with all models and yielded higher production responses than culling on reproduction, except when there were dominant pleiotropic effects on production.  相似文献   

16.
Abstract We investigated the role of the number of loci coding for a neutral trait on the release of additive variance for this trait after population bottlenecks. Different bottleneck sizes and durations were tested for various matrices of genotypic values, with initial conditions covering the allele frequency space. We used three different types of matrices. First, we extended Cheverud and Routman's model by defining matrices of "pure" epistasis for three and four independent loci; second, we used genotypic values drawn randomly from uniform, normal, and exponential distributions; and third we used two models of simple metabolic pathways leading to physiological epistasis. For all these matrices of genotypic values except the dominant metabolic pathway, we find that, as the number of loci increases from two to three and four, an increase in the release of additive variance is occurring. The amount of additive variance released for a given set of genotypic values is a function of the inbreeding coefficient, independently of the size and duration of the bottleneck. The level of inbreeding necessary to achieve maximum release in additive variance increases with the number of loci. We find that additive-by-additive epistasis is the type of epistasis most easily converted into additive variance. For a wide range of models, our results show that epistasis, rather than dominance, plays a significant role in the increase of additive variance following bottlenecks.  相似文献   

17.
Adaptation in response to selection on polygenic phenotypes may occur via subtle allele frequencies shifts at many loci. Current population genomic techniques are not well posed to identify such signals. In the past decade, detailed knowledge about the specific loci underlying polygenic traits has begun to emerge from genome-wide association studies (GWAS). Here we combine this knowledge from GWAS with robust population genetic modeling to identify traits that may have been influenced by local adaptation. We exploit the fact that GWAS provide an estimate of the additive effect size of many loci to estimate the mean additive genetic value for a given phenotype across many populations as simple weighted sums of allele frequencies. We use a general model of neutral genetic value drift for an arbitrary number of populations with an arbitrary relatedness structure. Based on this model, we develop methods for detecting unusually strong correlations between genetic values and specific environmental variables, as well as a generalization of comparisons to test for over-dispersion of genetic values among populations. Finally we lay out a framework to identify the individual populations or groups of populations that contribute to the signal of overdispersion. These tests have considerably greater power than their single locus equivalents due to the fact that they look for positive covariance between like effect alleles, and also significantly outperform methods that do not account for population structure. We apply our tests to the Human Genome Diversity Panel (HGDP) dataset using GWAS data for height, skin pigmentation, type 2 diabetes, body mass index, and two inflammatory bowel disease datasets. This analysis uncovers a number of putative signals of local adaptation, and we discuss the biological interpretation and caveats of these results.  相似文献   

18.
We provide experimental evidence showing that, during the restriction-enzyme digestion of DNA samples, some of the HaeIII-digested DNA fragments are small enough to prevent their reliable sizing on a Southern gel. As a result of such nondetectability of DNA fragments, individuals who show a single-band DNA profile at a VNTR locus may not necessarily be true homozygotes. In a population database, when the presence of such nondetectable alleles is ignored, we show that a pseudodependence of alleles within as well as across loci may occur. Using a known statistical method, under the hypothesis of independence of alleles within loci, we derive an efficient estimate of null allele frequency, which may be subsequently used for testing allelic independence within and across loci. The estimates of null allele frequencies, thus derived, are shown to agree with direct experimental data on the frequencies of HaeIII-null alleles. Incorporation of null alleles into the analysis of the forensic VNTR database suggests that the assumptions of allelic independence within and between loci are appropriate. In contrast, a failure to incorporate the occurrence of null alleles would provide a wrong inference regarding the independence of alleles within and between loci.  相似文献   

19.
It is important to detect population bottlenecks in threatened and managed species because bottlenecks can increase the risk of population extinction. Early detection is critical and can be facilitated by statistically powerful monitoring programs for detecting bottleneck-induced genetic change. We used Monte Carlo computer simulations to evaluate the power of the following tests for detecting genetic changes caused by a severe reduction in a population's effective size ( N e): a test for loss of heterozygosity, two tests for loss of alleles, two tests for change in the distribution of allele frequencies, and a test for small N e based on variance in allele frequencies (the 'variance test'). The variance test was most powerful; it provided an 85% probability of detecting a bottleneck of size N e = 10 when monitoring five microsatellite loci and sampling 30 individuals both before and one generation after the bottleneck. The variance test was almost 10-times more powerful than a commonly used test for loss of heterozygosity, and it allowed for detection of bottlenecks before 5% of a population's heterozygosity had been lost. The second most powerful tests were generally the tests for loss of alleles. However, these tests had reduced power for detecting genetic bottlenecks caused by skewed sex ratios. We provide guidelines for the number of loci and individuals needed to achieve high-power tests when monitoring via the variance test. We also illustrate how the variance test performs when monitoring loci that have widely different allele frequency distributions as observed in five wild populations of mountain sheep ( Ovis canadensis ).  相似文献   

20.
Nuclear SSRs are notorious for having relatively high frequencies of null alleles, i.e. alleles that fail to amplify and are thus recessive and undetected in heterozygotes. In this paper, we compare two kinds of approaches for estimating null allele frequencies at seven nuclear microsatellite markers in three French Fagus sylvatica populations: (1) maximum likelihood methods that compare observed and expected homozygote frequencies in the population under the assumption of Hardy-Weinberg equilibrium and (2) direct null allele frequency estimates from progeny where parent genotypes are known. We show that null allele frequencies are high in F. sylvatica (7.0% on average with the population method, 5.1% with the progeny method), and that estimates are consistent between the two approaches, especially when the number of sampled maternal half-sib progeny arrays is large. With null allele frequencies ranging between 5% and 8% on average across loci, population genetic parameters such as genetic differentiation (F ST) may be mostly unbiased. However, using markers with such average prevalence of null alleles (up to 15% for some loci) can be seriously misleading in fine scale population studies and parentage analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号