The central theme in case-control genetic association studies is to efficiently identify genetic markers associated with trait status. Powerful statistical methods are critical to accomplishing this goal. A popular method is the omnibus Pearson's chi-square test applied to genotype counts. To achieve increased power, tests based on an assumed trait model have been proposed. However, they are not robust to model misspecification. Much research has been carried out on enhancing robustness of such model-based tests. An analysis framework that tests the equality of allele frequency while allowing for different deviation from Hardy-Weinberg equilibrium (HWE) between cases and controls is proposed. The proposed method does not require specification of trait models nor HWE. It involves only 1 degree of freedom. The likelihood ratio statistic, score statistic, and Wald statistic associated with this framework are introduced. Their performance is evaluated by extensive computer simulation in comparison with existing methods.  相似文献   

Association study designs for complex diseases   总被引:1,自引:0,他引:1  
Assessing the association between DNA variants and disease has been used widely to identify regions of the genome and candidate genes that contribute to disease. However, there are numerous examples of associations that cannot be replicated, which has led to skepticism about the utility of the approach for common conditions. With the discovery of massive numbers of genetic markers and the development of better tools for genotyping, association studies will inevitably proliferate. Now is the time to consider critically the design of such studies, to avoid the mistakes of the past and to maximize their potential to identify new components of disease.  相似文献   

Stella A  Boettcher PJ 《Genetics》2004,166(1):341-350
Simulation was used to evaluate the performance of different selective genotyping strategies when using linkage disequilibrium across large half-sib families to position a QTL within a previously defined genomic region. Strategies examined included standard selective genotyping and different approaches of discordant and concordant sib selection applied to arbitrary or selected families. Strategies were compared as a function of effect and frequency of QTL alleles, heritability, and phenotypic expression of the trait. Large half-sib families were simulated for 100 generations and 2% of the population was genotyped in the final generation. Simple ANOVA was applied and the marker with the greatest F-value was considered the most likely QTL position. For traits with continuous phenotypes, genotyping the most divergent pairs of half-sibs from all families was the best strategy in general, but standard selective genotyping was somewhat more precise when heritability was low. When the phenotype was distributed in ordered categories, discordant sib selection was the optimal approach for positioning QTL for traits with high heritability and concordant sib selection was the best approach when genetic effects were small. Genotyping of a few selected sibs from many families was generally more efficient than genotyping many individuals from a few highly selected sires.  相似文献   

严卫丽 《遗传》2008,30(4):400-406
实现全基因组关联研究(Genome-wide association study, GWA)在数年前还是遗传学家们的梦想, 如今它已经变成了现实。自2005年Science杂志报道了第一项有关年龄相关性(视网膜)黄斑变性全基因组关联研究研究以来, 有关与复杂疾病的全基因组关联研究如雨后春笋般层出不穷。文中介绍了近两年来全基因组关联研究在复杂疾病研究领域内的主要发现、全基因组关联研究设计原理、遗传标记的选择、比较及相关商品信息。最后介绍了人类基因组拷贝数变异的研究进展, 总结了人类全基因组关联研究所取得成就和存在的问题, 并对全基因组关联研究未来的研究重点和要解决的问题进行了展望。  相似文献   

One way to perform linkage-disequilibrium (LD) mapping of genetic traits is to use single markers. Since dense marker maps-such as single-nucleotide polymorphism and high-resolution microsatellite maps-are available, it is natural and practical to generalize single-marker LD mapping to high-resolution haplotype or multiple-marker LD mapping. This article investigates high-resolution LD-mapping methods, for complex diseases, based on haplotype maps or microsatellite marker maps. The objective is to explore test statistics that combine information from haplotype blocks or multiple markers. Based on two coding methods, genotype coding and haplotype coding, Hotelling's T2 statistics TG and TH are proposed to test the association between a disease locus and two haplotype blocks or two markers. The validity of the two T2 statistics is proved by theoretical calculations. A statistic TC, an extension of the traditional chi2 method of comparing haplotype frequencies, is introduced by simply adding the chi2 test statistics of the two haplotype blocks together. The merit of the three methods is explored by calculation and comparison of power and of type I errors. In the presence of LD between the two blocks, the type I error of TC is higher than that of TH and TG, since TC ignores the correlation between the two blocks. For each of the three statistics, the power of using two haplotype blocks is higher than that of using only one haplotype block. By power comparison, we notice that TC has higher power than that of TH, and TH has higher power than that of TG. In the absence of LD between the two blocks, the power of TC is similar to that of TH and higher than that of TG. Hence, we advocate use of TH in the data analysis. In the presence of LD between the two blocks, TH takes into account the correlation between the two haplotype blocks and has a lower type I error and higher power than TG. Besides, the feasibility of the methods is shown by sample-size calculation.  相似文献   

复杂疾病全基因组关联研究进展——遗传统计分析   总被引:7,自引:0,他引:7  
严卫丽 《遗传》2008,30(5):543-549
2005年, Science杂志首次报道了有关人类年龄相关性黄斑变性的全基因组关联研究, 此后有关肥胖、2型糖尿病、冠心病、阿尔茨海默病等一系列复杂疾病的全基因组关联研究被陆续报道, 这一阶段被称为人类全基因组关联研究的第一次浪潮。文章分别介绍了全基因组关联研究统计分析的方法、软件和应用实例; 比较了关联分析中多重检验的P值调整方法, 包括Bonferroni、递减的Bonferroni校正法、模拟运算法和控制错误发现率的方法; 还讨论了人群混杂对关联分析结果可能产生的影响及原理, 以及全基因组关联研究中控制人群混杂的方法的研究进展和应用实例。在全基因组关联研究的第一次浪潮中, 应用经典的遗传统计方法发现了许多基因-表型之间的关联并且能够对这些关联做出解释, 其中包括许多基因组中的未知基因和染色体区域。然而, 全基因组关联研究的继续发展需要进一步阐述基因组内基因之间相互作用、基因-基因之间的复杂作用网络与环境因素的相互作用在复杂疾病发生中的作用, 现有的统计分析方法肯定不能满足需要, 开发更为高级的统计分析方法势在必行。最后, 文章还给出了全基因组关联研究统计分析软件的相关网站信息。  相似文献   

We present optimized group sequential designs where testing of a single parameter theta is of interest. We require specification of a loss function and of a prior distribution for theta. For the examples presented, we pre-specify Type I and II error rates and minimize the expected sample size over the prior distribution for theta. Minimizing the square of sample size rather than the sample size is found to produce designs with slightly less aggressive interim stopping rules and smaller maximum sample sizes with essentially identical expected sample size. We compare optimal designs using Hwang-Shih-DeCani and Kim-DeMets spending functions to fully optimized designs not restricted by a spending function family. In the examples selected, we also examine when there might be substantial benefit gained by adding an interim analysis. Finally, we provide specific optimal asymmetric spending function designs that should be generally useful and simply applied when a design with minimal expected sample size is desired.  相似文献   

Amos C  de Andrade M  Zhu D 《Human heredity》2001,51(3):133-144
OBJECTIVES: Multivariate tests for linkage can provide improved power over univariate tests but the type I error rates and comparative power of commonly used methods have not previously been compared. Here we studied the behavior of bivariate formulations of the variance component (VC) and Haseman-Elston (H-E) approaches. METHODS: We compared through simulation studies the bivariate H-E test with the unconstrained bivariate VC approach and with a VC approach in which the major-gene correlation is constrained to +/-1. We also compared these methods to univariate methods. RESULTS: Bivariate approaches are more powerful than univariate analyses unless the traits are very highly positively correlated. The power of the bivariate H-E test was less than the VC procedures. The constrained test was often less powerful than the unconstrained test. The empirical distributions of the bivariate H-E test and the unconstrained bivariate VC test conformed with asymptotic distributions for samples of 100 or more sibships of size 4. CONCLUSIONS: The unconstrained VC test is valuable for testing for preliminary linkages using multivariate phenotypes. The bivariate H-E test was less powerful than the bivariate VC tests.  相似文献   

To analyze incomplete families, the following statistical tests can be used: LRAT-a simple likelihood-based association test, TRANSMIT, SIBASSOC/STDT, and RCTDT. We compared these four tests, for the diallelic case, on simulated data sets. The comparisons focused on the power to detect linkage and association when different familial structures, resistance to population stratification, resistance to misclassification of the disease status of the healthy sib, and the effect of nonpaternity were considered. The simulations lead to the following conclusions. The type I errors of TRANSMIT, SIBASSOC/STDT, and RCTDT were not affected by population stratification. LRAT showed bias under strong population stratification. High nonpaternity rates can lead to inflated type I errors, highlighting the importance of identification of half sibs. Under different homogeneous models, the power of TRANSMIT was very similar to that of LRAT, and, similarly, no difference in power was observed between SIBASSOC/STDT and RCTDT. Under various recessive and additive models, TRANSMIT was slightly more powerful than SIBASSOC/STDT when monoparental families with one affected and one unaffected sib were analyzed. Under various dominant models, SIBASSOC/STDT was slightly more powerful than TRANSMIT. Misclassification of the disease status of healthy sibs, as well as the discarding of incomplete families, resulted in a consistent loss of power.  相似文献   

In this paper we describe various study designs and analytic techniques for testing the joint hypothesis that a genetic marker is both linked to and associated with a quantitative phenotype. Issues of power and sampling are addressed. The distinction between methods that explicitly examine association and those that infer association by examining the distribution of allelic transmissions from a heterozygous parent is examined. Extensions to multivariate, multiallelic, and multilocus situations are addressed. Recent approaches that combine variance-components-based linkage analyses with joint tests of linkage in the presence of association for disentanglement of the linkage and association and the application of such methods to fine mapping are discussed. Finally, new classes of joint tests of linkage and association that do not require samples of related individuals are described.  相似文献   

The current development of densely spaced collections of single nucleotide polymorphisms (SNPs) will lead to genomewide association studies for a wide range of diseases in many different populations. Determinations of the appropriate number of SNPs to genotype involve a balancing of power and cost. Several variables are important in these determinations. We show that there are different combinations of sample size and marker density that can be expected to achieve the same power. Within certain bounds, investigators can choose between designs with more subjects and fewer markers or those with more markers and fewer subjects. Which designs are more cost-effective depends on the cost of phenotyping versus the cost of genotyping. We show that, under the assumption of a set cost for genotyping, one can calculate a "threshold cost" for phenotyping; when phenotyping costs per subject are less than this threshold, designs with more subjects will be more cost-effective than designs with more markers. This framework for determining a cost-effective study will aid in the planning of studies, especially if there are choices to be made with respect to phenotyping methods or study populations.  相似文献   

The transmission/disequilibrium test (TDT) developed by Spielman et al. can be a powerful family-based test of linkage and, in some cases, a test of association as well as linkage. It has recently been extended in several ways; these include allowance for implementation with quantitative traits, allowance for multiple alleles, and, in the case of dichotomous traits, allowance for testing in the absence of parental data. In this article, these three extensions are combined, and two procedures are developed that offer valid joint tests of linkage and (in the case of certain sibling configurations) association with quantitative traits, with use of data from siblings only, and that can accommodate biallelic or multiallelic loci. The first procedure uses a mixed-effects (i.e., random and fixed effects) analysis of variance in which sibship is the random factor, marker genotype is the fixed factor, and the continuous phenotype is the dependent variable. Covariates can easily be accommodated, and the procedure can be implemented in commonly available statistical software. The second procedure is a permutation-based procedure. Selected power studies are conducted to illustrate the relative power of each test under a variety of circumstances.  相似文献   

Currently, the design of group sequential clinical trials requires choosing among several distinct design categories, design scales, and strategies for determining stopping rules. This approach can limit the design selection process so that clinical issues are not fully addressed. This paper describes a family of designs that unifies previous approaches and allows continuous movement among the previous categories. This unified approach facilitates the process of tailoring the design to address important clinical issues. The unified family of designs is constructed from a generalization of a four-boundary group sequential design in which the shape and location of each boundary can be independently specified. Methods for implementing the design using error-spending functions are described. Examples illustrating the use of the design family are also presented.  相似文献   

Müller HH  Schäfer H 《Biometrics》2001,57(3):886-891
A general method is presented integrating the concept of adaptive interim analyses into classical group sequential testing. This allows the researcher to represent every group sequential plan as an adaptive trial design and to make design changes during the course of the trial after every interim analysis in the same way as with adaptive designs. The concept of adaptive trial designing is thereby generalized to a large variety of possible sequential plans.  相似文献   

Nuclear families with multiple affected sibs are often collected for genetic linkage analysis of complex diseases. Once linkage evidence is established, dense markers are often typed in the linked region for genetic association analysis based on linkage disequilibrium (LD). Detection of association in the presence of linkage localizes disease genes more accurately than the methods that rely on linkage alone. However, test of association due to LD in the linked region needs to account for dependency of the allele transmissions to different sibs within a family. In this paper, we define a joint model for genetic linkage and association and derive the corresponding joint survival function of age of onset for the sibs within a sibship. The joint survival function is a function of both the inheritance vector and the genotypes at the candidate marker locus. Based on this joint survival function, we derive score tests for genetic association. The proposed methods utilize the phenotype data of all the sibs and have the advantages of family-based designs which can avoid the potential spurious association caused by population admixture. In addition, the methods can account for variable age of onset or age at censoring and possible covariate effects, and therefore provide important tools for modelling disease heterogeneity. Simulation studies and application to the data sets from the 12th Genetic Analysis Workshop indicate that the proposed methods have correct type 1 error rates and increased power over other existing methods for testing allelic association.  相似文献   

A power calculation is crucial in planning genetic studies. In genetic association studies, the power is often calculated using the expected number of individuals with each genotype calculated from an assumed allele frequency under Hardy-Weinberg equilibrium. Since the allele frequency is often unknown, the number of individuals with each genotype is random and so a power calculation assuming a known allele frequency may be incorrect. Ambrosius et al. recently showed that the power ignoring this randomness may lead to studies with insufficient power and proposed averaging the power due to the randomness. We extend the method of averaging power in two directions. First, for testing association in case-control studies, we use the Cochran-Armitage trend test and find that the time needed for calculating the averaged power is much reduced compared to the chi-square test with two degrees of freedom studied by Ambrosius et al. A real study is used for illustration of the method. Second, we extend the method to linkage analysis, where the number of identical-by-descent alleles shared by siblings is random. The distribution of identical-by-descent numbers depends on the underlying genetic model rather than the allele frequency. The robust test for linkage analysis is also examined using the averaged powers. We also recommend a sensitivity analysis when the true allele frequency or the number of identical-by-descent alleles is unknown.  相似文献   

We investigated protocol designs for gene mapping in livestock. The optimization of the population structure was based on the empirical variance of the recombination rate estimator. We concluded that a mixture of half-sib and full-sib families is preferred to half-sib families; a knowledge of parental phases does not improve the quality of the estimation for typical livestock families with five offspring or more; and measurements of the genotype of the mates in half-sib families are not useful. Graphs and algebraic approximations for the practical choice of family size and structure are given.  相似文献   

A comparative study of sibship tests of linkage and/or association.   总被引:4,自引:0,他引:4       下载免费PDF全文
Population-based tests of association have used data from either case-control studies or studies based on trios (affected child and parents). Case-control studies are more prone to false-positive results caused by inappropriate controls, which can occur if, for example, there is population admixture or stratification. An advantage of family-based tests is that cases and controls are well matched, but parental data may not always be available, especially for late-onset diseases. Three recent family-based tests of association and linkage utilize unaffected siblings as surrogates for untyped parents. In this paper, we propose an extension of one of these tests. We describe and compare the four tests in the context of a complex disease for both biallelic and multiallelic markers, as well as for sibships of different sizes. We also examine the consequences of having some parental data in the sample.  相似文献   

Sun X  Zhang Z  Zhang Y  Zhang X  Li Y 《Human heredity》2005,60(3):143-149
Common heritable diseases often result from the action of several different genes, each of which contributes to the total observed variability in the disease trait. Traditional single-locus association approaches rely heavily on the marginal effects of single-locus and tend to ignore the multigenic nature of complex diseases. The increasing request for localizing genes underlying traits in multi-gene diseases has led to the development of some statistical methods. In this study, we develop a multi-locus analysis method - multi-locus penetrance variance analysis (MPVA), and conduct systematical simulation studies to evaluate its performance. Our results show that compared with other multi-locus methods, MPVA has some advantage in detecting complicated interactions under different epistatic models, and its performance is stable and robust.  相似文献   

