共查询到20条相似文献,搜索用时 0 毫秒
1.
Wang K 《Biostatistics (Oxford, England)》2012,13(4):724-733
The central theme in case-control genetic association studies is to efficiently identify genetic markers associated with trait status. Powerful statistical methods are critical to accomplishing this goal. A popular method is the omnibus Pearson's chi-square test applied to genotype counts. To achieve increased power, tests based on an assumed trait model have been proposed. However, they are not robust to model misspecification. Much research has been carried out on enhancing robustness of such model-based tests. An analysis framework that tests the equality of allele frequency while allowing for different deviation from Hardy-Weinberg equilibrium (HWE) between cases and controls is proposed. The proposed method does not require specification of trait models nor HWE. It involves only 1 degree of freedom. The likelihood ratio statistic, score statistic, and Wald statistic associated with this framework are introduced. Their performance is evaluated by extensive computer simulation in comparison with existing methods. 相似文献
2.
Association study designs for complex diseases 总被引:1,自引:0,他引:1
Assessing the association between DNA variants and disease has been used widely to identify regions of the genome and candidate genes that contribute to disease. However, there are numerous examples of associations that cannot be replicated, which has led to skepticism about the utility of the approach for common conditions. With the discovery of massive numbers of genetic markers and the development of better tools for genotyping, association studies will inevitably proliferate. Now is the time to consider critically the design of such studies, to avoid the mistakes of the past and to maximize their potential to identify new components of disease. 相似文献
3.
Optimal designs for linkage disequilibrium mapping and candidate gene association tests in livestock populations 总被引:1,自引:0,他引:1
Simulation was used to evaluate the performance of different selective genotyping strategies when using linkage disequilibrium across large half-sib families to position a QTL within a previously defined genomic region. Strategies examined included standard selective genotyping and different approaches of discordant and concordant sib selection applied to arbitrary or selected families. Strategies were compared as a function of effect and frequency of QTL alleles, heritability, and phenotypic expression of the trait. Large half-sib families were simulated for 100 generations and 2% of the population was genotyped in the final generation. Simple ANOVA was applied and the marker with the greatest F-value was considered the most likely QTL position. For traits with continuous phenotypes, genotyping the most divergent pairs of half-sibs from all families was the best strategy in general, but standard selective genotyping was somewhat more precise when heritability was low. When the phenotype was distributed in ordered categories, discordant sib selection was the optimal approach for positioning QTL for traits with high heritability and concordant sib selection was the best approach when genetic effects were small. Genotyping of a few selected sibs from many families was generally more efficient than genotyping many individuals from a few highly selected sires. 相似文献
4.
One way to perform linkage-disequilibrium (LD) mapping of genetic traits is to use single markers. Since dense marker maps-such as single-nucleotide polymorphism and high-resolution microsatellite maps-are available, it is natural and practical to generalize single-marker LD mapping to high-resolution haplotype or multiple-marker LD mapping. This article investigates high-resolution LD-mapping methods, for complex diseases, based on haplotype maps or microsatellite marker maps. The objective is to explore test statistics that combine information from haplotype blocks or multiple markers. Based on two coding methods, genotype coding and haplotype coding, Hotelling's T2 statistics TG and TH are proposed to test the association between a disease locus and two haplotype blocks or two markers. The validity of the two T2 statistics is proved by theoretical calculations. A statistic TC, an extension of the traditional chi2 method of comparing haplotype frequencies, is introduced by simply adding the chi2 test statistics of the two haplotype blocks together. The merit of the three methods is explored by calculation and comparison of power and of type I errors. In the presence of LD between the two blocks, the type I error of TC is higher than that of TH and TG, since TC ignores the correlation between the two blocks. For each of the three statistics, the power of using two haplotype blocks is higher than that of using only one haplotype block. By power comparison, we notice that TC has higher power than that of TH, and TH has higher power than that of TG. In the absence of LD between the two blocks, the power of TC is similar to that of TH and higher than that of TG. Hence, we advocate use of TH in the data analysis. In the presence of LD between the two blocks, TH takes into account the correlation between the two haplotype blocks and has a lower type I error and higher power than TG. Besides, the feasibility of the methods is shown by sample-size calculation. 相似文献
5.
OBJECTIVES: Multivariate tests for linkage can provide improved power over univariate tests but the type I error rates and comparative power of commonly used methods have not previously been compared. Here we studied the behavior of bivariate formulations of the variance component (VC) and Haseman-Elston (H-E) approaches. METHODS: We compared through simulation studies the bivariate H-E test with the unconstrained bivariate VC approach and with a VC approach in which the major-gene correlation is constrained to +/-1. We also compared these methods to univariate methods. RESULTS: Bivariate approaches are more powerful than univariate analyses unless the traits are very highly positively correlated. The power of the bivariate H-E test was less than the VC procedures. The constrained test was often less powerful than the unconstrained test. The empirical distributions of the bivariate H-E test and the unconstrained bivariate VC test conformed with asymptotic distributions for samples of 100 or more sibships of size 4. CONCLUSIONS: The unconstrained VC test is valuable for testing for preliminary linkages using multivariate phenotypes. The bivariate H-E test was less powerful than the bivariate VC tests. 相似文献
6.
To analyze incomplete families, the following statistical tests can be used: LRAT-a simple likelihood-based association test, TRANSMIT, SIBASSOC/STDT, and RCTDT. We compared these four tests, for the diallelic case, on simulated data sets. The comparisons focused on the power to detect linkage and association when different familial structures, resistance to population stratification, resistance to misclassification of the disease status of the healthy sib, and the effect of nonpaternity were considered. The simulations lead to the following conclusions. The type I errors of TRANSMIT, SIBASSOC/STDT, and RCTDT were not affected by population stratification. LRAT showed bias under strong population stratification. High nonpaternity rates can lead to inflated type I errors, highlighting the importance of identification of half sibs. Under different homogeneous models, the power of TRANSMIT was very similar to that of LRAT, and, similarly, no difference in power was observed between SIBASSOC/STDT and RCTDT. Under various recessive and additive models, TRANSMIT was slightly more powerful than SIBASSOC/STDT when monoparental families with one affected and one unaffected sib were analyzed. Under various dominant models, SIBASSOC/STDT was slightly more powerful than TRANSMIT. Misclassification of the disease status of healthy sibs, as well as the discarding of incomplete families, resulted in a consistent loss of power. 相似文献
7.
In this paper we describe various study designs and analytic techniques for testing the joint hypothesis that a genetic marker is both linked to and associated with a quantitative phenotype. Issues of power and sampling are addressed. The distinction between methods that explicitly examine association and those that infer association by examining the distribution of allelic transmissions from a heterozygous parent is examined. Extensions to multivariate, multiallelic, and multilocus situations are addressed. Recent approaches that combine variance-components-based linkage analyses with joint tests of linkage in the presence of association for disentanglement of the linkage and association and the application of such methods to fine mapping are discussed. Finally, new classes of joint tests of linkage and association that do not require samples of related individuals are described. 相似文献
8.
The current development of densely spaced collections of single nucleotide polymorphisms (SNPs) will lead to genomewide association studies for a wide range of diseases in many different populations. Determinations of the appropriate number of SNPs to genotype involve a balancing of power and cost. Several variables are important in these determinations. We show that there are different combinations of sample size and marker density that can be expected to achieve the same power. Within certain bounds, investigators can choose between designs with more subjects and fewer markers or those with more markers and fewer subjects. Which designs are more cost-effective depends on the cost of phenotyping versus the cost of genotyping. We show that, under the assumption of a set cost for genotyping, one can calculate a "threshold cost" for phenotyping; when phenotyping costs per subject are less than this threshold, designs with more subjects will be more cost-effective than designs with more markers. This framework for determining a cost-effective study will aid in the planning of studies, especially if there are choices to be made with respect to phenotyping methods or study populations. 相似文献
9.
10.
The transmission/disequilibrium test (TDT) developed by Spielman et al. can be a powerful family-based test of linkage and, in some cases, a test of association as well as linkage. It has recently been extended in several ways; these include allowance for implementation with quantitative traits, allowance for multiple alleles, and, in the case of dichotomous traits, allowance for testing in the absence of parental data. In this article, these three extensions are combined, and two procedures are developed that offer valid joint tests of linkage and (in the case of certain sibling configurations) association with quantitative traits, with use of data from siblings only, and that can accommodate biallelic or multiallelic loci. The first procedure uses a mixed-effects (i.e., random and fixed effects) analysis of variance in which sibship is the random factor, marker genotype is the fixed factor, and the continuous phenotype is the dependent variable. Covariates can easily be accommodated, and the procedure can be implemented in commonly available statistical software. The second procedure is a permutation-based procedure. Selected power studies are conducted to illustrate the relative power of each test under a variety of circumstances. 相似文献
11.
A power calculation is crucial in planning genetic studies. In genetic association studies, the power is often calculated using the expected number of individuals with each genotype calculated from an assumed allele frequency under Hardy-Weinberg equilibrium. Since the allele frequency is often unknown, the number of individuals with each genotype is random and so a power calculation assuming a known allele frequency may be incorrect. Ambrosius et al. recently showed that the power ignoring this randomness may lead to studies with insufficient power and proposed averaging the power due to the randomness. We extend the method of averaging power in two directions. First, for testing association in case-control studies, we use the Cochran-Armitage trend test and find that the time needed for calculating the averaged power is much reduced compared to the chi-square test with two degrees of freedom studied by Ambrosius et al. A real study is used for illustration of the method. Second, we extend the method to linkage analysis, where the number of identical-by-descent alleles shared by siblings is random. The distribution of identical-by-descent numbers depends on the underlying genetic model rather than the allele frequency. The robust test for linkage analysis is also examined using the averaged powers. We also recommend a sensitivity analysis when the true allele frequency or the number of identical-by-descent alleles is unknown. 相似文献
12.
Nuclear families with multiple affected sibs are often collected for genetic linkage analysis of complex diseases. Once linkage evidence is established, dense markers are often typed in the linked region for genetic association analysis based on linkage disequilibrium (LD). Detection of association in the presence of linkage localizes disease genes more accurately than the methods that rely on linkage alone. However, test of association due to LD in the linked region needs to account for dependency of the allele transmissions to different sibs within a family. In this paper, we define a joint model for genetic linkage and association and derive the corresponding joint survival function of age of onset for the sibs within a sibship. The joint survival function is a function of both the inheritance vector and the genotypes at the candidate marker locus. Based on this joint survival function, we derive score tests for genetic association. The proposed methods utilize the phenotype data of all the sibs and have the advantages of family-based designs which can avoid the potential spurious association caused by population admixture. In addition, the methods can account for variable age of onset or age at censoring and possible covariate effects, and therefore provide important tools for modelling disease heterogeneity. Simulation studies and application to the data sets from the 12th Genetic Analysis Workshop indicate that the proposed methods have correct type 1 error rates and increased power over other existing methods for testing allelic association. 相似文献
13.
J. M. Elsen B. Mangin B. Goffinet C. Chevalet 《TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik》1994,88(1):129-134
We investigated protocol designs for gene mapping in livestock. The optimization of the population structure was based on the empirical variance of the recombination rate estimator. We concluded that a mixture of half-sib and full-sib families is preferred to half-sib families; a knowledge of parental phases does not improve the quality of the estimation for typical livestock families with five offspring or more; and measurements of the genotype of the mates in half-sib families are not useful. Graphs and algebraic approximations for the practical choice of family size and structure are given. 相似文献
14.
Common heritable diseases often result from the action of several different genes, each of which contributes to the total observed variability in the disease trait. Traditional single-locus association approaches rely heavily on the marginal effects of single-locus and tend to ignore the multigenic nature of complex diseases. The increasing request for localizing genes underlying traits in multi-gene diseases has led to the development of some statistical methods. In this study, we develop a multi-locus analysis method - multi-locus penetrance variance analysis (MPVA), and conduct systematical simulation studies to evaluate its performance. Our results show that compared with other multi-locus methods, MPVA has some advantage in detecting complicated interactions under different epistatic models, and its performance is stable and robust. 相似文献
15.
Population-based tests of association have used data from either case-control studies or studies based on trios (affected child and parents). Case-control studies are more prone to false-positive results caused by inappropriate controls, which can occur if, for example, there is population admixture or stratification. An advantage of family-based tests is that cases and controls are well matched, but parental data may not always be available, especially for late-onset diseases. Three recent family-based tests of association and linkage utilize unaffected siblings as surrogates for untyped parents. In this paper, we propose an extension of one of these tests. We describe and compare the four tests in the context of a complex disease for both biallelic and multiallelic markers, as well as for sibships of different sizes. We also examine the consequences of having some parental data in the sample. 相似文献
16.
In this paper the currently available software we know for group sequential and adaptive designs is briefly reviewed. Stand-alone packages as well as modules within software packages or programming languages exist. New software developments for adaptive designs enable the user to perform data dependent design adaptations while controlling the Type I error rate. 相似文献
17.
The transmission/disequilibrium test was introduced to test for linkage disequilibrium between a marker and a putative disease locus using case-parent trios. However, parental genotypes may be incomplete in such a study. When parental information is non-randomly missing, due, for example, to death from the disease under study, the impact on type I error and power under dominant and recessive disease models has been reported. In this paper, we examine non-ignorable missingness by assigning missing values to the genotypes of affected parents. We used unrelated case-parent trios in the Genetic Analysis Workshop 14 simulated data for the Danacaa population. Our computer simulations revealed that the type I error of these tests using incomplete trios was not inflated over the nominal level under either recessive or dominant disease models. However, the power of these tests appears to be inflated over the complete information case due to an excess of heterozygous parents in dyads. 相似文献
18.
Linkage analysis may not provide the necessary resolution for identification of the genes underlying phenotypic variation. This is especially true for gene-mapping studies that focus on complex diseases that do not exhibit Mendelian inheritance patterns. One positional genomic strategy involves application of association methodology to areas of identified linkage. Detection of association in the presence of linkage localizes the gene(s) of interest to more-refined regions in the genome than is possible through linkage analysis alone. This strategy introduces a statistical complexity when family-based association tests are used: the marker genotypes among siblings are correlated in linked regions. Ignoring this correlation will compromise the size of the statistical hypothesis test, thus clouding the interpretation of test results. We present a method for computing the expectation of a wide range of association test statistics under the null hypothesis that there is linkage but no association. To standardize the test statistic, an empirical variance-covariance estimator that is robust to the sibling marker-genotype correlation is used. This method is widely applicable: any type of phenotypic measure or family configuration can be used. For example, we analyze a deletion in the A2M gene at the 5' splice site of exon II of the bait region in Alzheimer disease (AD) discordant sibships. Since the A2M gene lies in a chromosomal region (chromosome 12p) that consistently has been linked to AD, association tests should be conducted under the null hypothesis that there is linkage but no association. 相似文献
19.
Efficient study designs for test of genetic association using sibship data and unrelated cases and controls
下载免费PDF全文

Linkage mapping of complex diseases is often followed by association studies between phenotypes and marker genotypes through use of case-control or family-based designs. Given fixed genotyping resources, it is important to know which study designs are the most efficient. To address this problem, we extended the likelihood-based method of Li et al., which assesses whether there is linkage disequilibrium between a disease locus and a SNP, to accommodate sibships of arbitrary size and disease-phenotype configuration. A key advantage of our method is the ability to combine data from different family structures. We consider scenarios for which genotypes are available for unrelated cases, affected sib pairs (ASPs), or only one sibling per ASP. We construct designs that use cases only and others that use unaffected siblings or unrelated unaffected individuals as controls. Different combinations of cases and controls result in seven study designs. We compare the efficiency of these designs when the number of individuals to be genotyped is fixed. Our results suggest that (1) when the disease is influenced by a single gene, the one sibling per ASP-control design is the most efficient, followed by the ASP-control design, and familial cases contribute more association information than singleton cases; (2) when the disease is influenced by multiple genes, familial cases provide more association information than singleton cases, unless the effect of the locus being tested is much smaller than at least one other untested disease locus; and (3) the case-control design can be useful for detecting genes with small effect in the presence of genes with much larger effect. Our findings will be helpful for researchers designing and analyzing complex disease-association studies and will facilitate genotyping resource allocation. 相似文献
20.
The performance of some weakly parametric linkage tests in common use was compared on 200 replicates of oligogenic inheritance from Genetic Analysis Workshop 10. Each random sample for the quantitative trait was dichotomized at different thresholds and also selected through 2 affected sibs, generating 8 combinations of sample and variable. The variance component program SOLAR performed best with a continuous trait, even in selected samples, when the population mean was used. The sib-pair program SIBPAL2 was best in most other cases when the phenotype product, population mean, and empirical estimates of pair correlations were used. The BETA program that introduced phenotype products was slightly more powerful than maximum likelihood scores under the null hypothesis and approached but did not exceed SIBPAL2 under its optimal conditions. Type I errors generally exceeded expectations from a chi(2) test, but were conservative with respect to bounds on lods. All methods can be improved by use of the population mean, empirical correlations, logistic representation for affection status, and correct lods for samples that favour the null hypothesis. It remains uncertain whether all information can be extracted by weakly parametric methods and whether correction for ascertainment bias demands a strongly parametric model. Performance on a standard set of simulated data is indispensable for recognising optimal methods. 相似文献