首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 86 毫秒
1.
Population-based case-control studies are a useful method to test for a genetic association between a trait and a marker. However, the analysis of the resulting data can be affected by population stratification or cryptic relatedness, which may inflate the variance of the usual statistics, resulting in a higher-than-nominal rate of false-positive results. One approach to preserving the nominal type I error is to apply genomic control, which adjusts the variance of the Cochran-Armitage trend test by calculating the statistic on data from null loci. This enables one to estimate any additional variance in the null distribution of statistics. When the underlying genetic model (e.g., recessive, additive, or dominant) is known, genomic control can be applied to the corresponding optimal trend tests. In practice, however, the mode of inheritance is unknown. The genotype-based chi (2) test for a general association between the trait and the marker does not depend on the underlying genetic model. Since this general association test has 2 degrees of freedom (df), the existing formulas for estimating the variance factor by use of genomic control are not directly applicable. By expressing the general association test in terms of two Cochran-Armitage trend tests, one can apply genomic control to each of the two trend tests separately, thereby adjusting the chi (2) statistic. The properties of this robust genomic control test with 2 df are examined by simulation. This genomic control-adjusted 2-df test has control of type I error and achieves reasonable power, relative to the optimal tests for each model.  相似文献   

2.
An entropy-based statistic for genomewide association studies   总被引:8,自引:0,他引:8       下载免费PDF全文
Efficient genotyping methods and the availability of a large collection of single-nucleotide polymorphisms provide valuable tools for genetic studies of human disease. The standard chi2 statistic for case-control studies, which uses a linear function of allele frequencies, has limited power when the number of marker loci is large. We introduce a novel test statistic for genetic association studies that uses Shannon entropy and a nonlinear function of allele frequencies to amplify the differences in allele and haplotype frequencies to maintain statistical power with large numbers of marker loci. We investigate the relationship between the entropy-based test statistic and the standard chi2 statistic and show that, in most cases, the power of the entropy-based statistic is greater than that of the standard chi2 statistic. The distribution of the entropy-based statistic and the type I error rates are validated using simulation studies. Finally, we apply the new entropy-based test statistic to two real data sets, one for the COMT gene and schizophrenia and one for the MMP-2 gene and esophageal carcinoma, to evaluate the performance of the new method for genetic association studies. The results show that the entropy-based statistic obtained smaller P values than did the standard chi2 statistic.  相似文献   

3.
Zaykin DV  Pudovkin A  Weir BS 《Genetics》2008,180(1):533-545
The correlation between alleles at a pair of genetic loci is a measure of linkage disequilibrium. The square of the sample correlation multiplied by sample size provides the usual test statistic for the hypothesis of no disequilibrium for loci with two alleles and this relation has proved useful for study design and marker selection. Nevertheless, this relation holds only in a diallelic case, and an extension to multiple alleles has not been made. Here we introduce a similar statistic, R(2), which leads to a correlation-based test for loci with multiple alleles: for a pair of loci with k and m alleles, and a sample of n individuals, the approximate distribution of n(k - 1)(m - 1)/(km)R(2) under independence between loci is chi((k-1)(m-1))(2). One advantage of this statistic is that it can be interpreted as the total correlation between a pair of loci. When the phase of two-locus genotypes is known, the approach is equivalent to a test for the overall correlation between rows and columns in a contingency table. In the phase-known case, R(2) is the sum of the squared sample correlations for all km 2 x 2 subtables formed by collapsing to one allele vs. the rest at each locus. We examine the approximate distribution under the null of independence for R(2) and report its close agreement with the exact distribution obtained by permutation. The test for independence using R(2) is a strong competitor to approaches such as Pearson's chi square, Fisher's exact test, and a test based on Cressie and Read's power divergence statistic. We combine this approach with our previous composite-disequilibrium measures to address the case when the genotypic phase is unknown. Calculation of the new multiallele test statistic and its P-value is very simple and utilizes the approximate distribution of R(2). We provide a computer program that evaluates approximate as well as "exact" permutational P-values.  相似文献   

4.
Fan R  Floros J  Xiong M 《Human heredity》2002,53(3):130-145
In this paper, we explore models and tests for association and linkage studies of a quantitative trait locus (QTL) linked to a multi-allele marker locus. Based on the difference between an offspring's conditional trait means of receiving and not receiving an allele from a parent at marker locus, we propose three statistics T(m), T(m,row) and T(m,col) to test association or linkage disequilibrium between the marker locus and the QTL. These tests are composite tests, and use the offspring marginal sample means including offspring data of both homozygous and heterozygous parents. For the linkage study, we calculate the offspring's conditional trait mean given the allele transmission status of a heterozygous parent at the marker locus. Based on the difference between the conditional means of a transmitted and a nontransmitted allele from a heterozygous parent, we propose statistics T(parsi), T(satur), T(gen) and T(m,het) to perform composite tests of linkage between the marker locus and the quantitative trait locus in the presence of association. These tests only use the offspring data that are related to the heterozygous parents at the marker locus. T(parsi) is a parsimonious or allele-wise statistic, T(satur) and T(gen )are satured or genotype-wise statistics, and T(m,het) compares the row and column sample means for offspring data of heterozygous parents. After comparing the powers and the sample sizes, we conclude that T(parsi) has higher power than those of the bi-allele tests, T(satur), T(gen), and T(m,het). If there is tight linkage between the marker and the trait locus, T(parsi) is powerful in detecting linkage between the marker and the trait locus in the presence of association. By investigating the goodness-of-fit of T(parsi), we find that T(satur) does not gain much power compared to that of T(parsi). Moreover, T(parsi) takes into account the pattern of the data that is consistent with linkage and linkage disequilibrium. As the number of alleles at the marker locus increases, T(parsi) is very conservative, and can be useful even for sparse data. To illustrate the usefulness and the power of the methods proposed in this paper, we analyze the chromosome 6 data of the Oxford asthma data, Genetic Analysis Workshop 12.  相似文献   

5.
OBJECTIVE: The potential value of haplotypes has attracted widespread interest in the mapping of complex traits. Haplotype sharing methods take the linkage disequilibrium information between multiple markers into account, and may have good power to detect predisposing genes. We present a new approach based on Mantel statistics for spacetime clustering, which is developed in order to improve the power of haplotype sharing analysis for gene mapping in complex disease. METHODS: The new statistic correlates genetic similarity and phenotypic similarity across pairs of haplotypes for case-only and case-control studies. The genetic similarity is measured as the shared length between haplotypes around a putative disease locus. The phenotypic similarity is measured as the mean-corrected cross-product based on the respective phenotypes. We analyzed two tests for statistical significance with respect to type I error: (1) assuming asymptotic normality, and (2) using a Monte Carlo permutation procedure. The results were compared to the chi(2) test for association based on 3-marker haplotypes. RESULTS: The results of the type I error rates for the Mantel statistics using the permutational procedure yielded pointwise valid tests. The approach based on the assumption of asymptotic normality was seriously liberal. CONCLUSION: Power comparisons showed that the Mantel statistics were better than or equal to the chi(2) test for all simulated disease models.  相似文献   

6.
Ghosh S  Reich T 《Human heredity》2002,53(4):181-186
The traditional transmission disequilibrium test (TDT) (Spielman et al., 1993) is a powerful test for association only in the presence of linkage. Since allele transmissions from homozygous parents do not carry any information on linkage, the TDT statistic uses data only on heterozygous parents. However, homozygous parents carry information on association between alleles at a marker locus and a disease locus. In this article, we explore whether inclusion of homozygous parents increases the power to detect association. The resultant test statistic follows a chi(2) distribution with 2 degrees of freedom. Monte-Carlo simulations are included to compare the performance of this test with the traditional TDT under different disease models.  相似文献   

7.
To test for linkage between a trait and a marker, one can consider identical marker alleles in related individuals, for instance, sibs. For recessive diseases, it has been shown that some information may be gained from the identity by descent (IBD) of the two alleles of an affected inbred individual at the marker locus. The aim of this paper is to extend the sib-pair method of linkage analysis to the situation of sib pairs sampled from consanguineous populations. This extension takes maximum advantage of the information provided by both the IBD pattern between sibs and allelic identity within each sib of the pair. This is possible through the use of the condensed identity coefficients. Here, we propose a new test of linkage based on a chi2. We compare the performance of this test with that of the classical chi2 test based on the distribution of sib pairs sharing 0, 1, or 2 alleles IBD. For sib pairs from first-cousin matings, the proposed test can better detect the role of a disease-susceptibility (DS) locus. Its power is shown to be greater than that of the classical test, especially for models where the DS allele may be common and incompletely penetrant; that is to say for situations that may be encountered in multifactorial diseases. A study of the impact of inbreeding on the expected proportions of sib pairs sharing 0, 1, or 2 alleles IBD is also performed here. Ignoring inbreeding, when in fact inbreeding exists, increases the rate of type I errors in tests of linkage.  相似文献   

8.
Nixon J 《Heredity》2006,96(4):290-297
It is important that breeders have the means to assess genetic scoring data for segregation distortion because of its probable effect on the design of efficient breeding strategies. Scoring data is usually assessed for segregation distortion by separate nonindependent chi2 tests at each locus in a set of marker loci. This analysis gives the loci most affected by selection if it exists, but it cannot give a statistically correct test for the presence or absence of selection in a linkage group as a whole. I have used a combined test based on the statistic, which is the most significant P-value from the above tests, called the single locus test. I have also derived mathematically a new combined statistical test, the overall test, for segregation distortion that requires genetic scoring data for a single linkage group. This test also takes genetic linkage into account. Using a range of marker densities and population sizes, simulations were carried out, to compare the power of these two statistical tests to detect the effect of selection at one or two loci. The single locus test was always found to be more powerful than the overall test, but the single locus test required a more complicated P-value correction. For the single locus test, approximate correction factors for the P-values are given for a range of marker densities and genetic lengths.  相似文献   

9.
Disease association with a genetic marker is often taken as a preliminary indication of linkage with disease susceptibility. However, population subdivision and admixture may lead to disease association even in the absence of linkage. In a previous paper, we described a test for linkage (and linkage disequilibrium) between a genetic marker and disease susceptibility; linkage is detected by this test only if association is also present. This transmission/disequilibrium test (TDT) is carried out with data on transmission of marker alleles from parents heterozygous for the marker to affected offspring. The TDT is a valid test for linkage and association, even when the association is caused by population subdivision and admixture. In the previous paper, we did not explicitly consider the effect of recent history on population structure. Here we extend the previous results by examining in detail the effects of subdivision and admixture, viewed as processes in population history. We describe two models for these processes. For both models, we analyze the properties of (a) the TDT as a test for linkage (and association) between marker and disease and (b) the conventional contingency statistic used with family data to test for population association. We show that the contingency test statistic does not have a chi 2 distribution if subdivision or admixture is present. In contrast, the TDT remains a valid chi 2 statistic for the linkage hypothesis, regardless of population history.  相似文献   

10.
Svishcheva GR 《Genetika》2007,43(2):265-275
A method is proposed for analysis of quantitative traits in animal hybrid pedigrees formed by crosses between outbred lines differing in allele frequencies of the genes controlling the trait studied. The method is based on the decomposition of trait variances into components and uses maximization of the likelihood function for estimating model parameters, which allows the estimation of additive and dominance effects of the gene involved in trait determination and its allele frequencies, as well as determination of the chromosomal position of this gene relative to genotyped markers. To test the linkage of this gene with markers, a statistic with the noncentral chi(2) distribution has been chosen. Analytical expressions for the power of this method have been derived. The method has been tested on small model hybrid pedigrees. Phenotypic values of the trait and information on marker genotypes for each individual in hybrid pedigrees are original data for the analysis of a quantitative trait.  相似文献   

11.
The present study detected two single nucleotide polymorphisms (SNPs) at the PLA2G4D locus, rs2459692 and rs4924618, to investigate a genetic association between the PLA2G4D gene and schizophrenia. A total of 236 Chinese parent-offspring trios of Han descent were recruited for the genetic analysis. The transmission disequilibrium test (TDT) did not show allelic association either for rs2459692 (chi(2) = 0.217, P = 0.641) or for rs4924618 (chi(2) = 0.663, P = 0.416). To see the combined effect of the PLA2G4D locus with the other three PLA2G4 genes, we applied the above two SNPs as a conditional marker to test the pair-wise combination for a disease association. The conditioning on allele (COA) test revealed a weak association for the rs2459692-PLA2G4A combination (chi(2) = 6.03, df = 2, P = 0.049), the rs2459692-PLA2G4B combination (chi(2) = 7.16, df = 3, P = 0.028) and the rs4924618-PLA2G4C combination (chi(2) = 7.01, df = 2, P = 0.03), whereas the conditioning on genotype (COG) test showed a weak association only for the rs4924618-PLA2G4C combination (chi(2) = 8.52, df = 3, P = 0.036). Because we performed a multi-locus analysis in this study, the weak association shown by the conditional tests could make little biological sense. In conclusion, the PLA2G4D gene may not be involved in a susceptibility to schizophrenia.  相似文献   

12.
OBJECTIVES: Modelling of variation in identical-by-descent (IBD) allele sharing using covariates can increase power to detect linkage, identify covariate-defined subgroups linked to particular marker regions, and improve the design of subsequent studies to localize genes and characterize their effects. In this report, we highlight issues that arise in studies of families with affected relatives. METHODS: Mirea et al. [Genet Epidemiol 2003, in press] extended linear and exponential linkage likelihood models [Kong and Cox, Am J Hum Genet 1997;61: 1179-1188] to model variation in NPL scores among covariate-defined groups of families, and proposed likelihood ratio (LR) and t statistics to detect differences in allele sharing between groups defined by a binary covariate. Here we evaluate factors affecting the power of these tests analytically and by example, as well as effects of constraints, nuisance parameters, and incomplete data on test validity by simulation of locus heterogeneity in families with affected siblings or affected cousins. RESULTS: Provided constraints on the parameters are avoided, these tests are particularly useful when one subgroup has less than expected IBD sharing. The distribution of the LR statistic depends on the extent of linkage, particularly in the presence of constraints. The t statistic may be biased by group differences in information content. CONCLUSIONS: We recommend that constraints be applied cautiously, and covariate effects in IBD allele sharing models interpreted with care.  相似文献   

13.
The central theme in case-control genetic association studies is to efficiently identify genetic markers associated with trait status. Powerful statistical methods are critical to accomplishing this goal. A popular method is the omnibus Pearson's chi-square test applied to genotype counts. To achieve increased power, tests based on an assumed trait model have been proposed. However, they are not robust to model misspecification. Much research has been carried out on enhancing robustness of such model-based tests. An analysis framework that tests the equality of allele frequency while allowing for different deviation from Hardy-Weinberg equilibrium (HWE) between cases and controls is proposed. The proposed method does not require specification of trait models nor HWE. It involves only 1 degree of freedom. The likelihood ratio statistic, score statistic, and Wald statistic associated with this framework are introduced. Their performance is evaluated by extensive computer simulation in comparison with existing methods.  相似文献   

14.
Recently, it has been suggested that traditional nonparametric multipoint-linkage procedures can show a "bias" toward the null hypothesis of no effect when there is incomplete information about allele sharing at genotyped marker loci (or at positions in between marker loci). Here, I investigate the extent of this bias for a variety of test statistics commonly used in qualitative- ("affecteds only") and quantitative-trait linkage analysis. Through simulation and analytical derivation, I show that many of the test statistics available in standard linkage analysis packages (such as Genehunter, Merlin, and Allegro) are, in fact, not affected by this bias problem. A few test statistics--most notably the nonparametric linkage statistic and, to a lesser extent, the Aspex-MLS and Haseman-Elston statistics--are affected by the bias. Variance-components procedures, although unbiased, can show inflation or deflation of the test statistic attributable to the inclusion of pairs with incomplete identity-by-descent information. Results obtained--for instance, in genome scans--using these methods might therefore be worth revisiting to see if greater power can be obtained by use of an alternative statistic or by eliminating or downweighting uninformative relative pairs.  相似文献   

15.
Klei L  Roeder K 《Human genetics》2007,121(5):549-557
Samples consisting of a mix of unrelated cases and controls, small pedigrees, and much larger pedigrees present a unique challenge for association studies. Few methods are available for efficient analysis of such a broad spectrum of data structures. In this paper we introduce a new matching statistic that is well suited to complex data structures and compare it with frequency-based methods available in the literature. To investigate and compare the power of these methods we simulate datasets based on complex pedigrees. We examine the influence of various levels of linkage disequilibrium (LD) of the disease allele with a marker allele (or equivalently a haplotype). For low frequency marker alleles/haplotypes, frequency-based statistics are more powerful in detecting association. In contrast, for high frequency marker alleles, the matching statistic has greater power. The highest power for frequency-based statistics occurs when the disease allele frequency closely matches the frequency of the linked marker allele. In contrast maximum power of the matching statistic always occurs for intermediate marker allele frequency regardless of the disease allele frequency. Moreover, the matching and frequency-based statistics exhibit little correlation. We conclude that these two approaches can be viewed as complementary in finding possible association between a disease and a marker for many different situations.  相似文献   

16.
In studies of complex diseases, a common paradigm is to conduct association analysis at markers in regions identified by linkage analysis, to attempt to narrow the region of interest. Family-based tests for association based on parental transmissions to affected offspring are often used in fine-mapping studies. However, for diseases with late onset, parental genotypes are often missing. Without parental genotypes, family-based tests either compare allele frequencies in affected individuals with those in their unaffected siblings or use siblings to infer missing parental genotypes. An example of the latter approach is the score test implemented in the computer program TRANSMIT. The inference of missing parental genotypes in TRANSMIT assumes that transmissions from parents to affected siblings are independent, which is appropriate when there is no linkage. However, using computer simulations, we show that, when the marker and disease locus are linked and the data set consists of families with multiple affected siblings, this assumption leads to a bias in the score statistic under the null hypothesis of no association between the marker and disease alleles. This bias leads to an inflated type I error rate for the score test in regions of linkage. We present a novel test for association in the presence of linkage (APL) that correctly infers missing parental genotypes in regions of linkage by estimating identity-by-descent parameters, to adjust for correlation between parental transmissions to affected siblings. In simulated data, we demonstrate the validity of the APL test under the null hypothesis of no association and show that the test can be more powerful than the pedigree disequilibrium test and family-based association test. As an example, we compare the performance of the tests in a candidate-gene study in families with Parkinson disease.  相似文献   

17.
The purpose of this work is to quantify the effects that errors in genotyping have on power and the sample size necessary to maintain constant asymptotic Type I and Type II error rates (SSN) for case-control genetic association studies between a disease phenotype and a di-allelic marker locus, for example a single nucleotide polymorphism (SNP) locus. We consider the effects of three published models of genotyping errors on the chi-square test for independence in the 2 x 3 table. After specifying genotype frequencies for the marker locus conditional on disease status and error model in both a genetic model-based and a genetic model-free framework, we compute the asymptotic power to detect association through specification of the test's non-centrality parameter. This parameter determines the functional dependence of SSN on the genotyping error rates. Additionally, we study the dependence of SSN on linkage disequilibrium (LD), marker allele frequencies, and genotyping error rates for a dominant disease model. Increased genotyping error rate requires a larger SSN. Every 1% increase in sum of genotyping error rates requires that both case and control SSN be increased by 2-8%, with the extent of increase dependent upon the error model. For the dominant disease model, SSN is a nonlinear function of LD and genotyping error rate, with greater SSN for lower LD and higher genotyping error rate. The combination of lower LD and higher genotyping error rates requires a larger SSN than the sum of the SSN for the lower LD and for the higher genotyping error rate.  相似文献   

18.
The affected-pedigree-member method of linkage analysis.   总被引:67,自引:45,他引:22       下载免费PDF全文
This paper describes a generalization of the affected-sib-pair method of linkage analysis to pedigrees. By substituting identity-by-state relations for identity-by-descent relations, we develop a test statistic for detecting departures from independent segregation of disease and marker phenotypes. The statistic is based on the marker phenotypes of affected pedigree members only. Since it is more striking for distantly affected relatives to share a rare marker allele than a common marker allele, the statistic also includes a weighting factor based on allele frequency. The distributional properties of the statistic are investigated theoretically and by simulation. Part of the theoretical treatment entails generalizing Karigl's multiple-person kinship coefficients. When the test statistic is applied to pedigree data on Huntington disease, the null hypothesis of independent segregation between the marker locus and the disease locus is firmly rejected. In this case, as expected, there is a loss of power when compared with standard lod-score analysis. However, our statistic possesses the advantage of requiring no explicit assumptions about the mode of inheritance of the disease. This point is illustrated by application of the test statistic to data on rheumatoid arthritis.  相似文献   

19.
A new statistical test for linkage heterogeneity.   总被引:6,自引:5,他引:1       下载免费PDF全文
A new, statistical test for linkage heterogeneity is described. It is a likelihood-ratio test based on a beta distribution for the prior distribution of the recombination fraction among families (or individuals). The null distribution for this statistic (called the B-test) is derived under a broad range of circumstances. Two other heterogeneity test statistics--the admixture test or A-test first described by Smith and Morton's test (here referred to as the K-test)--are also examined. The probability distribution for the K-test statistic is very sensitive to family size, whereas the other two statistics are not. All three statistics are somewhat sensitive to the magnitude of the recombination fraction theta. Critical values for each of the test statistics are given. A conservative approximation for both the A-test and B-test is given by a chi 2 distribution when P/2 instead of P is used for the observed significance level. In terms of power, the B-test performs best among the three tests over a broad range of alternate heterogeneity hypotheses--except for the specific case of admixture with loose linkage, in which the A-test performs best. Overall, the difference in power among the three tests is not large. An application to some recently published data on the fragile-X syndrome and X-chromosome markers is given.  相似文献   

20.
The transmission/disequilibrium test (TDT) is a popular, simple, and powerful test of linkage, which can be used to analyze data consisting of transmissions to the affected members of families with any kind pedigree structure, including affected sib pairs (ASPs). Although it is based on the preferential transmission of a particular marker allele across families, it is not a valid test of association for ASPs. Martin et al. devised a similar statistic for ASPs, Tsp, which is also based on preferential transmission of a marker allele but which is a valid test of both linkage and association for ASPs. It is, however, less powerful than the TDT as a test of linkage for ASPs. What I show is that the differences between the TDT and Tsp are due to the fact that, although both statistics are based on preferential transmission of a marker allele, the TDT also exploits excess sharing in identity-by-descent transmissions to ASPs. Furthermore, I show that both of these statistics are members of a family of "TDT-like" statistics for ASPs. The statistics in this family are based on preferential transmission but also, to varying extents, exploit excess sharing. From this family of statistics, we see that, although the TDT exploits excess sharing to some extent, it is possible to do so to a greater extent-and thus produce a more powerful test of linkage, for ASPs, than is provided by the TDT. Power simulations conducted under a number of disease models are used to verify that the most powerful member of this family of TDT-like statistics is more powerful than the TDT for ASPs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号