首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
Chen Z  Ng HK 《Human heredity》2012,73(1):26-34
In genetic association studies, due to the varying underlying genetic models, no single statistical test can be the most powerful test under all situations. Current studies show that if the underlying genetic models are known, trend-based tests, which outperform the classical Pearson χ2 test, can be constructed. However, when the underlying genetic models are unknown, the χ2 test is usually more robust than trend-based tests. In this paper, we propose a new association test based on a generalized genetic model, namely the generalized order-restricted relative risks model. Through a Monte Carlo simulation study, we show that the proposed association test is generally more powerful than the χ2 test, and more robust than those trend-based tests. The proposed methodologies are also illustrated by some real SNP datasets.  相似文献   

2.

Background

Current robust association tests for case–control genome-wide association study (GWAS) data are mainly based on the assumption of some specific genetic models. Due to the richness of the genetic models, this assumption may not be appropriate. Therefore, robust but powerful association approaches are desirable.

Results

In this paper, we propose a new approach to testing for the association between the genotype and phenotype for case–control GWAS. This method assumes a generalized genetic model and is based on the selected disease allele to obtain a p-value from the more powerful one-sided test. Through a comprehensive simulation study we assess the performance of the new test by comparing it with existing methods. Some real data applications are also used to illustrate the use of the proposed test.

Conclusions

Based on the simulation results and real data application, the proposed test is powerful and robust.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-358) contains supplementary material, which is available to authorized users.  相似文献   

3.
Genomic imprinting is a genetic phenomenon in which certain alleles are differentially expressed in a parent-of-origin-specific manner, and plays an important role in the study of complex traits. For a diallelic marker locus in human, the parental-asymmetry tests Q-PAT(c) with any constant c were developed to detect parent-of-origin effects for quantitative traits. However, these methods can only be applied to deal with nuclear families and thus are not suitable for extended pedigrees. In this study, by making no assumption about the distribution of the quantitative trait, we first propose the pedigree parental-asymmetry tests Q-PPAT(c) with any constant c for quantitative traits to test for parent-of-origin effects based on nuclear families with complete information from general pedigree data, in the presence of association between marker alleles under study and quantitative traits. When there are any genotypes missing in pedigrees, we utilize Monte Carlo (MC) sampling and estimation and develop the Q-MCPPAT(c) statistics to test for parent-of-origin effects. Various simulation studies are conducted to assess the performance of the proposed methods, for different sample sizes, genotype missing rates, degrees of imprinting effects and population models. Simulation results show that the proposed methods control the size well under the null hypothesis of no parent-of-origin effects and Q-PPAT(c) are robust to population stratification. In addition, the power comparison demonstrates that Q-PPAT(c) and Q-MCPPAT(c) for pedigree data are much more powerful than Q-PAT(c) only using two-generation nuclear families selected from extended pedigrees.  相似文献   

4.
A robust statistical method to detect linkage or association between a genetic marker and a set of distinct phenotypic traits is to combine univariate trait-specific test statistics for a more powerful overall test. This procedure does not need complex modeling assumptions, can easily handle the problem with partially missing trait values, and is applicable to the case with a mixture of qualitative and quantitative traits. In this note, we propose a simple test procedure along this line, and show its advantages over the standard combination tests for linkage or association in the literature through a data set from Genetic Analysis Workshop 12 (GAW12) and an extensive simulation study.  相似文献   

5.
Recent investigations such as a more powerful quasi-likelihoods score test (MQLS) statistic have enabled the efficient association analysis with related samples. Although those approaches are robust against the mis-specified phenotypic distribution and covariance structure, it has been shown that MQLS statistic becomes violated under the presence of the population substructure if the level of population substructure depends on the genomic location. In this report, we propose a new statistical method which combines EIGENSTRAT approach and MQLS-statistic. The proposed method was evaluated with simulation data under various scenarios and we found that proposed method performs better than the traditional methods such as transmission disequilibrium test. The proposed method was applied to genetic association analysis for body mass index with Framingham heart study, and we found that rs1121980 and rs9940128 in the linkage block in FTO gene are associated with the body mass index.  相似文献   

6.
In case-control studies, genetic associations for complex diseases may be probed either with single-locus tests or with haplotype-based tests. Although there are different views on the relative merits and preferences of the two test strategies, haplotype-based analyses are generally believed to be more powerful to detect genes with modest effects. However, a main drawback of haplotype-based association tests is the large number of distinct haplotypes, which increases the degrees of freedom for corresponding test statistics and thus reduces the statistical power. To decrease the degrees of freedom and enhance the efficiency and power of haplotype analysis, we propose an improved haplotype clustering method that is based on the haplotype cladistic analysis developed by Durrant et al. In our method, we attempt to combine the strengths of single-locus analysis and haplotype-based analysis into one single test framework. Novel in our method is that we develop a more informative haplotype similarity measurement by using p-values obtained from single-locus association tests to construct a measure of weight, which to some extent incorporates the information of disease outcomes. The weights are then used in computation of similarity measures to construct distance metrics between haplotype pairs in haplotype cladistic analysis. To assess our proposed new method, we performed simulation analyses to compare the relative performances of (1) conventional haplotype-based analysis using original haplotype, (2) single-locus allele-based analysis, (3) original haplotype cladistic analysis (CLADHC) by Durrant et al., and (4) our weighted haplotype cladistic analysis method, under different scenarios. Our weighted cladistic analysis method shows an increased statistical power and robustness, compared with the methods of haplotype cladistic analysis, single-locus test, and the traditional haplotype-based analyses. The real data analyses also show that our proposed method has practical significance in the human genetics field.  相似文献   

7.
Family-based study design will play a key role in identifying rare causal variants, because rare causal variants can be enriched in families with multiple affected subjects. Furthermore, different from population-based studies, family studies are robust to bias induced by population substructure. It is well known that rare causal variants are difficult to detect from single-locus tests. Therefore, burden tests and non-burden tests have been developed, by combining signals of multiple variants in a chromosomal region or a functional unit. This inevitably incorporates some neutral variants into the test statistics, which can dilute the power of statistical methods. To guard against the noise caused by neutral variants, we here propose an ‘adaptive combination of P-values method’ (abbreviated as ‘ADA’). This method combines per-site P-values of variants that are more likely to be causal. Variants with large P-values (which are more likely to be neutral variants) are discarded from the combined statistic. In addition to performing extensive simulation studies, we applied these tests to the Genetic Analysis Workshop 17 data sets, where real sequence data were generated according to the 1000 Genomes Project. Compared with some existing methods, ADA is more robust to the inclusion of neutral variants. This is a merit especially when dichotomous traits are analyzed. However, there are some limitations for ADA. First, it is more computationally intensive. Second, pedigree structures and founders'' sequence data are required for the permutation procedure. Third, unrelated controls cannot be included. We here show that, for family-based studies, the application of ADA is limited to dichotomous trait analyses with full pedigree information.  相似文献   

8.
Association tests that pool minor alleles into a measure of burden at a locus have been proposed for case-control studies using sequence data containing rare variants. However, such pooling tests are not robust to the inclusion of neutral and protective variants, which can mask the association signal from risk variants. Early studies proposing pooling tests dismissed methods for locus-wide inference using nonnegative single-variant test statistics based on unrealistic comparisons. However, such methods are robust to the inclusion of neutral and protective variants and therefore may be more useful than previously appreciated. In fact, some recently proposed methods derived within different frameworks are equivalent to performing inference on weighted sums of squared single-variant score statistics. In this study, we compared two existing methods for locus-wide inference using nonnegative single-variant test statistics to two widely cited pooling tests under more realistic conditions. We established analytic results for a simple model with one rare risk and one rare neutral variant, which demonstrated that pooling tests were less powerful than even Bonferroni-corrected single-variant tests in most realistic situations. We also performed simulations using variants with realistic minor allele frequency and linkage disequilibrium spectra, disease models with multiple rare risk variants and extensive neutral variation, and varying rates of missing genotypes. In all scenarios considered, existing methods using nonnegative single-variant test statistics had power comparable to or greater than two widely cited pooling tests. Moreover, in disease models with only rare risk variants, an existing method based on the maximum single-variant Cochran-Armitage trend chi-square statistic in the locus had power comparable to or greater than another existing method closely related to some recently proposed methods. We conclude that efficient locus-wide inference using single-variant test statistics should be reconsidered as a useful framework for devising powerful association tests in sequence data with rare variants.  相似文献   

9.
10.
The gene has been proposed as an attractive unit of analysis for association studies, but a simple yet valid, powerful, and sufficiently fast method of evaluating the statistical significance of all genes in large, genome-wide datasets has been lacking. Here we propose the use of an extended Simes test that integrates functional information and association evidence to combine the p values of the single nucleotide polymorphisms within a gene to obtain an overall p value for the association of the entire gene. Our computer simulations demonstrate that this test is more powerful than the SNP-based test, offers effective control of the type 1 error rate regardless of gene size and linkage-disequilibrium pattern among markers, and does not need permutation or simulation to evaluate empirical significance. Its statistical power in simulated data is at least comparable, and often superior, to that of several alternative gene-based tests. When applied to real genome-wide association study (GWAS) datasets on Crohn disease, the test detected more significant genes than SNP-based tests and alternative gene-based tests. The proposed test, implemented in an open-source package, has the potential to identify additional novel disease-susceptibility genes for complex diseases from large GWAS datasets.  相似文献   

11.
In genome-wide association studies (GWAS), multiple diseases with shared controls is one of the case–control study designs. If data obtained from these studies are appropriately analyzed, this design can have several advantages such as improving statistical power in detecting associations and reducing the time and cost in the data collection process. In this paper, we propose a study design for GWAS which involves multiple diseases but without controls. We also propose corresponding statistical data analysis strategy for GWAS with multiple diseases but no controls. Through a simulation study, we show that the statistical association test with the proposed study design is more powerful than the test with single disease sharing common controls, and it has comparable power to the overall test based on the whole dataset including the controls. We also apply the proposed method to a real GWAS dataset to illustrate the methodologies and the advantages of the proposed design. Some possible limitations of this study design and testing method and their solutions are also discussed. Our findings indicate that the proposed study design and statistical analysis strategy could be more efficient than the usual case–control GWAS as well as those with shared controls.  相似文献   

12.
With development of massively parallel sequencing technologies, there is a substantial need for developing powerful rare variant association tests. Common approaches include burden and non-burden tests. Burden tests assume all rare variants in the target region have effects on the phenotype in the same direction and of similar magnitude. The recently proposed sequence kernel association test (SKAT) (Wu, M. C., and others, 2011. Rare-variant association testing for sequencing data with the SKAT. The American Journal of Human Genetics 89, 82-93], an extension of the C-alpha test (Neale, B. M., and others, 2011. Testing for an unusual distribution of rare variants. PLoS Genetics 7, 161-165], provides a robust test that is particularly powerful in the presence of protective and deleterious variants and null variants, but is less powerful than burden tests when a large number of variants in a region are causal and in the same direction. As the underlying biological mechanisms are unknown in practice and vary from one gene to another across the genome, it is of substantial practical interest to develop a test that is optimal for both scenarios. In this paper, we propose a class of tests that include burden tests and SKAT as special cases, and derive an optimal test within this class that maximizes power. We show that this optimal test outperforms burden tests and SKAT in a wide range of scenarios. The results are illustrated using simulation studies and triglyceride data from the Dallas Heart Study. In addition, we have derived sample size/power calculation formula for SKAT with a new family of kernels to facilitate designing new sequence association studies.  相似文献   

13.
cDNA microarray data are subject to many sources of variation that have to be removed before statistical tests can be applied for identifying genes that are expressed differentially. Background correction, log-ratio transformation, and normalization, referred as the log-ratio approach, have been widely used for this purpose. However, there are some problems associated with this procedure. In this study, we proposed an alternative approach that obviates the log-ratio transformation step and goes directly to normalization after background correction. The method can estimate the “noise” effect by utilizing the information more effectively. Simulation studies were carried out to compare the feasibility and efficiency of this approach for detecting the specifically and differentially expressed genes under various conditions with the log-ratio approach. The results showed that our approach worked well and was more robust and powerful than the log-ratio approach.  相似文献   

14.
Compound tests for the detection of hitchhiking under positive selection   总被引:2,自引:0,他引:2  
Many statistical tests have been developed for detecting positive selection. Most of these tests draw conclusions based on significant deviations from the patterns of polymorphism predicted by the neutral model. However, many non-equilibrium forces may cause similar deviations, and thus the tests usually have low statistical specificity to positive selection. The main challenge is hence to construct test statistics that are reasonably powerful in detecting positive selection, but are relatively insensitive to other forces. Recently, Zeng et al. (2006) proposed a new test, DH, which is a compound of Tajima's D and Fay and Wu's H, and showed that DH has reasonably high statistical specificity to positive selection. In this report, we expand the idea of a compound test by combining Fay and Wu's H or DH with the Ewens-Watterson (EW) test. We refer to these 2 new tests as HEW and DHEW, respectively. Compared to the DH test, HEW and DHEW are more robust against the presence of recombination, and are also more powerful in detecting positive selection. Furthermore, the DHEW test, similar to DH, is also relatively insensitive to background selection and demography. The HEW test, on the other hand, tends to be somewhat less conservative than DH and DHEW in some cases.  相似文献   

15.
The statistical analysis of cancer bioassay data has historically depended on the pathological determination of the experimental animal's cause of death. The poly-k statistical test has provided a method of statistical analysis of animal bioassay data without the need for cause of death information. The test has been shown to have good statistical properties in the typical 2-year cancer bioassay. However, while the poly-k test has been applied to chronic lifetime animal studies, it has not been formally evaluated with respect to the operating characteristics of this statistical test when applied to such studies. Thus, our objective is to assess the performance of the poly-k test for lifetime studies and to make comparisons with other tests. We observed in one recent lifetime study of the gasoline additive methyl tertiary butyl ether (MTBE) that the application of the poly-k test was not statistically robust. Simulation studies were subsequently conducted for a limited number of scenarios of lifetime cancer bioassays. These simulations showed that the poly-k test is not statistically robust for testing effect of increasing dose in some lifetime cancer studies.  相似文献   

16.
Although genetic association studies using unrelated individuals may be subject to bias caused by population stratification, alternative methods that are robust to population stratification, such as family-based association designs, may be less powerful. Furthermore, it is often more feasible and less expensive to collect unrelated individuals. Recently, several statistical methods have been proposed for case-control association tests in a structured population; these methods may be robust to population stratification. In the present study, we propose a quantitative similarity-based association test (QSAT) to identify association between a candidate marker and a quantitative trait of interest, through use of unrelated individuals. For the QSAT, we first determine whether two individuals are from the same subpopulation or from different subpopulations, using genotype data at a set of independent markers. We then perform an association test between the candidate marker and the quantitative trait, through incorporation of such information. Simulation results based on either coalescent models or empirical population genetics data show that the QSAT has a correct type I error rate in the presence of population stratification and that the power of the QSAT is higher than that of family-based association designs.  相似文献   

17.
Li J  Ban J  Santiago LS 《Biometrics》2011,67(4):1481-1488
Testing homogeneity of species assemblages has important applications in ecology. Due to the unique structure of abundance data often collected in ecological studies, most classical statistical tests cannot be applied directly. In this article, we propose two novel nonparametric tests for comparing species assemblages based on the concept of data depth. They can be considered as a natural generalization of the Kolmogorov-Smirnov and the Cramér-von Mises tests (KS and CM) in this species assemblage comparison context. Our simulation studies show that the proposed test is more powerful than other existing methods under various settings. A real example is used to demonstrate how the proposed method is applied to compare species assemblages using plant community data from a highly diverse tropical forest at Barro Colorado Island, Panama.  相似文献   

18.
We propose in this paper a unified approach for testing the association between rare variants and phenotypes in sequencing association studies. This approach maximizes power by adaptively using the data to optimally combine the burden test and the nonburden sequence kernel association test (SKAT). Burden tests are more powerful when most variants in a region are causal and the effects are in the same direction, whereas SKAT is more powerful when a large fraction of the variants in a region are noncausal or the effects of causal variants are in different directions. The proposed unified test maintains the power in both scenarios. We show that the unified test corresponds to the optimal test in an extended family of SKAT tests, which we refer to as SKAT-O. The second goal of this paper is to develop a small-sample adjustment procedure for the proposed methods for the correction of conservative type I error rates of SKAT family tests when the trait of interest is dichotomous and the sample size is small. Both small-sample-adjusted SKAT and the optimal unified test (SKAT-O) are computationally efficient and can easily be applied to genome-wide sequencing association studies. We evaluate the finite sample performance of the proposed methods using extensive simulation studies and illustrate their application using the acute-lung-injury exome-sequencing data of the National Heart, Lung, and Blood Institute Exome Sequencing Project.  相似文献   

19.
This article focuses on conducting global testing for association between a binary trait and a set of rare variants (RVs), although its application can be much broader to other types of traits, common variants (CVs), and gene set or pathway analysis. We show that many of the existing tests have deteriorating performance in the presence of many nonassociated RVs: their power can dramatically drop as the proportion of nonassociated RVs in the group to be tested increases. We propose a class of so-called sum of powered score (SPU) tests, each of which is based on the score vector from a general regression model and hence can deal with different types of traits and adjust for covariates, e.g., principal components accounting for population stratification. The SPU tests generalize the sum test, a representative burden test based on pooling or collapsing genotypes of RVs, and a sum of squared score (SSU) test that is closely related to several other powerful variance component tests; a previous study (Basu and Pan 2011) has demonstrated good performance of one, but not both, of the Sum and SSU tests in many situations. The SPU tests are versatile in the sense that one of them is often powerful, although its identity varies with the unknown true association parameters. We propose an adaptive SPU (aSPU) test to approximate the most powerful SPU test for a given scenario, consequently maintaining high power and being highly adaptive across various scenarios. We conducted extensive simulations to show superior performance of the aSPU test over several state-of-the-art association tests in the presence of many nonassociated RVs. Finally we applied the SPU and aSPU tests to the GAW17 mini-exome sequence data to compare its practical performance with some existing tests, demonstrating their potential usefulness.  相似文献   

20.
cDNA微阵列数据中包含许多变异因素,用于检测差异表达基因和其它统计分析前,必须将这些“噪音”剔除。对数比法(背景校正、对数比转换和数据标准化)已经被广泛应用于cDNA微阵列数据分析中,然而这种方法却存在着一些亟待解决的缺陷。对此,该文提出一种非转换方法,它可免去对数比的转化过程,直接在背景校正后进行数据标准化,可以有效剔除实验“噪音”。研究结果表明:在检测差异表达基因的效率方面,非转换方法比常规的对数比法具有更好的稳健性和更高的检测功效,基因检出率和准确性大大提高。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号