首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
As genome-wide association studies (GWAS) are becoming more popular, two approaches, among others, could be considered in order to improve statistical power for identifying genes contributing subtle to moderate effects to human diseases. The first approach is to increase sample size, which could be achieved by combining both unrelated and familial subjects together. The second approach is to jointly analyze multiple correlated traits. In this study, by extending generalized estimating equations (GEEs), we propose a simple approach for performing univariate or multivariate association tests for the combined data of unrelated subjects and nuclear families. In particular, we correct for population stratification by integrating principal component analysis and transmission disequilibrium test strategies. The proposed method allows for multiple siblings as well as missing parental information. Simulation studies show that the proposed test has improved power compared to two popular methods, EIGENSTRAT and FBAT, by analyzing the combined data, while correcting for population stratification. In addition, joint analysis of bivariate traits has improved power over univariate analysis when pleiotropic effects are present. Application to the Genetic Analysis Workshop 16 (GAW16) data sets attests to the feasibility and applicability of the proposed method.  相似文献   

2.
Increasing empirical evidence suggests that many genetic variants influence multiple distinct phenotypes. When cross-phenotype effects exist, multivariate association methods that consider pleiotropy are often more powerful than univariate methods that model each phenotype separately. Although several statistical approaches exist for testing cross-phenotype effects for common variants, there is a lack of similar tests for gene-based analysis of rare variants. In order to fill this important gap, we introduce a statistical method for cross-phenotype analysis of rare variants using a nonparametric distance-covariance approach that compares similarity in multivariate phenotypes to similarity in rare-variant genotypes across a gene. The approach can accommodate both binary and continuous phenotypes and further can adjust for covariates. Our approach yields a closed-form test whose significance can be evaluated analytically, thereby improving computational efficiency and permitting application on a genome-wide scale. We use simulated data to demonstrate that our method, which we refer to as the Gene Association with Multiple Traits (GAMuT) test, provides increased power over competing approaches. We also illustrate our approach using exome-chip data from the Genetic Epidemiology Network of Arteriopathy.  相似文献   

3.
Brown RP 《Genetica》1997,101(1):67-74
Heterogeneous phenotypic correlations may be suggestive of underlying changes in genetic covariance among life-history, morphology, and behavioural traits, and their detection is therefore relevant to many biological studies. Two new statistical tests are proposed and their performances compared with existing methods. Of all tests considered, the existing approximate test of homogeneity of product-moment correlations provides the greatest power to detect heterogeneous correlations, when based on Hotelling's z*-transformation. The use of this transformation and test is recommended under conditions of bivariate normality. A new distribution-free randomisation test of homogeneity of Spearman's rank correlations is described and recommended for use when the bivariate samples are taken from populations with non-normal or unknown distributions. An alternative randomisation test of homogeneity of product-moment correlations is shown to be a useful compromise between the approximate tests and the randomisation tests on Spearman's rank correlations: it is not as sensitive to departures from normality as the approximate tests, but has greater power than the rank correlation test. An example is provided that shows how choice of test will have a considerable influence on the conclusions of a particular study. This revised version was published online in July 2006 with corrections to the Cover Date.  相似文献   

4.
MOTIVATION: An important application of microarray experiments is to identify differentially expressed genes. Because microarray data are often not distributed according to a normal distribution nonparametric methods were suggested for their statistical analysis. Here, the Baumgartner-Weiss-Schindler test, a novel and powerful test based on ranks, is investigated and compared with the parametric t-test as well as with two other nonparametric tests (Wilcoxon rank sum test, Fisher-Pitman permutation test) recently recommended for the analysis of gene expression data. RESULTS: Simulation studies show that an exact permutation test based on the Baumgartner-Weiss-Schindler statistic B is preferable to the other three tests. It is less conservative than the Wilcoxon test and more powerful, in particular in case of asymmetric or heavily tailed distributions. When the underlying distribution is symmetric the differences in power between the tests are relatively small. Thus, the Baumgartner-Weiss-Schindler is recommended for the usual situation that the underlying distribution is a priori unknown. AVAILABILITY: SAS code available on request from the authors.  相似文献   

5.
Large exploratory studies, including candidate-gene-association testing, genomewide linkage-disequilibrium scans, and array-expression experiments, are becoming increasingly common. A serious problem for such studies is that statistical power is compromised by the need to control the false-positive rate for a large family of tests. Because multiple true associations are anticipated, methods have been proposed that combine evidence from the most significant tests, as a more powerful alternative to individually adjusted tests. The practical application of these methods is currently limited by a reliance on permutation testing to account for the correlated nature of single-nucleotide polymorphism (SNP)-association data. On a genomewide scale, this is both very time-consuming and impractical for repeated explorations with standard marker panels. Here, we alleviate these problems by fitting analytic distributions to the empirical distribution of combined evidence. We fit extreme-value distributions for fixed lengths of combined evidence and a beta distribution for the most significant length. An initial phase of permutation sampling is required to fit these distributions, but it can be completed more quickly than a simple permutation test and need be done only once for each panel of tests, after which the fitted parameters give a reusable calibration of the panel. Our approach is also a more efficient alternative to a standard permutation test. We demonstrate the accuracy of our approach and compare its efficiency with that of permutation tests on genomewide SNP data released by the International HapMap Consortium. The estimation of analytic distributions for combined evidence will allow these powerful methods to be applied more widely in large exploratory studies.  相似文献   

6.
The observation that haplotypes from a particular region of the genome differ between affected and unaffected individuals or between chromosomes transmitted to affected individuals versus those not transmitted is sound evidence for a disease-liability mutation in the region. Tests for differentiation of haplotype distributions often take the form of either Pearson's chi(2) statistic or tests based on the similarity among haplotypes in the different populations. In this article, we show that many measures of haplotype similarity can be expressed in the same quadratic form, and we give the general form of the variance. As we describe, these methods can be applied to either phase-known or phase-unknown data. We investigate the performance of Pearson's chi(2) statistic and haplotype similarity tests through use of evolutionary simulations. We show that both approaches can be powerful, but under quite different conditions. Moreover, we show that the power of both approaches can be enhanced by clustering rare haplotypes from the distributions before performing a test.  相似文献   

7.
Taxon sampling, correlated evolution, and independent contrasts   总被引:14,自引:0,他引:14  
Independent contrasts are widely used to incorporate phylogenetic information into studies of continuous traits, particularly analyses of evolutionary trait correlations, but the effects of taxon sampling on these analyses have received little attention. In this paper, simulations were used to investigate the effects of taxon sampling patterns and alternative branch length assignments on the statistical performance of correlation coefficients and sign tests; "full-tree" analyses based on contrasts at all nodes and "paired-comparisons" based only on contrasts of terminal taxon pairs were also compared. The simulations showed that random samples, with respect to the traits under consideration, provide statistically robust estimates of trait correlations. However, exact significance tests are highly dependent on appropriate branch length information; equal branch lengths maintain lower Type I error than alternative topological approaches, and adjusted critical values of the independent contrast correlation coefficient are provided for use with equal branch lengths. Nonrandom samples, with respect to univariate or bivariate trait distributions, introduce discrepancies between interspecific and phylogenetically structured analyses and bias estimates of underlying evolutionary correlations. Examples of nonrandom sampling processes may include community assembly processes, convergent evolution under local adaptive pressures, selection of a nonrandom sample of species from a habitat or life-history group, or investigator bias. Correlation analyses based on species pairs comparisons, while ignoring deeper relationships, entail significant loss of statistical power and as a result provide a conservative test of trait associations. Paired comparisons in which species differ by a large amount in one trait, a method introduced in comparative plant ecology, have appropriate Type I error rates and high statistical power, but do not correctly estimate the magnitude of trait correlations. Sign tests, based on full-tree or paired-comparison approaches, are highly reliable across a wide range of sampling scenarios, in terms of Type I error rates, but have very low power. These results provide guidance for selecting species and applying comparative methods to optimize the performance of statistical tests of trait associations.  相似文献   

8.
We studied several methods for selecting single-nucleotide polymorphisms (SNPs) in a disease association study. Two major categories for analytical strategy are the univariate and the set selection approaches. The univariate approach evaluates each SNP marker one at a time, while the set selection approach tests disease association of a set of SNP markers simultaneously. We examined various test statistics that can be utilized in testing disease association and also reviewed several multiple testing procedures that can properly control the family-wise error rates when the univariate approach is applied to multiple markers. The set association methods were then briefly reviewed. Finally, we applied these methods to the data from Collaborative Study on the Genetics of Alcoholism (COGA).  相似文献   

9.
Complex disorders are typically characterized by multiple phenotypes. Analyzing these phenotypes jointly is expected to be more powerful than dealing with one of them at a time. A recent approach (O''Reilly et al. 2012) is to regress the genotype at a SNP marker on multiple phenotypes and apply the proportional odds model. In the current research, we introduce an explicit expression for the score test statistic and its non-centrality parameter that determines its power. Same simulation studies as those reported in Galesloot et al. (2014) were conducted to assess its performance. We demonstrate by theoretical arguments and simulation studies that, despite its potential usefulness for multiple phenotypes, the proportional odds model method can be less powerful than regular methods for univariate traits. We also introduce an implementation of the proposed score statistic in an R package named iGasso.  相似文献   

10.
OBJECTIVES: The association of a candidate gene with disease can be evaluated by a case-control study in which the genotype distribution is compared for diseased cases and unaffected controls. Usually, the data are analyzed with Armitage's test using the asymptotic null distribution of the test statistic. Since this test does not generally guarantee a type I error rate less than or equal to the significance level alpha, tests based on exact null distributions have been investigated. METHODS: An algorithm to generate the exact null distribution for both Armitage's test statistic and a recently proposed modification of the Baumgartner-Weiss-Schindler statistic is presented. I have compared the tests in a simulation study. RESULTS: The asymptotic Armitage test is slightly anticonservative whereas the exact tests control the type I error rate. The exact Armitage test is very conservative, but the exact test based on the modification of the Baumgartner-Weiss-Schindler statistic has a type I error rate close to alpha. The exact Armitage test is the least powerful test; the difference in power between the other two tests is often small and the comparison does not show a clear winner. CONCLUSION: Simulation results indicate that an exact test based on the modification of the Baumgartner-Weiss-Schindler statistic is preferable for the analysis of case-control studies of genetic markers.  相似文献   

11.
It is important to detect population bottlenecks in threatened and managed species because bottlenecks can increase the risk of population extinction. Early detection is critical and can be facilitated by statistically powerful monitoring programs for detecting bottleneck-induced genetic change. We used Monte Carlo computer simulations to evaluate the power of the following tests for detecting genetic changes caused by a severe reduction in a population's effective size ( N e): a test for loss of heterozygosity, two tests for loss of alleles, two tests for change in the distribution of allele frequencies, and a test for small N e based on variance in allele frequencies (the 'variance test'). The variance test was most powerful; it provided an 85% probability of detecting a bottleneck of size N e = 10 when monitoring five microsatellite loci and sampling 30 individuals both before and one generation after the bottleneck. The variance test was almost 10-times more powerful than a commonly used test for loss of heterozygosity, and it allowed for detection of bottlenecks before 5% of a population's heterozygosity had been lost. The second most powerful tests were generally the tests for loss of alleles. However, these tests had reduced power for detecting genetic bottlenecks caused by skewed sex ratios. We provide guidelines for the number of loci and individuals needed to achieve high-power tests when monitoring via the variance test. We also illustrate how the variance test performs when monitoring loci that have widely different allele frequency distributions as observed in five wild populations of mountain sheep ( Ovis canadensis ).  相似文献   

12.
Association tests that pool minor alleles into a measure of burden at a locus have been proposed for case-control studies using sequence data containing rare variants. However, such pooling tests are not robust to the inclusion of neutral and protective variants, which can mask the association signal from risk variants. Early studies proposing pooling tests dismissed methods for locus-wide inference using nonnegative single-variant test statistics based on unrealistic comparisons. However, such methods are robust to the inclusion of neutral and protective variants and therefore may be more useful than previously appreciated. In fact, some recently proposed methods derived within different frameworks are equivalent to performing inference on weighted sums of squared single-variant score statistics. In this study, we compared two existing methods for locus-wide inference using nonnegative single-variant test statistics to two widely cited pooling tests under more realistic conditions. We established analytic results for a simple model with one rare risk and one rare neutral variant, which demonstrated that pooling tests were less powerful than even Bonferroni-corrected single-variant tests in most realistic situations. We also performed simulations using variants with realistic minor allele frequency and linkage disequilibrium spectra, disease models with multiple rare risk variants and extensive neutral variation, and varying rates of missing genotypes. In all scenarios considered, existing methods using nonnegative single-variant test statistics had power comparable to or greater than two widely cited pooling tests. Moreover, in disease models with only rare risk variants, an existing method based on the maximum single-variant Cochran-Armitage trend chi-square statistic in the locus had power comparable to or greater than another existing method closely related to some recently proposed methods. We conclude that efficient locus-wide inference using single-variant test statistics should be reconsidered as a useful framework for devising powerful association tests in sequence data with rare variants.  相似文献   

13.
Aulchenko YS  de Koning DJ  Haley C 《Genetics》2007,177(1):577-585
For pedigree-based quantitative trait loci (QTL) association analysis, a range of methods utilizing within-family variation such as transmission-disequilibrium test (TDT)-based methods have been developed. In scenarios where stratification is not a concern, methods exploiting between-family variation in addition to within-family variation, such as the measured genotype (MG) approach, have greater power. Application of MG methods can be computationally demanding (especially for large pedigrees), making genomewide scans practically infeasible. Here we suggest a novel approach for genomewide pedigree-based quantitative trait loci (QTL) association analysis: genomewide rapid association using mixed model and regression (GRAMMAR). The method first obtains residuals adjusted for family effects and subsequently analyzes the association between these residuals and genetic polymorphisms using rapid least-squares methods. At the final step, the selected polymorphisms may be followed up with the full measured genotype (MG) analysis. In a simulation study, we compared type 1 error, power, and operational characteristics of the proposed method with those of MG and TDT-based approaches. For moderately heritable (30%) traits in human pedigrees the power of the GRAMMAR and the MG approaches is similar and is much higher than that of TDT-based approaches. When using tabulated thresholds, the proposed method is less powerful than MG for very high heritabilities and pedigrees including large sibships like those observed in livestock pedigrees. However, there is little or no difference in empirical power of MG and the proposed method. In any scenario, GRAMMAR is much faster than MG and enables rapid analysis of hundreds of thousands of markers.  相似文献   

14.
Open-pollinated progeny of Corymbia citriodora established in replicated field trials were assessed for stem diameter, wood density, and pulp yield prior to genotyping single nucleotide polymorphisms (SNP) and testing the significance of associations between markers and assessment traits. Multiple individuals within each family were genotyped and phenotyped, which facilitated a comparison of standard association testing methods and an alternative method developed to relate markers to additive genetic effects. Narrow-sense heritability estimates indicated there was significant additive genetic variance within this population for assessment traits ( $ {\widehat{h}^{{2}}} = 0.{28}\;{\text{to}}\;0.{44} $ ) and genetic correlations between the three traits were negligible to moderate (r G?=?0.08 to 0.50). The significance of association tests (p values) were compared for four different analyses based on two different approaches: (1) two software packages were used to fit standard univariate mixed models that include SNP-fixed effects, (2) bivariate and multivariate mixed models including each SNP as an additional selection trait were used. Within either the univariate or multivariate approach, correlations between the tests of significance approached +1; however, correspondence between the two approaches was less strong, although between-approach correlations remained significantly positive. Similar SNP markers would be selected using multivariate analyses and standard marker-trait association methods, where the former facilitates integration into the existing genetic analysis systems of applied breeding programs and may be used with either single markers or indices of markers created with genomic selection processes.  相似文献   

15.
To understand the role of human microbiota in health and disease, we need to study effects of environmental and other epidemiological variables on the composition of microbial communities. The composition of a microbial community may depend on multiple factors simultaneously. Therefore we need multivariate methods for detecting, analyzing and visualizing the interactions between environmental variables and microbial communities. We provide two different approaches for multivariate analysis of these complex combined datasets: (i) We select variables that correlate with overall microbiota composition and microbiota members that correlate with the metadata using canonical correlation analysis, determine independency of the observed correlations in a multivariate regression analysis, and visualize the effect size and direction of the observed correlations using heatmaps; (ii) We select variables and microbiota members using univariate or bivariate regression analysis, followed by multivariate regression analysis, and visualize the effect size and direction of the observed correlations using heatmaps. We illustrate the results of both approaches using a dataset containing respiratory microbiota composition and accompanying metadata. The two different approaches provide slightly different results; with approach (i) using canonical correlation analysis to select determinants and microbiota members detecting fewer and stronger correlations only and approach (ii) using univariate or bivariate analyses to select determinants and microbiota members detecting a similar but broader pattern of correlations. The proposed approaches both detect and visualize independent correlations between multiple environmental variables and members of the microbial community. Depending on the size of the datasets and the hypothesis tested one can select the method of preference.  相似文献   

16.
BACKGROUND: While several algorithms for the comparison of univariate distributions arising from flow cytometric analyses have been developed and studied for many years, algorithms for comparing multivariate distributions remain elusive. Such algorithms could be useful for comparing differences between samples based on several independent measurements, rather than differences based on any single measurement. It is conceivable that distributions could be completely distinct in multivariate space, but unresolvable in any combination of univariate histograms. Multivariate comparisons could also be useful for providing feedback about instrument stability, when only subtle changes in measurements are occurring. METHODS: We apply a variant of Probability Binning, described in the accompanying article, to multidimensional data. In this approach, hyper-rectangles of n dimensions (where n is the number of measurements being compared) comprise the bins used for the chi-squared statistic. These hyper-dimensional bins are constructed such that the control sample has the same number of events in each bin; the bins are then applied to the test samples for chi-squared calculations. RESULTS: Using a Monte-Carlo simulation, we determined the distribution of chi-squared values obtained by comparing sets of events from the same distribution; this distribution of chi-squared values was identical as for the univariate algorithm. Hence, the same formulae can be used to construct a metric, analogous to a t-score, that estimates the probability with which distributions are distinct. As for univariate comparisons, this metric scales with the difference between two distributions, and can be used to rank samples according to similarity to a control. We apply the algorithm to multivariate immunophenotyping data, and demonstrate that it can be used to discriminate distinct samples and to rank samples according to a biologically-meaningful difference. CONCLUSION: Probability binning, as shown here, provides a useful metric for determining the probability with which two or more multivariate distributions represent distinct sets of data. The metric can be used to identify the similarity or dissimilarity of samples. Finally, as demonstrated in the accompanying paper, the algorithm can be used to gate on events in one sample that are different from a control sample, even if those events cannot be distinguished on the basis of any combination of univariate or bivariate displays. Published 2001 Wiley-Liss, Inc.  相似文献   

17.
The Haseman and Elston (H-E) method uses a simple linear regression to model the squared trait difference of sib pairs with the shared allele identical by descent (IBD) at marker locus for linkage testing. Under this setting, the squared mean-corrected trait sum is also linearly related to the IBD sharing. However, the resulting slope estimate for either model is not efficient. In this report, we propose a simple linkage test that optimally uses information from the estimates of both models. We also demonstrate that the new test is more powerful than both the traditional one and the recently revisited H-E methods.  相似文献   

18.
Wu CC  Amos CI 《Human heredity》2003,55(4):153-162
Genetic linkage analysis is a powerful tool for the identification of disease susceptibility loci. Among the most commonly applied genetic linkage strategies are affected sib-pair tests, but the statistical properties of these tests have not been well characterized. Here, we present a study of the distribution of affected sib-pair tests comparing the type I error rate and the power of the mean test and the proportion test, which are the most commonly used, along with a novel exact test. In contrast to existing literature, our findings showed that the mean and proportion tests have inflated type I error rates, especially when used with small samples. We developed and applied corrections to the tests which provide an excellent adjustment to the type I error rate for both small and large samples. We also developed a novel approach to identify the areas of higher power for the mean test versus the proportion test, providing a wider and simpler comparison with fewer assumptions about parameter values than existing approaches require.  相似文献   

19.
Methods for multivariate meta-analysis of genetic association studies are reviewed, summarized and presented in a unified framework. Modifications of standard models are described in detail in order to be applied in genetic association studies. The model based on summary data is uniformly defined for both discrete and continuous outcomes and analytical expressions for the covariance of the two jointly modeled outcomes are derived for both cases. The models based on the binary nature of the data are fitted using both prospective and retrospective likelihood. Furthermore, formal tests for assessing the genetic model of inheritance are developed based on standard normal theory. The general model is compared to the recently proposed genetic model-free bivariate approach (either using summary or binary data), and it is clearly shown that the estimates provided by this approach are nearly identical to the estimates derived by the general bivariate model using the aforementioned tests for the genetic model. The methods developed here as well as the tests, are easily implemented in all major statistical packages, escaping the need of self written software. The methods are applied in several already published meta-analyses of genetic association studies (with both discrete and continuous outcomes) and the results are compared against the widely used univariate approach as well as against the genetic model free approaches. Illustrative examples of code in Stata are given in the appendix. It is anticipated that the methods developed in this work will be widely applied in the meta-analysis of genetic association studies.  相似文献   

20.
The paired-t, sign, and signed rank tests were compared for samples from a bivariate exponential distribution. Each is a valid α-level test. One test was not uniformly more powerful than the others for all sample sizes, α levels, correlations, and alternative hypotheses considered, but the signed rank test did well consistently. It was always preferable to the sign test and never was appreciably worse than the paired-t test. The relative performance of the tests depends on α as well as the sample size.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号