首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Complex human diseases do not have a clear inheritance pattern, and it is expected that risk involves multiple genes with modest effects acting independently or interacting. Major challenges for the identification of genetic effects are genetic heterogeneity and difficulty in analyzing high-order interactions. To address these challenges, we present MDR-Phenomics, a novel approach based on the multifactor dimensionality reduction (MDR) method, to detect genetic effects in pedigree data by integration of phenotypic covariates (PCs) that may reflect genetic heterogeneity. The P value of the test is calculated using a permutation test adjusted for multiple tests. To validate MDR-Phenomics, we compared it with two MDR-based methods: (1) traditional MDR pedigree disequilibrium test (PDT) without consideration of PCs (MDR-PDT) and (2) stratified phenotype (SP) analysis based on PCs, with use of MDR-PDT with a Bonferroni adjustment (SP-MDR). Using computer simulations, we examined the statistical power and type I error of the different approaches under several genetic models and sampling scenarios. We conclude that MDR-Phenomics is more powerful than MDR-PDT and SP-MDR when there is genetic heterogeneity, and the statistical power is affected by sample size and the number of PC levels. We further compared MDR-Phenomics with conditional logistic regression (CLR) for testing interactions across single or multiple loci with consideration of PC. The results show that CLR with PC has only slightly smaller power than does MDR-Phenomics for single-locus analysis but has considerably smaller power for multiple loci. Finally, by applying MDR-Phenomics to autism, a complex disease in which multiple genes are believed to confer risk, we attempted to identify multiple gene effects in two candidate genes of interest—the serotonin transporter gene (SLC6A4) and the integrin beta 3 gene (ITGB3) on chromosome 17. Analyzing four markers in SLC6A4 and four markers in ITGB3 in 117 white family triads with autism and using sex of the proband as a PC, we found significant interaction between two markers—rs1042173 in SLC6A4 and rs3809865 in ITGB3.  相似文献   

2.
The multifactor dimensionality reduction (MDR) is a model-free approach that can identify gene x gene or gene x environment effects in a case-control study. Here we explore several modifications of the MDR method. We extended MDR to provide model selection without crossvalidation, and use a chi-square statistic as an alternative to prediction error (PE). We also modified the permutation test to provide different levels of stringency. The extended MDR (EMDR) includes three permutation tests (fixed, non-fixed, and omnibus) to obtain p-values of multilocus models. The goal of this study was to compare the different approaches implemented in the EMDR method and evaluate the ability to identify genetic effects in the Genetic Analysis Workshop 14 simulated data. We used three replicates from the simulated family data, generating matched pairs from family triads. The results showed: 1) chi-square and PE statistics give nearly consistent results; 2) results of EMDR without cross-validation matched that of EMDR with 10-fold cross-validation; 3) the fixed permutation test reports false-positive results in data from loci unrelated to the disease, but the non-fixed and omnibus permutation tests perform well in preventing false positives, with the omnibus test being the most conservative. We conclude that the non-cross-validation test can provide accurate results with the advantage of high efficiency compared to 10-cross-validation, and the non-fixed permutation test provides a good compromise between power and false-positive rate.  相似文献   

3.
Genetic isolation among populations can be effectively investigated by multilocus DNA fingerprinting. If populations have diverged, it is expected that the mean proportion of bands shared by individuals from the same population, Bw, exceeds the corresponding mean, Bb, calculated from pairs of individuals from distinct populations. A problem arises in deciding whether any difference between Bw and Bb is statistically significant. In fact, any two band-sharing data (bij), contributing to Bw or Bb, are not independent if they share a common individual (like bij and bjl). This prevents a correct application of parametric tests, such as the Student's t-test. Recently, a modification of this test has been proposed that should avoid the independence problem. Using a large number of samples of fingerprints, simulated from an appropriate 'genetic' model, under a wide range of conditions, we compared the performances of the Student's t-test, the modified t-test and five new permutation tests, where individuals, rather than bij values, are permuted. We found that: (i) the Student's t-test can be very permissive, rejecting too often the null hypothesis when true, but is correct or conservative in certain cases; (ii) the modified t-test is extremely conservative when the null hypothesis is true and very inefficient otherwise; (iii) all five permutation tests are strictly correct, provided that individuals are ordered randomly on gels; and (iv) in this case, the permutation tests are equally efficient, and are not inferior to the Student's t-test when the latter is approximately correct and provides a fair benchmark.  相似文献   

4.
The power of the Mantel-Haenszel test for no treatment effect in the case of binary exposure and response variates was examined through simulation studies when subclasses were formed on the basis of the true and estimated propensity scores and by direct stratification on two continuous covariates. The power of these tests was also compared to the score test in a misspecified logistic regression model. In general adjustment by the true propensity score was most likely to reject a false null hypothesis, the score test was more likely to reject a false null hypothesis than the Mantel-Haenszel test when adjustment is by the estimated propensity score or subclassification on the covariates. There was litte difference in the observed powers of the Mantel-Haenszel tests between adjustment by the estimated propensity score and subclassification on the covariates.  相似文献   

5.
Lee OE  Braun TM 《Biometrics》2012,68(2):486-493
Inference regarding the inclusion or exclusion of random effects in linear mixed models is challenging because the variance components are located on the boundary of their parameter space under the usual null hypothesis. As a result, the asymptotic null distribution of the Wald, score, and likelihood ratio tests will not have the typical χ(2) distribution. Although it has been proved that the correct asymptotic distribution is a mixture of χ(2) distributions, the appropriate mixture distribution is rather cumbersome and nonintuitive when the null and alternative hypotheses differ by more than one random effect. As alternatives, we present two permutation tests, one that is based on the best linear unbiased predictors and one that is based on the restricted likelihood ratio test statistic. Both methods involve weighted residuals, with the weights determined by the among- and within-subject variance components. The null permutation distributions of our statistics are computed by permuting the residuals both within and among subjects and are valid both asymptotically and in small samples. We examine the size and power of our tests via simulation under a variety of settings and apply our test to a published data set of chronic myelogenous leukemia patients.  相似文献   

6.
One of the greatest challenges facing human geneticists is the identification and characterization of susceptibility genes for common complex multifactorial human diseases. This challenge is partly due to the limitations of parametric-statistical methods for detection of gene effects that are dependent solely or partially on interactions with other genes and with environmental exposures. We introduce multifactor-dimensionality reduction (MDR) as a method for reducing the dimensionality of multilocus information, to improve the identification of polymorphism combinations associated with disease risk. The MDR method is nonparametric (i.e., no hypothesis about the value of a statistical parameter is made), is model-free (i.e., it assumes no particular inheritance model), and is directly applicable to case-control and discordant-sib-pair studies. Using simulated case-control data, we demonstrate that MDR has reasonable power to identify interactions among two or more loci in relatively small samples. When it was applied to a sporadic breast cancer case-control data set, in the absence of any statistically significant independent main effects, MDR identified a statistically significant high-order interaction among four polymorphisms from three different estrogen-metabolism genes. To our knowledge, this is the first report of a four-locus interaction associated with a common complex multifactorial disease.  相似文献   

7.
Multilocus analysis of hypertension: a hierarchical approach   总被引:11,自引:0,他引:11  
While hypertension is a complex disease with a well-documented genetic component, genetic studies often fail to replicate findings. One possibility for such inconsistency is that the underlying genetics of hypertension is not based on single genes of major effect, but on interactions among genes. To test this hypothesis, we studied both single locus and multilocus effects, using a case-control design of subjects from Ghana. Thirteen polymorphisms in eight candidate genes were studied. Each candidate gene has been shown to play a physiological role in blood pressure regulation and affects one of four pathways that modulate blood pressure: vasoconstriction (angiotensinogen, angiotensin converting enzyme - ACE, angiotensin II receptor), nitric oxide (NO) dependent and NO independent vasodilation pathways and sodium balance (G protein-coupled receptor kinase, GRK4). We evaluated single site allelic and genotypic associations, multilocus genotype equilibrium and multilocus genotype associations, using multifactor dimensionality reduction (MDR). For MDR, we performed systematic reanalysis of the data to address the role of various physiological pathways. We found no significant single site associations, but the hypertensive class deviated significantly from genotype equilibrium in more than 25% of all multilocus comparisons (2,162 of 8,178), whereas the normotensive class rarely did (11 of 8,178). The MDR analysis identified a two-locus model including ACE and GRK4 that successfully predicted blood pressure phenotype 70.5% of the time. Thus, our data indicate epistatic interactions play a major role in hypertension susceptibility. Our data also support a model where multiple pathways need to be affected in order to predispose to hypertension.  相似文献   

8.
MOTIVATION: The identification and characterization of susceptibility genes that influence the risk of common and complex diseases remains a statistical and computational challenge in genetic association studies. This is partly because the effect of any single genetic variant for a common and complex disease may be dependent on other genetic variants (gene-gene interaction) and environmental factors (gene-environment interaction). To address this problem, the multifactor dimensionality reduction (MDR) method has been proposed by Ritchie et al. to detect gene-gene interactions or gene-environment interactions. The MDR method identifies polymorphism combinations associated with the common and complex multifactorial diseases by collapsing high-dimensional genetic factors into a single dimension. That is, the MDR method classifies the combination of multilocus genotypes into high-risk and low-risk groups based on a comparison of the ratios of the numbers of cases and controls. When a high-order interaction model is considered with multi-dimensional factors, however, there may be many sparse or empty cells in the contingency tables. The MDR method cannot classify an empty cell as high risk or low risk and leaves it as undetermined. RESULTS: In this article, we propose the log-linear model-based multifactor dimensionality reduction (LM MDR) method to improve the MDR in classifying sparse or empty cells. The LM MDR method estimates frequencies for empty cells from a parsimonious log-linear model so that they can be assigned to high-and low-risk groups. In addition, LM MDR includes MDR as a special case when the saturated log-linear model is fitted. Simulation studies show that the LM MDR method has greater power and smaller error rates than the MDR method. The LM MDR method is also compared with the MDR method using as an example sporadic Alzheimer's disease.  相似文献   

9.
MOTIVATION: The identification and characterization of genes that increase the susceptibility to common complex multifactorial diseases is a challenging task in genetic association studies. The multifactor dimensionality reduction (MDR) method has been proposed and implemented by Ritchie et al. (2001) to identify the combinations of multilocus genotypes and discrete environmental factors that are associated with a particular disease. However, the original MDR method classifies the combination of multilocus genotypes into high-risk and low-risk groups in an ad hoc manner based on a simple comparison of the ratios of the number of cases and controls. Hence, the MDR approach is prone to false positive and negative errors when the ratio of the number of cases and controls in a combination of genotypes is similar to that in the entire data, or when both the number of cases and controls is small. Hence, we propose the odds ratio based multifactor dimensionality reduction (OR MDR) method that uses the odds ratio as a new quantitative measure of disease risk. RESULTS: While the original MDR method provides a simple binary measure of risk, the OR MDR method provides not only the odds ratio as a quantitative measure of risk but also the ordering of the multilocus combinations from the highest risk to lowest risk groups. Furthermore, the OR MDR method provides a confidence interval for the odds ratio for each multilocus combination, which is extremely informative in judging its importance as a risk factor. The proposed OR MDR method is illustrated using the dataset obtained from the CDC Chronic Fatigue Syndrome Research Group. AVAILABILITY: The program written in R is available.  相似文献   

10.
The computer program identix estimates relatedness in natural populations using multilocus genotypic data. Queller & Goodnight's (1989) and Lynch & Ritland's (1999) estimators of pairwise relatedness are implemented, as well as the identity index of Mathieu et al. (1990). Estimates of the confidence intervals around these pairwise values are also provided. The null hypothesis of no relatedness (multilocus genotypes are independent draws from a panmictic population) is tested using a permutation method that compares the observed distribution of the moments of pairwise relatedness coefficients to that expected in unstructured population.  相似文献   

11.
Permutation tests are amongst the most commonly used statistical tools in modern genomic research, a process by which p-values are attached to a test statistic by randomly permuting the sample or gene labels. Yet permutation p-values published in the genomic literature are often computed incorrectly, understated by about 1/m, where m is the number of permutations. The same is often true in the more general situation when Monte Carlo simulation is used to assign p-values. Although the p-value understatement is usually small in absolute terms, the implications can be serious in a multiple testing context. The understatement arises from the intuitive but mistaken idea of using permutation to estimate the tail probability of the test statistic. We argue instead that permutation should be viewed as generating an exact discrete null distribution. The relevant literature, some of which is likely to have been relatively inaccessible to the genomic community, is reviewed and summarized. A computation strategy is developed for exact p-values when permutations are randomly drawn. The strategy is valid for any number of permutations and samples. Some simple recommendations are made for the implementation of permutation tests in practice.  相似文献   

12.
A key hypothesis in population ecology is that synchronous and intermittent seed production, known as mast seeding, is driven by the alternating allocation of carbohydrates and mineral nutrients between growth and reproduction in different years, i.e. ‘resource switching’. Such behaviour may ultimately generate bimodal distributions of long‐term flower and seed production, and evidence of these patterns has been taken to support the resource switching hypothesis. Here, we show how a widely‐used statistical test of bimodality applied by many studies in different ecological contexts may fail to reject the null hypothesis that focal probability distributions are unimodal. Using data from five tussock grass species in South Island, New Zealand, we find clear evidence of bimodality only when flowering patterns are analyzed with probabilistic mixture models. Mixture models provide a theory oriented framework for testing hypotheses of mast seeding patterns, enabling the different responses underlying medium‐ and high‐ versus non‐ and low‐flowering years to be modelled more realistically by associating these with distinct probability distributions. Coupling theoretical expectations with more rigorous statistical approaches will empower ecologists to reject null hypotheses more often.  相似文献   

13.
Epigenetic research leads to complex data structures. Since parametric model assumptions for the distribution of epigenetic data are hard to verify we introduce in the present work a nonparametric statistical framework for two-group comparisons. Furthermore, epigenetic analyses are often performed at various genetic loci simultaneously. Hence, in order to be able to draw valid conclusions for specific loci, an appropriate multiple testing correction is necessary. Finally, with technologies available for the simultaneous assessment of many interrelated biological parameters (such as gene arrays), statistical approaches also need to deal with a possibly unknown dependency structure in the data. Our statistical approach to the nonparametric comparison of two samples with independent multivariate observables is based on recently developed multivariate multiple permutation tests. We adapt their theory in order to cope with families of hypotheses regarding relative effects. Our results indicate that the multivariate multiple permutation test keeps the pre-assigned type I error level for the global null hypothesis. In combination with the closure principle, the family-wise error rate for the simultaneous test of the corresponding locus/parameter-specific null hypotheses can be controlled. In applications we demonstrate that group differences in epigenetic data can be detected reliably with our methodology.  相似文献   

14.
Epistasis or gene-gene interaction is a fundamental component of the genetic architecture of complex traits such as disease susceptibility. Multifactor dimensionality reduction (MDR) was developed as a nonparametric and model-free method to detect epistasis when there are no significant marginal genetic effects. However, in many studies of complex disease, other covariates like age of onset and smoking status could have a strong main effect and may potentially interfere with MDR's ability to achieve its goal. In this paper, we present a simple and computationally efficient sampling method to adjust for covariate effects in MDR. We use simulation to show that after adjustment, MDR has sufficient power to detect true gene-gene interactions. We also compare our method with the state-of-art technique in covariate adjustment. The results suggest that our proposed method performs similarly, but is more computationally efficient. We then apply this new method to an analysis of a population-based bladder cancer study in New Hampshire.  相似文献   

15.
SUMMARY: LIAN is a program to test the null hypothesis of linkage equilibrium for multilocus data. LIAN incorporates both a Monte Carlo method as well as a novel algebraic method to carry out the hypothesis test. The program further returns the genetic diversity of the sample and the pairwise distances between its members.  相似文献   

16.
The evidence for dispersal activity among soil-living invertebrates comes mainly from observations of their movement on artificial substrates or of colonisation of defaunated soils in the field. In an attempt to elucidate the dispersal pattern of soil collembolans in the presence of conspecifics, statistical analyses were undertaken to describe and simulate the movement of groups of Onychiurus armatus released in trays of homogeneous soil. A chi(2) test was used to reject the null hypothesis that individuals moved independently of each other and uniformly in all directions. The mean radial distance moved (1-2 cm day(-1)) and the radial standard deviation varied temporally and with the density of conspecifics. To capture the interaction between the moving individuals, four dispersal models (pure diffusion, diffusion with drift interaction, drift interaction and synchronised diffusion, and drift interaction and behavioural mood), were formulated as stochastic differential equations. The parameters of the models were estimated by minimising the deviance between the observed replicates and replicates that were simulated using the models. The dynamics of movement were best described by modelling the drift interaction as dependent on whether individuals were in a social or an asocial mood.  相似文献   

17.
Observed variations in rates of taxonomic diversification have been attributed to a range of factors including biological innovations, ecosystem restructuring, and environmental changes. Before inferring causality of any particular factor, however, it is critical to demonstrate that the observed variation in diversity is significantly greater than that expected from natural stochastic processes. Relative tests that assess whether observed asymmetry in species richness between sister taxa in monophyletic pairs is greater than would be expected under a symmetric model have been used widely in studies of rate heterogeneity and are particularly useful for groups in which paleontological data are problematic. Although one such test introduced by Slowinski and Guyer a decade ago has been applied to a wide range of clades and evolutionary questions, the statistical behavior of the test has not been examined extensively, particularly when used with Fisher's procedure for combining probabilities to analyze data from multiple independent taxon pairs. Here, certain pragmatic difficulties with the Slowinski-Guyer test are described, further details of the development of a recently introduced likelihood-based relative rates test are presented, and standard simulation procedures are used to assess the behavior of the two tests in a range of situations to determine: (1) the accuracy of the tests' nominal Type I error rate; (2) the statistical power of the tests; (3) the sensitivity of the tests to inclusion of taxon pairs with few species; (4) the behavior of the tests with datasets comprised of few taxon pairs; and (5) the sensitivity of the tests to certain violations of the null model assumptions. Our results indicate that in most biologically plausible scenarios, the likelihood-based test has superior statistical properties in terms of both Type I error rate and power, and we found no scenario in which the Slowinski-Guyer test was distinctly superior, although the degree of the discrepancy varies among the different scenarios. The Slowinski-Guyer test tends to be much more conservative (i.e., very disinclined to reject the null hypothesis) in datasets with many small pairs. In most situations, the performance of both the likelihood-based test and particularly the Slowinski-Guyer test improve when pairs with few species are excluded from the computation, although this is balanced against a decline in the tests' power and accuracy as fewer pairs are included in the dataset. The performance of both tests is quite poor when they are applied to datasets in which the taxon sizes do not conform to the distribution implied by the usual null model. Thus, results of analyses of taxonomic rate heterogeneity using the Slowinski-Guyer test can be misleading because the test's ability to reject the null hypothesis (equal rates) when true is often inaccurate and its ability to reject the null hypothesis when the alternative (unequal rates) is true is poor, particularly when small taxon pairs are included. Although not always perfect, the likelihood-based test provides a more accurate and powerful alternative as a relative rates test.  相似文献   

18.
Cook AJ  Li Y 《Biometrics》2008,64(4):1289-1292
Summary. This short note evaluates the assumptions required for a permutation test to approximate the null distribution of the spatial scan statistic for censored outcomes proposed in Cook et al. (2007). In particular, we study the exchangeability conditions required for such a test under survival models. A simulation study is further performed to assess the impact on the type I error when the global exchangeability assumption is violated and to determine whether the permutation test still well approximates the null distribution.  相似文献   

19.
MOTIVATION: Several authors have studied expression in gene sets with specific goals: overrepresentation of interesting genes in functional groups, predictive power for class membership and searches for groups where the constituent genes show coordinated changes in expression under the experimental conditions. The purpose of this article is to follow the third direction. One important aspect is that the gene sets under analysis are known a priori and are not determined from the experimental data at hand. Our goal is to provide a methodology that helps to identify the relevant structural constituents (phenotypical, experimental design, biological component) that determine gene expression in a group. RESULTS: Gene-wise linear models are used to formalize the structural aspects of a study. The full model is contrasted with a reduced model that lacks the relevant design component. A comparison with respect to goodness of fit is made and quantified. An asymptotic test and a permutation test are derived to test the null hypothesis that the reduced model sufficiently explains the observed expression within the gene group of interest. Graphical tools are available to illustrate and interpret the results of the analysis. Examples demonstrate the wide range of application. AVAILABILITY: The R-package GlobalAncova (http://www.bioconductor.org) offers data and functions as well as a vignette to guide the user through specific analysis steps.  相似文献   

20.
Oh C  Wang S  Liu N  Chen L  Zhao H 《BMC genetics》2005,6(Z1):S116
Common human disorders, such as alcoholism, may be the result of interactions of many genes as well as environmental risk factors. Therefore, it is important to incorporate gene x gene and gene x environment interactions in complex disease gene mapping. In this study, we applied a robust Bayesian genome screening method that can incorporate interaction effects to map genes underlying alcoholism through its application to the data of the Collaborative Studies on Genetics of Alcoholism provided by Genetic Analysis Workshop 14. Our Bayesian genome screening method uses the regression-based stochastic variable selection, coupled with the new Haseman-Elston method to identify markers linked to phenotypes of interest. Compared to traditional linkage methods based on single-gene disease models, our method allows for multilocus disease models for simultaneous screening including both main and interaction (epistatic) effects. It is conceptually simple and computationally efficient through the use of Gibbs sampler. We conducted genome-wide analysis and comparison between scans based on microsatellites and single-nucleotide polymorphisms. A total of 328 microsatellites and 11,560 single-nucleotide polymorphisms (by Affymetrix) on 22 autosomal chromosomes and sex chromosome were used.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号