首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The one‐degree‐of‐freedom Cochran‐Armitage (CA) test statistic for linear trend has been widely applied in various dose‐response studies (e.g., anti‐ulcer medications and short‐term antibiotics, animal carcinogenicity bioassays and occupational toxicant studies). This approximate statistic relies, however, on asymptotic theory that is reliable only when the sample sizes are reasonably large and well balanced across dose levels. For small, sparse, or skewed data, the asymptotic theory is suspect and exact conditional method (based on the CA statistic) seems to provide a dependable alternative. Unfortunately, the exact conditional method is only practical for the linear logistic model from which the sufficient statistics for the regression coefficients can be obtained explicitly. In this article, a simple and efficient recursive polynomial multiplication algorithm for exact unconditional test (based on the CA statistic) for detecting a linear trend in proportions is derived. The method is applicable for all choices of the model with monotone trend including logistic, probit, arcsine, extreme value and one hit. We also show that this algorithm can be easily extended to exact unconditional power calculation for studies with up to a moderately large sample size. A real example is given to illustrate the applicability of the proposed method.  相似文献   

2.
The problem of variable selection in the generalized linear‐mixed models (GLMMs) is pervasive in statistical practice. For the purpose of variable selection, many methodologies for determining the best subset of explanatory variables currently exist according to the model complexity and differences between applications. In this paper, we develop a “higher posterior probability model with bootstrap” (HPMB) approach to select explanatory variables without fitting all possible GLMMs involving a small or moderate number of explanatory variables. Furthermore, to save computational load, we propose an efficient approximation approach with Laplace's method and Taylor's expansion to approximate intractable integrals in GLMMs. Simulation studies and an application of HapMap data provide evidence that this selection approach is computationally feasible and reliable for exploring true candidate genes and gene–gene associations, after adjusting for complex structures among clusters.  相似文献   

3.
Summary Gilbert, Rossini, and Shankarappa (2005 , Biometrics 61 , 106‐117) present four U‐statistic based tests to compare genetic diversity between different samples. The proposed tests improved upon previously used methods by accounting for the correlations in the data. We find, however, that the same correlations introduce an unacceptable bias in the sample estimators used for the variance and covariance of the inter‐sequence genetic distances for modest sample sizes. Here, we compute unbiased estimators for these and test the resulting improvement using simulated data. We also show that, contrary to the claims in Gilbert et al., it is not always possible to apply the Welch–Satterthwaite approximate t‐test, and we provide explicit formulas for the degrees of freedom to be used when, on the other hand, such approximation is indeed possible.  相似文献   

4.
We consider the bivariate situation of some quantitative, ordinal, binary or censored response variable and some quantitative or ordinal exposure variable (dose) with a hypothetical effect on the response. Data can either be the outcome of a planned dose‐response experiment with only few dose levels or of an observational study where, for example, both exposure and response variable are observed within each individual. We are interested in testing the null hypothesis of no effect of the dose variable vs. a dose‐response function depending on an unknown ‘threshold’ parameter. The variety of dose‐response functions considered ranges from no observed effect level (NOEL) models to umbrella alternatives. Here we discuss generalizations of the method of Lausen & Schumacher (Biometrics, 1992, 48, 73–85)which are based on combinations of two‐sample rank statistics and rank statistics for trend. Our approach may be seen as a generalization of a proposal for change‐point problems. Using the approach of Davies (Biometrika, 1987, 74, 33–43)we derive and approximate the asymptotic null distribution for a large number of thresholds considered. We use an improved Bonferroni inequality as approximation for a small number of thresholds considered. Moreover, we analyse the small sample behaviour by means of a Monte‐Carlo study. Our paper is illustrated by examples from clinical research and epidemiology.  相似文献   

5.
In order to study family‐based association in the presence of linkage, we extend a generalized linear mixed model proposed for genetic linkage analysis (Lebrec and van Houwelingen (2007), Human Heredity 64 , 5–15) by adding a genotypic effect to the mean. The corresponding score test is a weighted family‐based association tests statistic, where the weight depends on the linkage effect and on other genetic and shared environmental effects. For testing of genetic association in the presence of gene–covariate interaction, we propose a linear regression method where the family‐specific score statistic is regressed on family‐specific covariates. Both statistics are straightforward to compute. Simulation results show that adjusting the weight for the within‐family variance structure may be a powerful approach in the presence of environmental effects. The test statistic for genetic association in the presence of gene–covariate interaction improved the power for detecting association. For illustration, we analyze the rheumatoid arthritis data from GAW15. Adjusting for smoking and anti‐cyclic citrullinated peptide increased the significance of the association with the DR locus.  相似文献   

6.
Group randomized trials (GRTs) randomize groups, or clusters, of people to intervention or control arms. To test for the effectiveness of the intervention when subject‐level outcomes are binary, and while fitting a marginal model that adjusts for cluster‐level covariates and utilizes a logistic link, we develop a pseudo‐Wald statistic to improve inference. Alternative Wald statistics could employ bias‐corrected empirical sandwich standard error estimates, which have received limited attention in the GRT literature despite their broad utility and applicability in our settings of interest. The test could also be carried out using popular approaches based upon cluster‐level summary outcomes. A simulation study covering a variety of realistic GRT settings is used to compare the accuracy of these methods in terms of producing nominal test sizes. Tests based upon the pseudo‐Wald statistic and a cluster‐level summary approach utilizing the natural log of observed cluster‐level odds worked best. Due to weighting, some popular cluster‐level summary approaches were found to lead to invalid inference in many settings. Finally, although use of bias‐corrected empirical sandwich standard error estimates did not consistently result in nominal sizes, they did work well, thus supporting the applicability of marginal models in GRT settings.  相似文献   

7.
How best to summarize large and complex datasets is a problem that arises in many areas of science. We approach it from the point of view of seeking data summaries that minimize the average squared error of the posterior distribution for a parameter of interest under approximate Bayesian computation (ABC). In ABC, simulation under the model replaces computation of the likelihood, which is convenient for many complex models. Simulated and observed datasets are usually compared using summary statistics, typically in practice chosen on the basis of the investigator's intuition and established practice in the field. We propose two algorithms for automated choice of efficient data summaries. Firstly, we motivate minimisation of the estimated entropy of the posterior approximation as a heuristic for the selection of summary statistics. Secondly, we propose a two-stage procedure: the minimum-entropy algorithm is used to identify simulated datasets close to that observed, and these are each successively regarded as observed datasets for which the mean root integrated squared error of the ABC posterior approximation is minimized over sets of summary statistics. In a simulation study, we both singly and jointly inferred the scaled mutation and recombination parameters from a population sample of DNA sequences. The computationally-fast minimum entropy algorithm showed a modest improvement over existing methods while our two-stage procedure showed substantial and highly-significant further improvement for both univariate and bivariate inferences. We found that the optimal set of summary statistics was highly dataset specific, suggesting that more generally there may be no globally-optimal choice, which argues for a new selection for each dataset even if the model and target of inference are unchanged.  相似文献   

8.
The problem of combining information from separate trials is a key consideration when performing a meta‐analysis or planning a multicentre trial. Although there is a considerable journal literature on meta‐analysis based on individual patient data (IPD), i.e. a one‐step IPD meta‐analysis, versus analysis based on summary data, i.e. a two‐step IPD meta‐analysis, recent articles in the medical literature indicate that there is still confusion and uncertainty as to the validity of an analysis based on aggregate data. In this study, we address one of the central statistical issues by considering the estimation of a linear function of the mean, based on linear models for summary data and for IPD. The summary data from a trial is assumed to comprise the best linear unbiased estimator, or maximum likelihood estimator of the parameter, along with its covariance matrix. The setup, which allows for the presence of random effects and covariates in the model, is quite general and includes many of the commonly employed models, for example, linear models with fixed treatment effects and fixed or random trial effects. For this general model, we derive a condition under which the one‐step and two‐step IPD meta‐analysis estimators coincide, extending earlier work considerably. The implications of this result for the specific models mentioned above are illustrated in detail, both theoretically and in terms of two real data sets, and the roles of balance and heterogeneity are highlighted. Our analysis also shows that when covariates are present, which is typically the case, the two estimators coincide only under extra simplifying assumptions, which are somewhat unrealistic in practice.  相似文献   

9.
Glaucoma is a progressive disease due to damage in the optic nerve with associated functional losses. Although the relationship between structural and functional progression in glaucoma is well established, there is disagreement on how this association evolves over time. In addressing this issue, we propose a new class of non‐Gaussian linear‐mixed models to estimate the correlations among subject‐specific effects in multivariate longitudinal studies with a skewed distribution of random effects, to be used in a study of glaucoma. This class provides an efficient estimation of subject‐specific effects by modeling the skewed random effects through the log‐gamma distribution. It also provides more reliable estimates of the correlations between the random effects. To validate the log‐gamma assumption against the usual normality assumption of the random effects, we propose a lack‐of‐fit test using the profile likelihood function of the shape parameter. We apply this method to data from a prospective observation study, the Diagnostic Innovations in Glaucoma Study, to present a statistically significant association between structural and functional change rates that leads to a better understanding of the progression of glaucoma over time.  相似文献   

10.
Summary We consider a problem of testing mixture proportions using two‐sample data, one from group one and the other from a mixture of groups one and two with unknown proportion, λ, for being in group two. Various statistical applications, including microarray study, infectious epidemiological studies, case–control studies with contaminated controls, clinical trials allowing “nonresponders,” genetic studies for gene mutation, and fishery applications can be formulated in this setup. Under the assumption that the log ratio of probability (density) functions from the two groups is linear in the observations, we propose a generalized score test statistic to test the mixture proportion. Under some regularity conditions, it is shown that this statistic converges to a weighted chi‐squared random variable under the null hypothesis of λ= 0 , where the weight depends only on the sampling fraction of both groups. The permutation method is used to provide more reliable finite sample approximation. Simulation results and two real data applications are presented.  相似文献   

11.
In linear mixed‐effects models, random effects are used to capture the heterogeneity and variability between individuals due to unmeasured covariates or unknown biological differences. Testing for the need of random effects is a nonstandard problem because it requires testing on the boundary of parameter space where the asymptotic chi‐squared distribution of the classical tests such as likelihood ratio and score tests is incorrect. In the literature several tests have been proposed to overcome this difficulty, however all of these tests rely on the restrictive assumption of i.i.d. measurement errors. The presence of correlated errors, which often happens in practice, makes testing random effects much more difficult. In this paper, we propose a permutation test for random effects in the presence of serially correlated errors. The proposed test not only avoids issues with the boundary of parameter space, but also can be used for testing multiple random effects and any subset of them. Our permutation procedure includes the permutation procedure in Drikvandi, Verbeke, Khodadadi, and Partovi Nia (2013) as a special case when errors are i.i.d., though the test statistics are different. We use simulations and a real data analysis to evaluate the performance of the proposed permutation test. We have found that random slopes for linear and quadratic time effects may not be significant when measurement errors are serially correlated.  相似文献   

12.
Assessing the exceptionality of network motifs.   总被引:1,自引:0,他引:1  
Getting and analyzing biological interaction networks is at the core of systems biology. To help understanding these complex networks, many recent works have suggested to focus on motifs which occur more frequently than expected in random. To identify such exceptional motifs in a given network, we propose a statistical and analytical method which does not require any simulation. For this, we first provide an analytical expression of the mean and variance of the count under any exchangeable random graph model. Then we approximate the motif count distribution by a compound Poisson distribution whose parameters are derived from the mean and variance of the count. Thanks to simulations, we show that the compound Poisson approximation outperforms the Gaussian approximation. The compound Poisson distribution can then be used to get an approximate p-value and to decide if an observed count is significantly high or not. Our methodology is applied on protein-protein interaction (PPI) networks, and statistical issues related to exceptional motif detection are discussed.  相似文献   

13.
Statistical models are the traditional choice to test scientific theories when observations, processes or boundary conditions are subject to stochasticity. Many important systems in ecology and biology, however, are difficult to capture with statistical models. Stochastic simulation models offer an alternative, but they were hitherto associated with a major disadvantage: their likelihood functions can usually not be calculated explicitly, and thus it is difficult to couple them to well-established statistical theory such as maximum likelihood and Bayesian statistics. A number of new methods, among them Approximate Bayesian Computing and Pattern-Oriented Modelling, bypass this limitation. These methods share three main principles: aggregation of simulated and observed data via summary statistics, likelihood approximation based on the summary statistics, and efficient sampling. We discuss principles as well as advantages and caveats of these methods, and demonstrate their potential for integrating stochastic simulation models into a unified framework for statistical modelling.  相似文献   

14.
Zhang K  Traskin M  Small DS 《Biometrics》2012,68(1):75-84
For group-randomized trials, randomization inference based on rank statistics provides robust, exact inference against nonnormal distributions. However, in a matched-pair design, the currently available rank-based statistics lose significant power compared to normal linear mixed model (LMM) test statistics when the LMM is true. In this article, we investigate and develop an optimal test statistic over all statistics in the form of the weighted sum of signed Mann-Whitney-Wilcoxon statistics under certain assumptions. This test is almost as powerful as the LMM even when the LMM is true, but it is much more powerful for heavy tailed distributions. A simulation study is conducted to examine the power.  相似文献   

15.
The meta‐analysis of diagnostic accuracy studies is often of interest in screening programs for many diseases. The typical summary statistics for studies chosen for a diagnostic accuracy meta‐analysis are often two dimensional: sensitivities and specificities. The common statistical analysis approach for the meta‐analysis of diagnostic studies is based on the bivariate generalized linear‐mixed model (BGLMM), which has study‐specific interpretations. In this article, we present a population‐averaged (PA) model using generalized estimating equations (GEE) for making inference on mean specificity and sensitivity of a diagnostic test in the population represented by the meta‐analytic studies. We also derive the marginalized counterparts of the regression parameters from the BGLMM. We illustrate the proposed PA approach through two dataset examples and compare performance of estimators of the marginal regression parameters from the PA model with those of the marginalized regression parameters from the BGLMM through Monte Carlo simulation studies. Overall, both marginalized BGLMM and GEE with sandwich standard errors maintained nominal 95% confidence interval coverage levels for mean specificity and mean sensitivity in meta‐analysis of 25 of more studies even under misspecification of the covariance structure of the bivariate positive test counts for diseased and nondiseased subjects.  相似文献   

16.
Kneib T  Fahrmeir L 《Biometrics》2006,62(1):109-118
Motivated by a space-time study on forest health with damage state of trees as the response, we propose a general class of structured additive regression models for categorical responses, allowing for a flexible semiparametric predictor. Nonlinear effects of continuous covariates, time trends, and interactions between continuous covariates are modeled by penalized splines. Spatial effects can be estimated based on Markov random fields, Gaussian random fields, or two-dimensional penalized splines. We present our approach from a Bayesian perspective, with inference based on a categorical linear mixed model representation. The resulting empirical Bayes method is closely related to penalized likelihood estimation in a frequentist setting. Variance components, corresponding to inverse smoothing parameters, are estimated using (approximate) restricted maximum likelihood. In simulation studies we investigate the performance of different choices for the spatial effect, compare the empirical Bayes approach to competing methodology, and study the bias of mixed model estimates. As an application we analyze data from the forest health survey.  相似文献   

17.
Saville BR  Herring AH 《Biometrics》2009,65(2):369-376
Summary .  Deciding which predictor effects may vary across subjects is a difficult issue. Standard model selection criteria and test procedures are often inappropriate for comparing models with different numbers of random effects due to constraints on the parameter space of the variance components. Testing on the boundary of the parameter space changes the asymptotic distribution of some classical test statistics and causes problems in approximating Bayes factors. We propose a simple approach for testing random effects in the linear mixed model using Bayes factors. We scale each random effect to the residual variance and introduce a parameter that controls the relative contribution of each random effect free of the scale of the data. We integrate out the random effects and the variance components using closed-form solutions. The resulting integrals needed to calculate the Bayes factor are low-dimensional integrals lacking variance components and can be efficiently approximated with Laplace's method. We propose a default prior distribution on the parameter controlling the contribution of each random effect and conduct simulations to show that our method has good properties for model selection problems. Finally, we illustrate our methods on data from a clinical trial of patients with bipolar disorder and on data from an environmental study of water disinfection by-products and male reproductive outcomes.  相似文献   

18.
Summary We propose a Bayesian chi‐squared model diagnostic for analysis of data subject to censoring. The test statistic has the form of Pearson's chi‐squared test statistic and is easy to calculate from standard output of Markov chain Monte Carlo algorithms. The key innovation of this diagnostic is that it is based only on observed failure times. Because it does not rely on the imputation of failure times for observations that have been censored, we show that under heavy censoring it can have higher power for detecting model departures than a comparable test based on the complete data. In a simulation study, we show that tests based on this diagnostic exhibit comparable power and better nominal Type I error rates than a commonly used alternative test proposed by Akritas (1988, Journal of the American Statistical Association 83, 222–230). An important advantage of the proposed diagnostic is that it can be applied to a broad class of censored data models, including generalized linear models and other models with nonidentically distributed and nonadditive error structures. We illustrate the proposed model diagnostic for testing the adequacy of two parametric survival models for Space Shuttle main engine failures.  相似文献   

19.
Hiriote S  Chinchilli VM 《Biometrics》2011,67(3):1007-1016
Summary In many clinical studies, Lin's concordance correlation coefficient (CCC) is a common tool to assess the agreement of a continuous response measured by two raters or methods. However, the need for measures of agreement may arise for more complex situations, such as when the responses are measured on more than one occasion by each rater or method. In this work, we propose a new CCC in the presence of repeated measurements, called the matrix‐based concordance correlation coefficient (MCCC) based on a matrix norm that possesses the properties needed to characterize the level of agreement between two p× 1 vectors of random variables. It can be shown that the MCCC reduces to Lin's CCC when p= 1. For inference, we propose an estimator for the MCCC based on U‐statistics. Furthermore, we derive the asymptotic distribution of the estimator of the MCCC, which is proven to be normal. The simulation studies confirm that overall in terms of accuracy, precision, and coverage probability, the estimator of the MCCC works very well in general cases especially when n is greater than 40. Finally, we use real data from an Asthma Clinical Research Network (ACRN) study and the Penn State Young Women's Health Study for demonstration.  相似文献   

20.
Phylogenetic comparative methods (PCMs) have been used to test evolutionary hypotheses at phenotypic levels. The evolutionary modes commonly included in PCMs are Brownian motion (genetic drift) and the Ornstein–Uhlenbeck process (stabilizing selection), whose likelihood functions are mathematically tractable. More complicated models of evolutionary modes, such as branch‐specific directional selection, have not been used because calculations of likelihood and parameter estimates in the maximum‐likelihood framework are not straightforward. To solve this problem, we introduced a population genetics framework into a PCM, and here, we present a flexible and comprehensive framework for estimating evolutionary parameters through simulation‐based likelihood computations. The method does not require analytic likelihood computations, and evolutionary models can be used as long as simulation is possible. Our approach has many advantages: it incorporates different evolutionary modes for phenotypes into phylogeny, it takes intraspecific variation into account, it evaluates full likelihood instead of using summary statistics, and it can be used to estimate ancestral traits. We present a successful application of the method to the evolution of brain size in primates. Our method can be easily implemented in more computationally effective frameworks such as approximate Bayesian computation (ABC), which will enhance the use of computationally intensive methods in the study of phenotypic evolution.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号