首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
    
This article applies a simple method for settings where one has clustered data, but statistical methods are only available for independent data. We assume the statistical method provides us with a normally distributed estimate, theta, and an estimate of its variance sigma. We randomly select a data point from each cluster and apply our statistical method to this independent data. We repeat this multiple times, and use the average of the associated theta's as our estimate. An estimate of the variance is given by the average of the sigma2's minus the sample variance of the theta's. We call this procedure multiple outputation, as all \"excess\" data within each cluster is thrown out multiple times. Hoffman, Sen, and Weinberg (2001, Biometrika 88, 1121-1134) introduced this approach for generalized linear models when the cluster size is related to outcome. In this article, we demonstrate the broad applicability of the approach. Applications to angular data, p-values, vector parameters, Bayesian inference, genetics data, and random cluster sizes are discussed. In addition, asymptotic normality of estimates based on all possible outputations, as well as a finite number of outputations, is proven given weak conditions. Multiple outputation provides a simple and broadly applicable method for analyzing clustered data. It is especially suited to settings where methods for clustered data are impractical, but can also be applied generally as a quick and simple tool.  相似文献   

2.
    
The augmentation of categorical outcomes with underlying Gaussian variables in bivariate generalized mixed effects models has facilitated the joint modeling of continuous and binary response variables. These models typically assume that random effects and residual effects (co)variances are homogeneous across all clusters and subjects, respectively. Motivated by conflicting evidence about the association between performance outcomes in dairy production systems, we consider the situation where these (co)variance parameters may themselves be functions of systematic and/or random effects. We present a hierarchical Bayesian extension of bivariate generalized linear models whereby functions of the (co)variance matrices are specified as linear combinations of fixed and random effects following a square‐root‐free Cholesky reparameterization that ensures necessary positive semidefinite constraints. We test the proposed model by simulation and apply it to the analysis of a dairy cattle data set in which the random herd‐level and residual cow‐level effects (co)variances between a continuous production trait and binary reproduction trait are modeled as functions of fixed management effects and random cluster effects.  相似文献   

3.
    
Likelihood analysis for regression models with measurement errors in explanatory variables typically involves integrals that do not have a closed-form solution. In this case, numerical methods such as Gaussian quadrature are generally employed. However, when the dimension of the integral is large, these methods become computationally demanding or even unfeasible. This paper proposes the use of the Laplace approximation to deal with measurement error problems when the likelihood function involves high-dimensional integrals. The cases considered are generalized linear models with multiple covariates measured with error and generalized linear mixed models with measurement error in the covariates. The asymptotic order of the approximation and the asymptotic properties of the Laplace-based estimator for these models are derived. The method is illustrated using simulations and real-data analysis.  相似文献   

4.
    
Wei Pan 《Biometrics》2001,57(2):529-534
Model selection is a necessary step in many practical regression analyses. But for methods based on estimating equations, such as the quasi-likelihood and generalized estimating equation (GEE) approaches, there seem to be few well-studied model selection techniques. In this article, we propose a new model selection criterion that minimizes the expected predictive bias (EPB) of estimating equations. A bootstrap smoothed cross-validation (BCV) estimate of EPB is presented and its performance is assessed via simulation for overdispersed generalized linear models. For illustration, the method is applied to a real data set taken from a study of the development of ewe embryos.  相似文献   

5.
6.
Summary Highest posterior density intervals are common in Bayesian inference, but as noted by Agresti and Min (2005, Biometrics 61, 515–523) they are not invariant under transformations. Agresti and Min suggested central or “tail” intervals as preferable in the context of the relative risk and odds ratio. A modification to this is proposed for extreme outcomes, as invariance is maintained when replacing central intervals by one‐sided intervals. Bayes–Laplace priors for the binomial parameters appear preferable here, compared to Jeffreys priors, contrary to Agresti and Min's suggestion based on frequentist coverage.  相似文献   

7.
    
Often, the functional form of covariate effects in an additive model varies across groups defined by levels of a categorical variable. This structure represents a factor-by-curve interaction. This article presents penalized spline models that incorporate factor-by-curve interactions into additive models. A mixed model formulation for penalized splines allows for straightforward model fitting and smoothing parameter selection. We illustrate the proposed model by applying it to pollen ragweed data in which seasonal trends vary by year.  相似文献   

8.
    
Paulino CD  Soares P  Neuhaus J 《Biometrics》2003,59(3):670-675
Motivated by a study of human papillomavirus infection in women, we present a Bayesian binomial regression analysis in which the response is subject to an unconstrained misclassification process. Our iterative approach provides inferences for the parameters that describe the relationships of the covariates with the response and for the misclassification probabilities. Furthermore, our approach applies to any meaningful generalized linear model, making model selection possible. Finally, it is straightforward to extend it to multinomial settings.  相似文献   

9.
    
Haitao Chu 《Biometrics》2015,71(2):538-547
  相似文献   

10.
    
Li Liu  Liming Xiang 《Biometrics》2014,70(4):910-919
  相似文献   

11.
Concerns have been raised that posterior probabilities on phylogenetic trees can be unreliable when the true tree is unresolved or has very short internal branches, because existing methods for Bayesian phylogenetic analysis do not explicitly evaluate unresolved trees. Two recent papers have proposed that evaluating only resolved trees results in a "star tree paradox": when the true tree is unresolved or close to it, posterior probabilities were predicted to become increasingly unpredictable as sequence length grows, resulting in inflated confidence in one resolved tree or another and an increasing risk of false-positive inferences. Here we show that this is not the case; existing Bayesian methods do not lead to an inflation of statistical confidence, provided the evolutionary model is correct and uninformative priors are assumed. Posterior probabilities do not become increasingly unpredictable with increasing sequence length, and they exhibit conservative type I error rates, leading to a low rate of false-positive inferences. With infinite data, posterior probabilities give equal support for all resolved trees, and the rate of false inferences falls to zero. We conclude that there is no star tree paradox caused by not sampling unresolved trees.  相似文献   

12.
    
Haolun Shi  Guosheng Yin 《Biometrics》2018,74(3):1055-1064
  相似文献   

13.
    
In order to study family‐based association in the presence of linkage, we extend a generalized linear mixed model proposed for genetic linkage analysis (Lebrec and van Houwelingen (2007), Human Heredity 64 , 5–15) by adding a genotypic effect to the mean. The corresponding score test is a weighted family‐based association tests statistic, where the weight depends on the linkage effect and on other genetic and shared environmental effects. For testing of genetic association in the presence of gene–covariate interaction, we propose a linear regression method where the family‐specific score statistic is regressed on family‐specific covariates. Both statistics are straightforward to compute. Simulation results show that adjusting the weight for the within‐family variance structure may be a powerful approach in the presence of environmental effects. The test statistic for genetic association in the presence of gene–covariate interaction improved the power for detecting association. For illustration, we analyze the rheumatoid arthritis data from GAW15. Adjusting for smoking and anti‐cyclic citrullinated peptide increased the significance of the association with the DR locus.  相似文献   

14.
Species’ presence/absence at two time points is a very common form of ecological data. It is the simplest type of longitudinal study and has fundamental applications in ecological succession, environmental monitoring, and climate change scenarios. Despite its widespread commonality the use of statistical regression to analyse such data has been wanting. We propose the use of the bivariate odds-ratio model to analyse these data. Seldomly used in ecology, it is argued as being suitable, especially within a constrained ordination framework. In particular, this paper presents the constrained ordination-odds ratio framework as a potentially important key in understanding the underlying processes of niche theory dynamics, e.g., local extinction and colonization probabilities can be described in terms of it. Some of the mathematical and statistical challenges associated with more ambitious extensions are highlighted. As examples, with an underlying Poisson abundance model, a complementary log-log link for the marginal probabilities is shown to be more appropriate. We then develop this model based on the zero-inflated Poisson distribution since excess absences relative to a Poisson distribution is frequent in practice. Two vegetation data sets are used for illustrative purposes.  相似文献   

15.
    
An important problem in agronomy is the study of longitudinal data on the growth curve of the weight of cattle through time, possibly taking into account the effect of other explanatory variables such as treatments and time. In this paper, a Bayesian approach for analysing longitudinal data is proposed. It takes into account regression structures on the mean and the variance‐covariance matrix of normal observations. The approach is based on the modeling strategy suggested by Pourahmadi (1999, Biometrika 86, 667–690). After revising this methodology, we present the Bayesian approach used to fit the models, based on a generalization of the Metropolis‐Hastings algorithm of Cepeda and Gamerman (2000, Brazilian Journal of Probability and Statistics, 14 , 207–221). The approach is used to the study of growth and development of a group of deaf children. The paper is concluded with a few proposed extensions. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

16.
17.
    
We have estimated the number of sika deer, Cervus nippon, in Hokkaido, Japan, with the aim of developing a management program that will reduce the level of agricultural damage caused by these deer. A population index that is defined by the population divided by the population of 1993 is first estimated from the data obtained during a spotlight survey. A generalized linear mixed model (GLMM) with corner point constraints is used in this estimation. We then estimate the population from the index by evaluating the response of index to the known amount of harvest, including hunting. A stage-structured model is used in this harvest-based estimation. It is well-known that estimates of indices suffer from large observation errors when the probability of the observation fluctuates widely; therefore, we apply state-space modeling to the harvest-based estimation to remove the observation errors. We propose the use of Bayesian estimation with uniform prior-distributions as an approximation of the maximum likelihood estimation, without permitting an arbitrary assumption that the parameters fluctuate following prior-distributions. We are able to demonstrate that the harvest-based Bayesian estimation is effective in reducing the observation errors in sika deer populations, but the stage-structured model requires many demographic parameters to be known prior to running the analyses. These parameters cannot be estimated from the observed time-series of the index if there is insufficient data. We then construct a univariate model by simplifying the stage-structured model and show that the simplified model yields estimates that are nearly identical to those obtained from the stage-structured model. This simplification of the model simultaneously clarifies which parameter is important in estimating the population. Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

18.
ABSTRACT

A groundwater field is a complex and open system. Groundwater simulation and prediction often deviated from true values, which is attributed to the uncertainty of groundwater modeling. The conceptual model (model struture) is one of the main sources of groundwater modeling uncertianty. In this study, the mean Euclidean distance (MED) between model simulations and observations is proposed to assess the integrated likelihood value of a conceptual model in Bayesian model averaging (BMA). Moreover, this proposed BMA method is compared with the traditional generalized likelihood uncertainty estimation (GLUE) BMA method by a synthetical groundwater model, and the characteristics of these two BMA methods are summarized.  相似文献   

19.
    
The intraclass correlation is commonly used with clustered data. It is often estimated based on fitting a model to hierarchical data and it leads, in turn, to several concepts such as reliability, heritability, inter‐rater agreement, etc. For data where linear models can be used, such measures can be defined as ratios of variance components. Matters are more difficult for non‐Gaussian outcomes. The focus here is on count and time‐to‐event outcomes where so‐called combined models are used, extending generalized linear mixed models, to describe the data. These models combine normal and gamma random effects to allow for both correlation due to data hierarchies as well as for overdispersion. Furthermore, because the models admit closed‐form expressions for the means, variances, higher moments, and even the joint marginal distribution, it is demonstrated that closed forms of intraclass correlations exist. The proposed methodology is illustrated using data from agricultural and livestock studies.  相似文献   

20.
玉米出籽率全基因组关联分析   总被引:1,自引:0,他引:1  
出籽率与玉米单穗产量密切相关,其遗传机制的解析对玉米高产育种具有重要意义.本研究利用309份玉米自交系为关联群体,利用固定和随机模型交替概率统一(FarmCPU)、压缩混合线性模型(CMLM)和多位点混合线性模型(MLMM)对2017年和2019年河南新乡原阳、周口郸城、海南三亚以及最佳线性无偏估计值(BLUE)的出籽...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号