共查询到20条相似文献,搜索用时 0 毫秒
1.
Mahmoud Torabi 《Biometrical journal. Biometrische Zeitschrift》2016,58(5):1138-1150
Disease mapping of a single disease has been widely studied in the public health setup. Simultaneous modeling of related diseases can also be a valuable tool both from the epidemiological and from the statistical point of view. In particular, when we have several measurements recorded at each spatial location, we need to consider multivariate models in order to handle the dependence among the multivariate components as well as the spatial dependence between locations. It is then customary to use multivariate spatial models assuming the same distribution through the entire population density. However, in many circumstances, it is a very strong assumption to have the same distribution for all the areas of population density. To overcome this issue, we propose a hierarchical multivariate mixture generalized linear model to simultaneously analyze spatial Normal and non‐Normal outcomes. As an application of our proposed approach, esophageal and lung cancer deaths in Minnesota are used to show the outperformance of assuming different distributions for different counties of Minnesota rather than assuming a single distribution for the population density. Performance of the proposed approach is also evaluated through a simulation study. 相似文献
2.
Getachew A. Dagne 《Biometrical journal. Biometrische Zeitschrift》2004,46(6):653-663
This article presents two‐component hierarchical Bayesian models which incorporate both overdispersion and excess zeros. The components may be resultants of some intervention (treatment) that changes the rare event generating process. The models are also expanded to take into account any heterogeneity that may exist in the data. Details of the model fitting, checking and selecting alternative models from a Bayesian perspective are also presented. The proposed methods are applied to count data on the assessment of an efficacy of pesticides in controlling the reproduction of whitefly. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim) 相似文献
3.
Maruotti A 《Biometrical journal. Biometrische Zeitschrift》2011,53(5):716-734
Two-part regression models are frequently used to analyze longitudinal count data with excess zeros, where the same set of subjects is repeatedly observed over time. In this context, several sources of heterogeneity may arise at individual level that affect the observed process. Further, longitudinal studies often suffer from missing values: individuals dropout of the study before its completion, and thus present incomplete data records. In this paper, we propose a finite mixture of hurdle models to face the heterogeneity problem, which is handled by introducing random effects with a discrete distribution; a pattern-mixture approach is specified to deal with non-ignorable missing values. This approach helps us to consider overdispersed counts, while allowing for association between the two parts of the model, and for non-ignorable dropouts. The effectiveness of the proposal is tested through a simulation study. Finally, an application to real data on skin cancer is provided. 相似文献
4.
Empirical Bayes Gibbs sampling 总被引:3,自引:0,他引:3
Casella G 《Biostatistics (Oxford, England)》2001,2(4):485-500
The wide applicability of Gibbs sampling has increased the use of more complex and multi-level hierarchical models. To use these models entails dealing with hyperparameters in the deeper levels of a hierarchy. There are three typical methods for dealing with these hyperparameters: specify them, estimate them, or use a 'flat' prior. Each of these strategies has its own associated problems. In this paper, using an empirical Bayes approach, we show how the hyperparameters can be estimated in a way that is both computationally feasible and statistically valid. 相似文献
5.
6.
7.
Osvaldo Loquiha Niel Hens Leonardo Chavane Marleen Temmerman Marc Aerts 《Biometrical journal. Biometrische Zeitschrift》2013,55(5):647-660
Count data are very common in health services research, and very commonly the basic Poisson regression model has to be extended in several ways to accommodate several sources of heterogeneity: (i) an excess number of zeros relative to a Poisson distribution, (ii) hierarchical structures, and correlated data, (iii) remaining “unexplained” sources of overdispersion. In this paper, we propose hierarchical zero‐inflated and overdispersed models with independent, correlated, and shared random effects for both components of the mixture model. We show that all different extensions of the Poisson model can be based on the concept of mixture models, and that they can be combined to account for all different sources of heterogeneity. Expressions for the first two moments are derived and discussed. The models are applied to data on maternal deaths and related risk factors within health facilities in Mozambique. The final model shows that the maternal mortality rate mainly depends on the geographical location of the health facility, the percentage of women admitted with HIV and the percentage of referrals from the health facility. 相似文献
8.
We consider the estimation of success rate and harvest under post survey stratification at the sub‐domain (county) level. Often in this situation, the population size for the sub‐domain is unknown and the random mechanism that dictates the sample size for sub‐domains is ignored. Finding good estimators of success rate and harvest is very important for wildlife abundance. A Bayesian hierarchical model is developed to estimate both success rate and harvest simultaneously. The model includes a random sub‐domain sample size correlated with the number of successes in the sub‐domain, fixed week effects, random geographic effects, and spatial correlations between neighboring sub‐domains. The computation is done by Gibbs sampling and adaptive rejection sampling techniques. The method developed is illustrated using data from the Missouri Turkey Hunting Survey. The estimation of success rate is improved by treating the the sub‐domain sample size as a random variable instead of a fixed constant. The Bayesian model yields a reasonable harvest estimation. The spatial pattern of the estimated harvest matches the pattern of the check station data. 相似文献
9.
We extend an approach for estimating random effects parameters under a random intercept and slope logistic regression model to include standard errors, thereby including confidence intervals. The procedure entails numerical integration to yield posterior empirical Bayes (EB) estimates of random effects parameters and their corresponding posterior standard errors. We incorporate an adjustment of the standard error due to Kass and Steffey (KS; 1989, Journal of the American Statistical Association 84, 717-726) to account for the variability in estimating the variance component of the random effects distribution. In assessing health care providers with respect to adult pneumonia mortality, comparisons are made with the penalized quasi-likelihood (PQL) approximation approach of Breslow and Clayton (1993, Journal of the American Statistical Association 88, 9-25) and a Bayesian approach. To make comparisons with an EB method previously reported in the literature, we apply these approaches to crossover trials data previously analyzed with the estimating equations EB approach of Waclawiw and Liang (1994, Statistics in Medicine 13, 541-551). We also perform simulations to compare the proposed KS and PQL approaches. These two approaches lead to EB estimates of random effects parameters with similar asymptotic bias. However, for many clusters with small cluster size, the proposed KS approach does better than the PQL procedures in terms of coverage of nominal 95% confidence intervals for random effects estimates. For large cluster sizes and a few clusters, the PQL approach performs better than the KS adjustment. These simulation results agree somewhat with those of the data analyses. 相似文献
10.
11.
Robust estimation of multivariate covariance components 总被引:1,自引:0,他引:1
In many settings, such as interlaboratory testing, small area estimation in sample surveys, and heritability studies, investigators are interested in estimating covariance components for multivariate measurements. However, the presence of outliers can seriously distort estimates obtained using standard procedures such as maximum likelihood. We propose a procedure based on M-estimation for robustly estimating multivariate covariance components in the presence of outliers; the procedure applies to balanced and unbalanced data. We present an algorithm for computing the robust estimates and examine the performance of the estimator through a simulation study. The estimator is used to find covariance components and identify outliers in a study of variability of egg length and breadth measurements of American coots. 相似文献
12.
A Bayesian hierarchical generalized linear model is used to estimate hunting success rates at the subarea level for postseason harvest surveys. The model includes fixed week effects, random geographic effects, and spatial correlations between neighboring subareas. The computation is done by Gibbs sampling and adaptive rejection sampling techniques. The method is illustrated using data from the Missouri Turkey Hunting Survey in the spring of 1996. Bayesian model selection methods are used to demonstrate that there are significant week differences and spatial correlations of hunting success rates among counties. The Bayesian estimates are also shown to be quite robust in terms of changes of hyperparameters. 相似文献
13.
Emanuela Dreassi Alessandra Petrucci Emilia Rocco 《Biometrical journal. Biometrische Zeitschrift》2014,56(1):141-156
Linear‐mixed models are frequently used to obtain model‐based estimators in small area estimation (SAE) problems. Such models, however, are not suitable when the target variable exhibits a point mass at zero, a highly skewed distribution of the nonzero values and a strong spatial structure. In this paper, a SAE approach for dealing with such variables is suggested. We propose a two‐part random effects SAE model that includes a correlation structure on the area random effects that appears in the two parts and incorporates a bivariate smooth function of the geographical coordinates of units. To account for the skewness of the distribution of the positive values of the response variable, a Gamma model is adopted. To fit the model, to get small area estimates and to evaluate their precision, a hierarchical Bayesian approach is used. The study is motivated by a real SAE problem. We focus on estimation of the per‐farm average grape wine production in Tuscany, at subregional level, using the Farm Structure Survey data. Results from this real data application and those obtained by a model‐based simulation experiment show a satisfactory performance of the suggested SAE approach. 相似文献
14.
Bowman FD 《Biostatistics (Oxford, England)》2005,6(4):558-575
Functional neuroimaging, including positron emission tomography (PET) and functional magnetic resonance imaging (fMRI), plays an important role in identifying specific brain regions associated with experimental stimuli or psychiatric disorders such as schizophrenia. PET and fMRI produce massive data sets that contain both temporal correlations from repeated scans and complex spatial correlations. Several methods exist for handling temporal correlations, some of which rely on transforming the response data to induce either a known or an independence covariance structure. Despite the presence of spatial correlations between the volume elements (voxels) comprising a brain scan, conventional methods perform voxel-by-voxel analyses of measured brain activity. We propose a two-stage spatio-temporal model for the estimation and testing of localized activity. Our second-stage model specifies a spatial auto-regression, capturing correlations within neural processing clusters defined by a data-driven cluster analysis. We use maximum likelihood methods to estimate parameters from our spatial autoregressive model. Our model protects against type-I errors, enables the detection of both localized and regional activations (including volume of interest effects), provides information on functional connectivity in the brain, and establishes a framework to produce spatially smoothed maps of distributed brain activity for each individual. We illustrate the application of our model using PET data from a study of working memory in individuals with schizophrenia. 相似文献
15.
Greenland (2000, Biometrics 56, 915-921) describes the use of random coefficient regression to adjust for residual confounding in a particular setting. We examine this setting further, giving theoretical and empirical results concerning the frequentist and Bayesian performance of random coefficient regression. Particularly, we compare estimators based on this adjustment for residual confounding to estimators based on the assumption of no residual confounding. This devolves to comparing an estimator from a nonidentified but more realistic model to an estimator from a less realistic but identified model. The approach described by Gustafson (2005, Statistical Science 20, 111-140) is used to quantify the performance of a Bayesian estimator arising from a nonidentified model. From both theoretical calculations and simulations we find support for the idea that superior performance can be obtained by replacing unrealistic identifying constraints with priors that allow modest departures from those constraints. In terms of point-estimator bias this superiority arises when the extent of residual confounding is substantial, but the advantage is much broader in terms of interval estimation. The benefit from modeling residual confounding is maintained when the prior distributions employed only roughly correspond to reality, for the standard identifying constraints are equivalent to priors that typically correspond much worse. 相似文献
16.
Kelvin K. W. Yau Kui Wang Andy H. Lee 《Biometrical journal. Biometrische Zeitschrift》2003,45(4):437-452
In many biometrical applications, the count data encountered often contain extra zeros relative to the Poisson distribution. Zero‐inflated Poisson regression models are useful for analyzing such data, but parameter estimates may be seriously biased if the nonzero observations are over‐dispersed and simultaneously correlated due to the sampling design or the data collection procedure. In this paper, a zero‐inflated negative binomial mixed regression model is presented to analyze a set of pancreas disorder length of stay (LOS) data that comprised mainly same‐day separations. Random effects are introduced to account for inter‐hospital variations and the dependency of clustered LOS observations. Parameter estimation is achieved by maximizing an appropriate log‐likelihood function using an EM algorithm. Alternative modeling strategies, namely the finite mixture of Poisson distributions and the non‐parametric maximum likelihood approach, are also considered. The determination of pertinent covariates would assist hospital administrators and clinicians to manage LOS and expenditures efficiently. 相似文献
17.
Recent advances in statistical software have led to the rapid diffusion of new methods for modelling longitudinal data. Multilevel (also known as hierarchical or random effects) models for binary outcomes have generally been based on a logistic-normal specification, by analogy with earlier work for normally distributed data. The appropriate application and interpretation of these models remains somewhat unclear, especially when compared with the computationally more straightforward semiparametric or 'marginal' modelling (GEE) approaches. In this paper we pose two interrelated questions. First, what limits should be placed on the interpretation of the coefficients and inferences derived from random-effect models involving binary outcomes? Second, what diagnostic checks are appropriate for evaluating whether such random-effect models provide adequate fits to the data? We address these questions by means of an extended case study using data on adolescent smoking from a large cohort study. Bayesian estimation methods are used to fit a discrete-mixture alternative to the standard logistic-normal model, and posterior predictive checking is used to assess model fit. Surprising parallels in the parameter estimates from the logistic-normal and mixture models are described and used to question the interpretability of the so-called 'subject-specific' regression coefficients from the standard multilevel approach. Posterior predictive checks suggest a serious lack of fit of both multilevel models. The results do not provide final answers to the two questions posed, but we expect that lessons learned from the case study will provide general guidance for further investigation of these important issues. 相似文献
18.
Disease mapping and spatial regression with count data 总被引:3,自引:0,他引:3
Wakefield J 《Biostatistics (Oxford, England)》2007,8(2):158-183
In this paper, we provide critical reviews of methods suggested for the analysis of aggregate count data in the context of disease mapping and spatial regression. We introduce a new method for picking prior distributions, and propose a number of refinements of previously used models. We also consider ecological bias, mutual standardization, and choice of both spatial model and prior specification. We analyze male lip cancer incidence data collected in Scotland over the period 1975-1980, and outline a number of problems with previous analyses of these data. In disease mapping studies, hierarchical models can provide robust estimation of area-level risk parameters, though care is required in the choice of covariate model, and it is important to assess the sensitivity of estimates to the spatial model chosen, and to the prior specifications on the variance parameters. Spatial ecological regression is a far more hazardous enterprise for two reasons. First, there is always the possibility of ecological bias, and this can only be alleviated by the inclusion of individual-level data. For the Scottish data, we show that the previously used mean model has limited interpretation from an individual perspective. Second, when residual spatial dependence is modeled, and if the exposure has spatial structure, then estimates of exposure association parameters will change when compared with those obtained from the independence across space model, and the data alone cannot choose the form and extent of spatial correlation that is appropriate. 相似文献
19.
We propose a mixed-effect linear model, as a particular case of the two-level regression model, for analyzing repeated measures made at completely irregular time points. The model allows for subject-level covariates, so as to study the trend and the variability of the individual growth curves. Application of this model is illustrated on a published data set. 相似文献
20.
Summary Many major genes have been identified that strongly influence the risk of cancer. However, there are typically many different mutations that can occur in the gene, each of which may or may not confer increased risk. It is critical to identify which specific mutations are harmful, and which ones are harmless, so that individuals who learn from genetic testing that they have a mutation can be appropriately counseled. This is a challenging task, since new mutations are continually being identified, and there is typically relatively little evidence available about each individual mutation. In an earlier article, we employed hierarchical modeling ( Capanu et al., 2008 , Statistics in Medicine 27 , 1973–1992) using the pseudo‐likelihood and Gibbs sampling methods to estimate the relative risks of individual rare variants using data from a case–control study and showed that one can draw strength from the aggregating power of hierarchical models to distinguish the variants that contribute to cancer risk. However, further research is needed to validate the application of asymptotic methods to such sparse data. In this article, we use simulations to study in detail the properties of the pseudo‐likelihood method for this purpose. We also explore two alternative approaches: pseudo‐likelihood with correction for the variance component estimate as proposed by Lin and Breslow (1996, Journal of the American Statistical Association 91 , 1007–1016) and a hybrid pseudo‐likelihood approach with Bayesian estimation of the variance component. We investigate the validity of these hierarchical modeling techniques by looking at the bias and coverage properties of the estimators as well as at the efficiency of the hierarchical modeling estimates relative to that of the maximum likelihood estimates. The results indicate that the estimates of the relative risks of very sparse variants have small bias, and that the estimated 95% confidence intervals are typically anti‐conservative, though the actual coverage rates are generally above 90%. The widths of the confidence intervals narrow as the residual variance in the second‐stage model is reduced. The results also show that the hierarchical modeling estimates have shorter confidence intervals relative to estimates obtained from conventional logistic regression, and that these relative improvements increase as the variants become more rare. 相似文献