首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
We propose a likelihood-based model for correlated count data that display under- or overdispersion within units (e.g. subjects). The model is capable of handling correlation due to clustering and/or serial correlation, in the presence of unbalanced, missing or unequally spaced data. A family of distributions based on birth-event processes is used to model within-subject underdispersion. A computational approach is given to overcome a parameterization difficulty with this family, and this allows use of common Markov Chain Monte Carlo software (e.g. WinBUGS) for estimation. Application of the model to daily counts of asthma inhaler use by children shows substantial within-subject underdispersion, between-subject heterogeneity and correlation due to both clustering of measurements within subjects and serial correlation of longitudinal measurements. The model provides a major improvement over Poisson longitudinal models, and diagnostics show that the model fits well.  相似文献   

2.
Several models are developed for the estimation of the rate of exponential die-off from decontamination data. Calculations with illustrative data are reported which indicate that the estimation of this rate and its variance are sensitive to changes in modelling assumptions. Since extrapolation using this estimated rate is used in the specification of planetary quarantine standards, special care should be taken in the selection of an appropriate model and corresponding estimation procedure for the analysis of each set of decontamination data to be used for this purpose.  相似文献   

3.
Some covariance models for longitudinal count data with overdispersion   总被引:9,自引:0,他引:9  
P F Thall  S C Vail 《Biometrics》1990,46(3):657-671
A family of covariance models for longitudinal counts with predictive covariates is presented. These models account for overdispersion, heteroscedasticity, and dependence among repeated observations. The approach is a quasi-likelihood regression similar to the formulation given by Liang and Zeger (1986, Biometrika 73, 13-22). Generalized estimating equations for both the covariate parameters and the variance-covariance parameters are presented. Large-sample properties of the parameter estimates are derived. The proposed methods are illustrated by an analysis of epileptic seizure count data arising from a study of progabide as an adjuvant therapy for partial seizures.  相似文献   

4.
We consider models for hierarchical count data, subject to overdispersion and/or excess zeros. Molenberghs et al. ( 2007 ) and Molenberghs et al. ( 2010 ) extend the Poisson‐normal generalized linear‐mixed model by including gamma random effects to accommodate overdispersion. Excess zeros are handled using either a zero‐inflation or a hurdle component. These models were studied by Kassahun et al. ( 2014 ). While flexible, they are quite elaborate in parametric specification and therefore model assessment is imperative. We derive local influence measures to detect and examine influential subjects, that is subjects who have undue influence on either the fit of the model as a whole, or on specific important sub‐vectors of the parameter vector. The latter include the fixed effects for the Poisson and for the excess‐zeros components, the variance components for the normal random effects, and the parameters describing gamma random effects, included to accommodate overdispersion. Interpretable influence components are derived. The method is applied to data from a longitudinal clinical trial involving patients with epileptic seizures. Even though the data were extensively analyzed in earlier work, the insight gained from the proposed diagnostics, statistically and clinically, is considerable. Possibly, a small but important subgroup of patients has been identified.  相似文献   

5.
A critical issue in modelling binary response data is the choiceof the links. We introduce a new link based on the generalizedt-distribution. There are two parameters in the generalizedt-link: one parameter purely controls the heaviness of the tailsof the link and the second parameter controls the scale of thelink. Two major advantages are offered by the generalized t-links.First, a symmetric generalized t-link with an unknown shapeparameter is much more identifiable than a Student t-link withunknown degrees of freedom and a known scale parameter. Secondly,skewed generalized t-links with both unknown shape and scaleparameters provide much more flexible and improved skewed linkregression models than the existing skewed links. Various theoreticalproperties and attractive features of the proposed links areexamined and explored in detail. An efficient Markov chain MonteCarlo algorithm is developed for sampling from the posteriordistribution. The deviance information criterion measure isused for guiding the choice of links. The proposed methodologyis motivated and illustrated by prostate cancer data.  相似文献   

6.
Titman AC 《Biometrics》2011,67(3):780-787
Methods for fitting nonhomogeneous Markov models to panel-observed data using direct numerical solution to the Kolmogorov Forward equations are developed. Nonhomogeneous Markov models occur most commonly when baseline transition intensities depend on calendar time, but may also occur with deterministic time-dependent covariates such as age. We propose transition intensities based on B-splines as a smooth alternative to piecewise constant intensities and also as a generalization of time transformation models. An expansion of the system of differential equations allows first derivatives of the likelihood to be obtained, which can be used in a Fisher scoring algorithm for maximum likelihood estimation. The method is evaluated through a small simulation study and demonstrated on data relating to the development of cardiac allograft vasculopathy in posttransplantation patients.  相似文献   

7.
P F Thall 《Biometrics》1988,44(1):197-209
In many longitudinal studies it is desired to estimate and test the rate over time of a particular recurrent event. Often only the event counts corresponding to the elapsed time intervals between each subject's successive observation times, and baseline covariate data, are available. The intervals may vary substantially in length and number between subjects, so that the corresponding vectors of counts are not directly comparable. A family of Poisson likelihood regression models incorporating a mixed random multiplicative component in the rate function of each subject is proposed for this longitudinal data structure. A related empirical Bayes estimate of random-effect parameters is also described. These methods are illustrated by an analysis of dyspepsia data from the National Cooperative Gallstone Study.  相似文献   

8.
9.
Huang YH  Hwang WH  Chen FY 《Biometrics》2011,67(4):1471-1480
Measurement errors in covariates may result in biased estimates in regression analysis. Most methods to correct this bias assume nondifferential measurement errors-i.e., that measurement errors are independent of the response variable. However, in regression models for zero-truncated count data, the number of error-prone covariate measurements for a given observational unit can equal its response count, implying a situation of differential measurement errors. To address this challenge, we develop a modified conditional score approach to achieve consistent estimation. The proposed method represents a novel technique, with efficiency gains achieved by augmenting random errors, and performs well in a simulation study. The method is demonstrated in an ecology application.  相似文献   

10.
11.
12.
MOTIVATION: Linking experimental data to mathematical models in biology is impeded by the lack of suitable software to manage and transform data. Model calibration would be facilitated and models would increase in value were it possible to preserve links to training data along with a record of all normalization, scaling, and fusion routines used to assemble the training data from primary results. RESULTS: We describe the implementation of DataRail, an open source MATLAB-based toolbox that stores experimental data in flexible multi-dimensional arrays, transforms arrays so as to maximize information content, and then constructs models using internal or external tools. Data integrity is maintained via a containment hierarchy for arrays, imposition of a metadata standard based on a newly proposed MIDAS format, assignment of semantically typed universal identifiers, and implementation of a procedure for storing the history of all transformations with the array. We illustrate the utility of DataRail by processing a newly collected set of approximately 22 000 measurements of protein activities obtained from cytokine-stimulated primary and transformed human liver cells. AVAILABILITY: DataRail is distributed under the GNU General Public License and available at http://code.google.com/p/sbpipeline/  相似文献   

13.
Outlier detection and cleaning procedures were evaluated to estimate mathematical restricted variogram models with discrete insect population count data. Because variogram modeling is significantly affected by outliers, methods to detect and clean outliers from data sets are critical for proper variogram modeling. In this study, we examined spatial data in the form of discrete measurements of insect counts on a rectangular grid. Two well-known insect pest population data were analyzed; one data set was the western flower thrips, Frankliniella occidentalis (Pergande) on greenhouse cucumbers and the other was the greenhouse whitefly, Trialeurodes vaporariorum (Westwood) on greenhouse cherry tomatoes. A spatial additive outlier model was constructed to detect outliers in both the isolated and patchy spatial distributions of outliers, and the outliers were cleaned with the neighboring median cleaner. To analyze the effect of outliers, we compared the relative nugget effects of data cleaned of outliers and data still containing outliers after transformation. In addition, the correlation coefficients between the actual and predicted values were compared using the leave-one-out cross-validation method with data cleaned of outliers and non-cleaned data after unbiased back transformation. The outlier detection and cleaning procedure improved geostatistical analysis, particularly by reducing the nugget effect, which greatly impacts the prediction variance of kriging. Consequently, the outlier detection and cleaning procedures used here improved the results of geostatistical analysis with highly skewed and extremely fluctuating data, such as insect counts.  相似文献   

14.
Astuti ET  Yanagawa T 《Biometrics》2002,58(2):398-402
Trend tests for monotone trend or umbrella trend (monotone upward changing to monotone downward or vise versa) in count data are proposed when the data exhibit extra-Poisson variability. The proposed tests, which are called the GS1 test and the GS2 test, are constructed by applying an orthonormal score vector to a generalized score test under an rth-order log-linear model. These tests are compared by simulation with the Cochran-Armitage test and the quasi-likelihood test of Piegorsch and Bailer (1997, Statistics for Environmental Biology and Toxicology). It is shown that the Cochran-Armitage test should not be used under the existence of extra-Poisson variability; that, for detecting monotone trend, the GS1 test is superior to the others; and that the GS2 test has high power to detect an umbrella response.  相似文献   

15.
16.
Semiparametric regression for count data   总被引:3,自引:0,他引:3  
  相似文献   

17.
Spatial weed count data are modeled and predicted using a generalized linear mixed model combined with a Bayesian approach and Markov chain Monte Carlo. Informative priors for a data set with sparse sampling are elicited using a previously collected data set with extensive sampling. Furthermore, we demonstrate that so-called Langevin-Hastings updates are useful for efficient simulation of the posterior distributions, and we discuss computational issues concerning prediction.  相似文献   

18.
The complementary log-log link was originally introduced in 1922 to R. A. Fisher, long before the logit and probit links. While the last two links are symmetric, the complementary log-log link is an asymmetrical link without a parameter associated with it. Several asymmetrical links with an extra parameter were proposed in the literature over last few years to deal with imbalanced data in binomial regression (when one of the classes is much smaller than the other); however, these do not necessarily have the cloglog link as a special case, with the exception of the link based on the generalized extreme value distribution. In this paper, we introduce flexible cloglog links for modeling binomial regression models that include an extra parameter associated with the link that explains some unbalancing for binomial outcomes. For all cases, the cloglog is a special case or the reciprocal version loglog link is obtained. A Bayesian Markov chain Monte Carlo inference approach is developed. Simulations study to evaluate the performance of the proposed algorithm is conducted and prior sensitivity analysis for the extra parameter shows that a uniform prior is the most convenient for all models. Additionally, two applications in medical data (age at menarche and pulmonary infection) illustrate the advantages of the proposed models.  相似文献   

19.
20.
Resource selection functions (RSFs) are typically estimated by comparing covariates at a discrete set of “used” locations to those from an “available” set of locations. This RSF approach treats the response as binary and does not account for intensity of use among habitat units where locations were recorded. Advances in global positioning system (GPS) technology allow animal location data to be collected at fine spatiotemporal scales and have increased the size and correlation of data used in RSF analyses. We suggest that a more contemporary approach to analyzing such data is to model intensity of use, which can be estimated for one or more animals by relating the relative frequency of locations in a set of sampling units to the habitat characteristics of those units with count‐based regression and, in particular, negative binomial (NB) regression. We demonstrate this NB RSF approach with location data collected from 10 GPS‐collared Rocky Mountain elk (Cervus elaphus) in the Starkey Experimental Forest and Range enclosure. We discuss modeling assumptions and show how RSF estimation with NB regression can easily accommodate contemporary research needs, including: analysis of large GPS data sets, computational ease, accounting for among‐animal variation, and interpretation of model covariates. We recommend the NB approach because of its conceptual and computational simplicity, and the fact that estimates of intensity of use are unbiased in the face of temporally correlated animal location data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号