首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Sampling from a finite population on multiple occasions introduces dependencies between the successive samples when overlap is designed. Such sampling designs lead to efficient statistical estimates, while they allow estimating changes over time for the targeted outcomes. This makes them very popular in real‐world statistical practice. Sampling with partial replacement can also be very efficient in biological and environmental studies where estimation of toxicants and its trends over time is the main interest. Sampling with partial replacement is designed here on two occasions in order to estimate the median concentration of chemical constituents quantified by means of liquid chromatography coupled with tandem mass spectrometry. Such data represent relative peak areas resulting from the chromatographic analysis. They are therefore positive‐valued and skewed data, and are commonly fitted very well by the log‐normal model. A log‐normal model is assumed here for chemical constituents quantified in mainstream cigarette smoke in a real case study. Combining design‐based and model‐based approaches for statistical inference, we seek for the median estimation of chemical constituents by sampling with partial replacement on two time occasions. We also discuss the limitations of extending the proposed approach to other skewed population models. The latter is investigated by means of a Monte Carlo simulation study.  相似文献   

2.
Glaucoma is a progressive disease due to damage in the optic nerve with associated functional losses. Although the relationship between structural and functional progression in glaucoma is well established, there is disagreement on how this association evolves over time. In addressing this issue, we propose a new class of non‐Gaussian linear‐mixed models to estimate the correlations among subject‐specific effects in multivariate longitudinal studies with a skewed distribution of random effects, to be used in a study of glaucoma. This class provides an efficient estimation of subject‐specific effects by modeling the skewed random effects through the log‐gamma distribution. It also provides more reliable estimates of the correlations between the random effects. To validate the log‐gamma assumption against the usual normality assumption of the random effects, we propose a lack‐of‐fit test using the profile likelihood function of the shape parameter. We apply this method to data from a prospective observation study, the Diagnostic Innovations in Glaucoma Study, to present a statistically significant association between structural and functional change rates that leads to a better understanding of the progression of glaucoma over time.  相似文献   

3.
A mixture of multivariate contaminated normal distributions is developed for model‐based clustering. In addition to the parameters of the classical normal mixture, our contaminated mixture has, for each cluster, a parameter controlling the proportion of mild outliers and one specifying the degree of contamination. Crucially, these parameters do not have to be specified a priori, adding a flexibility to our approach. Parsimony is introduced via eigen‐decomposition of the component covariance matrices, and sufficient conditions for the identifiability of all the members of the resulting family are provided. An expectation‐conditional maximization algorithm is outlined for parameter estimation and various implementation issues are discussed. Using a large‐scale simulation study, the behavior of the proposed approach is investigated and comparison with well‐established finite mixtures is provided. The performance of this novel family of models is also illustrated on artificial and real data.  相似文献   

4.
Linear‐mixed models are frequently used to obtain model‐based estimators in small area estimation (SAE) problems. Such models, however, are not suitable when the target variable exhibits a point mass at zero, a highly skewed distribution of the nonzero values and a strong spatial structure. In this paper, a SAE approach for dealing with such variables is suggested. We propose a two‐part random effects SAE model that includes a correlation structure on the area random effects that appears in the two parts and incorporates a bivariate smooth function of the geographical coordinates of units. To account for the skewness of the distribution of the positive values of the response variable, a Gamma model is adopted. To fit the model, to get small area estimates and to evaluate their precision, a hierarchical Bayesian approach is used. The study is motivated by a real SAE problem. We focus on estimation of the per‐farm average grape wine production in Tuscany, at subregional level, using the Farm Structure Survey data. Results from this real data application and those obtained by a model‐based simulation experiment show a satisfactory performance of the suggested SAE approach.  相似文献   

5.
This paper extends the multilevel survival model by allowing the existence of cured fraction in the model. Random effects induced by the multilevel clustering structure are specified in the linear predictors in both hazard function and cured probability parts. Adopting the generalized linear mixed model (GLMM) approach to formulate the problem, parameter estimation is achieved by maximizing a best linear unbiased prediction (BLUP) type log‐likelihood at the initial step of estimation, and is then extended to obtain residual maximum likelihood (REML) estimators of the variance component. The proposed multilevel mixture cure model is applied to analyze the (i) child survival study data with multilevel clustering and (ii) chronic granulomatous disease (CGD) data on recurrent infections as illustrations. A simulation study is carried out to evaluate the performance of the REML estimators and assess the accuracy of the standard error estimates.  相似文献   

6.
A score‐type test is proposed for testing the hypothesis of independent binary random variables against positive correlation in linear logistic models with sparse data and cluster specific covariates. The test is developed for univariate and multivariate one‐sided alternatives. The main advantage of using score test is that it requires estimation of the model only under the null hypothesis, that in this case corresponds to the binomial maximum likelihood fit. The score‐type test is developed from a class of estimating equations with block‐diagonal structure in which the coefficients of the linear logistic model are estimated simultaneously with the correlation. The simplicity of the score test is illustrated in two particular examples.  相似文献   

7.
Modeling plant growth using functional traits is important for understanding the mechanisms that underpin growth and for predicting new situations. We use three data sets on plant height over time and two validation methods—in‐sample model fit and leave‐one‐species‐out cross‐validation—to evaluate non‐linear growth model predictive performance based on functional traits. In‐sample measures of model fit differed substantially from out‐of‐sample model predictive performance; the best fitting models were rarely the best predictive models. Careful selection of predictor variables reduced the bias in parameter estimates, and there was no single best model across our three data sets. Testing and comparing multiple model forms is important. We developed an R package with a formula interface for straightforward fitting and validation of hierarchical, non‐linear growth models. Our intent is to encourage thorough testing of multiple growth model forms and an increased emphasis on assessing model fit relative to a model's purpose.  相似文献   

8.
A method is proposed that aims at identifying clusters of individuals that show similar patterns when observed repeatedly. We consider linear‐mixed models that are widely used for the modeling of longitudinal data. In contrast to the classical assumption of a normal distribution for the random effects a finite mixture of normal distributions is assumed. Typically, the number of mixture components is unknown and has to be chosen, ideally by data driven tools. For this purpose, an EM algorithm‐based approach is considered that uses a penalized normal mixture as random effects distribution. The penalty term shrinks the pairwise distances of cluster centers based on the group lasso and the fused lasso method. The effect is that individuals with similar time trends are merged into the same cluster. The strength of regularization is determined by one penalization parameter. For finding the optimal penalization parameter a new model choice criterion is proposed.  相似文献   

9.
We examine memory models for multisite capture–recapture data. This is an important topic, as animals may exhibit behavior that is more complex than simple first‐order Markov movement between sites, when it is necessary to devise and fit appropriate models to data. We consider the Arnason–Schwarz model for multisite capture–recapture data, which incorporates just first‐order Markov movement, and also two alternative models that allow for memory, the Brownie model and the Pradel model. We use simulation to compare two alternative tests which may be undertaken to determine whether models for multisite capture–recapture data need to incorporate memory. Increasing the complexity of models runs the risk of introducing parameters that cannot be estimated, irrespective of how much data are collected, a feature which is known as parameter redundancy. Rouan et al. (JABES, 2009, pp 338–355) suggest a constraint that may be applied to overcome parameter redundancy when it is present in multisite memory models. For this case, we apply symbolic methods to derive a simpler constraint, which allows more parameters to be estimated, and give general results not limited to a particular configuration. We also consider the effect sparse data can have on parameter redundancy and recommend minimum sample sizes. Memory models for multisite capture–recapture data can be highly complex and difficult to fit to data. We emphasize the importance of a structured approach to modeling such data, by considering a priori which parameters can be estimated, which constraints are needed in order for estimation to take place, and how much data need to be collected. We also give guidance on the amount of data needed to use two alternative families of tests for whether models for multisite capture–recapture data need to incorporate memory.  相似文献   

10.
In many biometrical applications, the count data encountered often contain extra zeros relative to the Poisson distribution. Zero‐inflated Poisson regression models are useful for analyzing such data, but parameter estimates may be seriously biased if the nonzero observations are over‐dispersed and simultaneously correlated due to the sampling design or the data collection procedure. In this paper, a zero‐inflated negative binomial mixed regression model is presented to analyze a set of pancreas disorder length of stay (LOS) data that comprised mainly same‐day separations. Random effects are introduced to account for inter‐hospital variations and the dependency of clustered LOS observations. Parameter estimation is achieved by maximizing an appropriate log‐likelihood function using an EM algorithm. Alternative modeling strategies, namely the finite mixture of Poisson distributions and the non‐parametric maximum likelihood approach, are also considered. The determination of pertinent covariates would assist hospital administrators and clinicians to manage LOS and expenditures efficiently.  相似文献   

11.
The use of control charts for monitoring schemes in medical context should consider adjustments to incorporate the specific risk for each individual. Some authors propose the use of a risk‐adjusted survival time cumulative sum (RAST CUSUM) control chart to monitor a time‐to‐event outcome, possibly right censored, using conventional survival models, which do not contemplate the possibility of cure of a patient. We propose to extend this approach considering a risk‐adjusted CUSUM chart, based on a cure rate model. We consider a regression model in which the covariates affect the cure fraction. The CUSUM scores are obtained for Weibull and log‐logistic promotion time model to monitor a scale parameter for nonimmune individuals. A simulation study was conducted to evaluate and compare the performance of the proposed chart (RACUF CUSUM) with RAST CUSUM, based on optimal control limits and average run length in different situations. As a result, we note that the RAST CUSUM chart is inappropriate when applied to data with a cure rate, while the proposed RACUF CUSUM chart seems to have similar performance if applied to data without a cure rate. The proposed chart is illustrated with simulated data and with a real data set of patients with heart failure treated at the Heart Institute (InCor), at the University of São Paulo, Brazil.  相似文献   

12.
Marginal structural models for time‐fixed treatments fit using inverse‐probability weighted estimating equations are increasingly popular. Nonetheless, the resulting effect estimates are subject to finite‐sample bias when data are sparse, as is typical for large‐sample procedures. Here we propose a semi‐Bayes estimation approach which penalizes or shrinks the estimated model parameters to improve finite‐sample performance. This approach uses simple symmetric data‐augmentation priors. Limited simulation experiments indicate that the proposed approach reduces finite‐sample bias and improves confidence‐interval coverage when the true values lie within the central “hill” of the prior distribution. We illustrate the approach with data from a nonexperimental study of HIV treatments.  相似文献   

13.
Huiping Xu  Bruce A. Craig 《Biometrics》2009,65(4):1145-1155
Summary Traditional latent class modeling has been widely applied to assess the accuracy of dichotomous diagnostic tests. These models, however, assume that the tests are independent conditional on the true disease status, which is rarely valid in practice. Alternative models using probit analysis have been proposed to incorporate dependence among tests, but these models consider restricted correlation structures. In this article, we propose a probit latent class model that allows a general correlation structure. When combined with some helpful diagnostics, this model provides a more flexible framework from which to evaluate the correlation structure and model fit. Our model encompasses several other PLC models but uses a parameter‐expanded Monte Carlo EM algorithm to obtain the maximum‐likelihood estimates. The parameter‐expanded EM algorithm was designed to accelerate the convergence rate of the EM algorithm by expanding the complete‐data model to include a larger set of parameters and it ensures a simple solution in fitting the PLC model. We demonstrate our estimation and model selection methods using a simulation study and two published medical studies.  相似文献   

14.
Recently, although advances were made on modeling multivariate count data, existing models really has several limitations: (i) The multivariate Poisson log‐normal model (Aitchison and Ho, 1989) cannot be used to fit multivariate count data with excess zero‐vectors; (ii) The multivariate zero‐inflated Poisson (ZIP) distribution (Li et al., 1999) cannot be used to model zero‐truncated/deflated count data and it is difficult to apply to high‐dimensional cases; (iii) The Type I multivariate zero‐adjusted Poisson (ZAP) distribution (Tian et al., 2017) could only model multivariate count data with a special correlation structure for random components that are all positive or negative. In this paper, we first introduce a new multivariate ZAP distribution, based on a multivariate Poisson distribution, which allows the correlations between components with a more flexible dependency structure, that is some of the correlation coefficients could be positive while others could be negative. We then develop its important distributional properties, and provide efficient statistical inference methods for multivariate ZAP model with or without covariates. Two real data examples in biomedicine are used to illustrate the proposed methods.  相似文献   

15.
Stuart G. Baker 《Biometrics》2011,67(1):319-323
Summary Recently, Cheng (2009 , Biometrics 65, 96–103) proposed a model for the causal effect of receiving treatment when there is all‐or‐none compliance in one randomization group, with maximum likelihood estimation based on convex programming. We discuss an alternative approach that involves a model for all‐or‐none compliance in two randomization groups and estimation via a perfect fit or an expectation–maximization algorithm for count data. We believe this approach is easier to implement, which would facilitate the reproduction of calculations.  相似文献   

16.
In risk assessment, it is often desired to make inferences on the low dose levels at which a specific benchmark risk is attained. Applications of simultaneous hyperbolic confidence bands for low‐dose risk estimation with quantal data under different dose‐response models (multistage, Abbott‐adjusted Weibull, and Abbott‐adjusted log‐logistic models) have appeared in the literature. The use of simultaneous three‐segment bands under the multistage model has also been proposed recently. In this article, we present explicit formulas for constructing asymptotic one‐sided simultaneous hyperbolic and three‐segment bands for the simple log‐logistic regression model. We use the simultaneous construction to estimate upper hyperbolic and three‐segment confidence bands on extra risk and to obtain lower limits on the benchmark dose by inverting the upper bands on risk under the Abbott‐adjusted log‐logistic model. Monte Carlo simulations evaluate the characteristics of the simultaneous limits. An example is given to illustrate the use of the proposed methods and to compare the two types of simultaneous limits at very low dose levels.  相似文献   

17.
18.
In this work, a methodology for the model‐based identifiable parameter determination (MBIPD) is presented. This systematic approach is proposed to be used for structure and parameter identification of nonlinear models of biological reaction networks. Usually, this kind of problems are over‐parameterized with large correlations between parameters. Hence, the related inverse problems for parameter determination and analysis are mathematically ill‐posed and numerically difficult to solve. The proposed MBIPD methodology comprises several tasks: (i) model selection, (ii) tracking of an adequate initial guess, and (iii) an iterative parameter estimation step which includes an identifiable parameter subset selection (SsS) algorithm and accuracy analysis of the estimated parameters. The SsS algorithm is based on the analysis of the sensitivity matrix by rank revealing factorization methods. Using this, a reduction of the parameter search space to a reasonable subset, which can be reliably and efficiently estimated from available measurements, is achieved. The simultaneous saccharification and fermentation (SSF) process for bio‐ethanol production from cellulosic material is used as case study for testing the methodology. The successful application of MBIPD to the SSF process demonstrates a relatively large reduction in the identified parameter space. It is shown by a cross‐validation that using the identified parameters (even though the reduction of the search space), the model is still able to predict the experimental data properly. Moreover, it is shown that the model is easily and efficiently adapted to new process conditions by solving reduced and well conditioned problems. © 2013 American Institute of Chemical Engineers Biotechnol. Prog., 29:1064–1082, 2013  相似文献   

19.
A topic of particular current interest is community‐level approaches to species distribution modelling (SDM), i.e. approaches that simultaneously analyse distributional data for multiple species. Previous studies have looked at the advantages of community‐level approaches for parameter estimation, but not for model selection – the process of choosing which model (and in particular, which subset of environmental variables) to fit to data. We compared the predictive performance of models using the same modelling method (generalised linear models) but choosing the subset of variables to include in the model either simultaneously across all species (community‐level model selection) or separately for each species (species‐specific model selection). Our results across two large presence/absence tree community datasets were inconclusive as to whether there was an overall difference in predictive performance between models fitted via species‐specific vs community‐level model selection. However, we found some evidence that a community approach was best suited to modelling rare species, and its performance decayed with increasing prevalence. That is, when data were sparse there was more opportunity for gains from “borrowing strength” across species via a community‐level approach. Interestingly, we also found that the community‐level approach tended to work better when the model selection problem was more difficult, and more reliably detected “noise” variables that should be excluded from the model.  相似文献   

20.
In this paper, a new class of models for autoradiographic hot‐line data is proposed. The models, for which there is theoretical justification, are a linear combination of generalized Student's t‐distributions and have as special cases all currently accepted line‐spread models. The new models are used to analyse experimental hot‐line data and compared with the fit of current models. The data are from a line source labelled with iodine‐125 in a resin section of 0.6 m in thickness. It will be shown that a significant improvement in goodness of fit, over that of previous models, can be achieved by choosing from this new class of models. A single model from this class will be proposed that has a simple form made up of only two components, but which fits experimental data significantly better than previous models. A short sensitivity analysis indicates that estimation is reliable. The modelling approach, although motivated by and applied to autoradiography, is appropriate for any mixture modelling situation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号