首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We consider the statistical modeling and analysis of replicated multi-type point process data with covariates. Such data arise when heterogeneous subjects experience repeated events or failures which may be of several distinct types. The underlying processes are modeled as nonhomogeneous mixed Poisson processes with random (subject) and fixed (covariate) effects. The method of maximum likelihood is used to obtain estimates and standard errors of the failure rate parameters and regression coefficients. Score tests and likelihood ratio statistics are used for covariate selection. A graphical test of goodness of fit of the selected model is based on generalized residuals. Measures for determining the influence of an individual observation on the estimated regression coefficients and on the score test statistic are developed. An application is described to a large ongoing randomized controlled clinical trial for the efficacy of nutritional supplements of selenium for the prevention of two types of skin cancer.  相似文献   

2.
Wang CY  Wang N  Wang S 《Biometrics》2000,56(2):487-495
We consider regression analysis when covariate variables are the underlying regression coefficients of another linear mixed model. A naive approach is to use each subject's repeated measurements, which are assumed to follow a linear mixed model, and obtain subject-specific estimated coefficients to replace the covariate variables. However, directly replacing the unobserved covariates in the primary regression by these estimated coefficients may result in a significantly biased estimator. The aforementioned problem can be evaluated as a generalization of the classical additive error model where repeated measures are considered as replicates. To correct for these biases, we investigate a pseudo-expected estimating equation (EEE) estimator, a regression calibration (RC) estimator, and a refined version of the RC estimator. For linear regression, the first two estimators are identical under certain conditions. However, when the primary regression model is a nonlinear model, the RC estimator is usually biased. We thus consider a refined regression calibration estimator whose performance is close to that of the pseudo-EEE estimator but does not require numerical integration. The RC estimator is also extended to the proportional hazards regression model. In addition to the distribution theory, we evaluate the methods through simulation studies. The methods are applied to analyze a real dataset from a child growth study.  相似文献   

3.
Random effects selection in linear mixed models   总被引:2,自引:0,他引:2  
Chen Z  Dunson DB 《Biometrics》2003,59(4):762-769
We address the important practical problem of how to select the random effects component in a linear mixed model. A hierarchical Bayesian model is used to identify any random effect with zero variance. The proposed approach reparameterizes the mixed model so that functions of the covariance parameters of the random effects distribution are incorporated as regression coefficients on standard normal latent variables. We allow random effects to effectively drop out of the model by choosing mixture priors with point mass at zero for the random effects variances. Due to the reparameterization, the model enjoys a conditionally linear structure that facilitates the use of normal conjugate priors. We demonstrate that posterior computation can proceed via a simple and efficient Markov chain Monte Carlo algorithm. The methods are illustrated using simulated data and real data from a study relating prenatal exposure to polychlorinated biphenyls and psychomotor development of children.  相似文献   

4.
The method of mixed regression is considered for the estimation of coefficients in a linear regression model when incomplete prior information is available, and two families of improved estimators stemming from Stein-rule are proposed. Their properties are studied when disturbances are normal but small.  相似文献   

5.
Habitats in the Wadden Sea, a world heritage area, are affected by land subsidence resulting from natural gas extraction and by sea level rise. Here we describe a method to monitor changes in habitat types by producing sequential maps based on point information followed by mapping using a multinomial logit regression model with abiotic variables of which maps are available as predictors.In a 70 ha study area a total of 904 vegetation samples has been collected in seven sampling rounds with an interval of 2–3 years. Half of the vegetation plots was permanent, violating the assumption of independent data in multinomial logistic regression. This paper shows how this dependency can be accounted for by adding a random effect to the multinomial logit (MLN) model, thus becoming a mixed multinomial logit (MMNL) model. In principle all regression coefficients can be taken as random, but in this study only the intercepts are treated as location-specific random variables (random intercepts model). With six habitat types we have five intercepts, so that the number of extra model parameters becomes 15, 5 variances and 10 covariances.The likelihood ratio test showed that the MMNL model fitted significantly better than the MNL model with the same fixed effects. McFadden-R2 for the MMNL model was 0.467, versus 0.395 for the MNL model. The estimated coefficients of the MMNL and MNL model were comparable; those of altitude, the most important predictor, differed most. The MMNL model accounts for pseudo-replication at the permanent plots, which explains the larger standard errors of the MMNL coefficients. The habitat type at a given location-year combination was predicted by the habitat type with the largest predicted probability. The series of maps shows local trends in habitat types most likely driven by sea-level rise, soil subsidence, and a restoration project.We conclude that in environmental modeling of categorical variables using panel data, dependency of repeated observations at permanent plots should be accounted for. This will affect the estimated probabilities of the categories, and even stronger the standard errors of the regression coefficients.  相似文献   

6.
This article considers the problem of simultaneous prediction of actual and average values of the study variable in a linear regression model when a set of linear restrictions binding the regression coefficients is available, and analyzes the performance properties of predictors arising from the methods of restricted regression and mixed regression besides least squares.  相似文献   

7.
Liu D  Lin X  Ghosh D 《Biometrics》2007,63(4):1079-1088
We consider a semiparametric regression model that relates a normal outcome to covariates and a genetic pathway, where the covariate effects are modeled parametrically and the pathway effect of multiple gene expressions is modeled parametrically or nonparametrically using least-squares kernel machines (LSKMs). This unified framework allows a flexible function for the joint effect of multiple genes within a pathway by specifying a kernel function and allows for the possibility that each gene expression effect might be nonlinear and the genes within the same pathway are likely to interact with each other in a complicated way. This semiparametric model also makes it possible to test for the overall genetic pathway effect. We show that the LSKM semiparametric regression can be formulated using a linear mixed model. Estimation and inference hence can proceed within the linear mixed model framework using standard mixed model software. Both the regression coefficients of the covariate effects and the LSKM estimator of the genetic pathway effect can be obtained using the best linear unbiased predictor in the corresponding linear mixed model formulation. The smoothing parameter and the kernel parameter can be estimated as variance components using restricted maximum likelihood. A score test is developed to test for the genetic pathway effect. Model/variable selection within the LSKM framework is discussed. The methods are illustrated using a prostate cancer data set and evaluated using simulations.  相似文献   

8.
Joint regression analysis of correlated data using Gaussian copulas   总被引:2,自引:0,他引:2  
Song PX  Li M  Yuan Y 《Biometrics》2009,65(1):60-68
Summary .  This article concerns a new joint modeling approach for correlated data analysis. Utilizing Gaussian copulas, we present a unified and flexible machinery to integrate separate one-dimensional generalized linear models (GLMs) into a joint regression analysis of continuous, discrete, and mixed correlated outcomes. This essentially leads to a multivariate analogue of the univariate GLM theory and hence an efficiency gain in the estimation of regression coefficients. The availability of joint probability models enables us to develop a full maximum likelihood inference. Numerical illustrations are focused on regression models for discrete correlated data, including multidimensional logistic regression models and a joint model for mixed normal and binary outcomes. In the simulation studies, the proposed copula-based joint model is compared to the popular generalized estimating equations, which is a moment-based estimating equation method to join univariate GLMs. Two real-world data examples are used in the illustration.  相似文献   

9.
In this article, we have considered two families of predictors for the simultaneous prediction of actual and average values of study variable in a linear regression model when a set of stochastic linear constraints binding the regression coefficients is available. These families arise from the method of mixed regression estimation. Performance properties of these families are analyzed when the objective is to predict values outside the sample and within the sample.  相似文献   

10.
Summary Ye, Lin, and Taylor (2008, Biometrics 64 , 1238–1246) proposed a joint model for longitudinal measurements and time‐to‐event data in which the longitudinal measurements are modeled with a semiparametric mixed model to allow for the complex patterns in longitudinal biomarker data. They proposed a two‐stage regression calibration approach that is simpler to implement than a joint modeling approach. In the first stage of their approach, the mixed model is fit without regard to the time‐to‐event data. In the second stage, the posterior expectation of an individual's random effects from the mixed‐model are included as covariates in a Cox model. Although Ye et al. (2008) acknowledged that their regression calibration approach may cause a bias due to the problem of informative dropout and measurement error, they argued that the bias is small relative to alternative methods. In this article, we show that this bias may be substantial. We show how to alleviate much of this bias with an alternative regression calibration approach that can be applied for both discrete and continuous time‐to‐event data. Through simulations, the proposed approach is shown to have substantially less bias than the regression calibration approach proposed by Ye et al. (2008) . In agreement with the methodology proposed by Ye et al. (2008) , an advantage of our proposed approach over joint modeling is that it can be implemented with standard statistical software and does not require complex estimation techniques.  相似文献   

11.
The ability to properly assess and accurately phenotype true differences in feed efficiency among dairy cows is key to the development of breeding programs for improving feed efficiency. The variability among individuals in feed efficiency is commonly characterised by the residual intake approach. Residual feed intake is represented by the residuals of a linear regression of intake on the corresponding quantities of the biological functions that consume (or release) energy. However, the residuals include both, model fitting and measurement errors as well as any variability in cow efficiency. The objective of this study was to isolate the individual animal variability in feed efficiency from the residual component. Two separate models were fitted, in one the standard residual energy intake (REI) was calculated as the residual of a multiple linear regression of lactation average net energy intake (NEI) on lactation average milk energy output, average metabolic BW, as well as lactation loss and gain of body condition score. In the other, a linear mixed model was used to simultaneously fit fixed linear regressions and random cow levels on the biological traits and intercept using fortnight repeated measures for the variables. This method split the predicted NEI in two parts: one quantifying the population mean intercept and coefficients, and one quantifying cow-specific deviations in the intercept and coefficients. The cow-specific part of predicted NEI was assumed to isolate true differences in feed efficiency among cows. NEI and associated energy expenditure phenotypes were available for the first 17 fortnights of lactation from 119 Holstein cows; all fed a constant energy-rich diet. Mixed models fitting cow-specific intercept and coefficients to different combinations of the aforementioned energy expenditure traits, calculated on a fortnightly basis, were compared. The variance of REI estimated with the lactation average model represented only 8% of the variance of measured NEI. Among all compared mixed models, the variance of the cow-specific part of predicted NEI represented between 53% and 59% of the variance of REI estimated from the lactation average model or between 4% and 5% of the variance of measured NEI. The remaining 41% to 47% of the variance of REI estimated with the lactation average model may therefore reflect model fitting errors or measurement errors. In conclusion, the use of a mixed model framework with cow-specific random regressions seems to be a promising method to isolate the cow-specific component of REI in dairy cows.  相似文献   

12.
Hans C  Dunson DB 《Biometrics》2005,61(4):1018-1026
In regression applications with categorical predictors, interest often focuses on comparing the null hypothesis of homogeneity to an ordered alternative. This article proposes a Bayesian approach for addressing this problem in the setting of normal linear and probit regression models. The regression coefficients are assigned a conditionally conjugate prior density consisting of mixtures of point masses at 0 and truncated normal densities, with a (possibly unknown) changepoint parameter included to accommodate umbrella ordering. Two strategies of prior elicitation are considered: (1) a Bayesian Bonferroni approach in which the probability of the global null hypothesis is specified and local hypotheses are considered independent; and (2) an approach which treats these probabilities as random. A single Gibbs sampling chain can be used to obtain posterior probabilities for the different hypotheses and to estimate regression coefficients and predictive quantities either by model averaging or under the preferred hypothesis. The methods are applied to data from a carcinogenesis study.  相似文献   

13.
Random regression models are widely used in the field of animal breeding for the genetic evaluation of daily milk yields from different test days. These models are capable of handling different environmental effects on the respective test day, and they describe the characteristics of the course of the lactation period by using suitable covariates with fixed and random regression coefficients. As the numerically expensive estimation of parameters is already part of advanced computer software, modifications of random regression models will considerably grow in importance for statistical evaluations of nutrition and behaviour experiments with animals. Random regression models belong to the large class of linear mixed models. Thus, when choosing a model, or more precisely, when selecting a suitable covariance structure of the random effects, the information criteria of Akaike and Schwarz can be used. In this study, the fitting of random regression models for a statistical analysis of a feeding experiment with dairy cows is illustrated under application of the program package SAS. For each of the feeding groups, lactation curves modelled by covariates with fixed regression coefficients are estimated simultaneously. With the help of the fixed regression coefficients, differences between the groups are estimated and then tested for significance. The covariance structure of the random and subject-specific effects and the serial correlation matrix are selected by using information criteria and by estimating correlations between repeated measurements. For the verification of the selected model and the alternative models, mean values and standard deviations estimated with ordinary least square residuals are used.  相似文献   

14.
Starting point of the investigations is the time-invariant Wolgograd model applied to a sample of sugar beets. To overcome the weak multicollinearity of the model in its logarithmic form, a ridge-type estimator is applied which uses prior information on the unknown regression coefficients. This is done by introducing the biased minimax-linear estimator. To judge the goodness of the estimates there are calculated the minimax risks of the MILE and the OLSE as well as the estimated maximal crop yields.  相似文献   

15.
Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as ‘nuisance’ variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this ‘conditional’ regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models.  相似文献   

16.
Statistical models are simple mathematical rules derived from empirical data describing the association between an outcome and several explanatory variables. In a typical modeling situation statistical analysis often involves a large number of potential explanatory variables and frequently only partial subject-matter knowledge is available. Therefore, selecting the most suitable variables for a model in an objective and practical manner is usually a non-trivial task. We briefly revisit the purposeful variable selection procedure suggested by Hosmer and Lemeshow which combines significance and change-in-estimate criteria for variable selection and critically discuss the change-in-estimate criterion. We show that using a significance-based threshold for the change-in-estimate criterion reduces to a simple significance-based selection of variables, as if the change-in-estimate criterion is not considered at all. Various extensions to the purposeful variable selection procedure are suggested. We propose to use backward elimination augmented with a standardized change-in-estimate criterion on the quantity of interest usually reported and interpreted in a model for variable selection. Augmented backward elimination has been implemented in a SAS macro for linear, logistic and Cox proportional hazards regression. The algorithm and its implementation were evaluated by means of a simulation study. Augmented backward elimination tends to select larger models than backward elimination and approximates the unselected model up to negligible differences in point estimates of the regression coefficients. On average, regression coefficients obtained after applying augmented backward elimination were less biased relative to the coefficients of correctly specified models than after backward elimination. In summary, we propose augmented backward elimination as a reproducible variable selection algorithm that gives the analyst more flexibility in adopting model selection to a specific statistical modeling situation.  相似文献   

17.
Abstract

Random regression models are widely used in the field of animal breeding for the genetic evaluation of daily milk yields from different test days. These models are capable of handling different environmental effects on the respective test day, and they describe the characteristics of the course of the lactation period by using suitable covariates with fixed and random regression coefficients. As the numerically expensive estimation of parameters is already part of advanced computer software, modifications of random regression models will considerably grow in importance for statistical evaluations of nutrition and behaviour experiments with animals. Random regression models belong to the large class of linear mixed models. Thus, when choosing a model, or more precisely, when selecting a suitable covariance structure of the random effects, the information criteria of Akaike and Schwarz can be used. In this study, the fitting of random regression models for a statistical analysis of a feeding experiment with dairy cows is illustrated under application of the program package SAS. For each of the feeding groups, lactation curves modelled by covariates with fixed regression coefficients are estimated simultaneously. With the help of the fixed regression coefficients, differences between the groups are estimated and then tested for significance. The covariance structure of the random and subject-specific effects and the serial correlation matrix are selected by using information criteria and by estimating correlations between repeated measurements. For the verification of the selected model and the alternative models, mean values and standard deviations estimated with ordinary least square residuals are used.  相似文献   

18.
Random-effects models for serial observations with binary response   总被引:9,自引:0,他引:9  
R Stiratelli  N Laird  J H Ware 《Biometrics》1984,40(4):961-971
This paper presents a general mixed model for the analysis of serial dichotomous responses provided by a panel of study participants. Each subject's serial responses are assumed to arise from a logistic model, but with regression coefficients that vary between subjects. The logistic regression parameters are assumed to be normally distributed in the population. Inference is based upon maximum likelihood estimation of fixed effects and variance components, and empirical Bayes estimation of random effects. Exact solutions are analytically and computationally infeasible, but an approximation based on the mode of the posterior distribution of the random parameters is proposed, and is implemented by means of the EM algorithm. This approximate method is compared with a simpler two-step method proposed by Korn and Whittemore (1979, Biometrics 35, 795-804), using data from a panel study of asthmatics originally described in that paper. One advantage of the estimation strategy described here is the ability to use all of the data, including that from subjects with insufficient data to permit fitting of a separate logistic regression model, as required by the Korn and Whittemore method. However, the new method is computationally intensive.  相似文献   

19.

Background

Over time, adaptive Gaussian Hermite quadrature (QUAD) has become the preferred method for estimating generalized linear mixed models with binary outcomes. However, penalized quasi-likelihood (PQL) is still used frequently. In this work, we systematically evaluated whether matching results from PQL and QUAD indicate less bias in estimated regression coefficients and variance parameters via simulation.

Methods

We performed a simulation study in which we varied the size of the data set, probability of the outcome, variance of the random effect, number of clusters and number of subjects per cluster, etc. We estimated bias in the regression coefficients, odds ratios and variance parameters as estimated via PQL and QUAD. We ascertained if similarity of estimated regression coefficients, odds ratios and variance parameters predicted less bias.

Results

Overall, we found that the absolute percent bias of the odds ratio estimated via PQL or QUAD increased as the PQL- and QUAD-estimated odds ratios became more discrepant, though results varied markedly depending on the characteristics of the dataset

Conclusions

Given how markedly results varied depending on data set characteristics, specifying a rule above which indicated biased results proved impossible.This work suggests that comparing results from generalized linear mixed models estimated via PQL and QUAD is a worthwhile exercise for regression coefficients and variance components obtained via QUAD, in situations where PQL is known to give reasonable results.  相似文献   

20.
A convenient method for evaluation of biochemical reaction rate coefficients and their uncertainties is described. The motivation for developing this method was the complexity of existing statistical methods for analysis of biochemical rate equations, as well as the shortcomings of linear approaches, such as Lineweaver-Burk plots. The nonlinear least-squares method provides accurate estimates of the rate coefficients and their uncertainties from experimental data. Linearized methods that involve inversion of data are unreliable since several important assumptions of linear regression are violated. Furthermore, when linearized methods are used, there is no basis for calculation of the uncertainties in the rate coefficients. Uncertainty estimates are crucial to studies involving comparisons of rates for different organisms or environmental conditions. The spreadsheet method uses weighted least-squares analysis to determine the best-fit values of the rate coefficients for the integrated Monod equation. Although the integrated Monod equation is an implicit expression of substrate concentration, weighted least-squares analysis can be employed to calculate approximate differences in substrate concentration between model predictions and data. An iterative search routine in a spreadsheet program is utilized to search for the best-fit values of the coefficients by minimizing the sum of squared weighted errors. The uncertainties in the best-fit values of the rate coefficients are calculated by an approximate method that can also be implemented in a spreadsheet. The uncertainty method can be used to calculate single-parameter (coefficient) confidence intervals, degrees of correlation between parameters, and joint confidence regions for two or more parameters. Example sets of calculations are presented for acetate utilization by a methanogenic mixed culture and trichloroethylene cometabolism by a methane-oxidizing mixed culture. An additional advantage of application of this method to the integrated Monod equation compared with application of linearized methods is the economy of obtaining rate coefficients from a single batch experiment or a few batch experiments rather than having to obtain large numbers of initial rate measurements. However, when initial rate measurements are used, this method can still be used with greater reliability than linearized approaches.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号