共查询到20条相似文献,搜索用时 15 毫秒
1.
We consider semiparametric regression for periodic longitudinal data. Parametric fixed effects are used to model the covariate effects and a periodic nonparametric smooth function is used to model the time effect. The within-subject correlation is modeled using subject-specific random effects and a random stochastic process with a periodic variance function. We use maximum penalized likelihood to estimate the regression coefficients and the periodic nonparametric time function, whose estimator is shown to be a periodic cubic smoothing spline. We use restricted maximum likelihood to simultaneously estimate the smoothing parameter and the variance components. We show that all model parameters can be easily obtained by fitting a linear mixed model. A common problem in the analysis of longitudinal data is to compare the time profiles of two groups, e.g., between treatment and placebo. We develop a scaled chi-squared test for the equality of two nonparametric time functions. The proposed model and the test are illustrated by analyzing hormone data collected during two consecutive menstrual cycles and their performance is evaluated through simulations. 相似文献
2.
Tag loss and the Petersen mark-recapture experiment 总被引:2,自引:0,他引:2
3.
4.
5.
Semiparametric regression of multidimensional genetic pathway data: least-squares kernel machines and linear mixed models 总被引:1,自引:0,他引:1
We consider a semiparametric regression model that relates a normal outcome to covariates and a genetic pathway, where the covariate effects are modeled parametrically and the pathway effect of multiple gene expressions is modeled parametrically or nonparametrically using least-squares kernel machines (LSKMs). This unified framework allows a flexible function for the joint effect of multiple genes within a pathway by specifying a kernel function and allows for the possibility that each gene expression effect might be nonlinear and the genes within the same pathway are likely to interact with each other in a complicated way. This semiparametric model also makes it possible to test for the overall genetic pathway effect. We show that the LSKM semiparametric regression can be formulated using a linear mixed model. Estimation and inference hence can proceed within the linear mixed model framework using standard mixed model software. Both the regression coefficients of the covariate effects and the LSKM estimator of the genetic pathway effect can be obtained using the best linear unbiased predictor in the corresponding linear mixed model formulation. The smoothing parameter and the kernel parameter can be estimated as variance components using restricted maximum likelihood. A score test is developed to test for the genetic pathway effect. Model/variable selection within the LSKM framework is discussed. The methods are illustrated using a prostate cancer data set and evaluated using simulations. 相似文献
6.
Estimates of waterfowl demographic parameters often come from resighting studies where birds fit with individually identifiable neck collars are resighted at a distance. Concerns have been raised about the effects of collar loss on parameter estimates, and the reliability of extrapolating from collared individuals to the population. Models previously proposed to account for collar loss do not allow survival or harvest parameters to depend on neck collar presence or absence. Also, few models have incorporated recent advances in mark-recapture theory that allow for multiple states or auxiliary encounters such as band recoveries. We propose a multistate model for tag loss in which the presence or absence of a collar is considered as a state variable. In this framework, demographic parameters are corrected for tag loss and questions related to collar effects on survival and recovery rates can be addressed. Encounters of individuals between closed sampling periods also can be incorporated in the analysis. We discuss data requirements for answering questions related to tag loss and sampling designs that lend themselves to this purpose. We illustrate the application of our model using a study of lesser snow geese (Chen caerulescens caerulescens). 相似文献
7.
Semiparametric analysis of zero-inflated count data 总被引:1,自引:0,他引:1
Medical and public health research often involve the analysis of count data that exhibit a substantially large proportion of zeros, such as the number of heart attacks and the number of days of missed primary activities in a given period. A zero-inflated Poisson regression model, which hypothesizes a two-point heterogeneity in the population characterized by a binary random effect, is generally used to model such data. Subjects are broadly categorized into the low-risk group leading to structural zero counts and high-risk (or normal) group so that the counts can be modeled by a Poisson regression model. The main aim is to identify the explanatory variables that have significant effects on (i) the probability that the subject is from the low-risk group by means of a logistic regression formulation; and (ii) the magnitude of the counts, given that the subject is from the high-risk group by means of a Poisson regression where the effects of the covariates are assumed to be linearly related to the natural logarithm of the mean of the counts. In this article we consider a semiparametric zero-inflated Poisson regression model that postulates a possibly nonlinear relationship between the natural logarithm of the mean of the counts and a particular covariate. A sieve maximum likelihood estimation method is proposed. Asymptotic properties of the proposed sieve maximum likelihood estimators are discussed. Under some mild conditions, the estimators are shown to be asymptotically efficient and normally distributed. Simulation studies were carried out to investigate the performance of the proposed method. For illustration purpose, the method is applied to a data set from a public health survey conducted in Indonesia where the variable of interest is the number of days of missed primary activities due to illness in a 4-week period. 相似文献
8.
Zhang D 《Biometrics》2004,60(1):8-15
The routinely assumed parametric functional form in the linear predictor of a generalized linear mixed model for longitudinal data may be too restrictive to represent true underlying covariate effects. We relax this assumption by representing these covariate effects by smooth but otherwise arbitrary functions of time, with random effects used to model the correlation induced by among-subject and within-subject variation. Due to the usually intractable integration involved in evaluating the quasi-likelihood function, the double penalized quasi-likelihood (DPQL) approach of Lin and Zhang (1999, Journal of the Royal Statistical Society, Series B61, 381-400) is used to estimate the varying coefficients and the variance components simultaneously by representing a nonparametric function by a linear combination of fixed effects and random effects. A scaled chi-squared test based on the mixed model representation of the proposed model is developed to test whether an underlying varying coefficient is a polynomial of certain degree. We evaluate the performance of the procedures through simulation studies and illustrate their application with Indonesian children infectious disease data. 相似文献
9.
Joint frailty models for recurring events and death using maximum penalized likelihood estimation: application on cancer events 总被引:1,自引:0,他引:1
Rondeau V Mathoulin-Pelissier S Jacqmin-Gadda H Brouste V Soubeyran P 《Biostatistics (Oxford, England)》2007,8(4):708-721
The observation of repeated events for subjects in cohort studies could be terminated by loss to follow-up, end of study, or a major failure event such as death. In this context, the major failure event could be correlated with recurrent events, and the usual assumption of noninformative censoring of the recurrent event process by death, required by most statistical analyses, can be violated. Recently, joint modeling for 2 survival processes has received considerable attention because it makes it possible to study the joint evolution over time of 2 processes and gives unbiased and efficient parameters. The most commonly used estimation procedure in the joint models for survival events is the expectation maximization algorithm. We show how maximum penalized likelihood estimation can be applied to nonparametric estimation of the continuous hazard functions in a general joint frailty model with right censoring and delayed entry. The simulation study demonstrates that this semiparametric approach yields satisfactory results in this complex setting. As an illustration, such an approach is applied to a prospective cohort with recurrent events of follicular lymphomas, jointly modeled with death. 相似文献
10.
Genomic data are often characterized by a moderate to large number of categorical variables observed for relatively few subjects. Some of the variables may be missing or noninformative. An example of such data is loss of heterozygosity (LOH), a dichotomous variable, observed on a moderate number of genetic markers. We first consider a latent class model where, conditional on unobserved membership in one of k classes, the variables are independent with probabilities determined by a regression model of low dimension q. Using a family of penalties including the ridge and LASSO, we extend this model to address higher-dimensional problems. Finally, we present an orthogonal map that transforms marker space to a space of "features" for which the constrained model has better predictive power. We demonstrate these methods on LOH data collected at 19 markers from 93 brain tumor patients. For this data set, the existing unpenalized latent class methodology does not produce estimates. Additionally, we show that posterior classes obtained from this method are associated with survival for these patients. 相似文献
11.
The Cox proportional hazards model usually assumes an exponential form for the dependence of the hazard function on covariate variables. However, in practice this assumption may be violated and other relative risk forms may be more appropriate. In this article, we consider the proportional hazards model with an unknown relative risk form. Issues in model interpretation are addressed. We propose a method to estimate the relative risk form and the regression parameters simultaneously by first approximating the logarithm of the relative risk form by a spline, and then employing the maximum partial likelihood estimation. An iterative alternating optimization procedure is developed for efficient implementation. Statistical inference of the regression coefficients and of the relative risk form based on parametric asymptotic theory is discussed. The proposed methods are illustrated using simulation and an application to the Veteran's Administration lung cancer data. 相似文献
12.
Exchangeable binary data are often collected in developmental toxicity and other studies, and a whole host of parametric distributions for fitting this kind of data have been proposed in the literature. While these distributions can be matched to have the same marginal probability and intra-cluster correlation, they can be quite different in terms of shape and higher-order quantities of interest such as the litter-level risk of having at least one malformed fetus. A sensible alternative is to fit a saturated model (Bowman and George, 1995, Journal of the American Statistical Association 90, 871-879) using the expectation-maximization (EM) algorithm proposed by Stefanescu and Turnbull (2003, Biometrics 59, 18-24). The assumption of compatibility of marginal distributions is often made to link up the distributions for different cluster sizes so that estimation can be based on the combined data. Stefanescu and Turnbull proposed a modified trend test to test this assumption. Their test, however, fails to take into account the variability of an estimated null expectation and as a result leads to inaccurate p-values. This drawback is rectified in this article. When the data are sparse, the probability function estimated using a saturated model can be very jagged and some kind of smoothing is needed. We extend the penalized likelihood method (Simonoff, 1983, Annals of Statistics 11, 208-218) to the present case of unequal cluster sizes and implement the method using an EM-type algorithm. In the presence of covariate, we propose a penalized kernel method that performs smoothing in both the covariate and response space. The proposed methods are illustrated using several data sets and the sampling and robustness properties of the resulting estimators are evaluated by simulations. 相似文献
13.
14.
Huggins R 《Biometrics》2006,62(3):684-690
A semiparametric partially linear model for the size of an open population is proposed and inference is conducted using weighted martingale estimating equations. This extends a previous nonparametric approach to modeling capture-recapture data for open populations with frequent capture occasions. Analytic expressions for the large sample variances are derived and these are confirmed in a simulation study. The method is illustrated on monthly penguin banding data collected over 6 years. 相似文献
15.
We study nonparametric likelihood-based estimators of the meanfunction of counting processes with panel count data using monotonepolynomial splines. The generalized Rosen algorithm, proposedby Zhang & Jamshidian (2004), is used to compute the estimators.We show that the proposed spline likelihood-based estimatorsare consistent and that their rate of convergence can be fasterthan n1/3. Simulation studies with moderate samples show thatthe estimators have smaller variances and mean squared errorsthan their alternatives proposed by Wellner & Zhang (2000).A real example from a bladder tumour clinical trial is usedto illustrate this method. 相似文献
16.
Use of regression functions for improved estimation of means 总被引:2,自引:0,他引:2
17.
Closed population capture-recapture analysis of camera-trap data has become the conventional method for estimating the abundance
of individually recognisable cryptic species living at low densities, such as large felids. Often these estimates are the
only information available to guide wildlife managers and conservation policy. Capture probability of the target species using
camera traps is commonly heterogeneous and low. Published studies often report overall capture probabilities as low as 0.03
and fail to report on the level of heterogeneity in capture probability. We used simulations to study the effects of low and
heterogeneous capture probability on the reliability of abundance estimates using the Mh jack-knife estimator within a closed-population capture-recapture framework. High heterogeneity in capture probability was
associated with under- and over-estimates of true abundance. The use of biased abundance estimates could have serious conservation
management consequences. We recommend that studies present capture frequencies of all sampled individuals so that policy makers
can assess the reliability of the abundance estimates. 相似文献
18.
Summary . We study joint modeling of survival and longitudinal data. There are two regression models of interest. The primary model is for survival outcomes, which are assumed to follow a time-varying coefficient proportional hazards model. The second model is for longitudinal data, which are assumed to follow a random effects model. Based on the trajectory of a subject's longitudinal data, some covariates in the survival model are functions of the unobserved random effects. Estimated random effects are generally different from the unobserved random effects and hence this leads to covariate measurement error. To deal with covariate measurement error, we propose a local corrected score estimator and a local conditional score estimator. Both approaches are semiparametric methods in the sense that there is no distributional assumption needed for the underlying true covariates. The estimators are shown to be consistent and asymptotically normal. However, simulation studies indicate that the conditional score estimator outperforms the corrected score estimator for finite samples, especially in the case of relatively large measurement error. The approaches are demonstrated by an application to data from an HIV clinical trial. 相似文献
19.
20.
Summary Case-control mother-child pair design represents a unique advantage for dissecting genetic susceptibility of complex traits because it allows the assessment of both maternal and offspring genetic compositions. This design has been widely adopted in studies of obstetric complications and neonatal outcomes. In this work, we developed an efficient statistical method for evaluating joint genetic and environmental effects on a binary phenotype. Using a logistic regression model to describe the relationship between the phenotype and maternal and offspring genetic and environmental risk factors, we developed a semiparametric maximum likelihood method for the estimation of odds ratio association parameters. Our method is novel because it exploits two unique features of the study data for the parameter estimation. First, the correlation between maternal and offspring SNP genotypes can be specified under the assumptions of random mating, Hardy-Weinberg equilibrium, and Mendelian inheritance. Second, environmental exposures are often not affected by offspring genes conditional on maternal genes. Our method yields more efficient estimates compared with the standard prospective method for fitting logistic regression models to case-control data. We demonstrated the performance of our method through extensive simulation studies and the analysis of data from the Jerusalem Perinatal Study. 相似文献