共查询到20条相似文献,搜索用时 0 毫秒
1.
We investigate the variable selection problem for Cox's proportionalhazards model, and propose a unified model selection and estimationprocedure with desired theoretical properties and computationalconvenience. The new method is based on a penalized log partiallikelihood with the adaptively weighted L1 penalty on regressioncoefficients, providing what we call the adaptive Lasso estimator.The method incorporates different penalties for different coefficients:unimportant variables receive larger penalties than importantones, so that important variables tend to be retained in theselection process, whereas unimportant variables are more likelyto be dropped. Theoretical properties, such as consistency andrate of convergence of the estimator, are studied. We also showthat, with proper choice of regularization parameters, the proposedestimator has the oracle properties. The convex optimizationnature of the method leads to an efficient algorithm. Both simulatedand real examples show that the method performs competitively. 相似文献
2.
3.
Summary . We consider variable selection in the Cox regression model ( Cox, 1975 , Biometrika 362, 269–276) with covariates missing at random. We investigate the smoothly clipped absolute deviation penalty and adaptive least absolute shrinkage and selection operator (LASSO) penalty, and propose a unified model selection and estimation procedure. A computationally attractive algorithm is developed, which simultaneously optimizes the penalized likelihood function and penalty parameters. We also optimize a model selection criterion, called the IC Q statistic ( Ibrahim, Zhu, and Tang, 2008 , Journal of the American Statistical Association 103, 1648–1658), to estimate the penalty parameters and show that it consistently selects all important covariates. Simulations are performed to evaluate the finite sample performance of the penalty estimates. Also, two lung cancer data sets are analyzed to demonstrate the proposed methodology. 相似文献
4.
We propose an efficient and adaptive shrinkage method for variableselection in the Cox model. The method constructs a piecewise-linearregularization path connecting the maximum partial likelihoodestimator and the origin. Then a model is selected along thepath. We show that the constructed path is adaptive in the sensethat, with a proper choice of regularization parameter, thefitted model works as well as if the true underlying submodelwere given in advance. A modified algorithm of the least-angle-regressiontype efficiently computes the entire regularization path ofthe new estimator. Furthermore, we show that, with a properchoice of shrinkage parameter, the method is consistent in variableselection and efficient in estimation. Simulation shows thatthe new method tends to outperform the lasso and the smoothly-clipped-absolute-deviationestimators with moderate samples. We apply the methodology todata concerning nursing homes. 相似文献
5.
Summary It is of great practical interest to simultaneously identify the important predictors that correspond to both the fixed and random effects components in a linear mixed‐effects (LME) model. Typical approaches perform selection separately on each of the fixed and random effect components. However, changing the structure of one set of effects can lead to different choices of variables for the other set of effects. We propose simultaneous selection of the fixed and random factors in an LME model using a modified Cholesky decomposition. Our method is based on a penalized joint log likelihood with an adaptive penalty for the selection and estimation of both the fixed and random effects. It performs model selection by allowing fixed effects or standard deviations of random effects to be exactly zero. A constrained expectation–maximization algorithm is then used to obtain the final estimates. It is further shown that the proposed penalized estimator enjoys the Oracle property, in that, asymptotically it performs as well as if the true model was known beforehand. We demonstrate the performance of our method based on a simulation study and a real data example. 相似文献
6.
Summary Gaussian graphical models have been widely used as an effective method for studying the conditional independency structure among genes and for constructing genetic networks. However, gene expression data typically have heavier tails or more outlying observations than the standard Gaussian distribution. Such outliers in gene expression data can lead to wrong inference on the dependency structure among the genes. We propose a l1 penalized estimation procedure for the sparse Gaussian graphical models that is robustified against possible outliers. The likelihood function is weighted according to how the observation is deviated, where the deviation of the observation is measured based on its own likelihood. An efficient computational algorithm based on the coordinate gradient descent method is developed to obtain the minimizer of the negative penalized robustified‐likelihood, where nonzero elements of the concentration matrix represents the graphical links among the genes. After the graphical structure is obtained, we re‐estimate the positive definite concentration matrix using an iterative proportional fitting algorithm. Through simulations, we demonstrate that the proposed robust method performs much better than the graphical Lasso for the Gaussian graphical models in terms of both graph structure selection and estimation when outliers are present. We apply the robust estimation procedure to an analysis of yeast gene expression data and show that the resulting graph has better biological interpretation than that obtained from the graphical Lasso. 相似文献
7.
8.
9.
10.
11.
The use of generalized additive models in statistical data analysis suffers from the restriction to few explanatory variables and the problems of selection of smoothing parameters. Generalized additive model boosting circumvents these problems by means of stagewise fitting of weak learners. A fitting procedure is derived which works for all simple exponential family distributions, including binomial, Poisson, and normal response variables. The procedure combines the selection of variables and the determination of the appropriate amount of smoothing. Penalized regression splines and the newly introduced penalized stumps are considered as weak learners. Estimates of standard deviations and stopping criteria, which are notorious problems in iterative procedures, are based on an approximate hat matrix. The method is shown to be a strong competitor to common procedures for the fitting of generalized additive models. In particular, in high-dimensional settings with many nuisance predictor variables it performs very well. 相似文献
12.
Gray RJ 《Biometrics》2000,56(2):571-576
An estimator of the regression parameters in a semiparametric transformed linear survival model is examined. This estimator consists of a single Newton-like update of the solution to a rank-based estimating equation from an initial consistent estimator. An automated penalized likelihood algorithm is proposed for estimating the optimal weight function for the estimating equations and the error hazard function that is needed in the variance estimator. In simulations, the estimated optimal weights are found to give reasonably efficient estimators of the regression parameters, and the variance estimators are found to perform well. The methodology is applied to an analysis of prognostic factors in non-Hodgkin's lymphoma. 相似文献
13.
14.
Nykamp DQ 《Journal of mathematical biology》2009,59(2):147-173
We present an analysis of interactions among neurons in stimulus-driven networks that is designed to control for effects from
unmeasured neurons. This work builds on previous connectivity analyses that assumed connectivity strength to be constant with
respect to the stimulus. Since unmeasured neuron activity can modulate with the stimulus, the effective strength of common
input connections from such hidden neurons can also modulate with the stimulus. By explicitly accounting for the resulting
stimulus-dependence of effective interactions among measured neurons, we are able to remove ambiguity in the classification
of causal interactions that resulted from classification errors in the previous analyses. In this way, we can more reliably
distinguish causal connections among measured neurons from common input connections that arise from hidden network nodes.
The approach is derived in a general mathematical framework that can be applied to other types of networks. We illustrate
the effects of stimulus-dependent connectivity estimates with simulations of neurons responding to a visual stimulus.
This research was supported by the National Science Foundation grants DMS-0415409 and DMS-0748417. 相似文献
15.
Robert G. Downer David C. Hamilton 《Biometrical journal. Biometrische Zeitschrift》2000,42(4):395-415
Raw estimates of disease rates over a geographical region are frequently quite variable, even though one may reasonably expect adjacent communities to have similar true rates. Smoother estimates are obtained by incorporating a penalty into a multinomial likelihood estimation procedure. For each pair of locations, this penalty increases with the difference between the rates and decreases with the distance between the two sites. The resulting estimates have smaller mean squared error than the raw estimates. Expansions are developed which demonstrate the contributions of the smoothing constant, spatial configuration, risk population and raw estimates to the amount of smoothing. Simulations and an example involving gastric cancer data illustrate the proposed method. 相似文献
16.
17.
Summary We consider selecting both fixed and random effects in a general class of mixed effects models using maximum penalized likelihood (MPL) estimation along with the smoothly clipped absolute deviation (SCAD) and adaptive least absolute shrinkage and selection operator (ALASSO) penalty functions. The MPL estimates are shown to possess consistency and sparsity properties and asymptotic normality. A model selection criterion, called the ICQ statistic, is proposed for selecting the penalty parameters ( Ibrahim, Zhu, and Tang, 2008 , Journal of the American Statistical Association 103, 1648–1658). The variable selection procedure based on ICQ is shown to consistently select important fixed and random effects. The methodology is very general and can be applied to numerous situations involving random effects, including generalized linear mixed models. Simulation studies and a real data set from a Yale infant growth study are used to illustrate the proposed methodology. 相似文献
18.
The model based on Gaussian process (GP) prior and a kernel covariance function can be used to fit nonlinear data with multidimensional covariates. It has been used as a flexible nonparametric approach for curve fitting, classification, clustering, and other statistical problems, and has been widely applied to deal with complex nonlinear systems in many different areas particularly in machine learning. However, it is a challenging problem when the model is used for the large-scale data sets and high-dimensional data, for example, for the meat data discussed in this article that have 100 highly correlated covariates. For such data, it suffers from large variance of parameter estimation and high predictive errors, and numerically, it suffers from unstable computation. In this article, penalized likelihood framework will be applied to the model based on GPs. Different penalties will be investigated, and their ability in application given to suit the characteristics of GP models will be discussed. The asymptotic properties will also be discussed with the relevant proofs. Several applications to real biomechanical and bioinformatics data sets will be reported. 相似文献
19.
Yassin Mazroui Simone Mathoulin‐Pélissier Gaetan MacGrogan Véronique Brouste Virginie Rondeau 《Biometrical journal. Biometrische Zeitschrift》2013,55(6):866-884
Individuals may experience more than one type of recurrent event and a terminal event during the life course of a disease. Follow‐up may be interrupted for several reasons, including the end of a study, or patients lost to follow‐up, which are noninformative censoring events. Death could also stop the follow‐up, hence, it is considered as a dependent terminal event. We propose a multivariate frailty model that jointly analyzes two types of recurrent events with a dependent terminal event. Two estimation methods are proposed: a semiparametrical approach using penalized likelihood estimation where baseline hazard functions are approximated by M‐splines, and another one with piecewise constant baseline hazard functions. Finally, we derived martingale residuals to check the goodness‐of‐fit. We illustrate our proposals with a real dataset on breast cancer. The main objective was to model the dependency between the two types of recurrent events (locoregional and metastatic) and the terminal event (death) after a breast cancer. 相似文献
20.
Joint frailty models for recurring events and death using maximum penalized likelihood estimation: application on cancer events 总被引:1,自引:0,他引:1
Rondeau V Mathoulin-Pelissier S Jacqmin-Gadda H Brouste V Soubeyran P 《Biostatistics (Oxford, England)》2007,8(4):708-721
The observation of repeated events for subjects in cohort studies could be terminated by loss to follow-up, end of study, or a major failure event such as death. In this context, the major failure event could be correlated with recurrent events, and the usual assumption of noninformative censoring of the recurrent event process by death, required by most statistical analyses, can be violated. Recently, joint modeling for 2 survival processes has received considerable attention because it makes it possible to study the joint evolution over time of 2 processes and gives unbiased and efficient parameters. The most commonly used estimation procedure in the joint models for survival events is the expectation maximization algorithm. We show how maximum penalized likelihood estimation can be applied to nonparametric estimation of the continuous hazard functions in a general joint frailty model with right censoring and delayed entry. The simulation study demonstrates that this semiparametric approach yields satisfactory results in this complex setting. As an illustration, such an approach is applied to a prospective cohort with recurrent events of follicular lymphomas, jointly modeled with death. 相似文献