首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Zhang  Hao Helen; Lu  Wenbin 《Biometrika》2007,94(3):691-703
We investigate the variable selection problem for Cox's proportionalhazards model, and propose a unified model selection and estimationprocedure with desired theoretical properties and computationalconvenience. The new method is based on a penalized log partiallikelihood with the adaptively weighted L1 penalty on regressioncoefficients, providing what we call the adaptive Lasso estimator.The method incorporates different penalties for different coefficients:unimportant variables receive larger penalties than importantones, so that important variables tend to be retained in theselection process, whereas unimportant variables are more likelyto be dropped. Theoretical properties, such as consistency andrate of convergence of the estimator, are studied. We also showthat, with proper choice of regularization parameters, the proposedestimator has the oracle properties. The convex optimizationnature of the method leads to an efficient algorithm. Both simulatedand real examples show that the method performs competitively.  相似文献   

2.
Cai T  Huang J  Tian L 《Biometrics》2009,65(2):394-404
Summary .  In the presence of high-dimensional predictors, it is challenging to develop reliable regression models that can be used to accurately predict future outcomes. Further complications arise when the outcome of interest is an event time, which is often not fully observed due to censoring. In this article, we develop robust prediction models for event time outcomes by regularizing the Gehan's estimator for the accelerated failure time (AFT) model ( Tsiatis, 1996 , Annals of Statistics 18, 305–328) with least absolute shrinkage and selection operator (LASSO) penalty. Unlike existing methods based on the inverse probability weighting and the Buckley and James estimator ( Buckley and James, 1979 , Biometrika 66, 429–436), the proposed approach does not require additional assumptions about the censoring and always yields a solution that is convergent. Furthermore, the proposed estimator leads to a stable regression model for prediction even if the AFT model fails to hold. To facilitate the adaptive selection of the tuning parameter, we detail an efficient numerical algorithm for obtaining the entire regularization path. The proposed procedures are applied to a breast cancer dataset to derive a reliable regression model for predicting patient survival based on a set of clinical prognostic factors and gene signatures. Finite sample performances of the procedures are evaluated through a simulation study.  相似文献   

3.
The penalized least squares approach with smoothly clipped absolutedeviation penalty has been consistently demonstrated to be anattractive regression shrinkage and selection method. It notonly automatically and consistently selects the important variables,but also produces estimators which are as efficient as the oracleestimator. However, these attractive features depend on appropriatechoice of the tuning parameter. We show that the commonly usedgeneralized crossvalidation cannot select the tuning parametersatisfactorily, with a nonignorable overfitting effect in theresulting model. In addition, we propose a BIC tuning parameterselector, which is shown to be able to identify the true modelconsistently. Simulation studies are presented to support theoreticalfindings, and an empirical example is given to illustrate itsuse in the Female Labor Supply data.  相似文献   

4.
One of the fundamental problems in theoretical electrocardiography can be characterized by an inverse problem. We present new methods for achieving better estimates of heart surface potential distributions in terms of torso potentials through an inverse procedure. First, we outline an automatic adaptive refinement algorithm that minimizes the spatial discretization error in the transfer matrix, increasing the accuracy of the inverse solution. Second, we introduce a new local regularization procedure, which works by partitioning the global transfer matrix into sub-matrices, allowing for varying amounts of smoothing. Each submatrix represents a region within the underlying geometric model in which regularization can be specifically ‘tuned’ using an a priori scheme based on the L-curve method. This local regularization method can provide a substantial increase in accuracy compared to global regularization schemes. Within this context of local regularization, we show that a generalized version of the singular value decomposition (GSVD) can further improve the accuracy of ECG inverse solutions compared to standard SVD and Tikhonov approaches. We conclude with specific examples of these techniques using geometric models of the human thorax derived from MRI data.  相似文献   

5.
Model selection and estimation in the Gaussian graphical model   总被引:3,自引:0,他引:3  
Yuan  Ming; Lin  Yi 《Biometrika》2007,94(1):19-35
We propose penalized likelihood methods for estimating the concentrationmatrix in the Gaussian graphical model. The methods lead toa sparse and shrinkage estimator of the concentration matrixthat is positive definite, and thus conduct model selectionand estimation simultaneously. The implementation of the methodsis nontrivial because of the positive definite constraint onthe concentration matrix, but we show that the computation canbe done effectively by taking advantage of the efficient maxdetalgorithm developed in convex optimization. We propose a BIC-typecriterion for the selection of the tuning parameter in the penalizedlikelihood methods. The connection between our methods and existingmethods is illustrated. Simulations and real examples demonstratethe competitive performance of the new methods.  相似文献   

6.
After introducing the fundamentals of BYY system and harmony learning, which has been developed in past several years as a unified statistical framework for parameter learning, regularization and model selection, we systematically discuss this BYY harmony learning on systems with discrete inner-representations. First, we shown that one special case leads to unsupervised learning on Gaussian mixture. We show how harmony learning not only leads us to the EM algorithm for maximum likelihood (ML) learning and the corresponding extended KMEAN algorithms for Mahalanobis clustering with criteria for selecting the number of Gaussians or clusters, but also provides us two new regularization techniques and a unified scheme that includes the previous rival penalized competitive learning (RPCL) as well as its various variants and extensions that performs model selection automatically during parameter learning. Moreover, as a by-product, we also get a new approach for determining a set of 'supporting vectors' for Parzen window density estimation. Second, we shown that other special cases lead to three typical supervised learning models with several new results. On three layer net, we get (i) a new regularized ML learning, (ii) a new criterion for selecting the number of hidden units, and (iii) a family of EM-like algorithms that combines harmony learning with new techniques of regularization. On the original and alternative models of mixture-of-expert (ME) as well as radial basis function (RBF) nets, we get not only a new type of criteria for selecting the number of experts or basis functions but also a new type of the EM-like algorithms that combines regularization techniques and RPCL learning for parameter learning with either least complexity nature on the original ME model or automated model selection on the alternative ME model and RBF nets. Moreover, all the results for the alternative ME model are also applied to other two popular nonparametric statistical approaches, namely kernel regression and supporting vector machine. Particularly, not only we get an easily implemented approach for determining the smoothing parameter in kernel regression, but also we get an alternative approach for deciding the set of supporting vectors in supporting vector machine.  相似文献   

7.
Fence method (Jiang and others 2008. Fence methods for mixed model selection. Annals of Statistics 36, 1669-1692) is a recently proposed strategy for model selection. It was motivated by the limitation of the traditional information criteria in selecting parsimonious models in some nonconventional situations, such as mixed model selection. Jiang and others (2009. A simplified adaptive fence procedure, Statistics & Probability Letters 79, 625-629) simplified the adaptive fence method of Jiang and others (2008) to make it more suitable and convenient to use in a wide variety of problems. Still, the current modification encounters computational difficulties when applied to high-dimensional and complex problems. To address this concern, we proposed a restricted fence procedure that combines the idea of the fence with that of the restricted maximum likelihood. Furthermore, we propose to use the wild bootstrap for choosing adaptively the tuning parameter used in the restricted fence. We focus on problems of longitudinal studies and demonstrate the performance of the new procedure and its comparison with other procedures of variable selection, including the information criteria and shrinkage methods, in simulation studies. The method is further illustrated by an example of real-data analysis.  相似文献   

8.
A method is proposed that aims at identifying clusters of individuals that show similar patterns when observed repeatedly. We consider linear‐mixed models that are widely used for the modeling of longitudinal data. In contrast to the classical assumption of a normal distribution for the random effects a finite mixture of normal distributions is assumed. Typically, the number of mixture components is unknown and has to be chosen, ideally by data driven tools. For this purpose, an EM algorithm‐based approach is considered that uses a penalized normal mixture as random effects distribution. The penalty term shrinks the pairwise distances of cluster centers based on the group lasso and the fused lasso method. The effect is that individuals with similar time trends are merged into the same cluster. The strength of regularization is determined by one penalization parameter. For finding the optimal penalization parameter a new model choice criterion is proposed.  相似文献   

9.
Summary High‐dimensional data such as microarrays have brought us new statistical challenges. For example, using a large number of genes to classify samples based on a small number of microarrays remains a difficult problem. Diagonal discriminant analysis, support vector machines, and k‐nearest neighbor have been suggested as among the best methods for small sample size situations, but none was found to be superior to others. In this article, we propose an improved diagonal discriminant approach through shrinkage and regularization of the variances. The performance of our new approach along with the existing methods is studied through simulations and applications to real data. These studies show that the proposed shrinkage‐based and regularization diagonal discriminant methods have lower misclassification rates than existing methods in many cases.  相似文献   

10.
Johnson BA  Long Q  Chung M 《Biometrics》2011,67(4):1379-1388
Summary Dimension reduction, model and variable selection are ubiquitous concepts in modern statistical science and deriving new methods beyond the scope of current methodology is noteworthy. This article briefly reviews existing regularization methods for penalized least squares and likelihood for survival data and their extension to a certain class of penalized estimating function. We show that if one's goal is to estimate the entire regularized coefficient path using the observed survival data, then all current strategies fail for the Buckley–James estimating function. We propose a novel two‐stage method to estimate and restore the entire Dantzig‐regularized coefficient path for censored outcomes in a least‐squares framework. We apply our methods to a microarray study of lung andenocarcinoma with sample size n = 200 and p = 1036 gene predictors and find 10 genes that are consistently selected across different criteria and an additional 14 genes that merit further investigation. In simulation studies, we found that the proposed path restoration and variable selection technique has the potential to perform as well as existing methods that begin with a proper convex loss function at the outset.  相似文献   

11.
A penalized maximum likelihood method for estimating epistatic effects of QTL   总被引:16,自引:0,他引:16  
Zhang YM  Xu S 《Heredity》2005,95(1):96-104
Although epistasis is an important phenomenon in the genetics and evolution of complex traits, epistatic effects are hard to estimate. The main problem is due to the overparameterized epistatic genetic models. An epistatic genetic model should include potential pair-wise interaction effects of all loci. However, the model is saturated quickly as the number of loci increases. Therefore, a variable selection technique is usually considered to exclude those interactions with negligible effects. With such techniques, we may run a high risk of missing some important interaction effects by not fully exploring the extremely large parameter space of models. We develop a penalized maximum likelihood method. The method developed here adopts a penalty that depends on the values of the parameters. The penalized likelihood method allows spurious QTL effects to be shrunk towards zero, while QTL with large effects are estimated with virtually no shrinkage. A simulation study shows that the new method can handle a model with a number of effects 15 times larger than the sample size. Simulation studies also show that results of the penalized likelihood method are comparable to the Bayesian shrinkage analysis, but the computational speed of the penalized method is orders of magnitude faster.  相似文献   

12.

Background  

DNA microarrays open up a new horizon for studying the genetic determinants of disease. The high throughput nature of these arrays creates an enormous wealth of information, but also poses a challenge to data analysis. Inferential problems become even more pronounced as experimental designs used to collect data become more complex. An important example is multigroup data collected over different experimental groups, such as data collected from distinct stages of a disease process. We have developed a method specifically addressing these issues termed Bayesian ANOVA for microarrays (BAM). The BAM approach uses a special inferential regularization known as spike-and-slab shrinkage that provides an optimal balance between total false detections and total false non-detections. This translates into more reproducible differential calls. Spike and slab shrinkage is a form of regularization achieved by using information across all genes and groups simultaneously.  相似文献   

13.
Variable Selection for Semiparametric Mixed Models in Longitudinal Studies   总被引:2,自引:0,他引:2  
Summary .  We propose a double-penalized likelihood approach for simultaneous model selection and estimation in semiparametric mixed models for longitudinal data. Two types of penalties are jointly imposed on the ordinary log-likelihood: the roughness penalty on the nonparametric baseline function and a nonconcave shrinkage penalty on linear coefficients to achieve model sparsity. Compared to existing estimation equation based approaches, our procedure provides valid inference for data with missing at random, and will be more efficient if the specified model is correct. Another advantage of the new procedure is its easy computation for both regression components and variance parameters. We show that the double-penalized problem can be conveniently reformulated into a linear mixed model framework, so that existing software can be directly used to implement our method. For the purpose of model inference, we derive both frequentist and Bayesian variance estimation for estimated parametric and nonparametric components. Simulation is used to evaluate and compare the performance of our method to the existing ones. We then apply the new method to a real data set from a lactation study.  相似文献   

14.
Compressed sensing has shown to be promising to accelerate magnetic resonance imaging. In this new technology, magnetic resonance images are usually reconstructed by enforcing its sparsity in sparse image reconstruction models, including both synthesis and analysis models. The synthesis model assumes that an image is a sparse combination of atom signals while the analysis model assumes that an image is sparse after the application of an analysis operator. Balanced model is a new sparse model that bridges analysis and synthesis models by introducing a penalty term on the distance of frame coefficients to the range of the analysis operator. In this paper, we study the performance of the balanced model in tight frame based compressed sensing magnetic resonance imaging and propose a new efficient numerical algorithm to solve the optimization problem. By tuning the balancing parameter, the new model achieves solutions of three models. It is found that the balanced model has a comparable performance with the analysis model. Besides, both of them achieve better results than the synthesis model no matter what value the balancing parameter is. Experiment shows that our proposed numerical algorithm constrained split augmented Lagrangian shrinkage algorithm for balanced model (C-SALSA-B) converges faster than previously proposed algorithms accelerated proximal algorithm (APG) and alternating directional method of multipliers for balanced model (ADMM-B).  相似文献   

15.
Huang J  Ma S  Xie H 《Biometrics》2006,62(3):813-820
We consider two regularization approaches, the LASSO and the threshold-gradient-directed regularization, for estimation and variable selection in the accelerated failure time model with multiple covariates based on Stute's weighted least squares method. The Stute estimator uses Kaplan-Meier weights to account for censoring in the least squares criterion. The weighted least squares objective function makes the adaptation of this approach to multiple covariate settings computationally feasible. We use V-fold cross-validation and a modified Akaike's Information Criterion for tuning parameter selection, and a bootstrap approach for variance estimation. The proposed method is evaluated using simulations and demonstrated on a real data example.  相似文献   

16.
We describe here a mathematical model of the adaptive dynamics of a transport network of the true slime mold Physarum polycephalum, an amoeboid organism that exhibits path-finding behavior in a maze. This organism possesses a network of tubular elements, by means of which nutrients and signals circulate through the plasmodium. When the organism is put in a maze, the network changes its shape to connect two exits by the shortest path. This process of path-finding is attributed to an underlying physiological mechanism: a tube thickens as the flux through it increases. The experimental evidence for this is, however, only qualitative. We constructed a mathematical model of the general form of the tube dynamics. Our model contains a key parameter corresponding to the extent of the feedback regulation between the thickness of a tube and the flux through it. We demonstrate the dependence of the behavior of the model on this parameter.  相似文献   

17.
The cognitive map has been taken as the standard model for how agents infer the most efficient route to a goal location. Alternatively, path integration – maintaining a homing vector during navigation – constitutes a primitive and presumably less-flexible strategy than cognitive mapping because path integration relies primarily on vestibular stimuli and pace counting. The historical debate as to whether complex spatial navigation is ruled by associative learning or cognitive map mechanisms has been challenged by experimental difficulties in successfully neutralizing path integration. To our knowledge, there are only three studies that have succeeded in resolving this issue, all showing clear evidence of novel route taking, a behaviour outside the scope of traditional associative learning accounts. Nevertheless, there is no mechanistic explanation as to how animals perform novel route taking. We propose here a new model of spatial learning that combines path integration with higher-order associative learning, and demonstrate how it can account for novel route taking without a cognitive map, thus resolving this long-standing debate. We show how our higher-order path integration (HOPI) model can explain spatial inferences, such as novel detours and shortcuts. Our analysis suggests that a phylogenetically ancient, vector-based navigational strategy utilizing associative processes is powerful enough to support complex spatial inferences.  相似文献   

18.
MOTIVATION: Diffusable and non-diffusable gene products play a major role in body plan formation. A quantitative understanding of the spatio-temporal patterns formed in body plan formation, by using simulation models is an important addition to experimental observation. The inverse modelling approach consists of describing the body plan formation by a rule-based model, and fitting the model parameters to real observed data. In body plan formation, the data are usually obtained from fluorescent immunohistochemistry or in situ hybridizations. Inferring model parameters by comparing such data to those from simulation is a major computational bottleneck. An important aspect in this process is the choice of method used for parameter estimation. When no information on parameters is available, parameter estimation is mostly done by means of heuristic algorithms. RESULTS: We show that parameter estimation for pattern formation models can be efficiently performed using an evolution strategy (ES). As a case study we use a quantitative spatio-temporal model of the regulatory network for early development in Drosophila melanogaster. In order to estimate the parameters, the simulated results are compared to a time series of gene products involved in the network obtained with immunohistochemistry. We demonstrate that a (mu,lambda)-ES can be used to find good quality solutions in the parameter estimation. We also show that an ES with multiple populations is 5-140 times as fast as parallel simulated annealing for this case study, and that combining ES with a local search results in an efficient parameter estimation method.  相似文献   

19.
Bayesian shrinkage estimation of quantitative trait loci parameters   总被引:13,自引:0,他引:13       下载免费PDF全文
Wang H  Zhang YM  Li X  Masinde GL  Mohan S  Baylink DJ  Xu S 《Genetics》2005,170(1):465-480
Mapping multiple QTL is a typical problem of variable selection in an oversaturated model because the potential number of QTL can be substantially larger than the sample size. Currently, model selection is still the most effective approach to mapping multiple QTL, although further research is needed. An alternative approach to analyzing an oversaturated model is the shrinkage estimation in which all candidate variables are included in the model but their estimated effects are forced to shrink toward zero. In contrast to the usual shrinkage estimation where all model effects are shrunk by the same factor, we develop a Bayesian method that allows the shrinkage factor to vary across different effects. The new shrinkage method forces marker intervals that contain no QTL to have estimated effects close to zero whereas intervals containing notable QTL have estimated effects subject to virtually no shrinkage. We demonstrate the method using both simulated and real data for QTL mapping. A simulation experiment with 500 backcross (BC) individuals showed that the method can localize closely linked QTL and QTL with effects as small as 1% of the phenotypic variance of the trait. The method was also used to map QTL responsible for wound healing in a family of a (MRL/MPJ x SJL/J) cross with 633 F(2) mice derived from two inbred lines.  相似文献   

20.
Patient-specific cardiac modelling can help in understanding pathophysiology and predict therapy planning. However, it requires to personalize the model geometry, kinematics, electrophysiology and mechanics. Calibration aims at providing proper initial values of parameters before performing the personalization stage which involves solving an inverse problem. We propose a fast automatic calibration method of the mechanical parameters of a complete electromechanical model of the heart based on a sensitivity analysis and the Unscented Transform algorithm. A new implementation of the complete Bestel–Clement–Sorine (BCS) cardiac model is also proposed, in a modular and efficient framework. A complete sensitivity analysis is performed that reveals which observations on the volume evolution are significant to characterize the global behaviour of the myocardium. We show that the calibration method gives satisfying results by optimizing up to 5 parameters of the BCS model in only one iteration. This method was evaluated synthetically as well as on 7 volunteers with a mean relative error from the real data of 10 %. This calibration is designed to replace manual parameter estimation as well as initialization steps that precede automatic personalization algorithms based on images.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号