首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Hazard regression for interval-censored data with penalized spline   总被引:1,自引:0,他引:1  
Cai T  Betensky RA 《Biometrics》2003,59(3):570-579
This article introduces a new approach for estimating the hazard function for possibly interval- and right-censored survival data. We weakly parameterize the log-hazard function with a piecewise-linear spline and provide a smoothed estimate of the hazard function by maximizing the penalized likelihood through a mixed model-based approach. We also provide a method to estimate the amount of smoothing from the data. We illustrate our approach with two well-known interval-censored data sets. Extensive numerical studies are conducted to evaluate the efficacy of the new procedure.  相似文献   

2.
We propose an estimating function for parameters in a modelfor Poisson process intensity when time- or space-varying covariatesare observed for both the events of the process and at sampletimes or locations selected from a probability-based samplingdesign. We investigate the large-sample properties of the proposedestimator under increasing domain asymptotics, demonstratingthat it is consistent and asymptotically normally distributed.We illustrate our approach using data from an ecological momentaryassessment of smoking.  相似文献   

3.
Numerous statistical methods have been developed for analyzing high‐dimensional data. These methods often focus on variable selection approaches but are limited for the purpose of testing with high‐dimensional data. They are often required to have explicit‐likelihood functions. In this article, we propose a “hybrid omnibus test” for high‐dicmensional data testing purpose with much weaker requirements. Our hybrid omnibus test is developed under a semiparametric framework where a likelihood function is no longer necessary. Our test is a version of a frequentist‐Bayesian hybrid score‐type test for a generalized partially linear single‐index model, which has a link function being a function of a set of variables through a generalized partially linear single index. We propose an efficient score based on estimating equations, define local tests, and then construct our hybrid omnibus test using local tests. We compare our approach with an empirical‐likelihood ratio test and Bayesian inference based on Bayes factors, using simulation studies. Our simulation results suggest that our approach outperforms the others, in terms of type I error, power, and computational cost in both the low‐ and high‐dimensional cases. The advantage of our approach is demonstrated by applying it to genetic pathway data for type II diabetes mellitus.  相似文献   

4.
Many small cetacean, sirenian, and pinniped species aggregate in groups of large or variable size. Accurate estimation of group sizes is essential for estimating the abundance and distribution of these species, but is challenging as individuals are highly mobile and only partially visible. We developed a Bayesian approach for estimating group sizes using wide‐angle aerial photographic or video imagery. Our approach accounts for both availability and perception bias, including a new method (analogous to distance sampling) for estimating perception bias due to small image size in wide‐angle images. We demonstrate our approach through an application to aerial survey data for an endangered population of beluga whales (Delphinapterus leucas) in Cook Inlet, Alaska. Our results strengthen understanding of variation in group size estimates and allow for probabilistic statements about the size of detected groups. Aerial surveys are a standard tool for estimating the abundance and distribution of various marine mammal species. The role of aerial photographic and video data in wildlife assessment is expected to increase substantially with the widespread uptake of unmanned aerial vehicle technology. Key aspects of our approach are relevant to group size estimation for a broad range of marine mammal, seabird, other waterfowl, and terrestrial ungulate species.  相似文献   

5.
In follow‐up studies, the disease event time can be subject to left truncation and right censoring. Furthermore, medical advancements have made it possible for patients to be cured of certain types of diseases. In this article, we consider a semiparametric mixture cure model for the regression analysis of left‐truncated and right‐censored data. The model combines a logistic regression for the probability of event occurrence with the class of transformation models for the time of occurrence. We investigate two techniques for estimating model parameters. The first approach is based on martingale estimating equations (EEs). The second approach is based on the conditional likelihood function given truncation variables. The asymptotic properties of both proposed estimators are established. Simulation studies indicate that the conditional maximum‐likelihood estimator (cMLE) performs well while the estimator based on EEs is very unstable even though it is shown to be consistent. This is a special and intriguing phenomenon for the EE approach under cure model. We provide insights into this issue and find that the EE approach can be improved significantly by assigning appropriate weights to the censored observations in the EEs. This finding is useful in overcoming the instability of the EE approach in some more complicated situations, where the likelihood approach is not feasible. We illustrate the proposed estimation procedures by analyzing the age at onset of the occiput‐wall distance event for patients with ankylosing spondylitis.  相似文献   

6.
Zero‐truncated data arises in various disciplines where counts are observed but the zero count category cannot be observed during sampling. Maximum likelihood estimation can be used to model these data; however, due to its nonstandard form it cannot be easily implemented using well‐known software packages, and additional programming is often required. Motivated by the Rao–Blackwell theorem, we develop a weighted partial likelihood approach to estimate model parameters for zero‐truncated binomial and Poisson data. The resulting estimating function is equivalent to a weighted score function for standard count data models, and allows for applying readily available software. We evaluate the efficiency for this new approach and show that it performs almost as well as maximum likelihood estimation. The weighted partial likelihood approach is then extended to regression modelling and variable selection. We examine the performance of the proposed methods through simulation and present two case studies using real data.  相似文献   

7.
This article develops an approach to estimating population abundance from line transect surveys that uses a calibration survey to estimate the detection function, which is then employed as a weight function in constructing the abundance estimate. Nonparametric methods of estimating the detection function via local regression and via a kernel density estimator are considered. The proposed methods are evaluated using a set of Western Australian plant data and weed enumeration data.  相似文献   

8.
9.
Approximate likelihood ratios for general estimating functions   总被引:1,自引:0,他引:1  
The method of estimating functions (Godambe, 1991) is commonlyused when one desires to conduct inference about some parametersof interest but the full distribution of the observations isunknown. However, this approach may have limited utility, dueto multiple roots for the estimating function, a poorly behavedWald test, or lack of a goodness-of-fit test. This paper presentsapproximate likelihood ratios that can be used along with estimatingfunctions when any of these three problems occurs. We show thatthe approximate likelihood ratio provides correct large sampleinference under very general circumstances, including clustereddata and misspecified weights in the estimating function. Twomethods of constructing the approximate likelihood ratio, onebased on the quasi-likelihood approach and the other based onthe linear projection approach, are compared and shown to beclosely related. In particular we show that quasi-likelihoodis the limit of the projection approach. We illustrate the techniquewith two applications.  相似文献   

10.
We introduce a novel approach for describing patterns of HIV genetic variation using regression modeling techniques. Parameters are defined for describing genetic variation within and between viral populations by generalizing Simpson's index of diversity. Regression models are specified for these variation parameters and the generalized estimating equation framework is used for estimating both the regression parameters and their corresponding variances. Conditions are described under which the usual asymptotic approximations to the distribution of the estimators are met. This approach provides a formal statistical framework for testing hypotheses regarding the changing patterns of HIV genetic variation over time within an infected patient. The application of these methods for testing biologically relevant hypotheses concerning HIV genetic variation is demonstrated in an example using sequence data from a subset of patients from the Multicenter AIDS Cohort Study.  相似文献   

11.
Aylor DL  Zeng ZB 《PLoS genetics》2008,4(3):e1000029
Gene expression data has been used in lieu of phenotype in both classical and quantitative genetic settings. These two disciplines have separate approaches to measuring and interpreting epistasis, which is the interaction between alleles at different loci. We propose a framework for estimating and interpreting epistasis from a classical experiment that combines the strengths of each approach. A regression analysis step accommodates the quantitative nature of expression measurements by estimating the effect of gene deletions plus any interaction. Effects are selected by significance such that a reduced model describes each expression trait. We show how the resulting models correspond to specific hierarchical relationships between two regulator genes and a target gene. These relationships are the basic units of genetic pathways and genomic system diagrams. Our approach can be extended to analyze data from a variety of experiments, multiple loci, and multiple environments.  相似文献   

12.
Hong F  Li H 《Biometrics》2006,62(2):534-544
Time-course studies of gene expression are essential in biomedical research to understand biological phenomena that evolve in a temporal fashion. We introduce a functional hierarchical model for detecting temporally differentially expressed (TDE) genes between two experimental conditions for cross-sectional designs, where the gene expression profiles are treated as functional data and modeled by basis function expansions. A Monte Carlo EM algorithm was developed for estimating both the gene-specific parameters and the hyperparameters in the second level of modeling. We use a direct posterior probability approach to bound the rate of false discovery at a pre-specified level and evaluate the methods by simulations and application to microarray time-course gene expression data on Caenorhabditis elegans developmental processes. Simulation results suggested that the procedure performs better than the two-way ANOVA in identifying TDE genes, resulting in both higher sensitivity and specificity. Genes identified from the C. elegans developmental data set show clear patterns of changes between the two experimental conditions.  相似文献   

13.
Wang YG 《Biometrics》2004,60(3):670-675
This article develops a method for analysis of growth data with multiple recaptures when the initial ages for all individuals are unknown. The existing approaches either impute the initial ages or model them as random effects. Assumptions about the initial age are not verifiable because all the initial ages are unknown. We present an alternative approach that treats all the lengths including the length at first capture as correlated repeated measures for each individual. Optimal estimating equations are developed using the generalized estimating equations approach that only requires the first two moment assumptions. Explicit expressions for estimation of both mean growth parameters and variance components are given to minimize the computational complexity. Simulation studies indicate that the proposed method works well. Two real data sets are analyzed for illustration, one from whelks (Dicathais aegaota) and the other from southern rock lobster (Jasus edwardsii) in South Australia.  相似文献   

14.
M. Kirkpatrick  D. Lofsvold    M. Bulmer 《Genetics》1990,124(4):979-993
We present methods for estimating the parameters of inheritance and selection that appear in a quantitative genetic model for the evolution growth trajectories and other "infinite-dimensional" traits that we recently introduced. Two methods for estimating the additive genetic covariance function are developed, a "full" model that fully fits the data and a "reduced" model that generates a smoothed estimate consistent with the sampling errors in the data. By decomposing the covariance function into its eigenvalues and eigenfunctions, it is possible to identify potential evolutionary changes in the population's mean growth trajectory for which there is (and those for which there is not) genetic variation. Algorithms for estimating these quantities, their confidence intervals, and for testing hypotheses about them are developed. These techniques are illustrated by an analysis of early growth in mice. Compatible methods for estimating the selection gradient function acting on growth trajectories in natural or domesticated populations are presented. We show how the estimates for the additive genetic covariance function and the selection gradient function can be used to predict the evolutionary change in a population's mean growth trajectory.  相似文献   

15.
Statistical inference for simultaneous clustering of gene expression data   总被引:1,自引:0,他引:1  
Current methods for analysis of gene expression data are mostly based on clustering and classification of either genes or samples. We offer support for the idea that more complex patterns can be identified in the data if genes and samples are considered simultaneously. We formalize the approach and propose a statistical framework for two-way clustering. A simultaneous clustering parameter is defined as a function theta=Phi(P) of the true data generating distribution P, and an estimate is obtained by applying this function to the empirical distribution P(n). We illustrate that a wide range of clustering procedures, including generalized hierarchical methods, can be defined as parameters which are compositions of individual mappings for clustering patients and genes. This framework allows one to assess classical properties of clustering methods, such as consistency, and to formally study statistical inference regarding the clustering parameter. We present results of simulations designed to assess the asymptotic validity of different bootstrap methods for estimating the distribution of Phi(P(n)). The method is illustrated on a publicly available data set.  相似文献   

16.
This paper addresses the problem of estimating an age-at-death distribution or paleodemographic profile from osteological data. It is demonstrated that the classical two-stage procedure whereby one first constructs estimates of age-at-death of individual skeletons and then uses these age estimates to obtain a paleodemographic profile is not a correct approach. This is a consequence of Bayes' theorem. Instead, we demonstrate a valid approach that proceeds from the opposite starting point: given skeletal age-at-death, one first estimates the probability of assigning the skeleton into a specific osteological age-indicator stage. We show that this leads to a statistically valid method for obtaining a paleodemographic profile, and moreover, that valid individual age estimation itself requires a demographic profile and therefore is done subsequent to its construction. Individual age estimation thus becomes the last rather than the first step in the estimation procedure. A central concept of our statistical approach is that of a weight function. A weight function is associated with each osteological age-indicator stage or category, and provides the probability that a specific age indicator stage is observed, given age-at-death of the individual. We recommend that weight functions be estimated nonparametrically from a reference data set. In their entirety, the weight functions characterize the relevant stochastic properties of a chosen age indicator. For actual estimation of the paleodemographic profile, a parametric age distribution in the target sample is assumed. The maximum likelihood method is used to identify the unknown parameters of this distribution. As some components are estimated nonparametrically, one then has a semiparametric model. We show how to obtain valid estimates of individual age-at-death, confidence regions, and goodness-of-fit tests. The methods are illustrated with both real and simulated data.  相似文献   

17.
Multivariate survival data arise from case-control family studies in which the ages at disease onset for family members may be correlated. In this paper, we consider a multivariate survival model with the marginal hazard function following the proportional hazards model. We use a frailty-based approach in the spirit of Glidden and Self (1999) to account for the correlation of ages at onset among family members. Specifically, we first estimate the baseline hazard function nonparametrically by the innovation theorem, and then obtain maximum pseudolikelihood estimators for the regression and correlation parameters plugging in the baseline hazard function estimator. We establish a connection with a previously proposed generalized estimating equation-based approach. Simulation studies and an analysis of case-control family data of breast cancer illustrate the methodology's practical utility.  相似文献   

18.
On parameter estimation in population models   总被引:2,自引:0,他引:2  
We describe methods for estimating the parameters of Markovian population processes in continuous time, thus increasing their utility in modelling real biological systems. A general approach, applicable to any finite-state continuous-time Markovian model, is presented, and this is specialised to a computationally more efficient method applicable to a class of models called density-dependent Markov population processes. We illustrate the versatility of both approaches by estimating the parameters of the stochastic SIS logistic model from simulated data. This model is also fitted to data from a population of Bay checkerspot butterfly (Euphydryas editha bayensis), allowing us to assess the viability of this population.  相似文献   

19.
Multivariate recurrent event data are usually encountered in many clinical and longitudinal studies in which each study subject may experience multiple recurrent events. For the analysis of such data, most existing approaches have been proposed under the assumption that the censoring times are noninformative, which may not be true especially when the observation of recurrent events is terminated by a failure event. In this article, we consider regression analysis of multivariate recurrent event data with both time‐dependent and time‐independent covariates where the censoring times and the recurrent event process are allowed to be correlated via a frailty. The proposed joint model is flexible where both the distributions of censoring and frailty variables are left unspecified. We propose a pairwise pseudolikelihood approach and an estimating equation‐based approach for estimating coefficients of time‐dependent and time‐independent covariates, respectively. The large sample properties of the proposed estimates are established, while the finite‐sample properties are demonstrated by simulation studies. The proposed methods are applied to the analysis of a set of bivariate recurrent event data from a study of platelet transfusion reactions.  相似文献   

20.
Zhu B  Song PX  Taylor JM 《Biometrics》2011,67(4):1295-1304
This article presents a new modeling strategy in functional data analysis. We consider the problem of estimating an unknown smooth function given functional data with noise. The unknown function is treated as the realization of a stochastic process, which is incorporated into a diffusion model. The method of smoothing spline estimation is connected to a special case of this approach. The resulting models offer great flexibility to capture the dynamic features of functional data, and allow straightforward and meaningful interpretation. The likelihood of the models is derived with Euler approximation and data augmentation. A unified Bayesian inference method is carried out via a Markov chain Monte Carlo algorithm including a simulation smoother. The proposed models and methods are illustrated on some prostate-specific antigen data, where we also show how the models can be used for forecasting.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号