期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Semiparametric regression for periodic longitudinal hormone data from multiple menstrual cycles

Zhang D Lin X Sowers M 《Biometrics》2000,56(1):31-39

We consider semiparametric regression for periodic longitudinal data. Parametric fixed effects are used to model the covariate effects and a periodic nonparametric smooth function is used to model the time effect. The within-subject correlation is modeled using subject-specific random effects and a random stochastic process with a periodic variance function. We use maximum penalized likelihood to estimate the regression coefficients and the periodic nonparametric time function, whose estimator is shown to be a periodic cubic smoothing spline. We use restricted maximum likelihood to simultaneously estimate the smoothing parameter and the variance components. We show that all model parameters can be easily obtained by fitting a linear mixed model. A common problem in the analysis of longitudinal data is to compare the time profiles of two groups, e.g., between treatment and placebo. We develop a scaled chi-squared test for the equality of two nonparametric time functions. The proposed model and the test are illustrated by analyzing hormone data collected during two consecutive menstrual cycles and their performance is evaluated through simulations. 相似文献

2.

Parametric regression on cumulative incidence function

Jeong JH Fine JP 《Biostatistics (Oxford, England)》2007,8(2):184-196

We propose parametric regression analysis of cumulative incidence function with competing risks data. A simple form of Gompertz distribution is used for the improper baseline subdistribution of the event of interest. Maximum likelihood inferences on regression parameters and associated cumulative incidence function are developed for parametric models, including a flexible generalized odds rate model. Estimation of the long-term proportion of patients with cause-specific events is straightforward in the parametric setting. Simple goodness-of-fit tests are discussed for evaluating a fixed odds rate assumption. The parametric regression methods are compared with an existing semiparametric regression analysis on a breast cancer data set where the cumulative incidence of recurrence is of interest. The results demonstrate that the likelihood-based parametric analyses for the cumulative incidence function are a practically useful alternative to the semiparametric analyses. 相似文献

3.

Bayesian data integration and variable selection for pan-cancer survival prediction using protein expression data

Arnab Kumar Maity Anirban Bhattacharya Bani K. Mallick Veerabhadran Baladandayuthapani 《Biometrics》2020,76(1):316-325

Accurate prognostic prediction using molecular information is a challenging area of research, which is essential to develop precision medicine. In this paper, we develop translational models to identify major actionable proteins that are associated with clinical outcomes, like the survival time of patients. There are considerable statistical and computational challenges due to the large dimension of the problems. Furthermore, data are available for different tumor types; hence data integration for various tumors is desirable. Having censored survival outcomes escalates one more level of complexity in the inferential procedure. We develop Bayesian hierarchical survival models, which accommodate all the challenges mentioned here. We use the hierarchical Bayesian accelerated failure time model for survival regression. Furthermore, we assume sparse horseshoe prior distribution for the regression coefficients to identify the major proteomic drivers. We borrow strength across tumor groups by introducing a correlation structure among the prior distributions. The proposed methods have been used to analyze data from the recently curated “The Cancer Proteome Atlas” (TCPA), which contains reverse-phase protein arrays–based high-quality protein expression data as well as detailed clinical annotation, including survival times. Our simulation and the TCPA data analysis illustrate the efficacy of the proposed integrative model, which links different tumors with the correlated prior structures. 相似文献

4.

FLCRM: Functional linear cox regression model

下载免费PDF全文

Dehan Kong Joseph G. Ibrahim Eunjee Lee Hongtu Zhu 《Biometrics》2018,74(1):109-117

Summary

We consider a functional linear Cox regression model for characterizing the association between time‐to‐event data and a set of functional and scalar predictors. The functional linear Cox regression model incorporates a functional principal component analysis for modeling the functional predictors and a high‐dimensional Cox regression model to characterize the joint effects of both functional and scalar predictors on the time‐to‐event data. We develop an algorithm to calculate the maximum approximate partial likelihood estimates of unknown finite and infinite dimensional parameters. We also systematically investigate the rate of convergence of the maximum approximate partial likelihood estimates and a score test statistic for testing the nullity of the slope function associated with the functional predictors. We demonstrate our estimation and testing procedures by using simulations and the analysis of the Alzheimer's Disease Neuroimaging Initiative (ADNI) data. Our real data analyses show that high‐dimensional hippocampus surface data may be an important marker for predicting time to conversion to Alzheimer's disease. Data used in the preparation of this article were obtained from the ADNI database ( adni.loni.usc.edu ). 相似文献

5.

A Bayesian Semiparametric Survival Model with Longitudinal Markers

Song Zhang Peter Müller Kim‐Anh Do 《Biometrics》2010,66(2):435-443

Summary We consider inference for data from a clinical trial of treatments for metastatic prostate cancer. Patients joined the trial with diverse prior treatment histories. The resulting heterogeneous patient population gives rise to challenging statistical inference problems when trying to predict time to progression on different treatment arms. Inference is further complicated by the need to include a longitudinal marker as a covariate. To address these challenges, we develop a semiparametric model for joint inference of longitudinal data and an event time. The proposed approach includes the possibility of cure for some patients. The event time distribution is based on a nonparametric Pólya tree prior. For the longitudinal data we assume a mixed effects model. Incorporating a regression on covariates in a nonparametric event time model in general, and for a Pólya tree model in particular, is a challenging problem. We exploit the fact that the covariate itself is a random variable. We achieve an implementation of the desired regression by factoring the joint model for the event time and the longitudinal outcome into a marginal model for the event time and a regression of the longitudinal outcomes on the event time, i.e., we implicitly model the desired regression by modeling the reverse conditional distribution. 相似文献

6.

Inference methods for the conditional logistic regression model with longitudinal data

Craiu RV Duchesne T Fortin D 《Biometrical journal. Biometrische Zeitschrift》2008,50(1):97-109

This paper considers inference methods for case-control logistic regression in longitudinal setups. The motivation is provided by an analysis of plains bison spatial location as a function of habitat heterogeneity. The sampling is done according to a longitudinal matched case-control design in which, at certain time points, exactly one case, the actual location of an animal, is matched to a number of controls, the alternative locations that could have been reached. We develop inference methods for the conditional logistic regression model in this setup, which can be formulated within a generalized estimating equation (GEE) framework. This permits the use of statistical techniques developed for GEE-based inference, such as robust variance estimators and model selection criteria adapted for non-independent data. The performance of the methods is investigated in a simulation study and illustrated with the bison data analysis. 相似文献

7.

A weighted partial likelihood approach for zero‐truncated models

Wen‐Han Hwang Dean Heinze Jakub Stoklosa 《Biometrical journal. Biometrische Zeitschrift》2019,61(4):1073-1087

Zero‐truncated data arises in various disciplines where counts are observed but the zero count category cannot be observed during sampling. Maximum likelihood estimation can be used to model these data; however, due to its nonstandard form it cannot be easily implemented using well‐known software packages, and additional programming is often required. Motivated by the Rao–Blackwell theorem, we develop a weighted partial likelihood approach to estimate model parameters for zero‐truncated binomial and Poisson data. The resulting estimating function is equivalent to a weighted score function for standard count data models, and allows for applying readily available software. We evaluate the efficiency for this new approach and show that it performs almost as well as maximum likelihood estimation. The weighted partial likelihood approach is then extended to regression modelling and variable selection. We examine the performance of the proposed methods through simulation and present two case studies using real data. 相似文献

8.

Parametric mode regression for bounded responses

Haiming Zhou Xianzheng Huang for the Alzheimer's Disease Neuroimaging Initiative 《Biometrical journal. Biometrische Zeitschrift》2020,62(7):1791-1809

We propose new parametric frameworks of regression analysis with the conditional mode of a bounded response as the focal point of interest. Covariate effects estimation and prediction based on the maximum likelihood method under two new classes of regression models are demonstrated. We also develop graphical and numerical diagnostic tools to detect various sources of model misspecification. Predictions based on different central tendency measures inferred using various regression models are compared using synthetic data in simulations. Finally, we conduct regression analysis for data from the Alzheimer's Disease Neuroimaging Initiative to demonstrate practical implementation of the proposed methods. Supporting Information that contain technical details and additional simulation and data analysis results are available online. 相似文献

9.

Semiparametric transformation models for interval‐censored data in the presence of a cure fraction

Chyong‐Mei Chen Pao‐sheng Shen Wei‐Lun Huang 《Biometrical journal. Biometrische Zeitschrift》2019,61(1):203-215

Mixed case interval‐censored data arise when the event of interest is known only to occur within an interval induced by a sequence of random examination times. Such data are commonly encountered in disease research with longitudinal follow‐up. Furthermore, the medical treatment has progressed over the last decade with an increasing proportion of patients being cured for many types of diseases. Thus, interest has grown in cure models for survival data which hypothesize a certain proportion of subjects in the population are not expected to experience the events of interest. In this article, we consider a two‐component mixture cure model for regression analysis of mixed case interval‐censored data. The first component is a logistic regression model that describes the cure rate, and the second component is a semiparametric transformation model that describes the distribution of event time for the uncured subjects. We propose semiparametric maximum likelihood estimation for the considered model. We develop an EM type algorithm for obtaining the semiparametric maximum likelihood estimators (SPMLE) of regression parameters and establish their consistency, efficiency, and asymptotic normality. Extensive simulation studies indicate that the SPMLE performs satisfactorily in a wide variety of settings. The proposed method is illustrated by the analysis of the hypobaric decompression sickness data from National Aeronautics and Space Administration. 相似文献

10.

Inference for constrained estimation of tumor size distributions

Ghosh D Banerjee M Biswas P 《Biometrics》2008,64(4):1009-1017

SUMMARY: In order to develop better treatment and screening programs for cancer prevention programs, it is important to be able to understand the natural history of the disease and what factors affect its progression. We focus on a particular framework first outlined by Kimmel and Flehinger (1991, Biometrics, 47, 987-1004) and in particular one of their limiting scenarios for analysis. Using an equivalence with a binary regression model, we characterize the nonparametric maximum likelihood estimation procedure for estimation of the tumor size distribution function and give associated asymptotic results. Extensions to semiparametric models and missing data are also described. Application to data from two cancer studies is used to illustrate the finite-sample behavior of the procedure. 相似文献

11.

Penalized Bregman divergence for large-dimensional regression and classification

Zhang C Jiang Y Chai Y 《Biometrika》2010,97(3):551-566

Regularization methods are characterized by loss functions measuring data fits and penalty terms constraining model parameters. The commonly used quadratic loss is not suitable for classification with binary responses, whereas the loglikelihood function is not readily applicable to models where the exact distribution of observations is unknown or not fully specified. We introduce the penalized Bregman divergence by replacing the negative loglikelihood in the conventional penalized likelihood with Bregman divergence, which encompasses many commonly used loss functions in the regression analysis, classification procedures and machine learning literature. We investigate new statistical properties of the resulting class of estimators with the number p(n) of parameters either diverging with the sample size n or even nearly comparable with n, and develop statistical inference tools. It is shown that the resulting penalized estimator, combined with appropriate penalties, achieves the same oracle property as the penalized likelihood estimator, but asymptotically does not rely on the complete specification of the underlying distribution. Furthermore, the choice of loss function in the penalized classifiers has an asymptotically relatively negligible impact on classification performance. We illustrate the proposed method for quasilikelihood regression and binary classification with simulation evaluation and real-data application. 相似文献

12.

Regression modeling of ordinal data with nonzero baselines

Xie M Simpson DG 《Biometrics》1999,55(1):308-316

This paper develops regression models for ordinal data with nonzero control response probabilities. The models are especially useful in dose-response studies where the spontaneous or natural response rate is nonnegligible and the dosage is logarithmic. These models generalize Abbott's formula, which has been commonly used to model binary data with nonzero background observations. We describe a biologically plausible latent structure and develop an EM algorithm for fitting the models. The EM algorithm can be implemented using standard software for ordinal regression. A toxicology data set where the proposed model fits the data but a more conventional model fails is used to illustrate the methodology. 相似文献

13.

Grouped generalized estimating equations for longitudinal data analysis

Tsubasa Ito Shonosuke Sugasawa 《Biometrics》2023,79(3):1868-1879

Generalized estimating equation (GEE) is widely adopted for regression modeling for longitudinal data, taking account of potential correlations within the same subjects. Although the standard GEE assumes common regression coefficients among all the subjects, such an assumption may not be realistic when there is potential heterogeneity in regression coefficients among subjects. In this paper, we develop a flexible and interpretable approach, called grouped GEE analysis, to modeling longitudinal data with allowing heterogeneity in regression coefficients. The proposed method assumes that the subjects are divided into a finite number of groups and subjects within the same group share the same regression coefficient. We provide a simple algorithm for grouping subjects and estimating the regression coefficients simultaneously, and show the asymptotic properties of the proposed estimator. The number of groups can be determined by the cross validation with averaging method. We demonstrate the proposed method through simulation studies and an application to a real data set. 相似文献

14.

It's all relative: Regression analysis with compositional predictors

Gen Li Yan Li Kun Chen 《Biometrics》2023,79(2):1318-1329

Compositional data reside in a simplex and measure fractions or proportions of parts to a whole. Most existing regression methods for such data rely on log-ratio transformations that are inadequate or inappropriate in modeling high-dimensional data with excessive zeros and hierarchical structures. Moreover, such models usually lack a straightforward interpretation due to the interrelation between parts of a composition. We develop a novel relative-shift regression framework that directly uses proportions as predictors. The new framework provides a paradigm shift for regression analysis with compositional predictors and offers a superior interpretation of how shifting concentration between parts affects the response. New equi-sparsity and tree-guided regularization methods and an efficient smoothing proximal gradient algorithm are developed to facilitate feature aggregation and dimension reduction in regression. A unified finite-sample prediction error bound is derived for the proposed regularized estimators. We demonstrate the efficacy of the proposed methods in extensive simulation studies and a real gut microbiome study. Guided by the taxonomy of the microbiome data, the framework identifies important taxa at different taxonomic levels associated with the neurodevelopment of preterm infants. 相似文献

15.

High circulating vascular endothelial growth factor (VEGF) is related to a better systolic function in diabetic hypertensive patients

Iacobellis G Cipriani R Gabriele A Di Mario U Morano S 《Cytokine》2004,27(1):25-30

BACKGROUND: VEGF seems to have a protective role on cardiac microcirculation, but no data are available on its action on cardiac function and morphology in diabetic patients. We sought to test the hypothesis that circulating VEGF levels could influence the cardiac performance in type 2 diabetic hypertensive patients. METHODS: We studied 30 patients with type 2 diabetes and hypertension, without severe cardiac, retinal, renal and peripheral vascular damage. Ten non-diabetic hypertensive patients represented the control group. VEGF plasma levels (ELISA) and echocardiographic parameters were evaluated. RESULTS: Diabetic patients had VEGF plasma levels higher than hypertensive non-diabetic subjects [median 82 (IQR 12-190) vs 50.5 (IQR 28-77) pg/mL, p=0.05]. Simple linear regression analysis showed that VEGF levels are related to relative wall thickness (RWT) and both endocardial and midwall systolic parameters in the diabetic patients. Multiple linear regression analysis showed that RWT and ejection fraction (EF) were the only independent correlates of VEGF (r2=0.274, p=0.03, p=0.05; respectively). CONCLUSIONS: Our data showed that high VEGF plasma levels are associated to a better systolic function in diabetic hypertensive patients with cardiac remodeling. VEGF may play a role in the improvement of cardiac performance in diabetes. 相似文献

16.

Mixed effect regression analysis for a cluster-based two-stage outcome-auxiliary-dependent sampling design with a continuous outcome

Xu W Zhou H 《Biostatistics (Oxford, England)》2012,13(4):650-664

Two-stage design is a well-known cost-effective way for conducting biomedical studies when the exposure variable is expensive or difficult to measure. Recent research development further allowed one or both stages of the two-stage design to be outcome dependent on a continuous outcome variable. This outcome-dependent sampling feature enables further efficiency gain in parameter estimation and overall cost reduction of the study (e.g. Wang, X. and Zhou, H., 2010. Design and inference for cancer biomarker study with an outcome and auxiliary-dependent subsampling. Biometrics 66, 502-511; Zhou, H., Song, R., Wu, Y. and Qin, J., 2011. Statistical inference for a two-stage outcome-dependent sampling design with a continuous outcome. Biometrics 67, 194-202). In this paper, we develop a semiparametric mixed effect regression model for data from a two-stage design where the second-stage data are sampled with an outcome-auxiliary-dependent sample (OADS) scheme. Our method allows the cluster- or center-effects of the study subjects to be accounted for. We propose an estimated likelihood function to estimate the regression parameters. Simulation study indicates that greater study efficiency gains can be achieved under the proposed two-stage OADS design with center-effects when compared with other alternative sampling schemes. We illustrate the proposed method by analyzing a dataset from the Collaborative Perinatal Project. 相似文献

17.

Comparison of Regression Methods for Modeling Intensive Care Length of Stay

Ilona W. M. Verburg Nicolette F. de Keizer Evert de Jonge Niels Peek 《PloS one》2014,9(10)

Intensive care units (ICUs) are increasingly interested in assessing and improving their performance. ICU Length of Stay (LoS) could be seen as an indicator for efficiency of care. However, little consensus exists on which prognostic method should be used to adjust ICU LoS for case-mix factors. This study compared the performance of different regression models when predicting ICU LoS. We included data from 32,667 unplanned ICU admissions to ICUs participating in the Dutch National Intensive Care Evaluation (NICE) in the year 2011. We predicted ICU LoS using eight regression models: ordinary least squares regression on untransformed ICU LoS,LoS truncated at 30 days and log-transformed LoS; a generalized linear model with a Gaussian distribution and a logarithmic link function; Poisson regression; negative binomial regression; Gamma regression with a logarithmic link function; and the original and recalibrated APACHE IV model, for all patients together and for survivors and non-survivors separately. We assessed the predictive performance of the models using bootstrapping and the squared Pearson correlation coefficient (R²), root mean squared prediction error (RMSPE), mean absolute prediction error (MAPE) and bias. The distribution of ICU LoS was skewed to the right with a median of 1.7 days (interquartile range 0.8 to 4.0) and a mean of 4.2 days (standard deviation 7.9). The predictive performance of the models was between 0.09 and 0.20 for R², between 7.28 and 8.74 days for RMSPE, between 3.00 and 4.42 days for MAPE and between −2.99 and 1.64 days for bias. The predictive performance was slightly better for survivors than for non-survivors. We were disappointed in the predictive performance of the regression models and conclude that it is difficult to predict LoS of unplanned ICU admissions using patient characteristics at admission time only. 相似文献

18.

Asynchronous functional linear regression models for longitudinal data in reproducing kernel Hilbert space

Ting Li Huichen Zhu Tengfei Li Hongtu Zhu 《Biometrics》2023,79(3):1880-1895

Motivated by the analysis of longitudinal neuroimaging studies, we study the longitudinal functional linear regression model under asynchronous data setting for modeling the association between clinical outcomes and functional (or imaging) covariates. In the asynchronous data setting, both covariates and responses may be measured at irregular and mismatched time points, posing methodological challenges to existing statistical methods. We develop a kernel weighted loss function with roughness penalty to obtain the functional estimator and derive its representer theorem. The rate of convergence, a Bahadur representation, and the asymptotic pointwise distribution of the functional estimator are obtained under the reproducing kernel Hilbert space framework. We propose a penalized likelihood ratio test to test the nullity of the functional coefficient, derive its asymptotic distribution under the null hypothesis, and investigate the separation rate under the alternative hypotheses. Simulation studies are conducted to examine the finite-sample performance of the proposed procedure. We apply the proposed methods to the analysis of multitype data obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) study, which reveals significant association between 21 regional brain volume density curves and the cognitive function. Data used in preparation of this paper were obtained from the ADNI database (adni.loni.usc.edu). 相似文献

19.

Transformations of covariates for longitudinal data

Thompson WK Xie M White HR 《Biostatistics (Oxford, England)》2003,4(3):353-364

This paper develops a general approach for dealing with parametric transformations of covariates for longitudinal data, where the responses are modeled marginally and generalized estimating equations (GEEs) are used for estimation of regression parameters. We propose an iterative algorithm for obtaining regression and transformation parameters from estimating equations, utilizing existing software for GEE problems. The algorithmic technique is closely related to that used in the Box-Tidwell transformation in classical linear regression, but we develop it under the GEE setting and for more general transformation functions. We provide supporting theorems for consistency and asymptotic Normality of the estimates. Inference between two nested models is also considered. This methodology is applied to two data sets. One consists of pill dissolution data, the other is taken from the Pittsburgh Youth Study (PYS). The PYS is a prospective longitudinal study of the development of delinquency, substance use, and mental health in male youth. We use the model-based parametric approach to examine the association between alcohol use at an early stage of adolescent development and delinquency over the course of adolescence. 相似文献

20.

Blocked arteries and multivariate regression.

D F Percy 《Biometrics》1992,48(3):683-693

Ultrasound blood flow waveforms may be used in the diagnosis of arterial occlusive disease in human legs. We develop a statistical model to predict disease severity, conditional on the ultrasound data and some training data. It belongs to the class of models known as seemingly unrelated regressions, for which the Bayesian predictive density function cannot be evaluated analytically. Allowing for missing components of response vectors in the training data, we describe a first-order approximation to the predictive density, based on a Bayes estimate of the precision matrix. This approximation is then used to generate cross-validated predictions of disease severity in a set of 31 patients. We conclude with a discussion of the results. 相似文献