首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Wang YG  Zhao Y 《Biometrics》2008,64(1):39-45
Summary .   We consider ranked-based regression models for clustered data analysis. A weighted Wilcoxon rank method is proposed to take account of within-cluster correlations and varying cluster sizes. The asymptotic normality of the resulting estimators is established. A method to estimate covariance of the estimators is also given, which can bypass estimation of the density function. Simulation studies are carried out to compare different estimators for a number of scenarios on the correlation structure, presence/absence of outliers and different correlation values. The proposed methods appear to perform well, in particular, the one incorporating the correlation in the weighting achieves the highest efficiency and robustness against misspecification of correlation structure and outliers. A real example is provided for illustration.  相似文献   

2.
Yi GY  He W 《Biometrics》2009,65(2):618-625
Summary .  Recently, median regression models have received increasing attention. When continuous responses follow a distribution that is quite different from a normal distribution, usual mean regression models may fail to produce efficient estimators whereas median regression models may perform satisfactorily. In this article, we discuss using median regression models to deal with longitudinal data with dropouts. Weighted estimating equations are proposed to estimate the median regression parameters for incomplete longitudinal data, where the weights are determined by modeling the dropout process. Consistency and the asymptotic distribution of the resultant estimators are established. The proposed method is used to analyze a longitudinal data set arising from a controlled trial of HIV disease ( Volberding et al., 1990 , The New England Journal of Medicine 322, 941–949). Simulation studies are conducted to assess the performance of the proposed method under various situations. An extension to estimation of the association parameters is outlined.  相似文献   

3.
Wang P  Puterman ML  Cockburn I  Le N 《Biometrics》1996,52(2):381-400
This paper studies a class of Poisson mixture models that includes covariates in rates. This model contains Poisson regression and independent Poisson mixtures as special cases. Estimation methods based on the EM and quasi-Newton algorithms, properties of these estimates, a model selection procedure, residual analysis, and goodness-of-fit test are discussed. A Monte Carlo study investigates implementation and model choice issues. This methodology is used to analyze seizure frequency and Ames salmonella assay data.  相似文献   

4.
The choice of an appropriate family of linear models for the analysis of longitudinal data is often a matter of concern for practitioners. To attenuate such difficulties, we discuss some issues that emerge when analyzing this type of data via a practical example involving pretest–posttest longitudinal data. In particular, we consider log‐normal linear mixed models (LNLMM), generalized linear mixed models (GLMM), and models based on generalized estimating equations (GEE). We show how some special features of the data, like a nonconstant coefficient of variation, may be handled in the three approaches and evaluate their performance with respect to the magnitude of standard errors of interpretable and comparable parameters. We also show how different diagnostic tools may be employed to identify outliers and comment on available software. We conclude by noting that the results are similar, but that GEE‐based models may be preferable when the goal is to compare the marginal expected responses.  相似文献   

5.
Qin LX  Self SG 《Biometrics》2006,62(2):526-533
Identification of differentially expressed genes and clustering of genes are two important and complementary objectives addressed with gene expression data. For the differential expression question, many "per-gene" analytic methods have been proposed. These methods can generally be characterized as using a regression function to independently model the observations for each gene; various adjustments for multiplicity are then used to interpret the statistical significance of these per-gene regression models over the collection of genes analyzed. Motivated by this common structure of per-gene models, we proposed a new model-based clustering method--the clustering of regression models method, which groups genes that share a similar relationship to the covariate(s). This method provides a unified approach for a family of clustering procedures and can be applied for data collected with various experimental designs. In addition, when combined with per-gene methods for assessing differential expression that employ the same regression modeling structure, an integrated framework for the analysis of microarray data is obtained. The proposed methodology was applied to two microarray data sets, one from a breast cancer study and the other from a yeast cell cycle study.  相似文献   

6.
7.
Mathematical models are an essential tool in systems biology, linking the behaviour of a system to the interactions between its components. Parameters in empirical mathematical models must be determined using experimental data, a process called regression. Because experimental data are noisy and incomplete, diagnostics that test the structural identifiability and validity of models and the significance and determinability of their parameters are needed to ensure that the proposed models are supported by the available data.  相似文献   

8.
Yin G  Cai J 《Biometrics》2005,61(1):151-161
As an alternative to the mean regression model, the quantile regression model has been studied extensively with independent failure time data. However, due to natural or artificial clustering, it is common to encounter multivariate failure time data in biomedical research where the intracluster correlation needs to be accounted for appropriately. For right-censored correlated survival data, we investigate the quantile regression model and adapt an estimating equation approach for parameter estimation under the working independence assumption, as well as a weighted version for enhancing the efficiency. We show that the parameter estimates are consistent and asymptotically follow normal distributions. The variance estimation using asymptotic approximation involves nonparametric functional density estimation. We employ the bootstrap and perturbation resampling methods for the estimation of the variance-covariance matrix. We examine the proposed method for finite sample sizes through simulation studies, and illustrate it with data from a clinical trial on otitis media.  相似文献   

9.
A cross-validatory method for dependent data   总被引:1,自引:0,他引:1  
  相似文献   

10.
Conditional logistic regression models for correlated binary data   总被引:1,自引:0,他引:1  
  相似文献   

11.
In many observational studies, individuals are measured repeatedly over time, although not necessarily at a set of pre-specified occasions. Instead, individuals may be measured at irregular intervals, with those having a history of poorer health outcomes being measured with somewhat greater frequency and regularity. In this paper, we consider likelihood-based estimation of the regression parameters in marginal models for longitudinal binary data when the follow-up times are not fixed by design, but can depend on previous outcomes. In particular, we consider assumptions regarding the follow-up time process that result in the likelihood function separating into two components: one for the follow-up time process, the other for the outcome measurement process. The practical implication of this separation is that the follow-up time process can be ignored when making likelihood-based inferences about the marginal regression model parameters. That is, maximum likelihood (ML) estimation of the regression parameters relating the probability of success at a given time to covariates does not require that a model for the distribution of follow-up times be specified. However, to obtain consistent parameter estimates, the multinomial distribution for the vector of repeated binary outcomes must be correctly specified. In general, ML estimation requires specification of all higher-order moments and the likelihood for a marginal model can be intractable except in cases where the number of repeated measurements is relatively small. To circumvent these difficulties, we propose a pseudolikelihood for estimation of the marginal model parameters. The pseudolikelihood uses a linear approximation for the conditional distribution of the response at any occasion, given the history of previous responses. The appeal of this approximation is that the conditional distributions are functions of the first two moments of the binary responses only. When the follow-up times depend only on the previous outcome, the pseudolikelihood requires correct specification of the conditional distribution of the current outcome given the outcome at the previous occasion only. Results from a simulation study and a study of asymptotic bias are presented. Finally, we illustrate the main results using data from a longitudinal observational study that explored the cardiotoxic effects of doxorubicin chemotherapy for the treatment of acute lymphoblastic leukemia in children.  相似文献   

12.
One important problem in genomic research is to identify genomic features such as gene expression data or DNA single nucleotide polymorphisms (SNPs) that are related to clinical phenotypes. Often these genomic data can be naturally divided into biologically meaningful groups such as genes belonging to the same pathways or SNPs within genes. In this paper, we propose group additive regression models and a group gradient descent boosting procedure for identifying groups of genomic features that are related to clinical phenotypes. Our simulation results show that by dividing the variables into appropriate groups, we can obtain better identification of the group features that are related to the phenotypes. In addition, the prediction mean square errors are also smaller than the component-wise boosting procedure. We demonstrate the application of the methods to pathway-based analysis of microarray gene expression data of breast cancer. Results from analysis of a breast cancer microarray gene expression data set indicate that the pathways of metalloendopeptidases (MMPs) and MMP inhibitors, as well as cell proliferation, cell growth, and maintenance are important to breast cancer-specific survival.  相似文献   

13.
Chen Q  Ibrahim JG 《Biometrics》2006,62(1):177-184
We consider a class of semiparametric models for the covariate distribution and missing data mechanism for missing covariate and/or response data for general classes of regression models including generalized linear models and generalized linear mixed models. Ignorable and nonignorable missing covariate and/or response data are considered. The proposed semiparametric model can be viewed as a sensitivity analysis for model misspecification of the missing covariate distribution and/or missing data mechanism. The semiparametric model consists of a generalized additive model (GAM) for the covariate distribution and/or missing data mechanism. Penalized regression splines are used to express the GAMs as a generalized linear mixed effects model, in which the variance of the corresponding random effects provides an intuitive index for choosing between the semiparametric and parametric model. Maximum likelihood estimates are then obtained via the EM algorithm. Simulations are given to demonstrate the methodology, and a real data set from a melanoma cancer clinical trial is analyzed using the proposed methods.  相似文献   

14.
Carlin BP  Hodges JS 《Biometrics》1999,55(4):1162-1170
In clinical trials conducted over several data collection centers, the most common statistically defensible analytic method, a stratified Cox model analysis, suffers from two important defects. First, identification of units that are outlying with respect to the baseline hazard is awkward since this hazard is implicit (rather than explicit) in the Cox partial likelihood. Second (and more seriously), identification of modest treatment effects is often difficult since the model fails to acknowledge any similarity across the strata. We consider a number of hierarchical modeling approaches that preserve the integrity of the stratified design while offering a middle ground between traditional stratified and unstratified analyses. We investigate both fully parametric (Weibull) and semiparametric models, the latter based not on the Cox model but on an extension of an idea by Gelfand and Mallick (1995, Biometrics 51, 843-852), which models the integrated baseline hazard as a mixture of monotone functions. We illustrate the methods using data from a recent multicenter AIDS clinical trial, comparing their ease of use, interpretation, and degree of robustness with respect to estimates of both the unit-specific baseline hazards and the treatment effect.  相似文献   

15.
High-throughout genomic data provide an opportunity for identifying pathways and genes that are related to various clinical phenotypes. Besides these genomic data, another valuable source of data is the biological knowledge about genes and pathways that might be related to the phenotypes of many complex diseases. Databases of such knowledge are often called the metadata. In microarray data analysis, such metadata are currently explored in post hoc ways by gene set enrichment analysis but have hardly been utilized in the modeling step. We propose to develop and evaluate a pathway-based gradient descent boosting procedure for nonparametric pathways-based regression (NPR) analysis to efficiently integrate genomic data and metadata. Such NPR models consider multiple pathways simultaneously and allow complex interactions among genes within the pathways and can be applied to identify pathways and genes that are related to variations of the phenotypes. These methods also provide an alternative to mediating the problem of a large number of potential interactions by limiting analysis to biologically plausible interactions between genes in related pathways. Our simulation studies indicate that the proposed boosting procedure can indeed identify relevant pathways. Application to a gene expression data set on breast cancer distant metastasis identified that Wnt, apoptosis, and cell cycle-regulated pathways are more likely related to the risk of distant metastasis among lymph-node-negative breast cancer patients. Results from analysis of other two breast cancer gene expression data sets indicate that the pathways of Metalloendopeptidases (MMPs) and MMP inhibitors, as well as cell proliferation, cell growth, and maintenance are important to breast cancer relapse and survival. We also observed that by incorporating the pathway information, we achieved better prediction for cancer recurrence.  相似文献   

16.
17.
Fitting prospective regression models to case-control data   总被引:1,自引:0,他引:1  
WILD  C. J. 《Biometrika》1991,78(4):705-717
  相似文献   

18.
A linear regression method that allows survival rates to vary from stage to stage is described for the analysis of stage-frequency data. It has advantages over previously suggested methods since the calculations are not iterative, and it is not necessary to have independent estimates of stage durations, numbers entering stages, or the rate of entry to stage 1. Simulation is proposed to determine standard errors for estimates of population parameters, and to assess the goodness of fit of models.  相似文献   

19.
Yue Wei  Yi Liu  Tao Sun  Wei Chen  Ying Ding 《Biometrics》2020,76(2):619-629
Several gene-based association tests for time-to-event traits have been proposed recently to detect whether a gene region (containing multiple variants), as a set, is associated with the survival outcome. However, for bivariate survival outcomes, to the best of our knowledge, there is no statistical method that can be directly applied for gene-based association analysis. Motivated by a genetic study to discover the gene regions associated with the progression of a bilateral eye disease, age-related macular degeneration (AMD), we implement a novel functional regression (FR) method under the copula framework. Specifically, the effects of variants within a gene region are modeled through a functional linear model, which then contributes to the marginal survival functions within the copula. Generalized score test statistics are derived to test for the association between bivariate survival traits and the genetic region. Extensive simulation studies are conducted to evaluate the type I error control and power performance of the proposed approach, with comparisons to several existing methods for a single survival trait, as well as the marginal Cox FR model using the robust sandwich estimator for bivariate survival traits. Finally, we apply our method to a large AMD study, the Age-related Eye Disease Study, and to identify the gene regions that are associated with AMD progression.  相似文献   

20.
Regression models in survival analysis are most commonly applied for right‐censored survival data. In some situations, the time to the event is not exactly observed, although it is known that the event occurred between two observed times. In practice, the moment of observation is frequently taken as the event occurrence time, and the interval‐censored mechanism is ignored. We present a cure rate defective model for interval‐censored event‐time data. The defective distribution is characterized by a density function whose integration assumes a value less than one when the parameter domain differs from the usual domain. We use the Gompertz and inverse Gaussian defective distributions to model data containing cured elements and estimate parameters using the maximum likelihood estimation procedure. We evaluate the performance of the proposed models using Monte Carlo simulation studies. Practical relevance of the models is illustrated by applying datasets on ovarian cancer recurrence and oral lesions in children after liver transplantation, both of which were derived from studies performed at A.C. Camargo Cancer Center in São Paulo, Brazil.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号