首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
LeBlanc M  Crowley J 《Biometrics》1999,55(1):204-213
We develop a method for constructing adaptive regression spline models for the exploration of survival data. The method combines Cox's (1972, Journal of the Royal Statistical Society, Series B 34, 187-200) regression model with a weighted least-squares version of the multivariate adaptive regressi on spline (MARS) technique of Friedman (1991, Annals of Statistics 19, 1-141) to adaptively select the knots and covariates. The new technique can automatically fit models with terms that represent nonlinear effects and interactions among covariates. Applications based on simulated data and data from a clinical trial for myeloma are presented. Results from the myeloma application identified several important prognostic variables, including a possible nonmonotone relationship with survival in one laboratory variable. Results are compared to those from the adaptive hazard regression (HARE) method of Kooperberg, Stone, and Truong (1995, Journal of the American Statistical Association 90, 78-94).  相似文献   

2.
Summary A new methodology is proposed for estimating the proportion of true null hypotheses in a large collection of tests. Each test concerns a single parameter δ whose value is specified by the null hypothesis. We combine a parametric model for the conditional cumulative distribution function (CDF) of the p‐value given δ with a nonparametric spline model for the density g(δ) of δ under the alternative hypothesis. The proportion of true null hypotheses and the coefficients in the spline model are estimated by penalized least squares subject to constraints that guarantee that the spline is a density. The estimator is computed efficiently using quadratic programming. Our methodology produces an estimate of the density of δ when the null is false and can address such questions as “when the null is false, is the parameter usually close to the null or far away?” This leads us to define a falsely interesting discovery rate (FIDR), a generalization of the false discovery rate. We contrast the FIDR approach to Efron's (2004, Journal of the American Statistical Association 99, 96–104) empirical null hypothesis technique. We discuss the use of in sample size calculations based on the expected discovery rate (EDR). Our recommended estimator of the proportion of true nulls has less bias compared to estimators based upon the marginal density of the p‐values at 1. In a simulation study, we compare our estimators to the convex, decreasing estimator of Langaas, Lindqvist, and Ferkingstad (2005, Journal of the Royal Statistical Society, Series B 67, 555–572). The most biased of our estimators is very similar in performance to the convex, decreasing estimator. As an illustration, we analyze differences in gene expression between resistant and susceptible strains of barley.  相似文献   

3.
The modeling of lifetime (i.e. cumulative) medical cost data in the presence of censored follow-up is complicated by induced informative censoring, rendering standard survival analysis tools invalid. With few exceptions, recently proposed nonparametric estimators for such data do not extend easily to handle covariate information. We propose to model the hazard function for lifetime cost endpoints using an adaptation of the HARE methodology (Kooperberg, Stone, and Truong, Journal of the American Statistical Association, 1995, 90, 78-94). Linear splines and their tensor products are used to adaptively build a model that incorporates covariates and covariate-by-cost interactions without restrictive parametric assumptions. The informative censoring problem is handled using inverse probability of censoring weighted estimating equations. The proposed method is illustrated using simulation and also with data on the cost of dialysis for patients with end-stage renal disease.  相似文献   

4.
M S Pepe  T R Fleming 《Biometrics》1989,45(2):497-507
A class of statistics based on the integrated weighted difference in Kaplan-Meier estimators is introduced for the two-sample censored data problem. With positive weight functions these statistics are intuitive for and sensitive against the alternative of stochastic ordering. The standard weighted log-rank statistics are not always sensitive against this alternative, particularly if the hazard functions cross. Qualitative comparisons are made between the weighted log-rank statistics and these weighted Kaplan-Meier (WKM) statistics. A statement of null asymptotic distribution theory is given and the choice of weight function is discussed in some detail. Results from small-sample simulation studies indicate that these statistics compare favorably with the log-rank procedure even under the proportional hazards alternative, and may perform better than it under the crossing hazards alternative.  相似文献   

5.
Estimating the number of species in a stochastic abundance model   总被引:1,自引:0,他引:1  
Chao A  Bunge J 《Biometrics》2002,58(3):531-539
Consider a stochastic abundance model in which the species arrive in the sample according to independent Poisson processes, where the abundance parameters of the processes follow a gamma distribution. We propose a new estimator of the number of species for this model. The estimator takes the form of the number of duplicated species (i.e., species represented by two or more individuals) divided by an estimated duplication fraction. The duplication fraction is estimated from all frequencies including singleton information. The new estimator is closely related to the sample coverage estimator presented by Chao and Lee (1992, Journal of the American Statistical Association 87, 210-217). We illustrate the procedure using the Malayan butterfly data discussed by Fisher, Corbet, and Williams (1943, Journal of Animal Ecology 12, 42-58) and a 1989 Christmas Bird Count dataset collected in Florida, U.S.A. Simulation studies show that this estimator compares well with maximum likelihood estimators (i.e., empirical Bayes estimators from the Bayesian viewpoint) for which an iterative numerical procedure is needed and may be infeasible.  相似文献   

6.
Bickel DR 《Biometrics》2011,67(2):363-370
In a novel approach to the multiple testing problem, Efron (2004, Journal of the American Statistical Association 99, 96-104; 2007a Journal of the American Statistical Association 102, 93-103; 2007b, Annals of Statistics 35, 1351-1377) formulated estimators of the distribution of test statistics or nominal p-values under a null distribution suitable for modeling the data of thousands of unaffected genes, nonassociated single-nucleotide polymorphisms, or other biological features. Estimators of the null distribution can improve not only the empirical Bayes procedure for which it was originally intended, but also many other multiple-comparison procedures. Such estimators in some cases improve the proposed multiple-comparison procedure (MCP) based on a recent non-Bayesian framework of minimizing expected loss with respect to a confidence posterior, a probability distribution of confidence levels. The flexibility of that MCP is illustrated with a nonadditive loss function designed for genomic screening rather than for validation. The merit of estimating the null distribution is examined from the vantage point of the confidence-posterior MCP (CPMCP). In a generic simulation study of genome-scale multiple testing, conditioning the observed confidence level on the estimated null distribution as an approximate ancillary statistic markedly improved conditional inference. Specifically simulating gene expression data, however, indicates that estimation of the null distribution tends to exacerbate the conservative bias that results from modeling heavy-tailed data distributions with the normal family. To enable researchers to determine whether to rely on a particular estimated null distribution for inference or decision making, an information-theoretic score is provided. As the sum of the degree of ancillarity and the degree of inferential relevance, the score reflects the balance conditioning would strike between the two conflicting terms. The CPMCP and other methods introduced are applied to gene expression microarray data.  相似文献   

7.
Fleming TR  Lin DY 《Biometrics》2000,56(4):971-983
The field of survival analysis emerged in the 20th century and experienced tremendous growth during the latter half of the century. The developments in this field that have had the most profound impact on clinical trials are the Kaplan-Meier (1958, Journal of the American Statistical Association 53, 457-481) method for estimating the survival function, the log-rank statistic (Mantel, 1966, Cancer Chemotherapy Report 50, 163-170) for comparing two survival distributions, and the Cox (1972, Journal of the Royal Statistical Society, Series B 34, 187-220) proportional hazards model for quantifying the effects of covariates on the survival time. The counting-process martingale theory pioneered by Aalen (1975, Statistical inference for a family of counting processes, Ph.D. dissertation, University of California, Berkeley) provides a unified framework for studying the small- and large-sample properties of survival analysis statistics. Significant progress has been achieved and further developments are expected in many other areas, including the accelerated failure time model, multivariate failure time data, interval-censored data, dependent censoring, dynamic treatment regimes and causal inference, joint modeling of failure time and longitudinal data, and Baysian methods.  相似文献   

8.
G Heller  J S Simonoff 《Biometrics》1992,48(1):101-115
Although the analysis of censored survival data using the proportional hazards and linear regression models is common, there has been little work examining the ability of these estimators to predict time to failure. This is unfortunate, since a predictive plot illustrating the relationship between time to failure and a continuous covariate can be far more informative regarding the risk associated with the covariate than a Kaplan-Meier plot obtained by discretizing the variable. In this paper the predictive power of the Cox (1972, Journal of the Royal Statistical Society, Series B 34, 187-202) proportional hazards estimator and the Buckley-James (1979, Biometrika 66, 429-436) censored regression estimator are compared. Using computer simulations and heuristic arguments, it is shown that the choice of method depends on the censoring proportion, strength of the regression, the form of the censoring distribution, and the form of the failure distribution. Several examples are provided to illustrate the usefulness of the methods.  相似文献   

9.
Qin GY  Zhu ZY 《Biometrics》2009,65(1):52-59
Summary .  In this article, we study the robust estimation of both mean and variance components in generalized partial linear mixed models based on the construction of robustified likelihood function. Under some regularity conditions, the asymptotic properties of the proposed robust estimators are shown. Some simulations are carried out to investigate the performance of the proposed robust estimators. Just as expected, the proposed robust estimators perform better than those resulting from robust estimating equations involving conditional expectation like Sinha (2004, Journal of the American Statistical Association 99, 451–460) and Qin and Zhu (2007, Journal of Multivariate Analysis 98, 1658–1683). In the end, the proposed robust method is illustrated by the analysis of a real data set.  相似文献   

10.
Wang X  Wang K  Lim J 《Biometrics》2012,68(1):194-202
In applications that require cost efficiency, sample sizes are typically small so that the problem of empty strata may often occur in judgment poststratification (JPS), an important variant of balanced ranked set sampling. In this article, we consider estimation of population cumulative distribution functions (CDF) from JPS samples with empty strata. In the literature, the standard and restricted CDF estimators (Stokes and Sager, 1988, Journal of the American Statistical Association 83, 374381; Frey and Ozturk, 2011, Annals of the Institute of Statistical Mathematics, to appear) do not perform well when simply ignoring empty strata. In this article, we show that the original isotonized estimator (Ozturk, 2007, Journal of Nonparametric Statistics 19, 131-144) can handle empty strata automatically through two methods, MinMax and MaxMin. However, blindly using them can result in undesirable results in either tail of the CDF. We thoroughly examine MinMax and MaxMin and find interesting results about their behaviors and performance in the presence of empty strata. Motivated by these results, we propose modified isotonized estimators to improve estimation efficiency. Through simulation and empirical studies, we show that our estimators work well in different regions of the CDF, and also improve the overall performance of estimating the whole function.  相似文献   

11.
Lu  Minggen; Zhang  Ying; Huang  Jian 《Biometrika》2007,94(3):705-718
We study nonparametric likelihood-based estimators of the meanfunction of counting processes with panel count data using monotonepolynomial splines. The generalized Rosen algorithm, proposedby Zhang & Jamshidian (2004), is used to compute the estimators.We show that the proposed spline likelihood-based estimatorsare consistent and that their rate of convergence can be fasterthan n1/3. Simulation studies with moderate samples show thatthe estimators have smaller variances and mean squared errorsthan their alternatives proposed by Wellner & Zhang (2000).A real example from a bladder tumour clinical trial is usedto illustrate this method.  相似文献   

12.
This article investigates an augmented inverse selection probability weighted estimator for Cox regression parameter estimation when covariate variables are incomplete. This estimator extends the Horvitz and Thompson (1952, Journal of the American Statistical Association 47, 663-685) weighted estimator. This estimator is doubly robust because it is consistent as long as either the selection probability model or the joint distribution of covariates is correctly specified. The augmentation term of the estimating equation depends on the baseline cumulative hazard and on a conditional distribution that can be implemented by using an EM-type algorithm. This method is compared with some previously proposed estimators via simulation studies. The method is applied to a real example.  相似文献   

13.
Du P  Jiang Y  Wang Y 《Biometrics》2011,67(4):1330-1339
Gap time hazard estimation is of particular interest in recurrent event data. This article proposes a fully nonparametric approach for estimating the gap time hazard. Smoothing spline analysis of variance (ANOVA) decompositions are used to model the log gap time hazard as a joint function of gap time and covariates, and general frailty is introduced to account for between-subject heterogeneity and within-subject correlation. We estimate the nonparametric gap time hazard function and parameters in the frailty distribution using a combination of the Newton-Raphson procedure, the stochastic approximation algorithm (SAA), and the Markov chain Monte Carlo (MCMC) method. The convergence of the algorithm is guaranteed by decreasing the step size of parameter update and/or increasing the MCMC sample size along iterations. Model selection procedure is also developed to identify negligible components in a functional ANOVA decomposition of the log gap time hazard. We evaluate the proposed methods with simulation studies and illustrate its use through the analysis of bladder tumor data.  相似文献   

14.
Hazard regression for interval-censored data with penalized spline   总被引:1,自引:0,他引:1  
Cai T  Betensky RA 《Biometrics》2003,59(3):570-579
This article introduces a new approach for estimating the hazard function for possibly interval- and right-censored survival data. We weakly parameterize the log-hazard function with a piecewise-linear spline and provide a smoothed estimate of the hazard function by maximizing the penalized likelihood through a mixed model-based approach. We also provide a method to estimate the amount of smoothing from the data. We illustrate our approach with two well-known interval-censored data sets. Extensive numerical studies are conducted to evaluate the efficacy of the new procedure.  相似文献   

15.
Horton NJ  Laird NM 《Biometrics》2001,57(1):34-42
This article presents a new method for maximum likelihood estimation of logistic regression models with incomplete covariate data where auxiliary information is available. This auxiliary information is extraneous to the regression model of interest but predictive of the covariate with missing data. Ibrahim (1990, Journal of the American Statistical Association 85, 765-769) provides a general method for estimating generalized linear regression models with missing covariates using the EM algorithm that is easily implemented when there is no auxiliary data. Vach (1997, Statistics in Medicine 16, 57-72) describes how the method can be extended when the outcome and auxiliary data are conditionally independent given the covariates in the model. The method allows the incorporation of auxiliary data without making the conditional independence assumption. We suggest tests of conditional independence and compare the performance of several estimators in an example concerning mental health service utilization in children. Using an artificial dataset, we compare the performance of several estimators when auxiliary data are available.  相似文献   

16.
Pang Z  Kuk AY 《Biometrics》2007,63(1):218-227
Exchangeable binary data are often collected in developmental toxicity and other studies, and a whole host of parametric distributions for fitting this kind of data have been proposed in the literature. While these distributions can be matched to have the same marginal probability and intra-cluster correlation, they can be quite different in terms of shape and higher-order quantities of interest such as the litter-level risk of having at least one malformed fetus. A sensible alternative is to fit a saturated model (Bowman and George, 1995, Journal of the American Statistical Association 90, 871-879) using the expectation-maximization (EM) algorithm proposed by Stefanescu and Turnbull (2003, Biometrics 59, 18-24). The assumption of compatibility of marginal distributions is often made to link up the distributions for different cluster sizes so that estimation can be based on the combined data. Stefanescu and Turnbull proposed a modified trend test to test this assumption. Their test, however, fails to take into account the variability of an estimated null expectation and as a result leads to inaccurate p-values. This drawback is rectified in this article. When the data are sparse, the probability function estimated using a saturated model can be very jagged and some kind of smoothing is needed. We extend the penalized likelihood method (Simonoff, 1983, Annals of Statistics 11, 208-218) to the present case of unequal cluster sizes and implement the method using an EM-type algorithm. In the presence of covariate, we propose a penalized kernel method that performs smoothing in both the covariate and response space. The proposed methods are illustrated using several data sets and the sampling and robustness properties of the resulting estimators are evaluated by simulations.  相似文献   

17.
Huang JZ  Liu L 《Biometrics》2006,62(3):793-802
The Cox proportional hazards model usually assumes an exponential form for the dependence of the hazard function on covariate variables. However, in practice this assumption may be violated and other relative risk forms may be more appropriate. In this article, we consider the proportional hazards model with an unknown relative risk form. Issues in model interpretation are addressed. We propose a method to estimate the relative risk form and the regression parameters simultaneously by first approximating the logarithm of the relative risk form by a spline, and then employing the maximum partial likelihood estimation. An iterative alternating optimization procedure is developed for efficient implementation. Statistical inference of the regression coefficients and of the relative risk form based on parametric asymptotic theory is discussed. The proposed methods are illustrated using simulation and an application to the Veteran's Administration lung cancer data.  相似文献   

18.
Motivated by recent work involving the analysis of biomedical imaging data, we present a novel procedure for constructing simultaneous confidence corridors for the mean of imaging data. We propose to use flexible bivariate splines over triangulations to handle an irregular domain of the images that is common in brain imaging studies and in other biomedical imaging applications. The proposed spline estimators of the mean functions are shown to be consistent and asymptotically normal under some regularity conditions. We also provide a computationally efficient estimator of the covariance function and derive its uniform consistency. The procedure is also extended to the two-sample case in which we focus on comparing the mean functions from two populations of imaging data. Through Monte Carlo simulation studies, we examine the finite sample performance of the proposed method. Finally, the proposed method is applied to analyze brain positron emission tomography data in two different studies. One data set used in preparation of this article was obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database.  相似文献   

19.
E Brunner 《Biometrics》1991,47(3):1149-1153
In a nonparametric two-sample model for independent observations with repeated measurements, a new point estimator and a new distribution-free confidence interval for the difference in means are introduced. The method is based on some ideas in Hodges and Lehmann (1963, Annals of Mathematical Statistics 34, 598-611). The asymptotic theory in Brunner and Neumann (1983, Biometrical Journal 24, 373-389; 1986; Biometrical Journal 28, 394-402) and the results for small sample sizes in Brunner and Compagnone (1988, Statistical Software Newsletter 14, 36-42) are used. The new estimators are applied to a problem in morphometry.  相似文献   

20.
Royle JA 《Biometrics》2004,60(1):108-115
Spatial replication is a common theme in count surveys of animals. Such surveys often generate sparse count data from which it is difficult to estimate population size while formally accounting for detection probability. In this article, I describe a class of models (N-mixture models) which allow for estimation of population size from such data. The key idea is to view site-specific population sizes, N, as independent random variables distributed according to some mixing distribution (e.g., Poisson). Prior parameters are estimated from the marginal likelihood of the data, having integrated over the prior distribution for N. Carroll and Lombard (1985, Journal of American Statistical Association 80, 423-426) proposed a class of estimators based on mixing over a prior distribution for detection probability. Their estimator can be applied in limited settings, but is sensitive to prior parameter values that are fixed a priori. Spatial replication provides additional information regarding the parameters of the prior distribution on N that is exploited by the N-mixture models and which leads to reasonable estimates of abundance from sparse data. A simulation study demonstrates superior operating characteristics (bias, confidence interval coverage) of the N-mixture estimator compared to the Caroll and Lombard estimator. Both estimators are applied to point count data on six species of birds illustrating the sensitivity to choice of prior on p and substantially different estimates of abundance as a consequence.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号