期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Semiparametric estimation in copula models for bivariate sequential survival times

Lawless JF Yilmaz YE 《Biometrical journal. Biometrische Zeitschrift》2011,53(5):779-796

Sequentially observed survival times are of interest in many studies but there are difficulties in analyzing such data using nonparametric or semiparametric methods. First, when the duration of followup is limited and the times for a given individual are not independent, induced dependent censoring arises for the second and subsequent survival times. Non-identifiability of the marginal survival distributions for second and later times is another issue, since they are observable only if preceding survival times for an individual are uncensored. In addition, in some studies a significant proportion of individuals may never have the first event. Fully parametric models can deal with these features, but robustness is a concern. We introduce a new approach to address these issues. We model the joint distribution of the successive survival times by using copula functions, and provide semiparametric estimation procedures in which copula parameters are estimated without parametric assumptions on the marginal distributions. This provides more robust estimates and checks on the fit of parametric models. The methodology is applied to a motivating example involving relapse and survival following colon cancer treatment. 相似文献

2.

Improved Estimation of the Noncentrality Parameter Distribution from a Large Number of t‐Statistics,with Applications to False Discovery Rate Estimation in Microarray Data Analysis

Long Qu Dan Nettleton Jack C. M. Dekkers 《Biometrics》2012,68(4):1178-1187

Summary Given a large number of t‐statistics, we consider the problem of approximating the distribution of noncentrality parameters (NCPs) by a continuous density. This problem is closely related to the control of false discovery rates (FDR) in massive hypothesis testing applications, e.g., microarray gene expression analysis. Our methodology is similar to, but improves upon, the existing approach by Ruppert, Nettleton, and Hwang (2007, Biometrics, 63, 483–495). We provide parametric, nonparametric, and semiparametric estimators for the distribution of NCPs, as well as estimates of the FDR and local FDR. In the parametric situation, we assume that the NCPs follow a distribution that leads to an analytically available marginal distribution for the test statistics. In the nonparametric situation, we use convex combinations of basis density functions to estimate the density of the NCPs. A sequential quadratic programming procedure is developed to maximize the penalized likelihood. The smoothing parameter is selected with the approximate network information criterion. A semiparametric estimator is also developed to combine both parametric and nonparametric fits. Simulations show that, under a variety of situations, our density estimates are closer to the underlying truth and our FDR estimates are improved compared with alternative methods. Data‐based simulations and the analyses of two microarray datasets are used to evaluate the performance in realistic situations. 相似文献

3.

An Empirical Comparison of Parametric and Semiparametric Cure Models

Yingwei Peng K.C. Carriere 《Biometrical journal. Biometrische Zeitschrift》2002,44(8):1002-1014

Parametric and semiparametric cure models have been proposed for cure proportion estimation in cancer clinical research. In this paper, several parametric and semiparametric models are compared, and their estimation methods are discussed within the framework of the EM algorithm. We show that the semiparametric PH cure model can achieve efficiency levels similar to those of parametric cure models, provided that the failure time distribution is well specified and uncured patients have an increasing hazard rate. Therefore the semiparametric model is a viable alternative to parametric cure models. When the hazard rate of uncured patients is rapidly decreasing, the estimates from the semiparametric cure model tend to have large variations and biases. However, all other models also tend to have large variations and biases in this case. 相似文献

4.

Accelerated failure time modeling via nonparametric mixtures

Byungtae Seo Sangwook Kang 《Biometrics》2023,79(1):165-177

An accelerated failure time (AFT) model assuming a log-linear relationship between failure time and a set of covariates can be either parametric or semiparametric, depending on the distributional assumption for the error term. Both classes of AFT models have been popular in the analysis of censored failure time data. The semiparametric AFT model is more flexible and robust to departures from the distributional assumption than its parametric counterpart. However, the semiparametric AFT model is subject to producing biased results for estimating any quantities involving an intercept. Estimating an intercept requires a separate procedure. Moreover, a consistent estimation of the intercept requires stringent conditions. Thus, essential quantities such as mean failure times might not be reliably estimated using semiparametric AFT models, which can be naturally done in the framework of parametric AFT models. Meanwhile, parametric AFT models can be severely impaired by misspecifications. To overcome this, we propose a new type of the AFT model using a nonparametric Gaussian-scale mixture distribution. We also provide feasible algorithms to estimate the parameters and mixing distribution. The finite sample properties of the proposed estimators are investigated via an extensive stimulation study. The proposed estimators are illustrated using a real dataset. 相似文献

5.

N-estimation from retrospectively ascertained events with applications to AIDS.

M C Wang L C See 《Biometrics》1992,48(1):129-141

It is a common sampling scheme in retrospective studies that the data set includes only individuals who satisfy a certain sampling criterion. In this paper we consider the situation when the sampling criterion is a specified event, and assume that an earlier event can be retrospectively identified given the occurrence of the specified event. A semiparametric method, which is a compromise between nonparametric and parametric methods, is employed for the estimation of the expected number of the specified events (namely, the N-estimation) occurring in arbitrarily given intervals. A number of statistical properties of the estimates are developed. Due to the limitation of semiparametric models, our estimates should be regarded as conservative estimates since in general they underestimate the actual number of the specified events. This type of limitation, however, cannot be avoided with nonparametric or semiparametric models. Applications to acquired immunodeficiency syndrome (AIDS) cases are considered. The blood transfusion AIDS cases reported to the Centers for Disease Control are analyzed in detail. 相似文献

6.

Efficient Semiparametric Marginal Estimation for the Partially Linear Additive Model for Longitudinal/Clustered Data

Raymond Carroll Arnab Maity Enno Mammen Kyusang Yu 《Statistics in biosciences》2009,1(1):10-31

We consider the efficient estimation of a regression parameter in a partially linear additive nonparametric regression model from repeated measures data when the covariates are multivariate. To date, while there is some literature in the scalar covariate case, the problem has not been addressed in the multivariate additive model case. Ours represents a first contribution in this direction. As part of this work, we first describe the behavior of nonparametric estimators for additive models with repeated measures when the underlying model is not additive. These results are critical when one considers variants of the basic additive model. We apply them to the partially linear additive repeated-measures model, deriving an explicit consistent estimator of the parametric component; if the errors are in addition Gaussian, the estimator is semiparametric efficient. We also apply our basic methods to a unique testing problem that arises in genetic epidemiology; in combination with a projection argument we develop an efficient and easily computed testing scheme. Simulations and an empirical example from nutritional epidemiology illustrate our methods. 相似文献

7.

Semiparametric estimation of tag loss and reporting rates for tag-recovery experiments using exact time-at-liberty data

Cadigan NG Brattey J 《Biometrics》2003,59(4):869-876

We present a semiparametric likelihood approach to estimating reporting rates and tag-loss rates from the tags returned from capture-recapture studies. Such studies are commonly used to estimate critical population parameters. Tag loss rates are estimated using double-tagged animals, while reporting rates are estimated using information from high-reward tags. A likelihood function is constructed based on the conditional distribution of the type of tag returned (low or high reward, single or double tag), given that a tag has been returned. This involves many sparse 5 x 1 tag-return contingency tables, and choosing a good functional form for the tag loss rate is difficult with such data. We model tag-loss rates using monotone-smoothing splines, and use these nonparametric estimates to diagnose the parametric form of the tag-loss rate. The nonparametric methods can also be used directly to model tag-loss rates. 相似文献

8.

Semiparametric variance components models for genetic studies with longitudinal phenotypes

Wang Y Huang C 《Biostatistics (Oxford, England)》2012,13(3):482-496

In a family-based genetic study such as the Framingham Heart Study (FHS), longitudinal trait measurements are recorded on subjects collected from families. Observations on subjects from the same family are correlated due to shared genetic composition or environmental factors such as diet. The data have a 3-level structure with measurements nested in subjects and subjects nested in families. We propose a semiparametric variance components model to describe phenotype observed at a time point as the sum of a nonparametric population mean function, a nonparametric random quantitative trait locus (QTL) effect, a shared environmental effect, a residual random polygenic effect and measurement error. One feature of the model is that we do not assume a parametric functional form of the age-dependent QTL effect, and we use penalized spline-based method to fit the model. We obtain nonparametric estimation of the QTL heritability defined as the ratio of the QTL variance to the total phenotypic variance. We use simulation studies to investigate performance of the proposed methods and apply these methods to the FHS systolic blood pressure data to estimate age-specific QTL effect at 62cM on chromosome 17. 相似文献

9.

Estimation of rates-across-sites distributions in phylogenetic substitution models

Susko E Field C Blouin C Roger AJ 《Systematic biology》2003,52(5):594-603

Previous work has shown that it is often essential to account for the variation in rates at different sites in phylogenetic models in order to avoid phylogenetic artifacts such as long branch attraction. In most current models, the gamma distribution is used for the rates-across-sites distributions and is implemented as an equal-probability discrete gamma. In this article, we introduce discrete distribution estimates with large numbers of equally spaced rate categories allowing us to investigate the appropriateness of the gamma model. With large numbers of rate categories, these discrete estimates are flexible enough to approximate the shape of almost any distribution. Likelihood ratio statistical tests and a nonparametric bootstrap confidence-bound estimation procedure based on the discrete estimates are presented that can be used to test the fit of a parametric family. We applied the methodology to several different protein data sets, and found that although the gamma model often provides a good parametric model for this type of data, rate estimates from an equal-probability discrete gamma model with a small number of categories will tend to underestimate the largest rates. In cases when the gamma model assumption is in doubt, rate estimates coming from the discrete rate distribution estimate with a large number of rate categories provide a robust alternative to gamma estimates. An alternative implementation of the gamma distribution is proposed that, for equal numbers of rate categories, is computationally more efficient during optimization than the standard gamma implementation and can provide more accurate estimates of site rates. 相似文献

10.

Nonparametrically Weighted Least Squares Estimation in Heteroscedastic Linear Regression

Anthony Y.C. Kuk 《Biometrical journal. Biometrische Zeitschrift》1999,41(4):401-410

The weights used in iterative weighted least squares (IWLS) regression are usually estimated parametrically using a working model for the error variance. When the variance function is misspecified, the IWLS estimates of the regression coefficients β are still asymptotically consistent but there is some loss in efficiency. Since second moments can be quite hard to model, it makes sense to estimate the error variances nonparametrically and to employ weights inversely proportional to the estimated variances in computing the WLS estimate for β. Surprisingly, this approach had not received much attention in the literature. The aim of this note is to demonstrate that such a procedure can be implemented easily in S-plus using standard functions with default options making it suitable for routine applications. The particular smoothing method that we use is local polynomial regression applied to the logarithm of the squared residuals but other smoothers can be tried as well. The proposed procedure is applied to data on the use of two different assay methods for a hormone. Efficiency calculations based on the estimated model show that the nonparametric IWLS estimates are more efficient than the parametric IWLS estimates based on three different plausible working models for the variance function. The proposed estimators also perform well in a simulation study using both parametric and nonparametric variance functions as well as normal and gamma errors. 相似文献

11.

An asymptotic theory for model selection inference in general semiparametric problems 总被引：2，自引：0，他引：2

Claeskens Gerda; Carroll Raymond J. 《Biometrika》2007,94(2):249-265

Hjort & Claeskens (2003) developed an asymptotic theoryfor model selection, model averaging and subsequent inferenceusing likelihood methods in parametric models, along with associatedconfidence statements. In this article, we consider a semiparametricversion of this problem, wherein the likelihood depends on parametersand an unknown function, and model selection/averaging is tobe applied to the parametric parts of the model. We show thatall the results of Hjort & Claeskens hold in the semiparametriccontext, if the Fisher information matrix for parametric modelsis replaced by the semiparametric information bound for semiparametricmodels, and if maximum likelihood estimators for parametricmodels are replaced by semiparametric efficient profile estimators.Our methods of proof employ Le Cam's contiguity lemmas, leadingto transparent results. The results also describe the behaviourof semiparametric model estimators when the parametric componentis misspecified, and also have implications for pointwise-consistentmodel selectors. 相似文献

12.

Semiparametric frailty models for clustered failure time data

Yu Z Lin X Tu W 《Biometrics》2012,68(2):429-436

We consider frailty models with additive semiparametric covariate effects for clustered failure time data. We propose a doubly penalized partial likelihood (DPPL) procedure to estimate the nonparametric functions using smoothing splines. We show that the DPPL estimators could be obtained from fitting an augmented working frailty model with parametric covariate effects, whereas the nonparametric functions being estimated as linear combinations of fixed and random effects, and the smoothing parameters being estimated as extra variance components. This approach allows us to conveniently estimate all model components within a unified frailty model framework. We evaluate the finite sample performance of the proposed method via a simulation study, and apply the method to analyze data from a study of sexually transmitted infections (STI). 相似文献

13.

Two-component mixture cure rate model with spline estimated nonparametric components

Wang L Du P Liang H 《Biometrics》2012,68(3):726-735

Summary In some survival analysis of medical studies, there are often long-term survivors who can be considered as permanently cured. The goals in these studies are to estimate the noncured probability of the whole population and the hazard rate of the susceptible subpopulation. When covariates are present as often happens in practice, to understand covariate effects on the noncured probability and hazard rate is of equal importance. The existing methods are limited to parametric and semiparametric models. We propose a two-component mixture cure rate model with nonparametric forms for both the cure probability and the hazard rate function. Identifiability of the model is guaranteed by an additive assumption that allows no time-covariate interactions in the logarithm of hazard rate. Estimation is carried out by an expectation-maximization algorithm on maximizing a penalized likelihood. For inferential purpose, we apply the Louis formula to obtain point-wise confidence intervals for noncured probability and hazard rate. Asymptotic convergence rates of our function estimates are established. We then evaluate the proposed method by extensive simulations. We analyze the survival data from a melanoma study and find interesting patterns for this study. 相似文献

14.

Hypothesis testing in semiparametric additive mixed models

Zhang D Lin X 《Biostatistics (Oxford, England)》2003,4(1):57-74

We consider testing whether the nonparametric function in a semiparametric additive mixed model is a simple fixed degree polynomial, for example, a simple linear function. This test provides a goodness-of-fit test for checking parametric models against nonparametric models. It is based on the mixed-model representation of the smoothing spline estimator of the nonparametric function and the variance component score test by treating the inverse of the smoothing parameter as an extra variance component. We also consider testing the equivalence of two nonparametric functions in semiparametric additive mixed models for two groups, such as treatment and placebo groups. The proposed tests are applied to data from an epidemiological study and a clinical trial and their performance is evaluated through simulations. 相似文献

15.

Modeling Age and Nest‐Specific Survival Using a Hierarchical Bayesian Approach

Jing Cao Chong Z. He Kimberly M. Suedkamp Wells Joshua J. Millspaugh Mark R. Ryan 《Biometrics》2009,65(4):1052-1062

Summary : Recent studies have shown that grassland birds are declining more rapidly than any other group of terrestrial birds. Current methods of estimating avian age‐specific nest survival rates require knowing the ages of nests, assuming homogeneous nests in terms of nest survival rates, or treating the hazard function as a piecewise step function. In this article, we propose a Bayesian hierarchical model with nest‐specific covariates to estimate age‐specific daily survival probabilities without the above requirements. The model provides a smooth estimate of the nest survival curve and identifies the factors that are related to the nest survival. The model can handle irregular visiting schedules and it has the least restrictive assumptions compared to existing methods. Without assuming proportional hazards, we use a multinomial semiparametric logit model to specify a direct relation between age‐specific nest failure probability and nest‐specific covariates. An intrinsic autoregressive prior is employed for the nest age effect. This nonparametric prior provides a more flexible alternative to the parametric assumptions. The Bayesian computation is efficient because the full conditional posterior distributions either have closed forms or are log concave. We use the method to analyze a Missouri dickcissel dataset and find that (1) nest survival is not homogeneous during the nesting period, and it reaches its lowest at the transition from incubation to nestling; and (2) nest survival is related to grass cover and vegetation height in the study area. 相似文献

16.

Semiparametric transformation models with random effects for joint analysis of recurrent and terminal events

Zeng D Lin DY 《Biometrics》2009,65(3):746-752

Summary . We propose a broad class of semiparametric transformation models with random effects for the joint analysis of recurrent events and a terminal event. The transformation models include proportional hazards/intensity and proportional odds models. We estimate the model parameters by the nonparametric maximum likelihood approach. The estimators are shown to be consistent, asymptotically normal, and asymptotically efficient. Simple and stable numerical algorithms are provided to calculate the parameter estimators and to estimate their variances. Extensive simulation studies demonstrate that the proposed inference procedures perform well in realistic settings. Applications to two HIV/AIDS studies are presented. 相似文献

17.

Synthetic Statistical Approach Reveals a High Degree of Richness of Microbial Eukaryotes in an Anoxic Water Column 总被引：5，自引：0，他引：5

下载免费PDF全文

S.-O. Jeon J. Bunge T. Stoeck K. J.-A. Barger S.-H. Hong S. S. Epstein 《Applied microbiology》2006,72(10):6578-6583

Molecular surveys suggest that communities of microbial eukaryotes are remarkably rich, because even large clone libraries seem to capture only a minority of species. This provides a qualitative picture of protistan richness but does not measure its real extent either locally or globally. Statistical analysis can estimate a community's richness, but the specific methods used to date are not always well grounded in statistical theory. Here we study a large protistan molecular survey from an anoxic water column in the Cariaco Basin (Caribbean Sea). We group individual 18S rRNA gene sequences into operational taxonomic units (OTUs) using different cutoff values for sequence similarity (99 to 50%) and systematically apply parametric models and nonparametric estimators to the OTU frequency data to estimate the total protistan diversity. The parametric models provided statistically sound estimates of protistan richness, with biologically meaningful standard errors, maximal data usage, and extensive model diagnostics and were preferable to the available nonparametric tools. Our clone library exceeded 700 clones but still covered only a minority of species and less than half of the larger protistan clades. Our estimates of total protistan richness portray the target community as very rich at all OTU levels, with hundreds of different populations apparently co-occurring in the small (3-liter) volume of our sample, as well as dozens of clades of the highest taxonomic order. These estimates are among the first for microbial eukaryotes that are obtained using state-of-the-art statistical methods and can serve as benchmark numbers for the local diversity of protists. 相似文献

18.

A Novel Targeted Learning Method for Quantitative Trait Loci Mapping

Hui Wang Zhongyang Zhang Sherri Rose Mark van der Laan 《Genetics》2014,198(4):1369-1376

We present a novel semiparametric method for quantitative trait loci (QTL) mapping in experimental crosses. Conventional genetic mapping methods typically assume parametric models with Gaussian errors and obtain parameter estimates through maximum-likelihood estimation. In contrast with univariate regression and interval-mapping methods, our model requires fewer assumptions and also accommodates various machine-learning algorithms. Estimation is performed with targeted maximum-likelihood learning methods. We demonstrate our semiparametric targeted learning approach in a simulation study and a well-studied barley data set. 相似文献

19.

Synthetic statistical approach reveals a high degree of richness of microbial eukaryotes in an anoxic water column

Jeon SO Bunge J Stoeck T Barger KJ Hong SH Epstein SS 《Applied and environmental microbiology》2006,72(10):6578-6583

Molecular surveys suggest that communities of microbial eukaryotes are remarkably rich, because even large clone libraries seem to capture only a minority of species. This provides a qualitative picture of protistan richness but does not measure its real extent either locally or globally. Statistical analysis can estimate a community's richness, but the specific methods used to date are not always well grounded in statistical theory. Here we study a large protistan molecular survey from an anoxic water column in the Cariaco Basin (Caribbean Sea). We group individual 18S rRNA gene sequences into operational taxonomic units (OTUs) using different cutoff values for sequence similarity (99 to 50%) and systematically apply parametric models and nonparametric estimators to the OTU frequency data to estimate the total protistan diversity. The parametric models provided statistically sound estimates of protistan richness, with biologically meaningful standard errors, maximal data usage, and extensive model diagnostics and were preferable to the available nonparametric tools. Our clone library exceeded 700 clones but still covered only a minority of species and less than half of the larger protistan clades. Our estimates of total protistan richness portray the target community as very rich at all OTU levels, with hundreds of different populations apparently co-occurring in the small (3-liter) volume of our sample, as well as dozens of clades of the highest taxonomic order. These estimates are among the first for microbial eukaryotes that are obtained using state-of-the-art statistical methods and can serve as benchmark numbers for the local diversity of protists. 相似文献

20.

Nonparametric and semiparametric group sequential methods for comparing accuracy of diagnostic tests

Tang L Emerson SS Zhou XH 《Biometrics》2008,64(4):1137-1145

SUMMARY: Comparison of the accuracy of two diagnostic tests using the receiver operating characteristic (ROC) curves from two diagnostic tests has been typically conducted using fixed sample designs. On the other hand, the human experimentation inherent in a comparison of diagnostic modalities argues for periodic monitoring of the accruing data to address many issues related to the ethics and efficiency of the medical study. To date, very little research has been done on the use of sequential sampling plans for comparative ROC studies, even when these studies may use expensive and unsafe diagnostic procedures. In this article we propose a nonparametric group sequential design plan. The nonparametric sequential method adapts a nonparametric family of weighted area under the ROC curve statistics (Wieand et al., 1989, Biometrika 76, 585-592) and a group sequential sampling plan. We illustrate the implementation of this nonparametric approach for sequentially comparing ROC curves in the context of diagnostic screening for nonsmall-cell lung cancer. We also describe a semiparametric sequential method based on proportional hazard models. We compare the statistical properties of the nonparametric approach with alternative semiparametric and parametric analyses in simulation studies. The results show the nonparametric approach is robust to model misspecification and has excellent finite-sample performance. 相似文献