首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Evaluation of the likelihood in mixed models for non-normal data, e.g. dependent binary data, involves high dimensional integration, which offers severe numerical problems. Penalized quasi-likelihood, iterative re-weighted restricted maximum likelihood and adjusted profile h-likelihood estimation are methods which avoid numerical integration. They will be derived by approximation of the maximum likelihood equations. For binary data, these estimation procedures may yield seriously biased estimates for components of variance, intra-class correlation or heritability. An analytical evaluation of a simple example illustrates how very critical the approximations can be for the performance of the variance component estimators.  相似文献   

2.
Balshaw RF  Dean CB 《Biometrics》2002,58(2):324-331
In many longitudinal studies, interest focuses on the occurrence rate of some phenomenon for the subjects in the study. When the phenomenon is nonterminating and possibly recurring, the result is a recurrent-event data set. Examples include epileptic seizures and recurrent cancers. When the recurring event is detectable only by an expensive or invasive examination, only the number of events occurring between follow-up times may be available. This article presents a semiparametric model for such data, based on a multiplicative intensity model paired with a fully flexible nonparametric baseline intensity function. A random subject-specific effect is included in the intensity model to account for the overdispersion frequently displayed in count data. Estimators are determined from quasi-likelihood estimating functions. Because only first- and second-moment assumptions are required for quasi-likelihood, the method is more robust than those based on the specification of a full parametric likelihood. Consistency of the estimators depends only on the assumption of the proportional intensity model. The semiparametric estimators are shown to be highly efficient compared with the usual parametric estimators. As with semiparametric methods in survival analysis, the method provides useful diagnostics for specific parametric models, including a quasi-score statistic for testing specific baseline intensity functions. The techniques are used to analyze cancer recurrences and a pheromone-based mating disruption experiment in moths. A simulation study confirms that, for many practical situations, the estimators possess appropriate small-sample characteristics.  相似文献   

3.
When overdispersed logistic-linear models are fitted by maximum quasi-likelihood hypotheses can be tested by comparing either the Wald statistic, or the quasi-likelihood score statistic, or the quasi likelihood-ratio statistic, with the approximating null X2 distribution. This paper reports a simulation study of the reliability of these tests. Some factors affecting their relative reliabilities are identified. An extended quasi-likelihood ratio test is also considered.  相似文献   

4.
Saha K  Paul S 《Biometrics》2005,61(1):179-185
We derive a first-order bias-corrected maximum likelihood estimator for the negative binomial dispersion parameter. This estimator is compared, in terms of bias and efficiency, with the maximum likelihood estimator investigated by Piegorsch (1990, Biometrics46, 863-867), the moment and the maximum extended quasi-likelihood estimators investigated by Clark and Perry (1989, Biometrics45, 309-316), and a double-extended quasi-likelihood estimator. The bias-corrected maximum likelihood estimator has superior bias and efficiency properties in most instances. For ease of comparison we give results for the two-parameter negative binomial model. However, an example involving negative binomial regression is given.  相似文献   

5.
To assess evidence for genetic linkage from pedigrees, I developed a limited variance-components approach. In this method, variability among trait observations from individuals within pedigrees is expressed in terms of fixed effects from covariates and effects due to an unobservable trait-affecting major locus, random polygenic effects, and residual nongenetic variance. The effect attributable to a locus linked to a marker is a function of the additive and dominance components of variance of the locus, the recombination fraction, and the proportion of genes identical by descent at the marker locus for each pair of sibs. For unlinked loci, the polygenic variance component depends only on the relationship between the relative pair. Parameters can be estimated by either maximum-likelihood methods or quasi-likelihood methods. The forms of quasi-likelihood estimators are provided. Hypothesis tests derived from the maximum-likelihood approach are constructed by appeal to asymptotic theory. A simulation study showed that the size of likelihood-ratio tests was appropriate but that the monogenic component of variance was generally underestimated by the likelihood approach.  相似文献   

6.
The comparison of the efficiency of two binary diagnostic tests requires one to know the disease status for all patients in the sample, by applying a gold standard. In two-phase studies the gold standard is not applied to all patients in a sample, and the problem of partial verification of the disease arises. At present, one of the approaches most used for comparing two binary diagnostic tests are the likelihood ratios. In this study, the maximum likelihood estimators of likelihood ratios are obtained. The tests of hypothesis to compare the likelihood ratios of two binary diagnostic tests when both are applied to the same random sample in the presence of verification bias are deduced, and simulation experiments are performed in order to investigate the asymptotic behaviour of the tests of hypothesis. The results obtained have been applied to the study of Alzheimer's disease.  相似文献   

7.
A covariance estimator for GEE with improved small-sample properties   总被引:2,自引:0,他引:2  
Mancl LA  DeRouen TA 《Biometrics》2001,57(1):126-134
In this paper, we propose an alternative covariance estimator to the robust covariance estimator of generalized estimating equations (GEE). Hypothesis tests using the robust covariance estimator can have inflated size when the number of independent clusters is small. Resampling methods, such as the jackknife and bootstrap, have been suggested for covariance estimation when the number of clusters is small. A drawback of the resampling methods when the response is binary is that the methods can break down when the number of subjects is small due to zero or near-zero cell counts caused by resampling. We propose a bias-corrected covariance estimator that avoids this problem. In a small simulation study, we compare the bias-corrected covariance estimator to the robust and jackknife covariance estimators for binary responses for situations involving 10-40 subjects with equal and unequal cluster sizes of 16-64 observations. The bias-corrected covariance estimator gave tests with sizes close to the nominal level even when the number of subjects was 10 and cluster sizes were unequal, whereas the robust and jackknife covariance estimators gave tests with sizes that could be 2-3 times the nominal level. The methods are illustrated using data from a randomized clinical trial on treatment for bone loss in subjects with periodontal disease.  相似文献   

8.
Huang X  Tebbs JM 《Biometrics》2009,65(3):710-718
Summary .  We consider structural measurement error models for a binary response. We show that likelihood-based estimators obtained from fitting structural measurement error models with pooled binary responses can be far more robust to covariate measurement error in the presence of latent-variable model misspecification than the corresponding estimators from individual responses. Furthermore, despite the loss in information, pooling can provide improved parameter estimators in terms of mean-squared error. Based on these and other findings, we create a new diagnostic method to detect latent-variable model misspecification in structural measurement error models with individual binary response. We use simulation and data from the Framingham Heart Study to illustrate our methods.  相似文献   

9.
Cluster randomized trials (CRTs) frequently recruit a small number of clusters, therefore necessitating the application of small-sample corrections for valid inference. A recent systematic review indicated that CRTs reporting right-censored, time-to-event outcomes are not uncommon and that the marginal Cox proportional hazards model is one of the common approaches used for primary analysis. While small-sample corrections have been studied under marginal models with continuous, binary, and count outcomes, no prior research has been devoted to the development and evaluation of bias-corrected sandwich variance estimators when clustered time-to-event outcomes are analyzed by the marginal Cox model. To improve current practice, we propose nine bias-corrected sandwich variance estimators for the analysis of CRTs using the marginal Cox model and report on a simulation study to evaluate their small-sample properties. Our results indicate that the optimal choice of bias-corrected sandwich variance estimator for CRTs with survival outcomes can depend on the variability of cluster sizes and can also slightly differ whether it is evaluated according to relative bias or type I error rate. Finally, we illustrate the new variance estimators in a real-world CRT where the conclusion about intervention effectiveness differs depending on the use of small-sample bias corrections. The proposed sandwich variance estimators are implemented in an R package CoxBcv .  相似文献   

10.
Clustered interval-censored failure time data occur when the failure times of interest are clustered into small groups and known only to lie in certain intervals. A number of methods have been proposed for regression analysis of clustered failure time data, but most of them apply only to clustered right-censored data. In this paper, a sieve estimation procedure is proposed for fitting a Cox frailty model to clustered interval-censored failure time data. In particular, a two-step algorithm for parameter estimation is developed and the asymptotic properties of the resulting sieve maximum likelihood estimators are established. The finite sample properties of the proposed estimators are investigated through a simulation study and the method is illustrated by the data arising from a lymphatic filariasis study.  相似文献   

11.
In this paper, we develop a Gaussian estimation (GE) procedure to estimate the parameters of a regression model for correlated (longitudinal) binary response data using a working correlation matrix. A two‐step iterative procedure is proposed for estimating the regression parameters by the GE method and the correlation parameters by the method of moments. Consistency properties of the estimators are discussed. A simulation study was conducted to compare 11 estimators of the regression parameters, namely, four versions of the GE, five versions of the generalized estimating equations (GEEs), and two versions of the weighted GEE. Simulations show that (i) the Gaussian estimates have the smallest mean square error and best coverage probability if the working correlation structure is correctly specified and (ii) when the working correlation structure is correctly specified, the GE and the GEE with exchangeable correlation structure perform best as opposed to when the correlation structure is misspecified.  相似文献   

12.
The measurement of biallelic pair-wise association called linkage disequilibrium (LD) is an important issue in order to understand the genomic architecture. A plethora of measures of association in two by two tables have been proposed in the literature. Beside the problem of choosing an appropriate measure, the problem of their estimation has been neglected in the literature. It needs to be emphasized that the definition of a measure and the choice of an estimator function for it are conceptually unrelated tasks. In this paper, we compare the performance of various estimators for the three popular LD measures D', r and Y in a simulation study for small to moderate samples sizes (N<=500). The usual frequency-plug-in estimators can lead to unreliable or undefined estimates. Estimators based on the computationally expensive volume measures have been proposed recently as a remedy to this well-known problem. We confirm that volume estimators have better expected mean square error than the naive plug-in estimators. But they are outperformed by estimators plugging-in easy to calculate non-informative Bayesian probability estimates into the theoretical formulae for the measures. Fully Bayesian estimators with non-informative Dirichlet priors have comparable accuracy but are computationally more expensive. We recommend the use of non-informative Bayesian plug-in estimators based on Jeffreys' prior, in particular when dealing with SNP array data where the occurrence of small table entries and table margins is likely.  相似文献   

13.
Xi L  Yip PS  Watson R 《Biometrics》2007,63(1):228-236
A unified likelihood-based approach is proposed to estimate population size for a continuous-time closed capture-recapture experiment with frailty. The frailty model allows the capture intensity to vary with individual heterogeneity, time, and behavioral response. The individual heterogeneity effect is modeled as being gamma distributed. The first-capture and recapture intensities are assumed to be in constant proportion but may otherwise vary arbitrarily through time. The approach is also extended to capture-recapture experiments with possible random removals. Simulation studies are conducted to examine the performance of the proposed estimators. By asymptotic efficiency comparison and simulation studies, the proposed estimators have been shown to be superior to their discrete-time model counterparts in genuine continuous-time capture-recapture experiments.  相似文献   

14.
Maternity length of stay (LOS) is an important measure of hospital activity, but its empirical distribution is often positively skewed. A two-component gamma mixture regression model has been proposed to analyze the heterogeneous maternity LOS. The problem is that observations collected from the same hospital are often correlated, which can lead to spurious associations and misleading inferences. To account for the inherent correlation, random effects are incorporated within the linear predictors of the two-component gamma mixture regression model. An EM algorithm is developed for the residual maximum quasi-likelihood estimation of the regression coefficients and variance component parameters. The approach enables the correct identification and assessment of risk factors affecting the short-stay and long-stay patient subgroups. In addition, the predicted random effects can provide information on the inter-hospital variations after adjustment for patient characteristics and health provision factors. A simulation study shows that the estimators obtained via the EM algorithm perform well in all the settings considered. Application to a set of maternity LOS data for women having obstetrical delivery with multiple complicating diagnoses is illustrated.  相似文献   

15.
Robert M. Dorazio 《Biometrics》2012,68(4):1303-1312
Summary Several models have been developed to predict the geographic distribution of a species by combining measurements of covariates of occurrence at locations where the species is known to be present with measurements of the same covariates at other locations where species occurrence status (presence or absence) is unknown. In the absence of species detection errors, spatial point‐process models and binary‐regression models for case‐augmented surveys provide consistent estimators of a species’ geographic distribution without prior knowledge of species prevalence. In addition, these regression models can be modified to produce estimators of species abundance that are asymptotically equivalent to those of the spatial point‐process models. However, if species presence locations are subject to detection errors, neither class of models provides a consistent estimator of covariate effects unless the covariates of species abundance are distinct and independently distributed from the covariates of species detection probability. These analytical results are illustrated using simulation studies of data sets that contain a wide range of presence‐only sample sizes. Analyses of presence‐only data of three avian species observed in a survey of landbirds in western Montana and northern Idaho are compared with site‐occupancy analyses of detections and nondetections of these species.  相似文献   

16.
Madsen L  Fang Y 《Biometrics》2011,67(3):1171-5; discussion 1175-6
Summary We introduce an approximation to the Gaussian copula likelihood of Song, Li, and Yuan (2009, Biometrics 65, 60–68) used to estimate regression parameters from correlated discrete or mixed bivariate or trivariate outcomes. Our approximation allows estimation of parameters from response vectors of length much larger than three, and is asymptotically equivalent to the Gaussian copula likelihood. We estimate regression parameters from the toenail infection data of De Backer et al. (1996, British Journal of Dermatology 134, 16–17), which consist of binary response vectors of length seven or less from 294 subjects. Although maximizing the Gaussian copula likelihood yields estimators that are asymptotically more efficient than generalized estimating equation (GEE) estimators, our simulation study illustrates that for finite samples, GEE estimators can actually be as much as 20% more efficient.  相似文献   

17.
Multinomial data arise in many areas of the life sciences, such as mark-recapture studies and phylogenetics, and will often by overdispersed, with the variance being higher than predicted by a multinomial model. The quasi-likelihood approach to modeling this overdispersion involves the assumption that the variance is proportional to that specified by the multinomial model. As this approach does not require specification of the full distribution of the response variable, it can be more robust than fitting a Dirichlet-multinomial model or adding a random effect to the linear predictor. Estimation of the amount of overdispersion is often based on Pearson's statistic X2 or the deviance D. For many types of study, such as mark-recapture, the data will be sparse. The estimator based on X2 can then be highly variable, and that based on D can have a large negative bias. We derive a new estimator, which has a smaller asymptotic variance than that based on X2, the difference being most marked for sparse data. We illustrate the numerical difference between the three estimators using a mark-recapture study of swifts and compare their performance via a simulation study. The new estimator has the lowest root mean squared error across a range of scenarios, especially when the data are very sparse.  相似文献   

18.
We address estimation of the marginal effect of a time‐varying binary treatment on a continuous longitudinal outcome in the context of observational studies using electronic health records, when the relationship of interest is confounded, mediated, and further distorted by an informative visit process. We allow the longitudinal outcome to be recorded only sporadically and assume that its monitoring timing is informed by patients' characteristics. We propose two novel estimators based on linear models for the mean outcome that incorporate an adjustment for confounding and informative monitoring process through generalized inverse probability of treatment weights and a proportional intensity model, respectively. We allow for a flexible modeling of the intercept function as a function of time. Our estimators have closed‐form solutions, and their asymptotic distributions can be derived. Extensive simulation studies show that both estimators outperform standard methods such as the ordinary least squares estimator or estimators that only account for informative monitoring or confounders. We illustrate our methods using data from the Add Health study, assessing the effect of depressive mood on weight in adolescents.  相似文献   

19.
In observational studies of survival time featuring a binary time-dependent treatment, the hazard ratio (an instantaneous measure) is often used to represent the treatment effect. However, investigators are often more interested in the difference in survival functions. We propose semiparametric methods to estimate the causal effect of treatment among the treated with respect to survival probability. The objective is to compare post-treatment survival with the survival function that would have been observed in the absence of treatment. For each patient, we compute a prognostic score (based on the pre-treatment death hazard) and a propensity score (based on the treatment hazard). Each treated patient is then matched with an alive, uncensored and not-yet-treated patient with similar prognostic and/or propensity scores. The experience of each treated and matched patient is weighted using a variant of Inverse Probability of Censoring Weighting to account for the impact of censoring. We propose estimators of the treatment-specific survival functions (and their difference), computed through weighted Nelson–Aalen estimators. Closed-form variance estimators are proposed which take into consideration the potential replication of subjects across matched sets. The proposed methods are evaluated through simulation, then applied to estimate the effect of kidney transplantation on survival among end-stage renal disease patients using data from a national organ failure registry.  相似文献   

20.
AFLP is a DNA fingerprinting technique, resulting in binary band presence–absence patterns, called profiles, with known or unknown band positions. We model AFLP as a sampling procedure of fragments, with lengths sampled from a distribution. Bands represent fragments of specific lengths. We focus on estimation of pairwise genetic similarity, defined as average fraction of common fragments, by AFLP. Usual estimators are Dice (D) or Jaccard coefficients. D overestimates genetic similarity, since identical bands in profile pairs may correspond to different fragments (homoplasy). Another complicating factor is the occurrence of different fragments of equal length within a profile, appearing as a single band, which we call collision. The bias of D increases with larger numbers of bands, and lower genetic similarity. We propose two homoplasy- and collision-corrected estimators of genetic similarity. The first is a modification of D, replacing band counts by estimated fragment counts. The second is a maximum likelihood estimator, only applicable if band positions are available. Properties of the estimators are studied by simulation. Standard errors and confidence intervals for the first are obtained by bootstrapping, and for the second by likelihood theory. The estimators are nearly unbiased, and have for most practical cases smaller standard error than D. The likelihood-based estimator generally gives the highest precision. The relationship between fragment counts and precision is studied using simulation. The usual range of band counts (50–100) appears nearly optimal. The methodology is illustrated using data from a phylogenetic study on lettuce.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号