首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 328 毫秒
1.
J A Hanley  M N Parnes 《Biometrics》1983,39(1):129-139
This paper presents examples of situations in which one wishes to estimate a multivariate distribution from data that may be right-censored. A distinction is made between what we term 'homogeneous' and 'heterogeneous' censoring. It is shown how a multivariate empirical survivor function must be constructed in order to be considered a (nonparametric) maximum likelihood estimate of the underlying survivor function. A closed-form solution, similar to the product-limit estimate of Kaplan and Meier, is possible with homogeneous censoring, but an iterative method, such as the EM algorithm, is required with heterogeneous censoring. An example is given in which an anomaly is produced if censored multivariate data are analyzed as a series of univariate variables; this anomaly is shown to disappear if the methods of this paper are used.  相似文献   

2.
The standard test for anti-haemagglutinin antibody titration is the haemagglutination inhibition (HI) test. The HI titre is defined as the dilution factor of the highest dilution that still completely inhibits haemagglutination. If the highest dilution tested (1:2560) still completely inhibits haemagglutination, an HI titre value of 2560 is assigned. Logarithmically transformed HI titres tend to be normally distributed. But because dilutions less than 1:2560 are not tested, the distribution may be truncated and the assumption of normality may not hold. As a consequence, the geometric mean titre (GMT) will be underestimated. Using data from 10 clinical studies, it is shown here that the GMT may be underestimated by 5-13%. An unbiased estimate of the GMT can be obtained by a statistical method that originates from the analysis of survival data: maximum likelihood estimation for censored observations. The maximum likelihood estimate of the GMT of truncated HI titres can be readily obtained using the statistical software package SAS.  相似文献   

3.
A common testing problem for a life table or survival data is to test the equality of two survival distributions when the data is both grouped and censored. Several tests have been proposed in the literature which require various assumptions about the censoring distributions. It is shown that if these conditions are relaxed then the tests may no longer have the stated properties. The maximum likelihood test of equality when no assumptions are made about the censoring marginal distributions is derived. The properties of the test are found and it is compared to the existing tests. The fact that no assumptions are required about the censoring distributions make the test a useful initial testing procedure.  相似文献   

4.
Regression models in survival analysis are most commonly applied for right‐censored survival data. In some situations, the time to the event is not exactly observed, although it is known that the event occurred between two observed times. In practice, the moment of observation is frequently taken as the event occurrence time, and the interval‐censored mechanism is ignored. We present a cure rate defective model for interval‐censored event‐time data. The defective distribution is characterized by a density function whose integration assumes a value less than one when the parameter domain differs from the usual domain. We use the Gompertz and inverse Gaussian defective distributions to model data containing cured elements and estimate parameters using the maximum likelihood estimation procedure. We evaluate the performance of the proposed models using Monte Carlo simulation studies. Practical relevance of the models is illustrated by applying datasets on ovarian cancer recurrence and oral lesions in children after liver transplantation, both of which were derived from studies performed at A.C. Camargo Cancer Center in São Paulo, Brazil.  相似文献   

5.
Summary The validity of limiting dilution assays can be compromised or negated by the use of statistical methodology which does not consider all issues surrounding the biological process. This study critically evaluates statistical methods for estimating the mean frequency of responding cells in multiple sample limiting dilution assays. We show that methods that pool limiting dilution assay data, or samples, are unable to estimate the variance appropriately. In addition, we use Monte Carlo simulations to evaluate an unweighted mean of the maximum likelihood estimator, an unweighted mean based on the jackknife estimator, and a log transform of the maximum likelihood estimator. For small culture replicate size, the log transform outperforms both unweighted mean procedures. For moderate culture replicate size, the unweighted mean based on the jackknife produces the most acceptable results. This study also addresses the important issue of experimental design in multiple sample limiting dilution assays. In particular, we demonstrate that optimization of multiple sample limiting dilution assays is achieved by increasing the number of biological samples at the expense of repeat cultures.  相似文献   

6.
Maximum likelihood for interval censored data: Consistency and computation   总被引:5,自引:0,他引:5  
Standard convex optimization techniques are applied to the analysisof interval censored data. These methods provide easily verifiableconditions for the self-consistent estimator proposed by Turnbull(1976) to be a maximum likelihood estimator and for checkingwhether the maximum likelihood estimate is unique. A sufficientcondition is given for the almost sure convergence of the maximumlikelihood estimator to the true underlying distribution function.  相似文献   

7.
Multiple lower limits of quantification (MLOQs) result if various laboratories are involved in the analysis of concentration data and some observations are too low to be quantified. For normally distributed data under MLOQs there exists only the multiple regression method of Helsel to estimate the mean and variance. We propose a simple imputation method and two new maximum likelihood estimation methods: the multiple truncated sample method and the multiple censored sample method. A simulation study is conducted to compare the performances of the newly introduced methods to Helsel's via the criteria root mean squared error (RMSE) and bias of the parameter estimates. Two and four lower limits of quantification (LLOQs), various amounts of unquantifiable observations and two sample sizes are studied. Furthermore, the robustness is investigated under model misspecification. The methods perform with decreasing accuracy for increasing rates of unquantified observations. Increasing sample sizes lead to smaller bias. There is almost no change in the performance between two and four LLOQs. The magnitude of the variance impairs the performance of all methods. For a smaller variance, the multiple censored sample method leads to superior estimates regarding the RMSE and bias, whereas Helsel's method is superior regarding the bias for a larger variance. Under model misspecification, Helsel's method was inferior to the other methods. Estimating the mean, the multiple censored sample method performed better, whereas the multiple truncated sample method performs best in estimating the variance. Summarizing, for a large sample size and normally distributed data we recommend to use Helsel's method. Otherwise, the multiple censored sample method should be used to obtain estimates of the mean and variance of data including MLOQs.  相似文献   

8.
A commonly used tool in disease association studies is the search for discrepancies between the haplotype distribution in the case and control populations. In order to find this discrepancy, the haplotypes frequency in each of the populations is estimated from the genotypes. We present a new method HAPLOFREQ to estimate haplotype frequencies over a short genomic region given the genotypes or haplotypes with missing data or sequencing errors. Our approach incorporates a maximum likelihood model based on a simple random generative model which assumes that the genotypes are independently sampled from the population. We first show that if the phased haplotypes are given, possibly with missing data, we can estimate the frequency of the haplotypes in the population by finding the global optimum of the likelihood function in polynomial time. If the haplotypes are not phased, finding the maximum value of the likelihood function is NP-hard. In this case, we define an alternative likelihood function which can be thought of as a relaxed likelihood function. We show that the maximum relaxed likelihood can be found in polynomial time and that the optimal solution of the relaxed likelihood approaches asymptotically to the haplotype frequencies in the population. In contrast to previous approaches, our algorithms are guaranteed to converge in polynomial time to a global maximum of the different likelihood functions. We compared the performance of our algorithm to the widely used program PHASE, and we found that our estimates are at least 10% more accurate than PHASE and about ten times faster than PHASE. Our techniques involve new algorithms in convex optimization. These algorithms may be of independent interest. Particularly, they may be helpful in other maximum likelihood problems arising from survey sampling.  相似文献   

9.
Survival estimation using splines   总被引:1,自引:0,他引:1  
A nonparametric maximum likelihood procedure is given for estimating the survivor function from right-censored data. It approximates the hazard rate by a simple function such as a spline, with different approximations yielding different estimators. A special case is that proposed by Nelson (1969, Journal of Quality Technology 1, 27-52) and Altshuler (1970, Mathematical Biosciences 6, 1-11). The estimators are uniformly consistent and have the same asymptotic weak convergence properties as the Kaplan-Meier (1958, Journal of the American Statistical Association 53, 457-481) estimator. However, in small and in heavily censored samples, the simplest spline estimators have uniformly smaller mean squared error than do the Kaplan-Meier and Nelson-Altshuler estimators. The procedure is extended to estimate the baseline hazard rate and regression coefficients in the Cox (1972, Journal of the Royal Statistical Society, Series B 34, 187-220) proportional hazards model and is illustrated using experimental carcinogenesis data.  相似文献   

10.
Wen CC  Lin CT 《Biometrics》2011,67(3):760-769
Statistical inference based on right-censored data for the proportional hazards (PH) model with missing covariates has received considerable attention, but interval-censored or current status data with missing covariates has not yet been investigated. Our study is partly motivated by the analysis of fracture data from the 2005 National Health Interview Survey Original Database in Taiwan, where the occurrence of fractures was interval censored and the covariate osteoporosis was not reported for all residents. We assume that the data are realized from a PH model. A semiparametric maximum likelihood estimate implemented by a hybrid algorithm is proposed to analyze current status data with missing covariates. A comparison of the performance of our method with full-cohort analysis, complete-case analysis, and surrogate analysis is made via simulation with moderate sample sizes. The fracture data are then analyzed.  相似文献   

11.
Air toxic emission factor datasets often contain one or more points below a single or multiple detection limits and such datasets are referred to as “censored.” Conventional methods used to deal with censored datasets include removing non-detects, replacing the censored points with zero, half of the detection limit, or the detection limit. However, the estimated means of the censored dataset by conventional methods are usually biased. Maximum likelihood estimation (MLE) and bootstrap simulation have been demonstrated as a statistically robust method to quantify variability and uncertainty of censored datasets and can provide asymptotically unbiased mean estimates. The MLE/bootstrap method is applied to 16 cases of censored air toxic emission factors, including benzene, formaldehyde, benzo(a)pyrene, mercury, arsenic, cadmium, total chromium, chromium VI and lead from coal, fuel oil, and/or wood waste external combustion sources. The proportion of censored values in the emission factor data ranges from 4 to 80%. Key factors that influence the estimated uncertainty in the mean of censored data are sample size and inter-unit variability. The largest range of uncertainty in the mean was obtained for the external coal combustion benzene emission factor, with 95 confidence interval of the mean equal to minus 93 to plus 411%.  相似文献   

12.
Cui J 《Biometrics》1999,55(2):345-349
This paper proposes a nonparametric method for estimating a delay distribution based on left-censored and right-truncated data. A variance-covariance estimator is provided. The method is applied to the Australian AIDS data in which some data are left censored and some data are not left censored. This situation arises with AIDS case-reporting data in Australia because reporting delays were recorded only from November 1990 rather than from the beginning of the epidemic there. It is shown that inclusion of the left-censored data, as opposed to analyzing only the uncensored data, improves the precision of the estimate.  相似文献   

13.
P D Keightley 《Genetics》1998,150(3):1283-1293
The properties and limitations of maximum likelihood (ML) inference of genome-wide mutation rates (U) and parameters of distributions of mutation effects are investigated. Mutation parameters are estimated from simulated experiments in which mutations randomly accumulate in inbred lines. ML produces more accurate estimates than the procedure of Bateman and Mukai and is more robust if the data do not conform to the model assumed. Unbiased ML estimates of the mutation effects distribution parameters can be obtained if a value for U can be assumed, but if U is estimated simultaneously with the distribution parameters, likelihood may increase monotonically as a function of U. If the distribution of mutation effects is leptokurtic, the number of mutation events per line is large, or if genotypic values are poorly estimated, only a lower limit for U, an upper limit for the mean mutation effect, and a lower limit for the kurtosis of the distribution can be given. It is argued that such lower (upper) limits are appropriate minima (maxima). Estimates of the mean mutational effect are unbiased but may convey little about the properties of the distribution if it is leptokurtic.  相似文献   

14.
Cheng DM  Lagakos SW 《Biometrics》2000,56(2):626-633
In studies of chronic viral infections, the objective is to estimate probabilities of developing viral eradication and resistance. Complications arise as the laboratory methods used to assess eradication status result in unusual types of censored observations. This paper proposes nonparametric methods for the one-sample analysis of viral eradication/resistance data. We show that the unconstrained nonparametric maximum likelihood estimator of the subdistributions of eradication and resistance are obtainable in closed form. In small samples, these estimators may be inadmissible; thus, we also present an algorithm for obtaining the constrained MLEs based on an isotonic regression of the unconstrained MLEs. Estimators of several functionals of the eradication and resistance subdistributions are also developed and discussed. The methods are illustrated with results from recent hepatitis C clinical trials.  相似文献   

15.
We would like to use maximum likelihood to estimate parameters such as the effective population size N(e) or, if we do not know mutation rates, the product 4N(e) mu of mutation rate per site and effective population size. To compute the likelihood for a sample of unrecombined nucleotide sequences taken from a random-mating population it is necessary to sum over all genealogies that could have led to the sequences, computing for each one the probability that it would have yielded the sequences, and weighting each one by its prior probability. The genealogies vary in tree topology and in branch lengths. Although the likelihood and the prior are straightforward to compute, the summation over all genealogies seems at first sight hopelessly difficult. This paper reports that it is possible to carry out a Monte Carlo integration to evaluate the likelihoods approximately. The method uses bootstrap sampling of sites to create data sets for each of which a maximum likelihood tree is estimated. The resulting trees are assumed to be sampled from a distribution whose height is proportional to the likelihood surface for the full data. That it will be so is dependent on a theorem which is not proven, but seems likely to be true if the sequences are not short. One can use the resulting estimated likelihood curve to make a maximum likelihood estimate of the parameter of interest, N(e) or of 4N(e) mu. The method requires at least 100 times the computational effort required for estimation of a phylogeny by maximum likelihood, but is practical on today's work stations. The method does not at present have any way of dealing with recombination.  相似文献   

16.
G C Wei  M A Tanner 《Biometrics》1991,47(4):1297-1309
The first part of the article reviews the Data Augmentation algorithm and presents two approximations to the Data Augmentation algorithm for the analysis of missing-data problems: the Poor Man's Data Augmentation algorithm and the Asymptotic Data Augmentation algorithm. These two algorithms are then implemented in the context of censored regression data to obtain semiparametric methodology. The performances of the censored regression algorithms are examined in a simulation study. It is found, up to the precision of the study, that the bias of both the Poor Man's and Asymptotic Data Augmentation estimators, as well as the Buckley-James estimator, does not appear to differ from zero. However, with regard to mean squared error, over a wide range of settings examined in this simulation study, the two Data Augmentation estimators have a smaller mean squared error than does the Buckley-James estimator. In addition, associated with the two Data Augmentation estimators is a natural device for estimating the standard error of the estimated regression parameters. It is shown how this device can be used to estimate the standard error of either Data Augmentation estimate of any parameter (e.g., the correlation coefficient) associated with the model. In the simulation study, the estimated standard error of the Asymptotic Data Augmentation estimate of the regression parameter is found to be congruent with the Monte Carlo standard deviation of the corresponding parameter estimate. The algorithms are illustrated using the updated Stanford heart transplant data set.  相似文献   

17.
Different types of random binary topological trees (like neuronal processes and rivers) occur with relative frequencies that can be explained in terms of growth models. It will be shown how the model parameter determining the mode of growth can be estimated with the maximum likelihood procedure from observed data. Monte Carlo simulations were used to study the distributional properties of this estimator which appeared to have a negligible bias. It is shown that the minimum chi-square procedure yields an estimate that is very close to the maximum likelihood estimate. Moreover, the goodness-of-fit of the growth model can be inferred directly from the chi-square statistic. To illustrate the procedures we examined axonal trees from the goldfish tectum. A notion of complete partition randomness is presented as an alternative to our growth hypotheses.  相似文献   

18.
In survival models, some covariates affecting the lifetime could not be observed or measured. These covariates may correspond to environmental or genetic factors and be considered as a random effect related to a frailty of the individuals explaining their survival times. We propose a methodology based on a Birnbaum–Saunders frailty regression model, which can be applied to censored or uncensored data. Maximum‐likelihood methods are used to estimate the model parameters and to derive local influence techniques. Diagnostic tools are important in regression to detect anomalies, as departures from error assumptions and presence of outliers and influential cases. Normal curvatures for local influence under different perturbations are computed and two types of residuals are introduced. Two examples with uncensored and censored real‐world data illustrate the proposed methodology. Comparison with classical frailty models is carried out in these examples, which shows the superiority of the proposed model.  相似文献   

19.
Distribution-free regression analysis of grouped survival data   总被引:1,自引:0,他引:1  
Methods based on regression models for logarithmic hazard functions, Cox models, are given for analysis of grouped and censored survival data. By making an approximation it is possible to obtain explicitly a maximum likelihood function involving only the regression parameters. This likelihood function is a convenient analog to Cox's partial likelihood for ungrouped data. The method is applied to data from a toxicological experiment.  相似文献   

20.
Estimates of quantitative trait loci (QTL) effects derived from complete genome scans are biased, if no assumptions are made about the distribution of QTL effects. Bias should be reduced if estimates are derived by maximum likelihood, with the QTL effects sampled from a known distribution. The parameters of the distributions of QTL effects for nine economic traits in dairy cattle were estimated from a daughter design analysis of the Israeli Holstein population including 490 marker-by-sire contrasts. A separate gamma distribution was derived for each trait. Estimates for both the α and β parameters and their SE decreased as a function of heritability. The maximum likelihood estimates derived for the individual QTL effects using the gamma distributions for each trait were regressed relative to the least squares estimates, but the regression factor decreased as a function of the least squares estimate. On simulated data, the mean of least squares estimates for effects with nominal 1% significance was more than twice the simulated values, while the mean of the maximum likelihood estimates was slightly lower than the mean of the simulated values. The coefficient of determination for the maximum likelihood estimates was five-fold the corresponding value for the least squares estimates.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号