首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Bagiella E 《Biometrics》2006,62(1):54-60
Age at ascertainment from prevalence case-control data identifies the age-specific odds of disease. When age at onset is available from the cases, the conditional distribution of age at onset, given that disease occurs, is identifiable. Combining both kinds of information by introducing a multiplicative intercept allows identification of the marginal distribution of age at onset. Here, the approach is extended to the two-sample setting through a generalization of the multiplicative intercept model. The efficiency of the approach is explored and a test statistic based on the integrated difference between distribution function estimates is proposed. An approach to regularization of the likelihood is discussed. The methods are illustrated through an application to data on colorectal polyps obtained from a case-control study of individuals undergoing colonoscopy.  相似文献   

2.
Schafer DW 《Biometrics》2001,57(1):53-61
This paper presents an EM algorithm for semiparametric likelihood analysis of linear, generalized linear, and nonlinear regression models with measurement errors in explanatory variables. A structural model is used in which probability distributions are specified for (a) the response and (b) the measurement error. A distribution is also assumed for the true explanatory variable but is left unspecified and is estimated by nonparametric maximum likelihood. For various types of extra information about the measurement error distribution, the proposed algorithm makes use of available routines that would be appropriate for likelihood analysis of (a) and (b) if the true x were available. Simulations suggest that the semiparametric maximum likelihood estimator retains a high degree of efficiency relative to the structural maximum likelihood estimator based on correct distributional assumptions and can outperform maximum likelihood based on an incorrect distributional assumption. The approach is illustrated on three examples with a variety of structures and types of extra information about the measurement error distribution.  相似文献   

3.
Bartolucci F  Pennoni F 《Biometrics》2007,63(2):568-578
We propose an extension of the latent class model for the analysis of capture-recapture data which allows us to take into account the effect of a capture on the behavior of a subject with respect to future captures. The approach is based on the assumption that the variable indexing the latent class of a subject follows a Markov chain with transition probabilities depending on the previous capture history. Several constraints are allowed on these transition probabilities and on the parameters of the conditional distribution of the capture configuration given the latent process. We also allow for the presence of discrete explanatory variables, which may affect the parameters of the latent process. To estimate the resulting models, we rely on the conditional maximum likelihood approach and for this aim we outline an EM algorithm. We also give some simple rules for point and interval estimation of the population size. The approach is illustrated by applying it to two data sets concerning small mammal populations.  相似文献   

4.
A novel type of approximation scheme to the maximum likelihood (ML) approach is presented and discussed in the context of phylogenetic tree reconstruction from aligned DNA sequences. It is based on a parameterized approximation to the conditional distribution of hidden variables (related, e.g., to the sequences of unobserved branch point ancestors) given the observed data. A modified likelihood, based on the extended data, is then maximized with respect to the parameters of the model as well as to those involved in the approximation. With a suitable form of the approximation, the proposed method allows for simpler updating of the parameters, at the cost of an increased parameter count and a slight decrease in performance. The method is tested on phylogenetic tree reconstruction from artificially generated sequences, and its performance is compared to that of ML, showing that the approach is competitive for reasonably similar sequences. The method is also applied to real DNA sequences from primates, yielding a result consistent with those obtained by other standard algorithms.  相似文献   

5.
MOTIVATION: Maximum likelihood-based methods to estimate site by site substitution rate variability in aligned homologous protein sequences rely on the formulation of a phylogenetic tree and generally assume that the patterns of relative variability follow a pre-determined distribution. We present a phylogenetic tree-independent method to estimate the relative variability of individual sites within large datasets of homologous protein sequences. It is based upon two simple assumptions. Firstly that substitutions observed between two closely related sequences are likely, in general, to occur at the most variable sites. Secondly that non-conservative amino acid substitutions tend to occur at more variable sites. Our methodology makes no assumptions regarding the underlying pattern of relative variability between sites. RESULTS: We have compared, using data simulated under a non-gamma distributed model, the performance of this approach to that of a maximum likelihood method that assumes gamma distributed rates. At low mean rates of evolution our method inferred site by site relative substitution rates more accurately than the maximum likelihood approach in the absence of prior assumptions about the relationships between sequences. Our method does not directly account for the effects of mutational saturation, However, we have incorporated an 'ad-hoc' modification that allows the accurate estimation of relative site variability in fast evolving and saturated datasets.  相似文献   

6.
1. Observations of different organisms can often be used to infer environmental conditions at a site. These inferences may be useful for diagnosing the causes of degradation in streams and rivers. 2. When used for diagnosis, biological inferences must not only provide accurate, unbiased predictions of environmental conditions, but also pairs of inferred environmental variables must covary no more strongly than actual measurements of those same environmental variables. 3. Mathematical analysis of the relationship between the measured and inferred values of different environmental variables provides an approach for comparing the covariance between measurements with the covariance between inferences. Then, simulated and field‐collected data are used to assess the performance of weighted average and maximum likelihood inference methods. 4. Weighted average inferences became less accurate as covariance in the calibration data increased, whereas maximum likelihood inferences were unaffected by covariance in the calibration data. In contrast, the accuracy of weighted average inferences was unaffected by changes in measurement error, whilst the accuracy of maximum likelihood inferences decreased as measurement error increased. Weighted average inferences artificially increased the covariance of environmental variables beyond what was expected from measurements, whereas maximum likelihood inference methods more accurately reproduced the expected covariances. 5. Multivariate maximum likelihood inference methods can potentially provide more useful diagnostic information than single variable inference models.  相似文献   

7.
Mutation spectra recovered from lacI transgenic animals exposed in separate experiments to tris-(2,3-dibromopropyl)phosphate (TDBP) or aflatoxin B1 (AFB1) were examined using log-linear analysis. Log-linear analysis is a categorical procedure that analyses contingency table data. Expected contingency table cell counts are estimated by maximum likelihood as effects of main variables and variable interactions. Evaluation of hierarchical models of decreasing complexity indicates when significant explanatory power is lost by the sequential omission of interactions between variables. Use of this technique allows construction of the most parsimonious models to account for mutation spectra obtained in the two experiments. The resulting statistical models are consistent with previous analyses of these data and with biological explanations for causes of the observed spectra.  相似文献   

8.
A condition for practical independence of contact distribution functions in Boolean models is obtained. This result allows the authors to use maximum likelihcod methods, via sparse sampling, for estimating unknown parameters of an isotropic Boolean model. The second part of this paper is devoted to a simulation study of the proposed method. AMS classification: 60D05  相似文献   

9.
Dai JY  LeBlanc M  Kooperberg C 《Biometrics》2009,65(1):178-187
Summary .  Recent results for case–control sampling suggest when the covariate distribution is constrained by gene-environment independence, semiparametric estimation exploiting such independence yields a great deal of efficiency gain. We consider the efficient estimation of the treatment–biomarker interaction in two-phase sampling nested within randomized clinical trials, incorporating the independence between a randomized treatment and the baseline markers. We develop a Newton–Raphson algorithm based on the profile likelihood to compute the semiparametric maximum likelihood estimate (SPMLE). Our algorithm accommodates both continuous phase-one outcomes and continuous phase-two biomarkers. The profile information matrix is computed explicitly via numerical differentiation. In certain situations where computing the SPMLE is slow, we propose a maximum estimated likelihood estimator (MELE), which is also capable of incorporating the covariate independence. This estimated likelihood approach uses a one-step empirical covariate distribution, thus is straightforward to maximize. It offers a closed-form variance estimate with limited increase in variance relative to the fully efficient SPMLE. Our results suggest exploiting the covariate independence in two-phase sampling increases the efficiency substantially, particularly for estimating treatment–biomarker interactions.  相似文献   

10.
Stubbendick AL  Ibrahim JG 《Biometrics》2003,59(4):1140-1150
This article analyzes quality of life (QOL) data from an Eastern Cooperative Oncology Group (ECOG) melanoma trial that compared treatment with ganglioside vaccination to treatment with high-dose interferon. The analysis of this data set is challenging due to several difficulties, namely, nonignorable missing longitudinal responses and baseline covariates. Hence, we propose a selection model for estimating parameters in the normal random effects model with nonignorable missing responses and covariates. Parameters are estimated via maximum likelihood using the Gibbs sampler and a Monte Carlo expectation maximization (EM) algorithm. Standard errors are calculated using the bootstrap. The method allows for nonmonotone patterns of missing data in both the response variable and the covariates. We model the missing data mechanism and the missing covariate distribution via a sequence of one-dimensional conditional distributions, allowing the missing covariates to be either categorical or continuous, as well as time-varying. We apply the proposed approach to the ECOG quality-of-life data and conduct a small simulation study evaluating the performance of the maximum likelihood estimates. Our results indicate that a patient treated with the vaccine has a higher QOL score on average at a given time point than a patient treated with high-dose interferon.  相似文献   

11.
Lee SY  Shi JQ 《Biometrics》2001,57(3):787-794
Two-level data with hierarchical structure and mixed continuous and polytomous data are very common in biomedical research. In this article, we propose a maximum likelihood approach for analyzing a latent variable model with these data. The maximum likelihood estimates are obtained by a Monte Carlo EM algorithm that involves the Gibbs sampler for approximating the E-step and the M-step and the bridge sampling for monitoring the convergence. The approach is illustrated by a two-level data set concerning the development and preliminary findings from an AIDS preventative intervention for Filipina commercial sex workers where the relationship between some latent quantities is investigated.  相似文献   

12.
Longitudinal data usually consist of a number of short time series. A group of subjects or groups of subjects are followed over time and observations are often taken at unequally spaced time points, and may be at different times for different subjects. When the errors and random effects are Gaussian, the likelihood of these unbalanced linear mixed models can be directly calculated, and nonlinear optimization used to obtain maximum likelihood estimates of the fixed regression coefficients and parameters in the variance components. For binary longitudinal data, a two state, non-homogeneous continuous time Markov process approach is used to model serial correlation within subjects. Formulating the model as a continuous time Markov process allows the observations to be equally or unequally spaced. Fixed and time varying covariates can be included in the model, and the continuous time model allows the estimation of the odds ratio for an exposure variable based on the steady state distribution. Exact likelihoods can be calculated. The initial probability distribution on the first observation on each subject is estimated using logistic regression that can involve covariates, and this estimation is embedded in the overall estimation. These models are applied to an intervention study designed to reduce children's sun exposure.  相似文献   

13.
Several approaches have been suggested for estimating a respiratory response slope when both x and y variables are observed with error. Recently, a maximum likelihood estimate under the assumption of a bivariate normal distribution has been proposed. A method of moments solution yields a slope estimate of y/x as long as the underlying process mean is nonzero. This paper extends the maximum likelihood approach to the case where the process mean is zero. In this case, certain additional error assumptions must be made to yield a unique estimate. These concepts are applied to the problem of estimating an effective lung volume for steady-state breath-to-breath gas exchange data during exercise.  相似文献   

14.
Summary Cook, Gold, and Li (2007, Biometrics 63, 540–549) extended the Kulldorff (1997, Communications in Statistics 26, 1481–1496) scan statistic for spatial cluster detection to survival‐type observations. Their approach was based on the score statistic and they proposed a permutation distribution for the maximum of score tests. The score statistic makes it possible to apply the scan statistic idea to models including explanatory variables. However, we show that the permutation distribution requires strong assumptions of independence between potential cluster and both censoring and explanatory variables. In contrast, we present an approach using the asymptotic distribution of the maximum of score statistics in a manner not requiring these assumptions.  相似文献   

15.
Larsen K 《Biometrics》2004,60(1):85-92
Multiple categorical variables are commonly used in medical and epidemiological research to measure specific aspects of human health and functioning. To analyze such data, models have been developed considering these categorical variables as imperfect indicators of an individual's "true" status of health or functioning. In this article, the latent class regression model is used to model the relationship between covariates, a latent class variable (the unobserved status of health or functioning), and the observed indicators (e.g., variables from a questionnaire). The Cox model is extended to encompass a latent class variable as predictor of time-to-event, while using information about latent class membership available from multiple categorical indicators. The expectation-maximization (EM) algorithm is employed to obtain maximum likelihood estimates, and standard errors are calculated based on the profile likelihood, treating the nonparametric baseline hazard as a nuisance parameter. A sampling-based method for model checking is proposed. It allows for graphical investigation of the assumption of proportional hazards across latent classes. It may also be used for checking other model assumptions, such as no additional effect of the observed indicators given latent class. The usefulness of the model framework and the proposed techniques are illustrated in an analysis of data from the Women's Health and Aging Study concerning the effect of severe mobility disability on time-to-death for elderly women.  相似文献   

16.
He W  Lawless JF 《Biometrics》2003,59(4):837-848
This article presents methodology for multivariate proportional hazards (PH) regression models. The methods employ flexible piecewise constant or spline specifications for baseline hazard functions in either marginal or conditional PH models, along with assumptions about the association among lifetimes. Because the models are parametric, ordinary maximum likelihood can be applied; it is able to deal easily with such data features as interval censoring or sequentially observed lifetimes, unlike existing semiparametric methods. A bivariate Clayton model (1978, Biometrika 65, 141-151) is used to illustrate the approach taken. Because a parametric assumption about association is made, efficiency and robustness comparisons are made between estimation based on the bivariate Clayton model and "working independence" methods that specify only marginal distributions for each lifetime variable.  相似文献   

17.
Evolutionary biologists have adopted simple likelihood models for purposes of estimating ancestral states and evaluating character independence on specified phylogenies; however, for purposes of estimating phylogenies by using discrete morphological data, maximum parsimony remains the only option. This paper explores the possibility of using standard, well-behaved Markov models for estimating morphological phylogenies (including branch lengths) under the likelihood criterion. An important modification of standard Markov models involves making the likelihood conditional on characters being variable, because constant characters are absent in morphological data sets. Without this modification, branch lengths are often overestimated, resulting in potentially serious biases in tree topology selection. Several new avenues of research are opened by an explicitly model-based approach to phylogenetic analysis of discrete morphological data, including combined-data likelihood analyses (morphology + sequence data), likelihood ratio tests, and Bayesian analyses.  相似文献   

18.
Wolfinger RD  Kass RE 《Biometrics》2000,56(3):768-774
We consider the usual normal linear mixed model for variance components from a Bayesian viewpoint. With conjugate priors and balanced data, Gibbs sampling is easy to implement; however, simulating from full conditionals can become difficult for the analysis of unbalanced data with possibly nonconjugate priors, thus leading one to consider alternative Markov chain Monte Carlo schemes. We propose and investigate a method for posterior simulation based on an independence chain. The method is customized to exploit the structure of the variance component model, and it works with arbitrary prior distributions. As a default reference prior, we use a version of Jeffreys' prior based on the integrated (restricted) likelihood. We demonstrate the ease of application and flexibility of this approach in familiar settings involving both balanced and unbalanced data.  相似文献   

19.
We propose a joint analysis of recurrent and nonrecurrent event data subject to general types of interval censoring. The proposed analysis allows for general semiparametric models, including the Box–Cox transformation and inverse Box–Cox transformation models for the recurrent and nonrecurrent events, respectively. A frailty variable is used to account for the potential dependence between the recurrent and nonrecurrent event processes, while leaving the distribution of the frailty unspecified. We apply the pseudolikelihood for interval-censored recurrent event data, usually termed as panel count data, and the sufficient likelihood for interval-censored nonrecurrent event data by conditioning on the sufficient statistic for the frailty and using the working assumption of independence over examination times. Large sample theory and a computation procedure for the proposed analysis are established. We illustrate the proposed methodology by a joint analysis of the numbers of occurrences of basal cell carcinoma over time and time to the first recurrence of squamous cell carcinoma based on a skin cancer dataset, as well as a joint analysis of the numbers of adverse events and time to premature withdrawal from study medication based on a scleroderma lung disease dataset.  相似文献   

20.
A mixed-model procedure for analysis of censored data assuming a multivariate normal distribution is described. A Bayesian framework is adopted which allows for estimation of fixed effects and variance components and prediction of random effects when records are left-censored. The procedure can be extended to right- and two-tailed censoring. The model employed is a generalized linear model, and the estimation equations resemble those arising in analysis of multivariate normal or categorical data with threshold models. Estimates of variance components are obtained using expressions similar to those employed in the EM algorithm for restricted maximum likelihood (REML) estimation under normality.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号