首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Longitudinal data usually consist of a number of short time series. A group of subjects or groups of subjects are followed over time and observations are often taken at unequally spaced time points, and may be at different times for different subjects. When the errors and random effects are Gaussian, the likelihood of these unbalanced linear mixed models can be directly calculated, and nonlinear optimization used to obtain maximum likelihood estimates of the fixed regression coefficients and parameters in the variance components. For binary longitudinal data, a two state, non-homogeneous continuous time Markov process approach is used to model serial correlation within subjects. Formulating the model as a continuous time Markov process allows the observations to be equally or unequally spaced. Fixed and time varying covariates can be included in the model, and the continuous time model allows the estimation of the odds ratio for an exposure variable based on the steady state distribution. Exact likelihoods can be calculated. The initial probability distribution on the first observation on each subject is estimated using logistic regression that can involve covariates, and this estimation is embedded in the overall estimation. These models are applied to an intervention study designed to reduce children's sun exposure.  相似文献   

2.
Stratified data arise in several settings, such as longitudinal studies or multicenter clinical trials. Between-strata heterogeneity is usually addressed by random effects models, but an alternative approach is given by fixed effects models, which treat the incidental nuisance parameters as fixed unknown quantities. This approach presents several advantages, like computational simplicity and robustness to confounding by strata. However, maximum likelihood estimates of the parameter of interest are typically affected by incidental parameter bias. A remedy to this is given by the elimination of stratum-specific parameters by exact or approximate conditioning. The latter solution is afforded by the modified profile likelihood, which is the method applied in this paper. The aim is to demonstrate how the theory of modified profile likelihoods provides convenient solutions to various inferential problems in this setting. Specific procedures are available for different kinds of response variables, and they are useful both for inferential purposes and as a diagnostic method for validating random effects models. Some examples with real data illustrate these points.  相似文献   

3.
The additive hazards model specifies the effect of covariates on the hazard in an additive way, in contrast to the popular Cox model, in which it is multiplicative. As the non-parametric model, additive hazards offer a very flexible way of modeling time-varying covariate effects. It is most commonly estimated by ordinary least squares. In this paper, we consider the case where covariates are bounded, and derive the maximum likelihood estimator under the constraint that the hazard is non-negative for all covariate values in their domain. We show that the maximum likelihood estimator may be obtained by separately maximizing the log-likelihood contribution of each event time point, and we show that the maximizing problem is equivalent to fitting a series of Poisson regression models with an identity link under non-negativity constraints. We derive an analytic solution to the maximum likelihood estimator. We contrast the maximum likelihood estimator with the ordinary least-squares estimator in a simulation study and show that the maximum likelihood estimator has smaller mean squared error than the ordinary least-squares estimator. An illustration with data on patients with carcinoma of the oropharynx is provided.  相似文献   

4.
Statistical models are the traditional choice to test scientific theories when observations, processes or boundary conditions are subject to stochasticity. Many important systems in ecology and biology, however, are difficult to capture with statistical models. Stochastic simulation models offer an alternative, but they were hitherto associated with a major disadvantage: their likelihood functions can usually not be calculated explicitly, and thus it is difficult to couple them to well-established statistical theory such as maximum likelihood and Bayesian statistics. A number of new methods, among them Approximate Bayesian Computing and Pattern-Oriented Modelling, bypass this limitation. These methods share three main principles: aggregation of simulated and observed data via summary statistics, likelihood approximation based on the summary statistics, and efficient sampling. We discuss principles as well as advantages and caveats of these methods, and demonstrate their potential for integrating stochastic simulation models into a unified framework for statistical modelling.  相似文献   

5.
Biological invasions reshape environments and affect the ecological and economic welfare of states and communities. Such invasions advance on multiple spatial scales, complicating their control. When modeling stochastic dispersal processes, intractable likelihoods and autocorrelated data complicate parameter estimation. As with other approaches, the recent synthetic likelihood framework for stochastic models uses summary statistics to reduce this complexity; however, it additionally provides usable likelihoods, facilitating the use of existing likelihood‐based machinery. Here, we extend this framework to parameterize multi‐scale spatio‐temporal dispersal models and compare existing and newly developed spatial summary statistics to characterize dispersal patterns. We provide general methods to evaluate potential summary statistics and present a fitting procedure that accurately estimates dispersal parameters on simulated data. Finally, we apply our methods to quantify the short and long range dispersal of Chagas disease vectors in urban Arequipa, Peru, and assess the feasibility of a purely reactive strategy to contain the invasion.  相似文献   

6.
Phylogenetic comparative methods (PCMs) have been used to test evolutionary hypotheses at phenotypic levels. The evolutionary modes commonly included in PCMs are Brownian motion (genetic drift) and the Ornstein–Uhlenbeck process (stabilizing selection), whose likelihood functions are mathematically tractable. More complicated models of evolutionary modes, such as branch‐specific directional selection, have not been used because calculations of likelihood and parameter estimates in the maximum‐likelihood framework are not straightforward. To solve this problem, we introduced a population genetics framework into a PCM, and here, we present a flexible and comprehensive framework for estimating evolutionary parameters through simulation‐based likelihood computations. The method does not require analytic likelihood computations, and evolutionary models can be used as long as simulation is possible. Our approach has many advantages: it incorporates different evolutionary modes for phenotypes into phylogeny, it takes intraspecific variation into account, it evaluates full likelihood instead of using summary statistics, and it can be used to estimate ancestral traits. We present a successful application of the method to the evolution of brain size in primates. Our method can be easily implemented in more computationally effective frameworks such as approximate Bayesian computation (ABC), which will enhance the use of computationally intensive methods in the study of phenotypic evolution.  相似文献   

7.
For the purpose of making inferences for a one-dimensional interestparameter, or constructing approximate complementary ancillariesor residuals, the directed likelihood or signed square rootof the likelihood ratio statistic can be adjusted so that theresulting modified directed likelihood is under ordinary repeatedsampling approximately standard normal with error of O(n–3/2),conditional on a suitable ancillary statistic and hence unconditionally.In general, suitable specification of the ancillary statisticmay be difficult. We introduce two adjusted directed likelihoodswhich are similar to the modified directed likelihood but donot require the specification of the ancillary statistic. Theerror of the standard normal approximation to the distributionof these new adjusted directed likelihoods is O(n–1),conditional on any reasonable ancillary statistic, which isstill an improvement over the unadjusted directed likelihoods.  相似文献   

8.
Maximum likelihood estimation of the model parameters for a spatial population based on data collected from a survey sample is usually straightforward when sampling and non-response are both non-informative, since the model can then usually be fitted using the available sample data, and no allowance is necessary for the fact that only a part of the population has been observed. Although for many regression models this naive strategy yields consistent estimates, this is not the case for some models, such as spatial auto-regressive models. In this paper, we show that for a broad class of such models, a maximum marginal likelihood approach that uses both sample and population data leads to more efficient estimates since it uses spatial information from sampled as well as non-sampled units. Extensive simulation experiments based on two well-known data sets are used to assess the impact of the spatial sampling design, the auto-correlation parameter and the sample size on the performance of this approach. When compared to some widely used methods that use only sample data, the results from these experiments show that the maximum marginal likelihood approach is much more precise.  相似文献   

9.
Summary Logistic regression is an important statistical procedure used in many disciplines. The standard software packages for data analysis are generally equipped with this procedure where the maximum likelihood estimates of the regression coefficients are obtained iteratively. It is well known that the estimates from the analyses of small‐ or medium‐sized samples are biased. Also, in finding such estimates, often a separation is encountered in which the likelihood converges but at least one of the parameter estimates diverges to infinity. Standard approaches of finding such estimates do not take care of these problems. Moreover, the missingness in the covariates adds an extra layer of complexity to the whole process. In this article, we address these three practical issues—bias, separation, and missing covariates by means of simple adjustments. We have applied the proposed technique using real and simulated data. The proposed method always finds a solution and the estimates are less biased. A SAS macro that implements the proposed method can be obtained from the authors.  相似文献   

10.
Using sib-pairs and parent-pairs data, a quantitative trait can be tested for the existence of a major locus under the mixed model of Morton and MacLean (1974). The basic idea is to obtain the two conditional likelihoods for sib-pair differences and parent-pair differences provided that it is known which sib-pairs or parent-pairs have the same effect at the major locus. Two conditions are introduced to obtain two recursive computer algorithms that distinguish the sib-pairs or parent-pairs having the same effect at the major locus. This method has the advantage of reducing complicated computations involving maximum likelihood estimates from nuclear families. A simulation experiment is performed to illustrate the method and its results are discussed.  相似文献   

11.
Maximum likelihood statistics were applied to the analysis of serological data to confirm the originally proposed genetic models of the chimpanzee R-C-E-F and V-A-B-D systems. Five hundred ninety-nine chimpanzees, including 81 parents of 114 offspring, were tested for R-C-E-F, and 60 parents of 80 offspring were tested for V-A-B-D blood groups. An estimation-maximization procedure was used to obtain maximum likelihood estimates and support intervals of the haplotype frequencies. For each haplotype, the null hypothesis of nonexistence was evaluated. The frequencies obtained by this method do not differ significantly from those calculated by the square root formula, but put these estimates on a statistically more rigorous footing.  相似文献   

12.
The usefulness of fluorescence techniques for the study of macromolecular structure and dynamics depends on the accuracy and sensitivity of the methods used for data analysis. Many methods for data analysis have been proposed and used, but little attention has been paid to the maximum likelihood method, generally known as the most powerful statistical method for parameter estimation. In this paper we study the properties and behavior of maximum likelihood estimates by using simulated fluorescence intensity decay data. We show that the maximum likelihood method provides generally more accurate estimates of lifetimes and fractions than does the standard least-squares approach especially when the lifetime ratios between individual components are small. Three novelties to the field of fluorescence decay analysis are also introduced and studied in this paper: a) discretization of the convolution integral based on the generalized integral mean value theorem: b) the likelihood ratio test as a tool to determine the number of exponential decay components in a given decay profile; and c) separability and detectability indices which provide measures on how accurately, a particular decay component can be detected. Based on the experience gained from this and from our previous study of the Padé-Laplace method, we make some recommendations on how the complex problem of deconvolution and parameter estimation of multiexponential functions might be approached in an experimental setting. Offprint requests to: F. G. Prendergast  相似文献   

13.
The problem of assessing the relative calibrations and relative accuracies of a set of p instruments, each designed to measure the same characteristic on a common group of individuals is considered by using the EM algorithm. As shown, the EM algorithm provides a general solution for this problem. Its implementation is simple and in its most general form requires no extra iterative procedures within the M step. One important feature of the algorithm in this set up is that the error variance estimates are always positive. Thus, it can be seen as a kind of restricted maximization procedure. The expected information matrix for the maximum likelihood estimators is derived, upon which the large sample estimated covariance matrix for the maximum likelihood estimators can be computed. The problem of testing hypothesis about the calibration lines can be approached by using the Wald statistics. The approach is illustrated by re-analysing two data sets in the literature.  相似文献   

14.
Deterministic sampling was used to numerically evaluate the expected log-likelihood surfaces of QTL-marker linkage models in large pedigrees with simple structures. By calculating the expected values of likelihoods, questions of power of experimental designs, bias in parameter estimates, approximate lower-bound standard errors of estimates and correlations among estimates, and suitability of statistical models were addressed. Examples illustrated that bracket markers around the QTL approximately halved the standard error of the recombination fraction between the QTL and the marker, although they did not affect the standard error of the QTL's effect, that overestimation of the distance between the markers caused overestimation of the distance between the QTL and marker, that more parameters in the model did not affect the accuracy of parameter estimates, that there was a moderate positive correlation between the estimates of the QTL effect and its recombination distance from the marker, and that selective genotyping did not introduce bias into the estimates of the parameters. The method is suggested as a useful tool for exploring the power and accuracy of QTL linkage experiments, and the value of alternative statistical models, whenever the likelihood of the model can be written explictly.  相似文献   

15.
Pennello GA  Devesa SS  Gail MH 《Biometrics》1999,55(3):774-781
Commonly used methods for depicting geographic variation in cancer rates are based on rankings. They identify where the rates are high and low but do not indicate the magnitude of the rates nor their variability. Yet such measures of variability may be useful in suggesting which types of cancer warrant further analytic studies of localized risk factors. We consider a mixed effects model in which the logarithm of the mean Poisson rate is additive in fixed stratum effects (e.g., age effects) and in logarithms of random relative risk effects associated with geographic areas. These random effects are assumed to follow a gamma distribution with unit mean and variance 1/alpha, similar to Clayton and Kaldor (1987, Biometrics 43, 671-681). We present maximum likelihood and method-of-moments estimates with standard errors for inference on alpha -1/2, the relative risk standard deviation (RRSD). The moment estimates rely on only the first two moments of the Poisson and gamma distributions but have larger standard errors than the maximum likelihood estimates. We compare these estimates with other measures of variability. Several examples suggest that the RRSD estimates have advantages compared to other measures of variability.  相似文献   

16.
Assessing influence in regression analysis with censored data.   总被引:14,自引:0,他引:14  
L A Escobar  W Q Meeker 《Biometrics》1992,48(2):507-528
In this paper we show how to evaluate the effect that perturbations to the model, data, or case weights have on maximum likelihood estimates from censored survival data. The ideas and methods also apply to other nonlinear estimation problems. We review the ideas behind using log-likelihood displacement and local influence methods. We describe new interpretations for some local influence statistics and show how these statistics extend and complement traditional case deletion influence statistics for linear least squares. These statistics identify individual and combinations of cases that have important influence on estimates of parameters and functions of these parameters. We illustrate the methods by reanalyzing the Stanford Heart Transplant data with a parametric regression model.  相似文献   

17.
We revisit the usual conditional likelihood for stratum-matched case-control studies and consider three alternatives that may be more appropriate for family-based gene-characterization studies: First, the prospective likelihood, that is, Pr(D/G,A second, the retrospective likelihood, Pr(G/D); and third, the ascertainment-corrected joint likelihood, Pr(D,G/A). These likelihoods provide unbiased estimators of genetic relative risk parameters, as well as population allele frequencies and baseline risks. The parameter estimates based on the retrospective likelihood remain unbiased even when the ascertainment scheme cannot be modeled, as long as ascertainment only depends on families' phenotypes. Despite the need to estimate additional parameters, the prospective, retrospective, and joint likelihoods can lead to considerable gains in efficiency, relative to the conditional likelihood, when estimating genetic relative risk. This is true if baseline risks and allele frequencies can be assumed to be homogeneous. In the presence of heterogeneity, however, the parameter estimates assuming homogeneity can be seriously biased. We discuss the extent of this problem and present a mixed models approach for providing consistent parameter estimates when baseline risks and allele frequencies are heterogeneous. The efficiency gains of the mixed-model prospective, retrospective, and joint likelihoods relative to the efficiency of conditional likelihood are small in the situations presented here.  相似文献   

18.
The function of individual sites within a protein influences their rate of accepted point mutation. During the computation of phylogenetic likelihoods, rate heterogeneity can be modeled on a site-per-site basis with relative rates drawn from a discretized Gamma-distribution. Site-rate estimates (e.g., the rate of highest posterior probability given the data at a site) can then be used as a measure of evolutionary constraints imposed by function. However, if the sequence availability is limited, the estimation of rates is subject to sampling error. This article presents a simulation study that evaluates the robustness of evolutionary site-rate estimates for both small and phylogenetically unbalanced samples. The sampling error on rate estimates was first evaluated for alignments that included 5-45 sequences, sampled by jackknifing, from a master alignment containing 968 sequences. We observed that the potentially enhanced resolution among site rates due to the inclusion of a larger number of rate categories is negated by the difficulty in correctly estimating intermediate rates. This effect is marked for data sets with less than 30 sequences. Although the computation of likelihood theoretically accounts for phylogenetic distances through branch lengths, the introduction of a single long-branch outlier sequence had a significant negative effect on site-rate estimates. Finally, the presence of a shift in rates of evolution between related lineages can be diagnostic of a gain/loss of function within a protein family. Our analyses indicate that detecting these rate shifts is a harder problem than estimating rates. This is so, partially, because the difference in rates depends on two rate estimates, each with an intrinsic uncertainty. The performances of four methods to detect these site-rate shifts are evaluated and compared. Guidelines are suggested for preparing data sets minimally influenced by error introduced by sequence sampling.  相似文献   

19.
This paper considers the distribution of previously proposed goodness of fit tests when some or all of the covariates are dichotomous variables. The simulations show that of the statistics suggested for testing fit only one appears suitable for use with discrete covariates. This statistic uses conditional maximum likelihood estimates and groups the estimated probabilities into groups of equal size or into groups based on the patterns of the covariates when these are few in number.  相似文献   

20.
Summary Many major genes have been identified that strongly influence the risk of cancer. However, there are typically many different mutations that can occur in the gene, each of which may or may not confer increased risk. It is critical to identify which specific mutations are harmful, and which ones are harmless, so that individuals who learn from genetic testing that they have a mutation can be appropriately counseled. This is a challenging task, since new mutations are continually being identified, and there is typically relatively little evidence available about each individual mutation. In an earlier article, we employed hierarchical modeling ( Capanu et al., 2008 , Statistics in Medicine 27 , 1973–1992) using the pseudo‐likelihood and Gibbs sampling methods to estimate the relative risks of individual rare variants using data from a case–control study and showed that one can draw strength from the aggregating power of hierarchical models to distinguish the variants that contribute to cancer risk. However, further research is needed to validate the application of asymptotic methods to such sparse data. In this article, we use simulations to study in detail the properties of the pseudo‐likelihood method for this purpose. We also explore two alternative approaches: pseudo‐likelihood with correction for the variance component estimate as proposed by Lin and Breslow (1996, Journal of the American Statistical Association 91 , 1007–1016) and a hybrid pseudo‐likelihood approach with Bayesian estimation of the variance component. We investigate the validity of these hierarchical modeling techniques by looking at the bias and coverage properties of the estimators as well as at the efficiency of the hierarchical modeling estimates relative to that of the maximum likelihood estimates. The results indicate that the estimates of the relative risks of very sparse variants have small bias, and that the estimated 95% confidence intervals are typically anti‐conservative, though the actual coverage rates are generally above 90%. The widths of the confidence intervals narrow as the residual variance in the second‐stage model is reduced. The results also show that the hierarchical modeling estimates have shorter confidence intervals relative to estimates obtained from conventional logistic regression, and that these relative improvements increase as the variants become more rare.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号