首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This article investigates maximum likelihood estimation with saturated and unsaturated models for correlated exchangeable binary data, when a sample of independent clusters of varying sizes is available. We discuss various parameterizations of these models, and propose using the EM algorithm to obtain maximum likelihood estimates. The methodology is illustrated by applications to a study of familial disease aggregation and to the design of a proposed group randomized cancer prevention trial.  相似文献   

2.
Maximum likelihood estimation of the model parameters for a spatial population based on data collected from a survey sample is usually straightforward when sampling and non-response are both non-informative, since the model can then usually be fitted using the available sample data, and no allowance is necessary for the fact that only a part of the population has been observed. Although for many regression models this naive strategy yields consistent estimates, this is not the case for some models, such as spatial auto-regressive models. In this paper, we show that for a broad class of such models, a maximum marginal likelihood approach that uses both sample and population data leads to more efficient estimates since it uses spatial information from sampled as well as non-sampled units. Extensive simulation experiments based on two well-known data sets are used to assess the impact of the spatial sampling design, the auto-correlation parameter and the sample size on the performance of this approach. When compared to some widely used methods that use only sample data, the results from these experiments show that the maximum marginal likelihood approach is much more precise.  相似文献   

3.
Kneib T  Fahrmeir L 《Biometrics》2006,62(1):109-118
Motivated by a space-time study on forest health with damage state of trees as the response, we propose a general class of structured additive regression models for categorical responses, allowing for a flexible semiparametric predictor. Nonlinear effects of continuous covariates, time trends, and interactions between continuous covariates are modeled by penalized splines. Spatial effects can be estimated based on Markov random fields, Gaussian random fields, or two-dimensional penalized splines. We present our approach from a Bayesian perspective, with inference based on a categorical linear mixed model representation. The resulting empirical Bayes method is closely related to penalized likelihood estimation in a frequentist setting. Variance components, corresponding to inverse smoothing parameters, are estimated using (approximate) restricted maximum likelihood. In simulation studies we investigate the performance of different choices for the spatial effect, compare the empirical Bayes approach to competing methodology, and study the bias of mixed model estimates. As an application we analyze data from the forest health survey.  相似文献   

4.
Estimating the size-selection curves of towed gears,traps, nets and hooks   总被引:20,自引:0,他引:20  
A general statistical methodology for analysis of selectivity data from any type of fishing gear is presented. This formal statistical modelling of selectivity is built on explicit definitions of the selection process and specification of underlying assumptions and limitations, and this gives the resulting estimates of gear selectivity (and possibly fishing power) a clear interpretation. Application of the methodology to studies using subsampled catch data and to towed gears having windows or grids is outlined, and examples applied to passive nets and towed gears are presented. The analysis of data from replicate deployments is covered in detail, with particular regard to modelling the fixed and random effects of between-haul variation. Recent developments on the design of selectivity experiments are introduced and demonstrated.  相似文献   

5.
Latent class regression (LCR) is a popular method for analyzing multiple categorical outcomes. While nonresponse to the manifest items is a common complication, inferences of LCR can be evaluated using maximum likelihood, multiple imputation, and two‐stage multiple imputation. Under similar missing data assumptions, the estimates and variances from all three procedures are quite close. However, multiple imputation and two‐stage multiple imputation can provide additional information: estimates for the rates of missing information. The methodology is illustrated using an example from a study on racial and ethnic disparities in breast cancer severity.  相似文献   

6.
J Nedelman 《Biometrics》1983,39(4):1009-1020
Sampling models are investigated for counts of mosquitoes from a malaria field survey conducted by the World Health Organization in Nigeria. The data can be described by a negative binomial model for two-way classified counted data, where the cell means are constrained to satisfy row-by-column independence and the parameter k is constant across rows. An algorithm, based on iterative proportional fitting, is devised for finding maximum likelihood estimates. Sampling properties of the estimates and likelihood-ratio statistics for the small sample sizes of the data are investigated by Monte Carlo experiments. The WHO reported an observation that the relative efficiencies of four trapping methods vary over time. Out of eight villages in the survey area, this observation is found to be true in only the one village that is near a swamp.  相似文献   

7.
Growing interest in adaptive evolution in natural populations has spurred efforts to infer genetic components of variance and covariance of quantitative characters. Here, I review difficulties inherent in the usual least-squares methods of estimation. A useful alternative approach is that of maximum likelihood (ML). Its particular advantage over least squares is that estimation and testing procedures are well defined, regardless of the design of the data. A modified version of ML, REML, eliminates the bias of ML estimates of variance components. Expressions for the expected bias and variance of estimates obtained from balanced, fully hierarchical designs are presented for ML and REML. Analyses of data simulated from balanced, hierarchical designs reveal differences in the properties of ML, REML, and F-ratio tests of significance. A second simulation study compares properties of REML estimates obtained from a balanced, fully hierarchical design (within-generation analysis) with those from a sampling design including phenotypic data on parents and multiple progeny. It also illustrates the effects of imposing nonnegativity constraints on the estimates. Finally, it reveals that predictions of the behavior of significance tests based on asymptotic theory are not accurate when sample size is small and that constraining the estimates seriously affects properties of the tests. Because of their great flexibility, likelihood methods can serve as a useful tool for estimation of quantitative-genetic parameters in natural populations. Difficulties involved in hypothesis testing remain to be solved.  相似文献   

8.
This paper describes mathematical and computational methodology for estimating the parameters of the Burr Type XII distribution by the method of maximum likelihood. Expressions for the asymptotic variances and covariances of the parameter estimates are given, and the modality of the log-likelihood and conditional log-likelihood functions is analyzed. As a result of this analysis for various a priori known and unknown parameter combinations, conditions are given which guarantee that the parameter estimates obtained will, indeed, be maximum likelihood estimates. An efficient numerical method for maximizing the conditional log-likelihood function is described, and mathematical expressions are given for the various numerical approximations needed to evaluate the expressions given for the asymptotic variances and covariances of the parameter estimates. The methodology discussed is applied in a numerical example to life test data arising in a clinical setting.  相似文献   

9.
ABSTRACT Weighted distributions can be used to fit various forms of resource selection probability functions (RSPF) under the use-versus-available study design (Lele and Keim 2006). Although valid, the numerical maximization procedure used by Lele and Keim (2006) is unstable because of the inherent roughness of the Monte Carlo likelihood function. We used a combination of the methods of partial likelihood and data cloning to obtain maximum likelihood estimators of the RSPF in a numerically stable fashion. We demonstrated the methodology using simulated data sets generated under the log—log RSPF model and a reanalysis of telemetry data presented in Lele and Keim (2006) using the logistic RSPF model. The new method for estimation of RSPF can be used to understand differential selection of resources by animals, an essential component of studies in conservation biology, wildlife management, and applied ecology.  相似文献   

10.
The supplemented case-control design consists of a case-control sample and of an additional sample of disease-free subjects who arise from a given stratum of one of the measured exposures in the case-control study. The supplemental data might, for example, arise from a population survey conducted independently of the case-control study. This design improves precision of estimates of main effects and especially of joint exposures, particularly when joint exposures are uncommon and the prevalence of one of the exposures is low. We first present a pseudo-likelihood estimator (PLE) that is easy to compute. We further adapt two-phase design methods to find maximum likelihood estimates (MLEs) for the log odds ratios for this design and derive asymptotic variance estimators that appropriately account for the differences in sampling schemes of this design from that of the traditional two-phase design. As an illustration of our design we present a study that was conducted to assess the influence to joint exposure of hepatitis-B virus (HBV) and hepatitis-C virus (HCV) infection on the risk of hepatocellular carcinoma in data from Qidong County, Jiangsu Province, China.  相似文献   

11.
Meyer K 《Heredity》2008,101(3):212-221
Mixed model analyses via restricted maximum likelihood, fitting the so-called animal model, have become standard methodology for the estimation of genetic variances. Models involving multiple genetic variance components, due to different modes of gene action, are readily fitted. It is shown that likelihood-based calculations may provide insight into the quality of the resulting parameter estimates, and are directly applicable to the validation of experimental designs. This is illustrated for the example of a design suggested recently to estimate X-linked genetic variances. In particular, large sample variances and sampling correlations are demonstrated to provide an indication of 'problem' scenarios. Using simulation, it is shown that the profile likelihood function provides more appropriate estimates of confidence intervals than large sample variances. Examination of the likelihood function and its derivatives are recommended as part of the design stage of quantitative genetic experiments.  相似文献   

12.
Study designs where data have been aggregated by geographical areas are popular in environmental epidemiology. These studies are commonly based on administrative databases and, providing a complete spatial coverage, are particularly appealing to make inference on the entire population. However, the resulting estimates are often biased and difficult to interpret due to unmeasured confounders, which typically are not available from routinely collected data. We propose a framework to improve inference drawn from such studies exploiting information derived from individual-level survey data. The latter are summarized in an area-level scalar score by mimicking at ecological level the well-known propensity score methodology. The literature on propensity score for confounding adjustment is mainly based on individual-level studies and assumes a binary exposure variable. Here, we generalize its use to cope with area-referenced studies characterized by a continuous exposure. Our approach is based upon Bayesian hierarchical structures specified into a two-stage design: (i) geolocated individual-level data from survey samples are up-scaled at ecological level, then the latter are used to estimate a generalized ecological propensity score (EPS) in the in-sample areas; (ii) the generalized EPS is imputed in the out-of-sample areas under different assumptions about the missingness mechanisms, then it is included into the ecological regression, linking the exposure of interest to the health outcome. This delivers area-level risk estimates, which allow a fuller adjustment for confounding than traditional areal studies. The methodology is illustrated by using simulations and a case study investigating the risk of lung cancer mortality associated with nitrogen dioxide in England (UK).  相似文献   

13.
Little attention has been paid to the use of multi‐sample batch‐marking studies, as it is generally assumed that an individual's capture history is necessary for fully efficient estimates. However, recently, Huggins et al. ( 2010 ) present a pseudo‐likelihood for a multi‐sample batch‐marking study where they used estimating equations to solve for survival and capture probabilities and then derived abundance estimates using a Horvitz–Thompson‐type estimator. We have developed and maximized the likelihood for batch‐marking studies. We use data simulated from a Jolly–Seber‐type study and convert this to what would have been obtained from an extended batch‐marking study. We compare our abundance estimates obtained from the Crosbie–Manly–Arnason–Schwarz (CMAS) model with those of the extended batch‐marking model to determine the efficiency of collecting and analyzing batch‐marking data. We found that estimates of abundance were similar for all three estimators: CMAS, Huggins, and our likelihood. Gains are made when using unique identifiers and employing the CMAS model in terms of precision; however, the likelihood typically had lower mean square error than the pseudo‐likelihood method of Huggins et al. ( 2010 ). When faced with designing a batch‐marking study, researchers can be confident in obtaining unbiased abundance estimators. Furthermore, they can design studies in order to reduce mean square error by manipulating capture probabilities and sample size.  相似文献   

14.
The heterogeneous Poisson process with discretized exponential quadratic rate function is considered. Maximum likelihood estimates of the parameters of the rate function are derived for the case when the data consists of numbers of occurrences in consecutive equal time periods. A likelihood ratio test of the null hypothesis of exponential quadratic rate is presented. Its power against exponential linear rate functions is estimated using Monte Carlo simulation. The maximum likelihood method is compared with a log-linear least squares techniques. An application of the technique to the analysis of mortality rates due to congenital malformations is presented.  相似文献   

15.
Estimates of quantitative trait loci (QTL) effects derived from complete genome scans are biased, if no assumptions are made about the distribution of QTL effects. Bias should be reduced if estimates are derived by maximum likelihood, with the QTL effects sampled from a known distribution. The parameters of the distributions of QTL effects for nine economic traits in dairy cattle were estimated from a daughter design analysis of the Israeli Holstein population including 490 marker-by-sire contrasts. A separate gamma distribution was derived for each trait. Estimates for both the α and β parameters and their SE decreased as a function of heritability. The maximum likelihood estimates derived for the individual QTL effects using the gamma distributions for each trait were regressed relative to the least squares estimates, but the regression factor decreased as a function of the least squares estimate. On simulated data, the mean of least squares estimates for effects with nominal 1% significance was more than twice the simulated values, while the mean of the maximum likelihood estimates was slightly lower than the mean of the simulated values. The coefficient of determination for the maximum likelihood estimates was five-fold the corresponding value for the least squares estimates.  相似文献   

16.
Diagnostic studies in ophthalmology frequently involve binocular data where pairs of eyes are evaluated, through some diagnostic procedure, for the presence of certain diseases or pathologies. The simplest approach of estimating measures of diagnostic accuracy, such as sensitivity and specificity, treats eyes as independent, consequently yielding incorrect estimates, especially of the standard errors. Approaches that account for the inter‐eye correlation include regression methods using generalized estimating equations and likelihood techniques based on various correlated binomial models. The paper proposes a simple alternative statistical methodology of jointly estimating measures of diagnostic accuracy for binocular tests based on a flexible model for correlated binary data. Moments' estimation of model parameters is outlined and asymptotic inference is discussed. The resulting estimates are straightforward and easy to obtain, requiring no special statistical software but only elementary calculations. Results of simulations indicate that large‐sample and bootstrap confidence intervals based on the estimates have relatively good coverage properties when the model is correctly specified. The computation of the estimates and their standard errors are illustrated with data from a study on diabetic retinopathy.  相似文献   

17.
A powerful methodology for analyzing post-synaptic currents recorded from central neurons is presented. An unknown quantity of transmitter molecules released from presynaptic terminals by electrical stimulation of nerve fibers generates a post-synaptic response at the synaptic site. The current induced at the synaptic junction is assumed to rise rapidly and decay slowly with its peak amplitude being proportional to the number of released transmitter molecules. The signal so generated is then distorted by the cable properties of the dendrite, modeled as a time-invariant, linear filter with unknown parameters. The response recorded from the cell body of the neuron following the electrical stimulation is contaminated by zero-mean, white, Gaussian noise. The parameters of the signal are then evaluated from the observation sequence using a quasi-profile likelihood estimation procedure. These parameter values are then employed to deconvolve each measured post-synaptic response to produce an optimal estimate of the transmembrane current flux. From these estimates we derive the amplitude of the synaptic current and the relative amount of transmitter molecules that elicited each response. The underlying amplitude fluctuations in the entire data sequence are investigated using a non-parametric technique based on kernel smoothing procedures. The effectiveness of the new methodology is illustrated in various simulation examples.  相似文献   

18.
19.
In many biometrical applications, the count data encountered often contain extra zeros relative to the Poisson distribution. Zero‐inflated Poisson regression models are useful for analyzing such data, but parameter estimates may be seriously biased if the nonzero observations are over‐dispersed and simultaneously correlated due to the sampling design or the data collection procedure. In this paper, a zero‐inflated negative binomial mixed regression model is presented to analyze a set of pancreas disorder length of stay (LOS) data that comprised mainly same‐day separations. Random effects are introduced to account for inter‐hospital variations and the dependency of clustered LOS observations. Parameter estimation is achieved by maximizing an appropriate log‐likelihood function using an EM algorithm. Alternative modeling strategies, namely the finite mixture of Poisson distributions and the non‐parametric maximum likelihood approach, are also considered. The determination of pertinent covariates would assist hospital administrators and clinicians to manage LOS and expenditures efficiently.  相似文献   

20.
Time-series data resulting from surveying wild animals are often described using state-space population dynamics models, in particular with Gompertz, Beverton-Holt, or Moran-Ricker latent processes. We show how hidden Markov model methodology provides a flexible framework for fitting a wide range of models to such data. This general approach makes it possible to model abundance on the natural or log scale, include multiple observations at each sampling occasion and compare alternative models using information criteria. It also easily accommodates unequal sampling time intervals, should that possibility occur, and allows testing for density dependence using the bootstrap. The paper is illustrated by replicated time series of red kangaroo abundances, and a univariate time series of ibex counts which are an order of magnitude larger. In the analyses carried out, we fit different latent process and observation models using the hidden Markov framework. Results are robust with regard to the necessary discretization of the state variable. We find no effective difference between the three latent models of the paper in terms of maximized likelihood value for the two applications presented, and also others analyzed. Simulations suggest that ecological time series are not sufficiently informative to distinguish between alternative latent processes for modeling population survey data when data do not indicate strong density dependence.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号