首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Matrix models are widely used in biology to predict the temporal evolution of stage-structured populations. One issue related to matrix models that is often disregarded is the sampling variability. As the sample used to estimate the vital rates of the models are of finite size, a sampling error is attached to parameter estimation, which has in turn repercussions on all the predictions of the model. In this study, we address the question of building confidence bounds around the predictions of matrix models due to sampling variability. We focus on a density-dependent Usher model, the maximum likelihood estimator of parameters, and the predicted stationary stage vector. The asymptotic distribution of the stationary stage vector is specified, assuming that the parameters of the model remain in a set of the parameter space where the model admits one unique equilibrium point. Tests for density-dependence are also incidentally provided. The model is applied to a tropical rain forest in French Guiana.  相似文献   

2.
Owing to its robustness properties, marginal interpretations, and ease of implementation, the pseudo-partial likelihood method proposed in the seminal papers of Pepe and Cai and Lin et al. has become the default approach for analyzing recurrent event data with Cox-type proportional rate models. However, the construction of the pseudo-partial score function ignores the dependency among recurrent events and thus can be inefficient. An attempt to investigate the asymptotic efficiency of weighted pseudo-partial likelihood estimation found that the optimal weight function involves the unknown variance–covariance process of the recurrent event process and may not have closed-form expression. Thus, instead of deriving the optimal weights, we propose to combine a system of pre-specified weighted pseudo-partial score equations via the generalized method of moments and empirical likelihood estimation. We show that a substantial efficiency gain can be easily achieved without imposing additional model assumptions. More importantly, the proposed estimation procedures can be implemented with existing software. Theoretical and numerical analyses show that the empirical likelihood estimator is more appealing than the generalized method of moments estimator when the sample size is sufficiently large. An analysis of readmission risk in colorectal cancer patients is presented to illustrate the proposed methodology.  相似文献   

3.
We have estimated the number of sika deer, Cervus nippon, in Hokkaido, Japan, with the aim of developing a management program that will reduce the level of agricultural damage caused by these deer. A population index that is defined by the population divided by the population of 1993 is first estimated from the data obtained during a spotlight survey. A generalized linear mixed model (GLMM) with corner point constraints is used in this estimation. We then estimate the population from the index by evaluating the response of index to the known amount of harvest, including hunting. A stage-structured model is used in this harvest-based estimation. It is well-known that estimates of indices suffer from large observation errors when the probability of the observation fluctuates widely; therefore, we apply state-space modeling to the harvest-based estimation to remove the observation errors. We propose the use of Bayesian estimation with uniform prior-distributions as an approximation of the maximum likelihood estimation, without permitting an arbitrary assumption that the parameters fluctuate following prior-distributions. We are able to demonstrate that the harvest-based Bayesian estimation is effective in reducing the observation errors in sika deer populations, but the stage-structured model requires many demographic parameters to be known prior to running the analyses. These parameters cannot be estimated from the observed time-series of the index if there is insufficient data. We then construct a univariate model by simplifying the stage-structured model and show that the simplified model yields estimates that are nearly identical to those obtained from the stage-structured model. This simplification of the model simultaneously clarifies which parameter is important in estimating the population. Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

4.
Maximum likelihood estimation of the model parameters for a spatial population based on data collected from a survey sample is usually straightforward when sampling and non-response are both non-informative, since the model can then usually be fitted using the available sample data, and no allowance is necessary for the fact that only a part of the population has been observed. Although for many regression models this naive strategy yields consistent estimates, this is not the case for some models, such as spatial auto-regressive models. In this paper, we show that for a broad class of such models, a maximum marginal likelihood approach that uses both sample and population data leads to more efficient estimates since it uses spatial information from sampled as well as non-sampled units. Extensive simulation experiments based on two well-known data sets are used to assess the impact of the spatial sampling design, the auto-correlation parameter and the sample size on the performance of this approach. When compared to some widely used methods that use only sample data, the results from these experiments show that the maximum marginal likelihood approach is much more precise.  相似文献   

5.

Background  

The estimation of demographic parameters from genetic data often requires the computation of likelihoods. However, the likelihood function is computationally intractable for many realistic evolutionary models, and the use of Bayesian inference has therefore been limited to very simple models. The situation changed recently with the advent of Approximate Bayesian Computation (ABC) algorithms allowing one to obtain parameter posterior distributions based on simulations not requiring likelihood computations.  相似文献   

6.
Aims Fits of species-abundance distributions to empirical data are increasingly used to evaluate models of diversity maintenance and community structure and to infer properties of communities, such as species richness. Two distributions predicted by several models are the Poisson lognormal (PLN) and the negative binomial (NB) distribution; however, at least three different ways to parameterize the PLN have been proposed, which differ in whether unobserved species contribute to the likelihood and in whether the likelihood is conditional upon the total number of individuals in the sample. Each of these has an analogue for the NB. Here, we propose a new formulation of the PLN and NB that includes the number of unobserved species as one of the estimated parameters. We investigate the performance of parameter estimates obtained from this reformulation, as well as the existing alternatives, for drawing inferences about the shape of species abundance distributions and estimation of species richness.Methods We simulate the random sampling of a fixed number of individuals from lognormal and gamma community relative abundance distributions, using a previously developed 'individual-based' bootstrap algorithm. We use a range of sample sizes, community species richness levels and shape parameters for the species abundance distributions that span much of the realistic range for empirical data, generating 1?000 simulated data sets for each parameter combination. We then fit each of the alternative likelihoods to each of the simulated data sets, and we assess the bias, sampling variance and estimation error for each method.Important findings Parameter estimates behave reasonably well for most parameter values, exhibiting modest levels of median error. However, for the NB, median error becomes extremely large as the NB approaches either of two limiting cases. For both the NB and PLN,>90% of the variation in the error in model parameters across parameter sets is explained by three quantities that corresponded to the proportion of species not observed in the sample, the expected number of species observed in the sample and the discrepancy between the true NB or PLN distribution and a Poisson distribution with the same mean. There are relatively few systematic differences between the four alternative likelihoods. In particular, failing to condition the likelihood on the total sample sizes does not appear to systematically increase the bias in parameter estimates. Indeed, overall, the classical likelihood performs slightly better than the alternatives. However, our reparameterized likelihood, for which species richness is a fitted parameter, has important advantages over existing approaches for estimating species richness from fitted species-abundance models.  相似文献   

7.
The initial exponential growth rate of an epidemic is an important measure of disease spread, and is commonly used to infer the basic reproduction number $\mathcal{R}_{0}$ . While modern techniques (e.g., MCMC and particle filtering) for parameter estimation of mechanistic models have gained popularity, maximum likelihood fitting of phenomenological models remains important due to its simplicity, to the difficulty of using modern methods in the context of limited data, and to the fact that there is not always enough information available to choose an appropriate mechanistic model. However, it is often not clear which phenomenological model is appropriate for a given dataset. We compare the performance of four commonly used phenomenological models (exponential, Richards, logistic, and delayed logistic) in estimating initial epidemic growth rates by maximum likelihood, by fitting them to simulated epidemics with known parameters. For incidence data, both the logistic model and the Richards model yield accurate point estimates for fitting windows up to the epidemic peak. When observation errors are small, the Richards model yields confidence intervals with better coverage. For mortality data, the Richards model and the delayed logistic model yield the best growth rate estimates. We also investigate the width and coverage of the confidence intervals corresponding to these fits.  相似文献   

8.
Publication bias is a major concern in conducting systematic reviews and meta-analyses. Various sensitivity analysis or bias-correction methods have been developed based on selection models, and they have some advantages over the widely used trim-and-fill bias-correction method. However, likelihood methods based on selection models may have difficulty in obtaining precise estimates and reasonable confidence intervals, or require a rather complicated sensitivity analysis process. Herein, we develop a simple publication bias adjustment method by utilizing the information on conducted but still unpublished trials from clinical trial registries. We introduce an estimating equation for parameter estimation in the selection function by regarding the publication bias issue as a missing data problem under the missing not at random assumption. With the estimated selection function, we introduce the inverse probability weighting (IPW) method to estimate the overall mean across studies. Furthermore, the IPW versions of heterogeneity measures such as the between-study variance and the I2 measure are proposed. We propose methods to construct confidence intervals based on asymptotic normal approximation as well as on parametric bootstrap. Through numerical experiments, we observed that the estimators successfully eliminated bias, and the confidence intervals had empirical coverage probabilities close to the nominal level. On the other hand, the confidence interval based on asymptotic normal approximation is much wider in some scenarios than the bootstrap confidence interval. Therefore, the latter is recommended for practical use.  相似文献   

9.
We propose a method to construct simultaneous confidence intervals for a parameter vector from inverting a series of randomization tests (RT). The randomization tests are facilitated by an efficient multivariate Robbins–Monro procedure that takes the correlation information of all components into account. The estimation method does not require any distributional assumption of the population other than the existence of the second moments. The resulting simultaneous confidence intervals are not necessarily symmetric about the point estimate of the parameter vector but possess the property of equal tails in all dimensions. In particular, we present the constructing the mean vector of one population and the difference between two mean vectors of two populations. Extensive simulation is conducted to show numerical comparison with four methods. We illustrate the application of the proposed method to test bioequivalence with multiple endpoints on some real data.  相似文献   

10.
Lu Xia  Bin Nan  Yi Li 《Biometrics》2023,79(1):344-357
Modeling and drawing inference on the joint associations between single-nucleotide polymorphisms and a disease has sparked interest in genome-wide associations studies. In the motivating Boston Lung Cancer Survival Cohort (BLCSC) data, the presence of a large number of single nucleotide polymorphisms of interest, though smaller than the sample size, challenges inference on their joint associations with the disease outcome. In similar settings, we find that neither the debiased lasso approach (van de Geer et al., 2014), which assumes sparsity on the inverse information matrix, nor the standard maximum likelihood method can yield confidence intervals with satisfactory coverage probabilities for generalized linear models. Under this “large n, diverging p” scenario, we propose an alternative debiased lasso approach by directly inverting the Hessian matrix without imposing the matrix sparsity assumption, which further reduces bias compared to the original debiased lasso and ensures valid confidence intervals with nominal coverage probabilities. We establish the asymptotic distributions of any linear combinations of the parameter estimates, which lays the theoretical ground for drawing inference. Simulations show that the proposed refined debiased estimating method performs well in removing bias and yields honest confidence interval coverage. We use the proposed method to analyze the aforementioned BLCSC data, a large-scale hospital-based epidemiology cohort study investigating the joint effects of genetic variants on lung cancer risks.  相似文献   

11.
12.
Single nucleotide polymorphism (SNP) data can be used for parameter estimation via maximum likelihood methods as long as the way in which the SNPs were determined is known, so that an appropriate likelihood formula can be constructed. We present such likelihoods for several sampling methods. As a test of these approaches, we consider use of SNPs to estimate the parameter Theta = 4N(e)micro (the scaled product of effective population size and per-site mutation rate), which is related to the branch lengths of the reconstructed genealogy. With infinite amounts of data, ML models using SNP data are expected to produce consistent estimates of Theta. With finite amounts of data the estimates are accurate when Theta is high, but tend to be biased upward when Theta is low. If recombination is present and not allowed for in the analysis, the results are additionally biased upward, but this effect can be removed by incorporating recombination into the analysis. SNPs defined as sites that are polymorphic in the actual sample under consideration (sample SNPs) are somewhat more accurate for estimation of Theta than SNPs defined by their polymorphism in a panel chosen from the same population (panel SNPs). Misrepresenting panel SNPs as sample SNPs leads to large errors in the maximum likelihood estimate of Theta. Researchers collecting SNPs should collect and preserve information about the method of ascertainment so that the data can be accurately analyzed.  相似文献   

13.
To plan for any future rescue of personnel in a disabled and pressurized submarine, the US Navy needs a method for predicting risk of decompression sickness under possible scenarios for crew recovery. Such scenarios include direct ascent from compressed air exposures with risks too high for ethical human experiments. Animal data, however, with their extensive range of exposure pressures and incidence of decompression sickness, could improve prediction of high-risk human exposures. Hill equation dose-response models were fit, by using maximum likelihood, to 898 air-saturation, direct-ascent dives from humans, pigs, and rats, both individually and combined. Combining the species allowed estimation of one, more precise Hill equation exponent (steepness parameter), thus increasing the precision associated with human risk predictions. These predictions agreed more closely with the observed data at 2 ATA, compared with a current, more general, US Navy model, although the confidence limits of both models overlapped those of the data. However, the greatest benefit of adding animal data was observed after removal of the highest risk human exposures, requiring the models to extrapolate.  相似文献   

14.
We describe a new approximate likelihood for population genetic data under a model in which a single ancestral population has split into two daughter populations. The approximate likelihood is based on the ‘Product of Approximate Conditionals’ likelihood and ‘copying model’ of Li and Stephens [Li, N., Stephens, M., 2003. Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics 165 (4), 2213–2233]. The approach developed here may be used for efficient approximate likelihood-based analyses of unlinked data. However our copying model also considers the effects of recombination. Hence, a more important application is to loosely-linked haplotype data, for which efficient statistical models explicitly featuring non-equilibrium population structure have so far been unavailable. Thus, in addition to the information in allele frequency differences about the timing of the population split, the method can also extract information from the lengths of haplotypes shared between the populations. There are a number of challenges posed by extracting such information, which makes parameter estimation difficult. We discuss how the approach could be extended to identify haplotypes introduced by migrants.  相似文献   

15.
16.
Ecologists are increasingly using statistical models to predict animal abundance and occurrence in unsampled locations. The reliability of such predictions depends on a number of factors, including sample size, how far prediction locations are from the observed data, and similarity of predictive covariates in locations where data are gathered to locations where predictions are desired. In this paper, we propose extending Cook’s notion of an independent variable hull (IVH), developed originally for application with linear regression models, to generalized regression models as a way to help assess the potential reliability of predictions in unsampled areas. Predictions occurring inside the generalized independent variable hull (gIVH) can be regarded as interpolations, while predictions occurring outside the gIVH can be regarded as extrapolations worthy of additional investigation or skepticism. We conduct a simulation study to demonstrate the usefulness of this metric for limiting the scope of spatial inference when conducting model-based abundance estimation from survey counts. In this case, limiting inference to the gIVH substantially reduces bias, especially when survey designs are spatially imbalanced. We also demonstrate the utility of the gIVH in diagnosing problematic extrapolations when estimating the relative abundance of ribbon seals in the Bering Sea as a function of predictive covariates. We suggest that ecologists routinely use diagnostics such as the gIVH to help gauge the reliability of predictions from statistical models (such as generalized linear, generalized additive, and spatio-temporal regression models).  相似文献   

17.
We consider parametric distributions intended to model heterogeneity in population size estimation, especially parametric stochastic abundance models for species richness estimation. We briefly review (conditional) maximum likelihood estimation of the number of species, and summarize the results of fitting 7 candidate models to frequency‐count data, from a database of >40000 such instances, mostly arising from microbial ecology. We consider error estimation, goodness‐of‐fit assessment, data subsetting, and other practical matters. We find that, although the array of candidate models can be improved, finite mixtures of a small number of components (point masses or simple diffuse distributions) represent a promising direction. Finally we consider the connections between parametric models for abundance and incidence data, again noting the usefulness of finite mixture models. (© 2008 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

18.
M. K. Kuhner  J. Yamato    J. Felsenstein 《Genetics》1995,140(4):1421-1430
We present a new way to make a maximum likelihood estimate of the parameter 4N(e)μ (effective population size times mutation rate per site, or θ) based on a population sample of molecular sequences. We use a Metropolis-Hastings Markov chain Monte Carlo method to sample genealogies in proportion to the product of their likelihood with respect to the data and their prior probability with respect to a coalescent distribution. A specific value of θ must be chosen to generate the coalescent distribution, but the resulting trees can be used to evaluate the likelihood at other values of θ, generating a likelihood curve. This procedure concentrates sampling on those genealogies that contribute most of the likelihood, allowing estimation of meaningful likelihood curves based on relatively small samples. The method can potentially be extended to cases involving varying population size, recombination, and migration.  相似文献   

19.
A class of generalized linear mixed models can be obtained by introducing random effects in the linear predictor of a generalized linear model, e.g. a split plot model for binary data or count data. Maximum likelihood estimation, for normally distributed random effects, involves high-dimensional numerical integration, with severe limitations on the number and structure of the additional random effects. An alternative estimation procedure based on an extension of the iterative re-weighted least squares procedure for generalized linear models will be illustrated on a practical data set involving carcass classification of cattle. The data is analysed as overdispersed binomial proportions with fixed and random effects and associated components of variance on the logit scale. Estimates are obtained with standard software for normal data mixed models. Numerical restrictions pertain to the size of matrices to be inverted. This can be dealt with by absorption techniques familiar from e.g. mixed models in animal breeding. The final model fitted to the classification data includes four components of variance and a multiplicative overdispersion factor. Basically the estimation procedure is a combination of iterated least squares procedures and no full distributional assumptions are needed. A simulation study based on the classification data is presented. This includes a study of procedures for constructing confidence intervals and significance tests for fixed effects and components of variance. The simulation results increase confidence in the usefulness of the estimation procedure.  相似文献   

20.
Summary .   Motivated by the spatial modeling of aberrant crypt foci (ACF) in colon carcinogenesis, we consider binary data with probabilities modeled as the sum of a nonparametric mean plus a latent Gaussian spatial process that accounts for short-range dependencies. The mean is modeled in a general way using regression splines. The mean function can be viewed as a fixed effect and is estimated with a penalty for regularization. With the latent process viewed as another random effect, the model becomes a generalized linear mixed model. In our motivating data set and other applications, the sample size is too large to easily accommodate maximum likelihood or restricted maximum likelihood estimation (REML), so pairwise likelihood, a special case of composite likelihood, is used instead. We develop an asymptotic theory for models that are sufficiently general to be used in a wide variety of applications, including, but not limited to, the problem that motivated this work. The splines have penalty parameters that must converge to zero asymptotically: we derive theory for this along with a data-driven method for selecting the penalty parameter, a method that is shown in simulations to improve greatly upon standard devices, such as likelihood crossvalidation. Finally, we apply the methods to the data from our experiment ACF. We discover an unexpected location for peak formation of ACF.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号