首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 475 毫秒
1.
In this article, we propose a two-stage approach to modeling multilevel clustered non-Gaussian data with sufficiently large numbers of continuous measures per cluster. Such data are common in biological and medical studies utilizing monitoring or image-processing equipment. We consider a general class of hierarchical models that generalizes the model in the global two-stage (GTS) method for nonlinear mixed effects models by using any square-root-n-consistent and asymptotically normal estimators from stage 1 as pseudodata in the stage 2 model, and by extending the stage 2 model to accommodate random effects from multiple levels of clustering. The second-stage model is a standard linear mixed effects model with normal random effects, but the cluster-specific distributions, conditional on random effects, can be non-Gaussian. This methodology provides a flexible framework for modeling not only a location parameter but also other characteristics of conditional distributions that may be of specific interest. For estimation of the population parameters, we propose a conditional restricted maximum likelihood (CREML) approach and establish the asymptotic properties of the CREML estimators. The proposed general approach is illustrated using quartiles as cluster-specific parameters estimated in the first stage, and applied to the data example from a collagen fibril development study. We demonstrate using simulations that in samples with small numbers of independent clusters, the CREML estimators may perform better than conditional maximum likelihood estimators, which are a direct extension of the estimators from the GTS method.  相似文献   

2.
We introduce a new statistical computing method, called data cloning, to calculate maximum likelihood estimates and their standard errors for complex ecological models. Although the method uses the Bayesian framework and exploits the computational simplicity of the Markov chain Monte Carlo (MCMC) algorithms, it provides valid frequentist inferences such as the maximum likelihood estimates and their standard errors. The inferences are completely invariant to the choice of the prior distributions and therefore avoid the inherent subjectivity of the Bayesian approach. The data cloning method is easily implemented using standard MCMC software. Data cloning is particularly useful for analysing ecological situations in which hierarchical statistical models, such as state-space models and mixed effects models, are appropriate. We illustrate the method by fitting two nonlinear population dynamics models to data in the presence of process and observation noise.  相似文献   

3.
J. B. Wilson 《Oecologia》1987,73(4):579-582
Summary Comparison of co-occurrences between species on a group of islands with those expected from a randombased null model could provide evidence on community structure. However, it is difficult to decide on the appropriate null model. Gilpin and Diamond proposed a model and a test for departure from it, but this test is shown to indicate significant structure even when applied to a matrix of random numbers. An alternative method is suggested, using the distribution of Gilpin and Diamond's deviation as test statistic, but determining the expected distribution by Monto Carlo simulation, and using many such simulations as a randomisation test of significance. The null model used accepts the observed totals of occurrences for islands and species; it therefore offers a somewhat conservative test. Applied to the Vanuatu bird data that Gilpin and Diamond used, significant departure from a null model is seen, but with an excess of extreme negative associations, the opposite result from that given by Gilpin and Diamond's method. It is not possible to tell whether the negative associations are due to autecology, biogeography, or to interactions between species.  相似文献   

4.
In this work, we propose a novel method for individualized treatment selection when the treatment response is multivariate. Our method covers any number of treatments and it can be applied for a broad set of models. The proposed method uses a Mahalanobis-type distance measure to establish an ordering of treatments based on treatment performance measures. Our investigation in this work deals with means of responses conditional on lower dimensional composite scores based on covariates where these scores are built using single index models to approximate mean responses against patient covariates. Smoothed estimates of such conditional means are combined to construct an estimate of the aforementioned distance measure, which is then used to estimate the optimal treatment. An empirical study demonstrates the performance of the proposed method in finite samples. We also present a data analysis using an HIV clinical trial data to show the applicability of the proposed procedure for real data.  相似文献   

5.
We present a method for estimating growth and mortality rates in size-structured population models. The methods are based on least-square fits to data using approximate models (using spline approximations) for the underlying partial differential equation population model. In a series of numerical tests, we compare our approach to an existing method (due to Hackney and Webb). As an example, we apply our techniques to experimental data from larval striped bass field studies.Research supported in part under grants at Brown University from the National Science Foundation: UINT-8521208, NSFDMS-8818530 (H.T.B., F.K. and CW.); from the Air Force Office of Scientific Research: AFOSR F49620-86-C-0111 (H.T.B., C.W.); and at University of California, Davis from the Alford P. Sloan Foundation (L.W.B.)  相似文献   

6.
An Improved Parameter Estimation Method for Hodgkin-Huxley Models   总被引:2,自引:0,他引:2  
We consider whole-cell voltage-clamp data of isolated currents characterized by the Hodgkin-Huxley paradigm. We examine the errors associated with the typical parameter estimation method for these data and show them to be unsatisfactorally large especially if the time constants of activation and inactivation are not sufficiently separated. The size of these errors is due to the fact that the steady-state and kinetic properties of the current are estimated disjointly. We present an improved parameter estimation method that utilizes all of the information in the voltage-clamp conductance data to estimate steady-state and kinetic properties simultaneously and illustrate its success compared to the standard method using simulated data and data from P. interruptus shal channels expressed in oocytes.  相似文献   

7.
Lu SE  Lin Y  Shih WC 《Biometrics》2004,60(1):257-267
This article considers clinical trials in which the efficacy measure is taken from several sites within each patient, such as the alveolar bone height of the tooth sites, or bone mineral densities of the lumbar spine sites. Since usually only a small portion of these sites will exhibit changes, the conventional method using per patient average gives a diluted result due to excessive no changes in the data. Different methods have been proposed for this type of data in the case where the observations are mutually independent. This includes the popular "two-part model" (Lachenbruch, 2001, Statistics in Medicine 20, 1215-1234; 2002, Statistical Methods in Medical Research 11, 297-302), which is related to the "composite approach" for discrete and continuous data in Shih and Quan (1997, Statistics in Medicine16, 1225-1239; 2001, Statistica Sinica 11, 53-62). In this article, we model the data with excessive zeros (no changes) in clustered data using a mixture of distributions, and taking into account possible measurement errors. This mixture model includes the two-part model as a special case when one component of the mixture degenerates.  相似文献   

8.
We demonstrate that an allometric model for eelgrass leaf-growth rates can be derived from data on leaf architecture and growth form. Using this construct, we produced indirect assessments of growth rates of leaves that we call projections, which can be easily obtained in terms of allometric parameters and proxy values for leaf area, expressed as the product of leaf length and width. These projections of leaf-growth rates displayed a high level of correspondence with values observed in our data, as well as with other sets of reference data. A comparison with growth rates obtained by using the plastochrone index method showed that our model provides more accurate estimations while using a simpler methodology. Our results also show that whenever allometric parameters for the scaling of eelgrass leaf dry weight in terms of leaf area are available, the proposed model provides an accurate, cost-effective and non-destructive alternative to assessments based on traditional or plastochrone methods.  相似文献   

9.
Qin LX  Self SG 《Biometrics》2006,62(2):526-533
Identification of differentially expressed genes and clustering of genes are two important and complementary objectives addressed with gene expression data. For the differential expression question, many "per-gene" analytic methods have been proposed. These methods can generally be characterized as using a regression function to independently model the observations for each gene; various adjustments for multiplicity are then used to interpret the statistical significance of these per-gene regression models over the collection of genes analyzed. Motivated by this common structure of per-gene models, we proposed a new model-based clustering method--the clustering of regression models method, which groups genes that share a similar relationship to the covariate(s). This method provides a unified approach for a family of clustering procedures and can be applied for data collected with various experimental designs. In addition, when combined with per-gene methods for assessing differential expression that employ the same regression modeling structure, an integrated framework for the analysis of microarray data is obtained. The proposed methodology was applied to two microarray data sets, one from a breast cancer study and the other from a yeast cell cycle study.  相似文献   

10.
Chatterjee N  Shih J 《Biometrics》2001,57(3):779-786
For modeling correlation in familial diseases with variable ages at onset, we propose a bivariate model that incorporates two types of pairwise association, one between the lifetime risk or the overall susceptibility of two individuals and one between the ages at onset between two susceptible individuals. For estimation, we consider a two-stage estimation procedure similar to that of Shih (1998, Biometrics 54, 1115-1128). We evaluate the properties of the estimators through simulations and compare the performance with that from a bivariate survival model that allows correlation between ages at onset only. We apply the methodology to breast cancer using the kinship data from the Washington Ashkenazi Study. We also discuss potential applications of the proposed method in the area of cure modeling.  相似文献   

11.
Time-series data resulting from surveying wild animals are often described using state-space population dynamics models, in particular with Gompertz, Beverton-Holt, or Moran-Ricker latent processes. We show how hidden Markov model methodology provides a flexible framework for fitting a wide range of models to such data. This general approach makes it possible to model abundance on the natural or log scale, include multiple observations at each sampling occasion and compare alternative models using information criteria. It also easily accommodates unequal sampling time intervals, should that possibility occur, and allows testing for density dependence using the bootstrap. The paper is illustrated by replicated time series of red kangaroo abundances, and a univariate time series of ibex counts which are an order of magnitude larger. In the analyses carried out, we fit different latent process and observation models using the hidden Markov framework. Results are robust with regard to the necessary discretization of the state variable. We find no effective difference between the three latent models of the paper in terms of maximized likelihood value for the two applications presented, and also others analyzed. Simulations suggest that ecological time series are not sufficiently informative to distinguish between alternative latent processes for modeling population survey data when data do not indicate strong density dependence.  相似文献   

12.
Array-based comparative genomic hybridization (array-CGH) is a high throughput, high resolution technique for studying the genetics of cancer. Analysis of array-CGH data typically involves estimation of the underlying chromosome copy numbers from the log fluorescence ratios and segmenting the chromosome into regions with the same copy number at each location. We propose for the analysis of array-CGH data, a new stochastic segmentation model and an associated estimation procedure that has attractive statistical and computational properties. An important benefit of this Bayesian segmentation model is that it yields explicit formulas for posterior means, which can be used to estimate the signal directly without performing segmentation. Other quantities relating to the posterior distribution that are useful for providing confidence assessments of any given segmentation can also be estimated by using our method. We propose an approximation method whose computation time is linear in sequence length which makes our method practically applicable to the new higher density arrays. Simulation studies and applications to real array-CGH data illustrate the advantages of the proposed approach.  相似文献   

13.
Marginal models for longitudinal continuous proportional data   总被引:5,自引:0,他引:5  
Song PX  Tan M 《Biometrics》2000,56(2):496-502
Summary. Continuous proportional data arise when the response of interest is a percentage between zero and one, e.g., the percentage of decrease in renal function at different follow‐up times from the baseline. In this paper, we propose methods to directly model the marginal means of the longitudinal proportional responses using the simplex distribution of Barndorff‐Nielsen and Jørgensen that takes into account the fact that such responses are percentages restricted between zero and one and may as well have large dispersion. Parameters in such a marginal model are estimated using an extended version of the generalized estimating equations where the score vector is a nonlinear function of the observed response. The method is illustrated with an ophthalmology study on the use of intraocular gas in retinal repair surgeries.  相似文献   

14.
Ye W  Lin X  Taylor JM 《Biometrics》2008,64(4):1238-1246
SUMMARY: In this article we investigate regression calibration methods to jointly model longitudinal and survival data using a semiparametric longitudinal model and a proportional hazards model. In the longitudinal model, a biomarker is assumed to follow a semiparametric mixed model where covariate effects are modeled parametrically and subject-specific time profiles are modeled nonparametrially using a population smoothing spline and subject-specific random stochastic processes. The Cox model is assumed for survival data by including both the current measure and the rate of change of the underlying longitudinal trajectories as covariates, as motivated by a prostate cancer study application. We develop a two-stage semiparametric regression calibration (RC) method. Two variations of the RC method are considered, risk set regression calibration and a computationally simpler ordinary regression calibration. Simulation results show that the two-stage RC approach performs well in practice and effectively corrects the bias from the naive method. We apply the proposed methods to the analysis of a dataset for evaluating the effects of the longitudinal biomarker PSA on the recurrence of prostate cancer.  相似文献   

15.
Several analysis of the geographic variation of mortality rates in space have been proposed in the literature. Poisson models allowing the incorporation of random effects to model extra‐variability are widely used. The typical modelling approach uses normal random effects to accommodate local spatial autocorrelation. When spatial autocorrelation is absent but overdispersion persists, a discrete mixture model is an alternative approach. However, a technique for identifying regions which have significant high or low risk in any given area has not been developed yet when using the discrete mixture model. Taking into account the importance that this information provides to the epidemiologists to formulate hypothesis related to the potential risk factors affecting the population, different procedures for obtaining confidence intervals for relative risks are derived in this paper. These methods are the standard information‐based method and other four, all based on bootstrap techniques, namely the asymptotic‐bootstrap, the percentile‐bootstrap, the BC‐bootstrap and the modified information‐based method. All of them are compared empirically by their application to mortality data due to cardiovascular diseases in women from Navarra, Spain, during the period 1988–1994. In the small area example considered here, we find that the information‐based method is sensible at estimating standard errors of the component means in the discrete mixture model but it is not appropriate for providing standard errors of the estimated relative risks and hence, for constructing confidence intervals for the relative risk associated to each region. Therefore, the bootstrap‐based methods are recommended for this matter. More specifically, the BC method seems to provide better coverage probabilities in the case studied, according to a small scale simulation study that has been carried out using a scenario as encountered in the analysis of the real data.  相似文献   

16.
In this paper, our aim is to analyze geographical and temporal variability of disease incidence when spatio‐temporal count data have excess zeros. To that end, we consider random effects in zero‐inflated Poisson models to investigate geographical and temporal patterns of disease incidence. Spatio‐temporal models that employ conditionally autoregressive smoothing across the spatial dimension and B‐spline smoothing over the temporal dimension are proposed. The analysis of these complex models is computationally difficult from the frequentist perspective. On the other hand, the advent of the Markov chain Monte Carlo algorithm has made the Bayesian analysis of complex models computationally convenient. Recently developed data cloning method provides a frequentist approach to mixed models that is also computationally convenient. We propose to use data cloning, which yields to maximum likelihood estimation, to conduct frequentist analysis of zero‐inflated spatio‐temporal modeling of disease incidence. One of the advantages of the data cloning approach is that the prediction and corresponding standard errors (or prediction intervals) of smoothing disease incidence over space and time is easily obtained. We illustrate our approach using a real dataset of monthly children asthma visits to hospital in the province of Manitoba, Canada, during the period April 2006 to March 2010. Performance of our approach is also evaluated through a simulation study.  相似文献   

17.
In clinical research and practice, landmark models are commonly used to predict the risk of an adverse future event, using patients' longitudinal biomarker data as predictors. However, these data are often observable only at intermittent visits, making their measurement times irregularly spaced and unsynchronized across different subjects. This poses challenges to conducting dynamic prediction at any post-baseline time. A simple solution is the last-value-carry-forward method, but this may result in bias for the risk model estimation and prediction. Another option is to jointly model the longitudinal and survival processes with a shared random effects model. However, when dealing with multiple biomarkers, this approach often results in high-dimensional integrals without a closed-form solution, and thus the computational burden limits its software development and practical use. In this article, we propose to process the longitudinal data by functional principal component analysis techniques, and then use the processed information as predictors in a class of flexible linear transformation models to predict the distribution of residual time-to-event occurrence. The measurement schemes for multiple biomarkers are allowed to be different within subject and across subjects. Dynamic prediction can be performed in a real-time fashion. The advantages of our proposed method are demonstrated by simulation studies. We apply our approach to the African American Study of Kidney Disease and Hypertension, predicting patients' risk of kidney failure or death by using four important longitudinal biomarkers for renal functions.  相似文献   

18.
He Z  Sun D 《Biometrics》2000,56(2):360-367
A Bayesian hierarchical generalized linear model is used to estimate hunting success rates at the subarea level for postseason harvest surveys. The model includes fixed week effects, random geographic effects, and spatial correlations between neighboring subareas. The computation is done by Gibbs sampling and adaptive rejection sampling techniques. The method is illustrated using data from the Missouri Turkey Hunting Survey in the spring of 1996. Bayesian model selection methods are used to demonstrate that there are significant week differences and spatial correlations of hunting success rates among counties. The Bayesian estimates are also shown to be quite robust in terms of changes of hyperparameters.  相似文献   

19.
As the practice of using population models for wildlife risk assessment has become more common, so has the practice of using surrogate data, typically taken from the published scientific literature, as inputs for demographic models. This practice clearly exposes the user to inferential errors. However, it is likely to continue because demographic data are expensive to gather. We review potential errors associated with the use of previously published demographic data and how those errors propagate into the endpoints of demographic projection models. We suggest methods for inferring bias in model endpoints when multiple and opposing biases are present in the demographic input data. We provide an example using Eastern Meadowlarks (Sturnella magna), a common songbird in Midwestern grasslands and agro-ecosystems. We conclude with a brief review of methods that could improve inference made using published demographic data, including methods from life-history theory, meta-analysis, and Bayesian statistics.  相似文献   

20.
In recent years, the study of species' occurrence has benefited from the increased availability of large-scale citizen-science data. While abundance data from standardized monitoring schemes are biased toward well-studied taxa and locations, opportunistic data are available for many taxonomic groups, from a large number of locations and across long timescales. Hence, these data provide opportunities to measure species' changes in occurrence, particularly through the use of occupancy models, which account for imperfect detection. These opportunistic datasets can be substantially large, numbering hundreds of thousands of sites, and hence present a challenge from a computational perspective, especially within a Bayesian framework. In this paper, we develop a unifying framework for Bayesian inference in occupancy models that account for both spatial and temporal autocorrelation. We make use of the Pólya-Gamma scheme, which allows for fast inference, and incorporate spatio-temporal random effects using Gaussian processes (GPs), for which we consider two efficient approximations: subset of regressors and nearest neighbor GPs. We apply our model to data on two UK butterfly species, one common and widespread and one rare, using records from the Butterflies for the New Millennium database, producing occupancy indices spanning 45 years. Our framework can be applied to a wide range of taxa, providing measures of variation in species' occurrence, which are used to assess biodiversity change.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号