共查询到20条相似文献,搜索用时 0 毫秒
1.
Monica Pirani Alexina J. Mason Anna L. Hansell Sylvia Richardson Marta Blangiardo 《Biometrical journal. Biometrische Zeitschrift》2020,62(7):1650-1669
Study designs where data have been aggregated by geographical areas are popular in environmental epidemiology. These studies are commonly based on administrative databases and, providing a complete spatial coverage, are particularly appealing to make inference on the entire population. However, the resulting estimates are often biased and difficult to interpret due to unmeasured confounders, which typically are not available from routinely collected data. We propose a framework to improve inference drawn from such studies exploiting information derived from individual-level survey data. The latter are summarized in an area-level scalar score by mimicking at ecological level the well-known propensity score methodology. The literature on propensity score for confounding adjustment is mainly based on individual-level studies and assumes a binary exposure variable. Here, we generalize its use to cope with area-referenced studies characterized by a continuous exposure. Our approach is based upon Bayesian hierarchical structures specified into a two-stage design: (i) geolocated individual-level data from survey samples are up-scaled at ecological level, then the latter are used to estimate a generalized ecological propensity score (EPS) in the in-sample areas; (ii) the generalized EPS is imputed in the out-of-sample areas under different assumptions about the missingness mechanisms, then it is included into the ecological regression, linking the exposure of interest to the health outcome. This delivers area-level risk estimates, which allow a fuller adjustment for confounding than traditional areal studies. The methodology is illustrated by using simulations and a case study investigating the risk of lung cancer mortality associated with nitrogen dioxide in England (UK). 相似文献
2.
Owing to its robustness properties, marginal interpretations, and ease of implementation, the pseudo-partial likelihood method proposed in the seminal papers of Pepe and Cai and Lin et al. has become the default approach for analyzing recurrent event data with Cox-type proportional rate models. However, the construction of the pseudo-partial score function ignores the dependency among recurrent events and thus can be inefficient. An attempt to investigate the asymptotic efficiency of weighted pseudo-partial likelihood estimation found that the optimal weight function involves the unknown variance–covariance process of the recurrent event process and may not have closed-form expression. Thus, instead of deriving the optimal weights, we propose to combine a system of pre-specified weighted pseudo-partial score equations via the generalized method of moments and empirical likelihood estimation. We show that a substantial efficiency gain can be easily achieved without imposing additional model assumptions. More importantly, the proposed estimation procedures can be implemented with existing software. Theoretical and numerical analyses show that the empirical likelihood estimator is more appealing than the generalized method of moments estimator when the sample size is sufficiently large. An analysis of readmission risk in colorectal cancer patients is presented to illustrate the proposed methodology. 相似文献
3.
There has been a rising interest in better exploiting auxiliary summary information from large databases in the analysis of smaller-scale studies that collect more comprehensive patient-level information. The purpose of this paper is twofold: first, we propose a novel approach to synthesize information from both the aggregate summary statistics and the individual-level data in censored linear regression. We show that the auxiliary information amounts to a system of nonsmooth estimating equations and thus can be combined with the conventional weighted log-rank estimating equations by using the generalized method of moments (GMM) approach. The proposed methodology can be further extended to account for the potential inconsistency in information from different sources. Second, in the absence of auxiliary information, we propose to improve estimation efficiency by combining the overidentified weighted log-rank estimating equations with different weight functions via the GMM framework. To deal with the nonsmooth GMM-type objective functions, we develop an asymptotics-guided algorithm for parameter and variance estimation. We establish the asymptotic normality of the proposed GMM-type estimators. Simulation studies show that the proposed estimators can yield substantial efficiency gain over the conventional weighted log-rank estimators. The proposed methods are applied to a pancreatic cancer study for illustration. 相似文献
4.
5.
Stationary points embedded in the derivatives are often critical for a model to be interpretable and may be considered as key features of interest in many applications. We propose a semiparametric Bayesian model to efficiently infer the locations of stationary points of a nonparametric function, which also produces an estimate of the function. We use Gaussian processes as a flexible prior for the underlying function and impose derivative constraints to control the function's shape via conditioning. We develop an inferential strategy that intentionally restricts estimation to the case of at least one stationary point, bypassing possible mis-specifications in the number of stationary points and avoiding the varying dimension problem that often brings in computational complexity. We illustrate the proposed methods using simulations and then apply the method to the estimation of event-related potentials derived from electroencephalography (EEG) signals. We show how the proposed method automatically identifies characteristic components and their latencies at the individual level, which avoids the excessive averaging across subjects that is routinely done in the field to obtain smooth curves. By applying this approach to EEG data collected from younger and older adults during a speech perception task, we are able to demonstrate how the time course of speech perception processes changes with age. 相似文献
6.
R. P. FRECKLETON 《Journal of evolutionary biology》2009,22(7):1367-1375
Phylogenetic comparative methods are extremely commonly used in evolutionary biology. In this paper, I highlight some of the problems that are frequently encountered in comparative analyses and review how they can be fixed. In broad terms, the problems boil down to a lack of appreciation of the underlying assumptions of comparative methods, as well as problems with implementing methods in a manner akin to more familiar statistical approaches. I highlight that the advent of more flexible computing environments should improve matters and allow researchers greater scope to explore methods and data. 相似文献
7.
Roberto Benedetti Thomas Suesse Federica Piersimoni 《Biometrical journal. Biometrische Zeitschrift》2020,62(6):1494-1507
Maximum likelihood estimation of the model parameters for a spatial population based on data collected from a survey sample is usually straightforward when sampling and non-response are both non-informative, since the model can then usually be fitted using the available sample data, and no allowance is necessary for the fact that only a part of the population has been observed. Although for many regression models this naive strategy yields consistent estimates, this is not the case for some models, such as spatial auto-regressive models. In this paper, we show that for a broad class of such models, a maximum marginal likelihood approach that uses both sample and population data leads to more efficient estimates since it uses spatial information from sampled as well as non-sampled units. Extensive simulation experiments based on two well-known data sets are used to assess the impact of the spatial sampling design, the auto-correlation parameter and the sample size on the performance of this approach. When compared to some widely used methods that use only sample data, the results from these experiments show that the maximum marginal likelihood approach is much more precise. 相似文献
8.
9.
10.
A. R. Sen G. E. J. Smith G. Butler 《Biometrical journal. Biometrische Zeitschrift》1978,20(4):363-369
This paper deals with one of the basic assumptions made in estimating population density by the line transect sampling method. Theory has been developed in this paper for estimating, density using a gamma distribution. The results have been applied to data on ruffed grouse. 相似文献
11.
Butler RJ Brusatte SL Andres B Benson RB 《Evolution; international journal of organic evolution》2012,66(1):147-162
A fundamental contribution of paleobiology to macroevolutionary theory has been the illumination of deep time patterns of diversification. However, recent work has suggested that taxonomic diversity counts taken from the fossil record may be strongly biased by uneven spatiotemporal sampling. Although morphological diversity (disparity) is also frequently used to examine evolutionary radiations, no empirical work has yet addressed how disparity might be affected by uneven fossil record sampling. Here, we use pterosaurs (Mesozoic flying reptiles) as an exemplar group to address this problem. We calculate multiple disparity metrics based upon a comprehensive anatomical dataset including a novel phylogenetic correction for missing data, statistically compare these metrics to four geological sampling proxies, and use multiple regression modeling to assess the importance of uneven sampling and exceptional fossil deposits (Lagerstätten). We find that range‐based disparity metrics are strongly affected by uneven fossil record sampling, and should therefore be interpreted cautiously. The robustness of variance‐based metrics to sample size and geological sampling suggests that they can be more confidently interpreted as reflecting true biological signals. In addition, our results highlight the problem of high levels of missing data for disparity analyses, indicating a pressing need for more theoretical and empirical work. 相似文献
12.
13.
Multistate Markov models are frequently used to characterize disease processes, but their estimation from longitudinal data is often hampered by complex patterns of incompleteness. Two algorithms for estimating Markov chain models in the case of intermittent missing data in longitudinal studies, a stochastic EM algorithm and the Gibbs sampler, are described. The first can be viewed as a random perturbation of the EM algorithm and is appropriate when the M step is straightforward but the E step is computationally burdensome. It leads to a good approximation of the maximum likelihood estimates. The Gibbs sampler is used for a full Bayesian inference. The performances of the two algorithms are illustrated on two simulated data sets. A motivating example concerned with the modelling of the evolution of parasitemia by Plasmodium falciparum (malaria) in a cohort of 105 young children in Cameroon is described and briefly analyzed. 相似文献
14.
Simon N. Vandekar Theodore D. Satterthwaite Cedric H. Xia Azeez Adebimpe Kosha Ruparel Ruben C. Gur Raquel E. Gur Russell T. Shinohara 《Biometrics》2019,75(4):1145-1155
Spatial extent inference (SEI) is widely used across neuroimaging modalities to adjust for multiple comparisons when studying brain‐phenotype associations that inform our understanding of disease. Recent studies have shown that Gaussian random field (GRF)‐based tools can have inflated family‐wise error rates (FWERs). This has led to substantial controversy as to which processing choices are necessary to control the FWER using GRF‐based SEI. The failure of GRF‐based methods is due to unrealistic assumptions about the spatial covariance function of the imaging data. A permutation procedure is the most robust SEI tool because it estimates the spatial covariance function from the imaging data. However, the permutation procedure can fail because its assumption of exchangeability is violated in many imaging modalities. Here, we propose the (semi‐) parametric bootstrap joint (PBJ; sPBJ) testing procedures that are designed for SEI of multilevel imaging data. The sPBJ procedure uses a robust estimate of the spatial covariance function, which yields consistent estimates of standard errors, even if the covariance model is misspecified. We use the methods to study the association between performance and executive functioning in a working memory functional magnetic resonance imaging study. The sPBJ has similar or greater power to the PBJ and permutation procedures while maintaining the nominal type 1 error rate in reasonable sample sizes. We provide an R package to perform inference using the PBJ and sPBJ procedures. 相似文献
15.
Bryan E. Shepherd Kyunghee Han Tong Chen Aihua Bian Shannon Pugh Stephany N. Duda Thomas Lumley William J. Heerman Pamela A. Shaw 《Biometrics》2023,79(3):2649-2663
Electronic health record (EHR) data are increasingly used for biomedical research, but these data have recognized data quality challenges. Data validation is necessary to use EHR data with confidence, but limited resources typically make complete data validation impossible. Using EHR data, we illustrate prospective, multiwave, two-phase validation sampling to estimate the association between maternal weight gain during pregnancy and the risks of her child developing obesity or asthma. The optimal validation sampling design depends on the unknown efficient influence functions of regression coefficients of interest. In the first wave of our multiwave validation design, we estimate the influence function using the unvalidated (phase 1) data to determine our validation sample; then in subsequent waves, we re-estimate the influence function using validated (phase 2) data and update our sampling. For efficiency, estimation combines obesity and asthma sampling frames while calibrating sampling weights using generalized raking. We validated 996 of 10,335 mother-child EHR dyads in six sampling waves. Estimated associations between childhood obesity/asthma and maternal weight gain, as well as other covariates, are compared to naïve estimates that only use unvalidated data. In some cases, estimates markedly differ, underscoring the importance of efficient validation sampling to obtain accurate estimates incorporating validated data. 相似文献
16.
In some large clinical studies, it may be impractical to perform the physical examination to every subject at his/her last monitoring time in order to diagnose the occurrence of the event of interest. This gives rise to survival data with missing censoring indicators where the probability of missing may depend on time of last monitoring and some covariates. We present a fully Bayesian semi‐parametric method for such survival data to estimate regression parameters of the proportional hazards model of Cox. Theoretical investigation and simulation studies show that our method performs better than competing methods. We apply the proposed method to analyze the survival data with missing censoring indicators from the Orofacial Pain: Prospective Evaluation and Risk Assessment study. 相似文献
17.
18.
19.
Large sample theory of semiparametric models based on maximum likelihood estimation (MLE) with shape constraint on the nonparametric component is well studied. Relatively less attention has been paid to the computational aspect of semiparametric MLE. The computation of semiparametric MLE based on existing approaches such as the expectation‐maximization (EM) algorithm can be computationally prohibitive when the missing rate is high. In this paper, we propose a computational framework for semiparametric MLE based on an inexact block coordinate ascent (BCA) algorithm. We show theoretically that the proposed algorithm converges. This computational framework can be applied to a wide range of data with different structures, such as panel count data, interval‐censored data, and degradation data, among others. Simulation studies demonstrate favorable performance compared with existing algorithms in terms of accuracy and speed. Two data sets are used to illustrate the proposed computational method. We further implement the proposed computational method in R package BCA1SG , available at CRAN. 相似文献
20.