首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Cook AJ  Gold DR  Li Y 《Biometrics》2007,63(2):540-549
While numerous methods have been proposed to test for spatial cluster detection, in particular for discrete outcome data (e.g., disease incidence), few have been available for continuous data that are subject to censoring. This article provides an extension of the spatial scan statistic (Kulldorff, 1997, Communications in Statistics 26, 1481-1496) for censored outcome data and further proposes a simple spatial cluster detection method by utilizing cumulative martingale residuals within the framework of the Cox's proportional hazards models. Simulations have indicated good performance of the proposed methods, with the practical applicability illustrated by an ongoing epidemiology study which investigates the relationship of environmental exposures to asthma, allergic rhinitis/hayfever, and eczema.  相似文献   

2.
MOTIVATION: An important goal of microarray studies is to discover genes that are associated with clinical outcomes, such as disease status and patient survival. While a typical experiment surveys gene expressions on a global scale, there may be only a small number of genes that have significant influence on a clinical outcome. Moreover, expression data have cluster structures and the genes within a cluster have correlated expressions and coordinated functions, but the effects of individual genes in the same cluster may be different. Accordingly, we seek to build statistical models with the following properties. First, the model is sparse in the sense that only a subset of the parameter vector is non-zero. Second, the cluster structures of gene expressions are properly accounted for. RESULTS: For gene expression data without pathway information, we divide genes into clusters using commonly used methods, such as K-means or hierarchical approaches. The optimal number of clusters is determined using the Gap statistic. We propose a clustering threshold gradient descent regularization (CTGDR) method, for simultaneous cluster selection and within cluster gene selection. We apply this method to binary classification and censored survival analysis. Compared to the standard TGDR and other regularization methods, the CTGDR takes into account the cluster structure and carries out feature selection at both the cluster level and within-cluster gene level. We demonstrate the CTGDR on two studies of cancer classification and two studies correlating survival of lymphoma patients with microarray expressions. AVAILABILITY: R code is available upon request. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

3.
Controlling for imperfect detection is important for developing species distribution models (SDMs). Occupancy‐detection models based on the time needed to detect a species can be used to address this problem, but this is hindered when times to detection are not known precisely. Here, we extend the time‐to‐detection model to deal with detections recorded in time intervals and illustrate the method using a case study on stream fish distribution modeling. We collected electrofishing samples of six fish species across a Mediterranean watershed in Northeast Portugal. Based on a Bayesian hierarchical framework, we modeled the probability of water presence in stream channels, and the probability of species occupancy conditional on water presence, in relation to environmental and spatial variables. We also modeled time‐to‐first detection conditional on occupancy in relation to local factors, using modified interval‐censored exponential survival models. Posterior distributions of occupancy probabilities derived from the models were used to produce species distribution maps. Simulations indicated that the modified time‐to‐detection model provided unbiased parameter estimates despite interval‐censoring. There was a tendency for spatial variation in detection rates to be primarily influenced by depth and, to a lesser extent, stream width. Species occupancies were consistently affected by stream order, elevation, and annual precipitation. Bayesian P‐values and AUCs indicated that all models had adequate fit and high discrimination ability, respectively. Mapping of predicted occupancy probabilities showed widespread distribution by most species, but uncertainty was generally higher in tributaries and upper reaches. The interval‐censored time‐to‐detection model provides a practical solution to model occupancy‐detection when detections are recorded in time intervals. This modeling framework is useful for developing SDMs while controlling for variation in detection rates, as it uses simple data that can be readily collected by field ecologists.  相似文献   

4.
A method for fitting piecewise exponential regression models to censored survival data is described. Stratification is performed recursively, using a combination of statistical tests and residual analysis. The splitting criterion employed in cross-validation is the average squared error of the residuals. The bootstrap is employed to keep the probability of a type I error (the error of discovering two or more strata when there is only one) of the method close to a predetermined value. The proposed method can thus also serve as a formal goodness-of-fit test for the exponential regression model. Real and simulated data are used for illustration.  相似文献   

5.
Summary We address the problem of establishing a survival schedule for wild populations. A demographic key identity is established, leading to a method whereby age-specific survival and mortality can be deduced from a marked cohort life table established for individuals that are randomly sampled at unknown age and marked, with subsequent recording of time-to-death. This identity permits the construction of life tables from data where the birth date of subjects is unknown. An analogous key identity is established for the continuous case in which the survival schedule of the wild population is related to the density of the survival distribution in the marked cohort. These identities are explored for both life tables and continuous lifetime data. For the continuous case, they are implemented with statistical methods using non-parametric density estimation methods to obtain flexible estimates for the unknown survival distribution of the wild population. The analytical model provided here serves as a starting point to develop more complex models for residual demography, i.e. models for estimating survival of wild populations in which age-at-entry is unknown and using remaining information in randomly encountered individuals. This is a first step towards a broad new concept of 'expressed demographic information content of marked or captured individuals'.  相似文献   

6.
Accurate detection and classification of predation events is important to determine predation and consumption rates by predators. However, obtaining this information for large predators is constrained by the speed at which carcasses disappear and the cost of field data collection. To accurately detect predation events, researchers have used GPS collar technology combined with targeted site visits. However, kill sites are often investigated well after the predation event due to limited data retrieval options on GPS collars (VHF or UHF downloading) and to ensure crew safety when working with large predators. This can lead to missing information from small‐prey (including young ungulates) kill sites due to scavenging and general site deterioration (e.g., vegetation growth). We used a space–time permutation scan statistic (STPSS) clustering method (SaTScan) to detect predation events of grizzly bears (Ursus arctos) fitted with satellite transmitting GPS collars. We used generalized linear mixed models to verify predation events and the size of carcasses using spatiotemporal characteristics as predictors. STPSS uses a probability model to compare expected cluster size (space and time) with the observed size. We applied this method retrospectively to data from 2006 to 2007 to compare our method to random GPS site selection. In 2013–2014, we applied our detection method to visit sites one week after their occupation. Both datasets were collected in the same study area. Our approach detected 23 of 27 predation sites verified by visiting 464 random grizzly bear locations in 2006–2007, 187 of which were within space–time clusters and 277 outside. Predation site detection increased by 2.75 times (54 predation events of 335 visited clusters) using 2013–2014 data. Our GLMMs showed that cluster size and duration predicted predation events and carcass size with high sensitivity (0.72 and 0.94, respectively). Coupling GPS satellite technology with clusters using a program based on space–time probability models allows for prompt visits to predation sites. This enables accurate identification of the carcass size and increases fieldwork efficiency in predation studies.  相似文献   

7.
Abstract: The use of bird counts as indices has come under increasing scrutiny because assumptions concerning detection probabilities may not be met, but there also seems to be some resistance to use of model-based approaches to estimating abundance. We used data from the United States Forest Service, Southern Region bird monitoring program to compare several common approaches for estimating annual abundance or indices and population trends from point-count data. We compared indices of abundance estimated as annual means of counts and from a mixed-Poisson model to abundance estimates from a count-removal model with 3 time intervals and a distance model with 3 distance bands. We compared trend estimates calculated from an autoregressive, exponential model fit to annual abundance estimates from the above methods and also by estimating trend directly by treating year as a continuous covariate in the mixed-Poisson model. We produced estimates for 6 forest songbirds based on an average of 621 and 459 points in 2 physiographic areas from 1997 to 2004. There was strong evidence that detection probabilities varied among species and years. Nevertheless, there was good overall agreement across trend estimates from the 5 methods for 9 of 12 comparisons. In 3 of 12 comparisons, however, patterns in detection probabilities potentially confounded interpretation of uncorrected counts. Estimates of detection probabilities differed greatly between removal and distance models, likely because the methods estimated different components of detection probability and the data collection was not optimally designed for either method. Given that detection probabilities often vary among species, years, and observers investigators should address detection probability in their surveys, whether it be by estimation of probability of detection and abundance, estimation of effects of key covariates when modeling count as an index of abundance, or through design-based methods to standardize these effects.  相似文献   

8.
A spatial scan statistic for multiple clusters   总被引:1,自引:0,他引:1  
Spatial scan statistics are commonly used for geographical disease surveillance and cluster detection. While there are multiple clusters coexisting in the study area, they become difficult to detect because of clusters’ shadowing effect to each other. The recently proposed sequential method showed its better power for detecting the second weaker cluster, but did not improve the ability of detecting the first stronger cluster which is more important than the second one. We propose a new extension of the spatial scan statistic which could be used to detect multiple clusters. Through constructing two or more clusters in the alternative hypothesis, our proposed method accounts for other coexisting clusters in the detecting and evaluating process. The performance of the proposed method is compared to the sequential method through an intensive simulation study, in which our proposed method shows better power in terms of both rejecting the null hypothesis and accurately detecting the coexisting clusters. In the real study of hand-foot-mouth disease data in Pingdu city, a true cluster town is successfully detected by our proposed method, which cannot be evaluated to be statistically significant by the standard method due to another cluster’s shadowing effect.  相似文献   

9.
D P Byar  N Mantel 《Biometrics》1975,31(4):943-947
Interrelationships among three response-time models which incorporate covariate information are explored. The most general of these models is the logistic-exponential in which the log odds of the probability of responding in a fixed interval is assumed to be a linear function of the covariates; this model includes a parameter W for the width of discrete time intervals in which responses occur. As W leads to O this model is equivalent to a continuous time exponential model in which the log hazard is linear in the covariates. As W leads to infininity it is equivalent to a continuous time exponential model in which the hazard itself is a linear function of the covariates. This second model was fitted to the data used in an earlier publication describing the logistic exponential model, and very close agreement of the estimates of the regression coefficients is demonstrated.  相似文献   

10.
Summary As a major analytical method for outbreak detection, Kulldorff's space–time scan statistic (2001, Journal of the Royal Statistical Society, Series A 164, 61–72) has been implemented in many syndromic surveillance systems. Since, however, it is based on circular windows in space, it has difficulty correctly detecting actual noncircular clusters. Takahashi et al. (2008, International Journal of Health Geographics 7 , 14) proposed a flexible space–time scan statistic with the capability of detecting noncircular areas. It seems to us, however, that the detection of the most likely cluster defined in these space–time scan statistics is not the same as the detection of localized emerging disease outbreaks because the former compares the observed number of cases with the conditional expected number of cases. In this article, we propose a new space–time scan statistic which compares the observed number of cases with the unconditional expected number of cases, takes a time‐to‐time variation of Poisson mean into account, and implements an outbreak model to capture localized emerging disease outbreaks more timely and correctly. The proposed models are illustrated with data from weekly surveillance of the number of absentees in primary schools in Kitakyushu‐shi, Japan, 2006.  相似文献   

11.
  1. Obtaining robust survival estimates is critical, but sample size limitations often result in imprecise estimates or the failure to obtain estimates for population subgroups. Concurrently, data are often recorded on incidental reencounters of marked individuals, but these incidental data are often unused in survival analyses.
  2. We evaluated the utility of supplementing a traditional survival dataset with incidental data on marked individuals that were collected ad hoc. We used a continuous time‐to‐event exponential survival model to leverage the matching information contained in both datasets and assessed differences in survival among adult and juvenile and resident and translocated Mojave desert tortoises (Gopherus agassizii).
  3. Incorporation of the incidental mark‐encounter data improved precision of all annual survival point estimates, with a 3.4%–37.5% reduction in the spread of the 95% Bayesian credible intervals. We were able to estimate annual survival for three subgroup combinations that were previously inestimable. Point estimates between the radiotelemetry and combined datasets were within |0.029| percentage points of each other, suggesting minimal to no bias induced by the incidental data.
  4. Annual survival rates were high (>0.89) for resident adult and juvenile tortoises in both study sites and for translocated adults in the southern site. Annual survival rates for translocated juveniles at both sites and translocated adults in the northern site were between 0.73 and 0.76. At both sites, translocated adults and juveniles had significantly lower survival than resident adults. High mortality in the northern site was driven primarily by a single pulse in mortalities.
  5. Using exponential survival models to leverage matching information across traditional survival studies and incidental data on marked individuals may serve as a useful tool to improve the precision and estimability of survival rates. This can improve the efficacy of understanding basic population ecology and population monitoring for imperiled species.
  相似文献   

12.
Sangbum Choi  Xuelin Huang 《Biometrics》2012,68(4):1126-1135
Summary We propose a semiparametrically efficient estimation of a broad class of transformation regression models for nonproportional hazards data. Classical transformation models are to be viewed from a frailty model paradigm, and the proposed method provides a unified approach that is valid for both continuous and discrete frailty models. The proposed models are shown to be flexible enough to model long‐term follow‐up survival data when the treatment effect diminishes over time, a case for which the PH or proportional odds assumption is violated, or a situation in which a substantial proportion of patients remains cured after treatment. Estimation of the link parameter in frailty distribution, considered to be unknown and possibly dependent on a time‐independent covariates, is automatically included in the proposed methods. The observed information matrix is computed to evaluate the variances of all the parameter estimates. Our likelihood‐based approach provides a natural way to construct simple statistics for testing the PH and proportional odds assumptions for usual survival data or testing the short‐ and long‐term effects for survival data with a cure fraction. Simulation studies demonstrate that the proposed inference procedures perform well in realistic settings. Applications to two medical studies are provided.  相似文献   

13.
Dunson DB  Chen Z  Harry J 《Biometrics》2003,59(3):521-530
In applications that involve clustered data, such as longitudinal studies and developmental toxicity experiments, the number of subunits within a cluster is often correlated with outcomes measured on the individual subunits. Analyses that ignore this dependency can produce biased inferences. This article proposes a Bayesian framework for jointly modeling cluster size and multiple categorical and continuous outcomes measured on each subunit. We use a continuation ratio probit model for the cluster size and underlying normal regression models for each of the subunit-specific outcomes. Dependency between cluster size and the different outcomes is accommodated through a latent variable structure. The form of the model facilitates posterior computation via a simple and computationally efficient Gibbs sampler. The approach is illustrated with an application to developmental toxicity data, and other applications, to joint modeling of longitudinal and event time data, are discussed.  相似文献   

14.
Summary Spatial cluster detection is an important methodology for identifying regions with excessive numbers of adverse health events without making strong model assumptions on the underlying spatial dependence structure. Previous work has focused on point or individual‐level outcome data and few advances have been made when the outcome data are reported at an aggregated level, for example, at the county‐ or census‐tract level. This article proposes a new class of spatial cluster detection methods for point or aggregate data, comprising of continuous, binary, and count data. Compared with the existing spatial cluster detection methods it has the following advantages. First, it readily incorporates region‐specific weights, for example, based on a region's population or a region's outcome variance, which is the key for aggregate data. Second, the established general framework allows for area‐level and individual‐level covariate adjustment. A simulation study is conducted to evaluate the performance of the method. The proposed method is then applied to assess spatial clustering of high Body Mass Index in a health maintenance organization population in the Seattle, Washington, USA area.  相似文献   

15.
Summary Cook, Gold, and Li (2007, Biometrics 63, 540–549) extended the Kulldorff (1997, Communications in Statistics 26, 1481–1496) scan statistic for spatial cluster detection to survival‐type observations. Their approach was based on the score statistic and they proposed a permutation distribution for the maximum of score tests. The score statistic makes it possible to apply the scan statistic idea to models including explanatory variables. However, we show that the permutation distribution requires strong assumptions of independence between potential cluster and both censoring and explanatory variables. In contrast, we present an approach using the asymptotic distribution of the maximum of score statistics in a manner not requiring these assumptions.  相似文献   

16.
Multi-state stochastic models are useful tools for studying complex dynamics such as chronic diseases. Semi-Markov models explicitly define distributions of waiting times, giving an extension of continuous time and homogeneous Markov models based implicitly on exponential distributions. This paper develops a parametric model adapted to complex medical processes. (i) We introduced a hazard function of waiting times with a U or inverse U shape. (ii) These distributions were specifically selected for each transition. (iii) The vector of covariates was also selected for each transition. We applied this method to the evolution of HIV infected patients. We used a sample of 1244 patients followed up at the hospital in Nice, France.  相似文献   

17.
Royle JA 《Biometrics》2009,65(1):267-274
Summary .  I consider the analysis of capture–recapture models with individual covariates that influence detection probability. Bayesian analysis of the joint likelihood is carried out using a flexible data augmentation scheme that facilitates analysis by Markov chain Monte Carlo methods, and a simple and straightforward implementation in freely available software. This approach is applied to a study of meadow voles ( Microtus pennsylvanicus ) in which auxiliary data on a continuous covariate (body mass) are recorded, and it is thought that detection probability is related to body mass. In a second example, the model is applied to an aerial waterfowl survey in which a double-observer protocol is used. The fundamental unit of observation is the cluster of individual birds, and the size of the cluster (a discrete covariate) is used as a covariate on detection probability.  相似文献   

18.
Ibrahim JG  Chen MH  Lipsitz SR 《Biometrics》1999,55(2):591-596
We propose a method for estimating parameters for general parametric regression models with an arbitrary number of missing covariates. We allow any pattern of missing data and assume that the missing data mechanism is ignorable throughout. When the missing covariates are categorical, a useful technique for obtaining parameter estimates is the EM algorithm by the method of weights proposed in Ibrahim (1990, Journal of the American Statistical Association 85, 765-769). We extend this method to continuous or mixed categorical and continuous covariates, and for arbitrary parametric regression models, by adapting a Monte Carlo version of the EM algorithm as discussed by Wei and Tanner (1990, Journal of the American Statistical Association 85, 699-704). In addition, we discuss the Gibbs sampler for sampling from the conditional distribution of the missing covariates given the observed data and show that the appropriate complete conditionals are log-concave. The log-concavity property of the conditional distributions will facilitate a straightforward implementation of the Gibbs sampler via the adaptive rejection algorithm of Gilks and Wild (1992, Applied Statistics 41, 337-348). We assume the model for the response given the covariates is an arbitrary parametric regression model, such as a generalized linear model, a parametric survival model, or a nonlinear model. We model the marginal distribution of the covariates as a product of one-dimensional conditional distributions. This allows us a great deal of flexibility in modeling the distribution of the covariates and reduces the number of nuisance parameters that are introduced in the E-step. We present examples involving both simulated and real data.  相似文献   

19.
Survival values have been computed for heat plus radiation data using the three most common radiation survival models: multitarget, multitarget with initial slope, and linear quadratic. By chi 2 analysis, all three models provide an equally good fit to the experimental data. When the survival values are compared, the linear quadratic model provides a slightly better fit in the shoulder region, while the multitarget models provide a slightly better fit in the exponential region.  相似文献   

20.
As a useful tool for geographical cluster detection of events, the spatial scan statistic is widely applied in many fields and plays an increasingly important role. The classic version of the spatial scan statistic for the binary outcome is developed by Kulldorff, based on the Bernoulli or the Poisson probability model. In this paper, we apply the Hypergeometric probability model to construct the likelihood function under the null hypothesis. Compared with existing methods, the likelihood function under the null hypothesis is an alternative and indirect method to identify the potential cluster, and the test statistic is the extreme value of the likelihood function. Similar with Kulldorff’s methods, we adopt Monte Carlo test for the test of significance. Both methods are applied for detecting spatial clusters of Japanese encephalitis in Sichuan province, China, in 2009, and the detected clusters are identical. Through a simulation to independent benchmark data, it is indicated that the test statistic based on the Hypergeometric model outweighs Kulldorff’s statistics for clusters of high population density or large size; otherwise Kulldorff’s statistics are superior.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号