首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到15条相似文献,搜索用时 0 毫秒
1.
2.
The use of survival models involving a random effect or 'frailty' term is becoming more common. Usually the random effects are assumed to represent different clusters, and clusters are assumed to be independent. In this paper, we consider random effects corresponding to clusters that are spatially arranged, such as clinical sites or geographical regions. That is, we might suspect that random effects corresponding to strata in closer proximity to each other might also be similar in magnitude. Such spatial arrangement of the strata can be modeled in several ways, but we group these ways into two general settings: geostatistical approaches, where we use the exact geographic locations (e.g. latitude and longitude) of the strata, and lattice approaches, where we use only the positions of the strata relative to each other (e.g. which counties neighbor which others). We compare our approaches in the context of a dataset on infant mortality in Minnesota counties between 1992 and 1996. Our main substantive goal here is to explain the pattern of infant mortality using important covariates (sex, race, birth weight, age of mother, etc.) while accounting for possible (spatially correlated) differences in hazard among the counties. We use the GIS ArcView to map resulting fitted hazard rates, to help search for possible lingering spatial correlation. The DIC criterion (Spiegelhalter et al., Journal of the Royal Statistical Society, Series B 2002, to appear) is used to choose among various competing models. We investigate the quality of fit of our chosen model, and compare its results when used to investigate neonatal versus post-neonatal mortality. We also compare use of our time-to-event outcome survival model with the simpler dichotomous outcome logistic model. Finally, we summarize our findings and suggest directions for future research.  相似文献   

3.
4.
Most modern population genetics inference methods are based on the coalescence framework. Methods that allow estimating parameters of structured populations commonly insert migration events into the genealogies. For these methods the calculation of the coalescence probability density of a genealogy requires a product over all time periods between events. Data sets that contain populations with high rates of gene flow among them require an enormous number of calculations. A new method, transition probability-structured coalescence (TPSC), replaces the discrete migration events with probability statements. Because the speed of calculation is independent of the amount of gene flow, this method allows calculating the coalescence densities efficiently. The current implementation of TPSC uses an approximation simplifying the interaction among lineages. Simulations and coverage comparisons of TPSC vs. MIGRATE show that TPSC allows estimation of high migration rates more precisely, but because of the approximation the estimation of low migration rates is biased. The implementation of TPSC into programs that calculate quantities on phylogenetic tree structures is straightforward, so the TPSC approach will facilitate more general inferences in many computer programs.  相似文献   

5.
Multi-state stochastic models are useful tools for studying complex dynamics such as chronic diseases. Semi-Markov models explicitly define distributions of waiting times, giving an extension of continuous time and homogeneous Markov models based implicitly on exponential distributions. This paper develops a parametric model adapted to complex medical processes. (i) We introduced a hazard function of waiting times with a U or inverse U shape. (ii) These distributions were specifically selected for each transition. (iii) The vector of covariates was also selected for each transition. We applied this method to the evolution of HIV infected patients. We used a sample of 1244 patients followed up at the hospital in Nice, France.  相似文献   

6.
7.
Repeatability (more precisely the common measure of repeatability, the intra‐class correlation coefficient, ICC) is an important index for quantifying the accuracy of measurements and the constancy of phenotypes. It is the proportion of phenotypic variation that can be attributed to between‐subject (or between‐group) variation. As a consequence, the non‐repeatable fraction of phenotypic variation is the sum of measurement error and phenotypic flexibility. There are several ways to estimate repeatability for Gaussian data, but there are no formal agreements on how repeatability should be calculated for non‐Gaussian data (e.g. binary, proportion and count data). In addition to point estimates, appropriate uncertainty estimates (standard errors and confidence intervals) and statistical significance for repeatability estimates are required regardless of the types of data. We review the methods for calculating repeatability and the associated statistics for Gaussian and non‐Gaussian data. For Gaussian data, we present three common approaches for estimating repeatability: correlation‐based, analysis of variance (ANOVA)‐based and linear mixed‐effects model (LMM)‐based methods, while for non‐Gaussian data, we focus on generalised linear mixed‐effects models (GLMM) that allow the estimation of repeatability on the original and on the underlying latent scale. We also address a number of methods for calculating standard errors, confidence intervals and statistical significance; the most accurate and recommended methods are parametric bootstrapping, randomisation tests and Bayesian approaches. We advocate the use of LMM‐ and GLMM‐based approaches mainly because of the ease with which confounding variables can be controlled for. Furthermore, we compare two types of repeatability (ordinary repeatability and extrapolated repeatability) in relation to narrow‐sense heritability. This review serves as a collection of guidelines and recommendations for biologists to calculate repeatability and heritability from both Gaussian and non‐Gaussian data.  相似文献   

8.
Generalized hierarchical multivariate CAR models for areal data   总被引:5,自引:0,他引:5  
Jin X  Carlin BP  Banerjee S 《Biometrics》2005,61(4):950-961
In the fields of medicine and public health, a common application of areal data models is the study of geographical patterns of disease. When we have several measurements recorded at each spatial location (for example, information on p>/= 2 diseases from the same population groups or regions), we need to consider multivariate areal data models in order to handle the dependence among the multivariate components as well as the spatial dependence between sites. In this article, we propose a flexible new class of generalized multivariate conditionally autoregressive (GMCAR) models for areal data, and show how it enriches the MCAR class. Our approach differs from earlier ones in that it directly specifies the joint distribution for a multivariate Markov random field (MRF) through the specification of simpler conditional and marginal models. This in turn leads to a significant reduction in the computational burden in hierarchical spatial random effect modeling, where posterior summaries are computed using Markov chain Monte Carlo (MCMC). We compare our approach with existing MCAR models in the literature via simulation, using average mean square error (AMSE) and a convenient hierarchical model selection criterion, the deviance information criterion (DIC; Spiegelhalter et al., 2002, Journal of the Royal Statistical Society, Series B64, 583-639). Finally, we offer a real-data application of our proposed GMCAR approach that models lung and esophagus cancer death rates during 1991-1998 in Minnesota counties.  相似文献   

9.
Leeyoung Park  Ju H. Kim 《Genetics》2015,199(4):1007-1016
Causal models including genetic factors are important for understanding the presentation mechanisms of complex diseases. Familial aggregation and segregation analyses based on polygenic threshold models have been the primary approach to fitting genetic models to the family data of complex diseases. In the current study, an advanced approach to obtaining appropriate causal models for complex diseases based on the sufficient component cause (SCC) model involving combinations of traditional genetics principles was proposed. The probabilities for the entire population, i.e., normal–normal, normal–disease, and disease–disease, were considered for each model for the appropriate handling of common complex diseases. The causal model in the current study included the genetic effects from single genes involving epistasis, complementary gene interactions, gene–environment interactions, and environmental effects. Bayesian inference using a Markov chain Monte Carlo algorithm (MCMC) was used to assess of the proportions of each component for a given population lifetime incidence. This approach is flexible, allowing both common and rare variants within a gene and across multiple genes. An application to schizophrenia data confirmed the complexity of the causal factors. An analysis of diabetes data demonstrated that environmental factors and gene–environment interactions are the main causal factors for type II diabetes. The proposed method is effective and useful for identifying causal models, which can accelerate the development of efficient strategies for identifying causal factors of complex diseases.  相似文献   

10.
A simple population genetic model is presented for a hermaphrodite annual species, allowing both selfing and outcrossing. Those male gametes (pollen) responsible for outcrossing are assumed to disperse much further than seeds. Under this model, the pedigree of a sample from a single locality is loop-free. A novel Markov chain Monte Carlo strategy is presented for sampling from the joint posterior distribution of the pedigree of such a sample and the parameters of the population genetic model (including the selfing rate) given the genotypes of the sampled individuals at unlinked marker loci. The computational costs of this Markov chain Monte Carlo strategy scale well with the number of individuals in the sample, and the number of marker loci, but increase exponentially with the age (time since colonisation from the source population) of the local population. Consequently, this strategy is particularly suited to situations where the sample has been collected from a population which is the result of a recent colonisation process.  相似文献   

11.
One barrier to interpreting the observational evidence concerning the adverse health effects of air pollution for public policy purposes is the measurement error inherent in estimates of exposure based on ambient pollutant monitors. Exposure assessment studies have shown that data from monitors at central sites may not adequately represent personal exposure. Thus, the exposure error resulting from using centrally measured data as a surrogate for personal exposure can potentially lead to a bias in estimates of the health effects of air pollution. This paper develops a multi-stage Poisson regression model for evaluating the effects of exposure measurement error on estimates of effects of particulate air pollution on mortality in time-series studies. To implement the model, we have used five validation data sets on personal exposure to PM10. Our goal is to combine data on the associations between ambient concentrations of particulate matter and mortality for a specific location, with the validation data on the association between ambient and personal concentrations of particulate matter at the locations where data have been collected. We use these data in a model to estimate the relative risk of mortality associated with estimated personal-exposure concentrations and make a comparison with the risk of mortality estimated with measurements of ambient concentration alone. We apply this method to data comprising daily mortality counts, ambient concentrations of PM10measured at a central site, and temperature for Baltimore, Maryland from 1987 to 1994. We have selected our home city of Baltimore to illustrate the method; the measurement error correction model is general and can be applied to other appropriate locations.Our approach uses a combination of: (1) a generalized additive model with log link and Poisson error for the mortality-personal-exposure association; (2) a multi-stage linear model to estimate the variability across the five validation data sets in the personal-ambient-exposure association; (3) data augmentation methods to address the uncertainty resulting from the missing personal exposure time series in Baltimore. In the Poisson regression model, we account for smooth seasonal and annual trends in mortality using smoothing splines. Taking into account the heterogeneity across locations in the personal-ambient-exposure relationship, we quantify the degree to which the exposure measurement error biases the results toward the null hypothesis of no effect, and estimate the loss of precision in the estimated health effects due to indirectly estimating personal exposures from ambient measurements.  相似文献   

12.
Bayesian shrinkage analysis is arguably the state-of-the-art technique for large-scale multiple quantitative trait locus (QTL) mapping. However, when the shrinkage model does not involve indicator variables for marker inclusion, QTL detection remains heavily dependent on significance thresholds derived from phenotype permutation under the null hypothesis of no phenotype-to-genotype association. This approach is computationally intensive and more importantly, the hypothetical data generation at the heart of the permutation-based method violates the Bayesian philosophy. Here we propose a fully Bayesian decision rule for QTL detection under the recently introduced extended Bayesian LASSO for QTL mapping. Our new decision rule is free of any hypothetical data generation and relies on the well-established Bayes factors for evaluating the evidence for QTL presence at any locus. Simulation results demonstrate the remarkable performance of our decision rule. An application to real-world data is considered as well.  相似文献   

13.
Polymerase chain reaction (PCR) is a major DNA amplification technology from molecular biology. The quantitative analysis of PCR aims at determining the initial amount of the DNA molecules from the observation of typically several PCR amplifications curves. The mainstream observation scheme of the DNA amplification during PCR involves fluorescence intensity measurements. Under the classical assumption that the measured fluorescence intensity is proportional to the amount of present DNA molecules, and under the assumption that these measurements are corrupted by an additive Gaussian noise, we analyze a single amplification curve using a hidden Markov model(HMM). The unknown parameters of the HMM may be separated into two parts. On the one hand, the parameters from the amplification process are the initial number of the DNA molecules and the replication efficiency, which is the probability of one molecule to be duplicated. On the other hand, the parameters from the observational scheme are the scale parameter allowing to convert the fluorescence intensity into the number of DNA molecules and the mean and variance characterizing the Gaussian noise. We use the maximum likelihood estimation procedure to infer the unknown parameters of the model from the exponential phase of a single amplification curve, the main parameter of interest for quantitative PCR being the initial amount of the DNA molecules. An illustrative example is provided. This research was financed by the Swedish foundation for Strategic Research through the Gothenburg Mathematical Modelling Centre.  相似文献   

14.
This article proposes resampling-based empirical Bayes multiple testing procedures for controlling a broad class of Type I error rates, defined as generalized tail probability (gTP) error rates, gTP (q,g) = Pr(g (V(n),S(n)) > q), and generalized expected value (gEV) error rates, gEV (g) = E [g (V(n),S(n))], for arbitrary functions g (V(n),S(n)) of the numbers of false positives V(n) and true positives S(n). Of particular interest are error rates based on the proportion g (V(n),S(n)) = V(n) /(V(n) + S(n)) of Type I errors among the rejected hypotheses, such as the false discovery rate (FDR), FDR = E [V(n) /(V(n) + S(n))]. The proposed procedures offer several advantages over existing methods. They provide Type I error control for general data generating distributions, with arbitrary dependence structures among variables. Gains in power are achieved by deriving rejection regions based on guessed sets of true null hypotheses and null test statistics randomly sampled from joint distributions that account for the dependence structure of the data. The Type I error and power properties of an FDR-controlling version of the resampling-based empirical Bayes approach are investigated and compared to those of widely-used FDR-controlling linear step-up procedures in a simulation study. The Type I error and power trade-off achieved by the empirical Bayes procedures under a variety of testing scenarios allows this approach to be competitive with or outperform the Storey and Tibshirani (2003) linear step-up procedure, as an alternative to the classical Benjamini and Hochberg (1995) procedure.  相似文献   

15.
The increasing number of taxa and loci in molecular phylogenetic studies of basal euteleosts has brought stability in a controversial area. A key emerging aspect to these studies is a sister Esociformes (pike) and Salmoniformes (salmon) relationship. We evaluate mitochondrial genome support for a sister Esociformes and Salmoniformes hypothesis by surveying many potential outgroups for these taxa, employing multiple phylogenetic approaches, and utilizing a thorough sampling scheme. Secondly, we conduct a simultaneous divergence time estimation and phylogenetic inference in a Bayesian framework with fossil calibrations focusing on relationships within Esociformes + Salmoniformes. Our dataset supports a sister relationship between Esociformes and Salmoniformes; however the nearest relatives of Esociformes + Salmoniformes are inconsistent among analyses. Within the order Esociformes, we advocate for a single family, Esocidae. Subfamily relationships within Salmonidae are poorly supported as Salmoninae sister to Thymallinae + Coregoninae.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号