首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 234 毫秒
1.
BackgroundCure models can provide improved possibilities for inference if used appropriately, but there is potential for misleading results if care is not taken. In this study, we compared five commonly used approaches for modelling cure in a relative survival framework and provide some practical advice on the use of these approaches.Patients and methodsData for colon, female breast, and ovarian cancers were used to illustrate these approaches. The proportion cured was estimated for each of these three cancers within each of three age groups. We then graphically assessed the assumption of cure and the model fit, by comparing the predicted relative survival from the cure models to empirical life table estimates.ResultsWhere both cure and distributional assumptions are appropriate (e.g., for colon or ovarian cancer patients aged <75 years), all five approaches led to similar estimates of the proportion cured. The estimates varied slightly when cure was a reasonable assumption but the distributional assumption was not (e.g., for colon cancer patients ≥75 years). Greater variability in the estimates was observed when the cure assumption was not supported by the data (breast cancer).ConclusionsIf the data suggest cure is not a reasonable assumption then we advise against fitting cure models. In the scenarios where cure was reasonable, we found that flexible parametric cure models performed at least as well, or better, than the other modelling approaches. We recommend that, regardless of the model used, the underlying assumptions for cure and model fit should always be graphically assessed.  相似文献   

2.
Chen MH  Ibrahim JG  Lam P  Yu A  Zhang Y 《Biometrics》2011,67(3):1163-1170
Summary We develop a new Bayesian approach of sample size determination (SSD) for the design of noninferiority clinical trials. We extend the fitting and sampling priors of Wang and Gelfand (2002, Statistical Science 17 , 193–208) to Bayesian SSD with a focus on controlling the type I error and power. Historical data are incorporated via a hierarchical modeling approach as well as the power prior approach of Ibrahim and Chen (2000, Statistical Science 15 , 46–60). Various properties of the proposed Bayesian SSD methodology are examined and a simulation‐based computational algorithm is developed. The proposed methodology is applied to the design of a noninferiority medical device clinical trial with historical data from previous trials.  相似文献   

3.
We develop a Bayes factor-based approach for the design of non-inferiority clinical trials with a focus on controlling type I error and power. Historical data are incorporated in the Bayesian design via the power prior discussed in Ibrahim and Chen (Stat Sci 15:46–60, 2000). The properties of the proposed method are examined in detail. An efficient simulation-based computational algorithm is developed to calculate the Bayes factor, type I error, and power. The proposed methodology is applied to the design of a non-inferiority medical device clinical trial.  相似文献   

4.
A growing body of evidence shows that human mating preferences, like those of other animal species, can vary geographically. For example, women living in areas with a high cost of living have been shown to seek potential mates that can provide resources (e.g., large salaries). In this study, we present data from a large (N = 2944) nationally representative (United States) sample of Internet dating profiles. The profiles allowed daters’ to report their own income and the minimum income they desired in a dating partner, and we analyzed these data at the level of zip code. Our analysis shows that women engage in more resource seeking than men. We also find a positive relationship between cost of living in the dater's zip code and resource seeking among both men and women. Importantly, however, this relationship disappears if one's own income is accounted for in the analysis; that is, individuals of both sexes seek mates with an income similar to their own, regardless of local resource pressures. Our data highlight the importance of considering individual characteristics when measuring the effects of environmental factors on behavior.  相似文献   

5.
Introgressive hybridization between domestic dogs and wolves (Canis lupus) represents an emblematic case of anthropogenic hybridization and is increasingly threatening the genomic integrity of wolf populations expanding into human-modified landscapes. But studies formally estimating prevalence and accounting for imperfect detectability and uncertainty in hybrid classification are lacking. Our goal was to present an approach to formally estimate the proportion of admixture by using a capture-recapture (CR) framework applied to individual multilocus genotypes detected from non-invasive samples collected from a protected wolf population in Italy. We scored individual multilocus genotypes using a panel of 12 microsatellites and assigned genotypes to reference wolf and dog populations through Bayesian clustering procedures. Based on 152 samples, our dataset comprised the capture histories of 39 individuals sampled in 7 wolf packs and was organized in bi-monthly sampling occasions (Aug 2015−May 2016). We fitted CR models using a multievent formulation to explicitly handle uncertainty in individual classification, and accordingly examined 2 model scenarios: one reflecting a traditional approach to classifying individuals (i.e., minimizing the misclassification of wolves as hybrids; Type 1 error), and the other using a more stringent criterion aimed to balance Type 1 and Type 2 error rates (i.e., the misclassification of hybrids as wolves). Compared to the sample proportion of admixed individuals in the dataset (43.6%), formally estimated prevalence was 50% under the first and 70% under the second scenario, with 71.4% and 85.7% of admixed packs, respectively. At the individual level, the proportion of dog ancestry in the wolf population averaged 7.8% (95% CI = 4.4−11%). Balancing between Type 1 and 2 error rates in assignment tests, our second scenario produced an estimate of prevalence 40% higher compared to the alternative scenario, corresponding to a 65% decrease in Type 2 and no increase in Type 1 error rates. Providing a formal and innovative estimation approach to assess prevalence in admixed wild populations, our study confirms previous population modeling indicating that reproductive barriers between wolves and dogs, or dilution of dog genes through backcrossing, should not be expected per se to prevent the spread of introgression. As anthropogenic hybridization is increasingly affecting animal species globally, our approach is of interest to a broader audience of wildlife conservationists and practitioners. © 2021 The Authors. The Journal of Wildlife Management published by Wiley Periodicals LLC on behalf of The Wildlife Society.  相似文献   

6.
7.
The molecular clock, i.e., constancy of the rate of evolution over time, is commonly assumed in estimating divergence dates. However, this assumption is often violated and has drastic effects on date estimation. Recently, a number of attempts have been made to relax the clock assumption. One approach is to use maximum likelihood, which assigns rates to branches and allows the estimation of both rates and times. An alternative is the Bayes approach, which models the change of the rate over time. A number of models of rate change have been proposed. We have extended and evaluated models of rate evolution, i.e., the lognormal and its recent variant, along with the gamma, the exponential, and the Ornstein-Uhlenbeck processes. These models were first applied to a small hominoid data set, where an empirical Bayes approach was used to estimate the hyperparameters that measure the amount of rate variation. Estimation of divergence times was sensitive to these hyperparameters, especially when the assumed model is close to the clock assumption. The rate and date estimates varied little from model to model, although the posterior Bayes factor indicated the Ornstein-Uhlenbeck process outperformed the other models. To demonstrate the importance of allowing for rate change across lineages, this general approach was used to analyze a larger data set consisting of the 18S ribosomal RNA gene of 39 metazoan species. We obtained date estimates consistent with paleontological records, the deepest split within the group being about 560 million years ago. Estimates of the rates were in accordance with the Cambrian explosion hypothesis and suggested some more recent lineage-specific bursts of evolution.  相似文献   

8.
Generalized hierarchical multivariate CAR models for areal data   总被引:5,自引:0,他引:5  
Jin X  Carlin BP  Banerjee S 《Biometrics》2005,61(4):950-961
In the fields of medicine and public health, a common application of areal data models is the study of geographical patterns of disease. When we have several measurements recorded at each spatial location (for example, information on p>/= 2 diseases from the same population groups or regions), we need to consider multivariate areal data models in order to handle the dependence among the multivariate components as well as the spatial dependence between sites. In this article, we propose a flexible new class of generalized multivariate conditionally autoregressive (GMCAR) models for areal data, and show how it enriches the MCAR class. Our approach differs from earlier ones in that it directly specifies the joint distribution for a multivariate Markov random field (MRF) through the specification of simpler conditional and marginal models. This in turn leads to a significant reduction in the computational burden in hierarchical spatial random effect modeling, where posterior summaries are computed using Markov chain Monte Carlo (MCMC). We compare our approach with existing MCAR models in the literature via simulation, using average mean square error (AMSE) and a convenient hierarchical model selection criterion, the deviance information criterion (DIC; Spiegelhalter et al., 2002, Journal of the Royal Statistical Society, Series B64, 583-639). Finally, we offer a real-data application of our proposed GMCAR approach that models lung and esophagus cancer death rates during 1991-1998 in Minnesota counties.  相似文献   

9.
A Bayesian model for sparse functional data   总被引:1,自引:0,他引:1  
Thompson WK  Rosen O 《Biometrics》2008,64(1):54-63
Summary.   We propose a method for analyzing data which consist of curves on multiple individuals, i.e., longitudinal or functional data. We use a Bayesian model where curves are expressed as linear combinations of B-splines with random coefficients. The curves are estimated as posterior means obtained via Markov chain Monte Carlo (MCMC) methods, which automatically select the local level of smoothing. The method is applicable to situations where curves are sampled sparsely and/or at irregular time points. We construct posterior credible intervals for the mean curve and for the individual curves. This methodology provides unified, efficient, and flexible means for smoothing functional data.  相似文献   

10.
We develop here a new class of gene evolution models in which the nucleotide mutations are time dependent. These models allow to study nonlinear gene evolution by accelerating or decelerating the mutation rates at different evolutionary times. They generalize the previous ones which are based on constant mutation rates. The stochastic model developed in this class determines at some time t the occurrence probabilities of trinucleotides mutating according to 3 time dependent substitution parameters associated with the 3 trinucleotide sites. Therefore, it allows to simulate the evolution of the circular code recently observed in genes. By varying the class of function for the substitution parameters, 1 among 12 models retrieves after mutation the statistical properties of the observed circular code in the 3 frames of actual genes. In this model, the mutation rate in the 3rd trinucleotide site increases during gene evolution while the mutation rates in the 1st and 2nd sites decrease. This property agrees with the actual degeneracy of the genetic code. This approach can easily be generalized to study evolution of motifs of various lengths, e.g., dicodons, etc., with time dependent mutations.  相似文献   

11.
Characterizing organism growth within populations requires the application of well-studied individual size-at-age models, such as the deterministic Gompertz model, to populations of individuals whose characteristics, corresponding to model parameters, may be highly variable. A natural approach is to assign probability distributions to one or more model parameters. In some contexts, size-at-age data may be absent due to difficulties in ageing individuals, but size-increment data may instead be available (e.g., from tag-recapture experiments). A preliminary transformation to a size-increment model is then required. Gompertz models developed along the above lines have recently been applied to strongly heterogeneous abalone tag-recapture data. Although useful in modelling the early growth stages, these models yield size-increment distributions that allow negative growth, which is inappropriate in the case of mollusc shells and other accumulated biological structures (e.g., vertebrae) where growth is irreversible. Here we develop probabilistic Gompertz models where this difficulty is resolved by conditioning parameter distributions on size, allowing application to irreversible growth data. In the case of abalone growth, introduction of a growth-limiting biological length scale is then shown to yield realistic length-increment distributions.  相似文献   

12.
We present a novel and straightforward method for estimating recent migration rates between discrete populations using multilocus genotype data. The approach builds upon a two-step sampling design, where individual genotypes are sampled before and after dispersal. We develop a model that estimates all pairwise backwards migration rates ( mij , the probability that an individual sampled in population i is a migrant from population j ) between a set of populations. The method is validated with simulated data and compared with the methods of BayesAss and Structure. First, we use data for an island model and then we consider more realistic data simulations for a metapopulation of the greater white-toothed shrew ( Crocidura russula ). We show that the precision and bias of estimates primarily depend upon the proportion of individuals sampled in each population. Weak sampling designs may particularly affect the quality of the coverage provided by 95% highest posterior density intervals. We further show that it is relatively insensitive to the number of loci sampled and the overall strength of genetic structure. The method can easily be extended and makes fewer assumptions about the underlying demographic and genetic processes than currently available methods. It allows backwards migration rates to be estimated across a wide range of realistic conditions.  相似文献   

13.
King R  Brooks SP  Coulson T 《Biometrics》2008,64(4):1187-1195
SUMMARY: We consider the issue of analyzing complex ecological data in the presence of covariate information and model uncertainty. Several issues can arise when analyzing such data, not least the need to take into account where there are missing covariate values. This is most acutely observed in the presence of time-varying covariates. We consider mark-recapture-recovery data, where the corresponding recapture probabilities are less than unity, so that individuals are not always observed at each capture event. This often leads to a large amount of missing time-varying individual covariate information, because the covariate cannot usually be recorded if an individual is not observed. In addition, we address the problem of model selection over these covariates with missing data. We consider a Bayesian approach, where we are able to deal with large amounts of missing data, by essentially treating the missing values as auxiliary variables. This approach also allows a quantitative comparison of different models via posterior model probabilities, obtained via the reversible jump Markov chain Monte Carlo algorithm. To demonstrate this approach we analyze data relating to Soay sheep, which pose several statistical challenges in fully describing the intricacies of the system.  相似文献   

14.
Functioning as an "address tag" or "zip code" that guides nascent proteins (newly synthesized proteins in the cytosol) to wherever they are needed, signal peptides (also called targeting signals or signal sequences) have become a crucial tool in finding new drugs or reprogramming cells for gene therapy. To effectively and timely use such a tool, however, the first important thing is to develop an automated method for quickly and accurately identifying the signal peptide for a given nascent protein. With the avalanche of new protein sequences generated in the post-genomic era, the challenge has become even more urgent and critical. In this paper, five statistical rulers were derived via performing a mutual information analysis. By combining these statistical rulers, a new prediction algorithm was established and high success prediction rates were observed. The new algorithm may play a complementary role to the existing algorithms in this area. It is anticipated that the mutual information approach introduced here may be very useful for studying many other sequence-coupling problems in molecular biology as well.  相似文献   

15.
16.
Large-scale annotation efforts typically involve several experts who may disagree with each other. We propose an approach for modeling disagreements among experts that allows providing each annotation with a confidence value (i.e., the posterior probability that it is correct). Our approach allows computing certainty-level for individual annotations, given annotator-specific parameters estimated from data. We developed two probabilistic models for performing this analysis, compared these models using computer simulation, and tested each model's actual performance, based on a large data set generated by human annotators specifically for this study. We show that even in the worst-case scenario, when all annotators disagree, our approach allows us to significantly increase the probability of choosing the correct annotation. Along with this publication we make publicly available a corpus of 10,000 sentences annotated according to several cardinal dimensions that we have introduced in earlier work. The 10,000 sentences were all 3-fold annotated by a group of eight experts, while a 1,000-sentence subset was further 5-fold annotated by five new experts. While the presented data represent a specialized curation task, our modeling approach is general; most data annotation studies could benefit from our methodology.  相似文献   

17.
Zinc is an essential metal for all eukaryotes, and cells have evolved a complex system of proteins to maintain the precise balance of zinc uptake, intracellular storage, and efflux. In mammals, zinc uptake appears to be mediated by members of the Zrt/Irt-like protein (ZIP) superfamily of metal ion transporters. Herein, we have studied a subfamily of zip genes (zip1, zip2, and zip3) that is conserved in mice and humans. These eight-transmembrane domain proteins contain a conserved 12-amino acid signature sequence within the fourth transmembrane domain. All three of these mouse ZIP proteins function to specifically increase the uptake of zinc in transfected cultured cells, similar to the previously demonstrated functions of human ZIP1 and ZIP2 (Gaither, L. A., and Eide, D. J. (2000) J. Biol. Chem. 275, 5560-5564; Gaither, L. A., and Eide, D. J. (2001) J. Biol. Chem. 276, 22258-22264). No ZIP3 orthologs have been previously studied. Furthermore, this first systematic comparative study of the in vivo expression and dietary zinc regulation of this subfamily of zip genes revealed that 1) zip1 mRNA is abundant in many mouse tissues, whereas zip2 and zip3 mRNAs are very rare or moderately rare, respectively, and tissue-restricted in their accumulation; and 2) unlike mouse metallothionein I and zip4 mRNAs (Dufner-Beattie, J., Wang, F., Kuo, Y.-M., Gitschier, J., Eide, D., and Andrews, G. K. (2003) J. Biol. Chem. 278, 33474-33481), the abundance of zip1, zip2, and zip3 mRNAs is not regulated by dietary zinc in the intestine and visceral endoderm, tissues involved in nutrient absorption. These studies suggest that all three of these ZIP proteins may play cell-specific roles in zinc homeostasis rather than primary roles in the acquisition of dietary zinc.  相似文献   

18.
The investigation of individual heterogeneity in vital rates has recently received growing attention among population ecologists. Individual heterogeneity in wild animal populations has been accounted for and quantified by including individually varying effects in models for mark–recapture data, but the real need for underlying individual effects to account for observed levels of individual variation has recently been questioned by the work of Tuljapurkar et al. (Ecology Letters, 12, 93, 2009) on dynamic heterogeneity. Model‐selection approaches based on information criteria or Bayes factors have been used to address this question. Here, we suggest that, in addition to model‐selection, model‐checking methods can provide additional important insights to tackle this issue, as they allow one to evaluate a model's misfit in terms of ecologically meaningful measures. Specifically, we propose the use of posterior predictive checks to explicitly assess discrepancies between a model and the data, and we explain how to incorporate model checking into the inferential process used to assess the practical implications of ignoring individual heterogeneity. Posterior predictive checking is a straightforward and flexible approach for performing model checks in a Bayesian framework that is based on comparisons of observed data to model‐generated replications of the data, where parameter uncertainty is incorporated through use of the posterior distribution. If discrepancy measures are chosen carefully and are relevant to the scientific context, posterior predictive checks can provide important information allowing for more efficient model refinement. We illustrate this approach using analyses of vital rates with long‐term mark–recapture data for Weddell seals and emphasize its utility for identifying shortfalls or successes of a model at representing a biological process or pattern of interest.  相似文献   

19.
Population dynamics are functions of several demographic processes including survival, reproduction, somatic growth, and maturation. The rates or probabilities for these processes can vary by time, by location, and by individual. These processes can co‐vary and interact to varying degrees, e.g., an animal can only reproduce when it is in a particular maturation state. Population dynamics models that treat the processes as independent may yield somewhat biased or imprecise parameter estimates, as well as predictions of population abundances or densities. However, commonly used integral projection models (IPMs) typically assume independence across these demographic processes. We examine several approaches for modelling between process dependence in IPMs and include cases where the processes co‐vary as a function of time (temporal variation), co‐vary within each individual (individual heterogeneity), and combinations of these (temporal variation and individual heterogeneity). We compare our methods to conventional IPMs, which treat vital rates independent, using simulations and a case study of Soay sheep (Ovis aries). In particular, our results indicate that correlation between vital rates can moderately affect variability of some population‐level statistics. Therefore, including such dependent structures is generally advisable when fitting IPMs to ascertain whether or not such between vital rate dependencies exist, which in turn can have subsequent impact on population management or life‐history evolution.  相似文献   

20.
Increased environmental stochasticity due to climate change will intensify temporal variance in the life‐history traits, and especially breeding probabilities, of long‐lived iteroparous species. These changes may decrease individual fitness and population viability and is therefore important to monitor. In wild animal populations with imperfect individual detection, breeding probabilities are best estimated using capture–recapture methods. However, in many vertebrate species (e.g., amphibians, turtles, seabirds), nonbreeders are unobservable because they are not tied to a territory or breeding location. Although unobservable states can be used to model temporary emigration of nonbreeders, there are disadvantages to having unobservable states in capture–recapture models. The best solution to deal with unobservable life‐history states is therefore to eliminate them altogether. Here, we achieve this objective by fitting novel multievent‐robust design models which utilize information obtained from multiple surveys conducted throughout the year. We use this approach to estimate annual breeding probabilities of capital breeding female elephant seals (Mirounga leonina). Conceptually, our approach parallels a multistate version of the Barker/robust design in that it combines robust design capture data collected during discrete breeding seasons with observations made at other times of the year. A substantial advantage of our approach is that the nonbreeder state became “observable” when multiple data sources were analyzed together. This allowed us to test for the existence of state‐dependent survival (with some support found for lower survival in breeders compared to nonbreeders), and to estimate annual breeding transitions to and from the nonbreeder state with greater precision (where current breeders tended to have higher future breeding probabilities than nonbreeders). We used program E‐SURGE (2.1.2) to fit the multievent‐robust design models, with uncertainty in breeding state assignment (breeder, nonbreeder) being incorporated via a hidden Markov process. This flexible modeling approach can easily be adapted to suit sampling designs from numerous species which may be encountered during and outside of discrete breeding seasons.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号