首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Joly P  Commenges D 《Biometrics》1999,55(3):887-890
We consider the estimation of the intensity and survival functions for a continuous time progressive three-state semi-Markov model with intermittently observed data. The estimator of the intensity function is defined nonparametrically as the maximum of a penalized likelihood. We thus obtain smooth estimates of the intensity and survival functions. This approach can accommodate complex observation schemes such as truncation and interval censoring. The method is illustrated with a study of hemophiliacs infected by HIV. The intensity functions and the cumulative distribution functions for the time to infection and for the time to AIDS are estimated. Covariates can easily be incorporated into the model.  相似文献   

2.
Sequentially observed survival times are of interest in many studies but there are difficulties in analyzing such data using nonparametric or semiparametric methods. First, when the duration of followup is limited and the times for a given individual are not independent, induced dependent censoring arises for the second and subsequent survival times. Non-identifiability of the marginal survival distributions for second and later times is another issue, since they are observable only if preceding survival times for an individual are uncensored. In addition, in some studies a significant proportion of individuals may never have the first event. Fully parametric models can deal with these features, but robustness is a concern. We introduce a new approach to address these issues. We model the joint distribution of the successive survival times by using copula functions, and provide semiparametric estimation procedures in which copula parameters are estimated without parametric assumptions on the marginal distributions. This provides more robust estimates and checks on the fit of parametric models. The methodology is applied to a motivating example involving relapse and survival following colon cancer treatment.  相似文献   

3.
FRYDMAN  HALINA 《Biometrika》1995,82(4):773-789
The nonparametric estimation of the cumulative transition intensityfunctions in a threestate time-nonhomogeneous Markov processwith irreversible transitions, an ‘illness-death’model, is considered when times of the intermediate transition,e.g. onset of a disease, are interval-censored. The times of‘death’ are assumed to be known exactly or to beright-censored. In addition the observed process may be left-truncated.Data of this type arise when the process is sampled periodically.For example, when the patients are monitored through periodicexaminations the observations on times of change in their diseasestatus will be interval-censored. Under the sampling schemeconsidered here the Nelson–Aalen estimator (Aalen, 1978)for a cumulative transition intensity is not applicable. Inthe proposed method the maximum likelihood estimators of someof the transition intensities are derived from the estimatorsof the corresponding subdistribution functions. The maximumlikelihood estimators are shown to have a self-consistency property.The self-consistency algorithm is developed for the computationof the estimators. This approach generalises the results fromTurnbull (1976) and Frydman (1992). The methods are illustratedwith diabetes survival data.  相似文献   

4.
Capture-recapture models were developed to estimate survival using data arising from marking and monitoring wild animals over time. Variation in survival may be explained by incorporating relevant covariates. We propose nonparametric and semiparametric regression methods for estimating survival in capture-recapture models. A fully Bayesian approach using Markov chain Monte Carlo simulations was employed to estimate the model parameters. The work is illustrated by a study of Snow petrels, in which survival probabilities are expressed as nonlinear functions of a climate covariate, using data from a 40-year study on marked individuals, nesting at Petrels Island, Terre Adélie.  相似文献   

5.
Several studies have reported that interactions of mothers with preterm infants show differential characteristics compared to that of mothers with full-term infants. Interaction of preterm dyads is often reported as less harmonious. However, observations and explanations concerning the underlying mechanisms are inconsistent. In this work 30 preterm and 42 full-term mother-infant dyads were observed at one year of age. Free play interactions were videotaped and coded using a micro-analytic coding system. The video records were coded at one second resolution and studied by a novel approach using network analysis tools. The advantage of our approach is that it reveals the patterns of behavioral transitions in the interactions. We found that the most frequent behavioral transitions are the same in the two groups. However, we have identified several high and lower frequency transitions which occur significantly more often in the preterm or full-term group. Our analysis also suggests that the variability of behavioral transitions is significantly higher in the preterm group. This higher variability is mostly resulted from the diversity of transitions involving non-harmonious behaviors. We have identified a maladaptive pattern in the maternal behavior in the preterm group, involving intrusiveness and disengagement. Application of the approach reported in this paper to longitudinal data could elucidate whether these maladaptive maternal behavioral changes place the infant at risk for later emotional, cognitive and behavioral disturbance.  相似文献   

6.
Tandem mass spectrometry (MS/MS) has emerged as a cornerstone of proteomics owing in part to robust spectral interpretation algorithms. Widely used algorithms do not fully exploit the intensity patterns present in mass spectra. Here, we demonstrate that intensity pattern modeling improves peptide and protein identification from MS/MS spectra. We modeled fragment ion intensities using a machine-learning approach that estimates the likelihood of observed intensities given peptide and fragment attributes. From 1,000,000 spectra, we chose 27,000 with high-quality, nonredundant matches as training data. Using the same 27,000 spectra, intensity was similarly modeled with mismatched peptides. We used these two probabilistic models to compute the relative likelihood of an observed spectrum given that a candidate peptide is matched or mismatched. We used a 'decoy' proteome approach to estimate incorrect match frequency, and demonstrated that an intensity-based method reduces peptide identification error by 50-96% without any loss in sensitivity.  相似文献   

7.
8.
We propose an estimating function for parameters in a modelfor Poisson process intensity when time- or space-varying covariatesare observed for both the events of the process and at sampletimes or locations selected from a probability-based samplingdesign. We investigate the large-sample properties of the proposedestimator under increasing domain asymptotics, demonstratingthat it is consistent and asymptotically normally distributed.We illustrate our approach using data from an ecological momentaryassessment of smoking.  相似文献   

9.
The kinetics and thermodynamics of protein folding is investigated using low friction Langevin simulation of minimal continuum mode of proteins. We show that the model protein has two characteristic temperatures: (a) Tθ, at which the chain undergoes a collapse transition from an extended conformation; (b) Tf(< Tθ), at which a finite size first-order transition to the folded state takes place. The kinetics of approach to the native state from initially denatured conformations is probed by several novel correlation functions. We find that the overall kinetics of approach to the native conformation occurs via a three-stage multiple pathway mechanism. The initial stage, characterized by a series of local dihedral angle transitions, eventually results in the compaction of the protein. Subsequently, the molecule acquires native-like structures during the second stage of folding. The final stage of folding involves activated transitions from one of the native-like structures to the native conformation. The first two stages are characterized by a multiplicity of pathways while relatively few paths are involved in the final stage. A detailed analysis of the dynamics of individual trajectories reveals a novel picture of protein folding. We find that afraction of the initial population reaches the native conformation without the formation of any detectable intermediates. This pathway is associated with a nucleation mechanism, i.e., once a critical number of tertiary contacts are established then the native state is reached rapidly. The remaining fraction of molecules become trapped in misfolded structures (stabilized by incorrect tertiary contacts). The slow folding involves transitions over barriers from these structures to the native conformation. The theoretical predictions are compared with recent experiments that probe protein folding kinetics by hydrogen exchange labeling technique. © 1995 John Wiley & Sons, Inc.  相似文献   

10.
Coexistence in ecological communities is governed largely by the nature and intensity of species interactions. Countless studies have proposed methods to infer these interactions from empirical data, yet models parameterised using such data often fail to recover observed coexistence patterns. Here, we propose a method to reconcile empirical parameterisations of community dynamics with species‐abundance data, ensuring that the predicted equilibrium is consistent with the observed abundance distribution. To illustrate the approach, we explore two case studies: an experimental freshwater algal community and a long‐term time series of displacement in an intertidal community. We demonstrate how our method helps recover observed coexistence patterns, capture the core dynamics of the system, and, in the latter case, predict the impacts of experimental extinctions. Collectively, these results demonstrate an intuitive approach for reconciling observed and empirical data, improving our ability to explore the links between species interactions and coexistence in natural systems.  相似文献   

11.

Background

A tremendous amount of efforts have been devoted to identifying genes for diagnosis and prognosis of diseases using microarray gene expression data. It has been demonstrated that gene expression data have cluster structure, where the clusters consist of co-regulated genes which tend to have coordinated functions. However, most available statistical methods for gene selection do not take into consideration the cluster structure.

Results

We propose a supervised group Lasso approach that takes into account the cluster structure in gene expression data for gene selection and predictive model building. For gene expression data without biological cluster information, we first divide genes into clusters using the K-means approach and determine the optimal number of clusters using the Gap method. The supervised group Lasso consists of two steps. In the first step, we identify important genes within each cluster using the Lasso method. In the second step, we select important clusters using the group Lasso. Tuning parameters are determined using V-fold cross validation at both steps to allow for further flexibility. Prediction performance is evaluated using leave-one-out cross validation. We apply the proposed method to disease classification and survival analysis with microarray data.

Conclusion

We analyze four microarray data sets using the proposed approach: two cancer data sets with binary cancer occurrence as outcomes and two lymphoma data sets with survival outcomes. The results show that the proposed approach is capable of identifying a small number of influential gene clusters and important genes within those clusters, and has better prediction performance than existing methods.  相似文献   

12.
In observational studies of survival time featuring a binary time-dependent treatment, the hazard ratio (an instantaneous measure) is often used to represent the treatment effect. However, investigators are often more interested in the difference in survival functions. We propose semiparametric methods to estimate the causal effect of treatment among the treated with respect to survival probability. The objective is to compare post-treatment survival with the survival function that would have been observed in the absence of treatment. For each patient, we compute a prognostic score (based on the pre-treatment death hazard) and a propensity score (based on the treatment hazard). Each treated patient is then matched with an alive, uncensored and not-yet-treated patient with similar prognostic and/or propensity scores. The experience of each treated and matched patient is weighted using a variant of Inverse Probability of Censoring Weighting to account for the impact of censoring. We propose estimators of the treatment-specific survival functions (and their difference), computed through weighted Nelson–Aalen estimators. Closed-form variance estimators are proposed which take into consideration the potential replication of subjects across matched sets. The proposed methods are evaluated through simulation, then applied to estimate the effect of kidney transplantation on survival among end-stage renal disease patients using data from a national organ failure registry.  相似文献   

13.
Predicting population dynamics for rare species is of paramount importance in order to evaluate the likelihood of extinction and planning conservation strategies. However, evaluating and predicting population viability can be hindered from a lack of data. Rare species frequently have small populations, so estimates of vital rates are often very uncertain due to lack of data. We evaluated the vital rates of seven small populations from two watersheds with varying light environment of a common epiphytic orchid using Bayesian methods of parameter estimation. From the Lefkovitch matrices we predicted the deterministic population growth rates, elasticities, stable stage distributions and the credible intervals of the statistics. Populations were surveyed on a monthly basis between 18–34 months. In some of the populations few or no transitions in some of the vital rates were observed throughout the sampling period, however, we were able to predict the most likely vital rates using a Bayesian model that incorporated the transitions rates from the other populations. Asymptotic population growth rate varied among the seven orchid populations. There was little difference in population growth rate among watersheds even though it was expected because of physical differences as a result of differing canopy cover and watershed width. Elasticity analyses of Lepanthes rupestris suggest that growth rate is more sensitive to survival followed by growth, shrinking and the reproductive rates. The Bayesian approach helped to estimate transition probabilities that were uncommon or variable in some populations. Moreover, it increased the precision of the parameter estimates as compared to traditional approaches.  相似文献   

14.
Resource selection functions (RSFs) are typically estimated by comparing covariates at a discrete set of “used” locations to those from an “available” set of locations. This RSF approach treats the response as binary and does not account for intensity of use among habitat units where locations were recorded. Advances in global positioning system (GPS) technology allow animal location data to be collected at fine spatiotemporal scales and have increased the size and correlation of data used in RSF analyses. We suggest that a more contemporary approach to analyzing such data is to model intensity of use, which can be estimated for one or more animals by relating the relative frequency of locations in a set of sampling units to the habitat characteristics of those units with count‐based regression and, in particular, negative binomial (NB) regression. We demonstrate this NB RSF approach with location data collected from 10 GPS‐collared Rocky Mountain elk (Cervus elaphus) in the Starkey Experimental Forest and Range enclosure. We discuss modeling assumptions and show how RSF estimation with NB regression can easily accommodate contemporary research needs, including: analysis of large GPS data sets, computational ease, accounting for among‐animal variation, and interpretation of model covariates. We recommend the NB approach because of its conceptual and computational simplicity, and the fact that estimates of intensity of use are unbiased in the face of temporally correlated animal location data.  相似文献   

15.
16.
In many cell types, the inositol trisphosphate receptor is one of the important components controlling intracellular calcium dynamics, and an understanding of this receptor is necessary for an understanding of calcium oscillations and waves. Based on single-channel data from the type-I inositol trisphosphate receptor, and using a Markov chain Monte Carlo approach, we show that the most complex time-dependent model that can be unambiguously determined from steady-state data is one with three closed states and one open state, and we determine how the rate constants depend on calcium. Because the transitions between these states are complex functions of calcium concentration, each model state must correspond to a group of physical states. We fit two different topologies and find that both models predict that the main effect of [Ca2+] is to modulate the probability that the receptor is in a state that is able to open, rather than to modulate the transition rate to the open state.  相似文献   

17.
We develop an approach, based on multiple imputation, to using auxiliary variables to recover information from censored observations in survival analysis. We apply the approach to data from an AIDS clinical trial comparing ZDV and placebo, in which CD4 count is the time-dependent auxiliary variable. To facilitate imputation, a joint model is developed for the data, which includes a hierarchical change-point model for CD4 counts and a time-dependent proportional hazards model for the time to AIDS. Markov chain Monte Carlo methods are used to multiply impute event times for censored cases. The augmented data are then analyzed and the results combined using standard multiple-imputation techniques. A comparison of our multiple-imputation approach to simply analyzing the observed data indicates that multiple imputation leads to a small change in the estimated effect of ZDV and smaller estimated standard errors. A sensitivity analysis suggests that the qualitative findings are reproducible under a variety of imputation models. A simulation study indicates that improved efficiency over standard analyses and partial corrections for dependent censoring can result. An issue that arises with our approach, however, is whether the analysis of primary interest and the imputation model are compatible.  相似文献   

18.
We present a novel decomposition of nonnegative functional count data that draws on concepts from nonnegative matrix factorization. Our decomposition, which we refer to as NARFD (nonnegative and regularized function decomposition), enables the study of patterns in variation across subjects in a highly interpretable manner. Prototypic modes of variation are estimated directly on the observed scale of the data, are local, and are transparently added together to reconstruct observed functions. This contrasts with generalized functional principal component analysis, an alternative approach that estimates functional principal components on a transformed scale, produces components that typically vary across the entire functional domain, and reconstructs observations using complex patterns of cancellation and multiplication of functional principal components. NARFD is implemented using an alternating minimization algorithm, and we evaluate our approach in simulations. We apply NARFD to an accelerometer dataset comprising observations of physical activity for healthy older Americans.  相似文献   

19.
Negatively skewed data arise occasionally in statistical practice; perhaps the most familiar example is the distribution of human longevity. Although other generalizations of the normal distribution exist, we demonstrate a new alternative that apparently fits human longevity data better. We propose an alternative approach of a normal distribution whose scale parameter is conditioned on attained age. This approach is consistent with previous findings that longevity conditioned on survival to the modal age behaves like a normal distribution. We derive such a distribution and demonstrate its accuracy in modeling human longevity data from life tables. The new distribution is characterized by 1. An intuitively straightforward genesis; 2. Closed forms for the pdf, cdf, mode, quantile, and hazard functions; and 3. Accessibility to non-statisticians, based on its close relationship to the normal distribution.  相似文献   

20.
Studies of wild vertebrates have provided evidence of substantial differences in lifetime reproduction among individuals and the sequences of life history ‘states’ during life (breeding, nonbreeding, etc.). Such differences may reflect ‘fixed’ differences in fitness components among individuals determined before, or at the onset of reproductive life. Many retrospective life history studies have translated this idea by assuming a ‘latent’ unobserved heterogeneity resulting in a fixed hierarchy among individuals in fitness components. Alternatively, fixed differences among individuals are not necessarily needed to account for observed levels of individual heterogeneity in life histories. Individuals with identical fitness traits may stochastically experience different outcomes for breeding and survival through life that lead to a diversity of ‘state’ sequences with some individuals living longer and being more productive than others, by chance alone. The question is whether individuals differ in their underlying fitness components in ways that cannot be explained by observable ‘states’ such as age, previous breeding success, etc. Here, we compare statistical models that represent these opposing hypotheses, and mixtures of them, using data from kittiwakes. We constructed models that accounted for observed covariates, individual random effects (unobserved heterogeneity), first‐order Markovian transitions between observed states, or combinations of these features. We show that individual sequences of states are better accounted for by models incorporating unobserved heterogeneity than by models including first‐order Markov processes alone, or a combination of both. If we had not considered individual heterogeneity, models including Markovian transitions would have been the best performing ones. We also show that inference about age‐related changes in fitness components is sensitive to incorporation of underlying individual heterogeneity in models. Our approach provides insight into the sources of individual heterogeneity in life histories, and can be applied to other data sets to examine the ubiquity of our results across the tree of life.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号