首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Capturing complex dependence structures between outcome variables (e.g., study endpoints) is of high relevance in contemporary biomedical data problems and medical research. Distributional copula regression provides a flexible tool to model the joint distribution of multiple outcome variables by disentangling the marginal response distributions and their dependence structure. In a regression setup, each parameter of the copula model, that is, the marginal distribution parameters and the copula dependence parameters, can be related to covariates via structured additive predictors. We propose a framework to fit distributional copula regression via model-based boosting, which is a modern estimation technique that incorporates useful features like an intrinsic variable selection mechanism, parameter shrinkage and the capability to fit regression models in high-dimensional data setting, that is, situations with more covariates than observations. Thus, model-based boosting does not only complement existing Bayesian and maximum-likelihood based estimation frameworks for this model class but rather enables unique intrinsic mechanisms that can be helpful in many applied problems. The performance of our boosting algorithm for copula regression models with continuous margins is evaluated in simulation studies that cover low- and high-dimensional data settings and situations with and without dependence between the responses. Moreover, distributional copula boosting is used to jointly analyze and predict the length and the weight of newborns conditional on sonographic measurements of the fetus before delivery together with other clinical variables.  相似文献   

2.
Neurons encounter unavoidable evolutionary trade-offs between multiple tasks. They must consume as little energy as possible while effectively fulfilling their functions. Cells displaying the best performance for such multi-task trade-offs are said to be Pareto optimal, with their ion channel configurations underpinning their functionality. Ion channel degeneracy, however, implies that multiple ion channel configurations can lead to functionally similar behaviour. Therefore, instead of a single model, neuroscientists often use populations of models with distinct combinations of ionic conductances. This approach is called population (database or ensemble) modelling. It remains unclear, which ion channel parameters in the vast population of functional models are more likely to be found in the brain. Here we argue that Pareto optimality can serve as a guiding principle for addressing this issue by helping to identify the subpopulations of conductance-based models that perform best for the trade-off between economy and functionality. In this way, the high-dimensional parameter space of neuronal models might be reduced to geometrically simple low-dimensional manifolds, potentially explaining experimentally observed ion channel correlations. Conversely, Pareto inference might also help deduce neuronal functions from high-dimensional Patch-seq data. In summary, Pareto optimality is a promising framework for improving population modelling of neurons and their circuits.  相似文献   

3.
Kong M  Lee JJ 《Biometrics》2006,62(4):986-995
When multiple drugs are administered simultaneously, investigators are often interested in assessing whether the drug combinations are synergistic, additive, or antagonistic. Based on the Loewe additivity reference model, many existing response surface models require constant relative potency and some of them use a single parameter to capture synergy, additivity, or antagonism. However, the assumption of constant relative potency is too restrictive, and these models using a single parameter to capture drug interaction are inadequate to describe the phenomenon when synergy, additivity, and antagonism are interspersed in different regions of drug combinations. We propose a generalized response surface model with a function of doses instead of one single parameter to identify and quantify departure from additivity. The proposed model can incorporate varying relative potencies among multiple drugs as well. Examples and simulations are given to demonstrate that the proposed model is effective in capturing different patterns of drug interaction.  相似文献   

4.
The immune response to viral infection is regulated by an intricate network of many genes and their products. The reverse engineering of gene regulatory networks (GRNs) using mathematical models from time course gene expression data collected after influenza infection is key to our understanding of the mechanisms involved in controlling influenza infection within a host. A five-step pipeline: detection of temporally differentially expressed genes, clustering genes into co-expressed modules, identification of network structure, parameter estimate refinement, and functional enrichment analysis, is developed for reconstructing high-dimensional dynamic GRNs from genome-wide time course gene expression data. Applying the pipeline to the time course gene expression data from influenza-infected mouse lungs, we have identified 20 distinct temporal expression patterns in the differentially expressed genes and constructed a module-based dynamic network using a linear ODE model. Both intra-module and inter-module annotations and regulatory relationships of our inferred network show some interesting findings and are highly consistent with existing knowledge about the immune response in mice after influenza infection. The proposed method is a computationally efficient, data-driven pipeline bridging experimental data, mathematical modeling, and statistical analysis. The application to the influenza infection data elucidates the potentials of our pipeline in providing valuable insights into systematic modeling of complicated biological processes.  相似文献   

5.
Houseman EA  Marsit C  Karagas M  Ryan LM 《Biometrics》2007,63(4):1269-1277
Increasingly used in health-related applications, latent variable models provide an appealing framework for handling high-dimensional exposure and response data. Item response theory (IRT) models, which have gained widespread popularity, were originally developed for use in the context of educational testing, where extremely large sample sizes permitted the estimation of a moderate-to-large number of parameters. In the context of public health applications, smaller sample sizes preclude large parameter spaces. Therefore, we propose a penalized likelihood approach to reduce mean square error and improve numerical stability. We present a continuous family of models, indexed by a tuning parameter, that range between the Rasch model and the IRT model. The tuning parameter is selected by cross validation or approximations such as Akaike Information Criterion. While our approach can be placed easily in a Bayesian context, we find that our frequentist approach is more computationally efficient. We demonstrate our methodology on a study of methylation silencing of gene expression in bladder tumors. We obtain similar results using both frequentist and Bayesian approaches, although the frequentist approach is less computationally demanding. In particular, we find high correlation of methylation silencing among 16 loci in bladder tumors, that methylation is associated with smoking and also with patient survival.  相似文献   

6.
Classification tree models are flexible analysis tools which have the ability to evaluate interactions among predictors as well as generate predictions for responses of interest. We describe Bayesian analysis of a specific class of tree models in which binary response data arise from a retrospective case-control design. We are also particularly interested in problems with potentially very many candidate predictors. This scenario is common in studies concerning gene expression data, which is a key motivating example context. Innovations here include the introduction of tree models that explicitly address and incorporate the retrospective design, and the use of nonparametric Bayesian models involving Dirichlet process priors on the distributions of predictor variables. The model specification influences the generation of trees through Bayes' factor based tests of association that determine significant binary partitions of nodes during a process of forward generation of trees. We describe this constructive process and discuss questions of generating and combining multiple trees via Bayesian model averaging for prediction. Additional discussion of parameter selection and sensitivity is given in the context of an example which concerns prediction of breast tumour status utilizing high-dimensional gene expression data; the example demonstrates the exploratory/explanatory uses of such models as well as their primary utility in prediction. Shortcomings of the approach and comparison with alternative tree modelling algorithms are also discussed, as are issues of modelling and computational extensions.  相似文献   

7.

Background  

When predictive survival models are built from high-dimensional data, there are often additional covariates, such as clinical scores, that by all means have to be included into the final model. While there are several techniques for the fitting of sparse high-dimensional survival models by penalized parameter estimation, none allows for explicit consideration of such mandatory covariates.  相似文献   

8.
Analysis of molecular data promises identification of biomarkers for improving prognostic models, thus potentially enabling better patient management. For identifying such biomarkers, risk prediction models can be employed that link high-dimensional molecular covariate data to a clinical endpoint. In low-dimensional settings, a multitude of statistical techniques already exists for building such models, e.g. allowing for variable selection or for quantifying the added value of a new biomarker. We provide an overview of techniques for regularized estimation that transfer this toward high-dimensional settings, with a focus on models for time-to-event endpoints. Techniques for incorporating specific covariate structure are discussed, as well as techniques for dealing with more complex endpoints. Employing gene expression data from patients with diffuse large B-cell lymphoma, some typical modeling issues from low-dimensional settings are illustrated in a high-dimensional application. First, the performance of classical stepwise regression is compared to stage-wise regression, as implemented by a component-wise likelihood-based boosting approach. A second issues arises, when artificially transforming the response into a binary variable. The effects of the resulting loss of efficiency and potential bias in a high-dimensional setting are illustrated, and a link to competing risks models is provided. Finally, we discuss conditions for adequately quantifying the added value of high-dimensional gene expression measurements, both at the stage of model fitting and when performing evaluation.  相似文献   

9.
Generalized estimating equations (Liang and Zeger, 1986) is a widely used, moment-based procedure to estimate marginal regression parameters. However, a subtle and often overlooked point is that valid inference requires the mean for the response at time t to be expressed properly as a function of the complete past, present, and future values of any time-varying covariate. For example, with environmental exposures it may be necessary to express the response as a function of multiple lagged values of the covariate series. Despite the fact that multiple lagged covariates may be predictive of outcomes, researchers often focus interest on parameters in a 'cross-sectional' model, where the response is expressed as a function of a single lag in the covariate series. Cross-sectional models yield parameters with simple interpretations and avoid issues of collinearity associated with multiple lagged values of a covariate. Pepe and Anderson (1994), showed that parameter estimates for time-varying covariates may be biased unless the mean, given all past, present, and future covariate values, is equal to the cross-sectional mean or unless independence estimating equations are used. Although working independence avoids potential bias, many authors have shown that a poor choice for the response correlation model can lead to highly inefficient parameter estimates. The purpose of this paper is to study the bias-efficiency trade-off associated with working correlation choices for application with binary response data. We investigate data characteristics or design features (e.g. cluster size, overall response association, functional form of the response association, covariate distribution, and others) that influence the small and large sample characteristics of parameter estimates obtained from several different weighting schemes or equivalently 'working' covariance models. We find that the impact of covariance model choice depends highly on the specific structure of the data features, and that key aspects should be examined before choosing a weighting scheme.  相似文献   

10.
We present and analyze a model for the dynamics of the interactions between a pathogen and its host's immune response. The model consists of two differential equations, one for pathogen load, the other one for an index of specific immunity. Differently from other simple models in the literature, this model exhibits, according to the hosts' or pathogen's parameter values, or to the initial infection size, a rich repertoire of behaviours: immediate clearing of the pathogen through aspecific immune response; or acute infection followed by clearing of the pathogen through specific immune response; or uncontrolled infections; or acute infection followed by convergence to a stable state of chronic infection; or periodic solutions with intermittent acute infections. The model can also mimic some features of immune response after vaccination. This model could be a basis on which to build epidemic models including immunological features.  相似文献   

11.
Traditional approaches to the problem of parameter estimation in biophysical models of neurons and neural networks usually adopt a global search algorithm (for example, an evolutionary algorithm), often in combination with a local search method (such as gradient descent) in order to minimize the value of a cost function, which measures the discrepancy between various features of the available experimental data and model output. In this study, we approach the problem of parameter estimation in conductance-based models of single neurons from a different perspective. By adopting a hidden-dynamical-systems formalism, we expressed parameter estimation as an inference problem in these systems, which can then be tackled using a range of well-established statistical inference methods. The particular method we used was Kitagawa's self-organizing state-space model, which was applied on a number of Hodgkin-Huxley-type models using simulated or actual electrophysiological data. We showed that the algorithm can be used to estimate a large number of parameters, including maximal conductances, reversal potentials, kinetics of ionic currents, measurement and intrinsic noise, based on low-dimensional experimental data and sufficiently informative priors in the form of pre-defined constraints imposed on model parameters. The algorithm remained operational even when very noisy experimental data were used. Importantly, by combining the self-organizing state-space model with an adaptive sampling algorithm akin to the Covariance Matrix Adaptation Evolution Strategy, we achieved a significant reduction in the variance of parameter estimates. The algorithm did not require the explicit formulation of a cost function and it was straightforward to apply on compartmental models and multiple data sets. Overall, the proposed methodology is particularly suitable for resolving high-dimensional inference problems based on noisy electrophysiological data and, therefore, a potentially useful tool in the construction of biophysical neuron models.  相似文献   

12.
It is widely accepted that the primary immune system contains a subpopulation of cells, known as regulatory T cells whose function is to regulate the immune response. There is conflicting biological evidence regarding the ability of regulatory cells to lose their regulatory capabilities and turn into immune promoting cells. In this paper, we develop mathematical models to investigate the effects of regulatory T cell switching on the immune response. Depending on environmental conditions, regulatory T cells may transition, becoming effector T cells that are immunostimulatory rather than immunoregulatory. We consider this mechanism both in the context of a simple, ordinary differential equation (ODE) model and in the context of a more biologically detailed, delay differential equation (DDE) model of the primary immune response. It is shown that models that incorporate such a mechanism express the usual characteristics of an immune response (expansion, contraction, and memory phases), while being more robust with respect to T cell precursor frequencies. We characterize the affects of regulatory T cell switching on the peak magnitude of the immune response and identify a biologically testable range for the switching parameter. We conclude that regulatory T cell switching may play a key role in controlling immune contraction.  相似文献   

13.
Mechanistic biochemical network models describe the dynamics of intracellular metabolite pools in terms of substance concentrations, stoichiometry and reaction kinetics. Data from stimulus response experiments are currently the most informative source for in-vivo parameter estimation in such models. However, only a part of the parameters of classical enzyme kinetic models can usually be estimated from typical stimulus response data. For this reason, several alternative kinetic formats using different “languages” (e.g. linear, power laws, linlog, generic and convenience) have been proposed to reduce the model complexity. The present contribution takes a rigorous “multi-lingual” approach to data evaluation by translating biochemical network models from one kinetic format into another. For this purpose, a new high-performance algorithm has been developed and tested. Starting with a given model, it replaces as many kinetic terms as possible by alternative expressions while still reproducing the experimental data. Application of the algorithm to a published model for Escherichia coli's sugar metabolism demonstrates the power of the new method. It is shown that model translation is a powerful tool to investigate the information content of stimulus response data and the predictive power of models. Moreover, the local and global approximation capabilities of the models are elucidated and some pitfalls of traditional single model approaches to data evaluation are revealed.  相似文献   

14.
Adaptive evolution is, to a large extent, a complex combinatorial optimization process. Such processes can be characterized as "uphill walks on rugged fitness landscapes". Concrete examples of fitness landscapes include the distribution of any specific functional property such as the capacity to catalyze a specific reaction, or bind a specific ligand, in "protein space". In particular, the property might be the affinity of all possible antibody molecules for a specific antigenic determinant. That affinity landscape presumably plays a critical role in maturation of the immune response. In this process, hypermutation and clonal selection act to select antibody V region mutant variants with successively higher affinity for the immunizing antigen. The actual statistical structure of affinity landscapes, although knowable, is currently unknown. Here, we analyze a class of mathematical models we call NK models. We show that these models capture significant features of the maturation of the immune response, which is currently thought to share features with general protein evolution. The NK models have the important property that, as the parameter K increases, the "ruggedness" of the NK landscape varies from a single peaked "Fujiyama" landscape to a multi-peaked "badlands" landscape. Walks to local optima on such landscapes become shorter as K increases. This fact allows us to choose a value of K that corresponds to the experimentally observed number of mutational "steps", 6-8, taken as an antibody sequence matures. If the mature antibody is taken to correspond to a local optimum in the model, tuning the model requires that K be about 40, implying that the functional contribution of each amino acid in the V region is affected by about 40 others. Given this value of K, the model then predicts several features of "antibody space" that are in qualitative agreement with experiment: (1) The fraction of fitter variants of an initial "roughed in" germ line antibody amplified by clonal selection is about 1-2%. (2) Mutations at some sites of the mature antibody hardly affect antibody function at all, but mutations at other sites dramatically decrease function. (3) The same "roughed in" antibody sequence can "walk" to many mature antibody sequences. (4) Many adaptive walks can end on the same local optimum. (5) Comparison of different mature sequences derived from the same initial V region shows evolutionary hot spots and parallel mutations. All these predictions are open to detailed testing by obtaining monoclonal antibodies early in the immune response and carrying out in vitro mutagenesis and adaptive hill climbing with respect to affinity for the immunizing antigen.  相似文献   

15.
There is ample theoretical and experimental evidence that virulence evolution depends on the immune response of the host. In this article, we review a number of recent studies that attempt to explicitly incorporate the dynamics of the immune system (instead of merely representing it by a single black box parameter) in models for the evolution of parasite virulence. A striking observation is that the type of infection (acute or chronic) is invariably considered to be a constraint that model assumptions have to satisfy rather than as a potential outcome of the interaction of the parasite with its host's immune system. We argue that avoiding making assumptions about the type of infection will lead to a better understanding of infectious diseases, even though a number of fundamental and technical problems remain. Dynamical modeling of the immune system opens a wide range of perspectives: for understanding how the immune system eradicates a parasite (which it does for most pathogens but not for all, HIV being a notorious example of a virus that is not completely eliminated), for studying multiple infections through concomitant immunity, for understanding the emergence and evolution of the immune system in animals, and for evolutionary epidemiology in general (e.g., predicting evolutionary consequences of new therapies and public health policies). We conclude by discussing new approaches based on embedded (or nested) models and identify future perspectives for the modeling of infectious diseases.  相似文献   

16.
We present and analyze a model for the dynamics of the interactions between a pathogen and its host’s immune response. The model consists of two differential equations, one for pathogen load, the other one for an index of specific immunity. Differently from other simple models in the literature, this model exhibits, according to the hosts’ or pathogen’s parameter values, or to the initial infection size, a rich repertoire of behaviours: immediate clearing of the pathogen through aspecific immune response; or acute infection followed by clearing of the pathogen through specific immune response; or uncontrolled infections; or acute infection followed by convergence to a stable state of chronic infection; or periodic solutions with intermittent acute infections. The model can also mimic some features of immune response after vaccination. This model could be a basis on which to build epidemic models including immunological features.  相似文献   

17.
The enormous upsurge of interest in immune-based treatments for cancer such as vaccines and immune checkpoint inhibitors, and increased understanding of the role of the tumor microenvironment in treatment response, collectively point to the need for immune-competent orthotopic models for pre-clinical testing of these new therapies. This paper demonstrates how to establish an orthotopic immune-competent rat model of pleural malignant mesothelioma. Monitoring disease progression in orthotopic models is confounded by the internal location of the tumors. To longitudinally monitor disease progression and its effect on circulating immune cells in this and other rat models of cancer, a single tube flow cytometry assay requiring only 25 µl whole blood is described. This provides accurate quantification of seven immune parameters: total lymphocytes, monocytes and neutrophils, as well as the T-cell subsets CD4 and CD8, B-cells and Natural Killer cells. Different subsets of these parameters are useful in different circumstances and models, with the neutrophil to lymphocyte ratio having the greatest utility for monitoring disease progression in the mesothelioma model. Analyzing circulating immune cell levels using this single tube method may also assist in monitoring the response to immune-based treatments and understanding the underlying mechanisms leading to success or failure of treatment.  相似文献   

18.
Co-infections alter the host immune response but how the systemic and local processes at the site of infection interact is still unclear. The majority of studies on co-infections concentrate on one of the infecting species, an immune function or group of cells and often focus on the initial phase of the infection. Here, we used a combination of experiments and mathematical modelling to investigate the network of immune responses against single and co-infections with the respiratory bacterium Bordetella bronchiseptica and the gastrointestinal helminth Trichostrongylus retortaeformis. Our goal was to identify representative mediators and functions that could capture the essence of the host immune response as a whole, and to assess how their relative contribution dynamically changed over time and between single and co-infected individuals. Network-based discrete dynamic models of single infections were built using current knowledge of bacterial and helminth immunology; the two single infection models were combined into a co-infection model that was then verified by our empirical findings. Simulations showed that a T helper cell mediated antibody and neutrophil response led to phagocytosis and clearance of B. bronchiseptica from the lungs. This was consistent in single and co-infection with no significant delay induced by the helminth. In contrast, T. retortaeformis intensity decreased faster when co-infected with the bacterium. Simulations suggested that the robust recruitment of neutrophils in the co-infection, added to the activation of IgG and eosinophil driven reduction of larvae, which also played an important role in single infection, contributed to this fast clearance. Perturbation analysis of the models, through the knockout of individual nodes (immune cells), identified the cells critical to parasite persistence and clearance both in single and co-infections. Our integrated approach captured the within-host immuno-dynamics of bacteria-helminth infection and identified key components that can be crucial for explaining individual variability between single and co-infections in natural populations.  相似文献   

19.
After variable selection, standard inferential procedures for regression parameters may not be uniformly valid; there is no finite-sample size at which a standard test is guaranteed to approximately attain its nominal size. This problem is exacerbated in high-dimensional settings, where variable selection becomes unavoidable. This has prompted a flurry of activity in developing uniformly valid hypothesis tests for a low-dimensional regression parameter (eg, the causal effect of an exposure A on an outcome Y) in high-dimensional models. So far there has been limited focus on model misspecification, although this is inevitable in high-dimensional settings. We propose tests of the null that are uniformly valid under sparsity conditions weaker than those typically invoked in the literature, assuming working models for the exposure and outcome are both correctly specified. When one of the models is misspecified, by amending the procedure for estimating the nuisance parameters, our tests continue to be valid; hence, they are doubly robust. Our proposals are straightforward to implement using existing software for penalized maximum likelihood estimation and do not require sample splitting. We illustrate them in simulations and an analysis of data obtained from the Ghent University intensive care unit.  相似文献   

20.
Variable selection is critical in competing risks regression with high-dimensional data. Although penalized variable selection methods and other machine learning-based approaches have been developed, many of these methods often suffer from instability in practice. This paper proposes a novel method named Random Approximate Elastic Net (RAEN). Under the proportional subdistribution hazards model, RAEN provides a stable and generalizable solution to the large-p-small-n variable selection problem for competing risks data. Our general framework allows the proposed algorithm to be applicable to other time-to-event regression models, including competing risks quantile regression and accelerated failure time models. We show that variable selection and parameter estimation improved markedly using the new computationally intensive algorithm through extensive simulations. A user-friendly R package RAEN is developed for public use. We also apply our method to a cancer study to identify influential genes associated with the death or progression from bladder cancer.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号