首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Estimation of epidemiological and population parameters from molecular sequence data has become central to the understanding of infectious disease dynamics. Various models have been proposed to infer details of the dynamics that describe epidemic progression. These include inference approaches derived from Kingman’s coalescent theory. Here, we use recently described coalescent theory for epidemic dynamics to develop stochastic and deterministic coalescent susceptible–infected–removed (SIR) tree priors. We implement these in a Bayesian phylogenetic inference framework to permit joint estimation of SIR epidemic parameters and the sample genealogy. We assess the performance of the two coalescent models and also juxtapose results obtained with a recently published birth–death-sampling model for epidemic inference. Comparisons are made by analyzing sets of genealogies simulated under precisely known epidemiological parameters. Additionally, we analyze influenza A (H1N1) sequence data sampled in the Canterbury region of New Zealand and HIV-1 sequence data obtained from known United Kingdom infection clusters. We show that both coalescent SIR models are effective at estimating epidemiological parameters from data with large fundamental reproductive number R0 and large population size S0. Furthermore, we find that the stochastic variant generally outperforms its deterministic counterpart in terms of error, bias, and highest posterior density coverage, particularly for smaller R0 and S0. However, each of these inference models is shown to have undesirable properties in certain circumstances, especially for epidemic outbreaks with R0 close to one or with small effective susceptible populations.  相似文献   

2.
We consider a set of sample counts obtained by sampling arbitrary fractions of a finite volume containing an homogeneously dispersed population of identical objects. We report a Bayesian derivation of the posterior probability distribution of the population size using a binomial likelihood and non-conjugate, discrete uniform priors under sampling with or without replacement. Our derivation yields a computationally feasible formula that can prove useful in a variety of statistical problems involving absolute quantification under uncertainty. We implemented our algorithm in the R package dupiR and compared it with a previously proposed Bayesian method based on a Gamma prior. As a showcase, we demonstrate that our inference framework can be used to estimate bacterial survival curves from measurements characterized by extremely low or zero counts and rather high sampling fractions. All in all, we provide a versatile, general purpose algorithm to infer population sizes from count data, which can find application in a broad spectrum of biological and physical problems.  相似文献   

3.
As recent advances in calcium sensing technologies facilitate simultaneously imaging action potentials in neuronal populations, complementary analytical tools must also be developed to maximize the utility of this experimental paradigm. Although the observations here are fluorescence movies, the signals of interest—spike trains and/or time varying intracellular calcium concentrations—are hidden. Inferring these hidden signals is often problematic due to noise, nonlinearities, slow imaging rate, and unknown biophysical parameters. We overcome these difficulties by developing sequential Monte Carlo methods (particle filters) based on biophysical models of spiking, calcium dynamics, and fluorescence. We show that even in simple cases, the particle filters outperform the optimal linear (i.e., Wiener) filter, both by obtaining better estimates and by providing error bars. We then relax a number of our model assumptions to incorporate nonlinear saturation of the fluorescence signal, as well external stimulus and spike history dependence (e.g., refractoriness) of the spike trains. Using both simulations and in vitro fluorescence observations, we demonstrate temporal superresolution by inferring when within a frame each spike occurs. Furthermore, the model parameters may be estimated using expectation maximization with only a very limited amount of data (e.g., ∼5-10 s or 5-40 spikes), without the requirement of any simultaneous electrophysiology or imaging experiments.  相似文献   

4.
Maximum entropy-based inference methods have been successfully used to infer direct interactions from biological datasets such as gene expression data or sequence ensembles. Here, we review undirected pairwise maximum-entropy probability models in two categories of data types, those with continuous and categorical random variables. As a concrete example, we present recently developed inference methods from the field of protein contact prediction and show that a basic set of assumptions leads to similar solution strategies for inferring the model parameters in both variable types. These parameters reflect interactive couplings between observables, which can be used to predict global properties of the biological system. Such methods are applicable to the important problems of protein 3-D structure prediction and association of gene–gene networks, and they enable potential applications to the analysis of gene alteration patterns and to protein design.  相似文献   

5.
Combining Voltage and Calcium Imaging from Neuronal Dendrites   总被引:2,自引:0,他引:2  
The ability to monitor membrane potential (V m) and calcium (Ca2+) transients at multiple locations on the same neuron can facilitate further progress in our understanding of neuronal function. Here we describe a method to combine V m and Ca2+ imaging using styryl voltage sensitive dyes and Fura type UV-excitable Ca2+ indicators. In all cases V m optical signals are linear with membrane potential changes, but the calibration of optical signals on an absolute scale is presently possible only in some neurons. The interpretation of Ca2+ optical signals depends on the indicator Ca2+ buffering capacity relative to the cell endogenous buffering capacity. In hippocampal CA1 pyramidal neurons, loaded with JPW-3028 and 300 μM Bis-Fura-2, V m optical signals cannot be calibrated and the physiological Ca2+ dynamics are compromised by the presence of the indicator. Nevertheless, at each individual site, relative changes in V m and Ca2+ fluorescence signals under different conditions can provide meaningful new information on local dendritic integration. In cerebellar Purkinje neurons, loaded with JPW-1114 and 1 mM Fura-FF, V m optical signals can be calibrated in terms of mV and Ca2+ optical signals quantitatively reveal the physiological changes in free Ca2+. Using these two examples, the method is explained in detail.  相似文献   

6.
Bayesian inference is a powerful statistical paradigm that has gained popularity in many fields of science, but adoption has been somewhat slower in biophysics. Here, I provide an accessible tutorial on the use of Bayesian methods by focusing on example applications that will be familiar to biophysicists. I first discuss the goals of Bayesian inference and show simple examples of posterior inference using conjugate priors. I then describe Markov chain Monte Carlo sampling and, in particular, discuss Gibbs sampling and Metropolis random walk algorithms with reference to detailed examples. These Bayesian methods (with the aid of Markov chain Monte Carlo sampling) provide a generalizable way of rigorously addressing parameter inference and identifiability for arbitrarily complicated models.  相似文献   

7.
Single-particle cryo-electron microscopy is widely used to study the structure of macromolecular assemblies. Tens of thousands of noisy two-dimensional images of the macromolecular assembly viewed from different directions are used to infer its three-dimensional structure. The first step is to estimate a low-resolution initial model and initial image orientations. This is a challenging global optimization problem with many unknowns, including an unknown orientation for each two-dimensional image. Obtaining a good initial model is crucial for the success of the subsequent refinement step. We introduce a probabilistic algorithm for estimating an initial model. The algorithm is fast, has very few algorithmic parameters, and yields information about the precision of estimated model parameters in addition to the parameters themselves. Our algorithm uses a pseudo-atomic model to represent the low-resolution three-dimensional structure, with isotropic Gaussian components as moveable pseudo-atoms. This leads to a significant reduction in the number of parameters needed to represent the three-dimensional structure, and a simplified way of computing two-dimensional projections. It also contributes to the speed of the algorithm. We combine the estimation of the unknown three-dimensional structure and image orientations in a Bayesian framework. This ensures that there are very few parameters to set, and specifies how to combine different types of prior information about the structure with the given data in a systematic way. To estimate the model parameters we use Markov chain Monte Carlo sampling. The advantage is that instead of just obtaining point estimates of model parameters, we obtain an ensemble of models revealing the precision of the estimated parameters. We demonstrate the algorithm on both simulated and real data.  相似文献   

8.
The accurate identification of the route of transmission taken by an infectious agent through a host population is critical to understanding its epidemiology and informing measures for its control. However, reconstruction of transmission routes during an epidemic is often an underdetermined problem: data about the location and timings of infections can be incomplete, inaccurate, and compatible with a large number of different transmission scenarios. For fast-evolving pathogens like RNA viruses, inference can be strengthened by using genetic data, nowadays easily and affordably generated. However, significant statistical challenges remain to be overcome in the full integration of these different data types if transmission trees are to be reliably estimated. We present here a framework leading to a bayesian inference scheme that combines genetic and epidemiological data, able to reconstruct most likely transmission patterns and infection dates. After testing our approach with simulated data, we apply the method to two UK epidemics of Foot-and-Mouth Disease Virus (FMDV): the 2007 outbreak, and a subset of the large 2001 epidemic. In the first case, we are able to confirm the role of a specific premise as the link between the two phases of the epidemics, while transmissions more densely clustered in space and time remain harder to resolve. When we consider data collected from the 2001 epidemic during a time of national emergency, our inference scheme robustly infers transmission chains, and uncovers the presence of undetected premises, thus providing a useful tool for epidemiological studies in real time. The generation of genetic data is becoming routine in epidemiological investigations, but the development of analytical tools maximizing the value of these data remains a priority. Our method, while applied here in the context of FMDV, is general and with slight modification can be used in any situation where both spatiotemporal and genetic data are available.  相似文献   

9.
One of the pressing open problems of computational systems biology is the elucidation of the topology of genetic regulatory networks (GRNs) using high throughput genomic data, in particular microarray gene expression data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) challenge aims to evaluate the success of GRN inference algorithms on benchmarks of simulated data. In this article, we present GENIE3, a new algorithm for the inference of GRNs that was best performer in the DREAM4 In Silico Multifactorial challenge. GENIE3 decomposes the prediction of a regulatory network between p genes into p different regression problems. In each of the regression problems, the expression pattern of one of the genes (target gene) is predicted from the expression patterns of all the other genes (input genes), using tree-based ensemble methods Random Forests or Extra-Trees. The importance of an input gene in the prediction of the target gene expression pattern is taken as an indication of a putative regulatory link. Putative regulatory links are then aggregated over all genes to provide a ranking of interactions from which the whole network is reconstructed. In addition to performing well on the DREAM4 In Silico Multifactorial challenge simulated data, we show that GENIE3 compares favorably with existing algorithms to decipher the genetic regulatory network of Escherichia coli. It doesn''t make any assumption about the nature of gene regulation, can deal with combinatorial and non-linear interactions, produces directed GRNs, and is fast and scalable. In conclusion, we propose a new algorithm for GRN inference that performs well on both synthetic and real gene expression data. The algorithm, based on feature selection with tree-based ensemble methods, is simple and generic, making it adaptable to other types of genomic data and interactions.  相似文献   

10.
The nonlinearity of dynamics in systems biology makes it hard to infer them from experimental data. Simple linear models are computationally efficient, but cannot incorporate these important nonlinearities. An adaptive method based on the S-system formalism, which is a sensible representation of nonlinear mass-action kinetics typically found in cellular dynamics, maintains the efficiency of linear regression. We combine this approach with adaptive model selection to obtain efficient and parsimonious representations of cellular dynamics. The approach is tested by inferring the dynamics of yeast glycolysis from simulated data. With little computing time, it produces dynamical models with high predictive power and with structural complexity adapted to the difficulty of the inference problem.  相似文献   

11.
12.
Metagenomics provides a powerful new tool set for investigating evolutionary interactions with the environment. However, an absence of model-based statistical methods means that researchers are often not able to make full use of this complex information. We present a Bayesian method for inferring the phylogenetic relationship among related organisms found within metagenomic samples. Our approach exploits variation in the frequency of taxa among samples to simultaneously infer each lineage haplotype, the phylogenetic tree connecting them, and their frequency within each sample. Applications of the algorithm to simulated data show that our method can recover a substantial fraction of the phylogenetic structure even in the presence of high rates of migration among sample sites. We provide examples of the method applied to data from green sulfur bacteria recovered from an Antarctic lake, plastids from mixed Plasmodium falciparum infections, and virulent Neisseria meningitidis samples.  相似文献   

13.
14.
The neural patterns recorded during a neuroscientific experiment reflect complex interactions between many brain regions, each comprising millions of neurons. However, the measurements themselves are typically abstracted from that underlying structure. For example, functional magnetic resonance imaging (fMRI) datasets comprise a time series of three-dimensional images, where each voxel in an image (roughly) reflects the activity of the brain structure(s)–located at the corresponding point in space–at the time the image was collected. FMRI data often exhibit strong spatial correlations, whereby nearby voxels behave similarly over time as the underlying brain structure modulates its activity. Here we develop topographic factor analysis (TFA), a technique that exploits spatial correlations in fMRI data to recover the underlying structure that the images reflect. Specifically, TFA casts each brain image as a weighted sum of spatial functions. The parameters of those spatial functions, which may be learned by applying TFA to an fMRI dataset, reveal the locations and sizes of the brain structures activated while the data were collected, as well as the interactions between those structures.  相似文献   

15.
In this paper we consider the competing risks model where the risks may not be independent. We assume both fixed and random censoring. The random censoring mechanism could have either a parametric or a non-parametric form. The life distributions and the parametric censoring distribution considered are exponential or Weibull. The expressions for the asymptotic confidence intervals for various parameters of interest under different models, using the estimated Fisher information matrix and parametric bootstrap techniques have been derived. Monte Carlo simulation studies for some of these cases have been carried out.  相似文献   

16.
There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV) between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR) suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001) for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron''s Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries.  相似文献   

17.
18.
Neural networks are considered by many to be very promising tools for classification and prediction. The flexibility of the neural network models often result in over-fit. Shrinking the parameters using a penalized likelihood is often used in order to overcome such over-fit. In this paper we extend the approach proposed by FARAGGI and SIMON (1995a) to modeling censored survival data using the input-output relationship associated with a single hidden layer feed-forward neural network. Instead of estimating the neural network parameters using the method of maximum likelihood, we place normal prior distributions on the parameters and make inferences based on derived posterior distributions of the parameters. This Bayesian formulation will result in shrinking the parameters of the neural network model and will reduce the over-fit compared with the maximum likelihood estimators. We illustrate our proposed method on a simulated and a real example.  相似文献   

19.
Epidemics of chronic wasting disease (CWD) of North American Cervidae have potential to harm ecosystems and economies. We studied a migratory population of mule deer (Odocoileus hemionus) affected by CWD for at least three decades using a Bayesian framework to integrate matrix population and disease models with long-term monitoring data and detailed process-level studies. We hypothesized CWD prevalence would be stable or increase between two observation periods during the late 1990s and after 2010, with higher CWD prevalence making deer population decline more likely. The weight of evidence suggested a reduction in the CWD outbreak over time, perhaps in response to intervening harvest-mediated population reductions. Disease effects on deer population growth under current conditions were subtle with a 72% chance that CWD depressed population growth. With CWD, we forecasted a growth rate near one and largely stable deer population. Disease effects appear to be moderated by timing of infection, prolonged disease course, and locally variable infection. Long-term outcomes will depend heavily on whether current conditions hold and high prevalence remains a localized phenomenon.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号