首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
MSMBuilder is a software package for building statistical models of high-dimensional time-series data. It is designed with a particular focus on the analysis of atomistic simulations of biomolecular dynamics such as protein folding and conformational change. MSMBuilder is named for its ability to construct Markov state models (MSMs), a class of models that has gained favor among computational biophysicists. In addition to both well-established and newer MSM methods, the package includes complementary algorithms for understanding time-series data such as hidden Markov models and time-structure based independent component analysis. MSMBuilder boasts an easy to use command-line interface, as well as clear and consistent abstractions through its Python application programming interface. MSMBuilder was developed with careful consideration for compatibility with the broader machine learning community by following the design of scikit-learn. The package is used primarily by practitioners of molecular dynamics, but is just as applicable to other computational or experimental time-series measurements.  相似文献   

2.
3.
Model methods for the analysis of mesocosm experimental studies   总被引:1,自引:0,他引:1  
The response of experimental ecosystem dynamics to varying nutrient loads was studied by analysing oxygen time-series. Time-series had been continuously recorded, and the data were analysed on a daily basis using a computer model which describes basic oxygen processes. The resulting sets of production and consumption parameters described the dynamic characteristics of each basin during the experimental period of 180 days. Dynamical analysis appeared to be possible; although the results did not indicate a clear relationship between oxygen dynamics and nutrient supply in these systems.  相似文献   

4.
We develop a new regression algorithm, cMIKANA, for inference of gene regulatory networks from combinations of steady-state and time-series gene expression data. Using simulated gene expression datasets to assess the accuracy of reconstructing gene regulatory networks, we show that steady-state and time-series data sets can successfully be combined to identify gene regulatory interactions using the new algorithm. Inferring gene networks from combined data sets was found to be advantageous when using noisy measurements collected with either lower sampling rates or a limited number of experimental replicates. We illustrate our method by applying it to a microarray gene expression dataset from human umbilical vein endothelial cells (HUVECs) which combines time series data from treatment with growth factor TNF and steady state data from siRNA knockdown treatments. Our results suggest that the combination of steady-state and time-series datasets may provide better prediction of RNA-to-RNA interactions, and may also reveal biological features that cannot be identified from dynamic or steady state information alone. Finally, we consider the experimental design of genomics experiments for gene regulatory network inference and show that network inference can be improved by incorporating steady-state measurements with time-series data.  相似文献   

5.
We present a novel approach for analyzing biological time-series data using a context-free language (CFL) representation that allows the extraction and quantification of important features from the time-series. This representation results in Hierarchically AdaPtive (HAP) analysis, a suite of multiple complementary techniques that enable rapid analysis of data and does not require the user to set parameters. HAP analysis generates hierarchically organized parameter distributions that allow multi-scale components of the time-series to be quantified and includes a data analysis pipeline that applies recursive analyses to generate hierarchically organized results that extend traditional outcome measures such as pharmacokinetics and inter-pulse interval. Pulsicons, a novel text-based time-series representation also derived from the CFL approach, are introduced as an objective qualitative comparison nomenclature. We apply HAP to the analysis of 24 hours of frequently sampled pulsatile cortisol hormone data, which has known analysis challenges, from 14 healthy women. HAP analysis generated results in seconds and produced dozens of figures for each participant. The results quantify the observed qualitative features of cortisol data as a series of pulse clusters, each consisting of one or more embedded pulses, and identify two ultradian phenotypes in this dataset. HAP analysis is designed to be robust to individual differences and to missing data and may be applied to other pulsatile hormones. Future work can extend HAP analysis to other time-series data types, including oscillatory and other periodic physiological signals.  相似文献   

6.
Summary Delayed density dependence, and the cycles in insect populations that it can generate, are often investigated using time-series analysis. Recently, several authors have raised concerns about the validity of using time-series analysis to detect density dependence. One particular concern is the suggestion that exogenous driving variables, such as cyclic weather patterns, can lead to the spurious detection of density dependence in natural populations.
Using non-biological data (the electricity bills of one of the authors), we show how easy it is to be misled by the results of time-series analysis. We then present 16 years' data on the gall-forming sawfly, Euura lasiolepis (Hymenoptera: Tenthredinidae), and show that cycles in weather, specifically winter precipitation, lead to the spurious detection of density dependence in time-series analysis. We conclude that time-series analysis cannot stand alone as a method for inferring the action of density dependence, and urge further investigation of the effects of apparent cycles in abiotic forces on insect populations.  相似文献   

7.
Cho KH  Shin SY  Choo SM 《The FEBS journal》2005,272(15):3950-3959
Due to the unavoidable nonbiological variations accompanying many experiments, it is imperative to consider a way of unravelling the functional interaction structure of a cellular network (e.g. signalling cascades or gene networks) by using the qualitative information of time-series experimental data instead of computation through the measured absolute values. In this spirit, we propose a very simple but effective method of identifying the functional interaction structure of a cellular network based on temporal ascending or descending slope information from given time-series measurements. From this method, we can gain insight into the acceptable measurement error ranges in order to estimate the correct functional interaction structure and we can also find guidance for a new experimental design to complement the insufficient information of a given experimental dataset. We developed experimental sign equations, making use of the temporal slope sign information from time-series experimental data, without a specific assumption on parameter perturbations for each network node. Based on these equations, we further describe the available specific information from each part of experimental data in detail and show the functional interaction structure obtained by integrating such information. In this procedure, we use only simple algebra on sign changes without complicated computations on the measured absolute values of the experimental data. The result is, however, verified through rigorous mathematical definitions and proofs. The present method provides us with information about the acceptable measurement error ranges for correct estimation of the functional interaction structure and it further leads to a new experimental design to complement the given experimental data by informing us about additional specific sampling points to be chosen for further required information.  相似文献   

8.
《Genomics》2019,111(4):636-641
High-throughput time-series data have a special value for studying the dynamism of biological systems. However, the interpretation of such complex data can be challenging. The aim of this study was to compare common algorithms recently developed for the detection of differentially expressed genes in time-course microarray data. Using different measures such as sensitivity, specificity, predictive values, and related signaling pathways, we found that limma, timecourse, and gprege have reasonably good performance for the analysis of datasets in which only test group is followed over time. However, limma has the additional advantage of being able to report significance cut off, making it a more practical tool. In addition, limma and TTCA can be satisfactorily used for datasets with time-series data for all experimental groups. These findings may assist investigators to select appropriate tools for the detection of differentially expressed genes as an initial step in the interpretation of time-course big data.  相似文献   

9.
We describe and illustrate methods for obtaining a parsimonious sinusoidal series representation or model of biological time-series data. The methods are also used to identify nonlinear systems with unknown structure. A key aspect is a rapid search for significant terms to include in the model for the system or the time-series. For example, the methods use fast and robust orthogonal searches for significant frequencies in the time-series, and differ from conventional Fourier series analysis in several important respects. In particular, the frequencies in our resulting sinusoidal series need not be commensurate, nor integral multiples of the fundamental frequency corresponding to the record length. Freed of these restrictions, the methods produce a more economical sinusoidal series representation (than a Fourier series), finding the most significant frequencies first, and automatically determine model order. The methods are also capable of higher resolution than a conventional Fourier series analysis. In addition, the methods can cope with unequally-spaced or missing data, and are applicable to time-series corrupted by noise. Fially, we compare one of our methods with a wellknown technique for resolving sinusoidal signals in noise using published data for the test time-series.  相似文献   

10.
A two-phase design approach is introduced to determine the optimal feed rate, fed glucose concentration and fermentation time to maximize protein productivity using recombinant Escherichia coli BL21 (pBAW2) strain. The first phase is applied to determine a primary S-system kinetic model using batch time-series data. Two runs were carried out in the second phase to achieve the maximum protein productivity for the fed-batch fermentation process. The computational results using the S-system kinetic model obtained from the second run are in better agreement with the experiments than those using the kinetic model obtained from batch time-series data. For cross-validation, two extra fed-batch experiments with different feed strategies were carried out for comparison with the optimal fed-batch result. From the experimental results, this approach could improve productivity by at least 3%.  相似文献   

11.

Background  

Time-course microarray experiments can produce useful data which can help in understanding the underlying dynamics of the system. Clustering is an important stage in microarray data analysis where the data is grouped together according to certain characteristics. The majority of clustering techniques are based on distance or visual similarity measures which may not be suitable for clustering of temporal microarray data where the sequential nature of time is important. We present a Granger causality based technique to cluster temporal microarray gene expression data, which measures the interdependence between two time-series by statistically testing if one time-series can be used for forecasting the other time-series or not.  相似文献   

12.
13.
14.
In order to investigate what community selection forces are operating on Grime's CSR strategies in different semi-natural grassland habitats that were not subjected to successional processes, a time-series analysis of 2946 Danish grassland plots from an eight-year period with pin-point plant cover data was conducted. Grime's CSR strategies were used for grouping plant species into functional types, which were then treated as dependent variables in a time-series analysis. Across four grassland habitat types, there was significant selection against ruderal plant species during the eight-year period. Furthermore, the monitoring data indicated that the decreasing cover of ruderals may, in part be, due to reduced grazing pressure of large herbivores. As a consequence of the findings in this study and others, it may be suggested that increased disturbance, e.g. by large herbivores, is a suitable management strategies for grassland habitats.  相似文献   

15.
The Ecuadorian Amazon, lying in the headwaters of the Napo and Aguarico River valleys, is experiencing rapid change in Land Use and Land Cover (LULC) conditions and regional landscape diversity uniquely tied to the spontaneous agricultural colonization of the Oriente region of northeastern Ecuador beginning in the mid to late 1970s. Spontaneous colonization occurred on squattered lands located adjacent to oil company roads and in government development sectors composed of multiple 50 ha land parcels organized into `piano key' shaped family farms or fincas. Portions of these fincas were deforested for agricultural extensification depending upon the age of the finca and several site and situation factors. Because fincas are managed at the household level as spatially discrete, temporally independent units, land conversion at the finca-level is recognized as the chief proximate cause of deforestation within the region.Focusing on the spatial and temporal dynamics of deforestation, agricultural extensification, and plant succession at the finca-level, and urbanization at the community-level, a cell-based morphogenetic model of Land Use and Land Cover Change (LULCC) was developed as the foundation for a predictive model of regional LULCC dynamics and landscape diversity. Here, LULC characteristics are determined using a time-series of remotely sensed data (i.e., Landsat Thematic Mapper (TM) and Multispectral Scanner (MSS)) using an experimental [semi-traditional] (hybrid unsupervised-supervised) classification scheme resulting in a time-series data set including LULC images for 1973, 1986, 1989, 1996, and 1999. Pixel histories of LULC type across the time-series were integrated into LULC trajectories and converted into seed or input data sets for LULC modeling to alternate time periods and for model validation. LULC simulations, achieved through cellular automata (CA) methodologies, were run on an annual basis to the year 2010 using 1973 as the initial conditions and the satellite time-series as the `check points' in the simulations. The model was developed using the Imagine Spatial Modeler of the ERDAS image processing software, and enhanced using the Spatial Modeler Language (SML). The model works by (a) simulating the present by extrapolating from the past using the image time-series, (b) validating the simulations via the remotely sensed time-series of past conditions and through field observations of current conditions, (c) allowing the model to iterate to the year 2010, and (d) comparing model outputs to an autoregressive time-series approach for annual conditions that are compared via paired t-tests of pattern metrics run at the landscape-level to define compositional and structural differences between successive model outputs.  相似文献   

16.
This paper investigates the utility of the Lomb-Scargle periodogram for the analysis of biological rhythms. This method is particularly suited to detect periodic components in unequally sampled time-series and data sets with missing values, but restricts all calculations to actually measured values. The Lomb-Scargle method was tested on both real and simulated time-series with even and uneven sampling, and compared to a standard method in biomedical rhythm research, the Chi-square periodogram. Results indicate that the Lomb-Scargle algorithm shows a clearly better detection efficiency and accuracy in the presence of noise, and avoids possible bias or erroneous results that may arise from replacement of missing data by interpolation techniques. Hence, the Lomb-Scargle periodogram may serve as a useful method for the study of biological rhythms, especially when applied to telemetrical or observational time-series obtained from free-living animals, i.e., data sets that notoriously lack points.  相似文献   

17.
Principal-oscillation-pattern (POP) analysis is a multivariate and systematic technique for identifying the dynamic characteristics of a system from time-series data. In this study, we demonstrate the first application of POP analysis to genome-wide time-series gene-expression data. We use POP analysis to infer oscillation patterns in gene expression. Typically, a genomic system matrix cannot be directly estimated because the number of genes is usually much larger than the number of time points in a genomic study. Thus, we first identify the POPs of the eigen-genomic system that consists of the first few significant eigengenes obtained by singular value decomposition. By using the linear relationship between eigengenes and genes, we then infer the POPs of the genes. Both simulation data and real-world data are used in this study to demonstrate the applicability of POP analysis to genomic data. We show that POP analysis not only compares favorably with experiments and existing computational methods, but that it also provides complementary information relative to other approaches.  相似文献   

18.
Experimental and clinical chronocardiology   总被引:2,自引:0,他引:2  
Recent advances of the chronobiologic approach to experimental and clinical cardiology was reviewed. First, the maximum entropy method (MEM) was introduced as one of the statistical methods analyzing the circadian periodicity. The MEM power spectrum showed a remarkable resolution property. It will play an important role in chronocardiology in cooperation with the cosinor method. Secondly, recent investigations of the relationship between sleep states and cardiac arrhythmias were mentioned from the viewpoints of both experimental and clinical chronocardiology. Next, recent remarkable advance on myocardial ischemia was reviewed. A marked circadian rhythm in the frequency of myocardial infarction onset and sudden cardiac death has been observed. Lastly, a reference has been made to the recent development of ambulatory BP monitoring. A chronobiologic approach to the analysis of time-series data in cardiology will lead to many advantages in clinical practice.  相似文献   

19.
This paper investigates the utility of the Lomb–Scargle periodogram for the analysis of biological rhythms. This method is particularly suited to detect periodic components in unequally sampled time-series and data sets with missing values, but restricts all calculations to actually measured values. The Lomb-Scargle method was tested on both real and simulated time-series with even and uneven sampling, and compared to a standard method in biomedical rhythm research, the Chi-square periodogram. Results indicate that the Lomb–Scargle algorithm shows a clearly better detection efficiency and accuracy in the presence of noise, and avoids possible bias or erroneous results that may arise from replacement of missing data by interpolation techniques. Hence, the Lomb–Scargle periodogram may serve as a useful method for the study of biological rhythms, especially when applied to telemetrical or observational time-series obtained from free-living animals, i.e., data sets that notoriously lack points.  相似文献   

20.
We propose a general theoretical framework for analyzing differentially expressed genes and behavior patterns from two homogenous short time-course data. The framework generalizes the recently proposed Hilbert-Schmidt Independence Criterion (HSIC)-based framework adapting it to the time-series scenario by utilizing tensor analysis for data transformation. The proposed framework is effective in yielding criteria that can identify both the differentially expressed genes and time-course patterns of interest between two time-series experiments without requiring to explicitly cluster the data. The results, obtained by applying the proposed framework with a linear kernel formulation, on various data sets are found to be both biologically meaningful and consistent with published studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号