首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Periodogram analysis of unequally spaced time-series, as part of many biological rhythm investigations, is complicated. The mathematical frameworkis scattered over the literature, and the interpretation of results is often debatable. In this paper, we show that the Lomb–Scargle method is the appropriate tool for periodogram analysis of unequally spaced data. A unique procedure of multiple period searching is derived, facilitating the assessment of the various rhythms that may be present in a time-series. All relevant mathematical and statistical aspects are considered in detail, and much attention is given to the correct interpretation of results. The use of the procedure is illustrated by examples, and problems that may be encountered are discussed. It is argued that, when following the procedure of multiple period searching, we can even benefit from the unequal spacing of a time-series in biological rhythm research.  相似文献   

2.
The classical power spectrum, computed in the frequency domain, outranks traditionally used periodograms derived in the time domain (such as the chi2 periodogram) regarding the search for biological rhythms. Unfortunately, classical power spectral analysis is not possible with unequally spaced data (e.g., time series with missing data). The Lomb-Scargle periodogram fixes this shortcoming. However, peak detection in the Lomb-Scargle periodogram of unequally spaced data requires some careful consideration. To guide researchers in the proper evaluation of detected peaks, therefore, a novel procedure and a computer program have recently become available. It is recommended that the Lomb-Scargle periodogram be the default method of periodogram analysis in future biomedical applications of rhythm investigation.  相似文献   

3.
This paper investigates the utility of the Lomb–Scargle periodogram for the analysis of biological rhythms. This method is particularly suited to detect periodic components in unequally sampled time-series and data sets with missing values, but restricts all calculations to actually measured values. The Lomb-Scargle method was tested on both real and simulated time-series with even and uneven sampling, and compared to a standard method in biomedical rhythm research, the Chi-square periodogram. Results indicate that the Lomb–Scargle algorithm shows a clearly better detection efficiency and accuracy in the presence of noise, and avoids possible bias or erroneous results that may arise from replacement of missing data by interpolation techniques. Hence, the Lomb–Scargle periodogram may serve as a useful method for the study of biological rhythms, especially when applied to telemetrical or observational time-series obtained from free-living animals, i.e., data sets that notoriously lack points.  相似文献   

4.
This paper investigates the utility of the Lomb-Scargle periodogram for the analysis of biological rhythms. This method is particularly suited to detect periodic components in unequally sampled time-series and data sets with missing values, but restricts all calculations to actually measured values. The Lomb-Scargle method was tested on both real and simulated time-series with even and uneven sampling, and compared to a standard method in biomedical rhythm research, the Chi-square periodogram. Results indicate that the Lomb-Scargle algorithm shows a clearly better detection efficiency and accuracy in the presence of noise, and avoids possible bias or erroneous results that may arise from replacement of missing data by interpolation techniques. Hence, the Lomb-Scargle periodogram may serve as a useful method for the study of biological rhythms, especially when applied to telemetrical or observational time-series obtained from free-living animals, i.e., data sets that notoriously lack points.  相似文献   

5.
MOTIVATION: Periodic patterns in time series resulting from biological experiments are of great interest. The commonly used Fast Fourier Transform (FFT) algorithm is applicable only when data are evenly spaced and when no values are missing, which is not always the case in high-throughput measurements. The choice of statistic to evaluate the significance of the periodic patterns for unevenly spaced gene expression time series has not been well substantiated. METHODS: The Lomb-Scargle periodogram approach is used to search time series of gene expression to quantify the periodic behavior of every gene represented on the DNA array. The Lomb-Scargle periodogram analysis provides a direct method to treat missing values and unevenly spaced time points. We propose the combination of a Lomb-Scargle test statistic for periodicity and a multiple hypothesis testing procedure with controlled false discovery rate to detect significant periodic gene expression patterns. RESULTS: We analyzed the Plasmodium falciparum gene expression dataset. In the Quality Control Dataset of 5080 expression patterns, we found 4112 periodic probes. In addition, we identified 243 probes with periodic expression in the Complete Dataset, which could not be examined in the original study by the FFT analysis due to an excessive number of missing values. While most periodic genes had a period of 48 h, some had a period close to 24 h. Our approach should be applicable for detection and quantification of periodic patterns in any unevenly spaced gene expression time-series data.  相似文献   

6.
An analogue of the periodogram method for unequally spaced data is presented with a view to resolving the frequency structure of the observations. The algorithm is explicitly based on the sequential least squares procedure. In particular, the key concept is that the with-in-plot spectral analysis can be augmented by the between-plot information to make inferences about common characteristics. It is also shown how the between-plot random variations can be incorporated into the multiple harmonic regression model. A detailed spectral analysis investigates the periodic fluctuations in four cardio-circulatory variables, measured by autorhythmometric observation by eight men at rest and extending over a time span of 2 years. The spectral curves show the existence of circadian and circaseptan rhythmicities. The amplitude modulation of the dian rhythm by circaseptan variation is assimilated with the rhythmicity of work during the week. The blood-pressure variables situate their maximum annual peak in the winter period. These quasi-periodic fluctuations appear to be related to the amount of physical activity performed in time by the subjects.  相似文献   

7.
In this paper, we introduce a Bayesian statistical model for the analysis of functional data observed at several time points. Examples of such data include the Michigan growth study where we wish to characterize the shape changes of human mandible profiles. The form of the mandible is often used by clinicians as an aid in predicting the mandibular growth. However, whereas many studies have demonstrated the changes in size that may occur during the period of pubertal growth spurt, shape changes have been less well investigated. Considering a group of subjects presenting normal occlusion, in this paper we thus describe a Bayesian functional ANOVA model that provides information about where and when the shape changes of the mandible occur during different stages of development. The model is developed by defining the notion of predictive process models for Gaussian process (GP) distributions used as priors over the random functional effects. We show that the predictive approach is computationally appealing and that it is useful to analyze multivariate functional data with unequally spaced observations that differ among subjects and times. Graphical posterior summaries show that our model is able to provide a biological interpretation of the morphometric findings and that they comprehensively describe the shape changes of the human mandible profiles. Compared with classical cephalometric analysis, this paper represents a significant methodological advance for the study of mandibular shape changes in two dimensions.  相似文献   

8.
The workhorse of modern genetic analysis is the parametric linear model. The advantages of the linear modeling framework are many and include a mathematical understanding of the model fitting process and ease of interpretation. However, an important limitation is that linear models make assumptions about the nature of the data being modeled. This assumption may not be realistic for complex biological systems such as disease susceptibility where nonlinearities in the genotype to phenotype mapping relationship that result from epistasis, plastic reaction norms, locus heterogeneity, and phenocopy, for example, are the norm rather than the exception. We have previously developed a flexible modeling approach called symbolic discriminant analysis (SDA) that makes no assumptions about the patterns in the data. Rather, SDA lets the data dictate the size, shape, and complexity of a symbolic discriminant function that could include any set of mathematical functions from a list of candidates supplied by the user. Here, we outline a new five step process for symbolic model discovery that uses genetic programming (GP) for coarse-grained stochastic searching, experimental design for parameter optimization, graphical modeling for generating expert knowledge, and estimation of distribution algorithms for fine-grained stochastic searching. Finally, we introduce function mapping as a new method for interpreting symbolic discriminant functions. We show that function mapping when combined with measures of interaction information facilitates statistical interpretation by providing a graphical approach to decomposing complex models to highlight synergistic, redundant, and independent effects of polymorphisms and their composite functions. We illustrate this five step SDA modeling process with a real case-control dataset.  相似文献   

9.
Modeling biological processes from time-series data is a resourceful procedure which has received much attention in the literature. For models established in the context of non-linear differential equations, parameter-dependent phenomenological tentative response functions are tested by comparing would-be solutions of those models to the experimental time-series. Those values of the parameters for which a tested solution is a best fit are then retained. It is done with the help of some appropriate optimization algorithm which simplifies the searching procedure within the range of variability of the parameters that are to be estimated. The procedure works well in problems with a small number of adjustable parameters or/and with narrow searching ranges. However, it may start to be problematic for models with a large number of problem parameters inasmuch as convergence to the best fit is not necessarily ensured. In this case, a reduction in size of the parameter estimation problem must be undertaken. We presently address this issue by proposing a systematic procedure that does so in problems in which the system's response to a sufficiently small pulse perturbation of steady-state can be obtained. The response is then assumed to be a solution of the linearized equations, the Jacobian of which can be retrieved by a simple multilinear regression. The calculated n(2) Jacobian entries provide as many relationships among problem parameters, thus cutting substantially the size of the starting problem. After this preliminary treatment is applied, only (kappa-n(2)) of the initial kappa adjustable parameters are left for evaluation by means of a non-linear optimization procedure. The benefits of the present variant are both in economy of computation and in accuracy in determining the parameter values. The performance of the method is established under different circumstances. It is illustrated in the context of power-law rates, although this does not preclude its applicability to more general functional responses.  相似文献   

10.
The identification of effective connectivity from time-series data such as electroencephalogram (EEG) or time-resolved function magnetic resonance imaging (fMRI) recordings is an important problem in brain imaging. One commonly used approach to inference effective connectivity is based on vector autoregressive models and the concept of Granger causality. However, this probabilistic concept of causality can lead to spurious causalities in the presence of latent variables. Recently, graphical models have been used to discuss problems of causal inference for multivariate data. In this paper, we extend these concepts to the case of time-series and present a graphical approach for discussing Granger-causal relationships among multiple time-series. In particular, we propose a new graphical representation that allows the characterization of spurious causality and, thus, can be used to investigate spurious causality. The method is demonstrated with concurrent EEG and fMRI recordings which are used to investigate the interrelations between the alpha rhythm in the EEG and blood oxygenation level dependent (BOLD) responses in the fMRI. The results confirm previous findings on the location of the source of the EEG alpha rhythm.  相似文献   

11.
We present a novel approach for analyzing biological time-series data using a context-free language (CFL) representation that allows the extraction and quantification of important features from the time-series. This representation results in Hierarchically AdaPtive (HAP) analysis, a suite of multiple complementary techniques that enable rapid analysis of data and does not require the user to set parameters. HAP analysis generates hierarchically organized parameter distributions that allow multi-scale components of the time-series to be quantified and includes a data analysis pipeline that applies recursive analyses to generate hierarchically organized results that extend traditional outcome measures such as pharmacokinetics and inter-pulse interval. Pulsicons, a novel text-based time-series representation also derived from the CFL approach, are introduced as an objective qualitative comparison nomenclature. We apply HAP to the analysis of 24 hours of frequently sampled pulsatile cortisol hormone data, which has known analysis challenges, from 14 healthy women. HAP analysis generated results in seconds and produced dozens of figures for each participant. The results quantify the observed qualitative features of cortisol data as a series of pulse clusters, each consisting of one or more embedded pulses, and identify two ultradian phenotypes in this dataset. HAP analysis is designed to be robust to individual differences and to missing data and may be applied to other pulsatile hormones. Future work can extend HAP analysis to other time-series data types, including oscillatory and other periodic physiological signals.  相似文献   

12.
《Genomics》2019,111(4):636-641
High-throughput time-series data have a special value for studying the dynamism of biological systems. However, the interpretation of such complex data can be challenging. The aim of this study was to compare common algorithms recently developed for the detection of differentially expressed genes in time-course microarray data. Using different measures such as sensitivity, specificity, predictive values, and related signaling pathways, we found that limma, timecourse, and gprege have reasonably good performance for the analysis of datasets in which only test group is followed over time. However, limma has the additional advantage of being able to report significance cut off, making it a more practical tool. In addition, limma and TTCA can be satisfactorily used for datasets with time-series data for all experimental groups. These findings may assist investigators to select appropriate tools for the detection of differentially expressed genes as an initial step in the interpretation of time-course big data.  相似文献   

13.
Kim HY  Kim MJ  Han JI  Kim BK  Lee YS  Lee YS  Kim JH 《Bio Systems》2009,95(1):17-25
A time-series microarray experiment is useful to study the changes in the expression of a large number of genes over time. Many methods for clustering genes using gene expression profiles have been suggested, but it is not easy to interpret the biological significance of the results or utilize these methods for understanding the dynamics of gene regulatory systems. In this study, we introduce an algorithm for readjusting the boundaries of clusters by adopting the advantages of both k-means and singular value decomposition (SVD). In addition, we suggest a methodology for searching the principal genes that can be the most crucial genes in regulation of clusters. We found 34 principal genes from 171 clusters having strong concentratedness in their expression patterns and distinct ranges of oscillatory phases, by using a time-series microarray dataset of mouse embryonic stem (ES) cells after induction of dopaminergic neural differentiation. The biological significance of the principal genes examined in the literature supports the feasibility of our algorithms in that the hierarchy of clusters may lead the manifestation of the phenotypes, e.g., the development of the nervous system.  相似文献   

14.
Increases in throughput and decreases in costs have facilitated large scale metabolomics studies, the simultaneous measurement of large numbers of biochemical components in biological samples. Initial large scale studies focused on biomarker discovery for disease or disease progression and helped to understand biochemical pathways underlying disease. The first population-based studies that combined metabolomics and genome wide association studies (mGWAS) have increased our understanding of the (genetic) regulation of biochemical conversions. Measurements of metabolites as intermediate phenotypes are a potentially very powerful approach to uncover how genetic variation affects disease susceptibility and progression. However, we still face many hurdles in the interpretation of mGWAS data. Due to the composite nature of many metabolites, single enzymes may affect the levels of multiple metabolites and, conversely, levels of single metabolites may be affected by multiple enzymes. Here, we will provide a global review of the current status of mGWAS. We will specifically discuss the application of prior biological knowledge present in databases to the interpretation of mGWAS results and discuss the potential of mathematical models. As the technology continuously improves to detect metabolites and to measure genetic variation, it is clear that comprehensive systems biology based approaches are required to further our insight in the association between genes, metabolites and disease. This article is part of a Special Issue entitled: From Genome to Function.  相似文献   

15.
Biological processes are often dynamic, thus researchers must monitor their activity at multiple time points. The most abundant source of information regarding such dynamic activity is time-series gene expression data. These data are used to identify the complete set of activated genes in a biological process, to infer their rates of change, their order and their causal effects and to model dynamic systems in the cell. In this Review we discuss the basic patterns that have been observed in time-series experiments, how these patterns are combined to form expression programs, and the computational analysis, visualization and integration of these data to infer models of dynamic biological systems.  相似文献   

16.
Chis OT  Banga JR  Balsa-Canto E 《PloS one》2011,6(11):e27755
Analysing the properties of a biological system through in silico experimentation requires a satisfactory mathematical representation of the system including accurate values of the model parameters. Fortunately, modern experimental techniques allow obtaining time-series data of appropriate quality which may then be used to estimate unknown parameters. However, in many cases, a subset of those parameters may not be uniquely estimated, independently of the experimental data available or the numerical techniques used for estimation. This lack of identifiability is related to the structure of the model, i.e. the system dynamics plus the observation function. Despite the interest in knowing a priori whether there is any chance of uniquely estimating all model unknown parameters, the structural identifiability analysis for general non-linear dynamic models is still an open question. There is no method amenable to every model, thus at some point we have to face the selection of one of the possibilities. This work presents a critical comparison of the currently available techniques. To this end, we perform the structural identifiability analysis of a collection of biological models. The results reveal that the generating series approach, in combination with identifiability tableaus, offers the most advantageous compromise among range of applicability, computational complexity and information provided.  相似文献   

17.
18.
Longitudinal data usually consist of a number of short time series. A group of subjects or groups of subjects are followed over time and observations are often taken at unequally spaced time points, and may be at different times for different subjects. When the errors and random effects are Gaussian, the likelihood of these unbalanced linear mixed models can be directly calculated, and nonlinear optimization used to obtain maximum likelihood estimates of the fixed regression coefficients and parameters in the variance components. For binary longitudinal data, a two state, non-homogeneous continuous time Markov process approach is used to model serial correlation within subjects. Formulating the model as a continuous time Markov process allows the observations to be equally or unequally spaced. Fixed and time varying covariates can be included in the model, and the continuous time model allows the estimation of the odds ratio for an exposure variable based on the steady state distribution. Exact likelihoods can be calculated. The initial probability distribution on the first observation on each subject is estimated using logistic regression that can involve covariates, and this estimation is embedded in the overall estimation. These models are applied to an intervention study designed to reduce children's sun exposure.  相似文献   

19.
Most multi-alignment methods are fully automated, i.e. they are based on a fixed set of mathematical rules. For various reasons, such methods may fail to produce biologically meaningful alignments. Herein, we describe a semi-automatic approach to multiple sequence alignment where biological expert knowledge can be used to influence the alignment procedure. The user can specify parts of the sequences that are biologically related to each other; our software program uses these sites as anchor points and creates a multiple alignment respecting these user-defined constraints. By using known functionally, structurally or evolutionarily related positions of the input sequences as anchor points, our method can produce alignments that reflect the true biological relationships among the input sequences more accurately than fully automated procedures can do.  相似文献   

20.
Summary We have developed two distinct methods of biological rhythm analysis. The procedures are based on existing techniques for analysis of time series, Enright's periodogram and autocorrelation, and both of the new methods use the parameter, period length (), for defining oscillatory phenomena. We empirically evaluated the two types of analyses using real biological data from circadian rhythm studies in salamanders and sparrows.The first method permits us to make a statistical comparison of period lengths between groups of animals in given treatments. This method is useful for data where the signal-to-noise ratio of the suspected rhythm is very low; and the method is not adequate for making a definitive judgment from single animals. It can best be applied to the question of whether a signal is entraining a rhythm or not and to questions of group differences in period length.With the second method, we determined period length versus time. Using this procedure, we took into consideration the observation that the period length of many biological oscillations changes with time. The method is applicable to records from individual animals, and it can be used to compare treatment effects in individual animals. The technique can also be used to answer the common question of whether periodicityper se exists within a defined range in a time series.We thank Dr, T. J. Crovello, Mr. Kilian, Mr. B. Bailey, Mr. E. Kluth, Mr. G. Wyche, and the Notre Dame Computer Center. Support was provided by postdoctoral fellowship (l F02-HD-52858) from NIH to S. Binkley; grants to K. Adler (NSF GB-30547 and NIH FR-07 033-05); and NSF postdoctoral fellowship (GU-2058) and an Indiana Academy of Science grant to D. Taylor. Sparrow data used in this report were gathered while S. Binkley was a graduate student at the U. of Texas in Austin (NIH traineeship 5T01 GM-00836-08) and with funds from an NIH program project grant (HD-03803-02) to M. Menaker.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号