共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
TimeClust is a user-friendly software package to cluster genes according to their temporal expression profiles. It can be conveniently used to analyze data obtained from DNA microarray time-course experiments. It implements two original algorithms specifically designed for clustering short time series together with hierarchical clustering and self-organizing maps. AVAILABILITY: TimeClust executable files for Windows and LINUX platforms can be downloaded free of charge for non-profit institutions from the following web site: http://aimed11.unipv.it/TimeClust. 相似文献
3.
Background
Time series gene expression data analysis is used widely to study the dynamics of various cell processes. Most of the time series data available today consist of few time points only, thus making the application of standard clustering techniques difficult. 相似文献4.
5.
Shinkawa T Taoka M Yamauchi Y Ichimura T Kaji H Takahashi N Isobe T 《Journal of proteome research》2005,4(5):1826-1831
We describe the software, STEM (STrategic Extractor for Mascot's results), which efficiently processes large-scale mass spectrometry-based proteomics data. V (View)-mode evaluates the Mascot peptide identification dataset, removes unreliable candidates and redundant assignments, and integrates the results with key information in the experiment. C (Comparison)-mode compares peptide coverage among multiple datasets and displays proteins commonly/specifically found therein, and processes data for quantitative studies that utilize conventional isotope tags or tags having a smaller mass difference. STEM significantly improves throughput of proteomics study. 相似文献
6.
7.
Large-scale expression data are today measured for thousands of genes simultaneously. This development has been followed by an exploration of theoretical tools to get as much information out of these data as possible. Several groups have used principal component analysis (PCA) for this task. However, since this approach is data-driven, care must be taken in order not to analyze the noise instead of the data. As a strong warning towards uncritical use of the output from a PCA, we employ a newly developed procedure to judge the effective dimensionality of a specific data set. Although this data set is obtained during the development of rat central nervous system, our finding is a general property of noisy time series data. Based on knowledge of the noise-level for the data, we find that the effective number of dimensions that are meaningful to use in a PCA is much lower than what could be expected from the number of measurements. We attribute this fact both to effects of noise and the lack of independence of the expression levels. Finally, we explore the possibility to increase the dimensionality by performing more measurements within one time series, and conclude that this is not a fruitful approach. 相似文献
8.
9.
Recent development in DNA microarray technologies has made the reconstruction of gene regulatory networks (GRNs) feasible. To infer the overall structure of a GRN, there is a need to find out how the expression of each gene can be affected by the others. Many existing approaches to reconstructing GRNs are developed to generate hypotheses about the presence or absence of interactions between genes so that laboratory experiments can be performed afterwards for verification. Since, they are not intended to be used to predict if a gene in an unseen sample has any interactions with other genes, statistical verification of the reliability of the discovered interactions can be difficult. Furthermore, since the temporal ordering of the data is not taken into consideration, the directionality of regulation cannot be established using these existing techniques. To tackle these problems, we propose a data mining technique here. This technique makes use of a probabilistic inference approach to uncover interesting dependency relationships in noisy, high-dimensional time series expression data. It is not only able to determine if a gene is dependent on another but also whether or not it is activated or inhibited. In addition, it can predict how a gene would be affected by other genes even in unseen samples. For performance evaluation, the proposed technique has been tested with real expression data. Experimental results show that it can be very effective. The discovered dependency relationships can reveal gene regulatory relationships that could be used to infer the structures of GRNs. 相似文献
10.
Pereira GS Brandão RM Giuliatti S Zago MA Silva WA 《Genetics and molecular research : GMR》2006,5(1):108-114
Serial analysis of gene expression (SAGE) technology produces large sets of interesting genes that are difficult to analyze directly. Bioinformatics tools are needed to interpret the functional information in these gene sets. We present an interactive web-based tool, called Gene Class, which allows functional annotation of SAGE data using the Gene Ontology (GO) database. This tool performs searches in the GO database for each SAGE tag, making associations in the selected GO category for a level selected in the hierarchy. This system provides user-friendly data navigation and visualization for mapping SAGE data onto the gene ontology structure. This tool also provides graphical visualization of the percentage of SAGE tags in each GO category, along with confidence intervals and hypothesis testing. 相似文献
11.
12.
M?nica G Campiteli Frederico M Soriani Iran Malavazi Osame Kinouchi Carlos AB Pereira Gustavo H Goldman 《BMC bioinformatics》2009,10(1):270
Background
Microarray techniques have become an important tool to the investigation of genetic relationships and the assignment of different phenotypes. Since microarrays are still very expensive, most of the experiments are performed with small samples. This paper introduces a method to quantify dependency between data series composed of few sample points. The method is used to construct gene co-expression subnetworks of highly significant edges. 相似文献13.
Nonlinearity is important and ubiquitous in ecology. Though detectable in principle, nonlinear behavior is often difficult to characterize, analyze, and incorporate mechanistically into models of ecosystem function. One obvious reason is that quantitative nonlinear analysis tools are data intensive (require long time series), and time series in ecology are generally short. Here we demonstrate a useful method that circumvents data limitation and reduces sampling error by combining ecologically similar multispecies time series into one long time series. With this technique, individual ecological time series containing as few as 20 data points can be mined for such important information as (1) significantly improved forecast ability, (2) the presence and location of nonlinearity, and (3) the effective dimensionality (the number of relevant variables) of an ecological system. 相似文献
14.
van Wieringen WN Belien JA Vosse SJ Achame EM Ylstra B 《Bioinformatics (Oxford, England)》2006,22(15):1919-1920
SUMMARY: We describe a tool, called ACE-it (Array CGH Expression integration tool). ACE-it links the chromosomal position of the gene dosage measured by array CGH to the genes measured by the expression array. ACE-it uses this link to statistically test whether gene dosage affects RNA expression. AVAILABILITY: ACE-it is freely available at http://ibivu.cs.vu.nl/programs/acewww/. 相似文献
15.
Amato R Ciaramella A Deniskina N Del Mondo C di Bernardo D Donalek C Longo G Mangano G Miele G Raiconi G Staiano A Tagliaferri R 《Bioinformatics (Oxford, England)》2006,22(5):589-596
MOTIVATION: The huge growth in gene expression data calls for the implementation of automatic tools for data processing and interpretation. RESULTS: We present a new and comprehensive machine learning data mining framework consisting in a non-linear PCA neural network for feature extraction, and probabilistic principal surfaces combined with an agglomerative approach based on Negentropy aimed at clustering gene microarray data. The method, which provides a user-friendly visualization interface, can work on noisy data with missing points and represents an automatic procedure to get, with no a priori assumptions, the number of clusters present in the data. Cell-cycle dataset and a detailed analysis confirm the biological nature of the most significant clusters. AVAILABILITY: The software described here is a subpackage part of the ASTRONEURAL package and is available upon request from the corresponding author. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. 相似文献
16.
17.
18.
Dynamic Bayesian network and nonparametric regression for nonlinear modeling of gene networks from time series gene expression data 总被引:8,自引:0,他引:8
We propose a dynamic Bayesian network and nonparametric regression model for constructing a gene network from time series microarray gene expression data. The proposed method can overcome a shortcoming of the Bayesian network model in the sense of the construction of cyclic regulations. The proposed method can analyze the microarray data as a continuous data and can capture even nonlinear relations among genes. It can be expected that this model will give a deeper insight into complicated biological systems. We also derive a new criterion for evaluating an estimated network from Bayes approach. We conduct Monte Carlo experiments to examine the effectiveness of the proposed method. We also demonstrate the proposed method through the analysis of the Saccharomyces cerevisiae gene expression data. 相似文献
19.
motivation: Increasingly, biological processes are being studied through time series of RNA expression data collected for large numbers of genes. Because common processes may unfold at varying rates in different experiments or individuals, methods are needed that will allow corresponding expression states in different time series to be mapped to one another. Results: We present implementations of time warping algorithms applicable to RNA and protein expression data and demonstrate their application to published yeast RNA expression time series. Programs executing two warping algorithms are described, a simple warping algorithm and an interpolative algorithm, along with programs that generate graphics that visually present alignment information. We show time warping to be superior to simple clustering at mapping corresponding time states. We document the impact of statistical measurement noise and sample size on the quality of time alignments, and present issues related to statistical assessment of alignment quality through alignment scores. We also discuss directions for algorithm improvement including development of multiple time series alignments and possible applications to causality searches and non-temporal processes ('concentration warping'). 相似文献
20.
The Graphical Query Language (GQL) is a set of tools for the analysis of gene expression time-courses. They allow a user to pre-process the data, to query it for interesting patterns, to perform model-based clustering or mixture estimation, to include subsequent refinements of clusters and, finally, to use other biological resources to evaluate the results. Analyses are carried out in a graphical and interactive environment, allowing expert intervention in all stages of the data analysis. AVAILABILITY: The GQL package is freely available under the GNU general public license (GPL) at http://www.ghmm.org/gql 相似文献