期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Poisson-based self-organizing feature maps and hierarchical clustering for serial analysis of gene expression data

Wang H Zheng H Azuaje F 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2007,4(2):163-175

相似文献

2.

Q-omics: Smart Software for Assisting Oncology and Cancer Research

Jieun Lee Youngju Kim Seonghee Jin Heeseung Yoo Sumin Jeong Euna Jeong Sukjoon Yoon 《Molecules and cells》2021,44(11):843

The rapid increase in collateral omics and phenotypic data has enabled data-driven studies for the fast discovery of cancer targets and biomarkers. Thus, it is necessary to develop convenient tools for general oncologists and cancer scientists to carry out customized data mining without computational expertise. For this purpose, we developed innovative software that enables user-driven analyses assisted by knowledge-based smart systems. Publicly available data on mutations, gene expression, patient survival, immune score, drug screening and RNAi screening were integrated from the TCGA, GDSC, CCLE, NCI, and DepMap databases. The optimal selection of samples and other filtering options were guided by the smart function of the software for data mining and visualization on Kaplan-Meier plots, box plots and scatter plots of publication quality. We implemented unique algorithms for both data mining and visualization, thus simplifying and accelerating user-driven discovery activities on large multiomics datasets. The present Q-omics software program (v0.95) is available at http://qomics.sookmyung.ac.kr. 相似文献

3.

Microarray data analysis and mining tools

Selvaraj S Natarajan J 《Bioinformation》2011,6(3):95-99

相似文献

4.

Genesis: cluster analysis of microarray data 总被引：26，自引：0，他引：26

Sturn A Quackenbush J Trajanoski Z 《Bioinformatics (Oxford, England)》2002,18(1):207-208

相似文献

5.

Gene-Ontology-based clustering of gene expression data 总被引：2，自引：0，他引：2

Adryan B Schuh R 《Bioinformatics (Oxford, England)》2004,20(16):2851-2852

The expected correlation between genetic co-regulation and affiliation to a common biological process is not necessarily the case when numerical cluster algorithms are applied to gene expression data. GO-Cluster uses the tree structure of the Gene Ontology database as a framework for numerical clustering, and thus allowing a simple visualization of gene expression data at various levels of the ontology tree. AVAILABILITY: The 32-bit Windows application is freely available at http://www.mpibpc.mpg.de/go-cluster/ 相似文献

6.

Client-server environment for high-performance gene expression data analysis

Sturn A Mlecnik B Pieler R Rainer J Truskaller T Trajanoski Z 《Bioinformatics (Oxford, England)》2003,19(6):772-773

SUMMARY: We have developed a platform independent, flexible and scalable Java environment for high-performance large-scale gene expression data analysis, which integrates various computational intensive hierarchical and non-hierarchical clustering algorithms. The environment includes a powerful client for data preparation and results visualization, an application server for computation and an additional administration tool. The package is available free of charge for academic and non-profit institutions. 相似文献

7.

AVA: visual analysis of gene expression microarray data

Zhou Y Liu J 《Bioinformatics (Oxford, England)》2003,19(2):293-294

SUMMARY: AVA (Array Visual Analyzer) is a Java program that provides a graphical environment for visualization and analysis of gene expression microarray data. Together with its interactive visualization tools and a variety of built-in data analysis and filtration methods, AVA effectively integrates microarray data normalization, quality assessment, and data mining into one application. AVAILABILITY: The software is freely available for academic users on request from the authors. 相似文献

8.

Attribute clustering for grouping, selection, and classification of gene expression data 总被引：1，自引：0，他引：1

Au WH Chan KC Wong AK Wang Y 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2005,2(2):83-101

This paper presents an attribute clustering method which is able to group genes based on their interdependence so as to mine meaningful patterns from the gene expression data. It can be used for gene grouping, selection, and classification. The partitioning of a relational table into attribute subgroups allows a small number of attributes within or across the groups to be selected for analysis. By clustering attributes, the search dimension of a data mining algorithm is reduced. The reduction of search dimension is especially important to data mining in gene expression data because such data typically consist of a huge number of genes (attributes) and a small number of gene expression profiles (tuples). Most data mining algorithms are typically developed and optimized to scale to the number of tuples instead of the number of attributes. The situation becomes even worse when the number of attributes overwhelms the number of tuples, in which case, the likelihood of reporting patterns that are actually irrelevant due to chances becomes rather high. It is for the aforementioned reasons that gene grouping and selection are important preprocessing steps for many data mining algorithms to be effective when applied to gene expression data. This paper defines the problem of attribute clustering and introduces a methodology to solving it. Our proposed method groups interdependent attributes into clusters by optimizing a criterion function derived from an information measure that reflects the interdependence between attributes. By applying our algorithm to gene expression data, meaningful clusters of genes are discovered. The grouping of genes based on attribute interdependence within group helps to capture different aspects of gene association patterns in each group. Significant genes selected from each group then contain useful information for gene expression classification and identification. To evaluate the performance of the proposed approach, we applied it to two well-known gene expression data sets and compared our results with those obtained by other methods. Our experiments show that the proposed method is able to find the meaningful clusters of genes. By selecting a subset of genes which have high multiple-interdependence with others within clusters, significant classification information can be obtained. Thus, a small pool of selected genes can be used to build classifiers with very high classification rate. From the pool, gene expressions of different categories can be identified. 相似文献

9.

Exploratory and inferential analysis of gene cluster neighborhood graphs

Theresa Scharl Ingo Voglhuber Friedrich Leisch 《BMC bioinformatics》2009,10(1):288

Background

Many different cluster methods are frequently used in gene expression data analysis to find groups of co-expressed genes. However, cluster algorithms with the ability to visualize the resulting clusters are usually preferred. The visualization of gene clusters gives practitioners an understanding of the cluster structure of their data and makes it easier to interpret the cluster results. 相似文献

10.

Predicting combinatorial binding of transcription factors to regulatory elements in the human genome by association rule mining

Xochitl C Morgan Shulin Ni Daniel P Miranker Vishwanath R Iyer 《BMC bioinformatics》2007,8(1):445

相似文献

11.

基因表达数据聚类分析技术及其软件工具

欧阳玉梅《生物信息学》2010,8(2):104-109

随着DNA芯片技术的广泛应用,基因表达数据分析已成为生命科学的研究热点之一。概述基因表达聚类技术类型、算法分类与特点、结果可视化与注释;阐述一些流行的和新型的算法;介绍17个最新相关软件包和在线web服务工具;并说明软件工具的研究趋向。相似文献

12.

The non-negative matrix factorization toolbox for biological data mining

Yifeng?Li Email author Alioune?Ngom 《Source code for biology and medicine》2013,8(1):10

Background

Non-negative matrix factorization (NMF) has been introduced as an important method for mining biological data. Though there currently exists packages implemented in R and other programming languages, they either provide only a few optimization algorithms or focus on a specific application field. There does not exist a complete NMF package for the bioinformatics community, and in order to perform various data mining tasks on biological data.

Results

We provide a convenient MATLAB toolbox containing both the implementations of various NMF techniques and a variety of NMF-based data mining approaches for analyzing biological data. Data mining approaches implemented within the toolbox include data clustering and bi-clustering, feature extraction and selection, sample classification, missing values imputation, data visualization, and statistical comparison.

Conclusions

A series of analysis such as molecular pattern discovery, biological process identification, dimension reduction, disease prediction, visualization, and statistical comparison can be performed using this toolbox.

相似文献

13.

Modeling and Visualizing Uncertainty in Gene Expression Clusters Using Dirichlet Process Mixtures

《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2009,6(4):615-628

Although the use of clustering methods has rapidly become one of the standard computational approaches in the literature of microarray gene expression data, little attention has been paid to uncertainty in the results obtained. Dirichlet process mixture (DPM) models provide a nonparametric Bayesian alternative to the bootstrap approach to modeling uncertainty in gene expression clustering. Most previously published applications of Bayesian model-based clustering methods have been to short time series data. In this paper, we present a case study of the application of nonparametric Bayesian clustering methods to the clustering of high-dimensional nontime series gene expression data using full Gaussian covariances. We use the probability that two genes belong to the same cluster in a DPM model as a measure of the similarity of these gene expression profiles. Conversely, this probability can be used to define a dissimilarity measure, which, for the purposes of visualization, can be input to one of the standard linkage algorithms used for hierarchical clustering. Biologically plausible results are obtained from the Rosetta compendium of expression profiles which extend previously published cluster analyses of this data. 相似文献

14.

Arabidopsis Gene Family Profiler (aGFP) – user-oriented transcriptomic database with easy-to-use graphic interface

Nikoleta Dupl'áková David Reňák Patrik Hovanec Barbora Honysová David Twell David Honys 《BMC plant biology》2007,7(1):39

相似文献

15.

BioWeka--extending the Weka framework for bioinformatics

Gewehr JE Szugat M Zimmer R 《Bioinformatics (Oxford, England)》2007,23(5):651-653

Given the growing amount of biological data, data mining methods have become an integral part of bioinformatics research. Unfortunately, standard data mining tools are often not sufficiently equipped for handling raw data such as e.g. amino acid sequences. One popular and freely available framework that contains many well-known data mining algorithms is the Waikato Environment for Knowledge Analysis (Weka). In the BioWeka project, we introduce various input formats for bioinformatics data and bioinformatics methods like alignments to Weka. This allows users to easily combine them with Weka's classification, clustering, validation and visualization facilities on a single platform and therefore reduces the overhead of converting data between different data formats as well as the need to write custom evaluation procedures that can deal with many different programs. We encourage users to participate in this project by adding their own components and data formats to BioWeka. Availability: The software, documentation and tutorial are available at http://www.bioweka.org. 相似文献

16.

NMPP: a user-customized NimbleGen microarray data processing pipeline 总被引：1，自引：0，他引：1

Wang X He H Li L Chen R Deng XW Li S 《Bioinformatics (Oxford, England)》2006,22(23):2955-2957

NMPP package is a bundle of user-customized tools based on established algorithms and methods to process self-designed NimbleGen microarray data. It features a command-line-based integrative processing procedure that comprises five major functional components, namely the raw microarray data parsing and integrating module, the array spatial effect smoothing and visualization module, the probe-level multi-array normalization module, the gene expression intensity summarization module and the gene expression status inference module. AVAILABILITY: http://plantgenomics.biology.yale.edu/nmpp 相似文献

17.

Gene expression data analysis using multiobjective clustering improved with SVM based ensemble

Mukhopadhyay A Maulik U Bandyopadhyay S 《In silico biology》2011,11(1-2):19-27

Microarray technology facilitates the monitoring of the expression levels of thousands of genes over different experimental conditions simultaneously. Clustering is a popular data mining tool which can be applied to microarray gene expression data to identify co-expressed genes. Most of the traditional clustering methods optimize a single clustering goodness criterion and thus may not be capable of performing well on all kinds of datasets. Motivated by this, in this article, a multiobjective clustering technique that optimizes cluster compactness and separation simultaneously, has been improved through a novel support vector machine classification based cluster ensemble method. The superiority of MOCSVMEN (MultiObjective Clustering with Support Vector Machine based ENsemble) has been established by comparing its performance with that of several well known existing microarray data clustering algorithms. Two real-life benchmark gene expression datasets have been used for testing the comparative performances of different algorithms. A recently developed metric, called Biological Homogeneity Index (BHI), which computes the clustering goodness with respect to functional annotation, has been used for the comparison purpose. 相似文献

18.

CLICK and EXPANDER: a system for clustering and visualizing gene expression data 总被引：9，自引：0，他引：9

Sharan R Maron-Katz A Shamir R 《Bioinformatics (Oxford, England)》2003,19(14):1787-1799

MOTIVATION: Microarrays have become a central tool in biological research. Their applications range from functional annotation to tissue classification and genetic network inference. A key step in the analysis of gene expression data is the identification of groups of genes that manifest similar expression patterns. This translates to the algorithmic problem of clustering genes based on their expression patterns. RESULTS: We present a novel clustering algorithm, called CLICK, and its applications to gene expression analysis. The algorithm utilizes graph-theoretic and statistical techniques to identify tight groups (kernels) of highly similar elements, which are likely to belong to the same true cluster. Several heuristic procedures are then used to expand the kernels into the full clusters. We report on the application of CLICK to a variety of gene expression data sets. In all those applications it outperformed extant algorithms according to several common figures of merit. We also point out that CLICK can be successfully used for the identification of common regulatory motifs in the upstream regions of co-regulated genes. Furthermore, we demonstrate how CLICK can be used to accurately classify tissue samples into disease types, based on their expression profiles. Finally, we present a new java-based graphical tool, called EXPANDER, for gene expression analysis and visualization, which incorporates CLICK and several other popular clustering algorithms. AVAILABILITY: http://www.cs.tau.ac.il/~rshamir/expander/expander.html 相似文献

19.

Reliability-oriented bioinformatic networks visualization

Aladağ AE Erten C Sözdinler M 《Bioinformatics (Oxford, England)》2011,27(11):1583-1584

SUMMARY: We present our protein-protein interaction (PPI) network visualization system RobinViz (reliability-oriented bioinformatic networks visualization). Clustering the PPI network based on gene ontology (GO) annotations or biclustered gene expression data, providing a clustered visualization model based on a central/peripheral duality, computing layouts with algorithms specialized for interaction reliabilities represented as weights, completely automated data acquisition, processing are notable features of the system. AVAILABILITY: RobinViz is a free, open-source software protected under GPL. It is written in C++ and Python, and consists of almost 30 000 lines of code, excluding the employed libraries. Source code, user manual and other Supplementary Material are available for download at http://code.google.com/p/robinviz/. 相似文献

20.

TomExpress,a unified tomato RNA‐Seq platform for visualization of expression data,clustering and correlation networks

下载免费PDF全文

Mohamed Zouine Elie Maza Anis Djari Mattieu Lauvernier Pierre Frasse Abdelaziz Smouni Julien Pirrello Mondher Bouzayen 《The Plant journal : for cell and molecular biology》2017,92(4):727-735

相似文献