期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Approximate geodesic distances reveal biologically relevant structures in microarray data 总被引：1，自引：0，他引：1

Nilsson J Fioretos T Höglund M Fontes M 《Bioinformatics (Oxford, England)》2004,20(6):874-880

MOTIVATION: Genome-wide gene expression measurements, as currently determined by the microarray technology, can be represented mathematically as points in a high-dimensional gene expression space. Genes interact with each other in regulatory networks, restricting the cellular gene expression profiles to a certain manifold, or surface, in gene expression space. To obtain knowledge about this manifold, various dimensionality reduction methods and distance metrics are used. For data points distributed on curved manifolds, a sensible distance measure would be the geodesic distance along the manifold. In this work, we examine whether an approximate geodesic distance measure captures biological similarities better than the traditionally used Euclidean distance. RESULTS: We computed approximate geodesic distances, determined by the Isomap algorithm, for one set of lymphoma and one set of lung cancer microarray samples. Compared with the ordinary Euclidean distance metric, this distance measure produced more instructive, biologically relevant, visualizations when applying multidimensional scaling. This suggests the Isomap algorithm as a promising tool for the interpretation of microarray data. Furthermore, the results demonstrate the benefit and importance of taking nonlinearities in gene expression data into account. 相似文献

2.

Quantitative quality control in microarray image processing and data acquisition 总被引：8，自引：3，他引：5

下载免费PDF全文

Wang X Ghosh S Guo SW 《Nucleic acids research》2001,29(15):e75-E75

A new integrated image analysis package with quantitative quality control schemes is described for cDNA microarray technology. The package employs an iterative algorithm that utilizes both intensity characteristics and spatial information of the spots on a microarray image for signal–background segmentation and defines five quality scores for each spot to record irregularities in spot intensity, size and background noise levels. A composite score q_com is defined based on these individual scores to give an overall assessment of spot quality. Using q_com we demonstrate that the inherent variability in intensity ratio measurements is closely correlated with spot quality, namely spots with higher quality give less variable measurements and vice versa. In addition, gauging data by q_com can improve data reliability dramatically and efficiently. We further show that the variability in ratio measurements drops exponentially with increasing q_com and, for the majority of spots at the high quality end, this improvement is mainly due to an improvement in correlation between the two dyes. Based on these studies, we discuss the potential of quantitative quality control for microarray data and the possibility of filtering and normalizing microarray data using a quality metrics-dependent scheme. 相似文献

3.

Reproducibility of microarray data: a further analysis of microarray quality control (MAQC) data

James J Chen Huey-Miin Hsueh Robert R Delongchamp Chien-Ju Lin Chen-An Tsai 《BMC bioinformatics》2007,8(1):412

Background

Many researchers are concerned with the comparability and reliability of microarray gene expression data. Recent completion of the MicroArray Quality Control (MAQC) project provides a unique opportunity to assess reproducibility across multiple sites and the comparability across multiple platforms. The MAQC analysis presented for the conclusion of inter- and intra-platform comparability/reproducibility of microarray gene expression measurements is inadequate. We evaluate the reproducibility/comparability of the MAQC data for 12901 common genes in four titration samples generated from five high-density one-color microarray platforms and the TaqMan technology. We discuss some of the problems with the use of correlation coefficient as metric to evaluate the inter- and intra-platform reproducibility and the percent of overlapping genes (POG) as a measure for evaluation of a gene selection procedure by MAQC. 相似文献

4.

Software package for automatic microarray image analysis (MAIA)

Novikov E Barillot E 《Bioinformatics (Oxford, England)》2007,23(5):639-640

Although various software solutions are currently available for microarray image analysis, one would still expect to develop algorithms ensuring higher level of intelligence and robustness. We present a fully functional software package for automatic processing of the two-color microarray images including spot localization, quantification and quality control. The developed algorithms aim at making ratio estimates more resistant to array contamination and offer automatic tools to evaluate spot quality. Availability: A demo version of the software can be downloaded from http://bioinfo.curie.fr/projects/maia. A full version is freely available to non-commercial users upon request from the authors. 相似文献

5.

springScape: visualisation of microarray and contextual bioinformatic data using spring embedding and an 'information landscape'

Ebbels TM Buxton BF Jones DT 《Bioinformatics (Oxford, England)》2006,22(14):e99-107

The interpretation of microarray and other high-throughput data is highly dependent on the biological context of experiments. However, standard analysis packages are poor at simultaneously presenting both the array and related bioinformatic data. We have addressed this challenge by developing a system springScape based on 'spring embedding' and an 'information landscape' allowing several related data sources to be dynamically combined while highlighting one particular feature. Each data source is represented as a network of nodes connected by weighted edges. The networks are combined and embedded in the 2-D plane by spring embedding such that nodes with a high similarity are drawn close together. Complex relationships can be discovered by varying the weight of each data source and observing the dynamic response of the spring network. By modifying Procrustes analysis, we find that the visualizations have an acceptable degree of reproducibility. The 'information landscape' highlights one particular data source, displaying it as a smooth surface whose height is proportional to both the information being viewed and the density of nodes. The algorithm is demonstrated using several microarray data sets in combination with protein-protein interaction data and GO annotations. Among the features revealed are the spatio-temporal profile of gene expression and the identification of GO terms correlated with gene expression and protein interactions. The power of this combined display lies in its interactive feedback and exploitation of human visual pattern recognition. Overall, springScape shows promise as a tool for the interpretation of microarray data in the context of relevant bioinformatic information. 相似文献

6.

Statistical methods of translating microarray data into clinically relevant diagnostic information in colorectal cancer

Kim BS Kim I Lee S Kim S Rha SY Chung HC 《Bioinformatics (Oxford, England)》2005,21(4):517-528

MOTIVATION: It is a common practice in cancer microarray experiments that a normal tissue is collected from the same individual from whom the tumor tissue was taken. The indirect design is usually adopted for the experiment that uses a common reference RNA hybridized both to normal and tumor tissues. However, it is often the case that the test material is not large enough for the experimenter to extract enough RNA to conduct the microarray experiment. Hence, collecting n cases does not necessarily end up with a matched pair sample of size n. Instead we usually have a matched pair sample of size n1, and two independent samples of sizes n2 and n3, respectively, for 'reference versus normal tissue only' and 'reference versus tumor tissue only' hybridizations (n=n1 + n2 + n3). Standard statistical methods need to be modified and new statistical procedures are developed for analyzing this mixed dataset. RESULTS: We propose a new test statistic, t3, as a means of combining all the information in the mixed dataset for detecting differentially expressed (DE) genes between normal and tumor tissues. We employed the extended receiver operating characteristic approach to the mixed dataset. We devised a measure of disagreement between a RT-PCR experiment and a microarray experiment. Hotelling's T2 statistic is employed to detect a set of DE genes and its prediction rate is compared with the prediction rate of a univariate procedure. We observe that Hotelling's T2 statistic detects DE genes more efficiently than a univariate procedure and that further research is warranted on the formal test procedure using Hotelling's T2 statistic. CONTACT: bskim@yonsei.ac.kr. 相似文献

7.

Spectral estimation in unevenly sampled space of periodically expressed microarray time series data

Alan Wee-Chung Liew Jun Xian Shuanhu Wu David Smith Hong Yan 《BMC bioinformatics》2007,8(1):137

Background

Periodogram analysis of time-series is widespread in biology. A new challenge for analyzing the microarray time series data is to identify genes that are periodically expressed. Such challenge occurs due to the fact that the observed time series usually exhibit non-idealities, such as noise, short length, and unevenly sampled time points. Most methods used in the literature operate on evenly sampled time series and are not suitable for unevenly sampled time series. 相似文献

8.

Better genechip microarray layouts by combining probe placement and embedding

de Carvalho SA Rahmann S 《Journal of bioinformatics and computational biology》2008,6(3):623-641

The microarray layout problem is a generalization of the border length minimization problem, and asks to distribute oligonucleotide probes on a microarray and to determine their embeddings in the deposition sequence in such a way that the overall quality of the resulting synthesized probes is maximized. Because of its inherent computational complexity, it is traditionally attacked in several phases: partitioning, placement, and re-embedding. We present the first algorithm, Greedy+, that combines placement and embedding and that results in improved layouts in terms of border length and conflict index (a more realistic measure of probe quality), both on arrays of random probes and on existing Affymetrix GeneChip arrays. We also present a detailed study on the layouts of the latest GeneChip arrays, and show how Greedy+ can further improve layout quality by as much as 12% in terms of border length and 35% in terms of conflict index. 相似文献

9.

Selecting a minimal number of relevant genes from microarray data to design accurate tissue classifiers

Huang HL Lee CC Ho SY 《Bio Systems》2007,90(1):78-86

It is essential to select a minimal number of relevant genes from microarray data while maximizing classification accuracy for the development of inexpensive diagnostic tests. However, it is intractable to simultaneously optimize gene selection and classification accuracy that is a large parameter optimization problem. We propose an efficient evolutionary approach to gene selection from microarray data which can be combined with the optimal design of various multiclass classifiers. The proposed method (named GeneSelect) consists of three parts which are fully cooperated: an efficient encoding scheme of candidate solutions, a generalized fitness function, and an intelligent genetic algorithm (IGA). An existing hybrid approach based on genetic algorithm and maximum likelihood classification (GA/MLHD) is proposed to select a small number of relevant genes for accurate classification of samples. To evaluate the performance of GeneSelect, the gene selection is combined with the same maximum likelihood classification (named IGA/MLHD) for convenient comparisons. The performance of IGA/MLHD is applied to 11 cancer-related human gene expression datasets. The simulation results show that IGA/MLHD is superior to GA/MLHD in terms of the number of selected genes, classification accuracy, and robustness of selected genes and accuracy. 相似文献

10.

Maintaining data integrity in microarray data management

Grant GR Manduchi E Pizarro A Stoeckert CJ 《Biotechnology and bioengineering》2003,84(7):795-800

Gene expression microarrays are a relatively new technology, dating back just a few years, yet they have already become a very widely used tool in biology, and have evolved to a wide range of applications well beyond their original design intent. However, while the use of microarrays has expanded, and the issues of performance optimization have been intensively studied, the fundamental issue of data integrity management has largely been ignored. Now that performance has improved so greatly, the shortcomings of data integrity control methods constitute a greater percent of the stumbling blocks for investigators. Microarray data are cumbersome, and the rule up to this point has mostly been one of hands-on transformations, leading to human errors which often have dramatic consequences. We show in this review that the time lost on such mistakes is enormous and dramatically affects results; therefore, mistakes should be mitigated in any way possible. We outline the scope of the data integrity issue, to survey some of the most common and dangerous data transformations, and their shortcomings. To illustrate, we review some case studies. We then look at the work done by the research community on this issue (which admittedly is meager up to this point). Some data integrity issues are always going to be difficult, while others will become easier-one of our goals is to expedite the use of integrity control methods. Finally, we present some preliminary guidelines and some specific approaches that we believe should be the focus of future research. 相似文献

11.

Global rank-invariant set normalization (GRSN) to reduce systematic distortions in microarray data

Carl R Pelz Molly Kulesz-Martin Grover Bagby Rosalie C Sears 《BMC bioinformatics》2008,9(1):520

相似文献

12.

The effects of normalization on the correlation structure of microarray data

Xing?Qiu Andrew?I?Brooks Lev?Klebanov Andrei?Yakovlev Email author 《BMC bioinformatics》2005,6(1):120

Background

Stochastic dependence between gene expression levels in microarray data is of critical importance for the methods of statistical inference that resort to pooling test-statistics across genes. It is frequently assumed that dependence between genes (or tests) is suffciently weak to justify the proposed methods of testing for differentially expressed genes. A potential impact of between-gene correlations on the performance of such methods has yet to be explored. 相似文献

13.

Statistical methods and microarray data 总被引：1，自引：0，他引：1

Klebanov L Qiu X Welle S Yakovlev A 《Nature biotechnology》2007,25(1):25-6; author reply 26-7

相似文献

14.

Freeze-drying and embedding in glycol methacrylate (GMA)

Dieter Mitrenga Arnold Wolfgang Heinz v. Mayersbach 《Histochemistry and cell biology》1974,39(4):313-326

相似文献

15.

PHOENIX, a web interface for (re)analysis of microarray data

Fabrice Berger Benoît De Hertogh Michaël Pierre Eric Bareke Anthoula Gaigneaux Eric Depiereux 《Central European Journal of Biology》2009,4(4):603-618

Microarrays are tools to study the expression profile of an entire genome. Technology, statistical tools and biological knowledge in general have evolved over the past ten years and it is now possible to improve analysis of previous datasets. We have developed a web interface called PHOENIX that automates the analysis of microarray data from preprocessing to the evaluation of significance through manual or automated parameterization. At each analytical step, several methods are possible for (re)analysis of data. PHOENIX evaluates a consensus score from several methods and thus determines the performance level of the best methods (even if the best performing method is not known). With an estimate of the true gene list, PHOENIX can evaluate the performance of methods or compare the results with other experiments. Each method used for differential expression analysis and performance evaluation has been implemented in the PEGASE back-end package, along with additional tools to further improve PHOENIX. Future developments will involve the addition of steps (CDF selection, geneset analysis, meta-analysis), methods (PLIER, ANOVA, Limma), benchmarks (spike-in and simulated datasets), and illustration of the results (automatically generated report). 相似文献

16.

Parallelization of multicategory support vector machines (PMC-SVM) for classifying microarray data

Zhang C Li P Rajendran A Deng Y Chen D 《BMC bioinformatics》2006,7(Z4):S15

相似文献

17.

Using Generalized Procrustes Analysis (GPA) for normalization of cDNA microarray data

Huiling Xiong Dapeng Zhang Christopher J Martyniuk Vance L Trudeau Xuhua Xia 《BMC bioinformatics》2008,9(1):25

Background

Normalization is essential in dual-labelled microarray data analysis to remove non-biological variations and systematic biases. Many normalization methods have been used to remove such biases within slides (Global, Lowess) and across slides (Scale, Quantile and VSN). However, all these popular approaches have critical assumptions about data distribution, which is often not valid in practice. 相似文献

18.

Importance of data structure in comparing two dimension reduction methods for classification of microarray gene expression data

Caroline Truntzer Catherine Mercier Jacques Estève Christian Gautier Pascal Roy 《BMC bioinformatics》2007,8(1):90

Background

With the advance of microarray technology, several methods for gene classification and prognosis have been already designed. However, under various denominations, some of these methods have similar approaches. This study evaluates the influence of gene expression variance structure on the performance of methods that describe the relationship between gene expression levels and a given phenotype through projection of data onto discriminant axes. 相似文献

19.

Basics and principles of particle image velocimetry (PIV) for mapping biogenic and biologically relevant flows

Eize J. Stamhuis 《Aquatic Ecology》2006,40(4):463-479

Particle image velocimetry (PIV) has proven to be a very useful technique in mapping animal-generated flows or flow patterns relevant to biota. Here, theoretical background is provided and experimental details of 2-dimensional digital PIV are explained for mapping flow produced by or relevant to aquatic biota. The main principles are clarified in sections on flow types, seeding, illumination, imaging, repetitive correlation analysis, post-processing and result interpretation, with reference to experimental situations. Examples from the benthic environment, namely, on filter feeding in barnacles and in bivalves, illustrate what the experiments comprise and what the results look like. Finally, alternative particle imaging flow analysis techniques are discussed briefly in the context of mapping biogenic and biologically relevant flows. 相似文献

20.

Query-driven module discovery in microarray data 总被引：1，自引：0，他引：1

Dhollander T Sheng Q Lemmens K De Moor B Marchal K Moreau Y 《Bioinformatics (Oxford, England)》2007,23(19):2573-2580

MOTIVATION: Existing (bi)clustering methods for microarray data analysis often do not answer the specific questions of interest to a biologist. Such specific questions could be derived from other information sources, including expert prior knowledge. More specifically, given a set of seed genes which are believed to have a common function, we would like to recruit genes with similar expression profiles as the seed genes in a significant subset of experimental conditions. RESULTS: We introduce QDB, a novel Bayesian query-driven biclustering framework in which the prior distributions allow introducing knowledge from a set of seed genes (query) to guide the pattern search. In two well-known yeast compendia, we grow highly functionally enriched biclusters from small sets of seed genes using a resolution sweep approach. In addition, relevant conditions are identified and modularity of the biclusters is demonstrated, including the discovery of overlapping modules. Finally, our method deals with missing values naturally, performs well on artificial data from a recent biclustering benchmark study and has a number of conceptual advantages when compared to existing approaches for focused module search. 相似文献