期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A summarization approach for Affymetrix GeneChip data using a reference training set from a large, biologically diverse database

Simon Katz Rafael A Irizarry Xue Lin Mark Tripputi Mark W Porter 《BMC bioinformatics》2006,7(1):464

Background

Many of the most popular pre-processing methods for Affymetrix expression arrays, such as RMA, gcRMA, and PLIER, simultaneously analyze data across a set of predetermined arrays to improve precision of the final measures of expression. One problem associated with these algorithms is that expression measurements for a particular sample are highly dependent on the set of samples used for normalization and results obtained by normalization with a different set may not be comparable. A related problem is that an organization producing and/or storing large amounts of data in a sequential fashion will need to either re-run the pre-processing algorithm every time an array is added or store them in batches that are pre-processed together. Furthermore, pre-processing of large numbers of arrays requires loading all the feature-level data into memory which is a difficult task even with modern computers. We utilize a scheme that produces all the information necessary for pre-processing using a very large training set that can be used for summarization of samples outside of the training set. All subsequent pre-processing tasks can be done on an individual array basis. We demonstrate the utility of this approach by defining a new version of the Robust Multi-chip Averaging (RMA) algorithm which we refer to as refRMA. 相似文献

2.

A probe-treatment-reference (PTR) model for the analysis of oligonucleotide expression microarrays

Huanying Ge Chao Cheng Lei M Li 《BMC bioinformatics》2008,9(1):194

相似文献

3.

SED, a normalization free method for DNA microarray data analysis

Huajun Wang Hui Huang 《BMC bioinformatics》2004,5(1):121

Background

Analysis of DNA microarray data usually begins with a normalization step where intensities of different arrays are adjusted to the same scale so that the intensity levels from different arrays can be compared with one other. Both simple total array intensity-based as well as more complex "local intensity level" dependent normalization methods have been developed, some of which are widely used. Much less developed methods for microarray data analysis include those that bypass the normalization step and therefore yield results that are not confounded by potential normalization errors. 相似文献

4.

Normalization of Illumina Infinium whole-genome SNP data improves copy number estimates and allelic intensity ratios

Johan Staaf Johan Vallon-Christersson David Lindgren Gunnar Juliusson Richard Rosenquist Mattias Höglund Åke Borg Markus Ringnér 《BMC bioinformatics》2008,9(1):409

Background

Illumina Infinium whole genome genotyping (WGG) arrays are increasingly being applied in cancer genomics to study gene copy number alterations and allele-specific aberrations such as loss-of-heterozygosity (LOH). Methods developed for normalization of WGG arrays have mostly focused on diploid, normal samples. However, for cancer samples genomic aberrations may confound normalization and data interpretation. Therefore, we examined the effects of the conventionally used normalization method for Illumina Infinium arrays when applied to cancer samples. 相似文献

5.

Delineation of amplification,hybridization and location effects in microarray data yields better-quality normalization

Marc Hulsman Anouk Mentink Eugene P van Someren Koen J Dechering Jan de Boer Marcel JT Reinders 《BMC bioinformatics》2010,11(1):156

Background

Oligonucleotide arrays have become one of the most widely used high-throughput tools in biology. Due to their sensitivity to experimental conditions, normalization is a crucial step when comparing measurements from these arrays. Normalization is, however, far from a solved problem. Frequently, we encounter datasets with significant technical effects that currently available methods are not able to correct. 相似文献

6.

Spatial normalization improves the quality of genotype calling for Affymetrix SNP 6.0 arrays

High Seng Chai Terry M Therneau Kent R Bailey Jean-Pierre A Kocher 《BMC bioinformatics》2010,11(1):356

Background

Microarray measurements are susceptible to a variety of experimental artifacts, some of which give rise to systematic biases that are spatially dependent in a unique way on each chip. It is likely that such artifacts affect many SNP arrays, but the normalization methods used in currently available genotyping algorithms make no attempt at spatial bias correction. Here, we propose an effective single-chip spatial bias removal procedure for Affymetrix 6.0 SNP arrays or platforms with similar design features. This procedure deals with both extreme and subtle biases and is intended to be applied before standard genotype calling algorithms. 相似文献

7.

The effect of oligonucleotide microarray data pre-processing on the analysis of patient-cohort studies

Roel GW Verhaak Frank JT Staal Peter JM Valk Bob Lowenberg Marcel JT Reinders Dick de Ridder 《BMC bioinformatics》2006,7(1):105-15

Background

Intensity values measured by Affymetrix microarrays have to be both normalized, to be able to compare different microarrays by removing non-biological variation, and summarized, generating the final probe set expression values. Various pre-processing techniques, such as dChip, GCRMA, RMA and MAS have been developed for this purpose. This study assesses the effect of applying different pre-processing methods on the results of analyses of large Affymetrix datasets. By focusing on practical applications of microarray-based research, this study provides insight into the relevance of pre-processing procedures to biology-oriented researchers. 相似文献

8.

Challenges in microarray class discovery: a comprehensive examination of normalization,gene selection and clustering

Eva Freyhult Mattias Landfors Jenny Önskog Torgeir R Hvidsten Patrik Rydén 《BMC bioinformatics》2010,11(1):503

Background

Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. 相似文献

9.

Normalization and experimental design for ChIP-chip data

Shouyong Peng Artyom A Alekseyenko Erica Larschan Mitzi I Kuroda Peter J Park 《BMC bioinformatics》2007,8(1):219

Background

Chromatin immunoprecipitation on tiling arrays (ChIP-chip) has been widely used to investigate the DNA binding sites for a variety of proteins on a genome-wide scale. However, several issues in the processing and analysis of ChIP-chip data have not been resolved fully, including the effect of background (mock control) subtraction and normalization within and across arrays. 相似文献

10.

An adaptive method for cDNA microarray normalization

Yingdong?Zhao Ming-Chung?Li Richard?Simon Email author 《BMC bioinformatics》2005,6(1):28

Background

Normalization is a critical step in analysis of gene expression profiles. For dual-labeled arrays, global normalization assumes that the majority of the genes on the array are non-differentially expressed between the two channels and that the number of over-expressed genes approximately equals the number of under-expressed genes. These assumptions can be inappropriate for custom arrays or arrays in which the reference RNA is very different from the experimental samples. 相似文献

11.

Two-stage normalization using background intensities in cDNA microarray data

Dankyu?Yoon Sung-Gon?Yi Ju-Han?Kim Taesung?Park Email author 《BMC bioinformatics》2004,5(1):97

Background

In the microarray experiment, many undesirable systematic variations are commonly observed. Normalization is the process of removing such variation that affects the measured gene expression levels. Normalization plays an important role in the earlier stage of microarray data analysis. The subsequent analysis results are highly dependent on normalization. One major source of variation is the background intensities. Recently, some methods have been employed for correcting the background intensities. However, all these methods focus on defining signal intensities appropriately from foreground and background intensities in the image analysis. Although a number of normalization methods have been proposed, no systematic methods have been proposed using the background intensities in the normalization process. 相似文献

12.

Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method

Henrik Bengtsson Ola Hössjer 《BMC bioinformatics》2006,7(1):100-18

Background

Low-level processing and normalization of microarray data are most important steps in microarray analysis, which have profound impact on downstream analysis. Multiple methods have been suggested to date, but it is not clear which is the best. It is therefore important to further study the different normalization methods in detail and the nature of microarray data in general. 相似文献

13.

Classification-based comparison of pre-processing methods for interpretation of mass spectrometry generated clinical datasets

Wouter Wegdam Perry D Moerland Marrije R Buist Emiel Ver Loren van Themaat Boris Bleijlevens Huub CJ Hoefsloot Chris G de Koster Johannes MFG Aerts 《Proteome science》2009,7(1):19-17

Background

Mass spectrometry is increasingly being used to discover proteins or protein profiles associated with disease. Experimental design of mass-spectrometry studies has come under close scrutiny and the importance of strict protocols for sample collection is now understood. However, the question of how best to process the large quantities of data generated is still unanswered. Main challenges for the analysis are the choice of proper pre-processing and classification methods. While these two issues have been investigated in isolation, we propose to use the classification of patient samples as a clinically relevant benchmark for the evaluation of pre-processing methods. 相似文献

14.

Missing value imputation for microarray gene expression data using histone acetylation information

Qian Xiang Xianhua Dai Yangyang Deng Caisheng He Jiang Wang Jihua Feng Zhiming Dai 《BMC bioinformatics》2008,9(1):252

Background

It is an important pre-processing step to accurately estimate missing values in microarray data, because complete datasets are required in numerous expression profile analysis in bioinformatics. Although several methods have been suggested, their performances are not satisfactory for datasets with high missing percentages. 相似文献

15.

How to choose a normalization strategy for miRNA quantitative real-time (qPCR) arrays 总被引：1，自引：0，他引：1

Deo A Carlsson J Lindlöf A 《Journal of bioinformatics and computational biology》2011,9(6):795-812

相似文献

16.

Comparison of solution-based exome capture methods for next generation sequencing

Sulonen AM Ellonen P Almusa H Lepistö M Eldfors S Hannula S Miettinen T Tyynismaa H Salo P Heckman C Joensuu H Raivio T Suomalainen A Saarela J 《Genome biology》2011,12(9):R94-18

Background

Techniques enabling targeted re-sequencing of the protein coding sequences of the human genome on next generation sequencing instruments are of great interest. We conducted a systematic comparison of the solution-based exome capture kits provided by Agilent and Roche NimbleGen. A control DNA sample was captured with all four capture methods and prepared for Illumina GAII sequencing. Sequence data from additional samples prepared with the same protocols were also used in the comparison.

Results

We developed a bioinformatics pipeline for quality control, short read alignment, variant identification and annotation of the sequence data. In our analysis, a larger percentage of the high quality reads from the NimbleGen captures than from the Agilent captures aligned to the capture target regions. High GC content of the target sequence was associated with poor capture success in all exome enrichment methods. Comparison of mean allele balances for heterozygous variants indicated a tendency to have more reference bases than variant bases in the heterozygous variant positions within the target regions in all methods. There was virtually no difference in the genotype concordance compared to genotypes derived from SNP arrays. A minimum of 11× coverage was required to make a heterozygote genotype call with 99% accuracy when compared to common SNPs on genome-wide association arrays.

Conclusions

Libraries captured with NimbleGen kits aligned more accurately to the target regions. The updated NimbleGen kit most efficiently covered the exome with a minimum coverage of 20×, yet none of the kits captured all the Consensus Coding Sequence annotated exons. 相似文献

17.

Strategies for analyzing highly enriched IP-chip datasets

Simon RV Knott Christopher J Viggiani Oscar M Aparicio Simon Tavaré 《BMC bioinformatics》2009,10(1):305

Background

Chromatin immunoprecipitation on tiling arrays (ChIP-chip) has been employed to examine features such as protein binding and histone modifications on a genome-wide scale in a variety of cell types. Array data from the latter studies typically have a high proportion of enriched probes whose signals vary considerably (due to heterogeneity in the cell population), and this makes their normalization and downstream analysis difficult. 相似文献

18.

Comparison of normalization methods for CodeLink Bioarray data

Wei?Wu Email author Nilesh?Dave George?C?Tseng Thomas?Richards Eric?P?Xing Naftali?Kaminski 《BMC bioinformatics》2005,6(1):309

Background

The quality of microarray data can seriously affect the accuracy of downstream analyses. In order to reduce variability and enhance signal reproducibility in these data, many normalization methods have been proposed and evaluated, most of which are for data obtained from cDNA microarrays and Affymetrix GeneChips. CodeLink Bioarrays are a newly emerged, single-color oligonucleotide microarray platform. To date, there are no reported studies that evaluate normalization methods for CodeLink Bioarrays. 相似文献

19.

Evaluation of normalization methods for microarray data

Taesung?Park Email author Sung-Gon?Yi Sung-Hyun?Kang SeungYeoun?Lee Yong-Sung?Lee Richard?Simon 《BMC bioinformatics》2003,4(1):33

Background

Microarray technology allows the monitoring of expression levels for thousands of genes simultaneously. This novel technique helps us to understand gene regulation as well as gene by gene interactions more systematically. In the microarray experiment, however, many undesirable systematic variations are observed. Even in replicated experiment, some variations are commonly observed. Normalization is the process of removing some sources of variation which affect the measured gene expression levels. Although a number of normalization methods have been proposed, it has been difficult to decide which methods perform best. Normalization plays an important role in the earlier stage of microarray data analysis. The subsequent analysis results are highly dependent on normalization.

Results

In this paper, we use the variability among the replicated slides to compare performance of normalization methods. We also compare normalization methods with regard to bias and mean square error using simulated data.

Conclusions

Our results show that intensity-dependent normalization often performs better than global normalization methods, and that linear and nonlinear normalization methods perform similarly. These conclusions are based on analysis of 36 cDNA microarrays of 3,840 genes obtained in an experiment to search for changes in gene expression profiles during neuronal differentiation of cortical stem cells. Simulation studies confirm our findings.

相似文献

20.

How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results

Frank F Millenaar John Okyere Sean T May Martijn van Zanten Laurentius ACJ Voesenek Anton JM Peeters 《BMC bioinformatics》2006,7(1):137-16

相似文献