期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

We present the results of a simple, statistical assay that measures the G+C content sensitivity bias of gene expression experiments without the requirement of a duplicate experiment. We analyse five gene expression profiling methods: Affymetrix GeneChip, Long Serial Analysis of Gene Expression (LongSAGE), LongSAGELite, ‘Classic’ Massively Parallel Signature Sequencing (MPSS) and ‘Signature’ MPSS. We demonstrate the methods have systematic and random errors leading to a different G+C content sensitivity. The relationship between this experimental error and the G+C content of the probe set or tag that identifies each gene influences whether the gene is detected and, if detected, the level of gene expression measured. LongSAGE has the least bias, while Signature MPSS shows a strong bias to G+C rich tags and Affymetrix data show different bias depending on the data processing method (MAS 5.0, RMA or GC-RMA). The bias in the Affymetrix data primarily impacts genes expressed at lower levels. Despite the larger sampling of the MPSS library, SAGE identifies significantly more genes (60% more RefSeq genes in a single comparison). 相似文献

9.

Accurate and unambiguous tag-to-gene mapping in serial analysis of gene expression 总被引：1，自引：0，他引：1

Rodrigo Malig Cristian Varela Eduardo Agosin Francisco Melo 《BMC bioinformatics》2006,7(1):487

Background

In this study, we present a robust and reliable computational method for tag-to-gene assignment in serial analysis of gene expression (SAGE). The method relies on current genome information and annotation, incorporation of several new features, and key improvements over alternative methods, all of which are important to determine gene expression levels more accurately. The method provides a complete annotation of potential virtual SAGE tags within a genome, along with an estimation of their confidence for experimental observation that ranks tags that present multiple matches in the genome. 相似文献

10.

MPSS: an integrated database system for surveying a set of proteins 总被引：3，自引：0，他引：3

Hao P He WZ Huang Y Ma LX Xu Y Xi H Wang C Liu BS Wang JM Li YX Zhong Y 《Bioinformatics (Oxford, England)》2005,21(9):2142-2143

SUMMARY: We design and implement an integrated database system called 'multi-protein survey system' (MPSS), which provides a platform to retrieve information about many proteins at a time. This system integrates several important and widely used databases including SwissProt, TrEMBL, PDB and InterPro, plus useful references such as GO and KEGG to other databases. Users may submit a group of protein IDs, entry names, SwissProt/TrEMBL accession numbers or GenBank GIs through MPSS' web interface, and obtain protein annotation information from public databases and pre-computed molecular properties speedily. MPSS can also supply comprehensive information about query proteins, including 3D structures, domains, pathway, gene ontology and visual presentation of mapping to the GO tree and KEGG pathway, to provide an up-to-date view of available knowledge with regard to the structures and molecular functions of proteins under study. AVAILABILITY: MPSS is freely accessible at http://www.scbit.org/mpss/ 相似文献

11.

SAGETTARIUS: a program to reduce the number of tags mapped to multiple transcripts and to plan SAGE sequencing stages

Bianchetti L Wu Y Guerin E Plewniak F Poch O 《Nucleic acids research》2007,35(18):e122

相似文献

12.

A comparison of global gene expression measurement technologies in Arabidopsis thaliana 总被引：2，自引：0，他引：2

Coughlan SJ Agrawal V Meyers B 《Comparative and Functional Genomics》2004,5(3):245-252

相似文献

13.

SAP-A Sequence Mapping and Analyzing Program for Long Sequence Reads Alignment and Accurate Variants Discovery

Z Sun W Tian 《PloS one》2012,7(8):e42887

The third-generation of sequencing technologies produces sequence reads of 1000 bp or more that may contain high polymorphism information. However, most currently available sequence analysis tools are developed specifically for analyzing short sequence reads. While the traditional Smith-Waterman (SW) algorithm can be used to map long sequence reads, its naive implementation is computationally infeasible. We have developed a new Sequence mapping and Analyzing Program (SAP) that implements a modified version of SW to speed up the alignment process. In benchmarks with simulated and real exon sequencing data and a real E. coli genome sequence data generated by the third-generation sequencing technologies, SAP outperforms currently available tools for mapping short and long sequence reads in both speed and proportion of captured reads. In addition, it achieves high accuracy in detecting SNPs and InDels in the simulated data. SAP is available at https://github.com/davidsun/SAP. 相似文献

14.

Transcriptomes for serial analysis of gene expression 总被引：1，自引：0，他引：1

Marti J Piquemal D Manchon L Commes T 《Journal de la Société de Biologie》2002,196(4):303-307

相似文献

15.

Identification and prevention of a GC content bias in SAGE libraries 总被引：6，自引：1，他引：5

下载免费PDF全文

Elliott H. Margulies Sharon L. R. Kardia Jeffrey W. Innis 《Nucleic acids research》2001,29(12):e60

Serial Analysis of Gene Expression (SAGE) is becoming a widely used gene expression profiling method for the study of development, cancer and other human diseases. Investigators using SAGE rely heavily on the quantitative aspect of this method for cataloging gene expression and comparing multiple SAGE libraries. We have developed additional computational and statistical tools to assess the quality and reproducibility of a SAGE library. Using these methods, a critical variable in the SAGE protocol was identified that has the potential to bias the Tag distribution relative to the GC content of the 10 bp SAGE Tag DNA sequence. We also detected this bias in a number of publicly available SAGE libraries. It is important to note that the GC content bias went undetected by quality control procedures in the current SAGE protocol and was only identified with the use of these statistical analyses on as few as 750 SAGE Tags. In addition to keeping any solution of free DiTags on ice, an analysis of the GC content should be performed before sequencing large numbers of SAGE Tags to be confident that SAGE libraries are free from experimental bias. 相似文献

16.

USAGE: a web-based approach towards the analysis of SAGE data. Serial Analysis of Gene Expression

van Kampen AH van Schaik BD Pauws E Michiels EM Ruijter JM Caron HN Versteeg R Heisterkamp SH Leunissen JA Baas F van der Mee M 《Bioinformatics (Oxford, England)》2000,16(10):899-905

MOTIVATION: SAGE enables the determination of genome-wide mRNA expression profiles. A comprehensive analysis of SAGE data requires software, which integrates (statistical) data analysis methods with a database system. Furthermore, to facilitate data sharing between users, the application should reside on a central server and be accessed via the internet. Since such an application was not available we developed the USAGE package. RESULTS: USAGE is a web-based application that comprises an integrated set of tools, which offers many functions for analysing and comparing SAGE data. Additionally, USAGE includes a statistical method for the planning of new SAGE experiments. USAGE is available in a multi-user environment giving users the option of sharing data. USAGE is interfaced to a relational database to store data and analysis results. The USAGE query editor allows the composition of queries for searching this database. Several database functions have been included which enable the selection and combination of data. USAGE provides the biologist increased functionality and flexibility for analysing SAGE data. AVAILABILITY: USAGE is freely accessible for academic institutions at http://www.cmbi.kun.nl/usage/. The source code of USAGE is freely available for academic institutions on request from the first author. 相似文献

17.

Maximizing the efficacy of SAGE analysis identifies novel transcripts in Arabidopsis

Robinson SJ Cram DJ Lewis CT Parkin IA 《Plant physiology》2004,136(2):3223-3233

相似文献

18.

Transcript identification by analysis of short sequence tags--influence of tag length,restriction site and transcript database

Unneberg P Wennborg A Larsson M 《Nucleic acids research》2003,31(8):2217-2226

相似文献

19.

hSAGEing: an improved SAGE-based software for identification of human tissue-specific or common tumor markers and suppressors

Yang CH Chuang LY Shih TM Chang HW 《PloS one》2010,5(12):e14369

相似文献

20.

Gene expression profiling: methodological challenges,results, and prospects for addiction research

Pollock JD 《Chemistry and physics of lipids》2002,121(1-2):241-256

This review describes the current methods used to profile gene expression. These methods include microarrays, spotted arrays, serial analysis of gene expression (SAGE), and massive parallel signature sequencing (MPSS). Methodological and statistical problems in interpreting microarray and spotted array experiments are also discussed. Methods and formats such as minimum information about microarray experiments (MIAME) needed to share gene expression data are described. The last part of the review provides an overview of the application of gene-expression profiling technology to substance abuse research and discusses future directions. 相似文献