首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Lyu  Yafei  Li  Qunhua 《BMC bioinformatics》2016,17(1):51-60
Determining differentially expressed genes (DEGs) between biological samples is the key to understand how genotype gives rise to phenotype. RNA-seq and microarray are two main technologies for profiling gene expression levels. However, considerable discrepancy has been found between DEGs detected using the two technologies. Integration data across these two platforms has the potential to improve the power and reliability of DEG detection. We propose a rank-based semi-parametric model to determine DEGs using information across different sources and apply it to the integration of RNA-seq and microarray data. By incorporating both the significance of differential expression and the consistency across platforms, our method effectively detects DEGs with moderate but consistent signals. We demonstrate the effectiveness of our method using simulation studies, MAQC/SEQC data and a synthetic microRNA dataset. Our integration method is not only robust to noise and heterogeneity in the data, but also adaptive to the structure of data. In our simulations and real data studies, our approach shows a higher discriminate power and identifies more biologically relevant DEGs than eBayes, DEseq and some commonly used meta-analysis methods.  相似文献   

3.

Background  

The quality of microarray data can seriously affect the accuracy of downstream analyses. In order to reduce variability and enhance signal reproducibility in these data, many normalization methods have been proposed and evaluated, most of which are for data obtained from cDNA microarrays and Affymetrix GeneChips. CodeLink Bioarrays are a newly emerged, single-color oligonucleotide microarray platform. To date, there are no reported studies that evaluate normalization methods for CodeLink Bioarrays.  相似文献   

4.
MOTIVATION: Affymetrix GeneChips are common 3' profiling platforms for quantifying gene expression. Using publicly available datasets of expression profiles from human and mouse experiments, we sought to characterize features of GeneChip data to better compare and evaluate analyses for differential expression, regulation and clustering. We uncovered an unexpected order dependence in expression data that holds across a variety of chips in both human and mouse data. RESULTS: Order dependence among GeneChips affected relative expression measures pre-processed and normalized with the Affymetrix MAS5.0 algorithm and the robust multi-array average summarization method. The effect strongly influenced detection calls and tests for differential expression and can potentially significantly bias experimental results based on GeneChip profiling.  相似文献   

5.
The most widely used statistical methods for finding differentially expressed genes (DEGs) are essentially univariate. In this study, we present a new T(2) statistic for analyzing microarray data. We implemented our method using a multiple forward search (MFS) algorithm that is designed for selecting a subset of feature vectors in high-dimensional microarray datasets. The proposed T2 statistic is a corollary to that originally developed for multivariate analyses and possesses two prominent statistical properties. First, our method takes into account multidimensional structure of microarray data. The utilization of the information hidden in gene interactions allows for finding genes whose differential expressions are not marginally detectable in univariate testing methods. Second, the statistic has a close relationship to discriminant analyses for classification of gene expression patterns. Our search algorithm sequentially maximizes gene expression difference/distance between two groups of genes. Including such a set of DEGs into initial feature variables may increase the power of classification rules. We validated our method by using a spike-in HGU95 dataset from Affymetrix. The utility of the new method was demonstrated by application to the analyses of gene expression patterns in human liver cancers and breast cancers. Extensive bioinformatics analyses and cross-validation of DEGs identified in the application datasets showed the significant advantages of our new algorithm.  相似文献   

6.
Non-alcoholic steatohepatitis (NASH) is a severe form of non-alcoholic fatty liver disease (NAFLD). The molecular pathological mechanism of NASH is poorly understood. Recently, high throughput data such as microarray data together with bioinformatics methods have become a powerful way to identify biomarkers and to investigate pathogenesis of diseases. Taking advantage of well characterized microarray datasets of NASH livers, we performed a systematic analysis of potential biomarkers and possible pathological mechanism of NASH from a bioinformatics perspective.CodeLink Human Whole Genome Bioarrays were analyzed to find differentially expressed genes (DEGs) between controls and NASH patients. Four methods were used to identify DEGs and the intersection of DEGs identified by these methods was subsequently used for both biomarker prediction and molecular pathological mechanism analysis. For biomarker prediction, rank aggregation was used to rank DEGs identified by all these methods according to their significance of different expression. Alcohol dehydrogenase 4 (ADH4) exhibited the highest rank suggesting the most significant differential expression between normal and disease condition. Together with the previous report demonstrating the association between ADH4 and the pathogenesis of NASH, our data suggest that ADH4 could be a potential biomarker for NASH. For molecular pathological mechanism analysis, two clusters of highly correlated annotation terms and genes in these terms were identified based on the intersection of DEGs. Then, pathways enriched with these genes were identified to construct the network. Using this network, both for the first time, amino acid catabolism is implicated to play a pivotal role and urea cycle is implicated to be involved in the development of NASH.The results of our study identified potential biomarkers and suggested possible molecular pathological mechanism of NASH. These findings provide a comprehensive and systematic understanding of the pathogenesis of NASH and may facilitate the diagnosis, prevention and treatment of NASH.  相似文献   

7.
Rosetta error model for gene expression analysis   总被引:4,自引:0,他引:4  
MOTIVATION: In microarray gene expression studies, the number of replicated microarrays is usually small because of cost and sample availability, resulting in unreliable variance estimation and thus unreliable statistical hypothesis tests. The unreliable variance estimation is further complicated by the fact that the technology-specific variance is intrinsically intensity-dependent. RESULTS: The Rosetta error model captures the variance-intensity relationship for various types of microarray technologies, such as single-color arrays and two-color arrays. This error model conservatively estimates intensity error and uses this value to stabilize the variance estimation. We present two commonly used error models: the intensity error-model for single-color microarrays and the ratio error model for two-color microarrays or ratios built from two single-color arrays. We present examples to demonstrate the strength of our error models in improving statistical power of microarray data analysis, particularly, in increasing expression detection sensitivity and specificity when the number of replicates is limited.  相似文献   

8.
We have evaluated the performance characteristics of three quantitative gene expression technologies and correlated their expression measurements to those of five commercial microarray platforms, based on the MicroArray Quality Control (MAQC) data set. The limit of detection, assay range, precision, accuracy and fold-change correlations were assessed for 997 TaqMan Gene Expression Assays, 205 Standardized RT (Sta)RT-PCR assays and 244 QuantiGene assays. TaqMan is a registered trademark of Roche Molecular Systems, Inc. We observed high correlation between quantitative gene expression values and microarray platform results and found few discordant measurements among all platforms. The main cause of variability was differences in probe sequence and thus target location. A second source of variability was the limited and variable sensitivity of the different microarray platforms for detecting weakly expressed genes, which affected interplatform and intersite reproducibility of differentially expressed genes. From this analysis, we conclude that the MAQC microarray data set has been validated by alternative quantitative gene expression platforms thus supporting the use of microarray platforms for the quantitative characterization of gene expression.  相似文献   

9.
We have conducted a study to compare the variability in measured gene expression levels associated with three types of microarray platforms. Total RNA samples were obtained from liver tissue of four male mice, two each from inbred strains A/J and C57BL/6J. The same four samples were assayed on Affymetrix Mouse Genome Expression Set 430 GeneChips (MOE430A and MOE430B), spotted cDNA microarrays, and spotted oligonucleotide microarrays using eight arrays of each type. Variances associated with measurement error were observed to be comparable across all microarray platforms. The MOE430A GeneChips and cDNA arrays had higher precision across technical replicates than the MOE430B GeneChips and oligonucleotide arrays. The Affymetrix platform showed the greatest range in the magnitude of expression levels followed by the oligonucleotide arrays. We observed good concordance in both estimated expression level and statistical significance of common genes between the Affymetrix MOE430A GeneChip and the oligonucleotide arrays. Despite their apparently high precision, cDNA arrays showed poor concordance with other platforms.  相似文献   

10.
11.

Background  

To identify differentially expressed genes (DEGs) from microarray data, users of the Affymetrix GeneChip system need to select both a preprocessing algorithm to obtain expression-level measurements and a way of ranking genes to obtain the most plausible candidates. We recently recommended suitable combinations of a preprocessing algorithm and gene ranking method that can be used to identify DEGs with a higher level of sensitivity and specificity. However, in addition to these recommendations, researchers also want to know which combinations enhance reproducibility.  相似文献   

12.
13.
14.
Recent developments in gene array technologies, specifically cDNA microarray platforms, have made it easier to try to understand the constellation of gene alterations that occur within the CNS. Unlike an organ that is comprised of one principal cell type, the brain contains a multiplicity of both neuronal (e.g., pyramidal neurons, interneurons, and others) and noneuronal (e.g., astrocytes, microglia, oligodendrocytes, and others) populations of cells. An emerging goal of modern molecular neuroscience is to sample gene expression from similar cell types within a defined region without potential contamination by expression profiles of adjacent neuronal subtypes and noneuronal cells. At present, an optimal methodology to assess gene expression is to evaluate single cells, either identified physiologically in living preparations, or by immunocytochemical or histochemical procedures in fixed cells in vitro or in vivo. Unfortunately, the quantity of RNA harvested from a single cell is not sufficient for standard RNA extraction methods. Therefore, exponential polymerase-chain reaction (PCR) based analyses and linear RNA amplifications, including a newly developed terminal continuation (TC) RNA amplification methodology, have been used in combination with single cell microdissection procedures to enable the use of cDNA microarray analysis within individual populations of cells obtained from postmortem brain samples as well as the brains of animal models of neurodegeneration.  相似文献   

15.
Information regarding gene coexpression is useful to predict gene function. Several databases have been constructed for gene coexpression in model organisms based on a large amount of publicly available gene expression data measured by GeneChip platforms. In these databases, Pearson''s correlation coefficients (PCCs) of gene expression patterns are widely used as a measure of gene coexpression. Although the coexpression measure or GeneChip summarization method affects the performance of the gene coexpression database, previous studies for these calculation procedures were tested with only a small number of samples and a particular species. To evaluate the effectiveness of coexpression measures, assessments with large-scale microarray data are required. We first examined characteristics of PCC and found that the optimal PCC threshold to retrieve functionally related genes was affected by the method of gene expression database construction and the target gene function. In addition, we found that this problem could be overcome when we used correlation ranks instead of correlation values. This observation was evaluated by large-scale gene expression data for four species: Arabidopsis, human, mouse and rat.  相似文献   

16.
Androgens play a major role in the growth and survival of primary prostate tumors. The molecular mechanisms involved in prostate cancer progression are not fully understood but genes that are regulated by androgens clearly influence this process. We searched for new androgen-regulated genes using the Affymetrix GeneChip Human Genome U95 Set in the androgen-sensitive LNCaP prostate cancer cell line. Analysis of gene expression profiles revealed that myosin light chain kinase (MLCK) mRNA levels were markedly down-regulated by the synthetic androgen R1881. The microarray data were confirmed by ribonuclease protection assays. RNA and protein analyses revealed that LNCaP cells express both long (non-muscle) and short (smooth muscle) isoforms, and that both isoforms are down-regulated by androgens. Taken together, these data identify MLCK as a novel downstream target of the androgen signalling pathway in prostate cells.  相似文献   

17.
Pulmonary arterial hypertension (PAH) featured a debilitating progressive disorder. Here, we intend to determine diagnosis-valuable biomarkers for PAH and decode the fundamental mechanisms of the biological function of these markers. Two mRNA microarray profiles (GSE70456 and GSE117261) and two microRNA microarray profiles (GSE55427 and GSE67597) were mined from the Gene Expression Omnibus platform. Then, we identified the differentially expressed genes (DEGs) and differentially expressed miRNAs (DEMs), respectively. Besides, we investigated online miRNA prediction tools to screen the target gene of DEMs. In this study, 185 DEGs and three common DEMs were screened as well as 1266 target genes of the three DEMs were identified. Next, 16 overlapping dysregulated genes from 185 DEGs and 1266 target gene were obtained. Meanwhile, we constructed the miRNA gene regulatory network and determined miRNA-508-3p-NR4A3 pair for deeper exploring. Experiment methods verified the functional expression of miR-508-3p in PAH and its signalling cascade. We observed that ectopic miR-508-3p expression promotes proliferation and migration of pulmonary artery smooth muscle cell (PASMC). Bioinformatic, dual-luciferase assay showed NR4A3 represents directly targeted gene of miR-508-3p. Mechanistically, we demonstrated that down-regulation of miR-508-3p advances PASMC proliferation and migration via inducing NR4A3 to activate MAPK/ERK kinase signalling pathway. Altogether, our research provides a promising diagnosis of predictor and therapeutic avenues for patients in PAH.  相似文献   

18.
Microarray analysis has become a key experimental tool in the study of genome‐wide patterns of gene expression. The labeling step of target molecules such as cDNA or cRNA plays a key role in a microarray experiment because the amount of mRNA is measured indirectly by the labeled molecules. In this paper, the most widely used cDNA labeling strategies in microarray experiments are reviewed in detail, including direct labeling and indirect labeling methods along with a discussion of the merits and disadvantages of these methods. Furthermore, various RNA amplification approaches were surveyed to obtain a target nucleic acid sufficient for microarray experiments from minute amounts of mRNA. Finally, the labeling strategies of commonly used microarray platforms (e.g., Affymetrix GeneChip®, CodeLink? Bioarray, Agilent and spotted microarrays) were compared.  相似文献   

19.
The advent of next generation sequencing technologies (NGS) has expanded the area of genomic research, offering high coverage and increased sensitivity over older microarray platforms. Although the current cost of next generation sequencing is still exceeding that of microarray approaches, the rapid advances in NGS will likely make it the platform of choice for future research in differential gene expression. Connectivity mapping is a procedure for examining the connections among diseases, genes and drugs by differential gene expression initially based on microarray technology, with which a large collection of compound-induced reference gene expression profiles have been accumulated. In this work, we aim to test the feasibility of incorporating NGS RNA-Seq data into the current connectivity mapping framework by utilizing the microarray based reference profiles and the construction of a differentially expressed gene signature from a NGS dataset. This would allow for the establishment of connections between the NGS gene signature and those microarray reference profiles, alleviating the associated incurring cost of re-creating drug profiles with NGS technology. We examined the connectivity mapping approach on a publicly available NGS dataset with androgen stimulation of LNCaP cells in order to extract candidate compounds that could inhibit the proliferative phenotype of LNCaP cells and to elucidate their potential in a laboratory setting. In addition, we also analyzed an independent microarray dataset of similar experimental settings. We found a high level of concordance between the top compounds identified using the gene signatures from the two datasets. The nicotine derivative cotinine was returned as the top candidate among the overlapping compounds with potential to suppress this proliferative phenotype. Subsequent lab experiments validated this connectivity mapping hit, showing that cotinine inhibits cell proliferation in an androgen dependent manner. Thus the results in this study suggest a promising prospect of integrating NGS data with connectivity mapping.  相似文献   

20.
Microarrays have been widely used for the analysis of gene expression, but the issue of reproducibility across platforms has yet to be fully resolved. To address this apparent problem, we compared gene expression between two microarray platforms: the short oligonucleotide Affymetrix Mouse Genome 430 2.0 GeneChip and a spotted cDNA array using a mouse model of angiotensin II-induced hypertension. RNA extracted from treated mice was analyzed using Affymetrix and cDNA platforms and then by quantitative RT-PCR (qRT-PCR) for validation of specific genes. For the 11,710 genes present on both arrays, we assessed the relative impact of experimental treatment and platform on measured expression and found that biological treatment had a far greater impact on measured expression than did platform for more than 90% of genes, a result validated by qRT-PCR. In the small number of cases in which platforms yielded discrepant results, qRT-PCR generally did not confirm either set of data, suggesting that sequence-specific effects may make expression predictions difficult to make using any technique.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号