首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

The massive scale of microarray derived gene expression data allows for a global view of cellular function. Thus far, comparative studies of gene expression between species have been based on the level of expression of the gene across corresponding tissues, or on the co-expression of the gene with another gene.

Results

To compare gene expression between distant species on a global scale, we introduce the "expression context". The expression context of a gene is based on the co-expression with all other genes that have unambiguous counterparts in both genomes. Employing this new measure, we show 1) that the expression context is largely conserved between orthologs, and 2) that sequence identity shows little correlation with expression context conservation after gene duplication and speciation.

Conclusion

This means that the degree of sequence identity has a limited predictive quality for differential expression context conservation between orthologs, and thus presumably also for other facets of gene function.  相似文献   

2.
Identifying latent structure in high-dimensional genomic data is essential for exploring biological processes. Here, we consider recovering gene co-expression networks from gene expression data, where each network encodes relationships between genes that are co-regulated by shared biological mechanisms. To do this, we develop a Bayesian statistical model for biclustering to infer subsets of co-regulated genes that covary in all of the samples or in only a subset of the samples. Our biclustering method, BicMix, allows overcomplete representations of the data, computational tractability, and joint modeling of unknown confounders and biological signals. Compared with related biclustering methods, BicMix recovers latent structure with higher precision across diverse simulation scenarios as compared to state-of-the-art biclustering methods. Further, we develop a principled method to recover context specific gene co-expression networks from the estimated sparse biclustering matrices. We apply BicMix to breast cancer gene expression data and to gene expression data from a cardiovascular study cohort, and we recover gene co-expression networks that are differential across ER+ and ER- samples and across male and female samples. We apply BicMix to the Genotype-Tissue Expression (GTEx) pilot data, and we find tissue specific gene networks. We validate these findings by using our tissue specific networks to identify trans-eQTLs specific to one of four primary tissues.  相似文献   

3.
MOTIVATION: Identifying groups of co-regulated genes by monitoring their expression over various experimental conditions is complicated by the fact that such co-regulation is condition-specific. Ignoring the context-specific nature of co-regulation significantly reduces the ability of clustering procedures to detect co-expressed genes due to additional 'noise' introduced by non-informative measurements. RESULTS: We have developed a novel Bayesian hierarchical model and corresponding computational algorithms for clustering gene expression profiles across diverse experimental conditions and studies that accounts for context-specificity of gene expression patterns. The model is based on the Bayesian infinite mixtures framework and does not require a priori specification of the number of clusters. We demonstrate that explicit modeling of context-specificity results in increased accuracy of the cluster analysis by examining the specificity and sensitivity of clusters in microarray data. We also demonstrate that probabilities of co-expression derived from the posterior distribution of clusterings are valid estimates of statistical significance of created clusters. AVAILABILITY: The open-source package gimm is available at http://eh3.uc.edu/gimm.  相似文献   

4.
Dong B  Zhang P  Chen X  Liu L  Wang Y  He S  Chen R 《PloS one》2011,6(6):e21012
Housekeeping genes (HKGs) generally have fundamental functions in basic biochemical processes in organisms, and usually have relatively steady expression levels across various tissues. They play an important role in the normalization of microarray technology. Using Fourier analysis we transformed gene expression time-series from a Hela cell cycle gene expression dataset into Fourier spectra, and designed an effective computational method for discriminating between HKGs and non-HKGs using the support vector machine (SVM) supervised learning algorithm which can extract significant features of the spectra, providing a basis for identifying specific gene expression patterns. Using our method we identified 510 human HKGs, and then validated them by comparison with two independent sets of tissue expression profiles. Results showed that our predicted HKG set is more reliable than three previously identified sets of HKGs.  相似文献   

5.
6.

Background

The chicken is an important agricultural and avian-model species. A survey of gene expression in a range of different tissues will provide a benchmark for understanding expression levels under normal physiological conditions in birds. With expression data for birds being very scant, this benchmark is of particular interest for comparative expression analysis among various terrestrial vertebrates.

Methodology/Principal Findings

We carried out a gene expression survey in eight major chicken tissues using whole genome microarrays. A global picture of gene expression is presented for the eight tissues, and tissue specific as well as common gene expression were identified. A Gene Ontology (GO) term enrichment analysis showed that tissue-specific genes are enriched with GO terms reflecting the physiological functions of the specific tissue, and housekeeping genes are enriched with GO terms related to essential biological functions. Comparisons of structural genomic features between tissue-specific genes and housekeeping genes show that housekeeping genes are more compact. Specifically, coding sequence and particularly introns are shorter than genes that display more variation in expression between tissues, and in addition intergenic space was also shorter. Meanwhile, housekeeping genes are more likely to co-localize with other abundantly or highly expressed genes on the same chromosomal regions. Furthermore, comparisons of gene expression in a panel of five common tissues between birds, mammals and amphibians showed that the expression patterns across tissues are highly similar for orthologuous genes compared to random gene pairs within each pair-wise comparison, indicating a high degree of functional conservation in gene expression among terrestrial vertebrates.

Conclusions

The housekeeping genes identified in this study have shorter gene length, shorter coding sequence length, shorter introns, and shorter intergenic regions, there seems to be selection pressure on economy in genes with a wide tissue distribution, i.e. these genes are more compact. A comparative analysis showed that the expression patterns of orthologous genes are conserved in the terrestrial vertebrates during evolution.  相似文献   

7.
Gene co-expression networks provide an important tool for systems biology studies. Using microarray data from the Array Express database, we constructed an Arabidopsis gene co-expression network, termed At GGM2014, based on the graphical Gaussian model, which contains 102,644 co-expression gene pairs among 18,068 genes. The network was grouped into 622 gene co-expression modules. These modules function in diverse house-keeping, cell cycle, development, hormone response, metabolism, and stress response pathways. We developed a tool to facilitate easy visualization of the expression patterns of these modules either in a tissue context or their regulation under different treatment conditions. The results indicate that at least six modules with tissue-specific expression pattern failed to record modular regulation under various stress conditions. This discrepancy could be best explained by the fact that experiments to study plant stress responses focused mainly on leaves and less on roots, and thus failed to recover specific regulation pattern in other tissues. Overall, the modular structures revealed by our network provide extensive information to generate testable hypotheses about diverse plant signaling pathways. At GGM2014 offers a constructive tool for plant systems biology studies.  相似文献   

8.
《BIOSILICO》2003,1(3):89-96
The function(s) of a novel gene or gene product can be inferred by associating the gene or gene product with those whose functions are known. It is now common practice to associate two genes if they have similar sequences. In recent years, computational methods have been developed that associate genes on the basis of features beyond similarity, using a variety of biological data beyond single-gene sequences. This review describes several promising methods that associate genes or gene products. These associative methods employ similarity of sequences and structures, features from whole-genome analysis, co-expression patterns from microarray and EST data, interacting properties from proteomic data, and links from literature mining. Finally, we outline issues surrounding the validation and integration of these methods.  相似文献   

9.
10.
刘杰  李勃  陈晓洁  陈斌 《昆虫学报》1950,63(10):1171-1182
【目的】利用权重基因共表达网络分析(weighted gene co-expression network analysis,WGCNA)探索埃及伊蚊Aedes aegypti不同组织基因共表达模式。【方法】从NCBI SRA数据库中选择埃及伊蚊不同组织的转录组数据中具代表性的9种组织(雌雄成蚊的触角和脑,雌蚊的喙、下颚须和卵巢,雄成蚊的前足、中足、后足和腹部末端)的双端测序数据;经过缺失值移除以及方差计算后,筛选出方差最大的5 000个基因,利用R软件中WGCNA包建立埃及伊蚊成蚊不同组织的基因共表达网络并划分模块;然后利用clusterProfiler包对组织特异性模块内的基因进行GO(Gene Ontology)和KEGG(Kyoto Encyclopediaof Genes and Genomes)富集分析,并用Cytoscape软件中的CytoHubba插件筛选共表达模块内的hub基因。【结果】从埃及伊蚊成蚊不同组织中共鉴定出11个基因共表达模块,在雌蚊触角、喙、卵巢、下颚须以及雄蚊脑、腹部末端组织中各鉴定出1个特异性表达模块,雄蚊前足、中足和后足组织中无特异性表达模块。6个组织特异性表达模块内基因功能注释到组织生物学功能;其中,雌蚊触角特异性green模块内基因具有气味结合和嗅觉受体活性等功能;雌蚊喙特异性purple模块内基因具有丝氨酸型肽链内切酶活性和丝氨酸水解酶活性等功能;雄蚊脑特异性blue模块内基因在生物学过程调节、信号转导和神经系统过程等生物学过程中发挥主要作用。利用CytoHubba进一步鉴定出所选组织特异性共表达模块中具有高连通性的hub基因,包括AAEL010426, AAEL002896, AAEL002600, AAEL000961, AAEL007784和AAEL006429。【结论】本研究依据埃及伊蚊不同组织转录组数据,利用WGCNA方法发现了许多重要的基因共表达模块。本研究的结果为蚊虫基因共表达模式分析提供新思路和方法基础,对探究蚊虫不同组织特有的基因资源信息以及功能基因生物信息学研究有参考价值。  相似文献   

11.
12.
13.
14.
MOTIVATION: The rapid accumulation of microarray datasets provides unique opportunities to perform systematic functional characterization of the human genome. We designed a graph-based approach to integrate cross-platform microarray data, and extract recurrent expression patterns. A series of microarray datasets can be modeled as a series of co-expression networks, in which we search for frequently occurring network patterns. The integrative approach provides three major advantages over the commonly used microarray analysis methods: (1) enhance signal to noise separation (2) identify functionally related genes without co-expression and (3) provide a way to predict gene functions in a context-specific way. RESULTS: We integrate 65 human microarray datasets, comprising 1105 experiments and over 11 million expression measurements. We develop a data mining procedure based on frequent itemset mining and biclustering to systematically discover network patterns that recur in at least five datasets. This resulted in 143,401 potential functional modules. Subsequently, we design a network topology statistic based on graph random walk that effectively captures characteristics of a gene's local functional environment. Function annotations based on this statistic are then subject to the assessment using the random forest method, combining six other attributes of the network modules. We assign 1126 functions to 895 genes, 779 known and 116 unknown, with a validation accuracy of 70%. Among our assignments, 20% genes are assigned with multiple functions based on different network environments. AVAILABILITY: http://zhoulab.usc.edu/ContextAnnotation.  相似文献   

15.
16.
MOTIVATION: A major issue in computational biology is the reconstruction of pathways from several genomic datasets, such as expression data, protein interaction data and phylogenetic profiles. As a first step toward this goal, it is important to investigate the amount of correlation which exists between these data. RESULTS: These methods are successfully tested on their ability to recognize operons in the Escherichia coli genome, from the comparison of three datasets corresponding to functional relationships between genes in metabolic pathways, geometrical relationships along the chromosome, and co-expression relationships as observed by gene expression data.  相似文献   

17.
18.
Microarrays have been useful in understanding various biological processes by allowing the simultaneous study of the expression of thousands of genes. However, the analysis of microarray data is a challenging task. One of the key problems in microarray analysis is the classification of unknown expression profiles. Specifically, the often large number of non-informative genes on the microarray adversely affects the performance and efficiency of classification algorithms. Furthermore, the skewed ratio of sample to variable poses a risk of overfitting. Thus, in this context, feature selection methods become crucial to select relevant genes and, hence, improve classification accuracy. In this study, we investigated feature selection methods based on gene expression profiles and protein interactions. We found that in our setup, the addition of protein interaction information did not contribute to any significant improvement of the classification results. Furthermore, we developed a novel feature selection method that relies exclusively on observed gene expression changes in microarray experiments, which we call “relative Signal-to-Noise ratio” (rSNR). More precisely, the rSNR ranks genes based on their specificity to an experimental condition, by comparing intrinsic variation, i.e. variation in gene expression within an experimental condition, with extrinsic variation, i.e. variation in gene expression across experimental conditions. Genes with low variation within an experimental condition of interest and high variation across experimental conditions are ranked higher, and help in improving classification accuracy. We compared different feature selection methods on two time-series microarray datasets and one static microarray dataset. We found that the rSNR performed generally better than the other methods.  相似文献   

19.
20.
Global, comparative analysis of tissue-specific promoter CpG methylation   总被引:3,自引:0,他引:3  
Schilling E  Rehli M 《Genomics》2007,90(3):314-323
Understanding cell-type-specific epigenetic codes on a global level is a major challenge after the sequencing of the human genome has been completed. Here we applied methyl-CpG immunoprecipitation (MCIp) to obtain comparative methylation profiles of coding and noncoding genes in three human tissues, testis, brain, and monocytes. Forty-four mainly testis-specific promoters were independently validated using bisulfite sequencing or single-gene MCIp, confirming the results obtained by the MCIp microarray approach. We demonstrate the previously unknown somatic hypermethylation at many CpG-rich, testis-specific gene promoters, in particular in ampliconic areas of the Y chromosome. We also identify a number of miRNA genes showing tissue-specific methylation patterns. The comparison of the obtained tissue methylation profiles with corresponding gene expression data indicates a significant association between tissue-specific promoter methylation and gene expression, not only in CpG-rich promoters. In summary, our study highlights the exceptional epigenetic status of germ-line cells in testis and provides a global insight into tissue-specific DNA methylation patterns.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号