共查询到20条相似文献,搜索用时 15 毫秒
1.
Background
Differential gene expression is important to understand the biological differences between healthy and diseased states. Two common sources of differential gene expression data are microarray studies and the biomedical literature.Methods
With the aid of text mining and gene expression analysis we have examined the comparative properties of these two sources of differential gene expression data.Results
The literature shows a preference for reporting genes associated to higher fold changes in microarray data, rather than genes that are simply significantly differentially expressed. Thus, the resemblance between the literature and microarray data increases when the fold-change threshold for microarray data is increased. Moreover, the literature has a reporting preference for differentially expressed genes that (1) are overexpressed rather than underexpressed; (2) are overexpressed in multiple diseases; and (3) are popular in the biomedical literature at large. Additionally, the degree to which diseases are similar depends on whether microarray data or the literature is used to compare them. Finally, vaguely-qualified reports of differential expression magnitudes in the literature have only small correlation with microarray fold-change data.Conclusions
Reporting biases of differential gene expression in the literature can be affecting our appreciation of disease biology and of the degree of similarity that actually exists between different diseases.2.
Motivation
Identification of differentially expressed genes from microarray datasets is one of the most important analyses for microarray data mining. Popular algorithms such as statistical t-test rank genes based on a single statistics. The false positive rate of these methods can be improved by considering other features of differentially expressed genes.Results
We proposed a pattern recognition strategy for identifying differentially expressed genes. Genes are mapped to a two dimension feature space composed of average difference of gene expression and average expression levels. A density based pruning algorithm (DB Pruning) is developed to screen out potential differentially expressed genes usually located in the sparse boundary region. Biases of popular algorithms for identifying differentially expressed genes are visually characterized. Experiments on 17 datasets from Gene Omnibus Database (GEO) with experimentally verified differentially expressed genes showed that DB pruning can significantly improve the prediction accuracy of popular identification algorithms such as t-test, rank product, and fold change.Conclusions
Density based pruning of non-differentially expressed genes is an effective method for enhancing statistical testing based algorithms for identifying differentially expressed genes. It improves t-test, rank product, and fold change by 11% to 50% in the numbers of identified true differentially expressed genes. The source code of DB pruning is freely available on our website http://mleg.cse.sc.edu/degprune3.
Background
Microarray gene expression data are accumulating in public databases. The expression profiles contain valuable information for understanding human gene expression patterns. However, the effective use of public microarray data requires integrating the expression profiles from heterogeneous sources.Results
In this study, we have compiled a compendium of microarray expression profiles of various human tissue samples. The microarray raw data generated in different research laboratories have been obtained and combined into a single dataset after data normalization and transformation. To demonstrate the usefulness of the integrated microarray data for studying human gene expression patterns, we have analyzed the dataset to identify potential tissue-selective genes. A new method has been proposed for genome-wide identification of tissue-selective gene targets using both microarray intensity values and detection calls. The candidate genes for brain, liver and testis-selective expression have been examined, and the results suggest that our approach can select some interesting gene targets for further experimental studies.Conclusion
A computational approach has been developed in this study for combining microarray expression profiles from heterogeneous sources. The integrated microarray data can be used to investigate tissue-selective expression patterns of human genes.4.
Background
Existing clustering approaches for microarray data do not adequately differentiate between subsets of co-expressed genes. We devised a novel approach that integrates expression and sequence data in order to generate functionally coherent and biologically meaningful subclusters of genes. Specifically, the approach clusters co-expressed genes on the basis of similar content and distributions of predicted statistically significant sequence motifs in their upstream regions.Results
We applied our method to several sets of co-expressed genes and were able to define subsets with enrichment in particular biological processes and specific upstream regulatory motifs.Conclusions
These results show the potential of our technique for functional prediction and regulatory motif identification from microarray data.5.
6.
Background
To identify differentially expressed genes, it is standard practice to test a two-sample hypothesis for each gene with a proper adjustment for multiple testing. Such tests are essentially univariate and disregard the multidimensional structure of microarray data. A more general two-sample hypothesis is formulated in terms of the joint distribution of any sub-vector of expression signals.Results
By building on an earlier proposed multivariate test statistic, we propose a new algorithm for identifying differentially expressed gene combinations. The algorithm includes an improved random search procedure designed to generate candidate gene combinations of a given size. Cross-validation is used to provide replication stability of the search procedure. A permutation two-sample test is used for significance testing. We design a multiple testing procedure to control the family-wise error rate (FWER) when selecting significant combinations of genes that result from a successive selection procedure. A target set of genes is composed of all significant combinations selected via random search.Conclusions
A new algorithm has been developed to identify differentially expressed gene combinations. The performance of the proposed search-and-testing procedure has been evaluated by computer simulations and analysis of replicated Affymetrix gene array data on age-related changes in gene expression in the inner ear of CBA mice.7.
8.
9.
Background
Sets of genes that are known to be associated with each other can be used to interpret microarray data. This gene set approach to microarray data analysis can illustrate patterns of gene expression which may be more informative than analyzing the expression of individual genes. Various statistical approaches exist for the analysis of gene sets. There are three main classes of these methods: over-representation analysis, functional class scoring, and pathway topology based methods.Methods
We propose weighted hypergeometric and weighted chi-squared methods in order to assign a rank to the degree to which each gene participates in the enrichment. Each gene is assigned a weight determined by the absolute value of its log fold change, which is then raised to a certain power. The power value can be adjusted as needed. Datasets from the Gene Expression Omnibus are used to test the method. The significantly enriched pathways are validated through searching the literature in order to determine their relevance to the dataset.Results
Although these methods detect fewer significantly enriched pathways, they can potentially produce more relevant results. Furthermore, we compare the results of different enrichment methods on a set of microarray studies all containing data from various rodent neuropathic pain models.Discussion
Our method is able to produce more consistent results than other methods when evaluated on similar datasets. It can also potentially detect relevant pathways that are not identified by the standard methods. However, the lack of biological ground truth makes validating the method difficult.10.
Jörg Menche Amitabh Sharma Michael H Cho Ruth J Mayer Stephen I Rennard Bartolome Celli Bruce E Miller Nick Locantore Ruth Tal-Singer Soumitra Ghosh Chris Larminie Glyn Bradley John H Riley Alvar Agusti Edwin K Silverman Albert-László Barabási 《BMC systems biology》2014,8(Z2):S8
Background
An important step toward understanding the biological mechanisms underlying a complex disease is a refined understanding of its clinical heterogeneity. Relating clinical and molecular differences may allow us to define more specific subtypes of patients that respond differently to therapeutic interventions.Results
We developed a novel unbiased method called diVIsive Shuffling Approach (VIStA) that identifies subgroups of patients by maximizing the difference in their gene expression patterns. We tested our algorithm on 140 subjects with Chronic Obstructive Pulmonary Disease (COPD) and found four distinct, biologically and clinically meaningful combinations of clinical characteristics that are associated with large gene expression differences. The dominant characteristic in these combinations was the severity of airflow limitation. Other frequently identified measures included emphysema, fibrinogen levels, phlegm, BMI and age. A pathway analysis of the differentially expressed genes in the identified subtypes suggests that VIStA is capable of capturing specific molecular signatures within in each group.Conclusions
The introduced methodology allowed us to identify combinations of clinical characteristics that correspond to clear gene expression differences. The resulting subtypes for COPD contribute to a better understanding of its heterogeneity.11.
Background
Clinical statement alone is not enough to predict the progression of disease. Instead, the gene expression profiles have been widely used to forecast clinical outcomes. Many genes related to survival have been identified, and recently miRNA expression signatures predicting patient survival have been also investigated for several cancers. However, miRNAs and their target genes associated with clinical outcomes have remained largely unexplored.Methods
Here, we demonstrate a survival analysis based on the regulatory relationships of miRNAs and their target genes. The patient survivals for the two major cancers, ovarian cancer and glioblastoma multiforme (GBM), are investigated through the integrated analysis of miRNA-mRNA interaction pairs.Results
We found that there is a larger survival difference between two patient groups with an inversely correlated expression profile of miRNA and mRNA. It supports the idea that signatures of miRNAs and their targets related to cancer progression can be detected via this approach.Conclusions
This integrated analysis can help to discover coordinated expression signatures of miRNAs and their target mRNAs that can be employed for therapeutics in human cancers.12.
Background
The kidney functions in key physiological processes to filter blood and regulate blood pressure via key molecular transporters and ion channels. Sex-specific differences have been observed in renal disease incidence and progression, as well as acute kidney injury in response to certain drugs. Although advances have been made in characterizing the molecular components involved in various kidney functions, the molecular mechanisms responsible for sex differences are not well understood. We hypothesized that the basal expression levels of genes involved in various kidney functions throughout the life cycle will influence sex-specific susceptibilities to adverse renal events.Methods
Whole genome microarray gene expression analysis was performed on kidney samples collected from untreated male and female Fischer 344 (F344) rats at eight age groups between 2 and 104 weeks of age.Results
A combined filtering approach using statistical (ANOVA or pairwise t test, FDR 0.05) and fold-change criteria (>1.5 relative fold change) was used to identify 7,447 unique differentially expressed genes (DEGs). Principal component analysis (PCA) of the 7,447 DEGs revealed sex-related differences in mRNA expression at early (2 weeks), middle (8, 15, and 21 weeks), and late (104 weeks) ages in the rat life cycle. Functional analysis (Ingenuity Pathway Analysis) of these sex-different genes indicated over-representation of specific pathways and networks including renal tubule injury, drug metabolism, and immune cell and inflammatory responses. The mRNAs that code for the qualified urinary protein kidney biomarkers KIM-1, Clu, Tff3, and Lcn2 were also observed to show sex differences.Conclusions
These data represent one of the most comprehensive in-life time course studies to be published, assessing sex differences in global gene expression in the F344 rat kidney. PCA and Venn analyses reveal specific periods of sexually dimorphic gene expression which are associated with functional categories (xenobiotic metabolism and immune cell and inflammatory responses) of key relevance to acute kidney injury and chronic kidney disease, which may underlie sex-specific susceptibility. Analysis of the basal gene expression patterns of renal genes throughout the life cycle of the rat will improve the use of current and future renal biomarkers and inform our assessments of kidney injury and disease.13.
14.
15.
Background
Metabolic disorders such as obesity and diabetes are diseases which develop gradually over time in an individual and through the perturbations of genes. Systematic experiments tracking disease progression at gene level are usually conducted giving a temporal microarray data. There is a need for developing methods to analyze such complex data and extract important proteins which could be involved in temporal progression of the data and hence progression of the disease.Results
In the present study, we have considered a temporal microarray data from an experiment conducted to study development of obesity and diabetes in mice. We have used this data along with an available Protein-Protein Interaction network to find a network of interactions between proteins which reproduces the next time point data from previous time point data. We show that the resulting network can be mined to identify critical nodes involved in the temporal progression of perturbations. We further show that published algorithms can be applied on such connected network to mine important proteins and show an overlap between outputs from published and our algorithms. The importance of set of proteins identified was supported by literature as well as was further validated by comparing them with the positive genes dataset from OMIM database which shows significant overlap.Conclusions
The critical proteins identified from algorithms can be hypothesized to play important role in temporal progression of the data.16.
Alex G. Lee Megan Hagenauer Devin Absher Kathleen E. Morrison Tracy L. Bale Richard M. Myers Stanley J. Watson Huda Akil Alan F. Schatzberg David M. Lyons 《Biology of sex differences》2017,8(1):36
Background
Stress is a recognized risk factor for mood and anxiety disorders that occur more often in women than men. Prefrontal brain regions mediate stress coping, cognitive control, and emotion. Here, we investigate sex differences and stress effects on prefrontal cortical profiles of gene expression in squirrel monkey adults.Methods
Dorsolateral, ventrolateral, and ventromedial prefrontal cortical regions from 18 females and 12 males were collected after stress or no-stress treatment conditions. Gene expression profiles were acquired using HumanHT-12v4.0 Expression BeadChip arrays adapted for squirrel monkeys.Results
Extensive variation between prefrontal cortical regions was discerned in the expression of numerous autosomal and sex chromosome genes. Robust sex differences were also identified across prefrontal cortical regions in the expression of mostly autosomal genes. Genes with increased expression in females compared to males were overrepresented in mitogen-activated protein kinase and neurotrophin signaling pathways. Many fewer genes with increased expression in males compared to females were discerned, and no molecular pathways were identified. Effect sizes for sex differences were greater in stress compared to no-stress conditions for ventromedial and ventrolateral prefrontal cortical regions but not dorsolateral prefrontal cortex.Conclusions
Stress amplifies sex differences in gene expression profiles for prefrontal cortical regions involved in stress coping and emotion regulation. Results suggest molecular targets for new treatments of stress disorders in human mental health.17.
Background
Microarray technology allows the monitoring of expression levels for thousands of genes simultaneously. This novel technique helps us to understand gene regulation as well as gene by gene interactions more systematically. In the microarray experiment, however, many undesirable systematic variations are observed. Even in replicated experiment, some variations are commonly observed. Normalization is the process of removing some sources of variation which affect the measured gene expression levels. Although a number of normalization methods have been proposed, it has been difficult to decide which methods perform best. Normalization plays an important role in the earlier stage of microarray data analysis. The subsequent analysis results are highly dependent on normalization.Results
In this paper, we use the variability among the replicated slides to compare performance of normalization methods. We also compare normalization methods with regard to bias and mean square error using simulated data.Conclusions
Our results show that intensity-dependent normalization often performs better than global normalization methods, and that linear and nonlinear normalization methods perform similarly. These conclusions are based on analysis of 36 cDNA microarrays of 3,840 genes obtained in an experiment to search for changes in gene expression profiles during neuronal differentiation of cortical stem cells. Simulation studies confirm our findings.18.
19.
Background
Gene expression data extracted from microarray experiments have been used to study the difference between mRNA abundance of genes under different conditions. In one of such experiments, thousands of genes are measured simultaneously, which provides a high-dimensional feature space for discriminating between different sample classes. However, most of these dimensions are not informative about the between-class difference, and add noises to the discriminant analysis.Results
In this paper we propose and study feature selection methods that evaluate the "informativeness" of a set of genes. Two measures of information based on multigene expression profiles are considered for a backward information-driven screening approach for selecting important gene features. By considering multigene expression profiles, we are able to utilize interaction information among these genes. Using a breast cancer data, we illustrate our methods and compare them to the performance of existing methods.Conclusion
We illustrate in this paper that methods considering gene-gene interactions have better classification power in gene expression analysis. In our results, we identify important genes with relative large p-values from single gene tests. This indicates that these are genes with weak marginal information but strong interaction information, which will be overlooked by strategies that only examine individual genes.20.