共查询到20条相似文献,搜索用时 46 毫秒
1.
EASE is a customizable software application for rapid biological interpretation of gene lists that result from the analysis
of microarray, proteomics, SAGE, and other high-throughput genomic data. The biological themes returned by EASE recapitulate
manually determined themes in previously published gene lists and are robust to varying methods of normalization, intensity
calculation and statistical selection of genes. EASE is a powerful tool for rapidly converting the results of functional genomics
studies from "genes to themes." 相似文献
2.
The model plant Arabidopsis has been well-studied using high-throughput genomics technologies, which usually generate lists of differentially expressed genes under various conditions. Our group recently collected 1065 gene lists from 397 gene expression studies as a knowledgebase for pathway analysis. Here we systematically analyzed these gene lists by computing overlaps in all-vs.-all comparisons. We identified 16,261 statistically significant overlaps, represented by an undirected network in which nodes correspond to gene lists and edges indicate significant overlaps. The network highlights the correlation across the gene expression signatures of the diverse biological processes. We also partitioned the main network into 20 sub-networks, representing groups of highly similar expression signatures. These are common sets of genes that were co-regulated under different treatments or conditions and are often related to specific biological themes. Overall, our result suggests that diverse gene expression signatures are highly interconnected in a modular fashion. 相似文献
3.
Background
One of the challenges in the analysis of microarray data is to integrate and compare the selected (e.g., differential) gene lists from multiple experiments for common or unique underlying biological themes. A common way to approach this problem is to extract common genes from these gene lists and then subject these genes to enrichment analysis to reveal the underlying biology. However, the capacity of this approach is largely restricted by the limited number of common genes shared by datasets from multiple experiments, which could be caused by the complexity of the biological system itself. 相似文献4.
Transcript profiling during preimplantation mouse development 总被引:2,自引:0,他引:2
5.
Even-Faitelson L Samach A Melamed-Bessudo C Avivi-Ragolsky N Levy AA 《The Plant journal : for cell and molecular biology》2011,68(5):929-937
Targeted modification of the genome is an important genetic tool, which can be achieved via homologous, non-homologous or site-specific recombination. Although numerous efforts have been made, such a tool does not exist for routine applications in plants. This work describes a simple and useful method for targeted mutagenesis or gene targeting, tailored to floral-dip transformation in Arabidopsis, by means of specific protein expression in the egg cell. Proteins stably or transiently expressed under the egg apparatus-specific enhancer (EASE) were successfully localized to the area of the egg cell. Moreover, a zinc-finger nuclease expressed under EASE induced targeted mutagenesis. Mutations obtained under EASE control corresponded to genetically independent events that took place specifically in the germline. In addition, RAD54 expression under EASE led to an approximately 10-fold increase in gene targeting efficiency, when compared with wild-type plants. EASE-controlled gene expression provides a method for the precise engineering of the Arabidopsis genome through temporally and spatially controlled protein expression. This system can be implemented as a useful method for basic research in Arabidopsis, as well as in the optimization of tools for targeted genetic modifications in crop plants. 相似文献
6.
Yang W Jefferson RA Huttner E Moore JM Gagliano WB Grossniklaus U 《Plant physiology》2005,139(3):1421-1432
Despite a central role in angiosperm reproduction, few gametophyte-specific genes and promoters have been isolated, particularly for the inaccessible female gametophyte (embryo sac). Using the Ds-based enhancer-detector line ET253, we have cloned an egg apparatus-specific enhancer (EASE) from Arabidopsis (Arabidopsis thaliana). The genomic region flanking the Ds insertion site was further analyzed by examining its capability to control gusA and GFP reporter gene expression in the embryo sac in a transgenic context. Through analysis of a 5' and 3' deletion series in transgenic Arabidopsis, the sequence responsible for egg apparatus-specific expression was delineated to 77 bp. Our data showed that this enhancer is unique in the Arabidopsis genome, is conserved among different accessions, and shows an unusual pattern of sequence variation. This EASE works independently of position and orientation in Arabidopsis but is probably not associated with any nearby gene, suggesting either that it acts over a large distance or that a cryptic element was detected. Embryo-specific ablation in Arabidopsis was achieved by transactivation of a diphtheria toxin gene under the control of the EASE. The potential application of the EASE element and similar control elements as part of an open-source biotechnology toolkit for apomixis is discussed. 相似文献
7.
Background
Over-representation analysis (ORA) detects enrichment of genes within biological categories. Gene Ontology (GO) domains are commonly used for gene/gene-product annotation. When ORA is employed, often times there are hundreds of statistically significant GO terms per gene set. Comparing enriched categories between a large number of analyses and identifying the term within the GO hierarchy with the most connections is challenging. Furthermore, ascertaining biological themes representative of the samples can be highly subjective from the interpretation of the enriched categories.Results
We developed goSTAG for utilizing GO Subtrees to Tag and Annotate Genes that are part of a set. Given gene lists from microarray, RNA sequencing (RNA-Seq) or other genomic high-throughput technologies, goSTAG performs GO enrichment analysis and clusters the GO terms based on the p-values from the significance tests. GO subtrees are constructed for each cluster, and the term that has the most paths to the root within the subtree is used to tag and annotate the cluster as the biological theme. We tested goSTAG on a microarray gene expression data set of samples acquired from the bone marrow of rats exposed to cancer therapeutic drugs to determine whether the combination or the order of administration influenced bone marrow toxicity at the level of gene expression. Several clusters were labeled with GO biological processes (BPs) from the subtrees that are indicative of some of the prominent pathways modulated in bone marrow from animals treated with an oxaliplatin/topotecan combination. In particular, negative regulation of MAP kinase activity was the biological theme exclusively in the cluster associated with enrichment at 6 h after treatment with oxaliplatin followed by control. However, nucleoside triphosphate catabolic process was the GO BP labeled exclusively at 6 h after treatment with topotecan followed by control.Conclusions
goSTAG converts gene lists from genomic analyses into biological themes by enriching biological categories and constructing GO subtrees from over-represented terms in the clusters. The terms with the most paths to the root in the subtree are used to represent the biological themes. goSTAG is developed in R as a Bioconductor package and is available at https://bioconductor.org/packages/goSTAG8.
The fact that there is very little if any overlap between the genes of different prognostic signatures for early-discovery breast cancer is well documented. The reasons for this apparent discrepancy have been explained by the limits of simple machine-learning identification and ranking techniques, and the biological relevance and meaning of the prognostic gene lists was questioned. Subsequently, proponents of the prognostic gene lists claimed that different lists do capture similar underlying biological processes and pathways. The present study places under scrutiny the validity of this claim, for two important gene lists that are at the focus of current large-scale validation efforts. We performed careful enrichment analysis, controlling the effects of multiple testing in a manner which takes into account the nested dependent structure of gene ontologies. In contradiction to several previous publications, we find that the only biological process or pathway for which statistically significant concordance can be claimed is cell proliferation, a process whose relevance and prognostic value was well known long before gene expression profiling. We found that the claims reported by others, of wider concordance between the biological processes captured by the two prognostic signatures studied, were found either to be lacking statistical rigor or were in fact based on addressing some other question. 相似文献
9.
Analysis of multivariate data sets from, for example, microarray studies frequently results in lists of genes which are associated with some response of interest. The biological interpretation is often complicated by the statistical instability of the obtained gene lists, which may partly be due to the functional redundancy among genes, implying that multiple genes can play exchangeable roles in the cell. In this paper, we use the concept of exchangeability of random variables to model this functional redundancy and thereby account for the instability. We present a flexible framework to incorporate the exchangeability into the representation of lists. The proposed framework supports straightforward comparison between any 2 lists. It can also be used to generate new more stable gene rankings incorporating more information from the experimental data. Using 2 microarray data sets, we show that the proposed method provides more robust gene rankings than existing methods with respect to sampling variations, without compromising the biological significance of the rankings. 相似文献
10.
Background
High throughput methods of the genome era produce vast amounts of data in the form of gene lists. These lists are large and difficult to interpret without advanced computational or bioinformatic tools. Most existing methods analyse a gene list as a single entity although it is comprised of multiple gene groups associated with separate biological functions. Therefore it is imperative to define and visualize gene groups with unique functionality within gene lists. 相似文献11.
GeneInfoMiner is a web-based system for searching Medline abstracts using sequence ID lists such as GenBank accession numbers derived from high-throughput experiments. It will map query results to MeSH topics to facilitate the exploration of the biological significance of the sequence ID lists. GeneInfoMiner is based on a custom gene and protein name identification engine that can map gene and protein names to important molecular biology databases. 相似文献
12.
13.
Xin He Moushumi Sen Sarma Xu Ling Brant Chee Chengxiang Zhai Bruce Schatz 《BMC bioinformatics》2010,11(1):272
Background
Large-scale genomic studies often identify large gene lists, for example, the genes sharing the same expression patterns. The interpretation of these gene lists is generally achieved by extracting concepts overrepresented in the gene lists. This analysis often depends on manual annotation of genes based on controlled vocabularies, in particular, Gene Ontology (GO). However, the annotation of genes is a labor-intensive process; and the vocabularies are generally incomplete, leaving some important biological domains inadequately covered. 相似文献14.
15.
Werner T 《Current opinion in biotechnology》2008,19(1):50-54
16.
Huang da W Sherman BT Tan Q Collins JR Alvord WG Roayaei J Stephens R Baseler MW Lane HC Lempicki RA 《Genome biology》2007,8(9):R183
The DAVID Gene Functional Classification Tool uses a novel agglomeration algorithm to condense a list of genes or associated biological terms into organized classes of
related genes or biology, called biological modules. This organization is accomplished by mining the complex biological co-occurrences
found in multiple sources of functional annotation. It is a powerful method to group functionally related genes and terms
into a manageable number of biological modules for efficient interpretation of gene lists in a network context. 相似文献
17.
We have recently reported on the isolation of a 5.7 kb segment of Chinese hamster ovary cell genomic DNA, Expression Augmenting
Sequence Element (EASE), which when used in bicistronic expression vectors allows the development of stable Chinese hamster
ovary cell pools in a five to seven week time period that express high levels of recombinant protein (6–25 μg 10-6 cells/day
depending on the protein). In the present study, we have mapped the activity of the EASE to a 2.1 kb region using colony forming
assays and developed bicistronic expression vectors with the smaller EASE or control lambda DNA. The recovery of pools expressing
the hematopoietic growth factor, FLT3 Ligand, in methotrexate-containing media took 1 to 4 weeks less when using EASE expression
vectors compared with control vectors. The cell pools developed with the EASE and control vectors had similar final protein
expression levels. Southern blot analysis suggested the expression cassette from the EASE containing vectors integrated in
tandem arrays arranged in either head to head or head to tail fashion. By contrast, control vectors appeared to integrate
with multiple interruptions to the expression vector. Thus, the EASE, within a bicistronic expression vector, appeared to
facilitate tandem vector integration and reduce the time required to develop cell pools for protein expression.
This revised version was published online in August 2006 with corrections to the Cover Date. 相似文献
18.
The outcome of a functional genomics pipeline is usually a partial list of genomic features, ranked by their relevance in modelling biological phenotype in terms of a classification or regression model. Due to resampling protocols or to a meta-analysis comparison, it is often the case that sets of alternative feature lists (possibly of different lengths) are obtained, instead of just one list. Here we introduce a method, based on permutations, for studying the variability between lists ("list stability") in the case of lists of unequal length. We provide algorithms evaluating stability for lists embedded in the full feature set or just limited to the features occurring in the partial lists. The method is demonstrated by finding and comparing gene profiles on a large prostate cancer dataset, consisting of two cohorts of patients from different countries, for a total of 455 samples. 相似文献
19.
Levine DM Haynor DR Castle JC Stepaniants SB Pellegrini M Mao M Johnson JM 《Genome biology》2006,7(10):R93-17
Background
Interpretation of lists of genes or proteins with altered expression is a critical and time-consuming part of microarray and proteomics research, but relatively little attention has been paid to methods for extracting biological meaning from these output lists. One powerful approach is to examine the expression of predefined biological pathways and gene sets, such as metabolic and signaling pathways and macromolecular complexes. Although many methods for measuring pathway expression have been proposed, a systematic analysis of the performance of multiple methods over multiple independent data sets has not previously been reported. 相似文献20.