首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Although expression microarrays have become a standard tool used by biologists, analysis of data produced by microarray experiments may still present challenges. Comparison of data from different platforms, organisms, and labs may involve complicated data processing, and inferring relationships between genes remains difficult.

Results

S TAR N ET 2 is a new web-based tool that allows post hoc visual analysis of correlations that are derived from expression microarray data. S TAR N ET 2 facilitates user discovery of putative gene regulatory networks in a variety of species (human, rat, mouse, chicken, zebrafish, Drosophila, C. elegans, S. cerevisiae, Arabidopsis and rice) by graphing networks of genes that are closely co-expressed across a large heterogeneous set of preselected microarray experiments. For each of the represented organisms, raw microarray data were retrieved from NCBI's Gene Expression Omnibus for a selected Affymetrix platform. All pairwise Pearson correlation coefficients were computed for expression profiles measured on each platform, respectively. These precompiled results were stored in a MySQL database, and supplemented by additional data retrieved from NCBI. A web-based tool allows user-specified queries of the database, centered at a gene of interest. The result of a query includes graphs of correlation networks, graphs of known interactions involving genes and gene products that are present in the correlation networks, and initial statistical analyses. Two analyses may be performed in parallel to compare networks, which is facilitated by the new H EAT S EEKER module.

Conclusion

S TAR N ET 2 is a useful tool for developing new hypotheses about regulatory relationships between genes and gene products, and has coverage for 10 species. Interpretation of the correlation networks is supported with a database of previously documented interactions, a test for enrichment of Gene Ontology terms, and heat maps of correlation distances that may be used to compare two networks. The list of genes in a S TAR N ET network may be useful in developing a list of candidate genes to use for the inference of causal networks. The tool is freely available at http://vanburenlab.medicine.tamhsc.edu/starnet2.html, and does not require user registration.  相似文献   

2.

Background

Motif analysis methods have long been central for studying biological function of nucleotide sequences. Functional genomics experiments extend their potential. They typically generate sequence lists ranked by an experimentally acquired functional property such as gene expression or protein binding affinity. Current motif discovery tools suffer from limitations in searching large motif spaces, and thus more complex motifs may not be included. There is thus a need for motif analysis methods that are tailored for analyzing specific complex motifs motivated by biological questions and hypotheses rather than acting as a screen based motif finding tool.

Methods

We present Regmex (REGular expression Motif EXplorer), which offers several methods to identify overrepresented motifs in ranked lists of sequences. Regmex uses regular expressions to define motifs or families of motifs and embedded Markov models to calculate exact p-values for motif observations in sequences. Biases in motif distributions across ranked sequence lists are evaluated using random walks, Brownian bridges, or modified rank based statistics. A modular setup and fast analytic p value evaluations make Regmex applicable to diverse and potentially large-scale motif analysis problems.

Results

We demonstrate use cases of combined motifs on simulated data and on expression data from micro RNA transfection experiments. We confirm previously obtained results and demonstrate the usability of Regmex to test a specific hypothesis about the relative location of microRNA seed sites and U-rich motifs. We further compare the tool with an existing motif discovery tool and show increased sensitivity.

Conclusions

Regmex is a useful and flexible tool to analyze motif hypotheses that relates to large data sets in functional genomics. The method is available as an R package (https://github.com/muhligs/regmex).
  相似文献   

3.
We consider the problem of finding the set of rankings that best represents a given group of orderings on the same collection of elements (preference lists). This problem arises from social choice and voting theory, in which each voter gives a preference on a set of alternatives, and a system outputs a single preference order based on the observed voters' preferences. In this paper, we observe that, if the given set of preference lists is not homogeneous, a unique true underling ranking might not exist. Moreover only the lists that share the highest amount of information should be aggregated, and thus multiple rankings might provide a more feasible solution to the problem. In this light, we propose Network Selection, an algorithm that, given a heterogeneous group of rankings, first discovers the different communities of homogeneous rankings and then combines only the rank orderings belonging to the same community into a single final ordering. Our novel approach is inspired by graph theory; indeed our set of lists can be loosely read as the nodes of a network. As a consequence, only the lists populating the same community in the network would then be aggregated. In order to highlight the strength of our proposal, we show an application both on simulated and on two real datasets, namely a financial and a biological dataset. Experimental results on simulated data show that Network Selection can significantly outperform existing related methods. The other way around, the empirical evidence achieved on real financial data reveals that Network Selection is also able to select the most relevant variables in data mining predictive models, providing a clear superiority in terms of predictive power of the models built. Furthermore, we show the potentiality of our proposal in the bioinformatics field, providing an application to a biological microarray dataset.  相似文献   

4.

Background  

The ever-expanding population of gene expression profiles (EPs) from specified cells and tissues under a variety of experimental conditions is an important but difficult resource for investigators to utilize effectively. Software tools have been recently developed to use the distribution of gene ontology (GO) terms associated with the genes in an EP to identify specific biological functions or processes that are over- or under-represented in that EP relative to other EPs. Additionally, it is possible to use the distribution of GO terms inherent to each EP to relate that EP as a whole to other EPs. Because GO term annotation is organized in a tree-like cascade of variable granularity, this approach allows the user to relate (e.g., by hierarchical clustering) EPs of varying length and from different platforms (e.g., GeneChip, SAGE, EST library).  相似文献   

5.
SUMMARY: We present here Blast2GO (B2G), a research tool designed with the main purpose of enabling Gene Ontology (GO) based data mining on sequence data for which no GO annotation is yet available. B2G joints in one application GO annotation based on similarity searches with statistical analysis and highlighted visualization on directed acyclic graphs. This tool offers a suitable platform for functional genomics research in non-model species. B2G is an intuitive and interactive desktop application that allows monitoring and comprehension of the whole annotation and analysis process. AVAILABILITY: Blast2GO is freely available via Java Web Start at http://www.blast2go.de. SUPPLEMENTARY MATERIAL: http://www.blast2go.de -> Evaluation.  相似文献   

6.
Activation tagging in plants: a tool for gene discovery   总被引:8,自引:0,他引:8  
A significant limitation of classical loss-of-function screens designed to dissect genetic pathways is that they rarely uncover genes that function redundantly, are compensated by alternative metabolic or regulatory circuits, or which have an additional role in early embryo or gametophyte development. Activation T-DNA tagging is one approach that has emerged in plants to help circumvent these potential problems. This technique utilises a T-DNA sequence that contains four tandem copies of the cauliflower mosaic virus (CaMV) 35S enhancer sequence. This element enhances the expression of neighbouring genes either side of the randomly integrated T-DNA tag, resulting in gain-of-function phenotypes. Activation tagging has identified a number of genes fundamental to plant development, metabolism and disease resistance in Arabidopsis. This review provides selected examples of these discoveries to highlight the utility of this technology. The recent development of activation tagging strategies for other model plant systems and the construction of new more sophisticated vectors for the generation of conditional alleles are also discussed. These recent advances have significantly expanded the horizons for gain-of-function genetics in plants.  相似文献   

7.

Background

Families of related proteins and their different functions may be described systematically using common classifications and ontologies such as Pfam and GO (Gene Ontology), for example. However, many proteins consist of multiple domains, and each domain, or some combination of domains, can be responsible for a particular molecular function. Therefore, identifying which domains should be associated with a specific function is a non-trivial task.

Results

We describe a general approach for the computational discovery of associations between different sets of annotations by formalising the problem as a bipartite graph enrichment problem in the setting of a tripartite graph. We call this approach “CODAC” (for COmputational Discovery of Direct Associations using Common Neighbours). As one application of this approach, we describe “GODomainMiner” for associating GO terms with protein domains. We used GODomainMiner to predict GO-domain associations between each of the 3 GO ontology namespaces (MF, BP, and CC) and the Pfam, CATH, and SCOP domain classifications. Overall, GODomainMiner yields average enrichments of 15-, 41- and 25-fold GO-domain associations compared to the existing GO annotations in these 3 domain classifications, respectively.

Conclusions

These associations could potentially be used to annotate many of the protein chains in the Protein Databank and protein sequences in UniProt whose domain composition is known but which currently lack GO annotation.
  相似文献   

8.

Background  

Composition Profiler is a web-based tool for semi-automatic discovery of enrichment or depletion of amino acids, either individually or grouped by their physico-chemical or structural properties.  相似文献   

9.
We present GENECODIS, a web-based tool that integrates different sources of information to search for annotations that frequently co-occur in a set of genes and rank them by statistical significance. The analysis of concurrent annotations provides significant information for the biologic interpretation of high-throughput experiments and may outperform the results of standard methods for the functional analysis of gene lists. GENECODIS is publicly available at .  相似文献   

10.
11.
Comparative genomics as a tool for gene discovery   总被引:1,自引:0,他引:1  
With the increasing availability of data from multiple eukaryotic genome sequencing projects, attention has focused on interspecific comparisons to discover novel genes and transcribed genomic sequences. Generally, these extrinsic strategies combine ab initio gene prediction with expression and/or homology data to identify conserved gene candidates between two or more genomes. Interspecific sequence analyses have proven invaluable for the improvement of existing annotations, automation of annotation, and identification of novel coding regions and splice variants. Further, comparative genomic approaches hold the promise of improved prediction of terminal or small exons, microRNA precursors, and small peptide-encoding open reading frames--sequence elements that are difficult to identify through purely intrinsic methodologies in the absence of experimental data.  相似文献   

12.
Among new insights coming from the completion of sequencing of the human genome, reported in Nature and Science, are clues of how evolution has increased the complexity of species, and in particular how the genetic code has enabled this process. It is clear that life has not only evolved by increasing the number of genes, but also by ingeniously evolving an efficient code for expressing diversity in the building blocks (i.e. the amino acids). The rules of nucleic acid base pairing and the classification of amino acids according to hydrophobicity/hydrophilicity relationships define a binary DNA code, which determines the general biophysical characteristics of proteins. Sense and antisense strands can encode protein segments having inverted and complementary hydropathy. The underlying binary code controls association and dissociation of proteins and presumably represents a primordial code that might have emerged in the early stages of self-organizing biochemical cycles. It is the purpose of this communication to provide a perspective of the code in the context of a binary language from its primordial origin to its present day format and to propose to use this code as a genomic mining tool.  相似文献   

13.
REVIGO summarizes and visualizes long lists of gene ontology terms   总被引:1,自引:0,他引:1  
Outcomes of high-throughput biological experiments are typically interpreted by statistical testing for enriched gene functional categories defined by the Gene Ontology (GO). The resulting lists of GO terms may be large and highly redundant, and thus difficult to interpret.REVIGO is a Web server that summarizes long, unintelligible lists of GO terms by finding a representative subset of the terms using a simple clustering algorithm that relies on semantic similarity measures. Furthermore, REVIGO visualizes this non-redundant GO term set in multiple ways to assist in interpretation: multidimensional scaling and graph-based visualizations accurately render the subdivisions and the semantic relationships in the data, while treemaps and tag clouds are also offered as alternative views. REVIGO is freely available at http://revigo.irb.hr/.  相似文献   

14.
15.
BisoGenet: a new tool for gene network building,visualization and analysis   总被引:1,自引:0,他引:1  

Background  

The increasing availability and diversity of omics data in the post-genomic era offers new perspectives in most areas of biomedical research. Graph-based biological networks models capture the topology of the functional relationships between molecular entities such as gene, protein and small compounds and provide a suitable framework for integrating and analyzing omics-data. The development of software tools capable of integrating data from different sources and to provide flexible methods to reconstruct, represent and analyze topological networks is an active field of research in bioinformatics.  相似文献   

16.
Admixture mapping is a rapidly developing method to map susceptibility alleles in complex genetic disease associated with continental ancestry. Theoretically, when admixture between continental populations has occurred relatively recently, the chromosomal segments derived from the parental populations can be deduced from the differences in genotype allele frequencies. Progress in computational algorithms, in identification of ancestry informative single nucleotide polymorphisms, and in recent studies applying these tools suggests that this approach will complement other strategies for identifying the variation that underlies many complex diseases.  相似文献   

17.
18.
Summary: BicOverlapper is a tool to visualize biclusters fromgene-expression matrices in a way that helps to compare biclusteringmethods, to unravel trends and to highlight relevant genes andconditions. A visual approach can complement biological andstatistical analysis and reduce the time spent by specialistsinterpreting the results of biclustering algorithms. The techniqueis based on a force-directed graph where biclusters are representedas flexible overlapped groups of genes and conditions. Availability: The BicOverlapper software and supplementary materialare available at http://vis.usal.es/bicoverlapper Contact: rodri{at}usal.es Associate Editor: John Quackenbush The first two authors should be reported as joint first authors.  相似文献   

19.
SUMMARY: SeqExpress is a stand-alone desktop application for the identification of relevant genes within collections of microarray or SAGE experiments. A number of analysis, filtering and visualization tools are provided to aid in the selection of groups of genes. If R is installed then the application can use this to provide further analysis. AVAILABILITY: SeqExpress is available at: http://www.seqexpress.com  相似文献   

20.
Exome sequencing - the targeted sequencing of the subset of the human genome that is protein coding - is a powerful and cost-effective new tool for dissecting the genetic basis of diseases and traits that have proved to be intractable to conventional gene-discovery strategies. Over the past 2 years, experimental and analytical approaches relating to exome sequencing have established a rich framework for discovering the genes underlying unsolved Mendelian disorders. Additionally, exome sequencing is being adapted to explore the extent to which rare alleles explain the heritability of complex diseases and health-related traits. These advances also set the stage for applying exome and whole-genome sequencing to facilitate clinical diagnosis and personalized disease-risk profiling.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号