首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.

Background  

Several supervised and unsupervised learning tools are available to classify functional genomics data. However, relatively less attention has been given to exploratory, visualisation-driven approaches. Such approaches should satisfy the following factors: Support for intuitive cluster visualisation, user-friendly and robust application, computational efficiency and generation of biologically meaningful outcomes. This research assesses a relaxation method for non-linear mapping that addresses these concerns. Its applications to gene expression and protein-protein interaction data analyses are investigated  相似文献   

2.
3.
Biological databases face challenges in four main areas: (1). integration, interoperation and federation; (2). ontologies and definitions of semantics; (3). community annotation; and (4). integration of data analysis tools with databases. Each of these areas provides interesting targets for research and development.  相似文献   

4.
SUMMARY: AnnBuilder is an R package for assembling genomic annotation data. The system currently provides parsers to process annotation data from LocusLink, Gene Ontology Consortium, and Human Gene Project and can be extended to new data sources via user defined parsers. AnnBuilder differs from other existing systems in that it provides users with unlimited ability to assemble data from user selected sources. The products of AnnBuilder are files in XML format that can be easily used by different systems. AVAILABILITY: (http://www.bioconductor.org). Open source.  相似文献   

5.
Conventional statistical methods for interpreting microarray data require large numbers of replicates in order to provide sufficient levels of sensitivity. We recently described a method for identifying differentially-expressed genes in one-channel microarray data 1. Based on the idea that the variance structure of microarray data can itself be a reliable measure of noise, this method allows statistically sound interpretation of as few as two replicates per treatment condition. Unlike the one-channel array, the two-channel platform simultaneously compares gene expression in two RNA samples. This leads to covariation of the measured signals. Hence, by accounting for covariation in the variance model, we can significantly increase the power of the statistical test. We believe that this approach has the potential to overcome limitations of existing methods. We present here a novel approach for the analysis of microarray data that involves modeling the variance structure of paired expression data in the context of a Bayesian framework. We also describe a novel statistical test that can be used to identify differentially-expressed genes. This method, bivariate microarray analysis (BMA), demonstrates dramatically improved sensitivity over existing approaches. We show that with only two array replicates, it is possible to detect gene expression changes that are at best detected with six array replicates by other methods. Further, we show that combining results from BMA with Gene Ontology annotation yields biologically significant results in a ligand-treated macrophage cell system.  相似文献   

6.

Background  

Expression array data are used to predict biological functions of uncharacterized genes by comparing their expression profiles to those of characterized genes. While biologically plausible, this is both statistically and computationally challenging. Typical approaches are computationally expensive and ignore correlations among expression profiles and functional categories.  相似文献   

7.

Background  

In contemporary biology, complex biological processes are increasingly studied by collecting and analyzing measurements of the same entities that are collected with different analytical platforms. Such data comprise a number of data blocks that are coupled via a common mode. The goal of collecting this type of data is to discover biological mechanisms that underlie the behavior of the variables in the different data blocks. The simultaneous component analysis (SCA) family of data analysis methods is suited for this task. However, a SCA may be hampered by the data blocks being subjected to different amounts of measurement error, or noise. To unveil the true mechanisms underlying the data, it could be fruitful to take noise heterogeneity into consideration in the data analysis. Maximum likelihood based SCA (MxLSCA-P) was developed for this purpose. In a previous simulation study it outperformed normal SCA-P. This previous study, however, did not mimic in many respects typical functional genomics data sets, such as, data blocks coupled via the experimental mode, more variables than experimental units, and medium to high correlations between variables. Here, we present a new simulation study in which the usefulness of MxLSCA-P compared to ordinary SCA-P is evaluated within a typical functional genomics setting. Subsequently, the performance of the two methods is evaluated by analysis of a real life Escherichia coli metabolomics data set.  相似文献   

8.
An object model and database for functional genomics   总被引:2,自引:0,他引:2  
MOTIVATION: Large-scale functional genomics analysis is now feasible and presents significant challenges in data analysis, storage and querying. Data standards are required to enable the development of public data repositories and to improve data sharing. There is an established data format for microarrays (microarray gene expression markup language, MAGE-ML) and a draft standard for proteomics (PEDRo). We believe that all types of functional genomics experiments should be annotated in a consistent manner, and we hope to open up new ways of comparing multiple datasets used in functional genomics. RESULTS: We have created a functional genomics experiment object model (FGE-OM), developed from the microarray model, MAGE-OM and two models for proteomics, PEDRo and our own model (Gla-PSI-Glasgow Proposal for the Proteomics Standards Initiative). FGE-OM comprises three namespaces representing (i) the parts of the model common to all functional genomics experiments; (ii) microarray-specific components; and (iii) proteomics-specific components. We believe that FGE-OM should initiate discussion about the contents and structure of the next version of MAGE and the future of proteomics standards. A prototype database called RNA And Protein Abundance Database (RAPAD), based on FGE-OM, has been implemented and populated with data from microbial pathogenesis. AVAILABILITY: FGE-OM and the RAPAD schema are available from http://www.gusdb.org/fge.html, along with a set of more detailed diagrams. RAPAD can be accessed by registration at the site.  相似文献   

9.
10.
The Functional Genomics Experiment data model (FuGE) has been developed to facilitate convergence of data standards for high-throughput, comprehensive analyses in biology. FuGE models the components of an experimental activity that are common across different technologies, including protocols, samples and data. FuGE provides a foundation for describing entire laboratory workflows and for the development of new data formats. The Microarray Gene Expression Data society and the Proteomics Standards Initiative have committed to using FuGE as the basis for defining their respective standards, and other standards groups, including the Metabolomics Standards Initiative, are evaluating FuGE in their development efforts. Adoption of FuGE by multiple standards bodies will enable uniform reporting of common parts of functional genomics workflows, simplify data-integration efforts and ease the burden on researchers seeking to fulfill multiple minimum reporting requirements. Such advances are important for transparent data management and mining in functional genomics and systems biology.  相似文献   

11.
High-throughput genomic measurements, interpreted as cooccurring data samples from multiple sources, open up a fresh problem for machine learning: What is in common in the different data sets, that is, what kind of statistical dependencies are there between the paired samples from the different sets? We introduce a clustering algorithm for exploring the dependencies. Samples within each data set are grouped such that the dependencies between groups of different sets capture as much of pairwise dependencies between the samples as possible. We formalize this problem in a novel probabilistic way, as optimization of a Bayes factor. The method is applied to reveal commonalities and exceptions in gene expression between organisms and to suggest regulatory interactions in the form of dependencies between gene expression profiles and regulator binding patterns.  相似文献   

12.
Gene expression microarrays and oligonucleotide GeneChips have provided biologists with a means of measuring, in a single experiment, the expression levels of entire genomes under a variety of conditions. As with any nascent field, there is no single accepted method for analyzing the new data types, with new methods appearing monthly. Investigators using the new technology must constantly seek access to the latest tools and explore their data in multiple ways. The functional genomics data pipeline provides an integrated, extendable analysis environment permitting multiple, simultaneous analyses to be automatically performed and provides a web server and interface for presenting results. AVAILABILITY: Source code and executables are available under the GNU public license at http://bioinformatics.fccc.edu/  相似文献   

13.
Next‐generation sequencing (NGS) technology is revolutionizing the fields of population genetics, molecular ecology and conservation biology. But it can be challenging for researchers to learn the new and rapidly evolving techniques required to use NGS data. A recent workshop entitled ‘Population Genomic Data Analysis’ was held to provide training in conceptual and practical aspects of data production and analysis for population genomics, with an emphasis on NGS data analysis. This workshop brought together 16 instructors who were experts in the field of population genomics and 31 student participants. Instructors provided helpful and often entertaining advice regarding how to choose and use a NGS method for a given research question, and regarding critical aspects of NGS data production and analysis such as library preparation, filtering to remove sequencing errors and outlier loci, and genotype calling. In addition, instructors provided general advice about how to approach population genomics data analysis and how to build a career in science. The overarching messages of the workshop were that NGS data analysis should be approached with a keen understanding of the theoretical models underlying the analyses, and with analyses tailored to each research question and project. When analysed carefully, NGS data provide extremely powerful tools for answering crucial questions in disciplines ranging from evolution and ecology to conservation and agriculture, including questions that could not be answered prior to the development of NGS technology.  相似文献   

14.

Background  

The transfer of functional annotations from model organism proteins to human proteins is one of the main applications of comparative genomics. Various methods are used to analyze cross-species orthologous relationships according to an operational definition of orthology. Often the definition of orthology is incorrectly interpreted as a prediction of proteins that are functionally equivalent across species, while in fact it only defines the existence of a common ancestor for a gene in different species. However, it has been demonstrated that orthologs often reveal significant functional similarity. Therefore, the quality of the orthology prediction is an important factor in the transfer of functional annotations (and other related information). To identify protein pairs with the highest possible functional similarity, it is important to qualify ortholog identification methods.  相似文献   

15.

Background  

Efficient analysis of results from mass spectrometry-based proteomics experiments requires access to disparate data types, including native mass spectrometry files, output from algorithms that assign peptide sequence to MS/MS spectra, and annotation for proteins and pathways from various database sources. Moreover, proteomics technologies and experimental methods are not yet standardized; hence a high degree of flexibility is necessary for efficient support of high- and low-throughput data analytic tasks. Development of a desktop environment that is sufficiently robust for deployment in data analytic pipelines, and simultaneously supports customization for programmers and non-programmers alike, has proven to be a significant challenge.  相似文献   

16.
Schizophrenia (SZ) is a complex disorder resulting from both genetic and environmental causes with a lifetime prevalence world-wide of 1%; however, there are no specific, sensitive and validated biomarkers for SZ. A general unifying hypothesis has been put forward that disease-associated single nucleotide polymorphisms (SNPs) from genome-wide association study (GWAS) are more likely to be associated with gene expression quantitative trait loci (eQTL). We will describe this hypothesis and review primary methodology with refinements for testing this paradigmatic approach in SZ. We will describe biomarker studies of SZ and testing enrichment of SNPs that are associated both with eQTLs and existing GWAS of SZ. SZ-associated SNPs that overlap with eQTLs can be placed into gene-gene expression, protein-protein and protein-DNA interaction networks. Further, those networks can be tested by reducing/silencing the gene expression levels of critical nodes. We present pilot data to support these methods of investigation such as the use of eQTLs to annotate GWASs of SZ, which could be applied to the field of biomarker discovery. Those networks that have association with SNP markers, especially cis-regulated expression, might lead to a more clear understanding of important candidate genes that predispose to disease and alter expression. This method has general application to many complex disorders.  相似文献   

17.
A report of the recent EMBO Conference 'From Functional Genomics to Systems Biology' held at the EMBL Advanced Training Centre, Heidelberg, Germany, 13-16 November 2010.  相似文献   

18.
SUMMARY: DroPhEA is a core module of a web application that facilitates research in insect functional genomics through enrichment analysis on mutant phenotypes of fruit fly (Drosophila melanogaster). The phenotypes investigated in the analyses can be predefined by FlyBase or customized by users. DroPhEA allows users to specify mutation or ortholog types, displays enriched term results in a hierarchical structure and supports analyses on gene sets of all insect species with a fully sequenced genome.  相似文献   

19.
20.
RNAi for plant functional genomics   总被引:9,自引:0,他引:9  
A major challenge in the post-genome era of plant biology is to determine the functions of all the genes in the plant genome. A straightforward approach to this problem is to reduce or knock out expression of a gene with the hope of seeing a phenotype that is suggestive of its function. Insertional mutagenesis is a useful tool for this type of study, but it is limited by gene redundancy, lethal knock-outs, nontagged mutants and the inability to target the inserted element to a specific gene. RNA interference (RNAi) of plant genes, using constructs encoding self-complementary 'hairpin' RNA, largely overcomes these problems. RNAi has been used very effectively in Caenorhabditis elegans functional genomics, and resources are currently being developed for the application of RNAi to high-throughput plant functional genomics.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号