首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
As volume of genomic data grows, computational methods become essential for providing a first glimpse onto gene annotations. Automated Gene Ontology (GO) annotation methods based on hierarchical ensemble classification techniques are particularly interesting when interpretability of annotation results is a main concern. In these methods, raw GO-term predictions computed by base binary classifiers are leveraged by checking the consistency of predefined GO relationships. Both formal leveraging strategies, with main focus on annotation precision, and heuristic alternatives, with main focus on scalability issues, have been described in literature. In this contribution, a factor graph approach to the hierarchical ensemble formulation of the automated GO annotation problem is presented. In this formal framework, a core factor graph is first built based on the GO structure and then enriched to take into account the noisy nature of GO-term predictions. Hence, starting from raw GO-term predictions, an iterative message passing algorithm between nodes of the factor graph is used to compute marginal probabilities of target GO-terms. Evaluations on Saccharomyces cerevisiae, Arabidopsis thaliana and Drosophila melanogaster protein sequences from the GO Molecular Function domain showed significant improvements over competing approaches, even when protein sequences were naively characterized by their physicochemical and secondary structure properties or when loose noisy annotation datasets were considered. Based on these promising results and using Arabidopsis thaliana annotation data, we extend our approach to the identification of most promising molecular function annotations for a set of proteins of unknown function in Solanum lycopersicum.  相似文献   

3.
Next‐generation technologies generate an overwhelming amount of gene sequence data. Efficient annotation tools are required to make these data amenable to functional genomics analyses. The Mercator pipeline automatically assigns functional terms to protein or nucleotide sequences. It uses the MapMan ‘BIN’ ontology, which is tailored for functional annotation of plant ‘omics’ data. The classification procedure performs parallel sequence searches against reference databases, compiles the results and computes the most likely MapMan BINs for each query. In the current version, the pipeline relies on manually curated reference classifications originating from the three reference organisms (Arabidopsis, Chlamydomonas, rice), various other plant species that have a reviewed SwissProt annotation, and more than 2000 protein domain and family profiles at InterPro, CDD and KOG. Functional annotations predicted by Mercator achieve accuracies above 90% when benchmarked against manual annotation. In addition to mapping files for direct use in the visualization software MapMan, Mercator provides graphical overview charts, detailed annotation information in a convenient web browser interface and a MapMan‐to‐GO translation table to export results as GO terms. Mercator is available free of charge via http://mapman.gabipd.org/web/guest/app/Mercator .  相似文献   

4.
Gene Ontology annotation quality analysis in model eukaryotes   总被引:1,自引:0,他引:1  
Functional analysis using the Gene Ontology (GO) is crucial for array analysis, but it is often difficult for researchers to assess the amount and quality of GO annotations associated with different sets of gene products. In many cases the source of the GO annotations and the date the GO annotations were last updated is not apparent, further complicating a researchers’ ability to assess the quality of the GO data provided. Moreover, GO biocurators need to ensure that the GO quality is maintained and optimal for the functional processes that are most relevant for their research community. We report the GO Annotation Quality (GAQ) score, a quantitative measure of GO quality that includes breadth of GO annotation, the level of detail of annotation and the type of evidence used to make the annotation. As a case study, we apply the GAQ scoring method to a set of diverse eukaryotes and demonstrate how the GAQ score can be used to track changes in GO annotations over time and to assess the quality of GO annotations available for specific biological processes. The GAQ score also allows researchers to quantitatively assess the functional data available for their experimental systems (arrays or databases).  相似文献   

5.
6.
7.
The evolutionary origin of eukaryotes spurred the transition from prokaryotic-like translation to a more sophisticated, eukaryotic translation. During this process, successive gene duplication of a single, primordial eIF4E gene encoding the mRNA cap-binding protein eukaryotic translation initiation factor 4E (eIF4E) gave rise to a plethora of paralog genes across eukaryotes that underwent further functional diversification in RNA metabolism. The ability to take different roles is due to eIF4E promiscuity in binding many partner proteins, rendering eIF4E a highly versatile and multifunctional player that functions as a molecular wildcard. Thus, in metazoans, eIF4E paralogs are involved in various processes, including messenger RNA (mRNA) processing, export, translation, storage, and decay. Moreover, some paralogs display differential expression in tissues and developmental stages and show variable biochemical properties. In this review, we discuss recent advances shedding light on the functional diversification of eIF4E in metazoans. We emphasise humans and two phylogenetically distant species which have become paradigms for studies on development, namely the fruit fly Drosophila melanogaster and the roundworm Caenorhabditis elegans.  相似文献   

8.
《Fly》2013,7(6):291-299
The Drosophila 12 genome data set was used to construct whole genome, gene family presence/absence matrices using a broad range of E value cutoffs as criteria for gene family inclusion. The various matrices generated behave differently in phylogenetic analyses as a function of the e-value employed. Based on an optimality criterion that maximizes internal corroboration of information, we show that values of e-105 to e-125 extract the most internally consistent phylogenetic signal. Functional class of most genes and gene families can be accurately determined based on the D. melanogaster genome annotation. We used the gene ontology (GO) system to create partitions based on gene function. Several measures of phylogenetic congruence (diagnosis, consistency, partitioned support , hidden support) for different higher and lower level GO categories, were used to mine the data set for genes and gene families that show strong agreement or disagreement with the overall combined phylogenetic hypothesis. We propose that measures of phylogenetic congruence can be used as criteria to identify loci with related GO terms that have a significant impact on cladogenesis.  相似文献   

9.
10.
Mazur  A. M.  Kholod  N. S.  Seit-Nebi  A.  Kisselev  L. L. 《Molecular Biology》2002,36(1):104-109
Termination of protein synthesis (hydrolysis of the last peptidyl-tRNA on the ribosome) takes place when the ribosomal A site is occupied simultaneously by one of the three stop codons and by a class-1 translation termination factor. The existing procedures to measure the functional activity of this factor both in vitro and in vivo have serious drawbacks, the main of which are artificial conditions for in vitro assays, far from those in the cell, and indirect evaluation of activity in in vivo systems. A simple reliable and sensitive system to measure the functional activity of class-1 translation termination factors could considerably expedite the study of the terminal steps of protein synthesis, at present remaining poorly known, especially in eukaryotes. We suggest a novel system to test the functional activity in vitro using native functionally active mRNA, rather than tri-, tetra-, or oligonucleotides as before. This mRNA is specially designed to contain one of the three terminating (stop) codons within the coding nucleotide sequence. Plasmids have been generated that carry the genes of suppressor tRNAs each of which is specific toward one of the three stop codons. They were shown to support normal synthesis of a reporter protein, luciferase, by reading through the stop codon within the coding mRNA sequence. We have demonstrated that human class-1 translation termination factor eRF1 is able to compete with suppressor tRNA for a stop codon and to completely prevent its suppressive effect at a sufficient concentration. Forms of eRF1 with point mutations in functionally essential regions have lower competitive ability, demonstrating the sensitivity of the method to the eRF1 structure. The enzymatic reaction catalyzed by the full-size reporter protein is accompanied by emission of light quanta. Therefore, competition between suppressor tRNA and eRF1 can be measured using a luminometer, and this allows precise kinetic measurements in a continuous automatic mode.  相似文献   

11.
12.
《Biomarkers》2013,18(7):580-586
Abstract

Objective: To analyze the differentially expressed genes and identify featured biomarkers from prostatic carcinoma.

Methods: The software “Significance Analysis of Microarray” (SAM) was used to identify the differentially coexpressed genes (DCGs). The DCGs existed in two datasets were analyzed by GO (Gene Ontology) functional annotation.

Results: A total of 389 DCGs were obtained. By GO analysis, we found these DCGs were closely related with the acinus development, TGF-β receptor and signal transduction pathways. Furthermore, five featured biomarkers were discovered by interaction analysis.

Conclusion: These important signal pathways and oncogenes may provide potential therapeutic targets for prostatic carcinoma.  相似文献   

13.
14.

Background

With the increased availability of high throughput data, such as DNA microarray data, researchers are capable of producing large amounts of biological data. During the analysis of such data often there is the need to further explore the similarity of genes not only with respect to their expression, but also with respect to their functional annotation which can be obtained from Gene Ontology (GO).

Results

We present the freely available software package GOSim, which allows to calculate the functional similarity of genes based on various information theoretic similarity concepts for GO terms. GOSim extends existing tools by providing additional lately developed functional similarity measures for genes. These can e.g. be used to cluster genes according to their biological function. Vice versa, they can also be used to evaluate the homogeneity of a given grouping of genes with respect to their GO annotation. GOSim hence provides the researcher with a flexible and powerful tool to combine knowledge stored in GO with experimental data. It can be seen as complementary to other tools that, for instance, search for significantly overrepresented GO terms within a given group of genes.

Conclusion

GOSim is implemented as a package for the statistical computing environment R and is distributed under GPL within the CRAN project.  相似文献   

15.
16.

Background  

The Gene Ontology (GO) is a well known controlled vocabulary describing the biological process, molecular function and cellular component aspects of gene annotation. It has become a widely used knowledge source in bioinformatics for annotating genes and measuring their semantic similarity. These measures generally involve the GO graph structure, the information content of GO aspects, or a combination of both. However, only a few of the semantic similarity measures described so far can handle GO annotations differently according to their origin (i.e. their evidence codes).  相似文献   

17.
Terminal oligopyrimidine (TOP) mRNAs (encoded by the TOP genes) are identified by a sequence of 6–12 pyrimidines at the 5′ end and by a growth-associated translational regulation. All vertebrate genes for the 80 ribosomal proteins and some other genes involved, directly or indirectly, in translation, are TOP genes. Among the numerous translation factors, only eEF1A and eEF2 are known to be encoded by TOP genes, most of the others having not been analyzed. Here, we report a systematic analysis of the human genes for translation factors. Our results show that: (1) all five elongation factors are encoded by TOP genes; and (2) among the initiation and termination factors analyzed, only eIF3e, eIF3f, and eIF3h exhibit the characteristics of TOP genes. Interestingly, these three polypeptides have been recently shown to constitute a specific subgroup among eIF3 subunits. In fact, eIF3e, eIF3f, and eIF3h are the part of the functional core of eIF3 that is not conserved in Saccharomyces cerevisiae. It has been hypothesized that they are regulatory subunits, and the fact that they are encoded by TOP genes may be relevant for their function.  相似文献   

18.
The chicken genome is sequenced and this, together with microarray and other functional genomics technologies, makes post-genomic research possible in the chicken. At this time, however, such research is hindered by a lack of genomic structural and functional annotations. Bio-ontologies have been developed for different annotation requirements, as well as to facilitate data sharing and computational analysis, but these are not yet optimally utilized in the chicken. Here we discuss genomic annotation and bio-ontologies. We focus specifically on the Gene Ontology (GO), chicken GO annotations and how these can facilitate functional genomics in the chicken. The GO is the most developed and widely used bio-ontology. It is the de facto standard for functional annotation. Despite its critical importance in analyzing microarray and other functional genomics data, relatively few chicken gene products have any GO annotation. When these are available, the average quality of chicken gene products annotations (defined using evidence code weight and annotation depth) is much less than in mouse. Moreover, tools allowing chicken researchers to easily and rapidly use the GO are either lacking or hard to use. To address all of these problems we developed ChickGO and AgBase. Chicken GO annotations are provided by complementary work at MSU-AgBase and EBI-GOA. The GO tools pipeline at AgBase uses GO to derive functional and biological significance from microarray and other functional genomics data. Not only will improved genomic annotation and tools to use these annotations benefit the chicken research community but they will also facilitate research in other avian species and comparative genomics.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号