首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Liu WY 《遗传》2012,34(1):59-71
爪蟾是重要的生物医学模式动物。文章根据NCBI公布的热带爪蟾(Xenopus tropicalis)基因组数据,利用生物信息学方法提取和鉴定了爪蟾全基因组范围的碱性螺旋-环-螺旋(bHLH)基因信息,应用系统发生方法进行分类并做基因本体论(Gene Ontology,GO)功能富集分布分析,以期从整体上探讨爪蟾bHLH转录因子基因家族的分类及功能。结果表明,在热带爪蟾基因组数据库中发现了70个bHLH转录因子,其中69个可以分别归到6大组(A~F)的34个亚家族中,另一个为"孤儿因子"(Orphan)基因。GO富集分布统计发现有51个显著富集分布的GO注释语句,其中转录调控活性、转录调控、DNA结合、RNA代谢过程调控、DNA依赖的转录调控、转录和转录因子活性等出现频率很高,表明这些GO术语是爪蟾bHLH基因最常见的功能;许多bHLH转录因子在一些重要的发育或生理过程中发挥调控作用,如肌肉组织和器官(横纹肌、骨骼肌、眼部和咽部肌肉)的分化和发育、消化系统发育、咽部和感觉器官的发育、碱基和核苷及核酸的代谢调控、生物合成过程调控、DNA结合和蛋白质异聚化活性等。另外,还有一些重要信号通路(Signaling pathway)的GO术语显著地富集。文章还对Hes转录因子家族做了进化分析。这些结果为热带爪蟾bHLH基因的进一步研究打下了很好的基础。  相似文献   

2.
MOTIVATION: Numerous annotations are available that functionally characterize genes and proteins with regard to molecular process, cellular localization, tissue expression, protein domain composition, protein interaction, disease association and other properties. Searching this steadily growing amount of information can lead to the discovery of new biological relationships between genes and proteins. To facilitate the searches, methods are required that measure the annotation similarity of genes and proteins. However, most current similarity methods are focused only on annotations from the Gene Ontology (GO) and do not take other annotation sources into account. RESULTS: We introduce the new method BioSim that incorporates multiple sources of annotations to quantify the functional similarity of genes and proteins. We compared the performance of our method with four other well-known methods adapted to use multiple annotation sources. We evaluated the methods by searching for known functional relationships using annotations based only on GO or on our large data warehouse BioMyn. This warehouse integrates many diverse annotation sources of human genes and proteins. We observed that the search performance improved substantially for almost all methods when multiple annotation sources were included. In particular, our method outperformed the other methods in terms of recall and average precision.  相似文献   

3.
4.
MOTIVATION: In general, most accurate gene/protein annotations are provided by curators. Despite having lesser evidence strengths, it is inevitable to use computational methods for fast and a priori discovery of protein function annotations. This paper considers the problem of assigning Gene Ontology (GO) annotations to partially annotated or newly discovered proteins. RESULTS: We present a data mining technique that computes the probabilistic relationships between GO annotations of proteins on protein-protein interaction data, and assigns highly correlated GO terms of annotated proteins to non-annotated proteins in the target set. In comparison with other techniques, probabilistic suffix tree and correlation mining techniques produce the highest prediction accuracy of 81% precision with the recall at 45%. AVAILABILITY: Code is available upon request. Results and used materials are available online at http://kirac.case.edu/PROTAN.  相似文献   

5.
6.
7.
Glycoproteins and lipids in the Golgi complex are modified by the addition of sugars. In the yeast Saccharomyces cerevisiae, these terminal Golgi carbohydrate modifications primarily involve mannose additions that utilize GDP-mannose as the substrate. The transport of GDP-mannose from its site of synthesis in the cytosol into the lumen of the Golgi is mediated by the VRG4 gene product, a nucleotide sugar transporter that is a member of a large family of related membrane proteins. Loss of VRG4 function leads to lethality, but several viable vrg4 mutants were isolated whose GDP-mannose transport activity was reduced but not obliterated. Mutations in these alleles mapped to a region of the Vrg4 protein that is highly conserved among other GDP-mannose transporters but not other types of nucleotide sugar transporters. Here, we present evidence that suggest an involvement of this region of the protein in binding GDP-mannose. Most of the mutations that were introduced within this conserved domain, spanning amino acids 280-291 of Vrg4p, lead to lethality, and none interfere with Vrg4 protein stability, localization, or dimer formation. The null phenotype of these mutant vrg4 alleles can be complemented by their overexpression. Vesicles prepared from vrg4 mutant strains were reduced in luminal GDP-mannose transport activity, but this effect could be suppressed by increasing the concentration of GDP-mannose in vitro. Thus, either an increased substrate concentration, in vitro, or an increased Vrg4 protein concentration, in vivo, can suppress these vrg4 mutant phenotypes. Vrg4 proteins with alterations in this region were reduced in binding to guanosine 5'-[gamma-(32)P]triphosphate gamma-azidoanilide, a photoaffinity substrate analogue whose binding to Vrg4-HAp was specifically inhibited by GDP-mannose. Taken together, these data are consistent with the model that amino acids in this region of the yeast GDP-mannose transporter mediate the recognition of or binding to nucleotide sugar prior to its transport into the Golgi.  相似文献   

8.
Despite the structure and objectivity provided by the Gene Ontology (GO), the annotation of proteins is a complex task that is subject to errors and inconsistencies. Electronically inferred annotations in particular are widely considered unreliable. However, given that manual curation of all GO annotations is unfeasible, it is imperative to improve the quality of electronically inferred annotations. In this work, we analyze the full GO molecular function annotation of UniProtKB proteins, and discuss some of the issues that affect their quality, focusing particularly on the lack of annotation consistency. Based on our analysis, we estimate that 64% of the UniProtKB proteins are incompletely annotated, and that inconsistent annotations affect 83% of the protein functions and at least 23% of the proteins. Additionally, we present and evaluate a data mining algorithm, based on the association rule learning methodology, for identifying implicit relationships between molecular function terms. The goal of this algorithm is to assist GO curators in updating GO and correcting and preventing inconsistent annotations. Our algorithm predicted 501 relationships with an estimated precision of 94%, whereas the basic association rule learning methodology predicted 12,352 relationships with a precision below 9%.  相似文献   

9.
Gene Ontology annotation quality analysis in model eukaryotes   总被引:1,自引:0,他引:1       下载免费PDF全文
Functional analysis using the Gene Ontology (GO) is crucial for array analysis, but it is often difficult for researchers to assess the amount and quality of GO annotations associated with different sets of gene products. In many cases the source of the GO annotations and the date the GO annotations were last updated is not apparent, further complicating a researchers’ ability to assess the quality of the GO data provided. Moreover, GO biocurators need to ensure that the GO quality is maintained and optimal for the functional processes that are most relevant for their research community. We report the GO Annotation Quality (GAQ) score, a quantitative measure of GO quality that includes breadth of GO annotation, the level of detail of annotation and the type of evidence used to make the annotation. As a case study, we apply the GAQ scoring method to a set of diverse eukaryotes and demonstrate how the GAQ score can be used to track changes in GO annotations over time and to assess the quality of GO annotations available for specific biological processes. The GAQ score also allows researchers to quantitatively assess the functional data available for their experimental systems (arrays or databases).  相似文献   

10.
GoSurfer   总被引:2,自引:0,他引:2  
The analysis of complex patterns of gene regulation is central to understanding the biology of cells, tissues and organisms. Patterns of gene regulation pertaining to specific biological processes can be revealed by a variety of experimental strategies, particularly microarrays and other highly parallel methods, which generate large datasets linking many genes. Although methods for detecting gene expression have improved substantially in recent years, understanding the physiological implications of complex patterns in gene expression data is a major challenge. This article presents GoSurfer, an easy-to-use graphical exploration tool with built-in statistical features that allow a rapid assessment of the biological functions represented in large gene sets. GoSurfer takes one or two list(s) of gene identifiers (Affymetrix probe set ID) as input and retrieves all the Gene Ontology (GO) terms associated with the input genes. GoSurfer visualises these GO terms in a hierarchical tree format. With GoSurfer, users can perform statistical tests to search for the GO terms that are enriched in the annotations of the input genes. These GO terms can be highlighted on the GO tree. Users can manipulate the GO tree in various ways and interactively query the genes associated with any GO term. The user-generated graphics can be saved as graphics files, and all the GO information related to the input genes can be exported as text files. AVAILABILITY: GoSurfer is a Windows-based program freely available for noncommercial use and can be downloaded at http://www.gosurfer.org. Datasets used to construct the trees shown in the figures in this article are available at http://www.gosurfer.org/download/GoSurfer.zip.  相似文献   

11.
12.
13.
MOTIVATION: Despite advances in the gene annotation process, the functions of a large portion of gene products remain insufficiently characterized. In addition, the in silico prediction of novel Gene Ontology (GO) annotations for partially characterized gene functions or processes is highly dependent on reverse genetic or functional genomic approaches. To our knowledge, no prediction method has been demonstrated to be highly accurate for sparsely annotated GO terms (those associated to fewer than 10 genes). RESULTS: We propose a novel approach, information theory-based semantic similarity (ITSS), to automatically predict molecular functions of genes based on existing GO annotations. Using a 10-fold cross-validation, we demonstrate that the ITSS algorithm obtains prediction accuracies (precision 97%, recall 77%) comparable to other machine learning algorithms when compared in similar conditions over densely annotated portions of the GO datasets. This method is able to generate highly accurate predictions in sparsely annotated portions of GO, where previous algorithms have failed. As a result, our technique generates an order of magnitude more functional predictions than previous methods. A 10-fold cross validation demonstrated a precision of 90% at a recall of 36% for the algorithm over sparsely annotated networks of the recent GO annotations (about 1400 GO terms and 11,000 genes in Homo sapiens). To our knowledge, this article presents the first historical rollback validation for the predicted GO annotations, which may represent more realistic conditions than more widely used cross-validation approaches. By manually assessing a random sample of 100 predictions conducted in a historical rollback evaluation, we estimate that a minimum precision of 51% (95% confidence interval: 43-58%) can be achieved for the human GO Annotation file dated 2003. AVAILABILITY: The program is available on request. The 97,732 positive predictions of novel gene annotations from the 2005 GO Annotation dataset and other supplementary information is available at http://phenos.bsd.uchicago.edu/ITSS/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

14.
15.
16.
17.
18.
19.
20.
The equine genome sequence enables the use of high-throughput genomic technologies in equine research, but accurate identification of expressed gene products and interpreting their biological relevance require additional structural and functional genome annotation. Here, we employ the equine genome sequence to identify predicted and known proteins using proteomics and model these proteins into biological pathways, identifying 582 proteins in normal cell-free equine bronchoalveolar lavage fluid (BALF). We improved structural and functional annotation by directly confirming the in vivo expression of 558 (96%) proteins, which were computationally predicted previously, and adding Gene Ontology (GO) annotations for 174 proteins, 108 of which lacked functional annotation. Bronchoalveolar lavage is commonly used to investigate equine respiratory disease, leading us to model the associated proteome and its biological functions. Modelling of protein functions using Ingenuity Pathway Analysis identified carbohydrate metabolism, cell-to-cell signalling, cellular function, inflammatory response, organ morphology, lipid metabolism and cellular movement as key biological processes in normal equine BALF. Comparative modelling of protein functions in normal cell-free bronchoalveolar lavage proteomes from horse, human, and mouse, performed by grouping GO terms sharing common ancestor terms, confirms conservation of functions across species. Ninety-one of 92 human GO categories and 105 of 109 mouse GO categories were conserved in the horse. Our approach confirms the utility of the equine genome sequence to characterize protein networks without antibodies or mRNA quantification, highlights the need for continued structural and functional annotation of the equine genome and provides a framework for equine researchers to aid in the annotation effort.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号