首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The ascomycetous yeast Wickerhamomyces anomalus (formerly Pichia anomala and Hansenula anomala) exhibits antimicrobial activities and flavoring features that are responsible for its frequent association with food, beverage and feed products. However, limited information on the genetic background of this yeast and its multiple capabilities are currently available. Here, we present the draft genome sequence of the neotype strain W.?anomalus DSM 6766. On the basis of pyrosequencing, a de novo assembly of this strain resulted in a draft genome sequence with a total size of 25.47?Mbp. An automatic annotation using RAPYD generated 11?512 protein-coding sequences. This annotation provided the basis to analyse metabolic capabilities, phylogenetic relationships, as well as biotechnologically important features and yielded novel candidate genes of W.?anomalus DSM 6766 coding for proteins participating in antimicrobial activities.  相似文献   

2.

Background  

A necessary step for a genome level analysis of the cellular metabolism is the in silico reconstruction of the metabolic network from genome sequences. The available methods are mainly based on the annotation of genome sequences including two successive steps, the prediction of coding sequences (CDS) and their function assignment. The annotation process takes time. The available methods often encounter difficulties when dealing with unfinished error-containing genomic sequence.  相似文献   

3.

Background

Complete genome annotation is a necessary tool as Anopheles gambiae researchers probe the biology of this potent malaria vector.

Results

We reannotate the A. gambiae genome by synthesizing comparative and ab initio sets of predicted coding sequences (CDSs) into a single set using an exon-gene-union algorithm followed by an open-reading-frame-selection algorithm. The reannotation predicts 20,970 CDSs supported by at least two lines of evidence, and it lowers the proportion of CDSs lacking start and/or stop codons to only approximately 4%. The reannotated CDS set includes a set of 4,681 novel CDSs not represented in the Ensembl annotation but with EST support, and another set of 4,031 Ensembl-supported genes that undergo major structural and, therefore, probably functional changes in the reannotated set. The quality and accuracy of the reannotation was assessed by comparison with end sequences from 20,249 full-length cDNA clones, and evaluation of mass spectrometry peptide hit rates from an A. gambiae shotgun proteomic dataset confirms that the reannotated CDSs offer a high quality protein database for proteomics. We provide a functional proteomics annotation, ReAnoXcel, obtained by analysis of the new CDSs through the AnoXcel pipeline, which allows functional comparisons of the CDS sets within the same bioinformatic platform. CDS data are available for download.

Conclusion

Comprehensive A. gambiae genome reannotation is achieved through a combination of comparative and ab initio gene prediction algorithms.  相似文献   

4.
5.
Sequencing of microbial genomes is important because of microbial-carrying antibiotic and pathogenetic activities. However, even with the help of new assembling software, finishing a whole genome is a time-consuming task. In most bacteria, pathogenetic or antibiotic genes are carried in genomic islands. Therefore, a quick genomic island (GI) prediction method is useful for ongoing sequencing genomes. In this work, we built a Web server called GI-POP (http://gipop.life.nthu.edu.tw) which integrates a sequence assembling tool, a functional annotation pipeline, and a high-performance GI predicting module, in a support vector machine (SVM)-based method called genomic island genomic profile scanning (GI-GPS). The draft genomes of the ongoing genome projects in contigs or scaffolds can be submitted to our Web server, and it provides the functional annotation and highly probable GI-predicting results. GI-POP is a comprehensive annotation Web server designed for ongoing genome project analysis. Researchers can perform annotation and obtain pre-analytic information include possible GIs, coding/non-coding sequences and functional analysis from their draft genomes. This pre-analytic system can provide useful information for finishing a genome sequencing project.  相似文献   

6.
The review considers the computational prediction of functionally related proteins by comparative genomics. Growing possibilities of biotechnology for genome sequencing lead to generation of sequences for millions of genes. However, functions of majority of these genes remain unknown, and can be determined experimentally only for a few of them. Therefore, accurate and robust methods for in silico prediction (annotation) of gene functions are needed. We describe here the main techniques of comparative genomics, including the standard method based on transferring functions between homologous sequences and also context-based methods, including phylogenetic profiles and gene-neighbor approaches. Modern methods of comparative genomics allow obtaining correct functional annotations for more than a half of all organism proteins.  相似文献   

7.
MOTIVATION: There is an imperative need to integrate functional genomics data to obtain a more comprehensive systems-biology view of the results. We believe that this is best achieved through the visualization of data within the biological context of metabolic pathways. Accordingly, metabolic pathway reconstruction was used to predict the metabolic composition for Medicago truncatula and these pathways were engineered to enable the correlated visualization of integrated functional genomics data. Results: Metabolic pathway reconstruction was used to generate a pathway database for M. truncatula (MedicCyc), which currently features more than 250 pathways with related genes, enzymes and metabolites. MedicCyc was assembled from more than 225,000 M. truncatula ESTs (MtGI Release 8.0) and available genomic sequences using the Pathway Tools software and the MetaCyc database. The predicted pathways in MedicCyc were verified through comparison with other plant databases such as AraCyc and RiceCyc. The comparison with other plant databases provided crucial information concerning enzymes still missing from the ongoing, but currently incomplete M. truncatula genome sequencing project. MedicCyc was further manually curated to remove non-plant pathways, and Medicago-specific pathways including isoflavonoid, lignin and triterpene saponin biosynthesis were modified or added based upon available literature and in-house expertise. Additional metabolites identified in metabolic profiling experiments were also used for pathway predictions. Once the metabolic reconstruction was completed, MedicCyc was engineered to visualize M. truncatula functional genomics datasets within the biological context of metabolic pathways. Availability: freely accessible at http://www.noble.org/MedicCyc/  相似文献   

8.
9.
10.
The accurate prediction of higher eukaryotic gene structures and regulatory elements directly from genomic sequences is an important early step in the understanding of newly assembled contigs and finished genomes. As more new genomes are sequenced, comparative approaches are becoming increasingly practical and valuable for predicting genes and regulatory elements. We demonstrate the effectiveness of a comparative method called pattern filtering; it utilizes synteny between two or more genomic segments for the annotation of genomic sequences. Pattern filtering optimally detects the signatures of conserved functional elements despite the stochastic noise inherent in evolutionary processes, allowing more accurate annotation of gene models. We anticipate that pattern filtering will facilitate sequence annotation and the discovery of new functional elements by the genetics and genomics communities.  相似文献   

11.
The current reach of genomics extends facilitated identification of microbial virulence factors, a primary objective for antimicrobial drug and vaccine design. Many putative proteins are yet to be identified which can act as potent drug targets. There is lack and limitation of methods which appropriately combine several omics ways for putative and new drug target identification. The study emphasizes a combined bioinformatic and theoretical method of screening unique and putative drug targets, lacking similarity with experimentally reported essential genes and drug targets. Synteny based comparison was carried out with 11 streptococci considering S. gordonii as reference genome. It revealed 534 non-homologous genes of which 334 were putative. Similarity search against host proteome, metabolic pathway annotation and subcellular localization predication identified 16 potent drug targets. This is a first attempt of several combinational approaches of similarity search with target protein structural features for screening drug targets, yielding a pipeline which can be substantiated to other human pathogens.  相似文献   

12.

Background

The C. elegans genome has been extensively annotated by the WormBase consortium that uses state of the art bioinformatics pipelines, functional genomics and manual curation approaches. As a result, the identification of novel genes in silico in this model organism is becoming more challenging requiring new approaches. The Oligonucleotide-oligosaccharide binding (OB) fold is a highly divergent protein family, in which protein sequences, in spite of having the same fold, share very little sequence identity (5–25%). Therefore, evidence from sequence-based annotation may not be sufficient to identify all the members of this family. In C. elegans, the number of OB-fold proteins reported is remarkably low (n = 46) compared to other evolutionary-related eukaryotes, such as yeast S. cerevisiae (n = 344) or fruit fly D. melanogaster (n = 84). Gene loss during evolution or differences in the level of annotation for this protein family, may explain these discrepancies.

Methodology/Principal Findings

This study examines the possibility that novel OB-fold coding genes exist in the worm. We developed a bioinformatics approach that uses the most sensitive sequence-sequence, sequence-profile and profile-profile similarity search methods followed by 3D-structure prediction as a filtering step to eliminate false positive candidate sequences. We have predicted 18 coding genes containing the OB-fold that have remarkably partially been characterized in C. elegans.

Conclusions/Significance

This study raises the possibility that the annotation of highly divergent protein fold families can be improved in C. elegans. Similar strategies could be implemented for large scale analysis by the WormBase consortium when novel versions of the genome sequence of C. elegans, or other evolutionary related species are being released. This approach is of general interest to the scientific community since it can be used to annotate any genome.  相似文献   

13.
The chicken genome is sequenced and this, together with microarray and other functional genomics technologies, makes post-genomic research possible in the chicken. At this time, however, such research is hindered by a lack of genomic structural and functional annotations. Bio-ontologies have been developed for different annotation requirements, as well as to facilitate data sharing and computational analysis, but these are not yet optimally utilized in the chicken. Here we discuss genomic annotation and bio-ontologies. We focus specifically on the Gene Ontology (GO), chicken GO annotations and how these can facilitate functional genomics in the chicken. The GO is the most developed and widely used bio-ontology. It is the de facto standard for functional annotation. Despite its critical importance in analyzing microarray and other functional genomics data, relatively few chicken gene products have any GO annotation. When these are available, the average quality of chicken gene products annotations (defined using evidence code weight and annotation depth) is much less than in mouse. Moreover, tools allowing chicken researchers to easily and rapidly use the GO are either lacking or hard to use. To address all of these problems we developed ChickGO and AgBase. Chicken GO annotations are provided by complementary work at MSU-AgBase and EBI-GOA. The GO tools pipeline at AgBase uses GO to derive functional and biological significance from microarray and other functional genomics data. Not only will improved genomic annotation and tools to use these annotations benefit the chicken research community but they will also facilitate research in other avian species and comparative genomics.  相似文献   

14.
The Institute for Genome Sciences (IGS) has developed a prokaryotic annotation pipeline that is used for coding gene/RNA prediction and functional annotation of Bacteria and Archaea. The fully automated pipeline accepts one or many genomic sequences as input and produces output in a variety of standard formats. Functional annotation is primarily based on similarity searches and motif finding combined with a hierarchical rule based annotation system. The output annotations can also be loaded into a relational database and accessed through visualization tools.  相似文献   

15.
16.
The characterization and public release of genome sequences from thousands of organisms is expanding the scope for genetic variation studies. However, understanding the phenotypic consequences of genetic variation remains a challenge in eukaryotes due to the complexity of the genotype-phenotype map. One approach to this is the intensive study of model systems for which diverse sources of information can be accumulated and integrated. Saccharomyces cerevisiae is an extensively studied model organism, with well-known protein functions and thoroughly curated phenotype data. To develop and expand the available resources linking genomic variation with function in yeast, we aim to model the pan-genome of S. cerevisiae. To initiate the yeast pan-genome, we newly sequenced or re-sequenced the genomes of 25 strains that are commonly used in the yeast research community using advanced sequencing technology at high quality. We also developed a pipeline for automated pan-genome analysis, which integrates the steps of assembly, annotation, and variation calling. To assign strain-specific functional annotations, we identified genes that were not present in the reference genome. We classified these according to their presence or absence across strains and characterized each group of genes with known functional and phenotypic features. The functional roles of novel genes not found in the reference genome and associated with strains or groups of strains appear to be consistent with anticipated adaptations in specific lineages. As more S. cerevisiae strain genomes are released, our analysis can be used to collate genome data and relate it to lineage-specific patterns of genome evolution. Our new tool set will enhance our understanding of genomic and functional evolution in S. cerevisiae, and will be available to the yeast genetics and molecular biology community.  相似文献   

17.
18.
19.

Background  

In silico analysis has shown that all bacterial genomes contain a low percentage of ORFs with undetected frameshifts and in-frame stop codons. These interrupted coding sequences (ICDSs) may really be present in the organism or may result from misannotation based on sequencing errors. The reality or otherwise of these sequences has major implications for all subsequent functional characterization steps, including module prediction, comparative genomics and high-throughput proteomic projects.  相似文献   

20.
The assembly of large recombinant DNA encoding a whole biochemical pathway or genome represents a significant challenge. Here, we report a new method, DNA assembler, which allows the assembly of an entire biochemical pathway in a single step via in vivo homologous recombination in Saccharomyces cerevisiae. We show that DNA assembler can rapidly assemble a functional d-xylose utilization pathway (∼9 kb DNA consisting of three genes), a functional zeaxanthin biosynthesis pathway (∼11 kb DNA consisting of five genes) and a functional combined d-xylose utilization and zeaxanthin biosynthesis pathway (∼19 kb consisting of eight genes) with high efficiencies (70–100%) either on a plasmid or on a yeast chromosome. As this new method only requires simple DNA preparation and one-step yeast transformation, it represents a powerful tool in the construction of biochemical pathways for synthetic biology, metabolic engineering and functional genomics studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号