共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
We examine the translated open reading frames (ORFs) of the yeast Saccharomyces cerevisiae, focusing on those that have FASTA matches in phyletically defined sets of completely sequenced genomes. On this basis, we identify archaeal yeast, bacterial yeast, universal yeast, and yeast ORFs that do not have a match in any of nine prokaryote genomes. Similarly, we examine the yeast mitochondrial genome and the subset of the yeast nuclear ORFs identified as being involved in mitochondrial biogenesis. For the yeast ORFs that match one or more ORFs in these prokaryote genomes, we examine the phyletic and functional distributions of these matches as a function of match strength. These results provide genome level insights into the origin of the eukaryotic cell and the origin of mitochondria. More generally, they exemplify how the growing database of prokaryote genome sequences can help us understand eukaryote genomes. 相似文献
3.
4.
Martin Hunt Taisei Kikuchi Mandy Sanders Chris Newbold Matthew Berriman Thomas D Otto 《Genome biology》2013,14(5):R47
Methods to reliably assess the accuracy of genome sequence data are lacking. Currently completeness is only described qualitatively and mis-assemblies are overlooked. Here we present REAPR, a tool that precisely identifies errors in genome assemblies without the need for a reference sequence. We have validated REAPR on complete genomes or de novo assemblies from bacteria, malaria and Caenorhabditis elegans, and demonstrate that 86% and 82% of the human and mouse reference genomes are error-free, respectively. When applied to an ongoing genome project, REAPR provides corrected assembly statistics allowing the quantitative comparison of multiple assemblies. REAPR is available at http://www.sanger.ac.uk/resources/software/reapr/. 相似文献
5.
Prophage loci often remain under-annotated or even unrecognized in prokaryotic genome sequencing projects. A PHP application, Prophage Finder, has been developed and implemented to predict prophage loci, based upon clusters of phage-related gene products encoded within DNA sequences. This application provides results detailing several facets of these clusters to facilitate rapid prediction and analysis of prophage sequences. Prophage Finder was tested using previously annotated prokaryotic genomic sequences with manually curated prophage loci as benchmarks. Additional analyses from Prophage Finder searches of several draft prokaryotic genome sequences are available through the Web site (http://bioinformatics.uwp.edu/~phage/DOEResults.php) to illustrate the potential of this application. 相似文献
6.
One of the most complex and computationally intensive tasks of genome sequence analysis is genome assembly. Even today, few centres have the resources, in both software and hardware, to assemble a genome from the thousands or millions of individual sequences generated in a whole-genome shotgun sequencing project. With the rapid growth in the number of sequenced genomes has come an increase in the number of organisms for which two or more closely related species have been sequenced. This has created the possibility of building a comparative genome assembly algorithm, which can assemble a newly sequenced genome by mapping it onto a reference genome. We describe here a novel algorithm for comparative genome assembly that can accurately assemble a typical bacterial genome in less than four minutes on a standard desktop computer. The software is available as part of the open-source AMOS project. 相似文献
7.
8.
Assembly algorithms have been extensively benchmarked using simulated data so that results can be compared to ground truth. However, in de novo assembly, only crude metrics such as contig number and size are typically used to evaluate assembly quality. We present CGAL, a novel likelihood-based approach to assembly assessment in the absence of a ground truth. We show that likelihood is more accurate than other metrics currently used for evaluating assemblies, and describe its application to the optimization and comparison of assembly algorithms. Our methods are implemented in software that is freely available at http://bio.math.berkeley.edu/cgal/. 相似文献
9.
V V Sukhodolets 《Genetika》1992,28(1):28-37
The peculiarities of bacterial chromosome organization are discussed, based mainly on the data on Escherichia coli. Highly important for bacterial genome organization is its division into two approx. equal half-genomes undergoing periodically "exchanges" of some kind displayed as continuous inversions including the oriC region of replication initiation. It is believed that short oligonucleotides are comprised in either of half-genomes. The former are predominantly oriented as direct repeats, which ensures the possibility of formation of tandem duplications consisting of identical genes--under conditions when selection for enhancing functions of corresponding genes takes place. Multiple tandem duplications capable of excision of plasmatic gene copies seem to initiate horizontal gene transfer in bacteria. Tandem gene duplications are probably being formed in the process of bacterial genetic recombination as well, when, as a result of non-equal crossing over, gene alleles derived from different strains are united into a tandem. 相似文献
10.
The growing number of complete sequencing projects based on the next-generation sequencing (NGS) platforms necessitates quality evaluation. Therefore, the use of guaranteed measures such as N50, N80 and average size of contigs etc. to evaluate the quality of genome assemblies produced by ab initio methods remains vital. Herein, we prove that various treatment qualities and their influence on the whole genome products must be considered in genome assembly quality measurements. 相似文献
11.
MreB, a major component of the bacterial cytoskeleton, exhibits high structural homology to its eukaryotic counterpart actin. Live cell microscopy studies suggest that MreB molecules organize into large filamentous spirals that support the cell membrane and play a key shape-determining function. However, the basic properties of MreB filament assembly remain unknown. Here, we studied the assembly of Thermotoga maritima MreB triggered by ATP in vitro and compared it to the well-studied assembly of actin. These studies show that MreB filament ultrastructure and polymerization depend crucially on temperature as well as the ions present on solution. At the optimal growth temperature of T. maritima, MreB assembly proceeded much faster than that of actin, without nucleation (or nucleation is highly favorable and fast) and with little or no contribution from filament end-to-end annealing. MreB exhibited rates of ATP hydrolysis and phosphate release similar to that of F-actin, however, with a critical concentration of approximately 3 nm, which is approximately 100-fold lower than that of actin. Furthermore, MreB assembled into filamentous bundles that have the ability to spontaneously form ring-like structures without auxiliary proteins. These findings suggest that despite high structural homology, MreB and actin display significantly different assembly properties. 相似文献
12.
Background
Next generation sequencing technology has allowed efficient production of draft genomes for many organisms of interest. However, most draft genomes are just collections of independent contigs, whose relative positions and orientations along the genome being sequenced are unknown. Although several tools have been developed to order and orient the contigs of draft genomes, more accurate tools are still needed.Results
In this study, we present a novel reference-based contig assembly (or scaffolding) tool, named as CAR, that can efficiently and more accurately order and orient the contigs of a prokaryotic draft genome based on a reference genome of a related organism. Given a set of contigs in multi-FASTA format and a reference genome in FASTA format, CAR can output a list of scaffolds, each of which is a set of ordered and oriented contigs. For validation, we have tested CAR on a real dataset composed of several prokaryotic genomes and also compared its performance with several other reference-based contig assembly tools. Consequently, our experimental results have shown that CAR indeed performs better than all these other reference-based contig assembly tools in terms of sensitivity, precision and genome coverage.Conclusions
CAR serves as an efficient tool that can more accurately order and orient the contigs of a prokaryotic draft genome based on a reference genome. The web server of CAR is freely available at http://genome.cs.nthu.edu.tw/CAR/ and its stand-alone program can also be downloaded from the same website.Electronic supplementary material
The online version of this article (doi:10.1186/s12859-014-0381-3) contains supplementary material, which is available to authorized users. 相似文献13.
Exhaustive gene identification is a fundamental goal in all metagenomics projects. However, most metagenomic sequences are unassembled anonymous fragments, and conventional gene-finding methods cannot be applied. We have developed a prokaryotic gene-finding program, MetaGene, which utilizes di-codon frequencies estimated by the GC content of a given sequence with other various measures. MetaGene can predict a whole range of prokaryotic genes based on the anonymous genomic sequences of a few hundred bases, with a sensitivity of 95% and a specificity of 90% for artificial shotgun sequences (700 bp fragments from 12 species). MetaGene has two sets of codon frequency interpolations, one for bacteria and one for archaea, and automatically selects the proper set for a given sequence using the domain classification method we propose. The domain classification works properly, correctly assigning domain information to more than 90% of the artificial shotgun sequences. Applied to the Sargasso Sea dataset, MetaGene predicted almost all of the annotated genes and a notable number of novel genes. MetaGene can be applied to wide variety of metagenomic projects and expands the utility of metagenomics. 相似文献
14.
We describe a new assembly algorithm, where a genome assembly with low sequence coverage, either throughout the genome or
locally, due to cloning bias, is considerably improved through an assisting process via a related genome. We show that the
information provided by aligning the whole-genome shotgun reads of the target against a reference genome can be used to substantially
improve the quality of the resulting assembly. 相似文献
15.
Although many bacteria with two chromosomes have been sequenced, the roles of such complex genome structuring are still unclear. To uncover levels of chromosome I (CI) and chromosome II (CII) sequence divergence, Mauve 2.2.0 was used to align the CI- and CII-specific sequences of bacteria with complex genome structuring in two sets of comparisons: the first set was conducted among the CI and CII of bacterial strains of the same species, while the second set was conducted among the CI and CII of species in Alphaproteobacteria that possess two chromosomes. The analyses revealed a rapid evolution of CII-specific DNA sequences compared with CI-specific sequences in a majority of organisms. In addition, levels of protein divergence between CI-specific and CII-specific genes were determined using phylogenetic analyses and confirmed the DNA alignment findings. Analysis of synonymous and nonsynonymous substitutions revealed that the structural and functional constraints on CI and CII genes are not significantly different. Also, horizontal gene transfer estimates in selected organisms demonstrated that CII in many species has acquired higher levels of horizontally transferred segments than CI. In summary, rapid evolution of CII may perform particular roles for organisms such as aiding in adapting to specialized niches. 相似文献
16.
17.
Yanting Shen Jing Liu Haiying Geng Jixiang Zhang Yucheng Liu Haikuan Zhang Shilai Xing Jianchang Du Shisong Ma Zhixi Tian 《中国科学:生命科学英文版》2018,(8)
Soybean was domesticated in China and has become one of the most important oilseed crops. Due to bottlenecks in their introduction and dissemination, soybeans from different geographic areas exhibit extensive genetic diversity. Asia is the largest soybean market; therefore, a high-quality soybean reference genome from this area is critical for soybean research and breeding.Here, we report the de novo assembly and sequence analysis of a Chinese soybean genome for "Zhonghuang 13" by a combination of SMRT, Hi-C and optical mapping data. The assembled genome size is 1.025 Gb with a contig N50 of 3.46 Mb and a scaffold N50 of 51.87 Mb. Comparisons between this genome and the previously reported reference genome(cv. Williams82) uncovered more than 250,000 structure variations. A total of 52,051 protein coding genes and 36,429 transposable elements were annotated for this genome, and a gene co-expression network including 39,967 genes was also established. This high quality Chinese soybean genome and its sequence analysis will provide valuable information for soybean improvement in the future. 相似文献
18.
Huang X Yang SP Chinwalla AT Hillier LW Minx P Mardis ER Wilson RK 《Nucleic acids research》2006,34(1):201-205
We introduce a data structure called a superword array for finding quickly matches between DNA sequences. The superword array possesses some desirable features of the lookup table and suffix array. We describe simple algorithms for constructing and using a superword array to find pairs of sequences that share a unique superword. The algorithms are implemented in a genome assembly program called PCAP.REP for computation of overlaps between reads. Experimental results produced by PCAP.REP and PCAP on a whole-genome dataset show that PCAP.REP produced a more accurate and contiguous assembly than PCAP. 相似文献
19.
CONSORF is a fully automatic high-accuracy identification system that provides consensus prokaryotic CDS information. It first predicts the CDSs supported by consensus alignments. The alignments are derived from multiple genome-to-proteome comparisons with other prokaryotes using the FASTX program. Then, it fills the empty genomic regions with the CDSs supported by consensus ab initio predictions. From those consensus results, CONSORF provides prediction reliability scores, predicted frame-shifts, alternative start sites and best pair-wise match information against other prokaryotes. These results are easily accessed from a website. 相似文献
20.
PeerGAD: a peer-review-based and community-centric web application for viewing and annotating prokaryotic genome sequences
下载免费PDF全文

PeerGAD is a web-based database-driven application that allows community-wide peer-reviewed annotation of prokaryotic genome sequences. The application was developed to support the annotation of the Pseudomonas syringae pv. tomato strain DC3000 genome sequence and is easily portable to other genome sequence annotation projects. PeerGAD incorporates several innovative design and operation features and accepts annotations pertaining to gene naming, role classification, gene translation and annotation derivation. The annotator tool in PeerGAD is built around a genome browser that offers users the ability to search and navigate the genome sequence. Because the application encourages annotation of the genome sequence directly by researchers and relies on peer review, it circumvents the need for an annotation curator while providing added value to the annotation data. Support for the Gene Ontology vocabulary, a structured and controlled vocabulary used in classification of gene roles, is emphasized throughout the system. Here we present the underlying concepts integral to the functionality of PeerGAD. 相似文献