首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
It has been more than 10 years since the first bacterial genome sequence was published. Hundreds of bacterial genome sequences are now available for comparative genomics, and searching a given protein against more than a thousand genomes will soon be possible. The subject of this review will address a relatively straightforward question: “What have we learned from this vast amount of new genomic data?” Perhaps one of the most important lessons has been that genetic diversity, at the level of large-scale variation amongst even genomes of the same species, is far greater than was thought. The classical textbook view of evolution relying on the relatively slow accumulation of mutational events at the level of individual bases scattered throughout the genome has changed. One of the most obvious conclusions from examining the sequences from several hundred bacterial genomes is the enormous amount of diversity—even in different genomes from the same bacterial species. This diversity is generated by a variety of mechanisms, including mobile genetic elements and bacteriophages. An examination of the 20 Escherichia coli genomes sequenced so far dramatically illustrates this, with the genome size ranging from 4.6 to 5.5 Mbp; much of the variation appears to be of phage origin. This review also addresses mobile genetic elements, including pathogenicity islands and the structure of transposable elements. There are at least 20 different methods available to compare bacterial genomes. Metagenomics offers the chance to study genomic sequences found in ecosystems, including genomes of species that are difficult to culture. It has become clear that a genome sequence represents more than just a collection of gene sequences for an organism and that information concerning the environment and growth conditions for the organism are important for interpretation of the genomic data. The newly proposed Minimal Information about a Genome Sequence standard has been developed to obtain this information.  相似文献   

2.
COMPAM is a tool for visualizing relationships among multiple whole genomes by combining all pairwise genome alignments. It displays shared conserved regions (blocks) and where these blocks occur (edges) as block relation graphs which can be explored interactively. An unannotated genome, e.g. can then be explored using information from well-annotated genomes, COG-based genome annotation and genes. COMPAM can run either as a stand-alone application or through an applet that is provided as service to PLATCOM, a toolset for whole genome comparative analysis, where a wide variety of genomes can be easily selected. Features provided by COMPAM include the ability to export genome relationship information into file formats that can be used by other existing tools. AVAILABILITY: http://bio.informatics.indiana.edu/projects/compam/  相似文献   

3.
Functional and structural genomics using PEDANT   总被引:11,自引:0,他引:11  
MOTIVATION: Enormous demand for fast and accurate analysis of biological sequences is fuelled by the pace of genome analysis efforts. There is also an acute need in reliable up-to-date genomic databases integrating both functional and structural information. Here we describe the current status of the PEDANT software system for high-throughput analysis of large biological sequence sets and the genome analysis server associated with it. RESULTS: The principal features of PEDANT are: (i) completely automatic processing of data using a wide range of bioinformatics methods, (ii) manual refinement of annotation, (iii) automatic and manual assignment of gene products to a number of functional and structural categories, (iv) extensive hyperlinked protein reports, and (v) advanced DNA and protein viewers. The system is easily extensible and allows to include custom methods, databases, and categories with minimal or no programming effort. PEDANT is actively used as a collaborative environment to support several on-going genome sequencing projects. The main purpose of the PEDANT genome database is to quickly disseminate well-organized information on completely sequenced and unfinished genomes. It currently includes 80 genomic sequences and in many cases serves as the only source of exhaustive information on a given genome. The database also acts as a vehicle for a number of research projects in bioinformatics. Using SQL queries, it is possible to correlate a large variety of pre-computed properties of gene products encoded in complete genomes with each other and compare them with data sets of special scientific interest. In particular, the availability of structural predictions for over 300 000 genomic proteins makes PEDANT the most extensive structural genomics resource available on the web.  相似文献   

4.
SUMMARY: GenColors is a new web-based software/database system aimed at an improved and accelerated annotation of prokaryotic genomes, considering information on related genomes and making extensive use of genome comparison. It offers a seamless integration of data from ongoing sequencing projects and annotated genomic sequences obtained from GenBank. The genome comparison tools determine, for example, best-bidirectional hits, gene conservation, syntenies and gene core sets. Swiss-Prot/TrEMBL hits allow annotations in an effective manner. To further support the annotation base-specific quality data can also be displayed if available. With GenColors dedicated genome browsers containing a group of related genomes can be easily set up and maintained. It has been efficiently used for Borrelia garinii and is currently applied to various ongoing genome projects. AVAILABILITY: Detailed information on GenColors is available at http://gencolors.imb-jena.de. Online usage of GenColors-based genome browsers is the preferred application mode. The system is also available upon request for local installation.  相似文献   

5.
6.
Genome content analysis has been used as a source of phylogenetic information in large prokaryotic tree of life studies. Recently the sequencing of many eukaryotic genomes has allowed for the similar use of genome content analysis for these organisms too. In this communication we examine the utility of genome content analysis for recovering phylogenetic patterns in several eukaryotic groups. By constructing multiple matrices using different e value cutoffs we examine the dynamics of altering the e value cutoff on five eukaryotic genome data sets. Our analysis indicates that the e value cutoff that is used as a criterion in the construction of the genome content matrix is a critical factor in both the accuracy and information content of the analysis. Strikingly, genome content by itself is not a reliable or accurate source of characters for phylogenetic analysis of the taxa in the five data sets we analyzed. We discuss two problems--small genome attraction and genome duplications as being involved in the rather poor performance of genome content data in recovering eukaryotic phylogeny.  相似文献   

7.
The Human Genome Project stimulated the development of efficient strategies and relevant hardware for complete genome sequencing. The comparative genomic approach extends the possibilities of using the sequencing data to identify new genes or conserved regulatory regions by means of nucleotide sequence alignment of the particular regions of the mouse and human genomes, or to trace the evolutionary events resulting in the genome structure of modern mammals. The review focuses on the use of new molecular cytogenetic methods along with computer-aided analysis of the genomes in vertebrates. Several factors hindering data analysis are considered. The currently available information on gene evolution rate inferred from comparative genomic data is presented. The origin and evolution of the genomes of several species are discussed.  相似文献   

8.
The Human Genome Project stimulated the development of efficient strategies and relevant hardware for complete genome sequencing. The comparative genomic approach extends the possibilities of using the sequencing data to identify new genes or conserved regulatory regions by means of nucleotide sequence alignment of the particular regions of the mouse and human genomes, or to trace the evolutionary events resulting in the genome structure of modern mammals. The review focuses on the use of new molecular cytogenetic methods along with computer-aided analysis of the genomes in vertebrates. Several factors hindering data analysis are considered. The currently available information on gene evolution rate inferred from comparative genomic data is presented. The origin and evolution of the genomes of several species are discussed.  相似文献   

9.
The study of genome size variation is important from a number of practical and theoretical perspectives. For example, the long-standing "C-value enigma" relating to the more than 200,000-fold range in eukaryotic genome sizes is best studied from a broad comparative standpoint. Genome size data are also required in detailed analyses of genome structure and evolution. The choice of future genome sequencing projects will be dependent on knowledge regarding the sizes of genomes to be sequenced, and so on. To date, genome size data have been acquired primarily by Feulgen microdensitometry or flow cytometry. Each has several advantages but also important limitations. In this review, we provide a practical guide to the new technique of Feulgen image analysis densitometry. The review is designed for those interested in genome size measurements but not extensively experienced in histochemistry, densitometry, or microscopy. Therefore, relevant historical and technical background information is included. For easy reference, we provide recipes for required reagents, guidelines for cell staining, and a checklist of steps for successful image analysis. We hope that the accuracy, rapidity, and cost-effectiveness of Feulgen image analysis demonstrated here will stimulate further surveys of genome sizes in a variety of taxa.  相似文献   

10.
The genetic effects of pleistocene ice ages are approached by deduction from paleoenvironmental information, by induction from the genetic structure of populations and species, and by their combination to infer likely consequences. (1) Recent palaeoclimatic information indicate rapid global reversals and changes in ranges of species which would involve elimination with spreading from the edge. Leading edge colonization during a rapid expansion would be leptokurtic and lead to homozygosity and spatial assortment of genomes. In Europe and North America, ice age contractions were into southern refugia, which would promote genome reorganization. (2) The present day genetic structure of species shows frequent geographic subdivision, with parapatric genomes, hybrid zones and suture zones. A survey of recent DNA phylogeographic information supports and extends earlier work. (3) The grasshopper Chorthippus parallelus is used to illustrate such data and processes. Its range in Europe is divided on DNA sequences into five parapatric races, with southern genomes showing greater haplotype diversity - probably due to southern mountain blocks acting as refugia and northern expansion reducing diversity. (4) Comparison with other recent studies shows a concordance of such phylogeographic data over pleistocene time scales. (5) The role that ice age range changes may have played in changing adaptations is explored, including the limits of range, rapid change in new invasions and refugial differentiation in a variety of organisms. (6) The effects of these events in causing divergence and speciation are explored using Chorthippus as a paradigm. Repeated contraction and expansion would accumulate genome differences and adaptations, protected from mixing by hybrid zones, and such a composite mode of speciation could apply to many organisms.  相似文献   

11.
Pulsed-field gel electrophoresis (PFGE) in combination with infrequently cutting restriction enzymes was used to investigate the structure of the mitochondrial (mt) genome of the maize variety Black Mexican Sweet (BMS). The mt genome of this variety was found to resemble that of the closely related B37N variety, with one recombination and five insertion/deletion events being sufficient to account for the differences observed between the two genomes. The majority of the BMS genome is organized as a number of subgenomic chromosomes with circular restriction maps. Several large repeated sequences are found in the BMS mt genome, but not all appear to be in recombinational equilibrium. No molecules large enough to contain the entire mt genome were discernible using these techniques. The mapping approach described here provides a means of quickly analyzing the large and complex rut genomes of plants.  相似文献   

12.
MIPS: a database for protein sequences and complete genomes.   总被引:7,自引:0,他引:7       下载免费PDF全文
The MIPS group [Munich Information Center for Protein Sequences of the German National Center for Environment and Health (GSF)] at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, is involved in a number of data collection activities, including a comprehensive database of the yeast genome, a database reflecting the progress in sequencing the Arabidopsis thaliana genome, the systematic analysis of other small genomes and the collection of protein sequence data within the framework of the PIR-International Protein Sequence Database (described elsewhere in this volume). Through its WWW server (http://www.mips.biochem.mpg.de ) MIPS provides access to a variety of generic databases, including a database of protein families as well as automatically generated data by the systematic application of sequence analysis algorithms. The yeast genome sequence and its related information was also compiled on CD-ROM to provide dynamic interactive access to the 16 chromosomes of the first eukaryotic genome unraveled.  相似文献   

13.
The rapidly emerging field of comparative genomics has yielded dramatic results. Comparative genome analysis has become feasible with the availability of a number of completely sequenced genomes. Comparison of complete genomes between organisms allow for global views on genome evolution and the availability of many completely sequenced genomes increases the predictive power in deciphering the hidden information in genome design, function and evolution. Thus, comparison of human genes with genes from other genomes in a genomic landscape could help assign novel functions for un-annotated genes. Here, we discuss the recently used techniques for comparative genomics and their derived inferences in genome biology.  相似文献   

14.
With the ever increasing amount of genomic data available, the interest for generating biochemical pathways has grown tremendously. So far, mainly complete genomes have been used to reconstruct the biochemical pathways and their associated interactions. However, a large number of low coverage genomes, as well as other sources of partial genomic data, are currently available for many organisms. In order to be able to use incomplete data for metabolic reconstruction, the inherent properties of this procedure need to be investigated. In this short note, we describe the robustness and predictive power of metabolic reconstructions using partial information from Schizosaccharomyces pombe. We also discuss the implications of the results on reference genome projects as well as other large-scale sequencing data.  相似文献   

15.
Magnifying Genomes (MaGe) is a microbial genome annotation system based on a relational database containing information on bacterial genomes, as well as a web interface to achieve genome annotation projects. Our system allows one to initiate the annotation of a genome at the early stage of the finishing phase. MaGe's main features are (i) integration of annotation data from bacterial genomes enhanced by a gene coding re-annotation process using accurate gene models, (ii) integration of results obtained with a wide range of bioinformatics methods, among which exploration of gene context by searching for conserved synteny and reconstruction of metabolic pathways, (iii) an advanced web interface allowing multiple users to refine the automatic assignment of gene product functions. MaGe is also linked to numerous well-known biological databases and systems. Our system has been thoroughly tested during the annotation of complete bacterial genomes (Acinetobacter baylyi ADP1, Pseudoalteromonas haloplanktis, Frankia alni) and is currently used in the context of several new microbial genome annotation projects. In addition, MaGe allows for annotation curation and exploration of already published genomes from various genera (e.g. Yersinia, Bacillus and Neisseria). MaGe can be accessed at http://www.genoscope.cns.fr/agc/mage.  相似文献   

16.
Next Generation Sequencing (NGS) is a disruptive technology that has found widespread acceptance in the life sciences research community. The high throughput and low cost of sequencing has encouraged researchers to undertake ambitious genomic projects, especially in de novo genome sequencing. Currently, NGS systems generate sequence data as short reads and de novo genome assembly using these short reads is computationally very intensive. Due to lower cost of sequencing and higher throughput, NGS systems now provide the ability to sequence genomes at high depth. However, currently no report is available highlighting the impact of high sequence depth on genome assembly using real data sets and multiple assembly algorithms. Recently, some studies have evaluated the impact of sequence coverage, error rate and average read length on genome assembly using multiple assembly algorithms, however, these evaluations were performed using simulated datasets. One limitation of using simulated datasets is that variables such as error rates, read length and coverage which are known to impact genome assembly are carefully controlled. Hence, this study was undertaken to identify the minimum depth of sequencing required for de novo assembly for different sized genomes using graph based assembly algorithms and real datasets. Illumina reads for E.coli (4.6 MB) S.kudriavzevii (11.18 MB) and C.elegans (100 MB) were assembled using SOAPdenovo, Velvet, ABySS, Meraculous and IDBA-UD. Our analysis shows that 50X is the optimum read depth for assembling these genomes using all assemblers except Meraculous which requires 100X read depth. Moreover, our analysis shows that de novo assembly from 50X read data requires only 6–40 GB RAM depending on the genome size and assembly algorithm used. We believe that this information can be extremely valuable for researchers in designing experiments and multiplexing which will enable optimum utilization of sequencing as well as analysis resources.  相似文献   

17.
The recently sequenced genome of the predatory delta-proteobacterium Bdellovibrio bacteriovorus provides many insights into its metabolism and evolution. Because its genes are reasonably uniform in G+C content, it was suggested that B. bacteriovorus actively resists recombination with foreign DNA and horizontal transfer of DNA from other bacteria. To investigate this further, we carried out a variety of phylogenetic and comparative genomics analyses using data from >200 microbial genomes, including several published delta-proteobacteria. Although there might be little evidence for the extensive recent transfer of genes, we demonstrate that ancient lateral gene acquisition has shaped the B. bacteriovorus genome to a great extent.  相似文献   

18.
Next-generation sequencing(NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting in an exponential increase in draft(partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the "scientific value" of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information.  相似文献   

19.
Reference cDNA library facilities available from European sources   总被引:1,自引:0,他引:1  
cDNA libraries are the cornerstone of efforts to identify the relatively small regions of genomes that are responsible for biological effects. Gene hunter seeking candidate genes, via a variety of approaches, ultimately focus on the cloning, sequencing, and expression of cDNAs. Assistance is now available to researchers in the form of genome programs, whose initial goals include assembly of a complete collection of expressed sequences derived from the genome of interest. The concept of reference sets of cDNA libraries is that the aims of genome programs are served most effectively by different laboratories working on a common set of high-quality arrayed cDNA libraries, using different experimental approaches, thereby reducing unnecessary duplication of effort, and maximizing the amount of information that one set of resources can provide.  相似文献   

20.
Echinoderms have long served as model organisms for a variety of biological research, especially in the field of developmental biology. Although the genome of the purple sea urchin Strongylocentrotus purpuratus has been sequenced, it is the only echinoderm whose whole genome sequence has been reported. Nevertheless, data is rapidly accumulating on the chromosomes and genomic sequences of all five classes of echinoderms, including the mitochondrial genomes and Hox genes. This blossoming new data will be essential for estimating the phylogenetic relationships among echinoderms, and also to examine the underlying mechanisms by which the diverse morphologies of echinoderms have arisen.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号