首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
INE: a rice genome database with an integrated map view   总被引:7,自引:1,他引:6  
The Rice Genome Research Program (RGP) launched a large-scale rice genome sequencing in 1998 aimed at decoding all genetic information in rice. A new genome database called INE (INtegrated rice genome Explorer) has been developed in order to integrate all the genomic information that has been accumulated so far and to correlate these data with the genome sequence. A web interface based on Java applet provides a rapid viewing capability in the database. The first operational version of the database has been completed which includes a genetic map, a physical map using YAC (Yeast Artificial Chromosome) clones and PAC (P1-derived Artificial Chromosome) contigs. These maps are displayed graphically so that the positional relationships among the mapped markers on each chromosome can be easily resolved. INE incorporates the sequences and annotations of the PAC contig. A site on low quality information ensures that all submitted sequence data comply with the standard for accuracy. As a repository of rice genome sequence, INE will also serve as a common database of all sequence data obtained by collaborating members of the International Rice Genome Sequencing Project (IRGSP). The database can be accessed at http://www. dna.affrc.go.jp:82/giot/INE.html or its mirror site at http://www.staff.or.jp/giot/INE.html  相似文献   

3.
Restauro-G: A Rapid Genome Re-Annotation System for Comparative Genomics   总被引:1,自引:0,他引:1  
of complete genome sequences submitted directly from sequencing projects are diverse in terms of annotation strategies and update frequencies. These inconsistencies make comparative studies difficult. To allow rapid data preparation of a large number of complete genomes, automation and speed are important for genome re-annotation. Here we introduce an open-source rapid genome re-annotation software system, Restauro-G, specialized for bacterial genomes. Restauro-G re-annotates a genome by similarity searches utilizing the BLASTLike Alignment Tool, referring to protein databases such as UniProt KB, NCBI nr, NCBI COGs, Pfam, and PSORTb. Re-annotation by Restauro-G achieved over 98% accuracy for most bacterial chromosomes in comparison with the original manually curated annotation of EMBL releases. Restauro-G was developed in the generic bioinformatics workbench G-language Genome Analysis Environment and is distributed at http://restauro-g.iab.keio.ac.jp/ under the GNU General Public License.  相似文献   

4.
We have constructed a physical map of Arabidopsis thaliana chromosome3 by ordering the clones from CIC YAC, P1, TAC and BAC librariesusing the sequences of a variety of genetic and EST markersand terminal sequences of clones. The markers used were 112DNA markers, 145 YAC end sequences, and 156 end sequences ofP1, TAC and BAC clones. The entire genome of chromosome 3, exceptfor the centromeric and telomeric regions, was covered by twolarge contigs, 13.6 Mb and 9.2 Mb long. This physical map willfacilitate map-based cloning experiments as well as genome sequencingof chromosome 3. The map and end sequence information are availableon the KAOS (Kazusa Arabidopsis data Opening Site) web siteat http://www.kazusa.or.jp/arabi/.  相似文献   

5.
KEGG: kyoto encyclopedia of genes and genomes   总被引:85,自引:3,他引:82       下载免费PDF全文
KEGG (Kyoto Encyclopedia of Genes and Genomes) is a knowledge base for systematic analysis of gene functions, linking genomic information with higher order functional information. The genomic information is stored in the GENES database, which is a collection of gene catalogs for all the completely sequenced genomes and some partial genomes with up-to-date annotation of gene functions. The higher order functional information is stored in the PATHWAY database, which contains graphical representations of cellular processes, such as metabolism, membrane transport, signal transduction and cell cycle. The PATHWAY database is supplemented by a set of ortholog group tables for the information about conserved subpathways (pathway motifs), which are often encoded by positionally coupled genes on the chromosome and which are especially useful in predicting gene functions. A third database in KEGG is LIGAND for the information about chemical compounds, enzyme molecules and enzymatic reactions. KEGG provides Java graphics tools for browsing genome maps, comparing two genome maps and manipulating expression maps, as well as computational tools for sequence comparison, graph comparison and path computation. The KEGG databases are daily updated and made freely available (http://www. genome.ad.jp/kegg/).  相似文献   

6.
In the context of the international project aiming at sequencing the whole genome of Bacillus subtilis we have developed NRSub, a non-redundant database of sequences from this organism. Starting from the B.subtilis sequences available in the repository collections we have removed all encountered duplications, then we have added extra annotations to the sequences (e.g. accession numbers for the genes, locations on the genetic map, codon usage index). We have also added cross-references with EMBL/GenBank/DDBJ, MEDLINE, SWISS-PROT and ENZYME databases. NRSub is distributed through anonymous FTP as a text file in EMBL format and as an ACNUC database. It is also possible to access the database through two dedicated World Wide Web servers located in France (http://acnuc.univ-lyon1.fr/nrsub/nrsub.++ +html ) and in Japan (http://ddbjs4h.genes.nig.ac.jp/ ).  相似文献   

7.
8.
InterPro was developed as a new integrated documentation resource for protein families, domains and functional sites to rationalize the complementary efforts of the PROSITE, PRINTS, Pfam and ProDom database projects and has applications in computational functional classification of newly determined sequences lacking biochemical characterization and in comparative genome analysis. InterPro contains over 3500 entries, with more than 1000000 hits in SWISS-PROT and TrEMBL. The database is accessible for text- and sequence-based searches at http://www.ebi.ac.uk/interpro/. InterPro was used for whole proteome analysis of the pathogenic microorganism, Mycobacterium tuberculosis, and comparison with the predicted protein coding sequences of the complete genomes of Bacillus subtilis and Escherichia coli. 64.8% of the M. tuberculosis proteins in the proteome matched InterPro entries, and these could be classified according to function. The comparison with B. subtilis and E. coli provided information on the most common protein families and domains, and the most highly represented families in each organism. InterPro thus provides a useful tool for global views of whole proteomes and their compositions.  相似文献   

9.
The Bacillus subtilis 168 chromosome was found to share extensive homology with the genome of bacteriophage phi 3T. At least three different regions of the bacterial genome hydridized to ribonucleic acid complementary to phi 3T deoxyribonucleic acid (DNA). The thymidylate synthetase gene, thyA, of B. subtilis and the sequences adjacent to it were shown to be homologous to the region in the phi 3T DNA containing the phage-encoded thymidylate synthetase gene, thyP3. SP beta, a temperate bacteriophage known to be integrated into the B. subtilis 168 chromosome, was demonstrated to be closely related to phi 3T. Other regions of the bacterial genome were also found to hybridize to the phi 3T probe. The nature and location of these sequences in the bacterial and phage chromosomes were not identified. It was shown however, that they were not homologous to either the thyP3 gene or the DNA surrounding the thyP3 gene. The chromosomes of other Bacillus species were also screened for the presence of phi 3T homologous sequences, and the thyP3 gene was localized in the linear genomes of phages phi 3T and rho 11 by heteroduplex mapping. It is suggested that the presence of sequences of phage origin in the B. subtilis 168 chromosome might contribute to the restructuring and evolution of the viral and bacterial DNAs.  相似文献   

10.
11.
Chloroplast genomes have been widely used in studying plant phylogeny and evolution. Several chloroplast genome visualization tools have been developed to display the distribution of genes on the genome. However, these tools do not draw features, such as exons, introns, repetitive elements, and variable sites, disallowing in-depth examination of the genome structures. Here, we developed and validated a software package called Chloroplast Genome Viewers (CPGView). CPGView can draw three maps showing (i) the distributions of genes, variable sites, and repetitive sequences, including microsatellites, tandem and dispersed repeats; (ii) the structure of the cis-splicing genes after adjusting the exon-intron boundary positions using a coordinate scaling algorithm, and (iii) the structure of the trans-splicing gene rps12. To test the accuracy of CPGView, we sequenced, assembled, and annotated 31 chloroplast genomes from 31 genera of 22 families. CPGView drew maps correctly for all the 31 chloroplast genomes. Lastly, we used CPGView to examine 5998 publicly released chloroplast genomes from 2513 genera of 553 families. CPGView succeeded in plotting maps for 5882 but failed to plot maps for 116 chloroplast genomes. Further examination showed that the annotations of these 116 genomes had various errors needing manual correction. The test on newly generated data and publicly available data demonstrated the ability of CPGView to identify errors in the annotations of chloroplast genomes. CPGView will become a widely used tool to study the detailed structure of chloroplast genomes. The web version of CPGView can be accessed from http://www.1kmpg.cn/cpgview .  相似文献   

12.
DNA Data Bank of Japan (DDBJ) for genome scale research in life science   总被引:5,自引:0,他引:5  
The DNA Data Bank of Japan (DDBJ, http://www.ddbj.nig.ac.jp) has made an effort to collect as much data as possible mainly from Japanese researchers. The increase rates of the data we collected, annotated and released to the public in the past year are 43% for the number of entries and 52% for the number of bases. The increase rates are accelerated even after the human genome was sequenced, because sequencing technology has been remarkably advanced and simplified, and research in life science has been shifted from the gene scale to the genome scale. In addition, we have developed the Genome Information Broker (GIB, http://gib.genes.nig.ac.jp) that now includes more than 50 complete microbial genome and Arabidopsis genome data. We have also developed a database of the human genome, the Human Genomics Studio (HGS, http://studio.nig.ac.jp). HGS provides one with a set of sequences being as continuous as possible in any one of the 24 chromosomes. Both GIB and HGS have been updated incorporating newly available data and retrieval tools.  相似文献   

13.
SUMMARY: Insertional mutagenesis is a powerful method for gene discovery. To identify the location of insertion sites in the genome linker based polymerase chain reaction (PCR) methods (such as splinkerette-PCR) may be employed. We have developed a web application called iMapper (Insertional Mutagenesis Mapping and Analysis Tool) for the efficient analysis of insertion site sequence reads against vertebrate and invertebrate Ensembl genomes. Taking linker based sequences as input, iMapper scans and trims the sequence to remove the linker and sequences derived from the insertional mutagen. The software then identifies and removes contaminating sequences derived from chimeric genomic fragments, vector or the transposon concatamer and then presents the clipped sequence reads to a sequence mapping server which aligns them to an Ensembl genome. Insertion sites can then be navigated in Ensembl in the context of genomic features such as gene structures. iMapper also generates test-based format for nucleic acid or protein sequences (FASTA) and generic file format (GFF) files of the clipped sequence reads and provides a graphical overview of the mapped insertion sites against a karyotype. iMapper is designed for high-throughput applications and can efficiently process thousands of DNA sequence reads. AVAILABILITY: iMapper is web based and can be accessed at http://www.sanger.ac.uk/cgi-bin/teams/team113/imapper.cgi.  相似文献   

14.
A fine physical map of the top arm of Arabidopsis thaliana chromosome 3 has been constructed by ordering P1, TAC and BAC clones using the sequences of a variety of DNA markers and end-sequences of clones. The marker sequences used in this study were derived from 58 DNA markers, 93 YAC end-sequences, and 807 end-sequences of P1, TAC and BAC clones. The entire top arm of chromosome 3, except for the centromeric and telomeric regions, was covered by a single contig 13.3 Mb long. This fine physical map will facilitate gene isolation by map-based cloning experiments as well as genome sequencing of the top arm of chromosome 3. The map and end-sequence information are available on the web site KAOS (Kazusa Arabidopsis data Opening Site) at [http://www.kazusa.or.jp/arabi/].  相似文献   

15.
DNA Data Bank of Japan at work on genome sequence data.   总被引:5,自引:3,他引:2       下载免费PDF全文
We at the DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) have recently begun receiving, processing and releasing EST and genome sequence data submitted by various Japanese genome projects. The data include those for human, Arabidopsis thaliana, rice, nematode, Synechocystis sp. and Escherichia coli. Since the quantity of data is very large, we organized teams to conduct preliminary discussions with project teams about data submission and handling for release to the public. We also developed a mass submission tool to cope with a large quantity of data. In addition, to provide genome data on WWW, we developed a genome information system using Java. This system (http://mol.genes.nig.ac.jp/ecoli/) can in theory be used for any genome sequence data. These activities will facilitate processing of large quantities of EST and genome data.  相似文献   

16.
We developed dbCNS (http://yamasati.nig.ac.jp/dbcns), a new database for conserved noncoding sequences (CNSs). CNSs exist in many eukaryotes and are assumed to be involved in protein expression control. Version 1 of dbCNS, introduced here, includes a powerful and precise CNS identification pipeline for multiple vertebrate genomes. Mutations in CNSs may induce morphological changes and cause genetic diseases. For this reason, many vertebrate CNSs have been identified, with special reference to primate genomes. We integrated ∼6.9 million CNSs from many vertebrate genomes into dbCNS, which allows users to extract CNSs near genes of interest using keyword searches. In addition to CNSs, dbCNS contains published genome sequences of 161 species. With purposeful taxonomic sampling of genomes, users can employ CNSs as queries to reconstruct CNS alignments and phylogenetic trees, to evaluate CNS modifications, acquisitions, and losses, and to roughly identify species with CNSs having accelerated substitution rates. dbCNS also produces links to dbSNP for searching pathogenic single-nucleotide polymorphisms in human CNSs. Thus, dbCNS connects morphological changes with genetic diseases. A test analysis using 38 gnathostome genomes was accomplished within 30 s. dbCNS results can evaluate CNSs identified by other stand-alone programs using genome-scale data.  相似文献   

17.
I have examined potential determinants of the asymmetric distribution of nucleotide sequences in the genome of Escherichia coli as cataloged in GenBank release 44. I have used the frequency of occurrence of all possible tetranucleotides in a given sequence catalog or derivative as a comparative measure of asymmetry. The GenBank-cataloged strand and its complement show statistically similar (not complementary) distributions. The distribution is statistically similar in comparisons between the protein coding subset and the total genome, the coding subset and selected non-coding genes, the coding subset and the remainder of the DNA, and the coding subset and stable RNA sequences. I have compared the distribution in the genome of E. coli with the distributions found in the cataloged genomes of Salmonella typhimurium, Bacillus subtilis, and of coliphages lambda and T7. The distribution summed in both strands of the cataloged DNA differs statistically only in comparisons with lytic bacteriophage T7 because only the two strands of T7 show statistically dissimilar distributions. Despite similarities in tetranucleotide distribution, the pattern of codon complementarity in B. subtilis is different than that documented for E. coli. Thus, sequence asymmetry does not seem related to specific DNA function or to documented similarities or differences in codon bias. The sequence asymmetry of the E. coli genome may thus reflect a hitherto unsuspected pattern impressed on both strands of DNA which is or can be packaged into bacterial genomes.  相似文献   

18.
19.
KEGG Mapper for inferring cellular functions from protein sequences   总被引:1,自引:0,他引:1  
KEGG is a reference knowledge base for biological interpretation of large‐scale molecular datasets, such as genome and metagenome sequences. It accumulates experimental knowledge about high‐level functions of the cell and the organism represented in terms of KEGG molecular networks, including KEGG pathway maps, BRITE hierarchies, and KEGG modules. By the process called KEGG mapping, a set of protein coding genes in the genome, for example, can be converted to KEGG molecular networks enabling interpretation of cellular functions and other high‐level features. Here we report a new version of KEGG Mapper, a suite of KEGG mapping tools available at the KEGG website ( https://www.kegg.jp/ or https://www.genome.jp/kegg/ ), together with the KOALA family tools for automatic assignment of KO (KEGG Orthology) identifiers used in the mapping.  相似文献   

20.
We have designed and implemented a system to manage whole genome shotgun sequences and whole genome sequence assembly data flow. The Sequence Assembly Manager (SAM) consists primarily of a MySQL relational database and Perl applications designed to easily manipulate and coordinate the analysis of sequence information and to view and report genome assembly progress through its Common Gateway Interface (CGI) web interface. The application includes a tool to compare sequence assemblies to fingerprint maps that has been used successfully to improve and validate both maps and sequence assemblies of the Rhodococcus sp.RHAI and Cryptococcus neoformans WM276 genomes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号