首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The EMBL Nucleotide Sequence Database.   总被引:9,自引:0,他引:9       下载免费PDF全文
The EMBL Nucleotide Sequence Database is a comprehensive database of DNA and RNA sequences directly submitted from researchers and genome sequencing groups and collected from the scientific literature and patent applications. In collaboration with DDBJ and GenBank the database is produced, maintained and distributed at the European Bioinformatics Institute (EBI) and constitutes Europe's primary nucleotide sequence resource. Database releases are produced quarterly and are distributed on CD-ROM. EBI's network services allow access to the most up-to-date data collection via Internet and World Wide Web interface, providing database searching and sequence similarity facilities plus access to a large number of additional databases.  相似文献   

2.
The DNA Data Bank of Japan (DDBJ, http://www.ddbj.nig.ac.jp) has collected and released more entries and bases than last year. This is mainly due to large-scale submissions from Japanese sequencing teams on mouse, rice, chimpanzee, nematoda and other organisms. The contributions of DDBJ over the past year are 17.3% (entries) and 10.3% (bases) of the combined outputs of the International Nucleotide Sequence Databases (INSD). Our complete genome sequence database, Genome Information Broker (GIB), has been improved by incorporating XML. It is now possible to perform a more sophisticated database search against the new GIB than the ordinary BLAST or FASTA search.  相似文献   

3.
Codon usage in 87 602 genes has been calculated using the nucleotide sequence data obtained from the GenBank Genetic Sequence Data Bank (Release 90.0; September 1995). The database is called the CUTG Database; the complete form of the database can be obtained by anonymous ftp from DDBJ and a part of the database, which lists the frequency of codon use in each organism, is made searchable through our World Wide Web server.  相似文献   

4.
Plant protein annotation in the UniProt Knowledgebase   总被引:3,自引:0,他引:3       下载免费PDF全文
The Swiss-Prot, TrEMBL, Protein Information Resource (PIR), and DNA Data Bank of Japan (DDBJ) protein database activities have united to form the Universal Protein Resource (UniProt) Consortium. UniProt presents three database layers: the UniProt Archive, the UniProt Knowledgebase (UniProtKB), and the UniProt Reference Clusters. The UniProtKB consists of two sections: UniProtKB/Swiss-Prot (fully manually curated entries) and UniProtKB/TrEMBL (automated annotation, classification and extensive cross-references). New releases are published fortnightly. A specific Plant Proteome Annotation Program (http://www.expasy.org/sprot/ppap/) was initiated to cope with the increasing amount of data produced by the complete sequencing of plant genomes. Through UniProt, our aim is to provide the scientific community with a single, centralized, authoritative resource for protein sequences and functional information that will allow the plant community to fully explore and utilize the wealth of information available for both plant and non-plant model organisms.  相似文献   

5.
The EMBL nucleotide sequence database   总被引:1,自引:0,他引:1  
The European Molecular Biology Laboratory Nucleotide Sequence Database receives sequence and sequence annotation data from genome projects, sequencing centers, individual scientists, and patent offices. Data may be most efficiently submitted to the database using the Internet based submission tool WEBIN or via previously established genome project accounts. Biologist curators will review the data and provide accession numbers within two working days. Non-confidential data are exchanged daily in an international collaboration between EMBL, DDBJ (the DNA Databank of Japan) and GenBank (USA) and may be accessed and retrieved via the Internet with the Sequence Retrieval System (SRS). Sequence database searching algorithms (e.g., Blitz, Fasta, Blast) are available for comparison of query to database sequences.  相似文献   

6.
In the context of the international project aimed at sequencing the whole genome of Bacillus subtilis we have developed a non-redundant, fully annotated database of sequences from this organism. Starting from the B.subtilis sequences available in the EMBL, GenBank and DDBJ collections we have removed all encountered duplications and then added extra annotations to the sequences (e.g. accession numbers for the genes, locations on the genetic map, codon usage, etc.) We have also added cross-references to the EMBL, MEDLINE, SWISS-PROT and ENZYME data banks. The present system results from merging of the NRSub and SubtiList databases and the sequence contigs used in the two systems are identical. NRSub is distributed as a flatfile in EMBL format (which is supported by most sequence analysis software packages) and as an ACNUC database, while SubtiList is distributed as a relational database under 4th Dimension. It is possible to access the data through two dedicated World Wide Web servers located in France and Japan.  相似文献   

7.
The European Bioinformatics Institute (EBI) databases.   总被引:4,自引:3,他引:1       下载免费PDF全文
This paper describes the databases and services of the European Bioinformatics Institute (EBI). In collaboration with DDBJ and GenBank/NCBI, the EBI maintains and distributes the EMBL Nucleotide Sequence Database, Europe's primary nucleotide sequence data resource. The EBI also maintains and distributes the SWISS-PROT Protein Sequence Database, in collaboration with Amos Bairoch of the University of Geneva. Over thirty additional specialist molecular biology databases, as well as software and documentation of interest to molecular biologists, are also available. The EBI network services include database searching, entry retrieval, and sequence similarity searching facilities.  相似文献   

8.
CUTG (codon usage tabulated from GenBank) is a comprehensive database for codon usage. The codon usage for each full-length protein gene has been calculated using the nucleotide sequence obtained from GenBank sequence database. The sum of the codon use of each organism has been also calculated. The data files can be obtained from anonymous ftp sites of DDBJ, DISC and EBI. The list of codonusage of genes in organisms was made searchableby name of organism through a web site http://www.dna.affrc.go.jp/ approximately nakamura/CUTG.html The compilation is synchronized with major release of GenBank.  相似文献   

9.
The EMBL nucleotide sequence database.   总被引:7,自引:5,他引:2       下载免费PDF全文
The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl. html ) constitutes Europe's primary nucleotide sequence resource. DNA and RNA sequences are directly submitted from researchers and genome sequencing groups and collected from the scientific literature and patent applications (Fig. 1). In collaboration with DDBJ and GenBank the database is produced, maintained and distributed at the European Bioinformatics Institute. Database releases are produced quarterly and are distributed on CD-ROM. EBI's network services allow access to the most up-to-date data collection via Internet and World Wide Web interface, providing database searching and sequence similarity facilities plus access to a large number of additional databases.  相似文献   

10.
High-performance next-generation sequencing (NGS) technologies are advancing genomics and molecular biological research. However, the immense amount of sequence data requires computational skills and suitable hardware resources that are a challenge to molecular biologists. The DNA Data Bank of Japan (DDBJ) of the National Institute of Genetics (NIG) has initiated a cloud computing-based analytical pipeline, the DDBJ Read Annotation Pipeline (DDBJ Pipeline), for a high-throughput annotation of NGS reads. The DDBJ Pipeline offers a user-friendly graphical web interface and processes massive NGS datasets using decentralized processing by NIG supercomputers currently free of charge. The proposed pipeline consists of two analysis components: basic analysis for reference genome mapping and de novo assembly and subsequent high-level analysis of structural and functional annotations. Users may smoothly switch between the two components in the pipeline, facilitating web-based operations on a supercomputer for high-throughput data analysis. Moreover, public NGS reads of the DDBJ Sequence Read Archive located on the same supercomputer can be imported into the pipeline through the input of only an accession number. This proposed pipeline will facilitate research by utilizing unified analytical workflows applied to the NGS data. The DDBJ Pipeline is accessible at http://p.ddbj.nig.ac.jp/.  相似文献   

11.
Expressed sequence tags (ESTs) from the Antarctic green algae Pyramimonas gelidicola were analyzed to obtain molecular information on cold acclimation of psychrophilic microorganisms. A total of 2,112 EST clones were sequenced, generating 222 contigs and 219 singletons, and 200 contigs and 391 singletons from control (4 degrees C) and cold-shock conditions (-2 degrees C), respectively. The complete EST sequences were deposited to the DDBJ EST database (http:// www.ddbj.nig.ac.jp/index-e.html) and the nucleotide sequences reported in this study are available in the DDBJ/EMBL/ GenBank. These EST databases of Antarctic green algae can be used in a wide range of studies on psychrophilic genes expressed by polar microorganisms.  相似文献   

12.
The frequencies of each of the 257 468 complete protein coding sequences (CDSs) have been compiled from the taxonomical divisions of the GenBank DNA sequence database. The sum of the codons used by 8792 organisms has also been calculated. The data files can be obtained from the anonymous ftp sites of DDBJ, Kazusa and EBI. A list of the codon usage of genes and the sum of the codons used by each organism can be obtained through the web site http://www.kazusa.or.jp/codon/ . The present study also reports recent developments on the WWW site. The new web interface provides data in the CodonFrequency-compatible format as well as in the traditional table format. The use of the database is facilitated by keyword based search analysis and the availability of codon usage tables for selected genes from each species. These new tools will provide users with the ability to further analyze for variations in codon usage among different genomes.  相似文献   

13.
Frequencies for each of the 206 526 complete protein-coding genes (CDS's) have been compiled from taxonomical divisions of the GenBank DNA sequence database. The sum of the codon use of 7434 organisms has also been calculated. These data files can be obtained from anonymous ftp sites of DDBJ, DISC and EBI. The list of the codon usage of genes in an organism as well as the sum of the codon usage of the organism was made searchable by the name of organism through a web site http://www.dna.affrc.go.jp//CUTG.html  相似文献   

14.
PeroxiBase: a class III plant peroxidase database   总被引:7,自引:0,他引:7  
Class III plant peroxidases (EC 1.11.1.7), which are encoded by multigenic families in land plants, are involved in several important physiological and developmental processes. Their varied functions are not yet clearly determined, but their characterization will certainly lead to a better understanding of plant growth, differentiation and interaction with the environment, and hence to many exciting applications. Since there is currently no central database for plant peroxidase sequences and many plant sequences are not deposited in the EMBL/GenBank/DDBJ repository or the UniProt KnowledgeBase, this prevents researchers from easily accessing all peroxidase sequences. Furthermore, gene expression data are poorly covered and annotations are inconsistent. In this rapidly moving field, there is a need for continual updating and correction of the peroxidase superfamily in plants. Moreover, consolidating information about peroxidases will allow for comparison of peroxidases between species and thus significantly help making correlations of function, structure or phylogeny. We report a new database (PeroxiBase) accessible through a web server with specific tools dedicated to facilitate query, classification and submission of peroxidase sequences. Recent developments in the field of plant peroxidase are also mentioned.  相似文献   

15.
DNA Data Bank of Japan at work on genome sequence data.   总被引:5,自引:3,他引:2       下载免费PDF全文
We at the DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) have recently begun receiving, processing and releasing EST and genome sequence data submitted by various Japanese genome projects. The data include those for human, Arabidopsis thaliana, rice, nematode, Synechocystis sp. and Escherichia coli. Since the quantity of data is very large, we organized teams to conduct preliminary discussions with project teams about data submission and handling for release to the public. We also developed a mass submission tool to cope with a large quantity of data. In addition, to provide genome data on WWW, we developed a genome information system using Java. This system (http://mol.genes.nig.ac.jp/ecoli/) can in theory be used for any genome sequence data. These activities will facilitate processing of large quantities of EST and genome data.  相似文献   

16.
In the context of the international project aiming at sequencing the whole genome of Bacillus subtilis we have developed NRSub, a non-redundant database of sequences from this organism. Starting from the B.subtilis sequences available in the repository collections we have removed all encountered duplications, then we have added extra annotations to the sequences (e.g. accession numbers for the genes, locations on the genetic map, codon usage index). We have also added cross-references with EMBL/GenBank/DDBJ, MEDLINE, SWISS-PROT and ENZYME databases. NRSub is distributed through anonymous FTP as a text file in EMBL format and as an ACNUC database. It is also possible to access the database through two dedicated World Wide Web servers located in France (http://acnuc.univ-lyon1.fr/nrsub/nrsub.++ +html ) and in Japan (http://ddbjs4h.genes.nig.ac.jp/ ).  相似文献   

17.
Freshwater is a critical resource for human survival but severely threatened by anthropogenic activities and climate change. These changes strongly impact the abundance and diversity of the microbial communities which are key players in the functioning of these aquatic ecosystems. Although widely documented since the emergence of high-throughput sequencing approaches, the information on these natural microbial communities is scattered among thousands of publications and it is therefore difficult to investigate the temporal dynamics and the spatial distribution of microbial taxa within or across ecosystems. To fill this gap and in the FAIR principles context we built a manually curated and standardized microbial freshwater –omics database (FreshOmics). Based on recognized ontologies (ENVO, MIMICS, GO, ISO), FreshOmics describes 29 different types of freshwater ecosystems and uses standardized attributes to depict biological samples, sequencing protocols and article attributes for more than 2487 geographical locations across 71 countries around the world. The database contains 24,808 sequence identifiers (i.e., Run_Id / Exp_ID, mainly from SRA/DDBJ SRA/ENA, GSA and MG-RAST repositories) covering all sequence-based -omics approaches used to investigate bacteria, archaea, microbial eukaryotes, and viruses. Therefore, FreshOmics allows accurate and comprehensive analyses of microbial communities to answer questions related to their roles in freshwater ecosystems functioning and resilience, especially through meta-analysis studies. This collection also highlights different sort of errors in published works (e.g., wrong coordinates, sample type, material, spelling).  相似文献   

18.
HCVDB   总被引:2,自引:0,他引:2  
To date, more than 30 000 hepatitis C virus (HCV) sequences have been deposited in the generalist databases DNA Data Bank of Japan (DDBJ), EMBL Nucleotide Sequence Database (EMBL) and GenBank. The main difficulties with HCV sequences in these databases are their retrieval, annotation and analyses. To help HCV researchers face the increasing needs of HCV sequence analyses, we developed a specialised database of computer-annotated HCV sequences, called HCVDB. HCVDB is re-built every month from an up-to-date EMBL database by an automated process. HCVDB provides key data about the HCV sequences (e.g. genotype, genomic region, protein names and functions, known 3-dimensional structures) and ensures consistency of the annotations, which enables reliable keyword queries. The database is highly integrated with sequence and structure analysis tools and the SRS (LION bioscience) keywords query system. Thus, any user can extract subsets of sequences matching particular criteria or enter their own sequences and analyse them with various bioinformatics programs available on the same server. AVAILABILITY: HCVDB is available from http://hepatitis.ibcp.fr.  相似文献   

19.
A total of 687 DNA sequence accessions from the Mendel database (release 1.04, 3 November 1994) assigned standardized designations for plant genes and gene products were used in aBLAST similarity search of 7557 rice partial cDNA sequences and 287 other rice sequences from the Japanese Rice Genome Research Program. We describe procedures for data manipulation, import and export from and to Macintosh and Unix, and the use of 4th Dimension relational database management system (RDBMS) in data processing. Altogether 275 sequences showed strong similarity hits. Using the CPGN nomenclature, we assign putative designations for genes and gene products. Assignments include representatives of 26 gene products, including 58 cDNA sequences similar to α-tubulins (TubA), 23 similar to β-tubulins (TubB) and 51 similar to cytosolic subunit C of glyceraldehyde-3-phosphate dehydrogenase (NAD) (GapC). The results of the similarity searches are listed and are also available electronically. The assignments have been submitted to the CPGN working groups for verification and for later inclusion in the GenBank/EMBL/DDBJ sequence databases, which will include the standardized designations in the accession data fields. Member of the ISPMB Commission on Plant Gene Nomenclature, representing the Rice Genome Research Program of Japan. Reprint requests to T. Sasaki.  相似文献   

20.
The application of novel and modern techniques in genetic engineering and genomics has resulted in information explosion in genomics. Three major genome databases under International Nucleotide Sequence Database collaboration NCBI, DDBJ and EMBL have been providing a convenient platform for submission of sequences which they share among themselves. Many institutes in India under Indian Council of Agricultural Research have scientists working on biotechnology and bioinformatics research. The various studies conducted by them, generate massive data related to biological information of plants, animals, insects, microbes and fisheries. These scientists are dependent on NCBI, EMBL, DDBJ and other portals for their sequence submissions, analysis and other data mining tasks. Due to various limitations imposed on these sites and the poor connectivity problem prevents them to conduct their studies on these open domain databases. The valued information generated by them needs to be shared by the scientific communities to eliminate the duplication of efforts and expedite their knowledge extended towards new findings. A secured common submission portal system with user-friendly interfaces, integrated help and error checking facilities has been developed in such a way that the database at the backend consists of a union of the items available on the above mentioned databases. Standard database management concepts have been employed for their systematic storage management. Extensive hardware resources in the form of high performance computing facility are being installed for deployment of this portal.

Availability

http://cabindb.iasri.res.in:8080/sequence_portal/  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号