共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Brooksbank C Camon E Harris MA Magrane M Martin MJ Mulder N O'Donovan C Parkinson H Tuli MA Apweiler R Birney E Brazma A Henrick K Lopez R Stoesser G Stoehr P Cameron G 《Nucleic acids research》2003,31(1):43-50
As the amount of biological data grows, so does the need for biologists to store and access this information in central repositories in a free and unambiguous manner. The European Bioinformatics Institute (EBI) hosts six core databases, which store information on DNA sequences (EMBL-Bank), protein sequences (SWISS-PROT and TrEMBL), protein structure (MSD), whole genomes (Ensembl) and gene expression (ArrayExpress). But just as a cell would be useless if it couldn't transcribe DNA or translate RNA, our resources would be compromised if each existed in isolation. We have therefore developed a range of tools that not only facilitate the deposition and retrieval of biological information, but also allow users to carry out searches that reflect the interconnectedness of biological information. The EBI's databases and tools are all available on our website at www.ebi.ac.uk. 相似文献
3.
The Ensembl database makes genomic features available via its Genome Browser. It is also possible to access the underlying data through a Perl API for advanced querying. We have developed a full-featured Ruby API to the Ensembl databases, providing the same functionality as the Perl interface with additional features. A single Ruby API is used to access different releases of the Ensembl databases and is also able to query multi-species databases. Availability and Implementation: Most functionality of the API is provided using the ActiveRecord pattern. The library depends on introspection to make it release independent. The API is available through the Rubygem system and can be installed with the command gem install ruby-ensembl-api. 相似文献
4.
BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis 总被引:6,自引:0,他引:6
Durinck S Moreau Y Kasprzyk A Davis S De Moor B Brazma A Huber W 《Bioinformatics (Oxford, England)》2005,21(16):3439-3440
biomaRt is a new Bioconductor package that integrates BioMart data resources with data analysis software in Bioconductor. It can annotate a wide range of gene or gene product identifiers (e.g. Entrez-Gene and Affymetrix probe identifiers) with information such as gene symbol, chromosomal coordinates, Gene Ontology and OMIM annotation. Furthermore biomaRt enables retrieval of genomic sequences and single nucleotide polymorphism information, which can be used in data analysis. Fast and up-to-date data retrieval is possible as the package executes direct SQL queries to the BioMart databases (e.g. Ensembl). The biomaRt package provides a tight integration of large, public or locally installed BioMart databases with data analysis in Bioconductor creating a powerful environment for biological data mining. 相似文献
5.
基因组数据库简介 总被引:1,自引:0,他引:1
本文以北京大学生物信息中心安装的3个国际著名基因组数据库GDB、GenoList和Ensembl为基础,介绍目前常用的基因组数据库,包括这些数据数据库的内容、数据格式、使用方法,以及用于构建上述数据库的数据库管理系统。
Abstract:A brief introduction to the genome databases GDB,GenoList and Ensembl is given.These databases,mirrored and maintained at the Centre of Bioinformatics,Peking University,provide useful information for genome research. 相似文献
6.
SUMMARY: With the availability of whole genome sequence in many species, linkage analysis, positional cloning and microarray are gradually becoming powerful tools for investigating the links between phenotype and genotype or genes. However, in these methods, causative genes underlying a quantitative trait locus, or a disease, are usually located within a large genomic region or a large set of genes. Examining the function of every gene is very time consuming and needs to retrieve and integrate the information from multiple databases or genome resources. PGMapper is a software tool for automatically matching phenotype to genes from a defined genome region or a group of given genes by combining the mapping information from the Ensembl database and gene function information from the OMIM and PubMed databases. PGMapper is currently available for candidate gene search of human, mouse, rat, zebrafish and 12 other species. AVAILABILITY: Available online at http://www.genediscovery.org/pgmapper/index.jsp. 相似文献
7.
《遗传学报》2021,48(12):1122-1129
The origination of new genes contributes to the biological diversity of life. New genes may quickly build their network, exert important functions, and generate novel phenotypes. Dating gene age and inferring the origination mechanisms of new genes, like primate-specific genes, is the basis for the functional study of the genes. However, no comprehensive resource of gene age estimates across species is available. Here, we systematically date the age of 9,102,113 protein-coding genes from 565 species in the Ensembl and Ensembl Genomes databases, including 82 bacteria, 57 protists, 134 fungi, 58 plants, 56 metazoa, and 178 vertebrates, using a protein-family-based pipeline with Wagner parsimony algorithm. We also collect gene age estimate data from other studies and uniformly distribute the gene age estimates to time ranges in a million years for comparison across studies. All the data are cataloged into GenOrigin (http://genorigin.chenzxlab.cn/), a user-friendly new database of gene age estimates, where users can browse gene age estimates by species, age, and gene ontology. In GenOrigin, the information such as gene age estimates, annotation, gene ontology, ortholog, and paralog, as well as detailed gene presence/absence views for gene age inference based on the species tree with evolutionary timescale, is provided to researchers for exploring gene functions. 相似文献
8.
9.
10.
Homology Gene List (HOMGL) is a web-based tool for comparing gene lists with different accession numbers and identifiers and between different organisms. UniGene, LocusLink, HomoloGene and Ensembl databases are utilized to map between these lists and to retrieve upstream or transcribed sequences for genes in these lists. We illustrate the use of HOMGL with respect to microarray studies and promoter analysis. AVAILABILITY: http://homgl.biologie.hu-berlin.de/ 相似文献
11.
Cuticchia AJ Kulkarni RD Parris WE Cooley PC Hall RD Silk GW 《Cytogenetic and genome research》2006,112(1-2):1-5
One result of the publishing of the human genome sequence is the ability to define objects through their position on the consensus sequence. While this has simplified the process of creating order maps for genes on a chromosome, it has created discrepancies between the published cytolocations of human genes, as presented through genetic references, and those locations derived computationally from the genomic sequence. For the 6,830 records with HUGO gene symbols shared between the online version of Mendelian Inheritance in Man and Ensembl, 18% of the records have a discrepancy of at least one cytogenetic band between the datasets. Discordance between data sets at this frequency would have a significant impact on the utility of datasets created by the amalgamation of numerous biological databases. 相似文献
12.
Boeckmann B Robinson-Rechavi M Xenarios I Dessimoz C 《Briefings in bioinformatics》2011,12(5):423-435
Phylogenomic databases provide orthology predictions for species with fully sequenced genomes. Although the goal seems well-defined, the content of these databases differs greatly. Seven ortholog databases (Ensembl Compara, eggNOG, HOGENOM, InParanoid, OMA, OrthoDB, Panther) were compared on the basis of reference trees. For three well-conserved protein families, we observed a generally high specificity of orthology assignments for these databases. We show that differences in the completeness of predicted gene relationships and in the phylogenetic information are, for the great majority, not due to the methods used, but to differences in the underlying database concepts. According to our metrics, none of the databases provides a fully correct and comprehensive protein classification. Our results provide a framework for meaningful and systematic comparisons of phylogenomic databases. In the future, a sustainable set of 'Gold standard' phylogenetic trees could provide a robust method for phylogenomic databases to assess their current quality status, measure changes following new database releases and diagnose improvements subsequent to an upgrade of the analysis procedure. 相似文献
13.
Anne Parker Eugene Bragin Simon Brent Bethan Pritchard James A Smith Stephen Trevanion 《BMC bioinformatics》2010,11(1):239
Background
The Ensembl web site has provided access to genomic information for almost 10 years. During this time the amount of data available through Ensembl has grown dramatically. At the same time, the World Wide Web itself has become a dramatically more important component of the scientific workflow and the way that scientists share and access data and scientific information. 相似文献14.
Protein interactions are fundamental to the molecular processes occurring within an organism and can be utilized in network biology to help organize, simplify, and understand biological complexity. Currently, there are more than 10 publicly available Arabidopsis (Arabidopsis thaliana) protein interaction databases. However, there are limitations with these databases, including different types of interaction evidence, a lack of defined standards for protein identifiers, differing levels of information, and, critically, a lack of integration between them. In this paper, we present an interactive bioinformatics Web tool, ANAP (Arabidopsis Network Analysis Pipeline), which serves to effectively integrate the different data sets and maximize access to available data. ANAP has been developed for Arabidopsis protein interaction integration and network-based study to facilitate functional protein network analysis. ANAP integrates 11 Arabidopsis protein interaction databases, comprising 201,699 unique protein interaction pairs, 15,208 identifiers (including 11,931 The Arabidopsis Information Resource Arabidopsis Genome Initiative codes), 89 interaction detection methods, 73 species that interact with Arabidopsis, and 6,161 references. ANAP can be used as a knowledge base for constructing protein interaction networks based on user input and supports both direct and indirect interaction analysis. It has an intuitive graphical interface allowing easy network visualization and provides extensive detailed evidence for each interaction. In addition, ANAP displays the gene and protein annotation in the generated interactive network with links to The Arabidopsis Information Resource, the AtGenExpress Visualization Tool, the Arabidopsis 1,001 Genomes GBrowse, the Protein Knowledgebase, the Kyoto Encyclopedia of Genes and Genomes, and the Ensembl Genome Browser to significantly aid functional network analysis. The tool is available open access at http://gmdd.shgmo.org/Computational-Biology/ANAP. 相似文献
15.
Using genomic databases for sequence-based biological discovery 总被引:1,自引:0,他引:1
Baxevanis AD 《Molecular medicine (Cambridge, Mass.)》2003,9(9-12):185-192
The inherent potential underlying the sequence data produced by the International Human Genome Sequencing Consortium and other systematic sequencing projects is, obviously, tremendous. As such, it becomes increasingly important that all biologists have the ability to navigate through and cull important information from key publicly available databases. The continued rapid rise in available sequence information, particularly as model organism data is generated at breakneck speed, also underscores the necessity for all biologists to learn how to effectively make their way through the expanding "sequence information space." This review discusses some of the more commonly used tools for sequence discovery; tools have been developed for the effective and efficient mining of sequence information. These include LocusLink, which provides a gene-centric view of sequence-based information, as well as the 3 major genome browsers: the National Center for Biotechnology Information Map Viewer, the University of California Santa Cruz Genome Browser, and the European Bioinformatics Institute's Ensembl system. An overview of the types of information available through each of these front-ends is given, as well as information on tutorials and other documentation intended to increase the reader's familiarity with these tools. 相似文献
16.
JCoDA: a tool for detecting evolutionary selection 总被引:1,自引:0,他引:1
Steven N Steinway Ruth Dannenfelser Christopher D Laucius James E Hayes Sudhir Nayak 《BMC bioinformatics》2010,11(1):284
Background
The incorporation of annotated sequence information from multiple related species in commonly used databases (Ensembl, Flybase, Saccharomyces Genome Database, Wormbase, etc.) has increased dramatically over the last few years. This influx of information has provided a considerable amount of raw material for evaluation of evolutionary relationships. To aid in the process, we have developed JCoDA (Java Codon Delimited Alignment) as a simple-to-use visualization tool for the detection of site specific and regional positive/negative evolutionary selection amongst homologous coding sequences. 相似文献17.
18.
Finding the position of a gene is now easily done when the genome sequence is available: the gene position is generally found by a simple query of genomic databases such as those available at the Ensembl browser or the NCBI. We were interested in determining the position of 125 cancer-related rat genes and we found that the position of most of these genes (110) could indeed be identified in this manner. However, in 15 cases, the gene position was not available in these databases, or the results were ambiguous. We then explored a more specialized database, namely the Rat Genome Database, and experimentally mapped these genes using standard and radiation cell hybrids. The 15 genes in question could be localized unambiguously. In four cases, the radiation cell hybrids were indispensable: the sequence of these four genes could not be found in the rat genome sequence. On the basis of the sample we examined, it thus appears that a classical gene mapping method is still required to localize about 3% of the rat genes, as if 3% of the rat gene sequences were lacking in the current rat genome sequence. 相似文献
19.
20.
Mathew W. Wright Tina A. Eyre Michael J. Lush Sue Povey Elspeth A. Bruford 《Mammalian genome》2005,16(11):827-828
The HGNC Comparison of Orthology Predictions search tool, HCOP (), enables users to compare predicted human and mouse orthologs for a specified gene, or set of genes, from either species
according to the ortholog assertions from the Ensembl, HGNC, Homologene, Inparanoid, MGI and PhIGs databases. Users can assess
the reliability of the prediction from the number of these different sources that identify a particular orthologous pair.
HCOP provides a useful one-stop resource to summarise, compare and access various sources of human and mouse orthology data. 相似文献