首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
As the amount of biological data grows, so does the need for biologists to store and access this information in central repositories in a free and unambiguous manner. The European Bioinformatics Institute (EBI) hosts six core databases, which store information on DNA sequences (EMBL-Bank), protein sequences (SWISS-PROT and TrEMBL), protein structure (MSD), whole genomes (Ensembl) and gene expression (ArrayExpress). But just as a cell would be useless if it couldn't transcribe DNA or translate RNA, our resources would be compromised if each existed in isolation. We have therefore developed a range of tools that not only facilitate the deposition and retrieval of biological information, but also allow users to carry out searches that reflect the interconnectedness of biological information. The EBI's databases and tools are all available on our website at www.ebi.ac.uk.  相似文献   

3.
The Ensembl database makes genomic features available via its Genome Browser. It is also possible to access the underlying data through a Perl API for advanced querying. We have developed a full-featured Ruby API to the Ensembl databases, providing the same functionality as the Perl interface with additional features. A single Ruby API is used to access different releases of the Ensembl databases and is also able to query multi-species databases. Availability and Implementation: Most functionality of the API is provided using the ActiveRecord pattern. The library depends on introspection to make it release independent. The API is available through the Rubygem system and can be installed with the command gem install ruby-ensembl-api.  相似文献   

4.
biomaRt is a new Bioconductor package that integrates BioMart data resources with data analysis software in Bioconductor. It can annotate a wide range of gene or gene product identifiers (e.g. Entrez-Gene and Affymetrix probe identifiers) with information such as gene symbol, chromosomal coordinates, Gene Ontology and OMIM annotation. Furthermore biomaRt enables retrieval of genomic sequences and single nucleotide polymorphism information, which can be used in data analysis. Fast and up-to-date data retrieval is possible as the package executes direct SQL queries to the BioMart databases (e.g. Ensembl). The biomaRt package provides a tight integration of large, public or locally installed BioMart databases with data analysis in Bioconductor creating a powerful environment for biological data mining.  相似文献   

5.
基因组数据库简介   总被引:1,自引:0,他引:1  
方刚  陈蕴佳  高歌  刘翟  何坤  吴昕  顾孝诚  罗静初 《遗传》2003,25(4):440-444
本文以北京大学生物信息中心安装的3个国际著名基因组数据库GDB、GenoList和Ensembl为基础,介绍目前常用的基因组数据库,包括这些数据数据库的内容、数据格式、使用方法,以及用于构建上述数据库的数据库管理系统。 Abstract:A brief introduction to the genome databases GDB,GenoList and Ensembl is given.These databases,mirrored and maintained at the Centre of Bioinformatics,Peking University,provide useful information for genome research.  相似文献   

6.
SUMMARY: With the availability of whole genome sequence in many species, linkage analysis, positional cloning and microarray are gradually becoming powerful tools for investigating the links between phenotype and genotype or genes. However, in these methods, causative genes underlying a quantitative trait locus, or a disease, are usually located within a large genomic region or a large set of genes. Examining the function of every gene is very time consuming and needs to retrieve and integrate the information from multiple databases or genome resources. PGMapper is a software tool for automatically matching phenotype to genes from a defined genome region or a group of given genes by combining the mapping information from the Ensembl database and gene function information from the OMIM and PubMed databases. PGMapper is currently available for candidate gene search of human, mouse, rat, zebrafish and 12 other species. AVAILABILITY: Available online at http://www.genediscovery.org/pgmapper/index.jsp.  相似文献   

7.
《遗传学报》2021,48(12):1122-1129
The origination of new genes contributes to the biological diversity of life. New genes may quickly build their network, exert important functions, and generate novel phenotypes. Dating gene age and inferring the origination mechanisms of new genes, like primate-specific genes, is the basis for the functional study of the genes. However, no comprehensive resource of gene age estimates across species is available. Here, we systematically date the age of 9,102,113 protein-coding genes from 565 species in the Ensembl and Ensembl Genomes databases, including 82 bacteria, 57 protists, 134 fungi, 58 plants, 56 metazoa, and 178 vertebrates, using a protein-family-based pipeline with Wagner parsimony algorithm. We also collect gene age estimate data from other studies and uniformly distribute the gene age estimates to time ranges in a million years for comparison across studies. All the data are cataloged into GenOrigin (http://genorigin.chenzxlab.cn/), a user-friendly new database of gene age estimates, where users can browse gene age estimates by species, age, and gene ontology. In GenOrigin, the information such as gene age estimates, annotation, gene ontology, ortholog, and paralog, as well as detailed gene presence/absence views for gene age inference based on the species tree with evolutionary timescale, is provided to researchers for exploring gene functions.  相似文献   

8.
9.
10.
Homology Gene List (HOMGL) is a web-based tool for comparing gene lists with different accession numbers and identifiers and between different organisms. UniGene, LocusLink, HomoloGene and Ensembl databases are utilized to map between these lists and to retrieve upstream or transcribed sequences for genes in these lists. We illustrate the use of HOMGL with respect to microarray studies and promoter analysis. AVAILABILITY: http://homgl.biologie.hu-berlin.de/  相似文献   

11.
One result of the publishing of the human genome sequence is the ability to define objects through their position on the consensus sequence. While this has simplified the process of creating order maps for genes on a chromosome, it has created discrepancies between the published cytolocations of human genes, as presented through genetic references, and those locations derived computationally from the genomic sequence. For the 6,830 records with HUGO gene symbols shared between the online version of Mendelian Inheritance in Man and Ensembl, 18% of the records have a discrepancy of at least one cytogenetic band between the datasets. Discordance between data sets at this frequency would have a significant impact on the utility of datasets created by the amalgamation of numerous biological databases.  相似文献   

12.
Phylogenomic databases provide orthology predictions for species with fully sequenced genomes. Although the goal seems well-defined, the content of these databases differs greatly. Seven ortholog databases (Ensembl Compara, eggNOG, HOGENOM, InParanoid, OMA, OrthoDB, Panther) were compared on the basis of reference trees. For three well-conserved protein families, we observed a generally high specificity of orthology assignments for these databases. We show that differences in the completeness of predicted gene relationships and in the phylogenetic information are, for the great majority, not due to the methods used, but to differences in the underlying database concepts. According to our metrics, none of the databases provides a fully correct and comprehensive protein classification. Our results provide a framework for meaningful and systematic comparisons of phylogenomic databases. In the future, a sustainable set of 'Gold standard' phylogenetic trees could provide a robust method for phylogenomic databases to assess their current quality status, measure changes following new database releases and diagnose improvements subsequent to an upgrade of the analysis procedure.  相似文献   

13.

Background  

The Ensembl web site has provided access to genomic information for almost 10 years. During this time the amount of data available through Ensembl has grown dramatically. At the same time, the World Wide Web itself has become a dramatically more important component of the scientific workflow and the way that scientists share and access data and scientific information.  相似文献   

14.
Wang C  Marshall A  Zhang D  Wilson ZA 《Plant physiology》2012,158(4):1523-1533
Protein interactions are fundamental to the molecular processes occurring within an organism and can be utilized in network biology to help organize, simplify, and understand biological complexity. Currently, there are more than 10 publicly available Arabidopsis (Arabidopsis thaliana) protein interaction databases. However, there are limitations with these databases, including different types of interaction evidence, a lack of defined standards for protein identifiers, differing levels of information, and, critically, a lack of integration between them. In this paper, we present an interactive bioinformatics Web tool, ANAP (Arabidopsis Network Analysis Pipeline), which serves to effectively integrate the different data sets and maximize access to available data. ANAP has been developed for Arabidopsis protein interaction integration and network-based study to facilitate functional protein network analysis. ANAP integrates 11 Arabidopsis protein interaction databases, comprising 201,699 unique protein interaction pairs, 15,208 identifiers (including 11,931 The Arabidopsis Information Resource Arabidopsis Genome Initiative codes), 89 interaction detection methods, 73 species that interact with Arabidopsis, and 6,161 references. ANAP can be used as a knowledge base for constructing protein interaction networks based on user input and supports both direct and indirect interaction analysis. It has an intuitive graphical interface allowing easy network visualization and provides extensive detailed evidence for each interaction. In addition, ANAP displays the gene and protein annotation in the generated interactive network with links to The Arabidopsis Information Resource, the AtGenExpress Visualization Tool, the Arabidopsis 1,001 Genomes GBrowse, the Protein Knowledgebase, the Kyoto Encyclopedia of Genes and Genomes, and the Ensembl Genome Browser to significantly aid functional network analysis. The tool is available open access at http://gmdd.shgmo.org/Computational-Biology/ANAP.  相似文献   

15.
Using genomic databases for sequence-based biological discovery   总被引:1,自引:0,他引:1  
The inherent potential underlying the sequence data produced by the International Human Genome Sequencing Consortium and other systematic sequencing projects is, obviously, tremendous. As such, it becomes increasingly important that all biologists have the ability to navigate through and cull important information from key publicly available databases. The continued rapid rise in available sequence information, particularly as model organism data is generated at breakneck speed, also underscores the necessity for all biologists to learn how to effectively make their way through the expanding "sequence information space." This review discusses some of the more commonly used tools for sequence discovery; tools have been developed for the effective and efficient mining of sequence information. These include LocusLink, which provides a gene-centric view of sequence-based information, as well as the 3 major genome browsers: the National Center for Biotechnology Information Map Viewer, the University of California Santa Cruz Genome Browser, and the European Bioinformatics Institute's Ensembl system. An overview of the types of information available through each of these front-ends is given, as well as information on tutorials and other documentation intended to increase the reader's familiarity with these tools.  相似文献   

16.
JCoDA: a tool for detecting evolutionary selection   总被引:1,自引:0,他引:1  

Background  

The incorporation of annotated sequence information from multiple related species in commonly used databases (Ensembl, Flybase, Saccharomyces Genome Database, Wormbase, etc.) has increased dramatically over the last few years. This influx of information has provided a considerable amount of raw material for evaluation of evolutionary relationships. To aid in the process, we have developed JCoDA (Java Codon Delimited Alignment) as a simple-to-use visualization tool for the detection of site specific and regional positive/negative evolutionary selection amongst homologous coding sequences.  相似文献   

17.
18.
Finding the position of a gene is now easily done when the genome sequence is available: the gene position is generally found by a simple query of genomic databases such as those available at the Ensembl browser or the NCBI. We were interested in determining the position of 125 cancer-related rat genes and we found that the position of most of these genes (110) could indeed be identified in this manner. However, in 15 cases, the gene position was not available in these databases, or the results were ambiguous. We then explored a more specialized database, namely the Rat Genome Database, and experimentally mapped these genes using standard and radiation cell hybrids. The 15 genes in question could be localized unambiguously. In four cases, the radiation cell hybrids were indispensable: the sequence of these four genes could not be found in the rat genome sequence. On the basis of the sample we examined, it thus appears that a classical gene mapping method is still required to localize about 3% of the rat genes, as if 3% of the rat gene sequences were lacking in the current rat genome sequence.  相似文献   

19.
20.
The HGNC Comparison of Orthology Predictions search tool, HCOP (), enables users to compare predicted human and mouse orthologs for a specified gene, or set of genes, from either species according to the ortholog assertions from the Ensembl, HGNC, Homologene, Inparanoid, MGI and PhIGs databases. Users can assess the reliability of the prediction from the number of these different sources that identify a particular orthologous pair. HCOP provides a useful one-stop resource to summarise, compare and access various sources of human and mouse orthology data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号