首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 578 毫秒
1.
There is now a rapidly expanding population of interlinked developmental biology databases on the World Wide Web that can be readily accessed from a desk-top PC using programs such as Netscape or Mosaic. These databases cover popular organisms (Arabidopsis, Caenorhabditis, Drosophila, zebrafish, mouse, etc.) and include gene and protein sequences, lists of mutants, information on resources and techniques, and teaching aids. More complex are databases relating domains of gene expression to embryonic anatomy and these range from existing text-based systems for specific organs such as kidney, to a massive project under development, that will cover gene expression during the whole of mouse embryogenesis. In this brief article, we review selected examples of databases currently available, look forward to what will be available soon, and explain how to gain access to the World Wide Web.  相似文献   

2.
Protein glycosylation serves critical roles in the cellular and biological processes of many organisms. Aberrant glycosylation has been associated with many illnesses such as hereditary and chronic diseases like cancer, cardiovascular diseases, neurological disorders, and immunological disorders. Emerging mass spectrometry (MS) technologies that enable the high-throughput identification of glycoproteins and glycans have accelerated the analysis and made possible the creation of dynamic and expanding databases. Although glycosylation-related databases have been established by many laboratories and institutions, they are not yet widely known in the community. Our study reviews 15 different publicly available databases and identifies their key elements so that users can identify the most applicable platform for their analytical needs. These databases include biological information on the experimentally identified glycans and glycopeptides from various cells and organisms such as human, rat, mouse, fly and zebrafish. The features of these databases - 7 for glycoproteomic data, 6 for glycomic data, and 2 for glycan binding proteins are summarized including the enrichment techniques that are used for glycoproteome and glycan identification. Furthermore databases such as Unipep, GlycoFly, GlycoFish recently established by our group are introduced. The unique features of each database, such as the analytical methods used and bioinformatical tools available are summarized. This information will be a valuable resource for the glycobiology community as it presents the analytical methods and glycosylation related databases together in one compendium. It will also represent a step towards the desired long term goal of integrating the different databases of glycosylation in order to characterize and categorize glycoproteins and glycans better for biomedical research.  相似文献   

3.
4.
euGenes is a genome information system and database that provides a common summary of eukaryote genes and genomes, at http://iubio.bio.indiana.edu/eugenes/. Seven popular genomes are included: human, mouse, fruitfly, Caenorhabditis elegans worm, Saccharomyces yeast, Arabidopsis mustard weed and zebrafish, with more planned. This information, automatically extracted and updated from several source databases, offers features not readily available through other genome databases to bioscientists looking for gene relationships across organisms. The database describes 150 000 known, predicted and orphan genes, using consistent gene names along with their homologies and associations with a standard vocabulary of molecular functions, cell locations and biological processes. Usable whole-genome maps including features, chromosome locations and molecular data integration are available, as are options to retrieve sequences from these genomes. Search and retrieval methods for these data are easy to use and efficient, allowing one to ask combined questions of sequence features, protein functions and other gene attributes, and fetch results in reports, computable tabular outputs or bulk database forms. These summarized data are useful for integration in other projects, such as gene expression databases. euGenes provides an extensible, flexible genome information system for many organisms.  相似文献   

5.
Graph-based analysis and visualization of experimental results with ONDEX   总被引:2,自引:0,他引:2  
MOTIVATION: Assembling the relevant information needed to interpret the output from high-throughput, genome scale, experiments such as gene expression microarrays is challenging. Analysis reveals genes that show statistically significant changes in expression levels, but more information is needed to determine their biological relevance. The challenge is to bring these genes together with biological information distributed across hundreds of databases or buried in the scientific literature (millions of articles). Software tools are needed to automate this task which at present is labor-intensive and requires considerable informatics and biological expertise. RESULTS: This article describes ONDEX and how it can be applied to the task of interpreting gene expression results. ONDEX is a database system that combines the features of semantic database integration and text mining with methods for graph-based analysis. An overview of the ONDEX system is presented, concentrating on recently developed features for graph-based analysis and visualization. A case study is used to show how ONDEX can help to identify causal relationships between stress response genes and metabolic pathways from gene expression data. ONDEX also discovered functional annotations for most of the genes that emerged as significant in the microarray experiment, but were previously of unknown function.  相似文献   

6.
Bioinformatics is a central discipline in modern life sciences aimed at describing the complex properties of living organisms starting from large-scale data sets of cellular constituents such as genes and proteins. In order for this wealth of information to provide useful biological knowledge, databases and software tools for data collection, analysis and interpretation need to be developed. In this paper, we review recent advances in the design and implementation of bioinformatics resources devoted to the study of metals in biological systems, a research field traditionally at the heart of bioinorganic chemistry. We show how metalloproteomes can be extracted from genome sequences, how structural properties can be related to function, how databases can be implemented, and how hints on interactions can be obtained from bioinformatics.  相似文献   

7.
Enormous amounts of data result from genome sequencing projects and new experimental methods. Within this tremendous amount of genomic data 30-40 per cent of the genes being identified in an organism remain unknown in terms of their biological function. As a consequence of this lack of information the overall schema of all the biological functions occurring in a specific organism cannot be properly represented. To understand the functional properties of the genomic data more experimental data must be collected. A pathway database is an effort to handle the current knowledge of biochemical pathways and in addition can be used for interpretation of sequence data. Some of the existing pathway databases can be interpreted as detailed functional annotations of genomes because they are tightly integrated with genomic information. However, experimental data are often lacking in these databases. This paper summarises a list of pathway databases and some of their corresponding biological databases, and also focuses on information about the content and the structure of these databases, the organisation of the data and the reliability of stored information from a biological point of view. Moreover, information about the representation of the pathway data and tools to work with the data are given. Advantages and disadvantages of the analysed databases are pointed out, and an overview to biological scientists on how to use these pathway databases is given.  相似文献   

8.
With the explosive growth of biological data, the development of new means of data storage was needed. More and more often biological information is no longer published in the conventional way via a publication in a scientific journal, but only deposited into a database. In the last two decades these databases have become essential tools for researchers in biological sciences. Biological databases can be classified according to the type of information they contain. There are basically three types of sequence-related databases (nucleic acid sequences, protein sequences and protein tertiary structures) as well as various specialized data collections. It is important to provide the users of biomolecular databases with a degree of integration between these databases as by nature all of these databases are connected in a scientific sense and each one of them is an important piece to biological complexity. In this review we will highlight our effort in connecting biological information as demonstrated in the SWISS-PROT protein database.  相似文献   

9.
The ecoinformatics community recognizes that ecological synthesis across studies, space, and time will require new informatics tools and infrastructure. Recent advances have been encouraging, but many problems still face ecologists who manage their own datasets, prepare data for archiving, and search data stores for synthetic research. In this paper, we describe how work by the Canopy Database Project (CDP) might enable use of database technology by field ecologists: increasing the quality of database design, improving data validation, and providing structural and semantic metadata — all of which might improve the quality of data archives and thereby help drive ecological synthesis.The CDP has experimented with conceptual components for database design, templates, to address information technology issues facing ecologists. Templates represent forest structures and observational measurements on these structures. Using our software, researchers select templates to represent their study’s data and can generate normalized relational databases. Information hidden in those databases is used by ancillary tools, including data intake forms and simple data validation, data visualization, and metadata export. The primary question we address in this paper is, which templates are the right templates.We argue for defining simple templates (with relatively few attributes) that describe the domain's major entities, and for coupling those with focused and flexible observation templates. We present a conceptual model for the observation data type, and show how we have implemented the model as an observation entity in the DataBank database designer and generator. We show how our visualization tool CanopyView exploits metadata made explicit by DataBank to help scientists with analysis and synthesis. We conclude by presenting future plans for tools to conduct statistical calculations common to forest ecology and to enhance data mining with DataBank databases.DataBank could be extended to another domain by replacing our forest–ecology-specific templates with those for the new domain. This work extends the basic computer science idea of abstract data types and user-defined types to ecology-specific database design tools for individual users, and applies to ecoinformatics the software engineering innovations of domain-specific languages, software patterns, components, refactoring, and end-user programming.  相似文献   

10.
The non-coding fraction of the human genome, which is approximately 98%, is mainly constituted by repeats. Transpositions, expansions and deletions of these repeat elements contribute to a number of diseases. None of the available databases consolidates information on both tandem and interspersed repeats with the flexibility of FASTA based homology search with reference to disease genes. Repeats in diseases database (RiDs db) is a web accessible relational database, which aids analysis of repeats associated with Mendelian disorders. It is a repository of disease genes, which can be searched by FASTA program or by limitedor free- text keywords. Unlike other databases, RiDs db contains the sequences of these genes with access to corresponding information on both interspersed and tandem repeats contained within them, on a unified platform. Comparative analysis of novel or patient sequences with the reference sequences in RiDs db using FASTA search will indicate change in structure of repeats, if any, with a particular disorder. This database also provides links to orthologs in model organisms such as zebrafish, mouse and Drosophila. AVAILABILITY: The database is available for free at http://115.111.90.196/ridsdb/index.php.  相似文献   

11.
Post‐translational modifications (PTMs) are critical regulators of protein function, and nearly 200 different types of PTM have been identified. Advances in high‐resolution mass spectrometry have led to the identification of an unprecedented number of PTM sites in numerous organisms, potentially facilitating a more complete understanding of how PTMs regulate cellular behavior. While databases have been created to house the resulting data, most of these resources focus on individual types of PTM, do not consider quantitative PTM analyses or do not provide tools for the visualization and analysis of PTM data. Here, we describe the Functional Analysis Tools for Post‐Translational Modifications (FAT‐PTM) database ( https://bioinformatics.cse.unr.edu/fat-ptm/ ), which currently supports eight different types of PTM and over 49 000 PTM sites identified in large‐scale proteomic surveys of the model organism Arabidopsis thaliana. The FAT‐PTM database currently supports tools to visualize protein‐centric PTM networks, quantitative phosphorylation site data from over 10 different quantitative phosphoproteomic studies, PTM information displayed in protein‐centric metabolic pathways and groups of proteins that are co‐modified by multiple PTMs. Overall, the FAT‐PTM database provides users with a robust platform to share and visualize experimentally supported PTM data, develop hypotheses related to target proteins or identify emergent patterns in PTM data for signaling and metabolic pathways.  相似文献   

12.
Community databases have become crucial to the collection, ordering and retrieval of data gathered on model organisms, as well as to the ways in which these data are interpreted and used across a range of research contexts. This paper analyses the impact of community databases on research practices in model organism biology by focusing on the history and current use of four community databases: FlyBase, Mouse Genome Informatics, WormBase and The Arabidopsis Information Resource. We discuss the standards used by the curators of these databases for what counts as reliable evidence, acceptable terminology, appropriate experimental set-ups and adequate materials (e.g., specimens). On the one hand, these choices are informed by the collaborative research ethos characterising most model organism communities. On the other hand, the deployment of these standards in databases reinforces this ethos and gives it concrete and precise instantiations by shaping the skills, practices, values and background knowledge required of the database users. We conclude that the increasing reliance on community databases as vehicles to circulate data is having a major impact on how researchers conduct and communicate their research, which affects how they understand the biology of model organisms and its relation to the biology of other species.  相似文献   

13.
The role of pattern databases in sequence analysis   总被引:2,自引:0,他引:2  
In the wake of the numerous now-fruitful genome projects, we are entering an era rich in biological data. The field of bioinformatics is poised to exploit this information in increasingly powerful ways, but the abundance and growing complexity both of the data and of the tools and resources required to analyse them are threatening to overwhelm us. Databases and their search tools are now an essential part of the research environment. However, the rate of sequence generation and the haphazard proliferation of databases have made it difficult to keep pace with developments. In an age of information overload, researchers want rapid, easy-to-use, reliable tools for functional characterisation of newly determined sequences. But what are those tools? How do we access them? Which should we use? This review focuses on a particular type of database that is increasingly used in the task of routine sequence analysis--the so-called pattern database. The paper aims to provide an overview of the current status of pattern databases in common use, outlining the methods behind them and giving pointers on their diagnostic strengths and weaknesses.  相似文献   

14.
Genome annotation conceptually consists of inferring and assigning biological information to gene products. Over the years, numerous pipelines and computational tools have been developed aiming to automate this task and assist researchers in gaining knowledge about target genes of study. However, even with these technological advances, manual annotation or manual curation is necessary, where the information attributed to the gene products is verified and enriched. Despite being called the gold standard process for depositing data in a biological database, the task of manual curation requires significant time and effort from researchers who sometimes have to parse through numerous products in various public databases. To assist with this problem, we present CODON, a tool for manual curation of genomic data, capable of performing the prediction and annotation process. This software makes use of a finite state machine in the prediction process and automatically annotates products based on information obtained from the Uniprot database. CODON is equipped with a simple and intuitive graphic interface that assists on manual curation, enabling the user to decide about the analysis based on information as to identity, length of the alignment, and name of the organism in which the product obtained a match. Further, visual analysis of all matches found in the database is possible, impacting significantly in the curation task considering that the user has at his disposal all the information available for a given product. An analysis performed on eleven organisms was used to test the efficiency of this tool by comparing the results of prediction and annotation through CODON to ones from the NCBI and RAST platforms.  相似文献   

15.
In the last 10-15years, many new technologies and approaches have been implemented in research in the pharmaceutical industry; these include high-throughput screening or combinatorial chemistry, which result in a rapidly growing amount of biological assay and structural data in the corporate databases. Efficient use of the data from this growing data mountain is a key success factor; 'provide as much knowledge as possible as early as possible and therefore enable research teams to make the best possible decision whenever this decision can be supported by stored data'. Here, an approach which started several years ago to obtain as much information as possible out of historical assay data stored in the corporate database is described. It will be shown how important a careful preprocessing of the stored data is to enhance its information. Different possibilities for accessing and to analyzing the preconditioned data are in place. Some of will be described in the examples.  相似文献   

16.
Leitner F  Valencia A 《FEBS letters》2008,582(8):1178-1181
We propose that the combination of human expertise and automatic text-mining systems can be used to create a first generation of electronically annotated information (EAI) that can be added to journal abstracts and that is directly related to the information in the corresponding text. The first experiments have concentrated on the annotation of gene/protein names and those of organisms, as these are the best resolved problems. A second generation of systems could then attempt to address the problems of annotating protein interactions and protein/gene functions, a more difficult task for text-mining systems. EAI will permit easier categorization of this information, it will help in the evaluation of papers for their curation in databases, and it will be invaluable for maintaining the links between the information in databases and the facts described in text. Additionally, it will contribute to the efforts towards completing database information and creating collections of annotated text that can be used to train new generations of text-mining systems. The recent introduction of the first meta-server for the annotation of biological text, with the possibility of collecting annotations from available text-mining systems, adds credibility to the technical feasibility of this proposal.  相似文献   

17.
18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号