首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 24 毫秒
1.
Clinical proteomics is an emerging field that deals with the use of proteomic technologies for medical applications. With a major objective of identifying proteins involved in pathological processes and as potential biomarkers, this field is already gaining momentum. Consequently, clinical proteomics data are being generated at a rapid pace, although mechanisms of sharing such data with the biomedical community lag far behind. Most of these data are either provided as supplementary information through journal web sites or directly made available by the authors through their own web resources. Integration of these data within a single resource that displays information in the context of individual proteins is likely to enhance the use of proteomic data in biomedical research. Human Proteinpedia is one such portal that unifies human proteomic data under a single banner. The goal of this resource is to ultimately capture and integrate all proteomic data obtained from individual studies on normal and diseased tissues. We anticipate that harnessing of these data will help prioritize experiments related to protein targets and also permit meta-analysis to uncover molecular signatures of disease. Finally, we encourage all biomedical investigators to maximize dissemination of their valuable proteomic data to rest of the community by active participation in existing repositories such as Human Proteinpedia.  相似文献   

2.
3.
Despite the fact that data deposition is not a generalised fact yet in the field of proteomics, several mass spectrometry (MS) based proteomics repositories are publicly available for the scientific community. The main existing resources are: the Global Proteome Machine Database (GPMDB), PeptideAtlas, the PRoteomics IDEntifications database (PRIDE), Tranche, and NCBI Peptidome. In this review the capabilities of each of these will be described, paying special attention to four key properties: data types stored, applicable data submission strategies, supported formats, and available data mining and visualization tools. Additionally, the data contents from model organisms will be enumerated for each resource. There are other valuable smaller and/or more specialized repositories but they will not be covered in this review. Finally, the concept behind the ProteomeXchange consortium, a collaborative effort among the main resources in the field, will be introduced.  相似文献   

4.
MS-based proteomics generates rapidly increasing amounts of precise and quantitative information. Analysis of individual proteomic experiments has made great strides, but the crucial ability to compare and store information across different proteome measurements still presents many challenges. For example, it has been difficult to avoid contamination of databases with low quality peptide identifications, to control for the inflation in false positive identifications when combining data sets, and to integrate quantitative data. Although, for example, the contamination with low quality identifications has been addressed by joint analysis of deposited raw data in some public repositories, we reasoned that there should be a role for a database specifically designed for high resolution and quantitative data. Here we describe a novel database termed MaxQB that stores and displays collections of large proteomics projects and allows joint analysis and comparison. We demonstrate the analysis tools of MaxQB using proteome data of 11 different human cell lines and 28 mouse tissues. The database-wide false discovery rate is controlled by adjusting the project specific cutoff scores for the combined data sets. The 11 cell line proteomes together identify proteins expressed from more than half of all human genes. For each protein of interest, expression levels estimated by label-free quantification can be visualized across the cell lines. Similarly, the expression rank order and estimated amount of each protein within each proteome are plotted. We used MaxQB to calculate the signal reproducibility of the detected peptides for the same proteins across different proteomes. Spearman rank correlation between peptide intensity and detection probability of identified proteins was greater than 0.8 for 64% of the proteome, whereas a minority of proteins have negative correlation. This information can be used to pinpoint false protein identifications, independently of peptide database scores. The information contained in MaxQB, including high resolution fragment spectra, is accessible to the community via a user-friendly web interface at http://www.biochem.mpg.de/maxqb.  相似文献   

5.
The Arabidopsis information portal (AIP), a resource expected to provide access to all community data and combine outputs into a single user-friendly interface, has emerged from community discussions over the last 23 months. These discussions began during two closely linked workshops in early 2010 that established the International Arabidopsis Informatics Consortium (IAIC). The design of the AIP will provide core functionality while remaining flexible to encourage multiple contributors and constant innovation. An IAIC-hosted Design Workshop in December 2011 proposed a structure for the AIP to provide a framework for the minimal components of a functional community portal while retaining flexibility to rapidly extend the resource to other species. We now invite broader participation in the AIP development process so that the resource can be implemented in a timely manner.  相似文献   

6.
Since it was launched in 1993, the ExPASy server has been and is still a reference in the proteomics world. ExPASy users access various databases, many dedicated tools, and lists of resources, among other services. A significant part of resources available is devoted to two-dimensional electrophoresis data. Our latest contribution to the expansion of the pool of on-line proteomics data is the World-2DPAGE Constellation, accessible at http://world-2dpage.expasy.org/. It is composed of the established WORLD-2DPAGE List of 2-D PAGE database servers, the World-2DPAGE Portal that queries simultaneously world-wide proteomics databases, and the recently created World-2DPAGE Repository. The latter component is a public standards-compliant repository for gel-based proteomics data linked to protein identifications published in the literature. It has been set up using the Make2D-DB package, a software tool that helps building SWISS-2DPAGE-like databases on one's own Web site. The lack of necessary informatics infrastructure to build and run a dedicated website is no longer an obstacle to make proteomics data publicly accessible on the Internet.  相似文献   

7.

Background

The 1980s marked the occasion when Geographical Information System (GIS) technology was broadly introduced into the geo-spatial community through the establishment of a strong GIS industry. This technology quickly disseminated across many countries, and has now become established as an important research, planning and commercial tool for a wider community that includes organisations in the public and private health sectors. The broad acceptance of GIS technology and the nature of its functionality have meant that numerous datasets have been created over the past three decades. Most of these datasets have been created independently, and without any structured documentation systems in place. However, search and retrieval systems can only work if there is a mechanism for datasets existence to be discovered and this is where proper metadata creation and management can greatly help. This situation must be addressed through support mechanisms such as Web-based portal technologies, metadata editor tools, automation, metadata standards and guidelines and collaborative efforts with relevant individuals and organisations. Engagement with data developers or administrators should also include a strategy of identifying the benefits associated with metadata creation and publication.

Findings

The establishment of numerous Spatial Data Infrastructures (SDIs), and other Internet resources, is a testament to the recognition of the importance of supporting good data management and sharing practices across the geographic information community. These resources extend to health informatics in support of research, public services and teaching and learning. This paper identifies many of these resources available to the UK academic health informatics community. It also reveals the reluctance of many spatial data creators across the wider UK academic community to use these resources to create and publish metadata, or deposit their data in repositories for sharing. The Go-Geo! service is introduced as an SDI developed to provide UK academia with the necessary resources to address the concerns surrounding metadata creation and data sharing. The Go-Geo! portal, Geodoc metadata editor tool, ShareGeo spatial data repository, and a range of other support resources, are described in detail.

Conclusions

This paper describes a variety of resources available for the health research and public health sector to use for managing and sharing their data. The Go-Geo! service is one resource which offers an SDI for the eclectic range of disciplines using GIS in UK academia, including health informatics. The benefits of data management and sharing are immense, and in these times of cost restraints, these resources can be seen as solutions to find cost savings which can be reinvested in more research.  相似文献   

8.
《Genomics》2019,111(6):1923-1928
An online portal, accessible at URL: http://mail.nbfgr.res.in/FisOmics/, was developed that features different genomic databases and tools. The portal, named as FisOmics, acts as a platform for sharing fish genomic sequences and related information in addition to facilitating the access of high-performance computational resources for genome and proteome data analyses. It provides the ability for quarrying, analysing and visualizing genomic sequences and related information. The featured databases in FisOmics are in the World Wide Web domain already. The aim to develop portal was to provide a nodal point to access the featured databases and work conveniently. Presently, FisOmics includes databases on barcode sequences, microsatellite markers, mitogenome sequences, hypoxia-responsive genes and karyology of fishes. Besides, it has a link to other molecular resources and reports on the on-going activities and research achievements.  相似文献   

9.
Centralisation of tools for analysis of genomic data is paramount in ensuring that research is always carried out on the latest currently available data. As such, World Wide Web sites providing a range of online analyses and displays of data can play a crucial role in guaranteeing consistency of in silico work. In this respect, the protozoan parasite research community is served by several resources, either focussing on data and tools for one species or taking a broader view and providing tools for analysis of data from many species, thereby facilitating comparative studies. In this paper, we give a broad overview of the online resources available. We then focus on the GeneDB project, detailing the features and tools currently available through it. Finally, we discuss data curation and its importance in keeping genomic data 'relevant' to the research community.  相似文献   

10.
Taverna: a tool for the composition and enactment of bioinformatics workflows   总被引:12,自引:0,他引:12  
MOTIVATION: In silico experiments in bioinformatics involve the co-ordinated use of computational tools and information repositories. A growing number of these resources are being made available with programmatic access in the form of Web services. Bioinformatics scientists will need to orchestrate these Web services in workflows as part of their analyses. RESULTS: The Taverna project has developed a tool for the composition and enactment of bioinformatics workflows for the life sciences community. The tool includes a workbench application which provides a graphical user interface for the composition of workflows. These workflows are written in a new language called the simple conceptual unified flow language (Scufl), where by each step within a workflow represents one atomic task. Two examples are used to illustrate the ease by which in silico experiments can be represented as Scufl workflows using the workbench application.  相似文献   

11.
Since the advent of public data repositories for proteomics data, readily accessible results from high-throughput experiments have been accumulating steadily. Several large-scale projects in particular have contributed substantially to the amount of identifications available to the community. Despite the considerable body of information amassed, very few successful analyses have been performed and published on this data, leveling off the ultimate value of these projects far below their potential. A prominent reason published proteomics data is seldom reanalyzed lies in the heterogeneous nature of the original sample collection and the subsequent data recording and processing. To illustrate that at least part of this heterogeneity can be compensated for, we here apply a latent semantic analysis to the data contributed by the Human Proteome Organization's Plasma Proteome Project (HUPO PPP). Interestingly, despite the broad spectrum of instruments and methodologies applied in the HUPO PPP, our analysis reveals several obvious patterns that can be used to formulate concrete recommendations for optimizing proteomics project planning as well as the choice of technologies used in future experiments. It is clear from these results that the analysis of large bodies of publicly available proteomics data by noise-tolerant algorithms such as the latent semantic analysis holds great promise and is currently underexploited.  相似文献   

12.
SUMMARY: The visualization-aided exploration of complex datasets will allow the research community to formulate novel functional hypotheses leading to a better understanding of biological processes at all levels. Therefore, we have developed a web resource termed VIS-O-BAC designed for the functional investigation of expression data for model systems, such as bacterial pathogens based on a graphical display. Genome-scale datasets derived from typical 'omic' approaches can directly be explored with respect to three biologically relevant aspects, the genome structure (operon organization), the organization of genes in pathways (KEGG) and the gene function with Gene Ontology (GO) terms. The integrated viewers can be used in parallel and combine expression data and functional annotations from different external data repositories. The graphical visualizations evidently accelerate both the validation of regulatory information and the detection of affected biological processes. AVAILABILITY: http://leger2.gbf.de/cgi-bin/vis-o-bac.pl. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

13.
The future bioinformatics needs of the Arabidopsis community as well as those of other scientific communities that depend on Arabidopsis resources were discussed at a pair of recent meetings held by the Multinational Arabidopsis Steering Committee and the North American Arabidopsis Steering Committee. There are extensive tools and resources for information storage, curation, and retrieval of Arabidopsis data that have been developed over recent years primarily through the activities of The Arabidopsis Information Resource, the Nottingham Arabidopsis Stock Centre, and the Arabidopsis Biological Resource Center, among others. However, the rapid expansion in many data types, the international basis of the Arabidopsis community, and changing priorities of the funding agencies all suggest the need for changes in the way informatics infrastructure is developed and maintained. We propose that there is a need for a single core resource that is integrated into a larger international consortium of investigators. We envision this to consist of a distributed system of data, tools, and resources, accessed via a single information portal and funded by a variety of sources, under shared international management of an International Arabidopsis Informatics Consortium (IAIC). This article outlines the proposal for the development, management, operations, and continued funding for the IAIC.The Multinational Arabidopsis Steering Committee (MASC) and the North American Arabidopsis Steering Committee (NAASC) hosted workshops in Nottingham, UK (April 15 to 16, 2010) and Washington DC (May 10 to 11, 2010) to consider the future bioinformatics needs of the Arabidopsis community as well as other science communities that depend vitally on Arabidopsis resources. The outcomes of both workshops were presented and discussed at the International Conference on Arabidopsis Research (ICAR) in Yokohama, Japan. The focus of the workshops was on Arabidopsis because of its unique and essential role as a reference organism for all seed plant species. The development of the highly annotated “gold standard” Arabidopsis genome sequence has been an invaluable resource for plant and crop sciences. This platform provides important information and working practices for other species and for comparative genomic and evolutionary studies. Arabidopsis tools and resources for information storage, curation, and retrieval have been developed over recent years primarily through the activities of The Arabidopsis Information Resource (TAIR), the Nottingham Arabidopsis Stock Centre (NASC), and the Arabidopsis Biological Resource Center, among others. However, the Arabidopsis community and funding agencies recognize the need for a single data management infrastructure. The key challenge is to develop and fund this resource in a sustainable and transparent manner.Global challenges surrounding food and energy security require intelligent plant breeding strategies that will be dependent on a central Arabidopsis information resource to aid our understanding of gene function and associated phenotype in many different environments. The knowledge accrued in Arabidopsis informs our understanding of the genetic basis of plant processes and crop traits. To date, this has accumulated primarily through analysis of single genes. However, gene products do not act alone but rather in complex interacting networks. Thus, the challenge for the Arabidopsis community is to understand this higher level of complexity, to a significant extent through the application of new high volume, quantitative experimental techniques. The goals of these efforts are to develop gene/protein/metabolite networks that will enable systems-level modeling of plant processes and ultimately to translate these findings to crop plants. To achieve these goals, we must develop novel approaches to data management, integration, and access.The UK workshop addressed three principal issues: the types of data generated by the Arabidopsis community, the types of data used by the community, and future needs of the community. The objective was to produce recommendations for the type of infrastructure necessary to address the challenges and opportunities associated with the application of new technologies and recommendations for a sustainable funding model to support this infrastructure. These recommendations were considered and expanded upon at the US workshop with the ultimate goal of generating solutions to the issues discussed in the first meeting. It was recognized that cohesive, cooperative, and long-term international collaboration will be critical to successfully maintain an Arabidopsis database infrastructure that is essential for plant biology research worldwide.The workshop participants concluded that there is a continued need for a central Arabidopsis information resource, based on the productivity of the Arabidopsis community and the critical importance of the findings generated by this community. For example, ∼3000 Arabidopsis publications are currently published in peer-reviewed journals each year, a nearly 10-fold increase since the early 1990s; and in 2009, TAIR was accessed by 335,692 unique visitors and had nearly 20 million page views. Furthermore, the importance of a current, well-organized, and carefully curated Arabidopsis genome to researchers studying other plants, including crops, cannot be overstated. In the future, this resource should be part of a larger infrastructure that would be dynamic and responsive to new directions in plant biology research.  相似文献   

14.
An object model and database for functional genomics   总被引:2,自引:0,他引:2  
MOTIVATION: Large-scale functional genomics analysis is now feasible and presents significant challenges in data analysis, storage and querying. Data standards are required to enable the development of public data repositories and to improve data sharing. There is an established data format for microarrays (microarray gene expression markup language, MAGE-ML) and a draft standard for proteomics (PEDRo). We believe that all types of functional genomics experiments should be annotated in a consistent manner, and we hope to open up new ways of comparing multiple datasets used in functional genomics. RESULTS: We have created a functional genomics experiment object model (FGE-OM), developed from the microarray model, MAGE-OM and two models for proteomics, PEDRo and our own model (Gla-PSI-Glasgow Proposal for the Proteomics Standards Initiative). FGE-OM comprises three namespaces representing (i) the parts of the model common to all functional genomics experiments; (ii) microarray-specific components; and (iii) proteomics-specific components. We believe that FGE-OM should initiate discussion about the contents and structure of the next version of MAGE and the future of proteomics standards. A prototype database called RNA And Protein Abundance Database (RAPAD), based on FGE-OM, has been implemented and populated with data from microbial pathogenesis. AVAILABILITY: FGE-OM and the RAPAD schema are available from http://www.gusdb.org/fge.html, along with a set of more detailed diagrams. RAPAD can be accessed by registration at the site.  相似文献   

15.
Since the publication of the human genome, two key points have emerged. First, it is still not certain which regions of the genome code for proteins. Second, the number of discrete protein-coding genes is far fewer than the number of different proteins. Proteomics has the potential to address some of these postgenomic issues if the obstacles that we face can be overcome in our efforts to combine proteomic and genomic data. There are many challenges associated with high-throughput and high-output proteomic technologies. Consequently, for proteomics to continue at its current growth rate, new approaches must be developed to ease data management and data mining. Initiatives have been launched to develop standard data formats for exchanging mass spectrometry proteomic data, including the Proteomics Standards Initiative formed by the Human Proteome Organization. Databases such as SwissProt and Uniprot are publicly available repositories for protein sequences annotated for function, subcellular location and known potential post-translational modifications. The availability of bioinformatics solutions is crucial for proteomics technologies to fulfil their promise of adding further definition to the functional output of the human genome. The aim of the Oxford Genome Anatomy Project is to provide a framework for integrating molecular, cellular, phenotypic and clinical information with experimental genetic and proteomics data. This perspective also discusses models to make the Oxford Genome Anatomy Project accessible and beneficial for academic and commercial research and development.  相似文献   

16.
17.
Since the publication of the human genome, two key points have emerged. First, it is still not certain which regions of the genome code for proteins. Second, the number of discrete protein-coding genes is far fewer than the number of different proteins. Proteomics has the potential to address some of these postgenomic issues if the obstacles that we face can be overcome in our efforts to combine proteomic and genomic data. There are many challenges associated with high-throughput and high-output proteomic technologies. Consequently, for proteomics to continue at its current growth rate, new approaches must be developed to ease data management and data mining. Initiatives have been launched to develop standard data formats for exchanging mass spectrometry proteomic data, including the Proteomics Standards Initiative formed by the Human Proteome Organization. Databases such as SwissProt and Uniprot are publicly available repositories for protein sequences annotated for function, subcellular location and known potential post-translational modifications. The availability of bioinformatics solutions is crucial for proteomics technologies to fulfil their promise of adding further definition to the functional output of the human genome. The aim of the Oxford Genome Anatomy Project is to provide a framework for integrating molecular, cellular, phenotypic and clinical information with experimental genetic and proteomics data. This perspective also discusses models to make the Oxford Genome Anatomy Project accessible and beneficial for academic and commercial research and development.  相似文献   

18.
The applications of functional genomics, proteomics and informatics to cancer research have yielded a tremendous amount of information, which is growing all the time. Much of this information is available publicly on the Internet and ranges from general information about different cancers from a patient or clinical viewpoint, through to databases suitable for cancer researchers of all backgrounds, to very specific sites dedicated to individual genes or molecules. A simple search for 'cancer' from a typical Web browser search engine yields more than half a million hits; an even more specific search for 'leukaemia' (>40 000 hits) or 'p53' (>5700 hits) yields far too many hits to allow one to identify particular sites of interest. This review aims to provide a brief guide to some of the resources and databases that can be used as springboards to home in rapidly on information relevant to many fields of cancer research. As such, this article will not focus on a single website but hopes to illustrate some of the ways that postgenomic biology is revolutionizing cancer research. It will cover genomics and proteomics approaches that have been applied to studying global expression patterns in cancers, in addition to providing links ranging from general information about cancer to specific cancer gene mutation databases.  相似文献   

19.
20.
CressExpress is a user-friendly, online, coexpression analysis tool for Arabidopsis (Arabidopsis thaliana) microarray expression data that computes patterns of correlated expression between user-entered query genes and the rest of the genes in the genome. Unlike other coexpression tools, CressExpress allows characterization of tissue-specific coexpression networks through user-driven filtering of input data based on sample tissue type. CressExpress also performs pathway-level coexpression analysis on each set of query genes, identifying and ranking genes based on their common connections with two or more query genes. This allows identification of novel candidates for involvement in common processes and functions represented by the query group. Users launch experiments using an easy-to-use Web-based interface and then receive the full complement of results, along with a record of tool settings and parameters, via an e-mail link to the CressExpress Web site. Data sets featured in CressExpress are strictly versioned and include expression data from MAS5, GCRMA, and RMA array processing algorithms. To demonstrate applications for CressExpress, we present coexpression analyses of cellulose synthase genes, indolic glucosinolate biosynthesis, and flowering. We show that subselecting sample types produces a richer network for genes involved in flowering in Arabidopsis. CressExpress provides direct access to expression values via an easy-to-use URL-based Web service, allowing users to determine quickly if their query genes are coexpressed with each other and likely to yield informative pathway-level coexpression results. The tool is available at http://www.cressexpress.org.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号