首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Ecoinformatics: supporting ecology as a data-intensive science   总被引:2,自引:0,他引:2  
Ecology is evolving rapidly and increasingly changing into a more open, accountable, interdisciplinary, collaborative and data-intensive science. Discovering, integrating and analyzing massive amounts of heterogeneous data are central to ecology as researchers address complex questions at scales from the gene to the biosphere. Ecoinformatics offers tools and approaches for managing ecological data and transforming the data into information and knowledge. Here, we review the state-of-the-art and recent advances in ecoinformatics that can benefit ecologists and environmental scientists as they tackle increasingly challenging questions that require voluminous amounts of data across disciplines and scales of space and time. We also highlight the challenges and opportunities that remain.  相似文献   

2.
《Ecological Informatics》2012,7(6):345-353
Camera traps and the images they generate are becoming an essential tool for field biologists studying and monitoring terrestrial animals, in particular medium to large terrestrial mammals and birds. In the last five years, camera traps have made the transition to digital technology, where these devices now produce hundreds of instantly available images per month and a large amount of ancillary metadata (e.g., date, time, temperature, image size, etc.). Despite this accelerated pace in the development of digital image capture, field biologists still lack adequate software solutions to process and manage the increasing amount of information in a cost efficient way. In this paper we describe a software system that we have developed, called DeskTEAM, to address this issue. DeskTEAM has been developed in the context of the Tropical Ecology Assessment and Monitoring Network (TEAM), a global network that monitors terrestrial vertebrates. We describe the software architecture and functionality and its utility in managing and processing large amounts of digital camera trap data collected throughout the global TEAM network. DeskTEAM incorporates software features and functionality that make it relevant to the broad camera trapping community. These include the ability to run the application locally on a laptop or desktop computer, without requiring an Internet connection, as well as the ability to run on multiple operating systems; an intuitive navigational user interface with multiple levels of detail (from individual images, to whole groups of images) which allows users to easily manage hundreds or thousands of images; ability to automatically extract EXIF and custom metadata information from digital images to increase standardization; availability of embedded taxonomic lists to allow users to easily tag images with species identities; and the ability to export data packages consisting of data, metadata and images in standardized formats so that they can be transferred to online data warehouses for easy archiving and dissemination. Lastly, building these software tools for wildlife scientists provides valuable lessons for the ecoinformatics community.  相似文献   

3.
Researchers are perpetually amassing biological sequence data. The computational approaches employed by ecologists for organizing this data (e.g. alignment, phylogeny, etc.) typically scale nonlinearly in execution time with the size of the dataset. This often serves as a bottleneck for processing experimental data since many molecular studies are characterized by massive datasets. To keep up with experimental data demands, ecologists are forced to choose between continually upgrading expensive in-house computer hardware or outsourcing the most demanding computations to the cloud. Outsourcing is attractive since it is the least expensive option, but does not necessarily allow direct user interaction with the data for exploratory analysis. Desktop analytical tools such as ARB are indispensable for this purpose, but they do not necessarily offer a convenient solution for the coordination and integration of datasets between local and outsourced destinations. Therefore, researchers are currently left with an undesirable tradeoff between computational throughput and analytical capability. To mitigate this tradeoff we introduce a software package to leverage the utility of the interactive exploratory tools offered by ARB with the computational throughput of cloud-based resources. Our pipeline serves as middleware between the desktop and the cloud allowing researchers to form local custom databases containing sequences and metadata from multiple resources and a method for linking data outsourced for computation back to the local database. A tutorial implementation of the toolkit is provided in the supporting information, S1 Tutorial. Availability: http://www.ece.drexel.edu/gailr/EESI/tutorial.php.  相似文献   

4.
Ten years after DNA barcoding was initially suggested as a tool to identify species, millions of barcode sequences from more than 1100 species are available in public databases. While several studies have reviewed the methods and potential applications of DNA barcoding, most have focused on species identification and discovery, and relatively few have addressed applications of DNA barcoding data to ecology. These data, and the associated information on the evolutionary histories of taxa that they can provide, offer great opportunities for ecologists to investigate questions that were previously difficult or impossible to address. We present an overview of potential uses of DNA barcoding relevant in the age of ecoinformatics, including applications in community ecology, species invasion, macroevolution, trait evolution, food webs and trophic interactions, metacommunities, and spatial ecology. We also outline some of the challenges and potential advances in DNA barcoding that lie ahead.  相似文献   

5.
Modern biological and chemical studies rely on life science databases as well as sophisticated software tools (e.g., homology search tools, modeling and visualization tools). These tools often have to be combined and integrated in order to support a given study. SIBIOS (System for the Integration of Bioinformatics Services) serves this purpose. The services are both life science database search services and software tools. The task engine is the core component of SIBIOS. It supports the execution of dynamic workflows that incorporate multiple bioinformatics services. The architecture of SIBIOS, the approaches to addressing the heterogeneity as well as interoperability of bioinformatics services, including data integration are presented in this paper.  相似文献   

6.
Dynamic publication model for neurophysiology databases.   总被引:2,自引:0,他引:2  
We have implemented a pair of database projects, one serving cortical electrophysiology and the other invertebrate neurones and recordings. The design for each combines aspects of two proven schemes for information interchange. The journal article metaphor determined the type, scope, organization and quantity of data to comprise each submission. Sequence databases encouraged intuitive tools for data viewing, capture, and direct submission by authors. Neurophysiology required transcending these models with new datatypes. Time-series, histogram and bivariate datatypes, including illustration-like wrappers, were selected by their utility to the community of investigators. As interpretation of neurophysiological recordings depends on context supplied by metadata attributes, searches are via visual interfaces to sets of controlled-vocabulary metadata trees. Neurones, for example, can be specified by metadata describing functional and anatomical characteristics. Permanence is advanced by data model and data formats largely independent of contemporary technology or implementation, including Java and the XML standard. All user tools, including dynamic data viewers that serve as a virtual oscilloscope, are Java-based, free, multiplatform, and distributed by our application servers to any contemporary networked computer. Copyright is retained by submitters; viewer displays are dynamic and do not violate copyright of related journal figures. Panels of neurophysiologists view and test schemas and tools, enhancing community support.  相似文献   

7.

Background

Epidemiologists and ecologists often collect data in the field and, on returning to their laboratory, enter their data into a database for further analysis. The recent introduction of mobile phones that utilise the open source Android operating system, and which include (among other features) both GPS and Google Maps, provide new opportunities for developing mobile phone applications, which in conjunction with web applications, allow two-way communication between field workers and their project databases.

Methodology

Here we describe a generic framework, consisting of mobile phone software, EpiCollect, and a web application located within www.spatialepidemiology.net. Data collected by multiple field workers can be submitted by phone, together with GPS data, to a common web database and can be displayed and analysed, along with previously collected data, using Google Maps (or Google Earth). Similarly, data from the web database can be requested and displayed on the mobile phone, again using Google Maps. Data filtering options allow the display of data submitted by the individual field workers or, for example, those data within certain values of a measured variable or a time period.

Conclusions

Data collection frameworks utilising mobile phones with data submission to and from central databases are widely applicable and can give a field worker similar display and analysis tools on their mobile phone that they would have if viewing the data in their laboratory via the web. We demonstrate their utility for epidemiological data collection and display, and briefly discuss their application in ecological and community data collection. Furthermore, such frameworks offer great potential for recruiting ‘citizen scientists’ to contribute data easily to central databases through their mobile phone.  相似文献   

8.
9.
The construction and analysis of networks is increasingly widespread in biological research. We have developed esyN (“easy networks”) as a free and open source tool to facilitate the exchange of biological network models between researchers. esyN acts as a searchable database of user-created networks from any field. We have developed a simple companion web tool that enables users to view and edit networks using data from publicly available databases. Both normal interaction networks (graphs) and Petri nets can be created. In addition to its basic tools, esyN contains a number of logical templates that can be used to create models more easily. The ability to use previously published models as building blocks makes esyN a powerful tool for the construction of models and network graphs. Users are able to save their own projects online and share them either publicly or with a list of collaborators. The latter can be given the ability to edit the network themselves, allowing online collaboration on network construction. esyN is designed to facilitate unrestricted exchange of this increasingly important type of biological information. Ultimately, the aim of esyN is to bring the advantages of Open Source software development to the construction of biological networks.  相似文献   

10.
Kebing Yu  Arthur R. Salomon 《Proteomics》2010,10(11):2113-2122
Recent advances in the speed and sensitivity of mass spectrometers and in analytical methods, the exponential acceleration of computer processing speeds, and the availability of genomic databases from an array of species and protein information databases have led to a deluge of proteomic data. The development of a lab‐based automated proteomic software platform for the automated collection, processing, storage, and visualization of expansive proteomic data sets is critically important. The high‐throughput autonomous proteomic pipeline described here is designed from the ground up to provide critically important flexibility for diverse proteomic workflows and to streamline the total analysis of a complex proteomic sample. This tool is composed of a software that controls the acquisition of mass spectral data along with automation of post‐acquisition tasks such as peptide quantification, clustered MS/MS spectral database searching, statistical validation, and data exploration within a user‐configurable lab‐based relational database. The software design of high‐throughput autonomous proteomic pipeline focuses on accommodating diverse workflows and providing missing software functionality to a wide range of proteomic researchers to accelerate the extraction of biological meaning from immense proteomic data sets. Although individual software modules in our integrated technology platform may have some similarities to existing tools, the true novelty of the approach described here is in the synergistic and flexible combination of these tools to provide an integrated and efficient analysis of proteomic samples.  相似文献   

11.
Existing software tools for topology-based pathway enrichment analysis are either computationally inefficient, have undesirable statistical power, or require expert knowledge to leverage the methods’ capabilities. To address these limitations, we have overhauled NetGSA, an existing topology-based method, to provide a computationally-efficient user-friendly tool that offers interactive visualization. Pathway enrichment analysis for thousands of genes can be performed in minutes on a personal computer without sacrificing statistical power. The new software also removes the need for expert knowledge by directly curating gene-gene interaction information from multiple external databases. Lastly, by utilizing the capabilities of Cytoscape, the new software also offers interactive and intuitive network visualization.  相似文献   

12.
signatureSearch is an R/Bioconductor package that integrates a suite of existing and novel algorithms into an analysis environment for gene expression signature (GES) searching combined with functional enrichment analysis (FEA) and visualization methods to facilitate the interpretation of the search results. In a typical GES search (GESS), a query GES is searched against a database of GESs obtained from large numbers of measurements, such as different genetic backgrounds, disease states and drug perturbations. Database matches sharing correlated signatures with the query indicate related cellular responses frequently governed by connected mechanisms, such as drugs mimicking the expression responses of a disease. To identify which processes are predominantly modulated in the GESS results, we developed specialized FEA methods combined with drug-target network visualization tools. The provided analysis tools are useful for studying the effects of genetic, chemical and environmental perturbations on biological systems, as well as searching single cell GES databases to identify novel network connections or cell types. The signatureSearch software is unique in that it provides access to an integrated environment for GESS/FEA routines that includes several novel search and enrichment methods, efficient data structures, and access to pre-built GES databases, and allowing users to work with custom databases.  相似文献   

13.
Predicting the distribution of metabolic fluxes in biochemical networks is of major interest in systems biology. Several databases provide metabolic reconstructions for different organisms. Software to analyze flux distributions exists, among others for the proprietary MATLAB environment. Given the large user community for the R computing environment, a simple implementation of flux analysis in R appears desirable and will facilitate easy interaction with computational tools to handle gene expression data. We extended the R software package BiGGR, an implementation of metabolic flux analysis in R. BiGGR makes use of public metabolic reconstruction databases, and contains the BiGG database and the reconstruction of human metabolism Recon2 as Systems Biology Markup Language (SBML) objects. Models can be assembled by querying the databases for pathways, genes or reactions of interest. Fluxes can then be estimated by maximization or minimization of an objective function using linear inverse modeling algorithms. Furthermore, BiGGR provides functionality to quantify the uncertainty in flux estimates by sampling the constrained multidimensional flux space. As a result, ensembles of possible flux configurations are constructed that agree with measured data within precision limits. BiGGR also features automatic visualization of selected parts of metabolic networks using hypergraphs, with hyperedge widths proportional to estimated flux values. BiGGR supports import and export of models encoded in SBML and is therefore interoperable with different modeling and analysis tools. As an application example, we calculated the flux distribution in healthy human brain using a model of central carbon metabolism. We introduce a new algorithm termed Least-squares with equalities and inequalities Flux Balance Analysis (Lsei-FBA) to predict flux changes from gene expression changes, for instance during disease. Our estimates of brain metabolic flux pattern with Lsei-FBA for Alzheimer’s disease agree with independent measurements of cerebral metabolism in patients. This second version of BiGGR is available from Bioconductor.  相似文献   

14.
Paleoecoinformatics, the development and use of paleoecological databases and tools, has a strong tradition of community support, and is rapidly growing as new tools bring new opportunities to advance understanding of ecological and evolutionary dynamics, from regional to global scales, and across the entire history of life on earth. Paleoecoinformatics occupies the intersection of ecoinformatics and geoinformatics. All face shared challenges, including developing frameworks to store heterogeneous and dynamic datasets, facilitating data contributions, linking heterogeneous data, and tracking improvements in data quality and scientific understanding. The three communities will benefit from increased engagement and exchange.  相似文献   

15.
National/ethnic mutation databases aim to document the genetic heterogeneity in various populations and ethnic groups worldwide. We have previously reported the development and upgrade of FINDbase (www.findbase.org), a database recording causative mutations and pharmacogenomic marker allele frequencies in various populations around the globe. Although this database has recently been upgraded, we continuously try to enhance its functionality by providing more advanced visualization tools that would further assist effective data querying and comparisons. We are currently experimenting in various visualization techniques on the existing FINDbase causative mutation data collection aiming to provide a dynamic research tool for the worldwide scientific community. We have developed an interactive web-based application for population-based mutation data retrieval. It supports sophisticated data exploration allowing users to apply advanced filtering criteria upon a set of multiple views of the underlying data collection and enables browsing the relationships between individual datasets in a novel and meaningful way.  相似文献   

16.
We argue the significance of a fundamental shift in bioinformatics, from in-the-small to in-the-large. Adopting a large-scale perspective is a way to manage the problems endemic to the world of the small-constellations of incompatible tools for which the effort required to assemble an integrated system exceeds the perceived benefit of the integration. Where bioinformatics in-the-small is about data and tools, bioinformatics in-the-large is about metadata and dependencies. Dependencies represent the complexities of large-scale integration, including the requirements and assumptions governing the composition of tools. The popular make utility is a very effective system for defining and maintaining simple dependencies, and it offers a number of insights about the essence of bioinformatics in-the-large. Keeping an in-the-large perspective has been very useful to us in large bioinformatics projects. We give two fairly different examples, and extract lessons from them showing how it has helped. These examples both suggest the benefit of explicitly defining and managing knowledge flows and knowledge maps (which represent metadata regarding types, flows, and dependencies), and also suggest approaches for developing bioinformatics database systems. Generally, we argue that large-scale engineering principles can be successfully adapted from disciplines such as software engineering and data management, and that having an in-the-large perspective will be a key advantage in the next phase of bioinformatics development.  相似文献   

17.
The NCBI Gene Expression Omnibus (GEO) represents the largest public repository of microarray data. However, finding data in GEO can be challenging. We have developed GEOmetadb in an attempt to make querying the GEO metadata both easier and more powerful. All GEO metadata records as well as the relationships between them are parsed and stored in a local MySQL database. A powerful, flexible web search interface with several convenient utilities provides query capabilities not available via NCBI tools. In addition, a Bioconductor package, GEOmetadb that utilizes a SQLite export of the entire GEOmetadb database is also available, rendering the entire GEO database accessible with full power of SQL-based queries from within R. AVAILABILITY: The web interface and SQLite databases available at http://gbnci.abcc.ncifcrf.gov/geo/. The Bioconductor package is available via the Bioconductor project. The corresponding MATLAB implementation is also available at the same website.  相似文献   

18.
This Special Feature includes contributions on data‐processing of large ecological datasets under the heading ecoinformatics. Herewith the latter term is now al so established in the Journal of Vegetation Science. Ecoinfomatics is introduced as a rapid growing field within community ecology which is generating exciting new developments in ecology and in particular vegetation ecology. In our field, ecoinformatics deals with the understanding of patterns of species distributions at local and regional scales, and on the assemblages of species in relation to their properties, the local environment and their distribution in the region. Community ecology using ecoinformatics is related to bioinformatics, community ecology, biogeography and macroecology. We make clear how ecoinformatics in vegetation science and particularly the IAVS Working Group on Ecoinformatics has developed from the work of the old Working Group for Data Processing which was active during the 1970s and 1980s. Recent developments, including the creation of TURBOVEG and Syn Bio Sys in Europa and VEGBANK in the USA, form a direct link with these pioneer activities, both scientifically and personally. The contributions collected in this Special Feature present examples of seco‐infeveral types of the use of databases and the application of programmes and models. The main types are the study of long‐term vegetation dynamics in different cases of primary and secondary succession and the understanding of successional developments in terms of species traits. Among the future developments of great significance we mention the use of a variety of different large datasets for the study of the distribution and ecology and conservation of rare and threatened species.  相似文献   

19.
王蕊  胡德华 《生物信息学》2014,12(4):305-312
以Web of Science为数据源,简要概括生物信息学数据库研究的发展趋势。利用Cite Space可视化工具展现生物信息学数据库研究的知识基础和研究热点图谱,为开展生物信息学数据库领域相关的理论研究和实践活动提供借鉴,以便推动生物信息学数据库研究的发展。研究表明:1990年Altschul SF发表的"局部比对搜索工具——BLAST"是生物信息学数据库研究的重要知识来源文献;热点主题集中在序列库、基因组数据库、分类数据库、蛋白质数据库、数据库更新、集成系统等。  相似文献   

20.
PaVESy: Pathway Visualization and Editing System   总被引:1,自引:0,他引:1  
A data managing system for editing and visualization of biological pathways is presented. The main component of PaVESy (Pathway Visualization and Editing System) is a relational SQL database system. The database design allows storage of biological objects, such as metabolites, proteins, genes and respective relations, which are required to assemble metabolic and regulatory biological interactions. The database model accommodates highly flexible annotation of biological objects by user-defined attributes. In addition, specific roles of objects are derived from these attributes in the context of user-defined interactions, e.g. in the course of pathway generation or during editing of the database content. Furthermore, the user may organize and arrange the database content within a folder structure and is free to group and annotate database objects of interest within customizable subsets. Thus, we allow an individualized view on the database content and facilitate user customization. A JAVA-based class library was developed, which serves as the database programming interface to PaVESy. This API provides classes, which implement the concepts of object persistence in SQL databases, such as entries, interactions, annotations, folders and subsets. We created editing and visualization tools for navigation in and visualization of the database content. User approved pathway assemblies are stored and may be retrieved for continued modification, annotation and export. Data export is interfaced with a range of network visualization programs, such as Pajek or other software allowing import of SBML or GML data format. AVAILABILITY: http://pavsey.mpimp-golm.mpg.de  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号