首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Advances in structural biology are opening greater opportunities for understanding biological structures from the cellular to the atomic level. Particularly promising are the links that can be established between the information provided by electron microscopy and the atomic structures derived from X-ray crystallography and nuclear magnetic resonance spectroscopy. Combining such different kinds of structural data can result in novel biological information on the interaction of biomolecules in large supramolecular assemblies. As a consequence, the need to develop new databases in the field of structural biology that allow for an integrated access to data from all the experimental techniques is becoming critical. Pilot studies performed in recent years have already established a solid background as far as the basic information that an integrated macromolecular structure database should contain, as well as the basic principles for integration. These efforts started in the context of the BioImage project, and resulted in a first complete database prototype that provided a versatile platform for the linking of atomic models or X-ray diffraction data with electron microscopy information. Analysis of the requirements needed to combine data at different levels of resolution have resulted in sets of specifications that make possible the integration of all these different types in the context of a web environment. The case of a structural study linking electron microscopy and X-ray data, which is already contained within the BioImage data base and in the Protein Data Bank, is used here to illustrate the current approach, while a general discussion highlights the urgent need for integrated databases. Received: 26 January 2000 / Revised version: 15 May 2000 / Accepted: 15 May 2000  相似文献   

2.
Nowadays we are experiencing a remarkable growth in the number of databases that have become accessible over the Web. However, in a certain number of cases, for example, in the case of BioImage, this information is not of a textual nature, thus posing new challenges in the design of tools to handle these data. In this work, we concentrate on the development of new mechanisms aimed at "querying" these databases of complex data sets by their intrinsic content, rather than by their textual annotations only. We concentrate our efforts on a subset of BioImage containing 3D images (volumes) of biological macromolecules, implementing a first prototype of a "query-by-content" system. In the context of databases of complex data types the term query-by-content makes reference to those data modeling techniques in which user-defined functions aim at "understanding" (to some extent) the informational content of the data sets. In these systems the matching criteria introduced by the user are related to intrinsic features concerning the 3D images themselves, hence, complementing traditional queries by textual key words only. Efficient computational algorithms are required in order to "extract" structural information of the 3D images prior to storing them in the database. Also, easy-to-use interfaces should be implemented in order to obtain feedback from the expert. Our query-by-content prototype is used to construct a concrete query, making use of basic structural features, which are then evaluated over a set of three-dimensional images of biological macromolecules. This experimental implementation can be accessed via the Web at the BioImage server in Madrid, at http://www.bioimage.org/qbc/index.html.  相似文献   

3.
The BioImage database is a new scientific database for multidimensional microscopic images of biological specimens, which is available through the World Wide Web (WWW). The development of this database has followed an iterative approach, in which requirements and functionality have been revised and extended. The complexity and innovative use of the data meant that technical and biological expertise has been crucial in the initial design of the data model. A controlled vocabulary was introduced to ensure data consistency. Pointers are used to reference information stored in other databases. The data model was built using InfoModeler as a database design tool. The database management system is the Informix Dynamic Server with Universal Data Option. This object-relational system allows the handling of complex data using features such as collection types, inheritance, and user-defined data types. Informix datablades are used to provide additional functionality: the Web Integration Option enables WWW access to the database; the Video Foundation Blade provides functionality for video handling.  相似文献   

4.
Nowadays it is possible to unravel complex information at all levels of cellular organization by obtaining multi-dimensional image information. At the macromolecular level, three-dimensional (3D) electron microscopy, together with other techniques, is able to reach resolutions at the nanometer or subnanometer level. The information is delivered in the form of 3D volumes containing samples of a given function, for example, the electron density distribution within a given macromolecule. The same situation happens at the cellular level with the new forms of light microscopy, particularly confocal microscopy, all of which produce biological 3D volume information. Furthermore, it is possible to record sequences of images over time (videos), as well as sequences of volumes, bringing key information on the dynamics of living biological systems. It is in this context that work on BioImage started two years ago, and that its first version is now presented here. In essence, BioImage is a database specifically designed to contain multi-dimensional images, perform queries and interactively work with the resulting multi-dimensional information on the World Wide Web, as well as accomplish the required cross-database links. Two sister home pages of BioImage can be accessed at http://www. bioimage.org and http://www-embl.bioimage.org  相似文献   

5.
Electronic light microscopy: present capabilities and future prospects   总被引:5,自引:3,他引:2  
Electronic light microscopy involves the combination of microscopic techniques with electronic imaging and digital image processing, resulting in dramatic improvements in image quality and ease of quantitative analysis. In this review, after a brief definition of digital images and a discussion of the sampling requirements for the accurate digital recording of optical images, I discuss the three most important imaging modalities in electronic light microscopy-video-enhanced contrast microscopy, digital fluorescence microscopy and confocal scanning microscopy-considering their capabilities, their applications, and recent developments that will increase their potential. Video-enhanced contrast microscopy permits the clear visualisation and real-time dynamic recording of minute objects such as microtubules, vesicles and colloidal gold particles, an order of magnitude smaller than the resolution limit of the light microscope. It has revolutionised the study of cellular motility, and permits the quantitative tracking of organelles and gold-labelled membrane bound proteins. In combination with the technique of optical trapping (optical tweezers), it permits exquisitely sensitive force and distance measurements to be made on motor proteins. Digital fluorescence microscopy enables low-light-level imaging of fluorescently labelled specimens. Recent progress has involved improvements in cameras, fluorescent probes and fluorescent filter sets, particularly multiple bandpass dichroic mirrors, and developments in multiparameter imaging, which is becoming particularly important for in situ hybridisation studies and automated image cytometry, fluorescence ratio imaging, and time-resolved fluorescence. As software improves and small computers become more powerful, computational techniques for out-of-focus blur deconvolution and image restoration are becoming increasingly important. Confocal microscopy permits convenient, high-resolution, non-invasive, blur-free optical sectioning and 3D image acquisition, but suffers from a number of limitations. I discuss advances in confocal techniques that address the problems of temporal resolution, spherical and chromatic aberration, wavelength flexibility and cross-talk between fluorescent channels, and describe new optics to enhance axial resolution and the use of two-photon excitation to reduce photobleaching. Finally, I consider the desirability of establishing a digital image database, the BioImage database, which would permit the archival storage of, and public Internet access to, multidimensional image data from all forms of biological microscopy. Submission of images to the BioImage database would be made in coordination with the scientific publication of research results based upon these data. In the context of electronic light microscopy, this would be particularly useful for three-dimensional images of cellular structure and video sequences of dynamic cellular processes, which are otherwise hard to communicate. However, it has the wider significance of allowing correlative studies on data obtained from many different microscopies and from sequence and crystallographic investigations. It also opens the door to interactive hypermedia access to the multidimensional image data, and multimedia publishing ventures based upon this.Presented at the XXXVII Symposium of the Society for Histochemistry, 23 September 1995, Rigi Kaltbad, Switzerland  相似文献   

6.
Distinct substructures within the nucleus are associated with a wide variety of important nuclear processes.Structures such as chromatin and nuclear pores have specific roles,while others such as Cajal bodies are more functionally varied.Understanding the roles of these membraneless intra-nuclear compartments requires extensive data sets covering nuclear and compartment-associated proteins.NSort/DB is a database providing access to intra-or sub-nuclear compartment associations for the mouse nuclear proteome.Based on resources ranging from large-scale curated data sets to detailed experiments,this data set provides a high-quality set of annotations of non-exclusive association of nuclear proteins with structures such as promyelocytic leukaemia bodies and chromatin.The database is searchable by protein identifier or compartment,and has a documented web service API.The search interface,web service and data download are all freely available online at http://www.nsort.org/db/.Availability of this data set will enable systematic analyses of the protein complements of nuclear compartments,improving our understanding of the diverse functional repertoire of these structures.  相似文献   

7.
The construction of a consistent protein chemical shift database is an important step toward making more extensive use of this data in structural studies. Unfortunately, progress in this direction has been hampered by the quality of the available data, particularly with respect to chemical shift referencing, which is often either inaccurate or inconsistently annotated. Preprocessing of the data is therefore required to detect and correct referencing errors. We have developed a program for performing this task, based on the comparison of reported and expected chemical shift distributions. This program, named CheckShift, does not require additional data and is therefore applicable to data sets where structures are not available. Therefore CheckShift provides the possibility to re-reference chemical shifts prior to their use as structural constraints.  相似文献   

8.
9.
高凡  闫正龙  黄强 《生态学报》2011,31(21):6363-6370
流域尺度海量生态环境数据库构建是生态环境精准化研究的基础。以塔里木河流域生态环境数据库构建为例,对流域尺度海量生态环境数据建库的无缝数据拼接、建库规范设计、要素代码设计、空间索引设计、特征展示表及一键入库设计等关键技术进行了探讨。针对流域跨带裂缝问题,从缝隙源出发,通过分离物理数据层和逻辑数据层并区分矢量数据和栅格数据,在统一的多尺度空间框架体系下实现了海量数据的跨带无缝拼接;数据库规范设计和要素代码设计是数据入库前的关键工作,针对流域实际,分别采用规范化英文字母和图形数据比例尺设置数据库命名规范和建立代码标准;在ArcSDE框架下,采用格网索引设计和多级金字塔结构分别构建矢量数据和栅格数据的空间索引,提高了数据的快速检索和浏览;通过建立特征展示表并提出"一键入库"策略,提高了系统响应及数据入库效率等。通过构建流域尺度海量生态环境数据库系统,实现了流域尺度多源、多类型、跨带海量生态环境数据的有效存储和管理,为流域一体化管理和生态环境研究提供了基础数据支撑。  相似文献   

10.
Problem: A series of long‐term field experiments is described, with particular reference to monitoring and quality control. This paper addresses problems in data‐management of particular importance for long‐term studies, including data manipulation, archiving, quality assessment, and flexible retrieval for analysis Method: The problems were addressed using a purpose‐built database system, using commercial software and running under Microsoft Windows. Conclusion: The database system brings many advantages compared to available software, including significantly improved quality checking and access. The query system allows for easy access to data sets thus improving the efficiency of analysis. Quality assessments of the initial dataset demonstrated that the database system can also provide general insight into types and magnitudes of error in data‐sets. Finally, the system can be generalised to include data from a number of different projects, thus simplifying data manipulation for meta‐analysis.  相似文献   

11.
MOTIVATION: Circular dichroism (CD) spectroscopy has become established as a key method for determining the secondary structure contents of proteins which has had a significant impact on molecular biology. Many excellent mathematical protocols have been developed for this purpose and their quality is above question. However, reference database sets of proteins, with CD spectra matched to secondary structure components derived from X-ray structures, provide the key resource for this task. These databases were created many years ago, before most CD spectrophotometers became standardized and before it was commonplace to validate X-ray structures prior to publication. The analyses presented here were undertaken to investigate the overall quality of these reference databases in light of their extensive usage in determining protein secondary structure content from CD spectra. RESULTS: The analyses show that there are a number of significant problems associated with the CD reference database sets in current use. There are disparities between CD spectra for the same protein collected by different groups. These include differences in magnitudes, peak positions or both. However, many current reference sets are now amalgamations of spectra from these groups, introducing inconsistencies that can lead to inaccuracies in the determination of secondary structure components from the CD spectra. A number of the X-ray structures used fall short on the validation criteria now employed as standard for structure determination. Many have substantial percentages of residues in the disallowed regions of the Ramachandran plot. Hence their calculated secondary structure components, used as a foundation for the reference databases, are likely to be in error. Additionally, the coverage of secondary structure space in the reference datasets is poorly correlated to the secondary structure components found in the Protein Data Bank. A conclusion is that a new reference CD database with cross-correlated, machine-independent CD spectra and validated X-ray structures that cover more secondary structure components, including diverse protein folds, is now needed. However, that reasonably accurate values for the secondary structure content of proteins can be determined from spectra is a testament to CD spectroscopy being a very powerful technique.  相似文献   

12.
In this paper, we seek to provide an introduction to the fast-moving field of digital video on the Internet, from the viewpoint of the biological microscopist who might wish to store or access videos, for instance in image databases such as the BioImage Database (http://www.bioimage.org). We describe and evaluate the principal methods used for encoding and compressing moving image data for digital storage and transmission over the Internet, which involve compromises between compression efficiency and retention of image fidelity, and describe the existing alternate software technologies for downloading or streaming compressed digitized videos using a Web browser. We report the results of experiments on video microscopy recordings and three-dimensional confocal animations of biological specimens to evaluate the compression efficiencies of the principal video compression-decompression algorithms (codecs) and to document the artefacts associated with each of them. Because MPEG-1 gives very high compression while yet retaining reasonable image quality, these studies lead us to recommend that video databases should store both a high-resolution original version of each video, ideally either uncompressed or losslessly compressed, and a separate edited and highly compressed MPEG-1 preview version that can be rapidly downloaded for interactive viewing by the database user.  相似文献   

13.
Conformation Angles DataBase (CADB) provides an online resource to access data on conformation angles (both main-chain and side-chain) of protein structures in two data sets corresponding to 25% and 90% sequence identity between any two proteins, available in the Protein Data Bank. In addition, the database contains the necessary crystallographic parameters. The package has several flexible options and display facilities to visualize the main-chain and side-chain conformation angles for a particular amino acid residue. The package can also be used to study the interrelationship between the main-chain and side-chain conformation angles. A web based JAVA graphics interface has been deployed to display the user interested information on the client machine. The database is being updated at regular intervals and can be accessed over the World Wide Web interface at the following URL: http://144.16.71.148/cadb/.  相似文献   

14.
The challenge of translating the huge amount of genomic and biochemical data into new drugs is a costly and challenging task. Historically, there has been comparatively little focus on linking the biochemical and chemical worlds. To address this need, we have developed ChEMBL, an online resource of small-molecule SAR (structure-activity relationship) data, which can be used to support chemical biology, lead discovery and target selection in drug discovery. The database contains the abstracted structures, properties and biological activities for over 700000 distinct compounds and in excess of more than 3 million bioactivity records abstracted from over 40000 publications. Additional public domain resources can be readily integrated into the same data model (e.g. PubChem BioAssay data). The compounds in ChEMBL are largely extracted from the primary medicinal chemistry literature, and are therefore usually 'drug-like' or 'lead-like' small molecules with full experimental context. The data cover a significant fraction of the discovery of modern drugs, and are useful in a wide range of drug design and discovery tasks. In addition to the compound data, ChEMBL also contains information for over 8000 protein, cell line and whole-organism 'targets', with over 4000 of those being proteins linked to their underlying genes. The database is searchable both chemically, using an interactive compound sketch tool, protein sequences, family hierarchies, SMILES strings, compound research codes and key words, and biologically, using a variety of gene identifiers, protein sequence similarity and protein families. The information retrieved can then be readily filtered and downloaded into various formats. ChEMBL can be accessed online at https://www.ebi.ac.uk/chembldb.  相似文献   

15.
Despite the huge impact of data resources in genomics and structural biology, until now there has been no central archive for biological data for all imaging modalities. The BioImage Archive is a new data resource at the European Bioinformatics Institute (EMBL-EBI) designed to fill this gap. In its initial development BioImage Archive accepts bioimaging data associated with publications, in any format, from any imaging modality from the molecular to the organism scale, excluding medical imaging. The BioImage Archive will ensure reproducibility of published studies that derive results from image data and reduce duplication of effort. Most importantly, the BioImage Archive will help scientists to generate new insights through reuse of existing data to answer new biological questions, and provision of training, testing and benchmarking data for development of tools for image analysis. The archive is available at https://www.ebi.ac.uk/bioimage-archive/.  相似文献   

16.

Background  

Modern, high-throughput biological experiments generate copious, heterogeneous, interconnected data sets. Research is dynamic, with frequently changing protocols, techniques, instruments, and file formats. Because of these factors, systems designed to manage and integrate modern biological data sets often end up as large, unwieldy databases that become difficult to maintain or evolve. The novel rule-based approach of the Ultra-Structure design methodology presents a potential solution to this problem. By representing both data and processes as formal rules within a database, an Ultra-Structure system constitutes a flexible framework that enables users to explicitly store domain knowledge in both a machine- and human-readable form. End users themselves can change the system's capabilities without programmer intervention, simply by altering database contents; no computer code or schemas need be modified. This provides flexibility in adapting to change, and allows integration of disparate, heterogenous data sets within a small core set of database tables, facilitating joint analysis and visualization without becoming unwieldy. Here, we examine the application of Ultra-Structure to our ongoing research program for the integration of large proteomic and genomic data sets (proteogenomic mapping).  相似文献   

17.
The HIV structural database (HIVSDB) is a comprehensive collection of the structures of HIV protease, both of unliganded enzyme and of its inhibitor complexes. It contains abstracts and crystallographic data such as inhibitor and protein coordinates for 248 data sets, of which only 141 are from the Protein Data Bank (PDB). Efficient annotation, indexing, and querying of the inhibitor data is crucial for their effective use for technological and industrial applications. The application of IUPAC International Chemical Identifier (InChI) to index, curate, and query inhibitor structures HIVSDB is described.  相似文献   

18.
The availability of age-matched normative data is an essential component of clinical gait analyses. Comparison of normative gait databases is difficult due to the high-dimensionality and temporal nature of the various gait waveforms. The purpose of this study was to provide a method of comparing the sagittal joint angle data between two normative databases. We compared a modern gait database to the historical San Diego database using statistical classifiers developed by Tingley et al. (2002). Gait data were recorded from 60 children aged 1–13 years. A six-camera Vicon 512 motion analysis system and two force plates were utilized to obtain temporal-spatial, kinematic, and kinetic parameters during walking. Differences between the two normative data sets were explored using the classifier index scores, and the mean and covariance structure of the joint angle data from each lab. Significant differences in sagittal angle data between the two databases were identified and attributed to technological advances and data processing techniques (data smoothing, sampling, and joint angle approximations). This work provides a simple method of database comparison using trainable statistical classifiers.  相似文献   

19.
20.
Computational docking approaches are important as a source of protein-protein complexes structures and as a means to understand the principles of protein association. A key element in designing better docking approaches, including search procedures, potentials, and scoring functions is their validation on experimentally determined structures. Thus, the databases of such structures (benchmark sets) are important. The previous, first release of the DOCKGROUND resource (Douguet et al., Bioinformatics 2006; 22:2612-2618) implemented a comprehensive database of cocrystallized (bound) protein-protein complexes in a relational database of annotated structures. The current release adds important features to the set of bound structures, such as regularly updated downloadable datasets: automatically generated nonredundant set, built according to most common criteria, and a manually curated set that includes only biological nonobligate complexes along with a number of additional useful characteristics. The main focus of the current release is unbound (experimental and simulated) protein-protein complexes. Complexes from the bound dataset are used to identify crystallized unbound analogs. If such analogs do not exist, the unbound structures are simulated by rotamer library optimization. Thus, the database contains comprehensive sets of complexes suitable for large scale benchmarking of docking algorithms. Advanced methodologies for simulating unbound conformations are being explored for the next release. The future releases will include datasets of modeled protein-protein complexes, and systematic sets of docking decoys obtained by different docking algorithms. The growing DOCKGROUND resource is designed to become a comprehensive public environment for developing and validating new docking methodologies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号