首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Modern structural genomics projects demand for integrated methods for the interpretation and storage of nuclear magnetic resonance (NMR) data. Here we present version 2.1 of our program ARIA (Ambiguous Restraints for Iterative Assignment) for automated assignment of nuclear Overhauser enhancement (NOE) data and NMR structure calculation. We report on recent developments, most notably a graphical user interface, and the incorporation of the object-oriented data model of the Collaborative Computing Project for NMR (CCPN). The CCPN data model defines a storage model for NMR data, which greatly facilitates the transfer of data between different NMR software packages. Availability: A distribution with the source code of ARIA 2.1 is freely available at http://www.pasteur.fr/recherche/unites/Binfs/aria2.  相似文献   

2.
We present a suite of software for the complete and easy deposition of NMR data to the PDB and BMRB. This suite uses the CCPN framework and introduces a freely downloadable, graphical desktop application called CcpNmr Entry Completion Interface (ECI) for the secure editing of experimental information and associated datasets through the lifetime of an NMR project. CCPN projects can be created within the CcpNmr Analysis software or by importing existing NMR data files using the CcpNmr FormatConverter. After further data entry and checking with the ECI, the project can then be rapidly deposited to the PDBe using AutoDep, or exported as a complete deposition NMR-STAR file. In full CCPN projects created with ECI, it is straightforward to select chemical shift lists, restraint data sets, structural ensembles and all relevant associated experimental collection details, which all are or will become mandatory when depositing to the PDB. Instructions and download information for the ECI are available from the PDBe web site at http://www.ebi.ac.uk/pdbe/nmr/deposition/eci.html.  相似文献   

3.
To address data management and data exchange problems in the nuclear magnetic resonance (NMR) community, the Collaborative Computing Project for the NMR community (CCPN) created a "Data Model" that describes all the different types of information needed in an NMR structural study, from molecular structure and NMR parameters to coordinates. This paper describes the development of a set of software applications that use the Data Model and its associated libraries, thus validating the approach. These applications are freely available and provide a pipeline for high-throughput analysis of NMR data. Three programs work directly with the Data Model: CcpNmr Analysis, an entirely new analysis and interactive display program, the CcpNmr FormatConverter, which allows transfer of data from programs commonly used in NMR to and from the Data Model, and the CLOUDS software for automated structure calculation and assignment (Carnegie Mellon University), which was rewritten to interact directly with the Data Model. The ARIA 2.0 software for structure calculation (Institut Pasteur) and the QUEEN program for validation of restraints (University of Nijmegen) were extended to provide conversion of their data to the Data Model. During these developments the Data Model has been thoroughly tested and used, demonstrating that applications can successfully exchange data via the Data Model. The software architecture developed by CCPN is now ready for new developments, such as integration with additional software applications and extensions of the Data Model into other areas of research.  相似文献   

4.

Background  

Current efforts in Metabolomics, such as the Human Metabolome Project, collect structures of biological metabolites as well as data for their characterisation, such as spectra for identification of substances and measurements of their concentration. Still, only a fraction of existing metabolites and their spectral fingerprints are known. Computer-Assisted Structure Elucidation (CASE) of biological metabolites will be an important tool to leverage this lack of knowledge. Indispensable for CASE are modules to predict spectra for hypothetical structures. This paper evaluates different statistical and machine learning methods to perform predictions of proton NMR spectra based on data from our open database NMRShiftDB.  相似文献   

5.
One of the greatest challenges in metabolomics is the rapid and unambiguous identification and quantification of metabolites in a biological sample. Although one-dimensional (1D) proton nuclear magnetic resonance (NMR) spectra can be acquired rapidly, they are complicated by severe peak overlap that can significantly hinder the automated identification and quantification of metabolites. Furthermore, it is currently not reasonable to assume that NMR spectra of pure metabolites are available a priori for every metabolite in a biological sample. In this paper we develop and report on tests of methods that assist in the automatic identification of metabolites using proton two-dimensional (2D) correlation spectroscopy (COSY) NMR. Given a database of 2D COSY spectra for the metabolites of interest, our methods provide a list sorted by a heuristic likelihood of the metabolites present in a sample that has been analyzed using 2D COSY NMR. Our models attempt to correct the displacement of the peaks that can occur from one sample to the next, due to pH, temperature and matrix effects, using a statistical and chemical model. The correction of one peak can result in an implied correction of others due to spin–spin coupling. Furthermore, these displacements are not independent: they depend on the relative position of functional groups in the molecule. We report experimental results using defined mixtures of amino acids as well as real complex biological samples that demonstrate that our methods can be very effective at automatically and rapidly identifying metabolites.  相似文献   

6.
MOTIVATION: Comparative metabolic profiling by nuclear magnetic resonance (NMR) is showing increasing promise for identifying inter-individual differences to drug response. Two dimensional (2D) (1)H (13)C NMR can reduce spectral overlap, a common problem of 1D (1)H NMR. However, the peak alignment tools for 1D NMR spectra are not well suited for 2D NMR. An automated and statistically robust method for aligning 2D NMR peaks is required to enable comparative metabonomic analysis using 2D NMR. RESULTS: A novel statistical method was developed to align NMR peaks that represent the same chemical groups across multiple 2D NMR spectra. The degree of local pattern match among peaks in different spectra is assessed using a similarity measure, and a heuristic algorithm maximizes the similarity measure for peaks across the whole spectrum. This peak alignment method was used to align peaks in 2D NMR spectra of endogenous metabolites in liver extracts obtained from four inbred mouse strains in the study of acetaminophen-induced liver toxicity. This automated alignment method was validated by manual examination of the top 50 peaks as ranked by signal intensity. Manual inspection of 1872 peaks in 39 different spectra demonstrated that the automated algorithm correctly aligned 1810 (96.7%) peaks. AVAILABILITY: Algorithm is available upon request.  相似文献   

7.
SUMMARY: With the Dictyostelium Genome Project nearing completion, we initiated the construction of a data repository for all Dictyostelium discoideum genomic data. Up to now this database, called DictyMOLD (Dicty Map Of Linked Data), incorporates the recently completed D.discoideum chromosomes 1 and 2 sequences together with related annotations. To visualise maps, sequences and annotations and to provide access for the scientific community a perl-based browser was developed. AVAILABILITY: The DictyMOLD database is freely accessible via http://genome.imb-jena.de/dictyostelium/ CONTACT: gernot@imb-jena.de.  相似文献   

8.
Steinbeck C  Kuhn S 《Phytochemistry》2004,65(19):2711-2717
Compound identification and support for computer-assisted structure elucidation via a free community-built web database for organic structures and their NMR data is described. The new database NMRShiftDB is available on . As the first NMR database, NMRShiftDB allows not only open access to the database but also open and peer reviewed submission of datasets, enabling the natural products community to build its first free repository of assigned 1H and 13C NMR spectra. In addition to the open access, the underlying database software is built solely from free software and is available under an open source license. This allows collaborating laboratories to fully replicate the database and to create a highly available network of NMRShiftDB mirrors. The database contains about 10,000 structures and assigned spectra, with new datasets constantly added. Its functionality includes (sub-) spectra and (sub-) structure searches as well as shift prediction of 13C spectra based on the current database material.  相似文献   

9.
10.
11.
We present two new databases of NMR-derived distance and dihedral angle restraints: the Database Of Converted Restraints (DOCR) and the Filtered Restraints Database (FRED). These databases currently correspond to 545 proteins with NMR structures deposited in the Protein Databank (PDB). The criteria for inclusion were that these should be unique, monomeric proteins with author-provided experimental NMR data and coordinates available from the PDB capable of being parsed and prepared in a consistent manner. The Wattos program was used to parse the files, and the CcpNmr FormatConverter program was used to prepare them semi-automatically. New modules, including a new implementation of Aqua in the BioMagResBank (BMRB) software Wattos were used to analyze the sets of distance restraints (DRs) for inconsistencies, redundancies, NOE completeness, classification and violations with respect to the original coordinates. Restraints that could not be associated with a known nomenclature were flagged. The coordinates of hydrogen atoms were recalculated from the positions of heavy atoms to allow for a full restraint analysis. The DOCR database contains restraint and coordinate data that is made consistent with each other and with IUPAC conventions. The FRED database is based on the DOCR data but is filtered for use by test calculation protocols and longitudinal analyses and validations. These two databases are available from websites of the BMRB and the Macromolecular Structure Database (MSD) in various formats: NMR-STAR, CCPN XML, and in formats suitable for direct use in the software packages CNS and CYANA.Supplementary material to this paper is available in electronic form at http://dx.doi.org/10.1007/s10858-005-2195-0These authors contributed equally to this work.  相似文献   

12.
MOTIVATION: Tandem mass spectrometry (MS/MS) identifies protein sequences using database search engines, at the core of which is a score that measures the similarity between peptide MS/MS spectra and a protein sequence database. The TANDEM application was developed as a freely available database search engine for the proteomics research community. To extend TANDEM as a platform for further research on developing improved database scoring methods, we modified the software to allow users to redefine the scoring function and replace the native TANDEM scoring function while leaving the remaining core application intact. Redefinition is performed at run time so multiple scoring functions are available to be selected and applied from a single search engine binary. We introduce the implementation of the pluggable scoring algorithm and also provide implementations of two TANDEM compatible scoring functions, one previously described scoring function compatible with PeptideProphet and one very simple scoring function that quantitative researchers may use to begin their development. This extension builds on the open-source TANDEM project and will facilitate research into and dissemination of novel algorithms for matching MS/MS spectra to peptide sequences. The pluggable scoring schema is also compatible with related search applications P3 and Hunter, which are part of the X! suite of database matching algorithms. The pluggable scores and the X! suite of applications are all written in C++. AVAILABILITY: Source code for the scoring functions is available from http://proteomics.fhcrc.org  相似文献   

13.
sMOL Explorer is a 2D ligand-based computational tool that provides three major functionalities: data management, information retrieval and extraction and statistical analysis and data mining through Web interface. With sMOL Explorer, users can create personal databases by adding each small molecule via a drawing interface or uploading the data files from internal and external projects into the sMOL database. Then, the database can be browsed and queried with textual and structural similarity search. The molecule can also be submitted to search against external public databases including PubChem, KEGG, DrugBank and eMolecules. Moreover, users can easily access a variety of data mining tools from Weka and R packages to perform analysis including (1) finding the frequent substructure, (2) clustering the molecular fingerprints, (3) identifying and removing irrelevant attributes from the data and (4) building the classification model of biological activity. AVAILABILITY: sMOL Explorer is an Open Source project and is freely available to all interested users at http://www.biotec.or.th/ISL/SMOL/.  相似文献   

14.
The applicability of 1H-NMR spectroscopy for the determination of the primary and tertiary structure of carbohydrate-containing molecules is demonstrated. For classes of known compounds the characterization can be based on chemical shifts observed in 1D NMR spectra with or without the aid of a computer database. For more complex structure determinations 2D NMR techniques are required. Here the application of 2D NMR is demonstrated for the primary structure determination of two bacterial exopolysaccharides, for the spatial structure determination of a disaccharide and a glycoprotein hormone.  相似文献   

15.

Background

Identification of individual components in complex mixtures is an important and sometimes daunting task in several research areas like metabolomics and natural product studies. NMR spectroscopy is an excellent technique for analysis of mixtures of organic compounds and gives a detailed chemical fingerprint of most individual components above the detection limit. For the identification of individual metabolites in metabolomics, correlation or covariance between peaks in 1H NMR spectra has previously been successfully employed. Similar correlation of 2D 1H-13C Heteronuclear Single Quantum Correlation spectra was recently applied to investigate the structure of heparine. In this paper, we demonstrate how a similar approach can be used to identify metabolites in human biofluids (post-prostatic palpation urine).

Results

From 50 1H-13C Heteronuclear Single Quantum Correlation spectra, 23 correlation plots resembling pure metabolites were constructed. The identities of these metabolites were confirmed by comparing the correlation plots with reported NMR data, mostly from the Human Metabolome Database.

Conclusions

Correlation plots prepared by statistically correlating 1H-13C Heteronuclear Single Quantum Correlation spectra from human biofluids provide unambiguous identification of metabolites. The correlation plots highlight cross-peaks belonging to each individual compound, not limited by long-range magnetization transfer as conventional NMR experiments.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0413-z) contains supplementary material, which is available to authorized users.  相似文献   

16.
PartiGene--constructing partial genomes   总被引:4,自引:0,他引:4  
Expressed sequence tags (ESTs) offer a low-cost approach to gene discovery and are being used by an increasing number of laboratories to obtain sequence information for a wide variety of organisms. The challenge lies in processing and organizing this data within a genomic context to facilitate large scale analyses. Here we present PartiGene, an integrated sequence analysis suite that uses freely available public domain software to (1) process raw trace chromatograms into sequence objects suitable for submission to dbEST; (2) place these sequences within a genomic context; (3) perform customizable first-pass annotation of the data; and (4) present the data as HTML tables and an SQL database resource. PartiGene has been used to create a number of non-model organism database resources including NEMBASE (http://www.nematodes.org) and LumbriBase (http://www.earthworms.org/). The packages are readily portable, freely available and can be run on simple Linux-based workstations. AVAILABILITY: PartiGene is available from http://www.nematodes.org/PartiGene and also forms part of the EST analysis software, associated with the Natural Environmental Research Council (UK) Bio-Linux project (http://envgen.nox.ac.uk/biolinux.html).  相似文献   

17.
18.
The deoxyribose hexanucleoside pentaphosphate (m5dC-dG)3 has been studied by 500 MHz 1H NMR in D2O (0.1 M NaCl) and in D2O/deuterated methanol mixtures. Two conformations, in slow equilibrium on the NMR time scale, were detected in methanolic solution. Two-dimensional nuclear Overhauser effect (NOE) experiments were used to assign the base and many of the sugar resonances as well as to determine structural features for both conformations. The results were consistent with the an equilibrium in solution between B-DNA and Z-DNA. The majority of the molecules have a B-DNA structure in low-salt D2O and a Z-DNA structure at high methanol concentrations. A cross-strand NOE between methyl groups on adjacent cytosines is observed for Z-DNA but not B-DNA. The B-DNA conformation predominates at low methanol concentrations and is stabilized by increasing temperature, while the Z-DNA conformation predominates at high methanol concentrations and low temperatures. 31P NMR spectra gave results consistent with those obtained by 1H NMR. Comparison of the 31P spectra with those obtained on poly(dG-m5dC) allow assignment of the lower field resonances to GpC in the Z conformation.  相似文献   

19.
A modified Lorentzian distribution function is used to model peaks in two-dimensional (2D) 1H–13C heteronuclear single quantum coherence (HSQC) nuclear magnetic resonance (NMR) spectra. The model fit is used to determine accurate chemical shifts from genuine signals in complex metabolite mixtures such as blood. The algorithm can be used to extract features from a set of spectra from different samples for exploratory metabolomics. First a reference spectrum is created in which the peak intensities are given by the median value over all samples at each point in the 2D spectra so that 1H–13C correlations in any spectra are accounted for. The mathematical model provides a footprint for each peak in the reference spectrum, which can be used to bin the 1H–13C correlations in each HSQC spectrum. The binned intensities are then used as variables in multivariate analyses and those found to be discriminatory are rapidly identified by cross referencing the chemical shifts of the bins with a database of 13C and 1H chemical shift correlations from known metabolites.  相似文献   

20.
As part of a long-term study of the chemical defenses of Norway spruce (Picea abies) against herbivores and pathogens, a phytochemical survey of the phenolics in the bark was carried out. Eight stilbene glucoside dimers, designated as piceasides A-H (1a-4b), were isolated as four 1:1 mixtures of inseparable diastereomers. Their structures were determined by extensive spectroscopic means including 1D (1H and 13C) and 2D NMR (1H-1H COSY, HSQC, HMBC, ROESY) spectra, and were supported by enzymatic hydrolysis and computational analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号