期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Deciphering the MHC-associated peptidome: a review of naturally processed ligand data

Kerrie Vaughan Xiaojun Xu Etienne Caron Bjoern Peters Alessandro Sette 《Expert review of proteomics》2017,14(9):729-736

Introduction: The availability of big data sets (‘OMICS’) has greatly impacted fundamental and translational science. High-throughput analysis of HLA class I and II associated peptidomes by mass spectrometry (MS) has generated large datasets, with the last decade witnessing tremendous growth in the breadth and number of studies.

Areas covered: For this, we first analyzed naturally processed peptide (NP) data captured within the IEDB to survey and characterize the current state of NP data. We next asked to what extent the NP data overlap with existing T cell epitope and MHC binding data.

Expert commentary: The current collection of NP data represents a large and diverse set of class I/II peptides mostly derived from self-antigens. These data overlap only marginally with existing immunogenicity and binding data and it is thus difficult to ascertain the correspondence between the different assay methodologies. This highlights a need for unbiased studies benchmarking in model antigen systems how well MHC binding and NP data predicts immunogenicity. Going forward, efforts at generating an integrated process for capturing all NP, curating associated metadata and accessing NP data from an immunological viewpoint will be important for development of novel methods for identifying optimal target antigens and for class I and II epitope prediction. 相似文献

2.

Integrative omics - from data to biology

Hassan Dihazi Abdul R. Asif Tim Beißbarth Rainer Bohrer Kirstin Feussner Ivo Feussner 《Expert review of proteomics》2018,15(6):463-466

Introduction: Multi-omic approaches are promising a broader view on cellular processes and a deeper understanding of biological systems. with strongly improved high-throughput methods the amounts of data generated have become huge, and their handling challenging.

Area Covered: New bioinformatic tools and pipelines for the integration of data from different omics disciplines continue to emerge, and will support scientists to reliably interpret data in the context of biological processes. comprehensive data integration strategies will fundamentally improve systems biology and systems medicine. to present recent developments of integrative omics, the göttingen proteomics forum (gpf) organized its 6th symposium on the 23rd of november 2017, as part of a series of regular gpf symposia. more than 140 scientists attended the event that highlighted the challenges and opportunities but also the caveats of integrating data from different omics disciplines.

Expert commentary: The continuous exponential growth in omics data require similar development in software solutions for handling this challenge. Integrative omics tools offer the chance to handle this challenge but profound investigations and coordinated efforts are required to boost this field. 相似文献

3.

Machine learning approaches in MALDI-MSI: clinical applications

Manuel Galli Italo Zoppis Andrew Smith Fulvio Magni Giancarlo Mauri 《Expert review of proteomics》2016,13(7):685-696

Introduction: Despite the unquestionable advantages of Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry Imaging in visualizing the spatial distribution and the relative abundance of biomolecules directly on-tissue, the yielded data is complex and high dimensional. Therefore, analysis and interpretation of this huge amount of information is mathematically, statistically and computationally challenging.

Areas covered: This article reviews some of the challenges in data elaboration with particular emphasis on machine learning techniques employed in clinical applications, and can be useful in general as an entry point for those who want to study the computational aspects. Several characteristics of data processing are described, enlightening advantages and disadvantages. Different approaches for data elaboration focused on clinical applications are also provided. Practical tutorial based upon Orange Canvas and Weka software is included, helping familiarization with the data processing.

Expert commentary: Recently, MALDI-MSI has gained considerable attention and has been employed for research and diagnostic purposes, with successful results. Data dimensionality constitutes an important issue and statistical methods for information-preserving data reduction represent one of the most challenging aspects. The most common data reduction methods are characterized by collecting independent observations into a single table. However, the incorporation of relational information can improve the discriminatory capability of the data. 相似文献

4.

ARIA: automated NOE assignment and NMR structure calculation

Linge JP Habeck M Rieping W Nilges M 《Bioinformatics (Oxford, England)》2003,19(2):315-316

相似文献

5.

Modeling Information Quality Risk for Data Mining in Data Warehouses

Ying Su Jie Peng Zhanming Jin 《人类与生态风险评估》2009,15(2):332-350

Information Quality (IQ) is a critical factor for the success of many activities in the information age, including the development of data warehouses and implementation of data mining. The issue of IQ risk is recognized during the process of data mining; however, there is no formal methodological approach to dealing with such issues.

Consequently, it is essential to measure the risk of IQ in a data warehouse to ensure success in implementing data mining. This article presents a methodology to determine three IQ risk characteristics: accuracy, comprehensiveness, and non-membership. The methodology provides a set of quantitative models to examine how the quality risks of source information affect the quality for information outputs produced using the relational algebra operations: Restriction, Projection, and Cubic product. It can be used to determine how quality risks associated with diverse data sources affect the derived data. The study also develops a data cube model and associated algebra to support IQ risk operations. 相似文献

6.

EMBL-Align: a new public nucleotide and amino acid multiple sequence alignment database

Lombard V Camon EB Parkinson HE Hingamp P Stoesser G Redaschi N 《Bioinformatics (Oxford, England)》2002,18(5):763-764

The submission of multiple sequence alignment data to EMBL has grown 30-fold in the past 10 years, creating a problem of archiving them. The EBI has developed a new public database of multiple sequence alignments called EMBL-Align. It has a dedicated web-based submission tool, Webin-Align. Together they represent a comprehensive data management solution for alignment data. Webin-Align accepts all the common alignment formats and can display data in CLUSTALW format as well as a new standard EMBL-Align flat file format. The alignments are stored in the EMBL-Align database and can be queried from the EBI SRS (Sequence Retrieval System) server. AVAILABILITY: Webin-Align: http://www.ebi.ac.uk/embl/Submission/align_top.html, EMBL-Align: ftp://ftp.ebi.ac.uk/pub/databases/embl/align, http://srs.ebi.ac.uk/ 相似文献

7.

Structural systems identification of genetic regulatory networks 总被引：2，自引：0，他引：2

Xiong H Choe Y 《Bioinformatics (Oxford, England)》2008,24(4):553-560

MOTIVATION: Reverse engineering of genetic regulatory networks from experimental data is the first step toward the modeling of genetic networks. Linear state-space models, also known as linear dynamical models, have been applied to model genetic networks from gene expression time series data, but existing works have not taken into account available structural information. Without structural constraints, estimated models may contradict biological knowledge and estimation methods may over-fit. RESULTS: In this report, we extended expectation-maximization (EM) algorithms to incorporate prior network structure and to estimate genetic regulatory networks that can track and predict gene expression profiles. We applied our method to synthetic data and to SOS data and showed that our method significantly outperforms the regular EM without structural constraints. AVAILABILITY: The Matlab code is available upon request and the SOS data can be downloaded from http://www.weizmann.ac.il/mcb/UriAlon/Papers/SOSData/, courtesy of Uri Alon. Zak's data is available from his website, http://www.che.udel.edu/systems/people/zak. 相似文献

8.

TableView: portable genomic data visualization

Johnson JE Stromvik MV Silverstein KA Crow JA Shoop E Retzel EF 《Bioinformatics (Oxford, England)》2003,19(10):1292-1293

TableView is a generalized scientific visualization program for exploration of various biological data, including EST, SAGE, microarray and annotation data. Written in Java, TableView is portable, is easily used together with other software including DBMSs and is versatile enough to be applied to any tabular data AVAILABILITY: TableView is freely available at: http://ccgb.umn.edu/software/java/apps/TableView/. 相似文献

9.

Challenges and promise at the interface of metaproteomics and genomics: an overview of recent progress in metaproteogenomic data analysis

Henning Schiebenhoefer Tim Van Den Bossche Stephan Fuchs Bernhard Y. Renard Lennart Martens 《Expert review of proteomics》2019,16(5):375-390

Introduction: The study of microbial communities based on the combined analysis of genomic and proteomic data – called metaproteogenomics – has gained increased research attention in recent years. This relatively young field aims to elucidate the functional and taxonomic interplay of proteins in microbiomes and its implications on human health and the environment.

Areas covered: This article reviews bioinformatics methods and software tools dedicated to the analysis of data from metaproteomics and metaproteogenomics experiments. In particular, it focuses on the creation of tailored protein sequence databases, on the optimal use of database search algorithms including methods of error rate estimation, and finally on taxonomic and functional annotation of peptide and protein identifications.

Expert opinion: Recently, various promising strategies and software tools have been proposed for handling typical data analysis issues in metaproteomics. However, severe challenges remain that are highlighted and discussed in this article; these include: (i) robust false-positive assessment of peptide and protein identifications, (ii) complex protein inference against a background of highly redundant data, (iii) taxonomic and functional post-processing of identification data, and finally, (iv) the assessment and provision of metrics and tools for quantitative analysis. 相似文献

10.

OTUbase: an R infrastructure package for operational taxonomic unit data

Beck D Settles M Foster JA 《Bioinformatics (Oxford, England)》2011,27(12):1700-1701

SUMMARY: OTUbase is an R package designed to facilitate the analysis of operational taxonomic unit (OTU) data and sequence classification (taxonomic) data. Currently there are programs that will cluster sequence data into OTUs and/or classify sequence data into known taxonomies. However, there is a need for software that can take the summarized output of these programs and organize it into easily accessed and manipulated formats. OTUbase provides this structure and organization within R, to allow researchers to easily manipulate the data with the rich library of R packages currently available for additional analysis. AVAILABILITY: OTUbase is an R package available through Bioconductor. It can be found at http://www.bioconductor.org/packages/release/bioc/html/OTUbase.html. 相似文献

11.

ToxoDB: accessing the Toxoplasma gondii genome 总被引：1，自引：0，他引：1

Kissinger JC Gajria B Li L Paulsen IT Roos DS 《Nucleic acids research》2003,31(1):234-236

ToxoDB (http://ToxoDB.org) provides a genome resource for the protozoan parasite Toxoplasma gondii. Several sequencing projects devoted to T. gondii have been completed or are in progress: an EST project (http://genome.wustl.edu/est/index.php?toxoplasma=1), a BAC clone end-sequencing project (http://www.sanger.ac.uk/Projects/T_gondii/) and an 8X random shotgun genomic sequencing project (http://www.tigr.org/tdb/e2k1/tga1/). ToxoDB was designed to provide a central point of access for all available T. gondii data, and a variety of data mining tools useful for the analysis of unfinished, un-annotated draft sequence during the early phases of the genome project. In later stages, as more and different types of data become available (microarray, proteomic, SNP, QTL, etc.) the database will provide an integrated data analysis platform facilitating user-defined queries across the different data types. 相似文献

12.

Mediante: a web-based microarray data manager

Le Brigand K Barbry P 《Bioinformatics (Oxford, England)》2007,23(10):1304-1306

Mediante is a MIAME-compliant microarray data manager that links together annotations and experimental data. Developed as a J2EE three-tier application, Mediante integrates a management system for production of long oligonucleotide microarrays, an experimental data repository suitable for home made or commercial microarrays, and a user interface dedicated to the management of microarrays projects. Several tools allow quality control of hybridizations and submission of validated data to public repositories. AVAILABILITY: http://www.microarray.fr. SUPPLEMENTARY INFORMATION: http://www.microarray.fr/SP/lebrigand2007/ 相似文献

13.

ProServer: a simple, extensible Perl DAS server 总被引：1，自引：0，他引：1

Finn RD Stalker JW Jackson DK Kulesha E Clements J Pettett R 《Bioinformatics (Oxford, England)》2007,23(12):1568-1570

SUMMARY: The increasing size and complexity of biological databases has led to a growing trend to federate rather than duplicate them. In order to share data between federated databases, protocols for the exchange mechanism must be developed. One such data exchange protocol that is widely used is the Distributed Annotation System (DAS). For example, DAS has enabled small experimental groups to integrate their data into the Ensembl genome browser. We have developed ProServer, a simple, lightweight, Perl-based DAS server that does not depend on a separate HTTP server. The ProServer package is easily extensible, allowing data to be served from almost any underlying data model. Recent additions to the DAS protocol have enabled both structure and alignment (sequence and structural) data to be exchanged. ProServer allows both of these data types to be served. AVAILABILITY: ProServer can be downloaded from http://www.sanger.ac.uk/proserver/ or CPAN http://search.cpan.org/~rpettett/. Details on the system requirements and installation of ProServer can be found at http://www.sanger.ac.uk/proserver/. 相似文献

14.

Microarray data mining with visual programming

Curk T Demsar J Xu Q Leban G Petrovic U Bratko I Shaulsky G Zupan B 《Bioinformatics (Oxford, England)》2005,21(3):396-398

SUMMARY: Visual programming offers an intuitive means of combining known analysis and visualization methods into powerful applications. The system presented here enables users who are not programmers to manage microarray and genomic data flow and to customize their analyses by combining common data analysis tools to fit their needs. AVAILABILITY: http://www.ailab.si/supp/bi-visprog SUPPLEMENTARY INFORMATION: http://www.ailab.si/supp/bi-visprog. 相似文献

15.

Mosclust: a software library for discovering significant structures in bio-molecular data

Valentini G 《Bioinformatics (Oxford, England)》2007,23(3):387-389

The R package mosclust (model order selection for clustering problems) implements algorithms based on the concept of stability for discovering significant structures in bio-molecular data. The software library provides stability indices obtained through different data perturbations methods (resampling, random projections, noise injection), as well as statistical tests to assess the significance of multi-level structures singled out from the data. Availability: http://homes.dsi.unimi.it/~valenti/SW/mosclust/download/mosclust_1.0.tar.gz. Supplementary information: http://homes.dsi.unimi.it/~valenti/SW/mosclust. 相似文献

16.

Molecular Data and the Dynamic Nature of Polyploidy

D. E. Soltis P. S. Soltis Dr. Loren H. Rieseberg 《植物科学评论》1993,12(3):243-273

During the past decade, molecular techniques have provided a wealth of data that have facilitated the resolution of several controversial questions in polyploid evolution. Herein we have focused on several of these issues: (1) the frequency of recurrent formation of polyploid species; (2) the genetic consequences of multiple polyploidizations within a species; (3) the prevalence and genetic attributes of autopolyploids; and (4) the genetic changes that occur in polyploid genomes following their formation.

Molecular data provide a more dynamic picture of polyploid evolution than has been traditionally espoused. Numerous studies have demonstrated multiple origins of both allopolyploids and autopolyploids. In several polyploid species studied in detail, multiple origins were found to be frequent on a local geographic scale, as well as during a short span of time. Molecular data strongly suggest that recurrent formation of polyploid species is the rule, rather than the exception. In addition, molecular data indicate that recurrent formation of polyploids has important genetic consequences, introducing considerable genetic variation from diploid progenitors into polyploid derivatives.

Molecular data also suggest a much more important role for natural autopolyploids than has been historically envisioned. In contrast to the longstanding view of autopolyploidy as being rare, molecular data continue to reveal steadily increasing numbers of well-documented autoploids having tetrasomic or higher-level polysomic inheritance. Although autopolyploidy undoubtedly occurs much less frequently than allopolyploidy in natural populations, it nonetheless has been a significant evolutionary mechanism. Molecular data also provide compelling genetic evidence that contradicts the traditional view of autopolyploidy as being maladaptive. Electrophoretic studies have revealed three important attributes of autopolyploids compared to their diploid progenitors: (1) enzyme multiplicity, (2) increased heterozygosity, and (3) increased allelic diversity. Genetic variability is, in fact, typically substantially higher in autopoloids than in their diploid progenitors. These genetic attributes of autopolyploids are due to polysomic inheritance and provide strong genetic arguments for the potential success of autopolyploids in nature.

In addition to providing numerous important insights into the formation of polyploids and the immediate genetic consequences of polyploidy, molecular data also have been used to study the subsequent evolution of polyploid genomes. Common hypotheses on the subsequent evolution of polyploid genomes include (1) gene silencing, eventually leading to extensively diploidized polyploid genomes; (2) gene diversification, resulting in regulatory or functional divergence of duplicate genes; and (3) genome diversification, resulting in chromosomal repatterning. Compelling, but limited, genetic evidence for all of these factors has been obtained in molecular analyses of polyploid species. The occurrence of these processes in polyploid genomes indicates that polyploid genomes are plastic and susceptible to evolutionary change.

In summary, molecular data continue to demonstrate that polyploidization and the subsequent evolution of polyploid genomes are very dynamic processes. 相似文献

17.

Determination of the material parameters of four-fibre family model based on uniaxial extension data of arterial walls

Lin Li Xiuqing Qian Songhua Yan Lin Hua Haixia Zhang 《Computer methods in biomechanics and biomedical engineering》2014,17(7):695-703

相似文献

18.

BioPAX support in CellDesigner

Mi H Muruganujan A Demir E Matsuoka Y Funahashi A Kitano H Thomas PD 《Bioinformatics (Oxford, England)》2011,27(24):3437-3438

MOTIVATION: BioPAX is a standard language for representing and exchanging models of biological processes at the molecular and cellular levels. It is widely used by different pathway databases and genomics data analysis software. Currently, the primary source of BioPAX data is direct exports from the curated pathway databases. It is still uncommon for wet-lab biologists to share and exchange pathway knowledge using BioPAX. Instead, pathways are usually represented as informal diagrams in the literature. In order to encourage formal representation of pathways, we describe a software package that allows users to create pathway diagrams using CellDesigner, a user-friendly graphical pathway-editing tool and save the pathway data in BioPAX Level 3 format. AVAILABILITY: The plug-in is freely available and can be downloaded at ftp://ftp.pantherdb.org/CellDesigner/plugins/BioPAX/ CONTACT: huaiyumi@usc.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. 相似文献

19.

Chemical effects in biological systems (CEBS) object model for toxicology data, SysTox-OM: design and application

Xirasagar S Gustafson SF Huang CC Pan Q Fostel J Boyer P Merrick BA Tomer KB Chan DD Yost KJ Choi D Xiao N Stasiewicz S Bushel P Waters MD 《Bioinformatics (Oxford, England)》2006,22(7):874-882

相似文献

20.

speed‐ne: Software to simulate and estimate genetic effective population size (Ne) from linkage disequilibrium observed in single samples

下载免费PDF全文

Matthew B. Hamilton Maria Tartakovsky Amy Battocletti 《Molecular ecology resources》2018,18(3):714-728

The genetic effective population size, N_e, can be estimated from the average gametic disequilibrium () between pairs of loci, but such estimates require evaluation of assumptions and currently have few methods to estimate confidence intervals. speed‐ne is a suite of matlab computer code functions to estimate from with a graphical user interface and a rich set of outputs that aid in understanding data patterns and comparing multiple estimators. speed‐ne includes functions to either generate or input simulated genotype data to facilitate comparative studies of estimators under various population genetic scenarios. speed‐ne was validated with data simulated under both time‐forward and time‐backward coalescent models of genetic drift. Three classes of estimators were compared with simulated data to examine several general questions: what are the impacts of microsatellite null alleles on , how should missing data be treated, and does disequilibrium contributed by reduced recombination among some loci in a sample impact . Estimators differed greatly in precision in the scenarios examined, and a widely employed estimator exhibited the largest variances among replicate data sets. speed‐ne implements several jackknife approaches to estimate confidence intervals, and simulated data showed that jackknifing over loci and jackknifing over individuals provided ~95% confidence interval coverage for some estimators and should be useful for empirical studies. speed‐ne provides an open‐source extensible tool for estimation of from empirical genotype data and to conduct simulations of both microsatellite and single nucleotide polymorphism (SNP) data types to develop expectations and to compare estimators. 相似文献