Usage of a dataset of NMR resolved protein structures to test aggregation versus solubility prediction algorithms期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Usage of a dataset of NMR resolved protein structures to test aggregation versus solubility prediction algorithms

Authors:	Daniel B Roche Etienne Villain Andrey V Kajava

Institution:	1. Centre de Recherche en Biologie cellulaire de Montpellier, CNRS‐UMR 5237, Montpellier, France;2. Institut de Biologie Computationnelle, Université de Montpellier, Montpellier, France;3. University ITMO, 49 Kronverksky Pr, 197101, St. Petersburg, Russia

Abstract:	There has been an increased interest in computational methods for amyloid and (or) aggregate prediction, due to the prevalence of these aggregates in numerous diseases and their recently discovered functional importance. To evaluate these methods, several datasets have been compiled. Typically, aggregation‐prone regions of proteins, which form aggregates or amyloids in vivo, are more than 15 residues long and intrinsically disordered. However, the number of such experimentally established amyloid forming and non‐forming sequences are limited, not exceeding one hundred entries in existing databases. In this work, we parsed all available NMR‐resolved protein structures from the PDB and assembled a new, sevenfold larger, dataset of unfolded sequences, soluble at high concentrations. We proposed to use these sequences as a negative set for evaluating methods for predicting aggregation in vivo. We also present the results of benchmarking cutting edge tools for the prediction of aggregation versus solubility propensity.

Keywords:	NMR soluble database aggregation 3D structure amyloid fibrils computational approaches

设为首页 | 免责声明 | 关于勤云 | 加入收藏