Adaptive machine learning for protein engineering |
| |
Affiliation: | 1. Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, 94305, USA;2. Stanford ChEM-H, Stanford University, Stanford, CA, 94305, USA;3. Microsoft Research New England, Cambridge, MA, 02142, USA |
| |
Abstract: | Machine-learning models that learn from data to predict how protein sequence encodes function are emerging as a useful protein engineering tool. However, when using these models to suggest new protein designs, one must deal with the vast combinatorial complexity of protein sequences. Here, we review how to use a sequence-to-function machine-learning surrogate model to select sequences for experimental measurement. First, we discuss how to select sequences through a single round of machine-learning optimization. Then, we discuss sequential optimization, where the goal is to discover optimized sequences and improve the model across multiple rounds of training, optimization, and experimental measurement. |
| |
Keywords: | Machine learning Protein engineering Model-based optimization Adaptive sampling Bayesian optimization Gaussian process |
本文献已被 ScienceDirect 等数据库收录! |
|