首页 | 本学科首页   官方微博 | 高级检索  
     


Discriminating physiological from non-physiological interfaces in structures of protein complexes: A community-wide study
Authors:Hugo Schweke  Qifang Xu  Gerardo Tauriello  Lorenzo Pantolini  Torsten Schwede  Frédéric Cazals  Alix Lhéritier  Juan Fernandez-Recio  Luis Angel Rodríguez-Lumbreras  Ora Schueler-Furman  Julia K. Varga  Brian Jiménez-García  Manon F. Réau  Alexandre M. J. J. Bonvin  Castrense Savojardo  Pier-Luigi Martelli  Rita Casadio  Jérôme Tubiana  Haim J. Wolfson  Romina Oliva  Didier Barradas-Bautista  Tiziana Ricciardelli  Luigi Cavallo  Česlovas Venclovas  Kliment Olechnovič  Raphael Guerois  Jessica Andreani  Juliette Martin  Xiao Wang  Genki Terashi  Daipayan Sarkar  Charles Christoffer  Tunde Aderinwale  Jacob Verburgt  Daisuke Kihara  Anthony Marchand  Bruno E. Correia  Rui Duan  Liming Qiu  Xianjin Xu  Shuang Zhang  Xiaoqin Zou  Sucharita Dey  Roland L. Dunbrack  Emmanuel D. Levy  Shoshana J. Wodak
Affiliation:1. Department of Chemical and Structural Biology, Weizmann Institute of Science, Rehovot, Israel;2. Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, Pennsylvania, USA;3. Biozentrum, University of Basel & SIB Swiss Institute of Bioinformatics, Basel, Switzerland;4. Centre Inria d'Université Côte d'Azur, Sophia-Antipolis, France;5. Amadeus SAS, Sophia-Antipolis, France;6. Instituto de Ciencias de la Vid y del Vino (ICVV), CSIC-UR-Gobierno de La Rioja, Logroño, Spain;7. Department of Microbiology and Molecular Genetics, The Institute for Medical Research Israel-Canada, Hebrew University-Hadassah Medical School, Jerusalem, Israel;8. Computational Structural Biology Group, Department of Chemistry, Bijvoet Centre, Faculty of Science, Utrecht University, Utrecht, The Netherlands

Zymvol Biomodeling SL, Barcelona, Spain;9. Computational Structural Biology Group, Department of Chemistry, Bijvoet Centre, Faculty of Science, Utrecht University, Utrecht, The Netherlands;10. Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy;11. Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel;12. Department of Sciences and Technologies, University of Naples “Parthenope”, Naples, Italy;13. Kaust Visualization Lab, Core lab Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia;14. Physical Sciences and Engineering Division, Kaust Catalysis Center, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia;15. Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania;16. Institute for Integrative Biology of the Cell (I2BC), Commissariat à l'Energie Atomique, CNRS, Université Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette, France;17. Univ Lyon, Université Claude Bernard Lyon 1, CNRS, UMR 5086 MMSB, Lyon, France;18. Department of Computer Science, Purdue University, West Lafayette, Indiana, USA;19. Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA;20. Laboratory of Protein Design and Immunoengineering, Ecole polytechnique fédérale de Lausanne (EPFL), Lausanne, Switzerland;21. Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri, USA;22. Department of Bioscience and Bioengineering, Indian Institute of Technology Jodhpur, Karwar, Rajasthan, India;23. VIB-VUB Center for Structural Biology, Brussels, Belgium

Abstract:Reliably scoring and ranking candidate models of protein complexes and assigning their oligomeric state from the structure of the crystal lattice represent outstanding challenges. A community-wide effort was launched to tackle these challenges. The latest resources on protein complexes and interfaces were exploited to derive a benchmark dataset consisting of 1677 homodimer protein crystal structures, including a balanced mix of physiological and non-physiological complexes. The non-physiological complexes in the benchmark were selected to bury a similar or larger interface area than their physiological counterparts, making it more difficult for scoring functions to differentiate between them. Next, 252 functions for scoring protein-protein interfaces previously developed by 13 groups were collected and evaluated for their ability to discriminate between physiological and non-physiological complexes. A simple consensus score generated using the best performing score of each of the 13 groups, and a cross-validated Random Forest (RF) classifier were created. Both approaches showed excellent performance, with an area under the Receiver Operating Characteristic (ROC) curve of 0.93 and 0.94, respectively, outperforming individual scores developed by different groups. Additionally, AlphaFold2 engines recalled the physiological dimers with significantly higher accuracy than the non-physiological set, lending support to the reliability of our benchmark dataset annotations. Optimizing the combined power of interface scoring functions and evaluating it on challenging benchmark datasets appears to be a promising strategy.
Keywords:protein structure  protein interactions  crystal contacts  potential energy  homodimers
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号