首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Use of a structural alphabet to find compatible folds for amino acid sequences
Authors:Swapnil Mahajan  Alexandre G de Brevern  Yves‐Henri Sanejouand  Narayanaswamy Srinivasan  Bernard Offmann
Institution:1. Université de La Réunion, DSIMB, UMR‐S S1134, La Réunion, France;2. INSERM, UMR‐S 1134, DSIMB, Paris, France;3. Laboratoire d'Excellence, GR‐Ex, Paris, France;4. Université de Nantes, UFIP CNRS UMR 6286 Faculté des Sciences et Techniques, Nantes Cedex 03, France;5. Univ Paris Diderot, Sorbonne Paris Cité, UMR‐S 1134, Paris, France;6. Institut National de la Transfusion Sanguine (INTS), Paris, France;7. Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka, India;8. Peaccel, Inc., Cambridge, Massachusetts
Abstract:The structural annotation of proteins with no detectable homologs of known 3D structure identified using sequence‐search methods is a major challenge today. We propose an original method that computes the conditional probabilities for the amino‐acid sequence of a protein to fit to known protein 3D structures using a structural alphabet, known as “Protein Blocks” (PBs). PBs constitute a library of 16 local structural prototypes that approximate every part of protein backbone structures. It is used to encode 3D protein structures into 1D PB sequences and to capture sequence to structure relationships. Our method relies on amino acid occurrence matrices, one for each PB, to score global and local threading of query amino acid sequences to protein folds encoded into PB sequences. It does not use any information from residue contacts or sequence‐search methods or explicit incorporation of hydrophobic effect. The performance of the method was assessed with independent test datasets derived from SCOP 1.75A. With a Z‐score cutoff that achieved 95% specificity (i.e., less than 5% false positives), global and local threading showed sensitivity of 64.1% and 34.2%, respectively. We further tested its performance on 57 difficult CASP10 targets that had no known homologs in PDB: 38 compatible templates were identified by our approach and 66% of these hits yielded correctly predicted structures. This method scales‐up well and offers promising perspectives for structural annotations at genomic level. It has been implemented in the form of a web‐server that is freely available at http://www.bo‐protscience.fr/forsa .
Keywords:protein structures  structural alphabet  fold recognition  protein domains  threading  sequence–  structure relationship  structural annotation  protein blocks
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号