期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

PackHelix: a tool for helix-sheet packing during protein structure prediction

Hu C Koehl P Max N 《Proteins》2011,79(10):2828-2843

The three‐dimensional structure of a protein is organized around the packing of its secondary structure elements. Predicting the topology and constructing the geometry of structural motifs involving α‐helices and/or β‐strands are therefore key steps for accurate prediction of protein structure. While many efforts have focused on how to pack helices and on how to sample exhaustively the topologies and geometries of multiple strands forming a β‐sheet in a protein, there has been little progress on generating native‐like packings of helices on sheets. We describe a method that can generate the packing of multiple helices on a given β‐sheet for αβα sandwich type protein folds. This method mines the results of a statistical analysis of the conformations of αβ₂ motifs in protein structures to provide input values for the geometric attributes of the packing of a helix on a sheet. It then proceeds with a geometric builder that generates multiple arrangements of the helices on the sheet of interest by sampling through these values and performing consistency checks that guarantee proper loop geometry between the helices and the strands, minimal number of collisions between the helices, and proper formation of a hydrophobic core. The method is implemented as a module of ProteinShop. Our results show that it produces structures that are within 4–6 Å RMSD of the native one, regardless of the number of helices that need to be packed, though this number may increase if the protein has several helices between two consecutive strands in the sequence that pack on the sheet formed by these two strands. Proteins 2011; Published 2011 Wiley‐Liss, Inc. 相似文献

2.

Helix‐sheet packing in proteins

Chengcheng Hu Patrice Koehl 《Proteins》2010,78(7):1736-1747

The three‐dimensional structure of a protein is organized around the packing of its secondary structure elements. Although much is known about the packing geometry observed between α‐helices and between β‐sheets, there has been little progress on characterizing helix–sheet interactions. We present an analysis of the conformation of αβ₂ motifs in proteins, corresponding to all occurrences of helices in contact with two strands that are hydrogen bonded. The geometry of the αβ₂ motif is characterized by the azimuthal angle θ between the helix axis and an average vector representing the two strands, the elevation angle ψ between the helix axis and the plane containing the two strands, and the distance D between the helix and the strands. We observe that the helix tends to align to the two strands, with a preference for an antiparallel orientation if the two strands are parallel; this preference is diminished for other topologies of the β‐sheet. Side‐chain packing at the interface between the helix and the strands is mostly hydrophobic, with a preference for aliphatic amino acids in the strand and aromatic amino acids in the helix. From the knowledge of the geometry and amino acid propensities of αβ₂ motifs in proteins, we have derived different statistical potentials that are shown to be efficient in picking native‐like conformations among a set of non‐native conformations in well‐known decoy datasets. The information on the geometry of αβ₂ motifs as well as the related statistical potentials have applications in the field of protein structure prediction. Proteins 2010. © 2010 Wiley‐Liss, Inc. 相似文献

3.

Contact patterns between helices and strands of sheet define protein folding patterns

Kamat AP Lesk AM 《Proteins》2007,66(4):869-876

Comparing and classifying protein folding patterns allows organizing the known structures and enumerating possible protein structural patterns including those not yet observed. We capture the essence of protein folding patterns in a concise tableau representation based on the order and contact patterns of secondary structures: helices and strands of sheet. The tableaux are intelligible to both humans and computers. They provide a database, derived from the Protein Data Bank, mineable in studies of protein architecture. Using this database, we have: (i) determined statistical properties of secondary structure contacts in an unbiased set of protein domains from ASTRAL, (ii) observed that in 98% of cases, the tableau is a faithful representation of the folding pattern as classified in SCOP, (iii) demonstrated that to a large extent the local structure of proteins indicates their complete folding topology, and (iv) studied the use of the representation for fold identification. 相似文献

4.

BuildBeta—A system for automatically constructing beta sheets

Nelson Max ChengCheng Hu Oliver Kreylos Silvia Crivelli 《Proteins》2010,78(3):559-574

We describe a method that can thoroughly sample a protein conformational space given the protein primary sequence of amino acids and secondary structure predictions. Specifically, we target proteins with β‐sheets because they are particularly challenging for ab initio protein structure prediction because of the complexity of sampling long‐range strand pairings. Using some basic packing principles, inverse kinematics (IK), and β‐pairing scores, this method creates all possible β‐sheet arrangements including those that have the correct packing of β‐strands. It uses the IK algorithms of ProteinShop to move α‐helices and β‐strands as rigid bodies by rotating the dihedral angles in the coil regions. Our results show that our approach produces structures that are within 4–6 Å RMSD of the native one regardless of the protein size and β‐sheet topology although this number may increase if the protein has long loops or complex α‐helical regions. Proteins 2010. © Published 2009 Wiley‐Liss, Inc. 相似文献

5.

Description and recognition of regular and distorted secondary structures in proteins using the automated protein structure analysis method

Sushilee Ranganathan Dmitry Izotov Elfi Kraka Dieter Cremer 《Proteins》2009,76(2):418-438

The Automated Protein Structure Analysis (APSA) method, which describes the protein backbone as a smooth line in three‐dimensional space and characterizes it by curvature κ and torsion τ as a function of arc length s, was applied on 77 proteins to determine all secondary structural units via specific κ(s) and τ(s) patterns. A total of 533 α‐helices and 644 β‐strands were recognized by APSA, whereas DSSP gives 536 and 651 units, respectively. Kinks and distortions were quantified and the boundaries (entry and exit) of secondary structures were classified. Similarity between proteins can be easily quantified using APSA, as was demonstrated for the roll architecture of proteins ubiquitin and spinach ferridoxin. A twenty‐by‐twenty comparison of all α domains showed that the curvature‐torsion patterns generated by APSA provide an accurate and meaningful similarity measurement for secondary, super secondary, and tertiary protein structure. APSA is shown to accurately reflect the conformation of the backbone effectively reducing three‐dimensional structure information to two‐dimensional representations that are easy to interpret and understand. Proteins 2009. © 2008 Wiley‐Liss, Inc. 相似文献

6.

Topology mapping to characterize cyanobacterial bicarbonate transporters: BicA (SulP/SLC26 family) and SbtA

G. Dean Price Susan M. Howitt 《Molecular membrane biology》2014,31(6):177-182

This mini-review addresses advances in understanding the transmembrane topologies of two unrelated, single-subunit bicarbonate transporters from cyanobacteria, namely BicA and SbtA. BicA is a Na⁺-dependent bicarbonate transporter that belongs to the SulP/SLC26 family that is widespread in both eukaryotes and prokaryotes. Topology mapping of BicA via the phoA/lacZ fusion reporter method identified 12 transmembrane helices with an unresolved hydrophobic region just beyond helix 8. Re-interpreting this data in the light of a recent topology study on rat prestin leads to a consensus topology of 14 transmembrane domains with a 7+7 inverted repeat structure. SbtA is also a Na⁺-dependent bicarbonate transporter, but of considerably higher affinity (K_m 2–5?μM versus >100?μM for BicA). Whilst SbtA is widespread in cyanobacteria and a few bacteria, it appears to be absent from eukaryotes. Topology mapping of SbtA via the phoA/lacZ fusion reporter method identified 10 transmembrane helices. The topology consists of a 5+5 inverted repeat, with the two repeats separated by a large intracellular loop. The unusual location of the N and C-termini outside the cell raises the possibility that SbtA forms a novel fold, not so far identified by structural and topological studies on transport proteins. 相似文献

7.

The phospho-β-galactosidase and synaptotagmin predictions

Steven A. Benner Dietlind Gerloff Gareth Chelvanayagam 《Proteins》1995,23(3):446-453

Two bona fide consensus predictions of secondary and tertiary structure in a protein family, made and announced before experimental structures were known, are evaluated in light of the subsequently determined experimental structures. The first, for phospho-β-galactosidase, identified the core strands of an 8-fold α–β barrel, and identified the 8-fold α–β barrel itself, which was found in the subsequently determined experimental structure to be the core folding domain. The second, for synaptotagmin, identified seven out of eight β-strands in the structure correctly, missing only a noncore strand. Three preferred “topologies” were selected from several hundred thousand possible topologies of these seven predicted strands using a rule-based analysis. The subsequently determined experimental structure showed that these seven strands in synaptotagmin adopt one of the three preferred topologies. We were unable, however, to identify the correct topology from among these three topologies. © 1995 Wiley-Liss, Inc. 相似文献

8.

Parallel nucleic acid helices with hoogsteen base pairing: Symmetry and structure

G. Raghunathan H. Todd Miles V. Sasisekharan 《Biopolymers》1994,34(12):1573-1581

Molecular structures for parallel DNA and RNA double helices with Hoogsteen pairing are proposed for the first time. The DNA helices have sugars in the C2′-endo region and the phosphodiester conformations are (trans, gauche^?), and the RNA helices have sugars in the C3′-endo region and the phosphodiester conformations are (gauche^?, gauche^?). A pseudorotational symmetry relates the two parallel strands of DNA helices and a screw symmetry relates the two strands of RNA helices, which have an associated tilt of the The conformational space of parallel helices with Hoogsteen base pairing, unlike the Watson-Crick duplex, is highly restricted due to the unique positioning of the symmetry axis in the former case. The features of the parallel double helix with Hoogsteen pairing are compared with the Watson-Crick duplex and the corresponding triple helix. © 1994 John Wiley & Sons, Inc. 相似文献

9.

Ranking valid topologies of the secondary structure elements using a constraint graph

Al Nasr K Ranjan D Zubair M He J 《Journal of bioinformatics and computational biology》2011,9(3):415-430

Electron cryo-microscopy is a fast advancing biophysical technique to derive three-dimensional structures of large protein complexes. Using this technique, many density maps have been generated at intermediate resolution such as 6-10 ? resolution. Although it is challenging to derive the backbone of the protein directly from such density maps, secondary structure elements such as helices and β-sheets can be computationally detected. Our work in this paper provides an approach to enumerate the top-ranked possible topologies instead of enumerating the entire population of the topologies. This approach is particularly practical for large proteins. We developed a directed weighted graph, the topology graph, to represent the secondary structure assignment problem. We prove that the problem of finding the valid topology with the minimum cost is NP hard. We developed an O(N(2)2(N)) dynamic programming algorithm to identify the topology with the minimum cost. The test of 15 proteins suggests that our dynamic programming approach is feasible to work with proteins of much larger size than we could before. The largest protein in the test contains 18 helical sticks detected from the density map out of 33 helices in the protein. 相似文献

10.

Uncovering symmetry-breaking vector and reliability order for assigning secondary structures of proteins from atomic NMR chemical shifts in amino acids

Yu W Lee W Lee W Kim S Chang I 《Journal of biomolecular NMR》2011,51(4):411-424

Unravelling the complex correlation between chemical shifts of ¹³ C ^α, ¹³ C ^β, ¹³ C′, ¹ H ^α, ¹⁵ N, ¹ H ^N atoms in amino acids of proteins from NMR experiment and local structural environments of amino acids facilitates the assignment of secondary structures of proteins. This is an important impetus for both determining the three-dimensional structure and understanding the biological function of proteins. The previous empirical correlation scores which relate chemical shifts of ¹³ C ^α, ¹³ C ^β, ¹³ C′, ¹ H ^α, ¹⁵ N, ¹ H ^N atoms to secondary structures resulted in progresses toward assigning secondary structures of proteins. However, the physical-mathematical framework for these was elusive partly due to both the limited and orthogonal exploration of higher-dimensional chemical shifts of hetero-nucleus and the lack of physical-mathematical understanding underlying those correlation scores. Here we present a simple multi-dimensional hetero-nuclear chemical shift score function (MDHN-CSSF) which captures systematically the salient feature of such complex correlations without any references to a random coil state of proteins. We uncover the symmetry-breaking vector and its reliability order not only for distinguishing different secondary structures of proteins but also for capturing the delicate sensitivity interplayed among chemical shifts of ¹³ C ^α, ¹³ C ^β, ¹³ C′, ¹ H ^α, ¹⁵ N, ¹ H ^N atoms simultaneously, which then provides a straightforward framework toward assigning secondary structures of proteins. MDHN-CSSF could correctly assign secondary structures of training (validating) proteins with the favourable (comparable) Q3 scores in comparison with those from the previous correlation scores. MDHN-CSSF provides a simple and robust strategy for the systematic assignment of secondary structures of proteins and would facilitate the de novo determination of three-dimensional structures of proteins. 相似文献

11.

Identifying the Tertiary Fold of Small Proteins with Different Topologies from Sequence and Secondary Structure using the Genetic Algorithm and Extended Criteria Specific for Strand Regions

《Journal of molecular biology》1996,256(3):645-660

Grid-free protein folding simulations based on sequence and secondary structure knowledge (using mostly experimentally determined secondary structure information but also analysing results from secondary structure predictions) were investigated using the genetic algorithm, a backbone representation, and standard dihedral angular conformations. Optimal structures are selected according to basic protein building principles. Having previously applied this approach to proteins with helical topology, we have now developed additional criteria and weights for β-strand- containing proteins, validated them on four small β-strand-rich proteins with different topologies, and tested the general performance of the method on many further examples from known protein structures with mixed secondary structural type and less than 100 amino acid residues.Topology predictions close to the observed experimental structures were obtained in four test cases together with fitness values that correlated with the similarity of the predicted topology to the observed structures. Root-mean-square deviation values of C^αatoms in the superposed predicted and observed structures, the latter of which had different topologies, were between 4.5 and 5.5 Å (2.9 to 5.1 Å without loops). Including 15 further protein examples with unique folds, root-mean-square deviation values ranged between 1.8 and 6.9 Å with loop regions and averaged 5.3 Å and 4.3 Å, including and excluding loop regions, respectively. 相似文献

12.

Computational de novo design of a four-helix bundle protein—DND_4HB

下载免费PDF全文

Grant S Murphy Bharatwaj Sathyamoorthy Bryan S Der Mischa C Machius Surya V Pulavarti Thomas Szyperski Brian Kuhlman 《Protein science : a publication of the Protein Society》2015,24(4):434-445

The de novo design of proteins is a rigorous test of our understanding of the key determinants of protein structure. The helix bundle is an interesting de novo design model system due to the diverse topologies that can be generated from a few simple α-helices. Previously, noncomputational studies demonstrated that connecting amphipathic helices together with short loops can sometimes generate helix bundle proteins, regardless of the bundle''s exact sequence. However, using such methods, the precise positions of helices and side chains cannot be predetermined. Since protein function depends on exact positioning of residues, we examined if sequence design tools in the program Rosetta could be used to design a four-helix bundle with a predetermined structure. Helix position was specified using a folding procedure that constrained the design model to a defined topology, and iterative rounds of rotamer-based sequence design and backbone refinement were used to identify a low energy sequence for characterization. The designed protein, DND_4HB, unfolds cooperatively (T_m >90°C) and a NMR solution structure shows that it adopts the target helical bundle topology. Helices 2, 3, and 4 agree very closely with the design model (backbone RMSD = 1.11 Å) and >90% of the core side chain χ1 and χ2 angles are correctly predicted. Helix 1 lies in the target groove against the other helices, but is displaced 3 Å along the bundle axis. This result highlights the potential of computational design to create bundles with atomic-level precision, but also points at remaining challenges for achieving specific positioning between amphipathic helices. 相似文献

13.

Structures of domains I and IV from YbbR are representative of a widely distributed protein family

Barb AW Cort JR Seetharaman J Lew S Lee HW Acton T Xiao R Kennedy MA Tong L Montelione GT Prestegard JH 《Protein science : a publication of the Protein Society》2011,20(2):396-405

YbbR domains are widespread throughout Eubacteria and are expressed as monomeric units, linked in tandem repeats or cotranslated with other domains. Although the precise role of these domains remains undefined, the location of the multiple YbbR domain‐encoding ybbR gene in the Bacillus subtilis glmM operon and its previous identification as a substrate for a surfactin‐type phosphopantetheinyl transferase suggests a role in cell growth, division, and virulence. To further characterize the YbbR domains, structures of two of the four domains (I and IV) from the YbbR‐like protein of Desulfitobacterium hafniense Y51 were solved by solution nuclear magnetic resonance and X‐ray crystallography. The structures show the domains to have nearly identical topologies despite a low amino acid identity (23%). The topology is dominated by β‐strands, roughly following a “figure 8” pattern with some strands coiling around the domain perimeter and others crossing the center. A similar topology is found in the C‐terminal domain of two stress‐responsive bacterial ribosomal proteins, TL5 and L25. Based on these models, a structurally guided amino acid alignment identifies features of the YbbR domains that are not evident from naïve amino acid sequence alignments. A structurally conserved cis‐proline (cis‐Pro) residue was identified in both domains, though the local structure in the immediate vicinities surrounding this residue differed between the two models. The conservation and location of this cis‐Pro, plus anchoring Val residues, suggest this motif may be significant to protein function. 相似文献

14.

Amyloidogenic sequences in native protein structures

Susan Tzotzos Andrew J. Doig 《Protein science : a publication of the Protein Society》2010,19(2):327-348

Numerous short peptides have been shown to form β‐sheet amyloid aggregates in vitro. Proteins that contain such sequences are likely to be problematic for a cell, due to their potential to aggregate into toxic structures. We investigated the structures of 30 proteins containing 45 sequences known to form amyloid, to see how the proteins cope with the presence of these potentially toxic sequences, studying secondary structure, hydrogen‐bonding, solvent accessible surface area and hydrophobicity. We identified two mechanisms by which proteins avoid aggregation: Firstly, amyloidogenic sequences are often found within helices, despite their inherent preference to form β structure. Helices may offer a selective advantage, since in order to form amyloid the sequence will presumably have to first unfold and then refold into a β structure. Secondly, amyloidogenic sequences that are found in β structure are usually buried within the protein. Surface exposed amyloidogenic sequences are not tolerated in strands, presumably because they lead to protein aggregation via assembly of the amyloidogenic regions. The use of α‐helices, where amyloidogenic sequences are forced into helix, despite their intrinsic preference for β structure, is thus a widespread mechanism to avoid protein aggregation. 相似文献

15.

Amino acid composition analysis of human secondary transport proteins and implications for reliable membrane topology prediction

Massoud Saidijam Sonia Azizpour 《Journal of biomolecular structure & dynamics》2017,35(5):929-949

Secondary transporters in humans are a large group of proteins that transport a wide range of ions, metals, organic and inorganic solutes involved in energy transduction, control of membrane potential and osmotic balance, metabolic processes and in the absorption or efflux of drugs and xenobiotics. They are also emerging as important targets for development of new drugs and as target sites for drug delivery to specific organs or tissues. We have performed amino acid composition (AAC) and phylogenetic analyses and membrane topology predictions for 336 human secondary transport proteins and used the results to confirm protein classification and to look for trends and correlations with structural domains and specific substrates and/or function. Some proteins showed statistically high contents of individual amino acids or of groups of amino acids with similar physicochemical properties. One recurring trend was a correlation between high contents of charged and/or polar residues with misleading results in predictions of membrane topology, which was especially prevalent in Mitochondrial Carrier family proteins. We demonstrate how charged or polar residues located in the middle of transmembrane helices can interfere with their identification by membrane topology tools resulting in missed helices in the prediction. Comparison of AAC in the human proteins with that in 235 secondary transport proteins from Escherichia coli revealed similar overall trends along with differences in average contents for some individual amino acids and groups of similar amino acids that are presumed to result from a greater number of functions and complexity in the higher organism. 相似文献

16.

Membrane topology mapping of the Na+-pumping NADH: quinone oxidoreductase from Vibrio cholerae by PhoA-green fluorescent protein fusion analysis

下载免费PDF全文

Duffy EB Barquera B 《Journal of bacteriology》2006,188(24):8343-8351

The membrane topologies of the six subunits of Na⁺-translocating NADH:quinone oxidoreductase (Na⁺-NQR) from Vibrio cholerae were determined by a combination of topology prediction algorithms and the construction of C-terminal fusions. Fusion expression vectors contained either bacterial alkaline phosphatase (phoA) or green fluorescent protein (gfp) genes as reporters of periplasmic and cytoplasmic localization, respectively. A majority of the topology prediction algorithms did not predict any transmembrane helices for NqrA. A lack of PhoA activity when fused to the C terminus of NqrA and the observed fluorescence of the green fluorescent protein C-terminal fusion confirm that this subunit is localized to the cytoplasmic side of the membrane. Analysis of four PhoA fusions for NqrB indicates that this subunit has nine transmembrane helices and that residue T236, the binding site for flavin mononucleotide (FMN), resides in the cytoplasm. Three fusions confirm that the topology of NqrC consists of two transmembrane helices with the FMN binding site at residue T225 on the cytoplasmic side. Fusion analysis of NqrD and NqrE showed almost mirror image topologies, each consisting of six transmembrane helices; the results for NqrD and NqrE are consistent with the topologies of Escherichia coli homologs YdgQ and YdgL, respectively. The NADH, flavin adenine dinucleotide, and Fe-S center binding sites of NqrF were localized to the cytoplasm. The determination of the topologies of the subunits of Na⁺-NQR provides valuable insights into the location of cofactors and identifies targets for mutagenesis to characterize this enzyme in more detail. The finding that all the redox cofactors are localized to the cytoplasmic side of the membrane is discussed. 相似文献

17.

Optimal mutation sites for PRE data collection and membrane protein structure prediction

Chen H Ji F Olman V Mobley CK Liu Y Zhou Y Bushweller JH Prestegard JH Xu Y 《Structure (London, England : 1993)》2011,19(4):484-495

Nuclear magnetic resonance paramagnetic relaxation enhancement (PRE) measures long-range distances to isotopically labeled residues, providing useful constraints for protein structure prediction. The method usually requires labor-intensive conjugation of nitroxide labels to multiple locations on the protein, one at a time. Here a computational procedure, based on protein sequence and simple secondary structure models, is presented to facilitate optimal placement of a minimum number of labels needed to determine the correct topology of?a helical transmembrane protein. Tests on DsbB (four helices) using just one label lead to correct topology predictions in four of five cases, with the predicted structures <6 ? to the native structure. Benchmark results using simulated PRE data show that we can generally predict the correct topology for five and six to seven helices using two and three labels, respectively, with an average success rate of 76% and structures of similar precision. The results show promise in facilitating experimentally constrained structure prediction of membrane proteins. 相似文献

18.

Understanding the Role of Three-Dimensional Topology in Determining the?Folding Intermediates of Group I Introns

Chunxia Chen Somdeb Mitra Magdalena Jonikas Joshua Martin Michael Brenowitz Alain Laederach 《Biophysical journal》2013,104(6):1326-1337

Many RNA molecules exert their biological function only after folding to unique three-dimensional structures. For long, noncoding RNA molecules, the complexity of finding the native topology can be a major impediment to correct folding to the biologically active structure. An RNA molecule may fold to a near-native structure but not be able to continue to the correct structure due to a topological barrier such as crossed strands or incorrectly stacked helices. Achieving the native conformation thus requires unfolding and refolding, resulting in a long-lived intermediate. We investigate the role of topology in the folding of two phylogenetically related catalytic group I introns, the Twort and Azoarcus group I ribozymes. The kinetic models describing the Mg²⁺-mediated folding of these ribozymes were previously determined by time-resolved hydroxyl (⋅OH) radical footprinting. Two intermediates formed by parallel intermediates were resolved for each RNA. These data and analytical ultracentrifugation compaction analyses are used herein to constrain coarse-grained models of these folding intermediates as we investigate the role of nonnative topology in dictating the lifetime of the intermediates. Starting from an ensemble of unfolded conformations, we folded the RNA molecules by progressively adding native constraints to subdomains of the RNA defined by the ⋅OH time-progress curves to simulate folding through the different kinetic pathways. We find that nonnative topologies (arrangement of helices) occur frequently in the folding simulations despite using only native constraints to drive the reaction, and that the initial conformation, rather than the folding pathway, is the major determinant of whether the RNA adopts nonnative topology during folding. From these analyses we conclude that biases in the initial conformation likely determine the relative flux through parallel RNA folding pathways. 相似文献

19.

Double helix formation in α‐peptides: a theoretical study

Peter Schramm Hans‐Jörg Hofmann 《Journal of peptide science》2010,16(6):276-283

A complete overview on all possible hydrogen bonding patterns of double helices with antiparallel and parallel strand orientation in α‐peptide sequences is provided on the basis of ab initio molecular orbital theory. The most stable representatives belong to the group of antiparallel helices. The study on side chain influence shows that these double helices can only be realized if the strands are composed of L ‐ and D ‐amino acids in alternate order. The stability of the double helices is compared with that of competing single‐stranded helices. The data contribute to an understanding of secondary structure formation in peptides and provide a basis for a rational design of membrane channels. Copyright © 2010 European Peptide Society and John Wiley & Sons, Ltd. 相似文献

20.

A new protein folding algorithm based on hydrophobic compactness: Rigid Unconnected Secondary Structure Iterative Assembly (RUSSIA). I: Methodology

Znamenskiy D Chomilier J Le Tuan K Mornon JP 《Protein engineering》2003,16(12):925-935

We present an algorithm that is able to propose compact models of protein 3D structures, only starting from the prediction of the nature and length of regular secondary structures. Helices are modeled by cylinders and sheets by helicoid surfaces, all strands of a sheet being considered as a single block. It means that relative topology of the strands inside one sheet is a prerequisite. Loops are only considered as constraints, given by the maximal distance between their Calpha extremities according to their sequence length. Unconnected regular secondary structures are reduced to a single point, the center of their hydrophobic faces. These centers are then repeatedly moved in order to obtain a compact hydrophobic core. To prevent secondary structures from interpenetrating, a repulsive term is introduced in the function whose minimization leads to the compact structure. This RUSSIA (Rigid Unconnected Secondary Structure Assembly) algorithm has the advantage of relying on a small number of variables and therefore many initial conformations can be tested. Flexibility is produced in the following way: helices or sheets are allowed to rotate around the direction leading to the center of the model; residues in a sheet can slide along the main direction of the strand where they are embedded. RUSSIA is fast and simple and it produces on a test set several neighbor good models with an r.m.s. to the native structures in the range 1.4-3.7 A. These models can be further treated by statistical potentials used in threading approaches in order to detect the best candidate. The limits of the present method are the following: small proteins with few secondary structures are excluded; multi domain proteins must be split into several compact globular domains from their sequences; sheets of more than five strands and completely buried helices are not treated. In this first paper the algorithm is developed and in Part II, which follows, some applications are presented and the program is evaluated. 相似文献