首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 375 毫秒
1.
MOTIVATION: Compensating alterations during the evolution of protein families give rise to coevolving positions that contain important structural and functional information. However, a high background composed of random noise and phylogenetic components interferes with the identification of coevolving positions. RESULTS: We have developed a rapid, simple and general method based on information theory that accurately estimates the level of background mutual information for each pair of positions in a given protein family. Removal of this background results in a metric, MIp, that correctly identifies substantially more coevolving positions in protein families than any existing method. A significant fraction of these positions coevolve strongly with one or only a few positions. The vast majority of such position pairs are in contact in representative structures. The identification of strongly coevolving position pairs can be used to impose significant structural limitations and should be an important additional constraint for ab initio protein folding. AVAILABILITY: Alignments and program files can be found in the Supplementary Information.  相似文献   

2.

Background  

Profile-based analysis of multiple sequence alignments (MSA) allows for accurate comparison of protein families. Here, we address the problems of detecting statistically confident dissimilarities between (1) MSA position and a set of predicted residue frequencies, and (2) between two MSA positions. These problems are important for (i) evaluation and optimization of methods predicting residue occurrence at protein positions; (ii) detection of potentially misaligned regions in automatically produced alignments and their further refinement; and (iii) detection of sites that determine functional or structural specificity in two related families.  相似文献   

3.
The millions of protein sequences generated by genomics are expected to transform protein engineering and personalized medicine. To achieve these goals, tools for predicting outcomes of amino acid changes must be improved. Currently, advances are hampered by insufficient experimental data about nonconserved amino acid positions. Since the property “nonconserved” is identified using a sequence alignment, we designed experiments to recapitulate that context: Mutagenesis and functional characterization was carried out in 15 LacI/GalR homologs (rows) at 12 nonconserved positions (columns). Multiple substitutions were made at each position, to reveal how various amino acids of a nonconserved column were tolerated in each protein row. Results showed that amino acid preferences of nonconserved positions were highly context-dependent, had few correlations with physico-chemical similarities, and were not predictable from their occurrence in natural LacI/GalR sequences. Further, unlike the “toggle switch” behaviors of conserved positions, substitutions at nonconserved positions could be rank-ordered to show a “rheostatic”, progressive effect on function that spanned several orders of magnitude. Comparisons to various sequence analyses suggested that conserved and strongly co-evolving positions act as functional toggles, whereas other important, nonconserved positions serve as rheostats for modifying protein function. Both the presence of rheostat positions and the sequence analysis strategy appear to be generalizable to other protein families and should be considered when engineering protein modifications or predicting the impact of protein polymorphisms.  相似文献   

4.
Compensatory substitutions happen when one mutation is advantageously selected because it restores the loss of fitness induced by a previous deleterious mutation. How frequent such mutations occur in evolution and what is the structural and functional context permitting their emergence remain open questions. We built an atlas of intra-protein compensatory substitutions using a phylogenetic approach and a dataset of 1,630 bacterial protein families for which high-quality sequence alignments and experimentally derived protein structures were available. We identified more than 51,000 positions coevolving by the mean of predicted compensatory mutations. Using the evolutionary and structural properties of the analyzed positions, we demonstrate that compensatory mutations are scarce (typically only a few in the protein history) but widespread (the majority of proteins experienced at least one). Typical coevolving residues are evolving slowly, are located in the protein core outside secondary structure motifs, and are more often in contact than expected by chance, even after accounting for their evolutionary rate and solvent exposure. An exception to this general scheme is residues coevolving for charge compensation, which are evolving faster than noncoevolving sites, in contradiction with predictions from simple coevolutionary models, but similar to stem pairs in RNA. While sites with a significant pattern of coevolution by compensatory mutations are rare, the comparative analysis of hundreds of structures ultimately permits a better understanding of the link between the three-dimensional structure of a protein and its fitness landscape.  相似文献   

5.
Correlated changes of nucleic or amino acids have provided strong information about the structures and interactions of molecules. Despite the rich literature in coevolutionary sequence analysis, previous methods often have to trade off between generality, simplicity, phylogenetic information, and specific knowledge about interactions. Furthermore, despite the evidence of coevolution in selected protein families, a comprehensive screening of coevolution among all protein domains is still lacking. We propose an augmented continuous-time Markov process model for sequence coevolution. The model can handle different types of interactions, incorporate phylogenetic information and sequence substitution, has only one extra free parameter, and requires no knowledge about interaction rules. We employ this model to large-scale screenings on the entire protein domain database (Pfam). Strikingly, with 0.1 trillion tests executed, the majority of the inferred coevolving protein domains are functionally related, and the coevolving amino acid residues are spatially coupled. Moreover, many of the coevolving positions are located at functionally important sites of proteins/protein complexes, such as the subunit linkers of superoxide dismutase, the tRNA binding sites of ribosomes, the DNA binding region of RNA polymerase, and the active and ligand binding sites of various enzymes. The results suggest sequence coevolution manifests structural and functional constraints of proteins. The intricate relations between sequence coevolution and various selective constraints are worth pursuing at a deeper level.  相似文献   

6.
The representation of protein structures as small-world networks facilitates the search for topological determinants, which may relate to functionally important residues. Here, we aimed to investigate the performance of residue centrality, viewed as a family fold characteristic, in identifying functionally important residues in protein families. Our study is based on 46 families, including 29 enzyme and 17 non-enzyme families. A total of 80% of these central positions corresponded to active site residues or residues in direct contact with these sites. For enzyme families, this percentage increased to 91%, while for non-enzyme families the percentage decreased substantially to 48%. A total of 70% of these central positions are located in catalytic sites in the enzyme families, 64% are in hetero-atom binding sites in those families binding hetero-atoms, and only 16% belong to protein-protein interfaces in families with protein-protein interaction data. These differences reflect the active site shape: enzyme active sites locate in surface clefts, hetero-atom binding residues are in deep cavities, while protein-protein interactions involve a more planar configuration. On the other hand, not all surface cavities or clefts are comprised of central residues. Thus, closeness centrality identifies functionally important residues in enzymes. While here we focus on binding sites, we expect to identify key residues for the integration and transmission of the information to the rest of the protein, reflecting the relationship between fold and function. Residue centrality is more conserved than the protein sequence, emphasizing the robustness of protein structures.  相似文献   

7.
Intraprotein side chain contacts can couple the evolutionary process of amino acid substitution at one position to that at another. This coupling, known as residue coevolution, may vary in strength. Conserved contacts thus not only define 3-dimensional protein structure, but also indicate which residue-residue interactions are crucial to a protein's function. Therefore, prediction of strongly coevolving residue-pairs helps clarify molecular mechanisms underlying function. Previously, various coevolution detectors have been employed separately to predict these pairs purely from multiple sequence alignments, while disregarding available structural information. This study introduces an integrative framework that improves the accuracy of such predictions, relative to previous approaches, by combining multiple coevolution detectors and incorporating structural contact information. This framework is applied to the ABC-B and ABC-C transporter families, which include the drug exporter P-glycoprotein involved in multidrug resistance of cancer cells, as well as the CFTR chloride channel linked to cystic fibrosis disease. The predicted coevolving pairs are further analyzed based on conformational changes inferred from outward- and inward-facing transporter structures. The analysis suggests that some pairs coevolved to directly regulate conformational changes of the alternating-access transport mechanism, while others to stabilize rigid-body-like components of the protein structure. Moreover, some identified pairs correspond to residues previously implicated in cystic fibrosis.  相似文献   

8.
9.
10.
Many protein pairs that share the same fold do not have any detectable sequence similarity, providing a valuable source of information for studying sequence-structure relationship. In this study, we use a stringent data set of structurally similar, sequence-dissimilar protein pairs to characterize residues that may play a role in the determination of protein structure and/or function. For each protein in the database, we identify amino-acid positions that show residue conservation within both close and distant family members. These positions are termed "persistently conserved". We then proceed to determine the "mutually" persistently conserved (MPC) positions: those structurally aligned positions in a protein pair that are persistently conserved in both pair mates. Because of their intra- and interfamily conservation, these positions are good candidates for determining protein fold and function. We find that 45% of the persistently conserved positions are mutually conserved. A significant fraction of them are located in critical positions for secondary structure determination, they are mostly buried, and many of them form spatial clusters within their protein structures. A substitution matrix based on the subset of MPC positions shows two distinct characteristics: (i) it is different from other available matrices, even those that are derived from structural alignments; (ii) its relative entropy is high, emphasizing the special residue restrictions imposed on these positions. Such a substitution matrix should be valuable for protein design experiments.  相似文献   

11.
12.
Dennis S  Camacho CJ  Vajda S 《Proteins》2000,38(2):176-188
To understand water-protein interactions in solution, the electrostatic field is calculated by solving the Poisson-Boltzmann equation, and the free energy surface of water is mapped by translating and rotating an explicit water molecule around the protein. The calculation is applied to T4 lysozyme with data available on the conservation of solvent binding sites in 18 crystallographically independent molecules. The free energy maps around the ordered water sites provide information on the relationship between water positions in crystal structure and in solution. Results show that almost all conserved sites and the majority of nonconserved sites are within 1.3 A of local free energy minima. This finding is in sharp contrast to the behavior of randomly placed water molecules in the boundary layer, which, on the average, must travel more than 3 A to the nearest free energy minimum. Thus, the solvation sites are at least partially determined by protein-water interactions rather than by crystal packing alone. The characteristic water residence times, obtained from the free energies at the local minima, are in good agreement with nuclear magnetic resonance experiments. Only about half of the potential sites show up as ordered water in the 1.7 A resolution X-ray structure. Crystal packing interactions can stabilize weak or mobile potential sites (in fact, some ordered water positions are not close to free energy minima) or can prevent water from occupying certain sites. Apart from a few buried water molecules that are strong binders, the free energies are not very different for conserved and nonconserved sites. We show that conservation of a water site between two crystals occurs if the positions of protein atoms, primarily contributing to the free energy at the local minimum, do not substantially change from one structure to the other. This requirement can be correlated with the nature of the side chain contacting the water molecule in the site.  相似文献   

13.
Tungtur S  Parente DJ  Swint-Kruse L 《Proteins》2011,79(5):1589-1608
Concomitant with the genomic era, many bioinformatics programs have been developed to identify functionally important positions from sequence alignments of protein families. To evaluate these analyses, many have used the LacI/GalR family and determined whether positions predicted to be "important" are validated by published experiments. However, we previously noted that predictions do not identify all of the experimentally important positions present in the linker regions of these homologs. In an attempt to reconcile these differences, we corrected and expanded the LacI/GalR sequence set commonly used in sequence/function analyses. Next, a variety of analyses were carried out (1) for the entire LacI/GalR sequence set and (2) for a subset of homologs with functionally-important "YxPxxxAxxL" motifs in their linkers. This strategy was devised to determine whether predictions could be improved by knowledge-based sequence sorting and-for some analyses-did increase the number of linker positions identified. However, two functionally important linker positions were not reliably identified by any analysis. Finally, we compared the new predictions to all known experimental data for E. coli LacI and three homologous linkers. From these, we estimate that >50% of positions are important to the functions of the LacI/GalR homologs. In corollary, neutral positions might occur less frequently and might be easier to detect in sequence analyses. Although analyses have successfully guided mutations that partially exchange protein functions, a better experimental understanding of the sequence/function relationships in protein families would be helpful for uncovering the remaining rules used by nature to evolve new protein functions.  相似文献   

14.
Interspecific comparisons of protein sequences can reveal regions of evolutionary conservation that are under purifying selection because of functional constraints. Interpreting these constraints requires combining evolutionary information with structural, biochemical, and physiological data to understand the biological function of conserved regions. We take this integrative approach to investigate the evolution and function of the nuclear-encoded subunits of cytochrome c oxidase (COX). We find that the nuclear-encoded subunits evolved subsequent to the origin of mitochondria and the subunit composition of the holoenzyme varies across diverse taxa that include animals, yeasts, and plants. By mapping conserved amino acids onto the crystal structure of bovine COX, we show that conserved residues are structurally organized into functional domains. These domains correspond to some known functional sites as well as to other uncharacterized regions. We find that amino acids that are important for structural stability are conserved at frequencies higher than expected within each taxon, and groups of conserved residues cluster together at distances of less than 5 A more frequently than do randomly selected residues. We, therefore, suggest that selection is acting to maintain the structural foundation of COX across taxa, whereas active sites vary or coevolve within lineages.  相似文献   

15.
16.
Multiple laboratory studies have evolved hosts against a nonevolving pathogen to address questions about evolution of immune responses. However, an ecologically more relevant scenario is one where hosts and pathogens can coevolve. Such coevolution between the antagonists, depending on the mutual selection pressure and additive variance in the respective populations, can potentially lead to a different pattern of evolution in the hosts compared to a situation where the host evolves against a nonevolving pathogen. In the present study, we used Drosophila melanogaster as the host and Pseudomonas entomophila as the pathogen. We let the host populations either evolve against a nonevolving pathogen or coevolve with the same pathogen. We found that the coevolving hosts on average evolved higher survivorship against the coevolving pathogen and ancestral (nonevolving) pathogen relative to the hosts evolving against a nonevolving pathogen. The coevolving pathogens evolved greater ability to induce host mortality even in nonlocal (novel) hosts compared to infection by an ancestral (nonevolving) pathogen. Thus, our results clearly show that the evolved traits in the host and the pathogen under coevolution can be different from one‐sided adaptation. In addition, our results also show that the coevolving host–pathogen interactions can involve certain general mechanisms in the pathogen, leading to increased mortality induction in nonlocal or novel hosts.  相似文献   

17.
Reynolds KA  McLaughlin RN  Ranganathan R 《Cell》2011,147(7):1564-1575
Recent work indicates a general architecture for proteins in which sparse networks of physically contiguous and coevolving amino acids underlie basic aspects of structure and function. These networks, termed sectors, are spatially organized such that active sites are linked to many surface sites distributed throughout the structure. Using the metabolic enzyme dihydrofolate reductase as a model system, we show that: (1) the sector is strongly correlated to a network of residues undergoing millisecond conformational fluctuations associated with enzyme catalysis, and (2) sector-connected surface sites are statistically preferred locations for the emergence of allosteric control in vivo. Thus, sectors represent an evolutionarily conserved "wiring" mechanism that can enable perturbations at specific surface positions to rapidly initiate conformational control over protein function. These findings suggest that sectors enable the evolution of intermolecular communication and regulation.  相似文献   

18.
Amino acid substitutions at nonconserved protein positions can have noncanonical and “long-distance” outcomes on protein function. Such outcomes might arise from changes in the internal protein communication network, which is often accompanied by changes in structural flexibility. To test this, we calculated flexibilities and dynamic coupling for positions in the linker region of the lactose repressor protein. This region contains nonconserved positions for which substitutions alter DNA-binding affinity. We first chose to study 11 substitutions at position 52. In computations, substitutions showed long-range effects on flexibilities of DNA-binding positions, and the degree of flexibility change correlated with experimentally measured changes in DNA binding. Substitutions also altered dynamic coupling to DNA-binding positions in a manner that captured other experimentally determined functional changes. Next, we broadened calculations to consider the dynamic coupling between 17 linker positions and the DNA-binding domain. Experimentally, these linker positions exhibited a wide range of substitution outcomes: Four conserved positions tolerated hardly any substitutions (“toggle”), ten nonconserved positions showed progressive changes from a range of substitutions (“rheostat”), and three nonconserved positions tolerated almost all substitutions (“neutral”). In computations with wild-type lactose repressor protein, the dynamic couplings between the DNA-binding domain and these linker positions showed varied degrees of asymmetry that correlated with the observed toggle/rheostat/neutral substitution outcomes. Thus, we propose that long-range and noncanonical substitutions outcomes at nonconserved positions arise from rewiring long-range communication among functionally important positions. Such calculations might enable predictions for substitution outcomes at a range of nonconserved positions.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号