首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 9 毫秒
1.
The domain of unknown function (DUF) YP_001302112.1, a protein secreted by the human intestinal microbita, has been determined by NMR and represents the first structure for the Pfam PF14466. Its NMR structure is classified as a new fold, which, nonetheless, shows limited similarities with representatives of the PLAT/LH2 domains from PF01477 and the C2 domains from PF00168, both of which bind Ca2+ for their physiological functions. Further experiments revealed affinity of YP_001302112.1 for Ca2+, and the NMR structure in the presence of CaCl2 was better defined than that of the apo‐protein. Overall, these NMR structures establish a new connection between structural representatives from two widely different Pfams that include the calcium‐binding domain of a sialidase from Vibrio cholerae and the α‐toxin from Clostridium perfrigens, whereby these two proteins have only 7% sequence identity. Furthermore, it provides information toward the functional annotation of YP_001302112.1, based on its capacity to bind Ca2+, and thus adds to the structural and functional coverage of the protein sequence universe. © 2013 The Protein Society  相似文献   

2.
A hypothetical protein encoded by the gene YajQ of Haemophilus influenzae was selected, as part of a structural genomics project, for X-ray crystallographic structure determination and analysis to assist with the functional assignment. The protein is present in most bacteria, but not in archaea or eukaryotes. The amino acid sequence has no homology to that of other proteins.The YajQ protein was cloned, expressed, and the crystal structure determined at 2.1-Å resolution by applying the multiwavelength anomalous dispersion method to a mercury derivative. The polypeptide chain is folded into two domains with identical folding topology. Each domain has a four-stranded antiparallel -sheet flanked on one side by two -helices. This structural motif is a characteristic feature of many RNA-binding proteins. The tetrameric structure observed in the crystal suggests a possibility of binding two stretches of double-stranded nucleic acid.  相似文献   

3.
A conserved cis proline residue located in the active site of Thermotoga maritima acetyl esterase (TmAcE) from the carbohydrate esterase family 7 (CE7) has been substituted by alanine. The residue was known to play a crucial role in determining the catalytic properties of the enzyme. To elucidate the structural role of the residue, the crystal structure of the Pro228Ala variant (TmAcEP228A) was determined at 2.1 Å resolution. The replacement does not affect the overall secondary, tertiary, and quaternary structures and moderately decreases the thermal stability. However, the wild type cis conformation of the 227–228 peptide bond adopts a trans conformation in the variant. Other conformational changes in the tertiary structure are restricted to residues 222–226, preceding this peptide bond and are located away from the active site. Overall, the results suggest that the conserved proline residue is responsible for the cis conformation of the peptide and shapes the geometry of the active site. Elimination of the pyrrolidine ring results in the loss of van der Waals and hydrophobic interactions with both the alcohol and acyl moeities of the ester substrate, leading to significant impairment of the activity and perturbation of substrate specificity. Furthermore, a cis‐to‐trans conformational change arising out of residue changes at this position may be associated with the evolution of divergent activity, specificity, and stability properties of members constituting the CE7 family. Proteins 2017; 85:694–708. © 2016 Wiley Periodicals, Inc.  相似文献   

4.
We present here the 2.6Å resolution crystal structure of the pT26‐6p protein, which is encoded by an ORF of the plasmid pT26‐2, recently isolated from the hyperthermophilic archaeon, Thermococcus sp. 26,2. This large protein is present in all members of a new family of mobile elements that, beside pT26‐2 include several virus‐like elements integrated in the genomes of several Thermococcales and Methanococcales (phylum Euryarchaeota). Phylogenetic analysis suggested that this protein, together with its nearest neighbor (organized as an operon) have coevolved for a long time with the cellular hosts of the encoding mobile element. As the sequences of the N and C‐terminal regions suggested a possible membrane association, a deletion construct (739 amino acids) was used for structural analysis. The structure consists of two very similar β‐sheet domains with a new topology and a five helical bundle C‐terminal domain. Each of these domains corresponds to a unique fold that has presently not been found in cellular proteins. This result supports the idea that proteins encoded by plasmid and viruses that have no cellular homologues could be a reservoir of new folds for structural genomic studies.  相似文献   

5.
We present an in silico method to estimate the contribution of each residue in a protein to its overall stability using three database‐derived statistical potentials that are based on inter‐residue distances, backbone torsion angles and solvent accessibility, respectively. Residues that contribute very unfavorably to the folding free energy are defined as stability weaknesses, whereas residues that show a highly stabilizing contribution are called stability strengths. Strengths and/or weaknesses on residues that are in spatial contact are clustered into 3‐dimensional (3D) stability patches. The identification and analysis of strength‐ and weakness‐containing regions in a protein may reveal structural or functional characteristics, and/or interesting spots to introduce mutations. To illustrate the power of our method, we apply it to bovine seminal ribonuclease. This enzyme catalyzes the degradation of RNA strands, and has the peculiarity of undergoing 3D domain swapping in physiological conditions. The weaknesses and strengths were compared among the monomeric, dimeric and swapped dimeric forms. We identified weaknesses among the catalytic residues and a mixture of weaknesses and strengths among the substrate‐binding residues in the three forms. In the regions involved in 3D swapping, we observed an accumulation of weaknesses in the monomer, which disappear in the dimer and especially in the swapped dimer. Moreover, monomeric homologous proteins were found to exhibit less weaknesses in these regions, whereas mutants known to favor unswapped dimerization appear stabilized in this form. Our method has several perspectives for functional annotation, rational prediction of targeted mutations, and mapping of stability changes upon conformational rearrangements. Proteins 2016; 84:143–158. © 2015 Wiley Periodicals, Inc.  相似文献   

6.
The TAR RNA-binding Protein (TRBP) is a double-stranded RNA (dsRNA)-binding protein, which binds to Dicer and is required for the RNA interference pathway. TRBP consists of three dsRNA-binding domains (dsRBDs). The first and second dsRBDs (dsRBD1 and dsRBD2, respectively) have affinities for dsRNA, whereas the third dsRBD (dsRBD3) binds to Dicer. In this study, we prepared the single domain fragments of human TRBP corresponding to dsRBD1 and dsRBD2 and solved the crystal structure of dsRBD1 and the solution structure of dsRBD2. The two structures contain an α-β-β-β-α fold, which is common to the dsRBDs. The overall structures of dsRBD1 and dsRBD2 are similar to each other, except for a slight shift of the first α helix. The residues involved in dsRNA binding are conserved. We examined the small interfering RNA (siRNA)-binding properties of these dsRBDs by isothermal titration colorimetry measurements. The dsRBD1 and dsRBD2 fragments both bound to siRNA, with dissociation constants of 220 and 113 nM, respectively. In contrast, the full-length TRBP and its fragment with dsRBD1 and dsRBD2 exhibited much smaller dissociation constants (0.24 and 0.25 nM, respectively), indicating that the tandem dsRBDs bind simultaneously to one siRNA molecule. On the other hand, the loop between the first α helix and the first β strand of dsRBD2, but not dsRBD1, has a Trp residue, which forms hydrophobic and cation-π interactions with the surrounding residues. A circular dichroism analysis revealed that the thermal stability of dsRBD2 is higher than that of dsRBD1 and depends on the Trp residue.  相似文献   

7.
The starch-synthase III (SSIII), with a total of 1025 residues, is one of the enzymes involved in plants starch synthesis. SSIII from Arabidopsis thaliana contains a putative N-terminal transit peptide followed by a 557-amino acid SSIII-specific domain (SSIII-SD) with three internal repeats and a C-terminal catalytic domain of 450 amino acids. Here, using computational characterization techniques, we show that each of the three internal repeats encodes a starch-binding domain (SBD). Although the SSIII from A. thaliana and its close homologous proteins show no detectable sequence similarity with characterized SBD sequences, the amino acid residues known to be involved in starch binding are well conserved.  相似文献   

8.
We recently determined the first structures of inactivated and calcium-activated calcium-dependent protein kinases (CDPKs) from Apicomplexa. Calcium binding triggered a large conformational change that constituted a new mechanism in calcium signaling and a novel EF-hand fold (CAD, for CDPK activation domain). Thus we set out to determine if this mechanism was universal to all CDPKs. We solved additional CDPK structures, including one from the species Plasmodium. We highlight the similarities in sequence and structure across apicomplexan and plant CDPKs, and strengthen our observations that this novel mechanism could be universal to canonical CDPKs. Our new structures demonstrate more detailed steps in the mechanism of calcium activation and possible key players in regulation. Residues involved in making the largest conformational change are the most conserved across Apicomplexa, leading us to propose that the mechanism is indeed conserved. CpCDPK3_CAD and PfCDPK_CAD were captured at a possible intermediate conformation, lending insight into the order of activation steps. PfCDPK3_CAD adopts an activated fold, despite having an inactive EF-hand sequence in the N-terminal lobe. We propose that for most apicomplexan CDPKs, the mode of activation will be similar to that seen in our structures, while specific regulation of the inactive and active forms will require further investigation.  相似文献   

9.
10.
The domain structure of hog-kidney aminoacylase I was studied by limited proteolytic digestion with trypsin and characterization of the resulting fragments. In the native enzyme, the sequences from residue 6 to 196 and 307 to 406 are resistant to trypsin and remain tightly bound in nondenaturing solvents, while the intervening sequence (197–306) is efficiently degraded by trypsin. We conclude that the N-terminal half of the molecule and its C-terminal fourth form two independently folded domains. Both contain a peculiar PWW(A,L) sequence motif preceded by several strongly polar residues. We propose that these sequences form surface loops that mediate the membrane association of aminoacy clase I. We further show that the three free cysteine residues and the essential Zn2+ ion reside in the trypsin-resistant domains, while the intervening sequence contains the only disulfide H bond of the protein.  相似文献   

11.
The crystal structures of an unliganded and adenosine 5′‐monophosphate (AMP) bound, metal‐dependent phosphoesterase (YP_910028.1) from Bifidobacterium adolescentis are reported at 2.4 and 1.94 Å, respectively. Functional characterization of this enzyme was guided by computational analysis and then confirmed by experiment. The structure consists of a polymerase and histidinol phosphatase (PHP, Pfam: PF02811) domain with a second domain (residues 105‐178) inserted in the middle of the PHP sequence. The insert domain functions in binding AMP, but the precise function and substrate specificity of this domain are unknown. Initial bioinformatics analyses yielded multiple potential functional leads, with most of them suggesting DNA polymerase or DNA replication activity. Phylogenetic analysis indicated a potential DNA polymerase function that was somewhat supported by global structural comparisons identifying the closest structural match to the alpha subunit of DNA polymerase III. However, several other functional predictions, including phosphoesterase, could not be excluded. Theoretical microscopic anomalous titration curve shapes, a computational method for the prediction of active sites from protein 3D structures, identified potential reactive residues in YP_910028.1. Further analysis of the predicted active site and local comparison with its closest structure matches strongly suggested phosphoesterase activity, which was confirmed experimentally. Primer extension assays on both normal and mismatched DNA show neither extension nor degradation and provide evidence that YP_910028.1 has neither DNA polymerase activity nor DNA‐proofreading activity. These results suggest that many of the sequence neighbors previously annotated as having DNA polymerase activity may actually be misannotated. Proteins 2011. © 2011 Wiley‐Liss, Inc.  相似文献   

12.
Collagens are thought to represent one of the most important molecular innovations in the metazoan line. Basement membrane type IV collagen is present in all Eumetazoa and was found in Homoscleromorpha, a sponge group with a well-organized epithelium, which may represent the first stage of tissue differentiation during animal evolution. In contrast, spongin seems to be a demosponge-specific collagenous protein, which can totally substitute an inorganic skeleton, such as in the well-known bath sponge. In the freshwater sponge Ephydatia mülleri, we previously characterized a family of short-chain collagens that are likely to be main components of spongins. Using a combination of sequence- and structure-based methods, we present evidence of remote homology between the carboxyl-terminal noncollagenous NC1 domain of spongin short-chain collagens and type IV collagen. Unexpectedly, spongin short-chain collagen-related proteins were retrieved in nonsponge animals, suggesting that a family related to spongin constitutes an evolutionary sister to the type IV collagen family. Formation of the ancestral NC1 domain and divergence of the spongin short-chain collagen-related and type IV collagen families may have occurred before the parazoan-eumetazoan split, the earliest divergence among extant animal phyla. Molecular phylogenetics based on NC1 domain sequences suggest distinct evolutionary histories for spongin short-chain collagen-related and type IV collagen families that include spongin short-chain collagen-related gene loss in the ancestors of Ecdyzosoa and of vertebrates. The fact that a majority of invertebrates encodes spongin short-chain collagen-related proteins raises the important question to the possible function of its members. Considering the importance of collagens for animal structure and substratum attachment, both families may have played crucial roles in animal diversification.  相似文献   

13.
C A Fields  D L Grady  R K Moyzis 《Genomics》1992,13(2):431-436
Fifteen examples of the transposon-like human element (THE) LTR and thirteen examples of the MstII interspersed repeat are aligned to generate new consensus sequences for these human repetitive elements. The consensus sequences of these elements are very similar, indicating that they compose subfamilies of a single human interspersed repetitive sequence family. Members of this highly polymorphic repeat family have been mapped to at least 11 chromosomes. Seven examples of the THE internal sequence are also aligned to generate a new consensus sequence for this element. Estimates of the abundance of this repetitive sequence family, derived from both hybridization analysis and frequency of occurrence in GenBank, indicate that THE-LTR/MstII sequences are present every 100-3000 kb in human DNA. The widespread occurrence of members of this family makes them useful landmarks, like Alu, L1, and (GT)n repeats, for physical and genetic mapping of human DNA.  相似文献   

14.
Szilágyi A 《Proteins》2008,71(4):2086-8; discussion 2089-90
In a paper titled "A topologically related singularity suggests a maximum preferred size for protein domains" (Zbilut et al., Proteins 2007;66:621-629), Zbilut et al. claim to have found a singularity in certain geometrical properties of protein structures, and suggest that this singularity may limit the maximum size of protein domains. They find further support for the singularity in their analysis of G-factors calculated by the PROCHECK program. Here, we show that the claimed singularity is a mathematical artifact with no physical meaning, and we reanalyze the G-factors to show that Zbilut et al.'s results are due to a single outlier in the data. Thus, the existence of an actual singularity in the topological properties of proteins is not supported by the findings of Zbilut et al.  相似文献   

15.
16.
An algorithm is presented for the fast and accurate definition of protein structural domains from coordinate data without prior knowledge of the number or type of domains. The algorithm explicitly locates domains that comprise one or two continuous segments of protein chain. Domains that include more than two segments are also located. The algorithm was applied to a nonredundant database of 230 protein structures and the results compared to domain definitions obtained from the literature, or by inspection of the coordinates on molecular graphics. For 70% of the proteins, the derived domains agree with the reference definitions, 18% show minor differences and only 12% (28 proteins) show very different definitions. Three screens were applied to identify the derived domains least likely to agree with the subjective definition set. These screens revealed a set of 173 proteins, 97% of which agree well with the subjective definitions. The algorithm represents a practical domain identification tool that can be run routinely on the entire structural database. Adjustment of parameters also allows smaller compact units to be identified in proteins.  相似文献   

17.
Rapid and accurate functional assignment of novel proteins is increasing in importance, given the completion of numerous genome sequencing projects and the vastly expanding list of unannotated proteins. Traditionally, global primary-sequence and structure comparisons have been used to determine putative function. These approaches, however, do not emphasize similarities in active site configurations that are fundamental to a protein's activity and highly conserved relative to the global and more variable structural features. The Comparison of Protein Active Site Structures (CPASS) database and software enable the comparison of experimentally identified ligand-binding sites to infer biological function and aid in drug discovery. The CPASS database comprises the ligand-defined active sites identified in the protein data bank, where the CPASS program compares these ligand-defined active sites to determine sequence and structural similarity without maintaining sequence connectivity. CPASS will compare any set of ligand-defined protein active sites, irrespective of the identity of the bound ligand.  相似文献   

18.
In this study, we describe the identification of nine novel genes isolated from a unique human first-trimester cDNA library generated from the placental bed. One of these clones, called C2360 and located on chromosome 10q22, was selected as it showed restricted expression in placental bed tissue as well as in JEG3 choriocarcinoma cells with absent expression in adult tissues. We show that the expression is restricted to first-trimester proliferative trophoblasts of the proximal column and show that C2360 is a nuclear protein. No detectable transactivation potential was observed for different domains of the protein. Secondary structure prediction showed that C2360 is a representative member of a eukaryotic family of proteins with a low conservation at the amino acid level, but with strong conservation at the structural level, sharing the general domain (coiled coil 1)-(helix 1)-(coiled coil 2)-(helix 2), or CHCH domain. Each alpha-helix within this domain contains two cysteine amino acids, and these intrahelical cysteines are separated by nine amino acids (C-X(9)-C motif). The fixed position within each helix indicated that both helices could form a hairpin structure stabilized by two interhelical disulfide bonds. Other proteins belonging to the family include estrogen-induced gene 2 and the ethanol-induced 6 protein. The conserved motif was found in yeast, plant, Drosophila, Caenorhabditis elegans, mouse, and human proteins, indicating that the ancestor of this protein family is of eukaryotic origin. These results indicate that C2360 is a representative member of a multifamily of proteins, sharing a protein domain that is conserved in eukaryotes.  相似文献   

19.
20.
The identification and characterization of the structural sites which contribute to protein function are crucial for understanding biological mechanisms, evaluating disease risk, and developing targeted therapies. However, the quantity of known protein structures is rapidly outpacing our ability to functionally annotate them. Existing methods for function prediction either do not operate on local sites, suffer from high false positive or false negative rates, or require large site‐specific training datasets, necessitating the development of new computational methods for annotating functional sites at scale. We present COLLAPSE (Compressed Latents Learned from Aligned Protein Structural Environments), a framework for learning deep representations of protein sites. COLLAPSE operates directly on the 3D positions of atoms surrounding a site and uses evolutionary relationships between homologous proteins as a self‐supervision signal, enabling learned embeddings to implicitly capture structure–function relationships within each site. Our representations generalize across disparate tasks in a transfer learning context, achieving state‐of‐the‐art performance on standardized benchmarks (protein–protein interactions and mutation stability) and on the prediction of functional sites from the prosite database. We use COLLAPSE to search for similar sites across large protein datasets and to annotate proteins based on a database of known functional sites. These methods demonstrate that COLLAPSE is computationally efficient, tunable, and interpretable, providing a general‐purpose platform for computational protein analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号