首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
We determined the three-dimensional crystal structure of the protein YML079wp, encoded by a hypothetical open reading frame from Saccharomyces cerevisiae to a resolution of 1.75 A. The protein has no close homologs and its molecular and cellular functions are unknown. The structure of the protein is a jelly-roll fold consisting of ten beta-strands organized in two parallel packed beta-sheets. The protein has strong structural resemblance to the plant storage and ligand binding proteins (canavalin, glycinin, auxin binding protein) but also to some plant and bacterial enzymes (epimerase, germin). The protein forms homodimers in the crystal, confirming measurements of its molecular mass in solution. Two monomers have their beta-sheet packed together to form the dimer. The presence of a hydrophobic ligand in a well conserved pocket inside the barrel and local sequence similarity with bacterial epimerases may suggest a biochemical function for this protein.  相似文献   

2.
The solution structure of the 154-residue conserved hypothetical protein HI0004 has been determined using multidimensional heteronuclear NMR spectroscopy. HI0004 has sequence homologs in many organisms ranging from bacteria to humans and is believed to be essential in Haemophilus influenzae, although an exact function has yet to be defined. It has a alpha-beta-alpha sandwich architecture consisting of a central four-stranded beta-sheet with the alpha2-helix packed against one side of the beta-sheet and four alpha-helices (alpha1, alpha3, alpha4, alpha5) on the other side. There is structural homology with the eukaryotic matrix metalloproteases (MMPs), but little sequence similarity except for a conserved region containing three histidines that appears in both the MMPs and throughout the HI0004 family of proteins. The solution structure of HI0004 is compared with the X-ray structure of an Aquifex aeolicus homolog, AQ_1354, which has 36% sequence identity over 148 residues. Despite this level of sequence homology, significant differences exist between the two structures. These differences are described along with possible functional implications of the structures.  相似文献   

3.
Rubisco is a very large, complex and one of the most abundant proteins in the world and comprises up to 50% of all soluble protein in plants. The activity of Rubisco, the enzyme that catalyzes CO2 assimilation in photosynthesis, is regulated by Rubisco activase (Rca). In the present study, we searched for hypothetical protein of Vitis vinifera which has putative Rubisco activase function. The Arabidopsis and tobacco Rubisco activase protein sequences were used as seed sequences to search against Vitis vinifera in UniprotKB database. The selected hypothetical proteins of Vitis vinifera were subjected to sequence, structural and functional annotation. Subcellular localization predictions suggested it to be cytoplasmic protein. Homology modelling was used to define the three-dimensional (3D) structure of selected hypothetical proteins of Vitis vinifera. Template search revealed that all the hypothetical proteins share more than 80% sequence identity with structure of green-type Rubisco activase from tobacco, indicating proteins are evolutionary conserved. The homology modelling was generated using SWISS-MODEL. Several quality assessment and validation parameters computed indicated that homology models are reliable. Further, functional annotation through PFAM, CATH, SUPERFAMILY, CDART suggested that selected hypothetical proteins of Vitis vinifera contain ATPase family associated with various cellular activities (AAA) and belong to the AAA+ super family of ring-shaped P-loop containing nucleoside triphosphate hydrolases. This study will lead to research in the optimization of the functionality of Rubisco which has large implication in the improvement of plant productivity and resource use efficiency.  相似文献   

4.
Mycobacterium leprae protein ML2640c belongs to a large family of conserved hypothetical proteins predominantly found in mycobacteria, some of them predicted as putative S-adenosylmethionine (AdoMet)-dependent methyltransferases (MTase). As part of a Structural Genomics initiative on conserved hypothetical proteins in pathogenic mycobacteria, we have determined the structure of ML2640c in two distinct crystal forms. As expected, ML2640c has a typical MTase core domain and binds the methyl donor substrate AdoMet in a manner consistent with other known members of this structural family. The putative acceptor substrate-binding site of ML2640c is a large internal cavity, mostly lined by aromatic and aliphatic side-chain residues, suggesting that a lipid-like molecule might be targeted for catalysis. A flap segment (residues 222-256), which isolates the binding site from the bulk solvent and is highly mobile in the crystal structures, could serve as a gateway to allow substrate entry and product release. The multiple sequence alignment of ML2640c-like proteins revealed that the central alpha/beta core and the AdoMet-binding site are very well conserved within the family. However, the amino acid positions defining the binding site for the acceptor substrate display a higher variability, suggestive of distinct acceptor substrate specificities. The ML2640c crystal structures offer the first structural glimpses at this important family of mycobacterial proteins and lend strong support to their functional assignment as AdoMet-dependent methyltransferases.  相似文献   

5.
Genome sequencing projects has led to an explosion of large amount of gene products in which many are of hypothetical proteins with unknown function. Analyzing and annotating the functions of hypothetical proteins is important in Staphylococcus aureus which is a pathogenic bacterium that cause multiple types of diseases by infecting various sites in humans and animals. In this study, ten hypothetical proteins of Staphylococcus aureus were retrieved from NCBI and analyzed for their structural and functional characteristics by using various bioinformatics tools and databases. The analysis revealed that some of them possessed functionally important domains and families and protein-protein interacting partners which were ABC transporter ATP-binding protein, Multiple Antibiotic Resistance (MAR) family, export proteins, Helix-Turn-helix domains, arsenate reductase, elongation factor, ribosomal proteins, Cysteine protease precursor, Type-I restriction endonuclease enzyme and plasmid recombination enzyme which might have the same functions in hypothetical proteins. The structural prediction of those proteins and binding sites prediction have been done which would be useful in docking studies for aiding in the drug discovery.  相似文献   

6.
HI1506 is a 128-residue hypothetical protein of unknown function from Haemophilus influenzae. It was originally annotated as a shorter 85-residue protein, but a more detailed sequence analysis conducted in our laboratory revealed that the full-length protein has an additional 43 residues on the C terminus, corresponding with a region initially ascribed to HI1507. As part of a larger effort to understand the functions of hypothetical proteins from Gram-negative bacteria, and H. influenzae in particular, we report here the three-dimensional solution NMR structure for the corrected full-length HI1506 protein. The structure consists of two well-defined domains, an alpha/beta 50-residue N-domain and a 3-alpha 32-residue C-domain, separated by an unstructured 30-residue linker. Both domains have positively charged surface patches and weak structural homology with folds that are associated with RNA binding, suggesting a possible functional role in binding distal nucleic acid sites.  相似文献   

7.
The NMR structure of the conserved hypothetical protein TM0487 from Thermotoga maritima represents an alpha/beta-topology formed by the regular secondary structures alpha1-beta1-beta2-alpha2-beta3-beta4-alpha3- beta5-3(10)-alpha4, with a small anti-parallel beta-sheet of beta-strands 1 and 2, and a mixed parallel/anti-parallel beta-sheet of beta-strands 3-5. Similar folds have previously been observed in other proteins, with amino acid sequence identity as low as 3% and a variety of different functions. There are also 216 sequence homologs of TM0487, which all have the signature sequence of domains of unknown function 59 (DUF59), for which no three-dimensional structures have as yet been reported. The TM0487 structure thus presents a platform for homology modeling of this large group of DUF59 proteins. Conserved among most of the DUF59s are 13 hydrophobic residues, which are clustered in the core of TM0487. A putative active site of TM0487 consisting of residues D20, E22, L23, T51, T52, and C55 is conserved in 98 of the 216 DUF59 sequences. Asp20 is buried within the proposed active site without any compensating positive charge, which suggests that its pK(a) value may be perturbed. Furthermore, the DUF59 family includes ORFs that are part of a conserved chromosomal group of proteins predicted to be involved in Fe-S cluster metabolism.  相似文献   

8.
The Mycobacterium tuberculosis genome contains about 4000 genes, of which approximately a third code for proteins of unknown function or are classified as conserved hypothetical proteins. We have determined the three-dimensional structure of one of these, the rv0216 gene product, which has been shown to be essential for M. tuberculosis growth in vivo. The structure exhibits the greatest similarity to bacterial and eukaryotic hydratases that catalyse the R-specific hydration of 2-enoyl coenzyme A. However, only part of the catalytic machinery is conserved in Rv0216 and it showed no activity for the substrate crotonyl-CoA. The structure of Rv0216 allows us to assign new functional annotations to a family of seven other M. tuberculosis proteins, a number if which are essential for bacterial survival during infection and growth.  相似文献   

9.
The structure of Vibrio cholerae protein VC0424 was determined by NMR spectroscopy. VC0424 belongs to a conserved family of bacterial proteins of unknown function (COG 3076). The structure has an alpha-beta sandwich architecture consisting of two layers: a four-stranded antiparallel beta-sheet and three side-by-side alpha-helices. The secondary structure elements have the order alphabetaalphabetabetaalphabeta along the sequence. This fold is the same as the ferredoxin-like fold, except with an additional long N-terminal helix, making it a variation on this common motif. A cluster of conserved surface residues on the beta-sheet side of the protein forms a pocket that may be important for the biological function of this conserved family of proteins.  相似文献   

10.
We report herein the NMR structure of Tm0979, a structural proteomics target from Thermotoga maritima. The Tm0979 fold consists of four beta/alpha units, which form a central parallel beta-sheet with strand order 1234. The first three helices pack toward one face of the sheet and the fourth helix packs against the other face. The protein forms a dimer by adjacent parallel packing of the fourth helices sandwiched between the two beta-sheets. This fold is very interesting from several points of view. First, it represents the first structure determination for the DsrH family of conserved hypothetical proteins, which are involved in oxidation of intracellular sulfur but have no defined molecular function. Based on structure and sequence analysis, possible functions are discussed. Second, the fold of Tm0979 most closely resembles YchN-like folds; however the proteins that adopt these folds differ in secondary structural elements and quaternary structure. Comparison of these proteins provides insight into possible mechanisms of evolution of quaternary structure through a simple mechanism of hydrophobicity-changing mutations of one or two residues. Third, the Tm0979 fold is found to be similar to flavodoxin-like folds and beta/alpha barrel proteins, and may provide a link between these very abundant folds and putative ancestral half-barrel proteins.  相似文献   

11.
Butt AM  Batool M  Tong Y 《Bioinformation》2011,7(6):299-303
Mycoplasma genitalium is a human pathogen associated with several sexually transmitted diseases. The complete genome of M. genitalium G37 has been sequenced and provides an opportunity to understand the pathogenesis and identification of therapeutic targets. However, complete understanding of bacterial function requires proper annotation of its proteins. The genome of M. genitalium consists of 475 proteins. Among these, 94 are without any known function and are described as 'hypothetical proteins'. We selected MG_237 for sequence and structural analysis using a bioinformatics approach. Primary and secondary structure analysis suggested that MG_237 is a hydrophilic protein containing a significant proportion of alpha helices, and subcellular localization predictions suggested it is a cytoplasmic protein. Homology modeling was used to define the three-dimensional (3D) structure of MG-237. A search for templates revealed that MG_237 shares 63% homology to a hypothetical protein of Mycoplasma pneumoniae, indicating this protein is evolutionary conserved. The refined 3D model was generated using (PS)(2)-v2 sever that incorporates MODELLER. Several quality assessment and validation parameters were computed and indicated that the homology model is reliable. Furthermore, comparative genomics analysis suggested MG_237 as non-homologous protein and involved in four different metabolic pathways. Experimental validation will provide more insight into the actual function of this protein in microbial pathways.  相似文献   

12.
TT1426, from Thermus thermophilus HB8, is a conserved hypothetical protein with a predicted phosphoribosyltransferase (PRTase) domain, as revealed by a Pfam database search. The 2.01 A crystal structure of TT1426 has been determined by the multiwavelength anomalous dispersion (MAD) method. TT1426 comprises a core domain consisting of a central five-stranded beta sheet surrounded by four alpha-helices, and a subdomain in the C terminus. The core domain structure resembles those of the type I PRTase family proteins, although a significant structural difference exists in an inserted 43-residue region. The C-terminal subdomain corresponds to the "hood," which contains a substrate-binding site in the type I PRTases. The hood structure of TT1426 differs from those of the other type I PRTases, suggesting the possibility that TT1426 binds an unknown substrate. The structure-based sequence alignment provides clues about the amino acid residues involved in catalysis and substrate binding.  相似文献   

13.
A hypothetical protein is predicted to be expressed from an open reading frame without known experimental evidence of translation. They constitute a substantial fraction of proteomes. Domain extraction from these hypothetical sequences helps to search for protein coding genes for protein structural and functional annotation. We describe the analysis of prediction data in a sequence dataset of hypothetical protein orthologs of Pongo abelii (orangutan) and Sus scrofa (pig). It should be noted that these orangutan-pig orthologs are also non-homologous to human proteins. These predicted data find application in the genome wide annotation of proteins in poorly understood genomes.

Abbreviations

PDB - Protein Data Bank, DEG - Database of Essential Genes, CDD - Conserved Domain Database, IUCN - International Union for Conservation of Nature.  相似文献   

14.
The latest crystallographic model of the cyanobacterial photosystem II (PS II) core complex added one transmembrane low molecular weight (LMW) component to the previous model, suggesting the presence of an unknown transmembrane LMW component in PS II. We have investigated the polypeptide composition in highly purified intact PS II core complexes from Thermosynechococcus elongatus, the species which yielded the PS II crystallographic models described above, to identify the unknown component. Using an electrophoresis system specialized for separation of LMW hydrophobic proteins, a novel protein of ∼ 5 kDa was identified as a PS II component. Its N-terminal amino acid sequence was identical to that of Ycf12. The corresponding gene is known as one of the ycf (hypothetical chloroplast reading frame) genes, ycf12, and is widely conserved in chloroplast and cyanobacterial genomes. Nonetheless, the localization and function of the gene product have never been assigned. Our finding shows, for the first time, that ycf12 is actually expressed as a component of the PS II complex in the cell, revealing that a previously unidentified transmembrane protein exists in the PS II core complex.  相似文献   

15.
16.
The TT1485 gene from Thermus thermophilus HB8 encodes a hypothetical protein of unknown function with about 20 sequence homologs of bacterial or archaeal origin. Together they form a family of uncharacterized proteins, the cluster of orthologous group COG3253. Using a combination of amino acid sequence analysis, three-dimensional structural studies and biochemical assays, we identified TT1485 as a novel heme-binding protein. The crystal structure reveals that this protein is a pentamer and each monomer exhibits a β-barrel fold. TT1485 is structurally similar to muconolactone isomerase, but this provided no functional clues. Amino acid sequence analysis revealed remote homology to a heme enzyme, chlorite dismutase. Strikingly, amino acid residues that are highly conserved in the homologous hypothetical proteins and chlorite dismutase cluster around a deep cavity on the surface of each monomer. Molecular modeling shows that the cavity can accommodate a heme group with a strictly conserved His as a heme ligand. TT1485 reconstituted with iron protoporphyrin IX chloride gave a low chlorite dismutase activity, indicating that TT1485 catalyzes a reaction other than chlorite degradation. The presence of a possible Fe–His–Asp triad in the heme proximal site suggests that TT1485 functions as a novel heme peroxidase to detoxify hydrogen peroxide within the cell.  相似文献   

17.
Based on a study involving structural comparisons of proteins sharing 25% or less sequence identity, three rounds of Psi-BLAST appear capable of identifying remote evolutionary homologs with greater than 95% confidence provided that more than 50% of the query sequence can be aligned with the target sequence. Since it seems that more than 80% of all homologous protein pairs may be characterized by a lack of significant sequence similarity, the experimental biologist is often confronted with a lack of guidance from conventional homology searches involving pair-wise sequence comparisons. The ability to disregard levels of sequence identity and expect value in Psi-BLAST if at least 50% of the query sequence has been aligned allows for generation of new hypotheses by consideration of matches that are conventionally disregarded. In one example, we suggest a possible evolutionary linkage between the cupredoxin and immunoglobulin fold families. A thermostable hypothetical protein of unknown function may be a circularly permuted homolog to phosphotriesterase, an enzyme capable of detoxifying organophosphate nerve agents. In a third example, the amino acid sequence of another hypothetical protein of unknown function reveals the ATP binding-site, metal binding site, and catalytic sidechain consistent with kinase activity of unknown specificity. This approach significantly expands the utility of existing sequence data to define the primary structure degeneracy of binding sites for substrates, cofactors and other proteins.  相似文献   

18.
TTHA0727 is a conserved hypothetical protein from Thermus thermophilus HB8, with a molecular mass of 12.6 kDa. TTHA0727 belongs to the carboxymuconolactone decarboxylase (CMD) family (Pfam 02627). A sequence comparison with its homologs suggested that TTHA0727 is a distinct protein from alkylhydroperoxidase AhpD and gamma-carboxymuconolactone decarboxylase in the CMD family. Here we report the 1.9 A crystal structure of TTHA0727 (PDB ID: 2CWQ) determined by the multiwavelength anomalous dispersion method. The TTHA0727 monomer structure consists of seven alpha-helices (alpha1-alpha7) and one short 3(10)-helix. The crystal structure and the analytical ultracentrifugation revealed that TTHA0727 forms a hexameric ring structure in solution. The electrostatic potential distribution on the solvent-accessible surface of the TTHA0727 hexamer showed that positively charged regions exist on the side of the ring structure, suggesting that TTHA0727 interacts with some negatively charged molecules. A structural homology search revealed that the structure of three alpha-helices (alpha4-alpha6) is remarkably conserved, suggesting that it is the common structural motif for the CMD family proteins. In addition, the nine residues of the N-terminal tag bound to the cleft region between alpha1 and alpha3 in chains A and B of TTHA0727, implying that this region is the putative binding/active site for some small molecules.  相似文献   

19.
20.
Piwi-interacting RNAs (piRNAs) guide Piwi argonautes to their transposon targets for silencing. The highly conserved protein Maelstrom is linked to both piRNA biogenesis and effector roles in this pathway. One defining feature of Maelstrom is the predicted MAEL domain of unknown molecular function. Here, we present the first crystal structure of the MAEL domain from Bombyx Maelstrom, which reveals a nuclease fold. The overall architecture resembles that found in Mg2+- or Mn2+-dependent DEDD nucleases, but a clear distinguishing feature is the presence of a structural Zn2+ ion coordinated by the conserved ECHC residues. Strikingly, metazoan Maelstrom orthologs across the animal kingdom lack the catalytic DEDD residues, and as we show for Bombyx Maelstrom are inactive as nucleases. However, a MAEL domain-containing protein from amoeba having both sequence motifs (DEDD and ECHC) is robustly active as an exoribonuclease. Finally, we show that the MAEL domain of Bombyx Maelstrom displays a strong affinity for single-stranded RNAs. Our studies suggest that the ancient MAEL nuclease domain evolved to function as an RNA-binding module in metazoan Maelstrom.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号