首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The structure of MTH538, a previously uncharacterized hypothetical protein from Methanobacterium thermoautotrophicum, has been determined by NMR spectroscopy. MTH538 is one of numerous structural genomics targets selected in a genome-wide survey of uncharacterized sequences from this organism. MTH538 is a so-called singleton, a sequence not closely related to any other (known) sequences. The structure of MTH538 closely resembles the known structures of receiver domains from two component response regulator systems, such as CheY, and is similar to the structures of flavodoxins and GTP-binding proteins. Tests on MTH538 for characteristic activities of CheY and flavodoxin were negative. MTH538 did not become phosphorylated in the presence of acetyl phosphate and Mg(2+), although it appeared to bind Mg(2+). MTH538 also did not bind flavin mononucleotide (FMN) or coenzyme F(420). Nevertheless, sequence and structure parallels between MTH538/CheY and two families of ATPase/phosphatase proteins suggest that MTH538 may have a role in a phosphorylation-independent two-component response regulator system.  相似文献   

2.
The dramatically increasing number of new protein sequences arising from genomics 4 proteomics requires the need for methods to rapidly and reliably infer the molecular and cellular functions of these proteins. One such approach, structural genomics, aims to delineate the total repertoire of protein folds in nature, thereby providing three-dimensional folding patterns for all proteins and to infer molecular functions of the proteins based on the combined information of structures and sequences. The goal of obtaining protein structures on a genomic scale has motivated the development of high throughput technologies and protocols for macromolecular structure determination that have begun to produce structures at a greater rate than previously possible. These new structures have revealed many unexpected functional inferences and evolutionary relationships that were hidden at the sequence level. Here, we present samples of structures determined at Berkeley Structural Genomics Center and collaborators laboratories to illustrate how structural information provides and complements sequence information to deduce the functional inferences of proteins with unknown molecular functions.Two of the major premises of structural genomics are to discover a complete repertoire of protein folds in nature and to find molecular functions of the proteins whose functions are not predicted from sequence comparison alone. To achieve these objectives on a genomic scale, new methods, protocols, and technologies need to be developed by multi-institutional collaborations worldwide. As part of this effort, the Protein Structure Initiative has been launched in the United States (PSI; www.nigms.nih.gov/funding/psi.html). Although infrastructure building and technology development are still the main focus of structural genomics programs [1–6], a considerable number of protein structures have already been produced, some of them coming directly out of semi-automated structure determination pipelines [6–10]. The Berkeley Structural Genomics Center (BSGC) has focused on the proteins of Mycoplasma or their homologues from other organisms as its structural genomics targets because of the minimal genome size of the Mycoplasmas as well as their relevance to human and animal pathogenicity (http://www.strgen.org). Here we present several protein examples encompassing a spectrum of functional inferences obtainable from their three-dimensional structures in five situations, where the inferences are new and testable, and are not predictable from protein sequence information alone.  相似文献   

3.
High efficiency capillary liquid chromatography-tandem mass spectrometry (LC-MS/MS) was used to examine the proteins extracted from Desulfovibrio vulgaris cells across six treatment conditions. While our previous study provided a proteomic overview of the cellular metabolism based on proteins with known functions [W. Zhang, M.A. Gritsenko, R.J. Moore, D.E. Culley, L. Nie, K. Petritis, E.F. Strittmatter, D.G. Camp II, R.D. Smith, F.J. Brockman, A proteomic view of the metabolism in Desulfovibrio vulgaris determined by liquid chromatography coupled with tandem mass spectrometry, Proteomics 6 (2006) 4286-4299], this study describes the global detection and functional inference for hypothetical D. vulgaris proteins. Using criteria that a given peptide of a protein is identified from at least two out of three independent LC-MS/MS measurements and that for any protein at least two different peptides are identified among the three measurements, 129 open reading frames (ORFs) originally annotated as hypothetical proteins were found to encode expressed proteins. Functional inference for the conserved hypothetical proteins was performed by a combination of several non-homology based methods: genomic context analysis, phylogenomic profiling, and analysis of a combination of experimental information, including peptide detection in cells grown under specific culture conditions and cellular location of the proteins. Using this approach we were able to assign possible functions to 20 conserved hypothetical proteins. This study demonstrated that a combination of proteomics and bioinformatics methodologies can provide verification of the expression of hypothetical proteins and improve genome annotation.  相似文献   

4.
Mycoplasma hyopneumoniae is a genome-reduced, cell wall-less, bacterial pathogen with a predicted coding capacity of less than 700 proteins and is one of the smallest self-replicating pathogens. The cell surface of M. hyopneumoniae is extensively modified by processing events that target the P97 and P102 adhesin families. Here, we present analyses of the proteome of M. hyopneumoniae-type strain J using protein-centric approaches (one- and two-dimensional GeLC–MS/MS) that enabled us to focus on global processing events in this species. While these approaches only identified 52% of the predicted proteome (347 proteins), our analyses identified 35 surface-associated proteins with widely divergent functions that were targets of unusual endoproteolytic processing events, including cell adhesins, lipoproteins and proteins with canonical functions in the cytosol that moonlight on the cell surface. Affinity chromatography assays that separately used heparin, fibronectin, actin and host epithelial cell surface proteins as bait recovered cleavage products derived from these processed proteins, suggesting these fragments interact directly with the bait proteins and display previously unrecognized adhesive functions. We hypothesize that protein processing is underestimated as a post-translational modification in genome-reduced bacteria and prokaryotes more broadly, and represents an important mechanism for creating cell surface protein diversity.  相似文献   

5.
The genomes of many organisms have been sequenced in the last 5 years. Typically about 30% of predicted genes from a newly sequenced genome cannot be given functional assignments using sequence comparison methods. In these situations three-dimensional structural predictions combined with a suite of computational tools can suggest possible functions for these hypothetical proteins. Suggesting functions may allow better interpretation of experimental data (e.g., microarray data and mass spectroscopy data) and help experimentalists design new experiments. In this paper, we focus on three hypothetical proteins of Shewanella oneidensis MR-1 that are potentially related to iron transport/metabolism based on microarray experiments. The threading program PROSPECT was used for protein structural predictions and functional annotation, in conjunction with literature search and other computational tools. Computational tools were used to perform transmembrane domain predictions, coiled coil predictions, signal peptide predictions, sub-cellular localization predictions, motif prediction, and operon structure evaluations. Combined computational results from all tools were used to predict roles for the hypothetical proteins. This method, which uses a suite of computational tools that are freely available to academic users, can be used to annotate hypothetical proteins in general.  相似文献   

6.
7.
The discovery of biochemical and cellular functions of unannotated gene products begins with a database search of proteins with structure/sequence homologues based on known genes. Very recently, a number of frontier groups in structural biology proposed a new paradigm to predict biological functions of an unknown protein on the basis of its three-dimensional structure on a genomic scale. Structural proteomics (genomics), a research area for structure-based functional discovery, aims to complete the protein-folding universe of all gene products in a cell. It would lead us to a complete understanding of a living organism from protein structure. Two major complementary experimental techniques, X-ray crystallography and NMR spectroscopy, combined with recently developed high throughput methods have played a central role in structural proteomics research; however, an integration of these methodologies together with comparative modeling and electron microscopy would speed up the goal for completing a full dictionary of protein folding space in the near future.  相似文献   

8.
Advances in sequence genomics have resulted in an accumulation of a huge number of protein sequences derived from genome sequences. However, the functions of a large portion of them cannot be inferred based on the current methods of sequence homology detection to proteins of known functions. Three-dimensional structure can have an important impact in providing inference of molecular function (physical and chemical function) of a protein of unknown function. Structural genomics centers worldwide have been determining many 3-D structures of the proteins of unknown functions, and possible molecular functions of them have been inferred based on their structures. Combined with bioinformatics and enzymatic assay tools, the successful acceleration of the process of protein structure determination through high throughput pipelines enables the rapid functional annotation of a large fraction of hypothetical proteins. We present a brief summary of the process we used at the Berkeley Structural Genomics Center to infer molecular functions of proteins of unknown function.  相似文献   

9.
Mycoplasma hyopneumoniae, the etiological agent of swine enzootic pneumonia, is an important pathogen in the swine industry worldwide. Vaccination is the most cost-effective strategy for controlling and prevention of this disease. However, investigations on pathogenicity mechanisms as well as current serological detection methods and the development of new recombinant subunit vaccines are hampered by the lack of known and well characterized species-specific M. hyopneumoniae antigens. In this work, 54 predicted genes encoding proteins with potential to be used as subunit vaccine or antigens in diagnostic tests were selected, amplified by PCR and cloned into Escherichia coli expression vectors. Recombinant protein expression, solubility and yields were analyzed. The majority of the recombinant proteins were expressed in inclusion bodies. After solubilization with urea or N-lauroyl sarcosine, recombinant proteins were purified by Ni2+ affinity chromatography. This approach allowed purification of thirty recombinant M. hyopneumoniae proteins which will be evaluated as vaccine candidates and/or as antigens to be used in diagnostic tests.  相似文献   

10.
Protein profiling in five individual mouse strains showed strain-specific expression of three hypothetical proteins (HPs). As functional and structural assignment of HPs were based on predictions and low identity to known structures, HPs were identified by MALDI-TOF/TOF, and their proposed tentative function was determined by enzyme assays. Three identified HPs were extracted from gels and renatured, and pyridoxal phosphate phosphatase, inorganic pyrophosphate phosphatase, and antioxidant activities were revealed, findings in agreement with functional predictions.  相似文献   

11.
12.
Surface protein antigens of Mycoplasma hyopneumoniae were identified by direct antibody-surface binding or by radioimmunoprecipitation of surface 125I-labeled proteins with a series of monoclonal antibodies (MAbs). Surface proteins p70, p65, p50, and p44 were shown to be integral membrane components by selective partitioning into the hydrophobic phase during Triton X-114 (TX-114)-phase fractionation, whereas p41 was concomitantly identified as a surface protein exclusively partitioning into the aqueous phase. Radioimmunoprecipitation of TX-114-phase proteins from cells labeled with [35S]methionine, 14C-amino acids, or [3H] palmitic acid showed that proteins p65, p50, and p44 were abundant and (with one other hydrophobic protein, p60) were selectively labeled with lipid. Covalent lipid attachment was established by high-performance liquid chromatography identification of [3H]methyl palmitate after acid methanolysis of delipidated proteins. An additional, unidentified methanolysis product suggested conversion of palmitate to another form of lipid also attached to these proteins. Alkaline hydroxylamine treatment of labeled proteins indicated linkage of lipids by amide or stable O-linked ester bonds. Proteins p65, p50, and p44 were highly immunogenic in the natural host as measured by immunoblots of TX-114-phase proteins with antisera from swine inoculated with whole organisms. These proteins were antigenically and structurally unrelated, since hyperimmune mouse antibodies to individual gel-purified proteins were monospecific and gave distinct proteolytic epitope maps. Intraspecies size variants of one surface antigen of M. hyopneumoniae were revealed by a MAb to p70 (defined in strain J, ATCC 25934), which recognized a larger p73 component on strain VPP11 (ATCC 25617). In addition, MAb to internal, aqueous-phase protein p82 of strain J failed to bind an analogous antigen in strain VPP11. These studies establish that a highly restricted set of distinct, lipid-modified hydrophobic membrane proteins are major surface antigens of M. hyopneumoniae and that structural variants of surface antigens occur within this species.  相似文献   

13.
Nature selected certain regions of the genome for encoding proteins. Most of the sequences were used to encode only RNA. What happened to the remaining sections of the genome? It is possible that some sequences were retired and retained as non-functional entities called pseudogenes. Though several evolutionary prospects with functional endpoints exist, we looked at the possibility of hypothetical proteins correlating with the emergence of pseudogenes and potential of such genes to make novel synthetic molecules. In this commentary, we consider two key aspects: (1) does any correlation exist between hypothetical proteins and pseudogenes and (2)—can we make novel and functional proteins from pseudogenes?  相似文献   

14.
15.
Genome sequencing projects has led to an explosion of large amount of gene products in which many are of hypothetical proteinswith unknown function. Analyzing and annotating the functions of hypothetical proteins is important in Staphylococcus aureuswhich is a pathogenic bacterium that cause multiple types of diseases by infecting various sites in humans and animals. In thisstudy, ten hypothetical proteins of Staphylococcus aureus were retrieved from NCBI and analyzed for their structural and functionalcharacteristics by using various bioinformatics tools and databases. The analysis revealed that some of them possessed functionallyimportant domains and families and protein-protein interacting partners which were ABC transporter ATP-binding protein,Multiple Antibiotic Resistance (MAR) family, export proteins, Helix-Turn-helix domains, arsenate reductase, elongation factor,ribosomal proteins, Cysteine protease precursor, Type-I restriction endonuclease enzyme and plasmid recombination enzymewhich might have the same functions in hypothetical proteins. The structural prediction of those proteins and binding sitesprediction have been done which would be useful in docking studies for aiding in the drug discovery.  相似文献   

16.
Rubisco is a very large, complex and one of the most abundant proteins in the world and comprises up to 50% of all soluble protein in plants. The activity of Rubisco, the enzyme that catalyzes CO2 assimilation in photosynthesis, is regulated by Rubisco activase (Rca). In the present study, we searched for hypothetical protein of Vitis vinifera which has putative Rubisco activase function. The Arabidopsis and tobacco Rubisco activase protein sequences were used as seed sequences to search against Vitis vinifera in UniprotKB database. The selected hypothetical proteins of Vitis vinifera were subjected to sequence, structural and functional annotation. Subcellular localization predictions suggested it to be cytoplasmic protein. Homology modelling was used to define the three-dimensional (3D) structure of selected hypothetical proteins of Vitis vinifera. Template search revealed that all the hypothetical proteins share more than 80% sequence identity with structure of green-type Rubisco activase from tobacco, indicating proteins are evolutionary conserved. The homology modelling was generated using SWISS-MODEL. Several quality assessment and validation parameters computed indicated that homology models are reliable. Further, functional annotation through PFAM, CATH, SUPERFAMILY, CDART suggested that selected hypothetical proteins of Vitis vinifera contain ATPase family associated with various cellular activities (AAA) and belong to the AAA+ super family of ring-shaped P-loop containing nucleoside triphosphate hydrolases. This study will lead to research in the optimization of the functionality of Rubisco which has large implication in the improvement of plant productivity and resource use efficiency.  相似文献   

17.
We describe a method to assign a protein structure to a functional family using family-specific fingerprints. Fingerprints represent amino acid packing patterns that occur in most members of a family but are rare in the background, a nonredundant subset of PDB; their information is additional to sequence alignments, sequence patterns, structural superposition, and active-site templates. Fingerprints were derived for 120 families in SCOP using Frequent Subgraph Mining. For a new structure, all occurrences of these family-specific fingerprints may be found by a fast algorithm for subgraph isomorphism; the structure can then be assigned to a family with a confidence value derived from the number of fingerprints found and their distribution in background proteins. In validation experiments, we infer the function of new members added to SCOP families and we discriminate between structurally similar, but functionally divergent TIM barrel families. We then apply our method to predict function for several structural genomics proteins, including orphan structures. Some predictions have been corroborated by other computational methods and some validated by subsequent functional characterization.  相似文献   

18.
To implement the 2-DE database of serogroup A Neisseria meningitidis (MenA) and improve its potential of investigation in bacterial biology, cell extracts were separated by tricine-SDS-PAGE and 131 novel proteins were identified by microLC-ESI-IT-MS/MS. These identifications extended to 404, the number of MenA gene expression products characterized at the proteome level, approximately covering 20% of the total ORFs predicted from genome sequence. This technical approach was particularly useful in ascertaining expression of ribosomal as well as hypothetical proteins. Particular attention was paid to functional characterization of hypothetical proteins by means of software analyses and database searches.  相似文献   

19.

Background  

The definition of a hypothetical protein is a protein that is predicted to be expressed from an open reading frame, but for which there is no experimental evidence of translation. Hypothetical proteins constitute a substantial fraction of proteomes of human as well as of other eukaryotes. With the general belief that the majority of hypothetical proteins are the product of pseudogenes, it is essential to have a tool with the ability of pinpointing the minority of hypothetical proteins with a high probability of being expressed.  相似文献   

20.
Colonization of the swine respiratory tract by Mycoplasma hyopneumoniae is accomplished by specific binding to the cilia of the mucosal epithelial cells. Previous studies have implicated a 97-kDa outer membrane-associated protein, P97, that appeared to mediate this interaction. In order to further define the role of P97 in adherence to porcine cilia, the structural gene was cloned and sequenced, and the recombinant products were analyzed. Monoclonal antibodies were used to identify recombinant clones in a genomic library expressed in an opal suppressor host because of alternate codon usage by mycoplasmas. The gene coding for P97 was then identified by Tn1000 mutagenesis of recombinant clones. DNA sequence analysis revealed an open reading frame coding for a 124.9-kDa protein with a hydrophobic transmembrane spanning domain. The N-terminal sequence of purified P97 mapped at amino acid position 195 of the translated sequence, indicating that a processing event had occurred in M. hyopneumoniae. Both recombinant P97 protein expressed in an Escherichia coli opal suppressor host and M. hyopneumoniae bound specifically to swine cilia, and the binding was inhibited by heparin and fucoidan, thus supporting the hypothesis that P97 was actively involved in binding to swine cilia in vivo.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号