首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 46 毫秒
The dramatically increasing number of new protein sequences arising from genomics 4 proteomics requires the need for methods to rapidly and reliably infer the molecular and cellular functions of these proteins. One such approach, structural genomics, aims to delineate the total repertoire of protein folds in nature, thereby providing three-dimensional folding patterns for all proteins and to infer molecular functions of the proteins based on the combined information of structures and sequences. The goal of obtaining protein structures on a genomic scale has motivated the development of high throughput technologies and protocols for macromolecular structure determination that have begun to produce structures at a greater rate than previously possible. These new structures have revealed many unexpected functional inferences and evolutionary relationships that were hidden at the sequence level. Here, we present samples of structures determined at Berkeley Structural Genomics Center and collaborators laboratories to illustrate how structural information provides and complements sequence information to deduce the functional inferences of proteins with unknown molecular functions.Two of the major premises of structural genomics are to discover a complete repertoire of protein folds in nature and to find molecular functions of the proteins whose functions are not predicted from sequence comparison alone. To achieve these objectives on a genomic scale, new methods, protocols, and technologies need to be developed by multi-institutional collaborations worldwide. As part of this effort, the Protein Structure Initiative has been launched in the United States (PSI; www.nigms.nih.gov/funding/psi.html). Although infrastructure building and technology development are still the main focus of structural genomics programs [1–6], a considerable number of protein structures have already been produced, some of them coming directly out of semi-automated structure determination pipelines [6–10]. The Berkeley Structural Genomics Center (BSGC) has focused on the proteins of Mycoplasma or their homologues from other organisms as its structural genomics targets because of the minimal genome size of the Mycoplasmas as well as their relevance to human and animal pathogenicity (http://www.strgen.org). Here we present several protein examples encompassing a spectrum of functional inferences obtainable from their three-dimensional structures in five situations, where the inferences are new and testable, and are not predictable from protein sequence information alone.  相似文献   

Advances in sequence genomics have resulted in an accumulation of a huge number of protein sequences derived from genome sequences. However, the functions of a large portion of them cannot be inferred based on the current methods of sequence homology detection to proteins of known functions. Three-dimensional structure can have an important impact in providing inference of molecular function (physical and chemical function) of a protein of unknown function. Structural genomics centers worldwide have been determining many 3-D structures of the proteins of unknown functions, and possible molecular functions of them have been inferred based on their structures. Combined with bioinformatics and enzymatic assay tools, the successful acceleration of the process of protein structure determination through high throughput pipelines enables the rapid functional annotation of a large fraction of hypothetical proteins. We present a brief summary of the process we used at the Berkeley Structural Genomics Center to infer molecular functions of proteins of unknown function.  相似文献   

The proteomes that make up the collection of proteins in contemporary organisms evolved through recombination and duplication of a limited set of domains. These protein domains are essentially the main components of globular proteins and are the most principal level at which protein function and protein interactions can be understood. An important aspect of domain evolution is their atomic structure and biochemical function, which are both specified by the information in the amino acid sequence. Changes in this information may bring about new folds, functions and protein architectures. With the present and still increasing wealth of sequences and annotation data brought about by genomics, new evolutionary relationships are constantly being revealed, unknown structures modeled and phylogenies inferred. Such investigations not only help predict the function of newly discovered proteins, but also assist in mapping unforeseen pathways of evolution and reveal crucial, co-evolving inter- and intra-molecular interactions. In turn this will help us describe how protein domains shaped cellular interaction networks and the dynamics with which they are regulated in the cell. Additionally, these studies can be used for the design of new and optimized protein domains for therapy. In this review, we aim to describe the basic concepts of protein domain evolution and illustrate recent developments in molecular evolution that have provided valuable new insights in the field of comparative genomics and protein interaction networks.  相似文献   

A protein's three-dimensional structure is encoded in its amino acid sequence. The < folding problem > consists in predicting one based on the other. This classic problem of molecular biology has seen important steps forward in recent years. The raw power of today's computers, along with the mobilization of thousands of internauts, have allowed several small proteins to be literally folded up in a computer, through simulations. Moreover, international programs for structural genomics aim to determine the experimental structures of hundreds of proteins in several organisms, and to model the others by homology to known structures. This will lead to a nearly-complete map of the protein structure universe, shedding light on the past evolution and current functions of today's proteins, and suggesting new targets for therapeutic strategies.  相似文献   

Structural proteomics: a tool for genome annotation   总被引:1,自引:0,他引:1  
In any newly sequenced genome, 30% to 50% of genes encode proteins with unknown molecular or cellular function. Fortunately, structural genomics is emerging as a powerful approach of functional annotation. Because of recent developments in high-throughput technologies, ongoing structural genomics projects are generating new structures at an unprecedented rate. In the past year, structural studies have identified many new structural motifs involved in enzymatic catalysis or in binding ligands or other macromolecules (DNA, RNA, protein). The efficiency by which function is deduced from structure can be further improved by the integration of structure with bioinformatics and other experimental approaches, such as screening for enzymatic activity or ligand binding.  相似文献   

Assigning function to structures is an important aspect of structural genomics projects, since they frequently provide structures for uncharacterized proteins. Similarities uncovered by structure alignment can suggest a similar function, even in the absence of sequence similarity. For proteins adopting novel folds or those with many functions, this strategy can fail, but functional clues can still come from comparison of local functional sites involving a few key residues. Here we assess the general applicability of functional site comparison through the study of 157 proteins solved by structural genomics initiatives. For 17, the method bolsters confidence in predictions made based on overall fold similarity. For another 12 with new folds, it suggests functions, including a putative phosphotyrosine binding site in the Archaeal protein Mth1187 and an active site for a ribose isomerase. The approach is applied weekly to all new structures, providing a resource for those interested in using structure to infer function.  相似文献   

Structural genomics efforts contribute new protein structures that often lack significant sequence and fold similarity to known proteins. Traditional sequence and structure-based methods may not be sufficient to annotate the molecular functions of these structures. Techniques that combine structural and functional modeling can be valuable for functional annotation. FEATURE is a flexible framework for modeling and recognition of functional sites in macromolecular structures. Here, we present an overview of the main components of the FEATURE framework, and describe the recent developments in its use. These include automating training sets selection to increase functional coverage, coupling FEATURE to structural diversity generating methods such as molecular dynamics simulations and loop modeling methods to improve performance, and using FEATURE in large-scale modeling and structure determination efforts.  相似文献   

As the number of complete genomes that have been sequenced keeps growing, unknown areas of the protein space are revealed and new horizons open up. Most of this information will be fully appreciated only when the structural information about the encoded proteins becomes available. The goal of structural genomics is to direct large-scale efforts of protein structure determination, so as to increase the impact of these efforts. This review focuses on current approaches in structural genomics aimed at selecting representative proteins as targets for structure determination. We will discuss the concept of representative structures/folds, the current methodologies for identifying those proteins, and computational techniques for identifying proteins which are expected to adopt new structural folds.  相似文献   

The first crucial step in any structural genomics project is the selection and prioritization of target proteins for structure determination. There may be a number of selection criteria to be satisfied, including that the proteins have novel folds, that they be representatives of large families for which no structure is known, and so on. The better the selection at this stage, the greater is the value of the structures obtained at the end of the experimental process. This value can be further enhanced once the protein structures have been solved if the functions of the given proteins can also be determined. Here we describe the methods used at either end of the experimental process: firstly, sensitive sequence comparison techniques for selecting a high-quality list of target proteins, and secondly the various computational methods that can be applied to the eventual 3D structures to determine the most likely biochemical function of the proteins in question.  相似文献   

Inferring protein functions from structures is a challenging task, as a large number of orphan protein structures from structural genomics project are now solved without their biochemical functions characterized. For proteins binding to similar substrates or ligands and carrying out similar functions, their binding surfaces are under similar physicochemical constraints, and hence the sets of allowed and forbidden residue substitutions are similar. However, it is difficult to isolate such selection pressure due to protein function from selection pressure due to protein folding, and evolutionary relationship reflected by global sequence and structure similarities between proteins is often unreliable for inferring protein function. We have developed a method, called pevoSOAR (pocket-based evolutionary search of amino acid residues), for predicting protein functions by solving the problem of uncovering amino acids residue substitution pattern due to protein function and separating it from amino acids substitution pattern due to protein folding. We incorporate evolutionary information specific to an individual binding region and match local surfaces on a large scale with millions of precomputed protein surfaces to identify those with similar functions. Our pevoSOAR method also generates a probablistic model called the computed binding a profile that characterizes protein-binding activities that may involve multiple substrates or ligands. We show that our method can be used to predict enzyme functions with accuracy. Our method can also assess enzyme binding specificity and promiscuity. In an objective large-scale test of 100 enzyme families with thousands of structures, our predictions are found to be sensitive and specific: At the stringent specificity level of 99.98%, we can correctly predict enzyme functions for 80.55% of the proteins. The overall area under the receiver operating characteristic curve measuring the performance of our prediction is 0.955, close to the perfect value of 1.00. The best Matthews coefficient is 86.6%. Our method also works well in predicting the biochemical functions of orphan proteins from structural genomics projects.  相似文献   

Following the complete genome sequencing of an increasing number of organisms, structural biology is engaging in a systematic approach of high-throughput structure determination called structural genomics to create a complete inventory of protein folds/structures that will help predict functions for all proteins. First results show that structural genomics will be highly effective in finding functional annotations for proteins of unknown function.  相似文献   

This paper describes efforts of the structural genomics project in the nuclear magnetic resonance (NMR) laboratory at the University of Science and Technology of China. This structural genomics project is biological-functional driven. Targets are mainly selected from two systems: proteins related with regulation of gene expression in humans and other eukaryotes, and proteins existing in the cell junction in humans. The majority of proteins selected from these two systems are related with human health and diseases, and some are potential drug targets. Twenty-five protein structures from Homo sapiens and other eukaryotes have been determined during last 5 years in this laboratory. Nuclear magnetic resonance (NMR) spectroscopy is highly suited to investigate molecular interactions at a close physiological condition and is particularly suited for the study of low-affinity, transient complexes. It can provide information on protein surface interaction, their complex structure, and their dynamic properties during protein recognition. Several examples are given in this paper.  相似文献   

在后基因组时代,随着大量物种全基因组序列的获得,结构生物学家面临着结构基因组学的新机遇和挑战。与传统的结构生物学不同的是,结构基因组学的研究主要集中在结构和功能未知并且与从前研究的蛋白质相似性很小的蛋白质。准确的来讲,结构基因组学通过高通量蛋白质表达、结构解析来完成所有蛋白质家族的结构表征,从而能够通过结构预测功能。加州结构基因组学联合实验室发展了高度自动化的蛋白质合成、结晶、结构解析生产线。然而由于一些蛋白质不能被结晶,要想覆盖所有蛋白质结构域还有很大困难。Wuthrich的研究小组通过一些高通量的目的蛋白质筛选和NMR结构解析的方法解决了这一难题。与X射线晶体学解析蛋白质结构相比,NMR技术由于能够解析更接近生理状态的溶液结构而具有互补性。通过获得溶液中的蛋白质稳定性、动力学特征和相互作用信息,正如在朊蛋白和SARS相关蛋白的研究中所表现的那样,NMR技术从扩大已知的蛋白质结构数据库、新的蛋白质功能到化学生物学研究中都扮演着激动人心的角色。  相似文献   

This article reviews the advances in molecular genetics that have led to the identification of genes and markers associated with meat quality in pig. The development of a considerable number of annotated livestock genome sequences represents an incredibly rich source of information that can be used to identify candidate genes responsible for complex traits and quantitative trait loci effects. In pig, the huge amount of information emerging from the study of the genome has helped in the acquisition of new knowledge concerning biological systems and it is opening new opportunities for the genetic selection of this specie. Among the new fields of genomics recently developed, functional genomics and proteomics that allow considering many genes and proteins at the same time are very useful tools for a better understanding of the function and regulation of genes, and how these participate in complex networks controlling the phenotypic characteristics of a trait. In particular, global gene expression profiling at the mRNA and protein level can provide a better understanding of gene regulation that underlies biological functions and physiology related to the delivery of a better pig meat quality. Moreover, the possibility to realize an integrated approach of genomics and proteomics with bioinformatics tools is essential to obtain a complete exploitation of the available molecular genetics information. The development of this knowledge will benefit scientists, industry and breeders considering that the efficiency and accuracy of the traditional pig selection schemes will be improved by the implementation of molecular data into breeding programs.  相似文献   

The genome projects produce an enormous amount of sequence data that needs to be annotated in terms of molecular structure and biological function. These tasks have triggered additional initiatives like structural genomics. The intention is to determine as many protein structures as possible, in the most efficient way, and to exploit the solved structures for the assignment of biological function to hypothetical proteins. We discuss the impact of these developments on protein classification, gene function prediction, and protein structure prediction.  相似文献   

The flood of new genomic sequence information together with technological innovations in protein structure determination have led to worldwide structural genomics (SG) initiatives. The goals of SG initiatives are to accelerate the process of protein structure determination, to fill in protein fold space and to provide information about the function of uncharacterized proteins. In the long-term, these outcomes are likely to impact on medical biotechnology and drug discovery, leading to a better understanding of disease as well as the development of new therapeutics. Here we describe the high throughput pipeline established at the University of Queensland in Australia. In this focused pipeline, the targets for structure determination are proteins that are expressed in mouse macrophage cells and that are inferred to have a role in innate immunity. The aim is to characterize the molecular structure and the biochemical and cellular function of these targets by using a parallel processing pipeline. The pipeline is designed to work with tens to hundreds of target gene products and comprises target selection, cloning, expression, purification, crystallization and structure determination. The structures from this pipeline will provide insights into the function of previously uncharacterized macrophage proteins and could lead to the validation of new drug targets for chronic obstructive pulmonary disease and arthritis.  相似文献   

Complementary developments in comparative genomics, protein structure determination and in-depth comparison of protein sequences and structures have provided a better understanding of the prevailing trends in the emergence and diversification of protein domains. The investigation of deep relationships among different classes of proteins involved in key cellular functions, such as nucleic acid polymerases and other nucleotide-dependent enzymes, indicates that a substantial set of diverse protein domains evolved within the primordial, ribozyme-dominated RNA world.  相似文献   

The identification of protein biochemical functions based on their three-dimensional structures is strongly required in the post-genome-sequencing era. We have developed a new method to identify and predict protein biochemical functions using the similarity information of molecular surface geometries and electrostatic potentials on the surfaces. Our prediction system consists of a similarity search method based on a clique search algorithm and the molecular surface database eF-site (electrostatic surface of functional-site in proteins). Using this system, functional sites similar to those of phosphoenoylpyruvate carboxy kinase were detected in several mononucleotide-binding proteins, which have different folds. We also applied our method to a hypothetical protein, MJ0226 from Methanococcus jannaschii, and detected the mononucleotide binding site from the similarity to other proteins having different folds.  相似文献   

A new approach to the functional classification of protein 3D structures is described with application to some examples from structural genomics. This approach is based on functional site prediction with THEMATICS and POOL. THEMATICS employs calculated electrostatic potentials of the query structure. POOL is a machine learning method that utilizes THEMATICS features and has been shown to predict accurate, precise, highly localized interaction sites. Extension to the functional classification of structural genomics proteins is now described. Predicted functionally important residues are structurally aligned with those of proteins with previously characterized biochemical functions. A 3D structure match at the predicted local functional site then serves as a more reliable predictor of biochemical function than an overall structure match. Annotation is confirmed for a structural genomics protein with the ribulose phosphate binding barrel (RPBB) fold. A putative glucoamylase from Bacteroides fragilis (PDB ID 3eu8) is shown to be in fact probably not a glucoamylase. Finally a structural genomics protein from Streptomyces coelicolor annotated as an enoyl-CoA hydratase (PDB ID 3g64) is shown to be misannotated. Its predicted active site does not match the well-characterized enoyl-CoA hydratases of similar structure but rather bears closer resemblance to those of a dehalogenase with similar fold.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号