首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 34 毫秒
1.
2.
Structural biology sheds light on the puzzle of genomic ORFans   总被引:5,自引:0,他引:5  
Genomic ORFans are orphan open reading frames (ORFs) with no significant sequence similarity to other ORFs. ORFans comprise 20-30% of the ORFs of most completely sequenced genomes. Because nothing can be learnt about ORFans via sequence homology, the functions and evolutionary origins of ORFans remain a mystery. Furthermore, because relatively few ORFans have been experimentally characterized, it has been suggested that most ORFans are not likely to correspond to functional, expressed proteins, but rather to spurious ORFs, pseudo-genes or to rapidly evolving proteins with non-essential roles. As a snapshot view of current ORFan structural studies, we searched for ORFans among proteins whose three-dimensional structures have been recently determined. We find that functional and structural studies of ORFans are not as underemphasized as previously suggested. These recently determined structures correspond to ORFans from all Kingdoms of life, and include proteins that have previously been functionally characterized, as well as structural genomics targets of unknown function labeled as "hypothetical proteins". This suggests that many of the ORFans in the databases are likely to correspond to expressed, functional (and even essential) proteins. Furthermore, the recently determined structures include examples of the various types of ORFans, suggesting that the functions and evolutionary origins of ORFans are diverse. Although this survey sheds some light on the ORFan mystery, further experimental studies are required to gain a better understanding of the role and origins of the tens of thousands of ORFans awaiting characterization.  相似文献   

3.
Many raw biological sequence data have been generated by the human genome project and related efforts. The understanding of structural information encoded by biological sequences is important to acquire knowledge of their biochemical functions but remains a fundamental challenge. Recent interest in RNA regulation has resulted in a rapid growth of deposited RNA secondary structures in varied databases. However, a functional classification and characterization of the RNA structure have only been partially addressed. This article aims to introduce a novel interval-based distance metric for structure-based RNA function assignment. The characterization of RNA structures relies on distance vectors learned from a collection of predicted structures. The distance measure considers the intersected, disjoint, and inclusion between intervals. A set of RNA pseudoknotted structures with known function are applied and the function of the query structure is determined by measuring structure similarity. This not only offers sequence distance criteria to measure the similarity of secondary structures but also aids the functional classification of RNA structures with pesudoknots.  相似文献   

4.
Comparative analysis of structure and function of macromolecules, such as proteins, is an integral part of modern evolutionary biology. The first and critical step in understanding evolution of homologous proteins is their amino acid sequence alignment. However, standard algorithms fail to provide unambiguous sequence alignment for proteins of poor homology. More reliable results can be provided by comparing experimental 3D structures obtained at atomic resolution with the aid of X-ray structural analysis. If such structures are lacking, homology modeling is used which considers indirect experimental data on functional roles of individual amino acid residues. An important problem is that sequence alignment, which reflects genetic modifications, not necessarily corresponds to functional homology, which depends on 3D structures critical for natural selection. Since the alignment techniques relying only on the analysis of primary structures carry no information on the functional properties of proteins, the inclusion of 3D structures into consideration is of utmost importance. Here we consider several ion channels as examples to demonstrate that alignment of their 3D structures can significantly improve sequence alignment obtained by traditional methods.  相似文献   

5.
Integral membrane proteins are involved in a wide range of essential biological functions and the determination of their three-dimensional structures plays a central role in understanding their function. This review focuses on the structures of one class of integral membrane proteins: the functionally diverse all-alpha type membrane proteins. It gives an overview of all the structures determined by X-ray crystallography, describing each system and structure in turn. It shows that the structures of all-alpha type membrane proteins have made valuable contributions to understanding structure–function relationships in membrane proteins. These range from the first insights into the function of exciting individual proteins to an in-depth knowledge of protein function from entire biological systems.  相似文献   

6.
Among the greatest challenges facing biology today is the exploitation of huge amounts of genomic data, and their conversion into functional information about the proteins encoded. For example, the large-scale cDNA sequencing project of the German cDNA Consortium is providing vast numbers of open reading frames (ORFs) encoding novel proteins of completely unknown function. As a first step towards their characterization we have tagged over 500 of these with the green fluorescent protein (GFP), and examined the subcellular localizations of these fusion proteins in living cells. These data have allowed us to classify the proteins into subcellular groups which determines the next step towards a detailed functional characterization. To make further use of these GFP-tagged constructs, a series of functional assays have been designed and implemented to assess the effect of these novel proteins on processes such as cell growth, cell death, and protein transport.Functional assays with such a large set of molecules is only possible by automation. Therefore, we have developed, and adapted, functional assays for use by robotic liquid handling stations and reading stations. A transport assay allows to identify proteins which localize to distinct organelles of the secretory pathway and have the potential to be new regulators in protein transport, a proliferation assay helps identifying proteins that stimulate or repress mitosis. Further assays to monitor the effects of the proteins in apoptosis and signal transduction pathways are in progress. Integrating the functional information that is generated in the assays with data from expression profiling and further functional genomics and proteomics approaches, will ultimately allow us to identify functional networks of proteins in a morphological context, and will greatly contribute to our understanding of cell function.  相似文献   

7.
Domains are the evolutionary units that comprise proteins, and most proteins are built from more than one domain. Domains can be shuffled by recombination to create proteins with new arrangements of domains. Using structural domain assignments, we examined the combinations of domains in the proteins of 131 completely sequenced organisms. We found two-domain and three-domain combinations that recur in different protein contexts with different partner domains. The domains within these combinations have a particular functional and spatial relationship. These units are larger than individual domains and we term them "supra-domains". Amongst the supra-domains, we identified some 1400 (1203 two-domain and 166 three-domain) combinations that are statistically significantly over-represented relative to the occurrence and versatility of the individual component domains. Over one-third of all structurally assigned multi-domain proteins contain these over-represented supra-domains. This means that investigation of the structural and functional relationships of the domains forming these popular combinations would be particularly useful for an understanding of multi-domain protein function and evolution as well as for genome annotation. These and other supra-domains were analysed for their versatility, duplication, their distribution across the three kingdoms of life and their functional classes. By examining the three-dimensional structures of several examples of supra-domains in different biological processes, we identify two basic types of spatial relationships between the component domains: the combined function of the two domains is such that either the geometry of the two domains is crucial and there is a tight constraint on the interface, or the precise orientation of the domains is less important and they are spatially separate. Frequently, the role of the supra-domain becomes clear only once the three-dimensional structure is known. Since this is the case for only a quarter of the supra-domains, we provide a list of the most important unknown supra-domains as potential targets for structural genomics projects.  相似文献   

8.
Transmembrane (TM) proteins constitute 15-30% of the genome, but <1% of the structures in the Protein Data Bank. This discrepancy is disturbing, and emphasizes that structure determination of TM proteins remains challenging. The challenge is greatest for proteins from eukaryotes, the structures of which remain intractable despite tremendous advances that have been made towards structure determination of bacterial TM proteins. Notably, >50% of the membrane protein families in eukaryotes lack bacterial homologs. Therefore, it is conceivable that many more years will elapse before high-resolution structures of eukaryotic TM proteins emerge. Until then, integrated approaches that combine biochemical and computational analyses with low-resolution structures are likely to have increasingly important roles in providing frameworks for the mechanistic understanding of membrane-protein structure and function.  相似文献   

9.
This review describes the family of intrinsically disordered proteins, members of which fail to form rigid 3-D structures under physiological conditions, either along their entire lengths or only in localized regions. Instead, these intriguing proteins/regions exist as dynamic ensembles within which atom positions and backbone Ramachandran angles exhibit extreme temporal fluctuations without specific equilibrium values. Many of these intrinsically disordered proteins are known to carry out important biological functions which, in fact, depend on the absence of a specific 3-D structure. The existence of such proteins does not fit the prevailing structure–function paradigm, which states that a unique 3-D structure is a prerequisite to function. Thus, the protein structure–function paradigm has to be expanded to include intrinsically disordered proteins and alternative relationships among protein sequence, structure, and function. This shift in the paradigm represents a major breakthrough for biochemistry, biophysics and molecular biology, as it opens new levels of understanding with regard to the complex life of proteins. This review will try to answer the following questions: how were intrinsically disordered proteins discovered? Why don't these proteins fold? What is so special about intrinsic disorder? What are the functional advantages of disordered proteins/regions? What is the functional repertoire of these proteins? What are the relationships between intrinsically disordered proteins and human diseases?  相似文献   

10.
The bias in protein structure and function space resulting from experimental limitations and targeting of particular functional classes of proteins by structural biologists has long been recognized, but never continuously quantified. Using the Enzyme Commission and the Gene Ontology classifications as a reference frame, and integrating structure data from the Protein Data Bank (PDB), target sequences from the structural genomics projects, structure homology derived from the SUPERFAMILY database, and genome annotations from Ensembl and NCBI, we provide a quantified view, both at the domain and whole-protein levels, of the current and projected coverage of protein structure and function space relative to the human genome. Protein structures currently provide at least one domain that covers 37% of the functional classes identified in the genome; whole structure coverage exists for 25% of the genome. If all the structural genomics targets were solved (twice the current number of structures in the PDB), it is estimated that structures of one domain would cover 69% of the functional classes identified and complete structure coverage would be 44%. Homology models from existing experimental structures extend the 37% coverage to 56% of the genome as single domains and 25% to 31% for complete structures. Coverage from homology models is not evenly distributed by protein family, reflecting differing degrees of sequence and structure divergence within families. While these data provide coverage, conversely, they also systematically highlight functional classes of proteins for which structures should be determined. Current key functional families without structure representation are highlighted here; updated information on the "most wanted list" that should be solved is available on a weekly basis from http://function.rcsb.org:8080/pdb/function_distribution/index.html.  相似文献   

11.
BACKGROUND: Structures that have diverged from a common ancestor often retain functional and sequence similarity, although the latter may be very reduced. Even so, the overall fold of the structure is generally highly conserved. Now however, several have been identified of proteins that have been identified that have different functions but which have converged to a similar fold. These proteins will also have low sequence identities. RESULTS: By comparing the complete structure databank against itself, using sequence and structure alignment techniques, we have been able to identify six new examples of structurally related folds that have no apparent sequence or functional similarity. These related proteins include a family of crambin-like folds and a family of ferredoxin II folds. We found that all the similarities between structures are present in small proteins and occur as motifs within the core of a larger protein. CONCLUSION: The low sequence similarity and the lack of any obvious functional relationship between proteins with similar structures suggest that the proteins have diverged from independent ancestors. The similarities may therefore be of interest for understanding the various stereochemical and physical criteria that operate to generate a favourable fold.  相似文献   

12.
The biochemical processes of living cells involve a numerous series of reactions that work with exceptional specificity and efficiency. The tight control of this intricate reaction network stems from the architecture of the proteins that drive the chemical reactions and mediate protein–protein interactions. Indeed, the structure of these proteins will determine both their function and interaction partners. A detailed understanding of the proximity and orientation of pivotal functional groups can reveal the molecular mechanistic basis for the activity of a protein. Together with X-ray crystallography and electron microscopy, NMR spectroscopy plays an important role in solving three-dimensional structures of proteins at atomic resolution. In the challenging field of membrane proteins, retinal-binding proteins are often employed as model systems and prototypes to develop biophysical techniques for the study of structural and functional mechanistic aspects. The recent determination of two 3D structures of seven-helical trans-membrane retinal proteins by solution-state NMR spectroscopy highlights the potential of solution NMR techniques in contributing to our understanding of membrane proteins. This review summarizes the multiple strategies available for expression of isotopically labeled membrane proteins. Different environments for mimicking lipid bilayers will be presented, along with the most important NMR methods and labeling schemes used to generate high-quality NMR spectra. The article concludes with an overview of types of conformational restraints used for generation of high-resolution structures of membrane proteins. This article is part of a Special Issue entitled: Retinal Proteins — You can teach an old dog new tricks.  相似文献   

13.
The genome sciences face the challenge to characterize structure and function of a vast number of novel genes. Sequence search techniques are used to infer functional and structural information from similarities to experimentally characterized genes or proteins. The persistent goal is to refine these techniques and to develop alternative and complementary methods to increase the range of reliable inference.Here, we focus on the structural and functional assignments that can be inferred from the known three-dimensional structures of proteins. The study uses all structures in the Protein Data Bank that were known by the end of 1997. The protein structures released in 1998 were then characterized in terms of functional and structural similarity to the previously known structures, yielding an estimate of the maximum amount of information on novel protein sequences that can be obtained from inference techniques.The 147 globular proteins corresponding to 196 domains released in 1998 have no clear sequence similarity to previously known structures. However, 75 % of the domains have extensive structure similarity to previously known folds, and most importantly, in two out of three cases similarity in structure coincides with related function. In view of this analysis, full utilization of existing structure data bases would provide information for many new targets even if the relationship is not accessible from sequence information alone. Currently, the most sophisticated techniques detect of the order of one-third of these relationships.  相似文献   

14.
Material remains of ancestor nucleotides and proteins are largely unavailable, thus sequence comparison among homologous genes in present-day organisms forms the core of current knowledge of molecular evolution. Variation in protein three-dimensional structure is a basis for functional diversity. To study the evolution of three-dimensional structures in related proteins would significantly improve our understanding of protein evolution and function. A protein may contain ancestor conformations that have been allosterically suppressed by evolutionarily additive structures. Using monoclonal antibody probes to detect such conformation in proteins after removing the suppressor structure, our study demonstrates three-dimensional structure evidence for the evolutionary relationship between troponin I and troponin T, two subunits of the troponin complex in the Ca2+-regulatory system of striated muscle, and among their muscle type-specific isoforms. The experimental data show the feasibility of detecting evolutionarily suppressed history-telling structural states in proteins by removing conformational modulator segments added during evolution. In addition to identifying structural modifications that were critical to the emergence of diverged proteins, investigating this novel mode of evolution will help us to understand the origin and functional potential of protein structures.  相似文献   

15.
The human cDNA and genomic sequencing projects will result in the identification and isolation of some 140,000 genes, the majority of which lack predicted functions and for which the cellular localizations are not known. The identification and characterization of protein components of specific cell structures and machineries are essential steps not only toward defining functions of genes but also toward understanding cell function and regulation. We describe here a new approach, termed PROLOC, which uses full-length cDNAs for systematic classification of novel proteins as a functional pointer. We have PCR-amplified 25 uncharacterized human genes and expressed the encoded proteins as GFP fusions in a human cell line. This pilot project has identified novel proteins associated with the nucleolus, mitochondria, the ER, the ER-Golgi-intermediate compartment (ERGIC), the GC, the plasma membrane, and cytoplasmic foci. This visual classification approach may be scaled up to handle a large number of novel genes and permit the generation of a global cellular protein localization map. Such information should be valuable for many aspects of functional genomics and cell biology.  相似文献   

16.
The key reaction of protein synthesis, peptidyl transfer, is catalysed in all living organisms by the ribosome - an advanced and highly efficient molecular machine. During the last decade extensive X-ray crystallographic and NMR studies of the three-dimensional structure of ribosomal proteins, ribosomal RNA components and their complexes with ribosomal proteins, and of several translation factors in different functional states have taken us to a new level of understanding of the mechanism of function of the protein synthesis machinery. Among the new remarkable features revealed by structural studies, is the mimicry of the tRNA molecule by elongation factor G, ribosomal recycling factor and the eukaryotic release factor 1. Several other translation factors, for which three-dimensional structures are not yet known, are also expected to show some form of tRNA mimicry. The efforts of several crystallographic and biochemical groups have resulted in the determination by X-ray crystallography of the structures of the 30S and 50S subunits at moderate resolution, and of the structure of the 70S subunit both by X-ray crystallography and cryo-electron microscopy (EM). In addition, low resolution cryo-EM models of the ribosome with different translation factors and tRNA have been obtained. The new ribosomal models allowed for the first time a clear identification of the functional centres of the ribosome and of the binding sites for tRNA and ribosomal proteins with known three-dimensional structure. The new structural data have opened a way for the design of new experiments aimed at deeper understanding at an atomic level of the dynamics of the system.  相似文献   

17.
The function of DNA‐ and RNA‐binding proteins can be inferred from the characterization and accurate prediction of their binding interfaces. However, the main pitfall of various structure‐based methods for predicting nucleic acid binding function is that they are all limited to a relatively small number of proteins for which high‐resolution three‐dimensional structures are available. In this study, we developed a pipeline for extracting functional electrostatic patches from surfaces of protein structural models, obtained using the I‐TASSER protein structure predictor. The largest positive patches are extracted from the protein surface using the patchfinder algorithm. We show that functional electrostatic patches extracted from an ensemble of structural models highly overlap the patches extracted from high‐resolution structures. Furthermore, by testing our pipeline on a set of 55 known nucleic acid binding proteins for which I‐TASSER produces high‐quality models, we show that the method accurately identifies the nucleic acids binding interface on structural models of proteins. Employing a combined patch approach we show that patches extracted from an ensemble of models better predicts the real nucleic acid binding interfaces compared with patches extracted from independent models. Overall, these results suggest that combining information from a collection of low‐resolution structural models could be a valuable approach for functional annotation. We suggest that our method will be further applicable for predicting other functional surfaces of proteins with unknown structure. Proteins 2012. © 2011 Wiley Periodicals, Inc.  相似文献   

18.
Lipocalins constitute a superfamily of extracellular proteins that are found in all three kingdoms of life. Although very divergent in their sequences and functions, they show remarkable similarity in 3-D structures. Lipocalins bind and transport small hydrophobic molecules. Earlier sequence-based phylogenetic studies of lipocalins highlighted that they have a long evolutionary history. However the molecular and structural basis of their functional diversity is not completely understood. The main objective of the present study is to understand functional diversity of the lipocalins using a structure-based phylogenetic approach. The present study with 39 protein domains from the lipocalin superfamily suggests that the clusters of lipocalins obtained by structure-based phylogeny correspond well with the functional diversity. The detailed analysis on each of the clusters and sub-clusters reveals that the 39 lipocalin domains cluster based on their mode of ligand binding though the clustering was performed on the basis of gross domain structure. The outliers in the phylogenetic tree are often from single member families. Also structure-based phylogenetic approach has provided pointers to assign putative function for the domains of unknown function in lipocalin family. The approach employed in the present study can be used in the future for the functional identification of new lipocalin proteins and may be extended to other protein families where members show poor sequence similarity but high structural similarity.  相似文献   

19.
20.
A long-standing goal in biology is to establish the link between function, structure, and dynamics of proteins. Considering that protein function at the molecular level is understood by the ability of proteins to bind to other molecules, the limited structural data of proteins in association with other bio-molecules represents a major hurdle to understanding protein function at the structural level. Recent reports show that protein function can be linked to protein structure and dynamics through network centrality analysis, suggesting that the structures of proteins bound to natural ligands may be inferred computationally. In the present work, a new method is described to discriminate protein conformations relevant to the specific recognition of a ligand. The method relies on a scoring system that matches critical residues with central residues in different structures of a given protein. Central residues are the most traversed residues with the same frequency in networks derived from protein structures. We tested our method in a set of 24 different proteins and more than 260,000 structures of these in the absence of a ligand or bound to it. To illustrate the usefulness of our method in the study of the structure/dynamics/function relationship of proteins, we analyzed mutants of the yeast TATA-binding protein with impaired DNA binding. Our results indicate that critical residues for an interaction are preferentially found as central residues of protein structures in complex with a ligand. Thus, our scoring system effectively distinguishes protein conformations relevant to the function of interest.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号