首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Several types of domain occur in beta-1, 4-glycanases. The best characterized of these are the catalytic domains and the cellulose-binding domains. The domains may be joined by linker sequences rich in proline or hydroxyamino acids or both. Some of the enzymes contain repeated sequences up to 150 amino acids in length. The enzymes can be grouped into families on the basis of sequence similarities between the catalytic domains. There are sequence similarities between the cellulose-binding domains, of which two types have been identified, and also between some domains of unknown function. The beta-1, 4-glycanases appear to have arisen by the shuffling of a relatively small number of progenitor sequences.  相似文献   

3.
BACKGROUND: Several methods of structural classification have been developed to introduce some order to the large amount of data present in the Protein Data Bank. Such methods facilitate structural comparisons and provide a greater understanding of structure and function. The most widely used and comprehensive databases are SCOP, CATH and FSSP, which represent three unique methods of classifying protein structures: purely manual, a combination of manual and automated, and purely automated, respectively. In order to develop reliable template libraries and benchmarks for protein-fold recognition, a systematic comparison of these databases has been carried out to determine their overall agreement in classifying protein structures. RESULTS: Approximately two-thirds of the protein chains in each database are common to all three databases. Despite employing different methods, and basing their systems on different rules of protein structure and taxonomy, SCOP, CATH and FSSP agree on the majority of their classifications. Discrepancies and inconsistencies are accounted for by a small number of explanations. Other interesting features have been identified, and various differences between manual and automatic classification methods are presented. CONCLUSIONS: Using these databases requires an understanding of the rules upon which they are based; each method offers certain advantages depending on the biological requirements and knowledge of the user. The degree of discrepancy between the systems also has an impact on reliability of prediction methods that employ these schemes as benchmarks. To generate accurate fold templates for threading, we extract information from a consensus database, encompassing agreements between SCOP, CATH and FSSP.  相似文献   

4.
In this work we examine how protein structural changes are coupled with sequence variation in the course of evolution of a family of homologs. The sequence-structure correlation analysis performed on 81 homologous protein families shows that the majority of them exhibit statistically significant linear correlation between the measures of sequence and structural similarity. We observed, however, that there are cases where structural variability cannot be mainly explained by sequence variation, such as protein families with a number of disulfide bonds. To understand whether structures from different families and/or folds evolve in the same manner, we compared the degrees of structural change per unit of sequence change ("the evolutionary plasticity of structure") between those families with a significant linear correlation. Using rigorous statistical procedures we find that, with a few exceptions, evolutionary plasticity does not show a statistically significant difference between protein families. Similar sequence-structure analysis performed for protein loop regions shows that evolutionary plasticity of loop regions is greater than for the protein core.  相似文献   

5.
Cai XH  Jaroszewski L  Wooley J  Godzik A 《Proteins》2011,79(8):2389-2402
The protein universe can be organized in families that group proteins sharing common ancestry. Such families display variable levels of structural and functional divergence, from homogenous families, where all members have the same function and very similar structure, to very divergent families, where large variations in function and structure are observed. For practical purposes of structure and function prediction, it would be beneficial to identify sub-groups of proteins with highly similar structures (iso-structural) and/or functions (iso-functional) within divergent protein families. We compared three algorithms in their ability to cluster large protein families and discuss whether any of these methods could reliably identify such iso-structural or iso-functional groups. We show that clustering using profile-sequence and profile-profile comparison methods closely reproduces clusters based on similarities between 3D structures or clusters of proteins with similar biological functions. In contrast, the still commonly used sequence-based methods with fixed thresholds result in vast overestimates of structural and functional diversity in protein families. As a result, these methods also overestimate the number of protein structures that have to be determined to fully characterize structural space of such families. The fact that one can build reliable models based on apparently distantly related templates is crucial for extracting maximal amount of information from new sequencing projects.  相似文献   

6.

Background

Optimization of high affinity reagents is a significant bottleneck in medicine and the life sciences. The ability to synthetically create thousands of permutations of a lead high-affinity reagent and survey the properties of individual permutations in parallel could potentially relieve this bottleneck. Aptamers are single stranded oligonucleotides affinity reagents isolated by in vitro selection processes and as a class have been shown to bind a wide variety of target molecules.

Methodology/Principal Findings

High density DNA microarray technology was used to synthesize, in situ, arrays of approximately 3,900 aptamer sequence permutations in triplicate. These sequences were interrogated on-chip for their ability to bind the fluorescently-labeled cognate target, immunoglobulin E, resulting in the parallel execution of thousands of experiments. Fluorescence intensity at each array feature was well resolved and shown to be a function of the sequence present. The data demonstrated high intra- and inter-chip correlation between the same features as well as among the sequence triplicates within a single array. Consistent with aptamer mediated IgE binding, fluorescence intensity correlated strongly with specific aptamer sequences and the concentration of IgE applied to the array.

Conclusion and Significance

The massively parallel sequence-function analyses provided by this approach confirmed the importance of a consensus sequence found in all 21 of the original IgE aptamer sequences and support a common stem:loop structure as being the secondary structure underlying IgE binding. The microarray application, data and results presented illustrate an efficient, high information content approach to optimizing aptamer function. It also provides a foundation from which to better understand and manipulate this important class of high affinity biomolecules.  相似文献   

7.
We assess the variability of protein function in protein sequence and structure space. Various regions in this space exhibit considerable difference in the local conservation of molecular function. We analyze and capture local function conservation by means of logistic curves. Based on this analysis, we propose a method for predicting molecular function of a query protein with known structure but unknown function. The prediction method is rigorously assessed and compared with a previously published function predictor. Furthermore, we apply the method to 500 functionally unannotated PDB structures and discuss selected examples. The proposed approach provides a simple yet consistent statistical model for the complex relations between protein sequence, structure, and function. The GOdot method is available online (http://godot.bioinf.mpi-inf.mpg.de).  相似文献   

8.
With the large amount of genomics and proteomics data that we are confronted with, computational support for the elucidation of protein function becomes more and more pressing. Many different kinds of biological data harbour signals of protein function, but these signals are often concealed. Computational methods that use protein sequence and structure data can be used for discovering these signals. They provide information that can substantially speed up experimental function elucidation. In this review we concentrate on such methods.  相似文献   

9.
While the number of sequenced genomes continues to grow, experimentally verified functional annotation of whole genomes remains patchy. Structural genomics projects are yielding many protein structures that have unknown function. Nevertheless, subsequent experimental investigation is costly and time-consuming, which makes computational methods for predicting protein function very attractive. There is an increasing number of noteworthy methods for predicting protein function from sequence and structural data alone, many of which are readily available to cell biologists who are aware of the strengths and pitfalls of each available technique.  相似文献   

10.
Classic studies of protein structure in the 1950s and 1960s demonstrated that green lacewing egg stalk silk possesses a rare native cross-beta sheet conformation. We have identified and sequenced the silk genes expressed by adult females of a green lacewing species. The two encoded silk proteins are 109 and 67 kDa in size and rich in serine, glycine and alanine. Over 70% of each protein sequence consists of highly repetitive regions with 16-residue periodicity. The repetitive sequences can be fitted to an elegant cross-beta sheet structural model with protein chains folded into regular 8-residue long beta strands. This model is supported by wide-angle X-ray scattering data and tensile testing from both our work and the original papers. We suggest that the silk proteins assemble into stacked beta sheet crystallites bound together by a network of cystine cross-links. This hierarchical structure gives the lacewing silk high lateral stiffness nearly threefold that of silkworm silk, enabling the egg stalks to effectively suspend eggs and protect them from predators.  相似文献   

11.
The OB-fold is found in all three kingdoms and is well represented in both sequence and structural databases. The OB-fold is a five-stranded closed beta barrel and the majority of OB-fold proteins use the same face for ligand binding or as an active site. Different OB-fold proteins use this 'fold-related binding face' to, variously, bind oligosaccharides, oligonucleotides, proteins, metal ions and catalytic substrates. Recently, a number of new structures with OB-folds have been reported that augment the variation seen for this set of proteins whilst conserving the characteristic fold and binding face. The conservation of fold and a functional binding face amongst many structures provides a model for investigating the evolutionary trajectory of sequence, structure and function.  相似文献   

12.
13.
We have identified conserved orthologs in completely sequenced genomes of double-strand DNA phages and arranged them into evolutionary families (phage orthologous groups [POGs]). Using this resource to analyze the collection of known phage genomes, we find that most orthologs are unique in their genomes (having no diverged duplicates [paralogs]), and while many proteins contain multiple domains, the evolutionary recombination of these domains does not appear to be a major factor in evolution of these orthologous families. The number of POGs has been rapidly increasing over the past decade, the percentage of genes in phage genomes that have orthologs in other phages has also been increasing, and the percentage of unknown "ORFans" is decreasing as more proteins find homologs and establish a family. Other properties of phage genomes have remained relatively stable over time, most notably the high fraction of genes that are never or only rarely observed in their cellular hosts. This suggests that despite the renowned ability of phages to transduce cellular genes, these cellular "hitchhiker" genes do not dominate the phage genomic landscape, and a large fraction of the genes in phage genomes maintain an evolutionary trajectory that is distinct from that of the host genes.  相似文献   

14.
Quadruplex DNA: sequence, topology and structure   总被引:31,自引:20,他引:11  
G-quadruplexes are higher-order DNA and RNA structures formed from G-rich sequences that are built around tetrads of hydrogen-bonded guanine bases. Potential quadruplex sequences have been identified in G-rich eukaryotic telomeres, and more recently in non-telomeric genomic DNA, e.g. in nuclease-hypersensitive promoter regions. The natural role and biological validation of these structures is starting to be explored, and there is particular interest in them as targets for therapeutic intervention. This survey focuses on the folding and structural features on quadruplexes formed from telomeric and non-telomeric DNA sequences, and examines fundamental aspects of topology and the emerging relationships with sequence. Emphasis is placed on information from the high-resolution methods of X-ray crystallography and NMR, and their scope and current limitations are discussed. Such information, together with biological insights, will be important for the discovery of drugs targeting quadruplexes from particular genes.  相似文献   

15.
16.
Prediction of protein function from protein sequence and structure   总被引:1,自引:0,他引:1  
The sequence of a genome contains the plans of the possible life of an organism, but implementation of genetic information depends on the functions of the proteins and nucleic acids that it encodes. Many individual proteins of known sequence and structure present challenges to the understanding of their function. In particular, a number of genes responsible for diseases have been identified but their specific functions are unknown. Whole-genome sequencing projects are a major source of proteins of unknown function. Annotation of a genome involves assignment of functions to gene products, in most cases on the basis of amino-acid sequence alone. 3D structure can aid the assignment of function, motivating the challenge of structural genomics projects to make structural information available for novel uncharacterized proteins. Structure-based identification of homologues often succeeds where sequence-alone-based methods fail, because in many cases evolution retains the folding pattern long after sequence similarity becomes undetectable. Nevertheless, prediction of protein function from sequence and structure is a difficult problem, because homologous proteins often have different functions. Many methods of function prediction rely on identifying similarity in sequence and/or structure between a protein of unknown function and one or more well-understood proteins. Alternative methods include inferring conservation patterns in members of a functionally uncharacterized family for which many sequences and structures are known. However, these inferences are tenuous. Such methods provide reasonable guesses at function, but are far from foolproof. It is therefore fortunate that the development of whole-organism approaches and comparative genomics permits other approaches to function prediction when the data are available. These include the use of protein-protein interaction patterns, and correlations between occurrences of related proteins in different organisms, as indicators of functional properties. Even if it is possible to ascribe a particular function to a gene product, the protein may have multiple functions. A fundamental problem is that function is in many cases an ill-defined concept. In this article we review the state of the art in function prediction and describe some of the underlying difficulties and successes.  相似文献   

17.

Background  

SCOP and CATH are widely used as gold standards to benchmark novel protein structure comparison methods as well as to train machine learning approaches for protein structure classification and prediction. The two hierarchies result from different protocols which may result in differing classifications of the same protein. Ignoring such differences leads to problems when being used to train or benchmark automatic structure classification methods. Here, we propose a method to compare SCOP and CATH in detail and discuss possible applications of this analysis.  相似文献   

18.
Getz G  Vendruscolo M  Sachs D  Domany E 《Proteins》2002,46(4):405-415
We present an automated procedure to assign CATH and SCOP classifications to proteins whose FSSP score is available. CATH classification is assigned down to the topology level, and SCOP classification is assigned to the fold level. Because the FSSP database is updated weekly, this method makes it possible to update also CATH and SCOP with the same frequency. Our predictions have a nearly perfect success rate when ambiguous cases are discarded. These ambiguous cases are intrinsic in any protein structure classification that relies on structural information alone. Hence, we introduce the "twilight zone for structure classification." We further suggest that to resolve these ambiguous cases, other criteria of classification, based also on information about sequence and function, must be used.  相似文献   

19.
In mass spectrometry (MS)-based bottom-up proteomics, protease digestion plays an essential role in profiling both proteome sequences and post-translational modifications (PTMs). Trypsin is the gold standard in digesting intact proteins into small-size peptides, which are more suitable for high-performance liquid chromatography (HPLC) separation and tandem MS (MS/MS) characterization. However, protein sequences lacking Lys and Arg cannot be cleaved by trypsin and may be missed in conventional proteomic analysis. Proteases with cleavage sites complementary to trypsin are widely applied in proteomic analysis to greatly improve the coverage of proteome sequences and PTM sites. In this review, we survey the common and newly emerging proteases used in proteomics analysis mainly in the last 5 years, focusing on their unique cleavage features and specific proteomics applications such as missing protein characterization, new PTM discovery, and de novo sequencing. In addition, we summarize the applications of proteases in structural proteomics and protein function analysis in recent years. Finally, we discuss the future development directions of new proteases and applications in proteomics.  相似文献   

20.
Hemopexin: structure,function, and regulation   总被引:1,自引:0,他引:1  
Hemopexin (HPX) is the plasma protein with the highest binding affinity to heme among known proteins. It is mainly expressed in liver, and belongs to acute phase reactants, the synthesis of which is induced after inflammation. Heme is potentially highly toxic because of its ability to intercalate into lipid membrane and to produce hydroxyl radicals. The binding strength between heme and HPX, and the presence of a specific heme-HPX receptor able to catabolize the complex and to induce intracellular antioxidant activities, suggest that hemopexin is the major vehicle for the transportation of heme in the plasma, thus preventing heme-mediated oxidative stress and heme-bound iron loss. In this review, we discuss the experimental data that support this view and show that the most important physiological role of HPX is to act as an antioxidant after blood heme overload, rather than to participate in iron metabolism. Particular attention is also put on the structure of the protein and on its regulation during the acute phase reaction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号