首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background  

The PD-(D/E)XK nuclease superfamily, initially identified in type II restriction endonucleases and later in many enzymes involved in DNA recombination and repair, is one of the most challenging targets for protein sequence analysis and structure prediction. Typically, the sequence similarity between these proteins is so low, that most of the relationships between known members of the PD-(D/E)XK superfamily were identified only after the corresponding structures were determined experimentally. Thus, it is tempting to speculate that among the uncharacterized protein families, there are potential nucleases that remain to be discovered, but their identification requires more sensitive tools than traditional PSI-BLAST searches.  相似文献   

2.
A comprehensive, structural and functional, in silico analysis of the medium-chain dehydrogenase/reductase (MDR) superfamily, including 583 proteins, was carried out by use of extensive database mining and the blastp program in an iterative manner to identify all known members of the superfamily. Based on phylogenetic, sequence, and functional similarities, the protein members of the MDR superfamily were classified into three different taxonomic categories: (a) subfamilies, consisting of a closed group containing a set of ideally orthologous proteins that perform the same function; (b) families, each comprising a cluster of monophyletic subfamilies that possess significant sequence identity among them and might share or not common substrates or mechanisms of reaction; and (c) macrofamilies, each comprising a cluster of monophyletic protein families with protein members from the three domains of life, which includes at least one subfamily member that displays activity related to a very ancient metabolic pathway. In this context, a superfamily is a group of homologous protein families (and/or macrofamilies) with monophyletic origin that shares at least a barely detectable sequence similarity, but showing the same 3D fold. The MDR superfamily encloses three macrofamilies, with eight families and 49 subfamilies. These subfamilies exhibit great functional diversity including noncatalytic members with different subcellular, phylogenetic, and species distributions. This results from constant enzymogenesis and proteinogenesis within each kingdom, and highlights the huge plasticity that MDR superfamily members possess. Thus, through evolution a great number of taxa-specific new functions were acquired by MDRs. The generation of new functions fulfilled by proteins, can be considered as the essence of protein evolution. The mechanisms of protein evolution inside MDR are not constrained to conserve substrate specificity and/or chemistry of catalysis. In consequence, MDR functional diversity is more complex than sequence diversity. MDR is a very ancient protein superfamily that existed in the last universal common ancestor. It had at least two (and probably three) different ancestral activities related to formaldehyde metabolism and alcoholic fermentation. Eukaryotic members of this superfamily are more related to bacterial than to archaeal members; horizontal gene transfer among the domains of life appears to be a rare event in modern organisms.  相似文献   

3.

Background  

Based on sequence similarity, the superfamily of G protein-coupled receptors (GPRs) can be subdivided into several subfamilies, the members of which often share similar ligands. The sequence data provided by the human genome project allows us to identify new GPRs by in silico homology screening, and to predict their ligands.  相似文献   

4.

Background  

Inferences about protein function are often made based on sequence homology to other gene products of known activities. This approach is valuable for small families of conserved proteins but can be difficult to apply to large superfamilies of proteins with diverse function. In this study we looked at sequence homology between members of the DJ-1/ThiJ/PfpI superfamily, which includes a human protein of unclear function, DJ-1, associated with inherited Parkinson's disease.  相似文献   

5.
6.

Background  

There are several evolutionarily unrelated and structurally dissimilar superfamilies of S-adenosylmethionine (AdoMet)-dependent methyltransferases (MTases). A new superfamily (SPOUT) has been recently characterized on a sequence level and three structures of its members (1gz0, 1ipa, and 1k3r) have been solved. However, none of these structures include the cofactor or the substrate. Due to the strong evolutionary divergence and the paucity of experimental information, no confident predictions of protein-ligand and protein-substrate interactions could be made, which hampered the study of sequence-structure-function relationships in the SPOUT superfamily.  相似文献   

7.

Background  

Studies of the structure-function relationship in proteins for which no 3D structure is available are often based on inspection of multiple sequence alignments. Many functionally important residues of proteins can be identified because they are conserved during evolution. However, residues that vary can also be critically important if their variation is responsible for diversity of protein function and improved phenotypes. If too few sequences are studied, the support for hypotheses on the role of a given residue will be weak, but analysis of large multiple alignments is too complex for simple inspection. When a large body of sequence and functional data are available for a protein family, mature data mining tools, such as machine learning, can be applied to extract information more easily, sensitively and reliably. We have undertaken such an analysis of voltage-gated potassium channels, a transmembrane protein family whose members play indispensable roles in electrically excitable cells.  相似文献   

8.

Background  

Most non-coding RNA families exert their function by means of a conserved, common secondary structure. The Rfam data base contains more than five hundred structurally annotated RNA families. Unfortunately, searching for new family members using covariance models (CMs) is very time consuming. Filtering approaches that use the sequence conservation to reduce the number of CM searches, are fast, but it is unknown to which sacrifice.  相似文献   

9.

Background  

Gene-gene epistatic interactions likely play an important role in the genetic basis of many common diseases. Recently, machine-learning and data mining methods have been developed for learning epistatic relationships from data. A well-known combinatorial method that has been successfully applied for detecting epistasis is Multifactor Dimensionality Reduction (MDR). Jiang et al. created a combinatorial epistasis learning method called BNMBL to learn Bayesian network (BN) epistatic models. They compared BNMBL to MDR using simulated data sets. Each of these data sets was generated from a model that associates two SNPs with a disease and includes 18 unrelated SNPs. For each data set, BNMBL and MDR were used to score all 2-SNP models, and BNMBL learned significantly more correct models. In real data sets, we ordinarily do not know the number of SNPs that influence phenotype. BNMBL may not perform as well if we also scored models containing more than two SNPs. Furthermore, a number of other BN scoring criteria have been developed. They may detect epistatic interactions even better than BNMBL.  相似文献   

10.

Background  

In insects, hemocyanin superfamily proteins accumulate apparently to serve as sources of amino acids during metamorphosis, reproduction and development. Storage hexamerins are important members of the hemocyanin superfamily. Although insects possess storage hexamerins, very little is known about the character and specific functions of hexamerin 1 and storage protein 1 in insect development.  相似文献   

11.
12.

Background  

Annotation of sequences that share little similarity to sequences of known function remains a major obstacle in genome annotation. Some of the best methods of detecting remote relationships between protein sequences are based on matching sequence profiles. We analyse the superfamily specific performance of sequence profile-profile matching. Our benchmark consists of a set of 16 protein superfamilies that are highly diverse at the sequence level. We relate the performance to the number of sequences in the profiles, the profile diversity and the extent of structural conservation in the superfamily.  相似文献   

13.

Background  

The detection of relationships between a protein sequence of unknown function and a sequence whose function has been characterised enables the transfer of functional annotation. However in many cases these relationships can not be identified easily from direct comparison of the two sequences. Methods which compare sequence profiles have been shown to improve the detection of these remote sequence relationships. However, the best method for building a profile of a known set of sequences has not been established. Here we examine how the type of profile built affects its performance, both in detecting remote homologs and in the resulting alignment accuracy. In particular, we consider whether it is better to model a protein superfamily using a single structure-based alignment that is representative of all known cases of the superfamily, or to use multiple sequence-based profiles each representing an individual member of the superfamily.  相似文献   

14.

Background  

During the last years, methods for remote homology detection have grown more and more sensitive and reliable. Automatic structure prediction servers relying on these methods can generate useful 3D models even below 20% sequence identity between the protein of interest and the known structure (template). When no homologs can be found in the protein structure database (PDB), the user would need to rerun the same search at regular intervals in order to make timely use of a template once it becomes available.  相似文献   

15.

Background  

Odorant binding proteins (OBPs) are believed to shuttle odorants from the environment to the underlying odorant receptors, for which they could potentially serve as odorant presenters. Although several sequence based search methods have been exploited for protein family prediction, less effort has been devoted to the prediction of OBPs from sequence data and this area is more challenging due to poor sequence identity between these proteins.  相似文献   

16.

Background  

In animals, the biogenesis of some lipoprotein classes requires members of the ancient large lipid transfer protein (LLTP) superfamily, including the cytosolic large subunit of microsomal triglyceride transfer protein (MTP), vertebrate apolipoprotein B (apoB), vitellogenin (Vtg), and insect apolipophorin II/I precursor (apoLp-II/I). In most oviparous species, Vtg, a large glycolipoprotein, is the main egg yolk precursor protein.  相似文献   

17.

Background  

Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs) and full-length (FL)cDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer.  相似文献   

18.

Background  

SUPFAM database is a compilation of superfamily relationships between protein domain families of either known or unknown 3-D structure. In SUPFAM, sequence families from Pfam and structural families from SCOP are associated, using profile matching, to result in sequence superfamilies of known structure. Subsequently all-against-all family profile matches are made to deduce a list of new potential superfamilies of yet unknown structure.  相似文献   

19.

Background  

Recently, several members of the transforming growth factor-beta (TGF-beta) superfamily have been shown to be essential for regulating the growth and differentiation of ovarian follicles and thus fertility.  相似文献   

20.

Background  

General protein evolution models help determine the baseline expectations for the evolution of sequences, and they have been extensively useful in sequence analysis and for the computer simulation of artificial sequence data sets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号