首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Membrane protein plays an important role in some biochemical process such as signal transduction, transmembrane transport, etc. Membrane proteins are usually classified into five types [Chou, K.C., Elrod, D.W., 1999. Prediction of membrane protein types and subcellular locations. Proteins: Struct. Funct. Genet. 34, 137-153] or six types [Chou, K.C., Cai, Y.D., 2005. J. Chem. Inf. Modelling 45, 407-413]. Designing in silico methods to identify and classify membrane protein can help us understand the structure and function of unknown proteins. This paper introduces an integrative approach, IAMPC, to classify membrane proteins based on protein sequences and protein profiles. These modules extract the amino acid composition of the whole profiles, the amino acid composition of N-terminal and C-terminal profiles, the amino acid composition of profile segments and the dipeptide composition of the whole profiles. In the computational experiment, the overall accuracy of the proposed approach is comparable with the functional-domain-based method. In addition, the performance of the proposed approach is complementary to the functional-domain-based method for different membrane protein types.  相似文献   

2.
MOTIVATION: Many studies have shown that database searches using position-specific score matrices (PSSMs) or profiles as queries are more effective at identifying distant protein relationships than are searches that use simple sequences as queries. One popular program for constructing a PSSM and comparing it with a database of sequences is Position-Specific Iterated BLAST (PSI-BLAST). RESULTS: This paper describes a new software package, IMPALA, designed for the complementary procedure of comparing a single query sequence with a database of PSI-BLAST-generated PSSMs. We illustrate the use of IMPALA to search a database of PSSMs for protein folds, and one for protein domains involved in signal transduction. IMPALA's sensitivity to distant biological relationships is very similar to that of PSI-BLAST. However, IMPALA employs a more refined analysis of statistical significance and, unlike PSI-BLAST, guarantees the output of the optimal local alignment by using the rigorous Smith-Waterman algorithm. Also, it is considerably faster when run with a large database of PSSMs than is BLAST or PSI-BLAST when run against the complete non-redundant protein database.  相似文献   

3.
4.
5.
6.

Background  

Chitinases (EC.3.2.1.14) hydrolyze the β-1,4-linkages in chitin, an abundant N-acetyl-β-D-glucosamine polysaccharide that is a structural component of protective biological matrices such as insect exoskeletons and fungal cell walls. The glycoside hydrolase 18 (GH18) family of chitinases is an ancient gene family widely expressed in archea, prokaryotes and eukaryotes. Mammals are not known to synthesize chitin or metabolize it as a nutrient, yet the human genome encodes eight GH18 family members. Some GH18 proteins lack an essential catalytic glutamic acid and are likely to act as lectins rather than as enzymes. This study used comparative genomic analysis to address the evolutionary history of the GH18 multiprotein family, from early eukaryotes to mammals, in an effort to understand the forces that shaped the human genome content of chitinase related proteins.  相似文献   

7.
MOTIVATION: We present techniques for increasing the speed of sequence analysis using scoring matrices. Our techniques are based on calculating, for a given scoring matrix, the quantile function, which assigns a probability, or p, value to each segmental score. Our techniques also permit the user to specify a p threshold to indicate the desired trade-off between sensitivity and speed for a particular sequence analysis. The resulting increase in speed should allow scoring matrices to be used more widely in large-scale sequencing and annotation projects. RESULTS: We develop three techniques for increasing the speed of sequence analysis: probability filtering, lookahead scoring, and permuted lookahead scoring. In probability filtering, we compute the score threshold that corresponds to the user-specified p threshold. We use the score threshold to limit the number of segments that are retained in the search process. In lookahead scoring, we test intermediate scores to determine whether they will possibly exceed the score threshold. In permuted lookahead scoring, we score each segment in a particular order designed to maximize the likelihood of early termination. Our two lookahead scoring techniques reduce substantially the number of residues that must be examined. The fraction of residues examined ranges from 62 to 6%, depending on the p threshold chosen by the user. These techniques permit sequence analysis with scoring matrices at speeds that are several times faster than existing programs. On a database of 12 177 alignment blocks, our techniques permit sequence analysis at a speed of 225 residues/s for a p threshold of 10-6, and 541 residues/s for a p threshold of 10-20. In order to compute the quantile function, we may use either an independence assumption or a Markov assumption. We measure the effect of first- and second-order Markov assumptions and find that they tend to raise the p value of segments, when compared with the independence assumption, by average ratios of 1.30 and 1.69, respectively. We also compare our technique with the empirical 99. 5th percentile scores compiled in the BLOCKSPLUS database, and find that they correspond on average to a p value of 1.5 x 10-5. AVAILABILITY: The techniques described above are implemented in a software package called EMATRIX. This package is available from the authors for free academic use or for licensed commercial use. The EMATRIX set of programs is also available on the Internet at http://motif.stanford.edu/ematrix.  相似文献   

8.
We extend the concept of the motif as a tool for characterizing protein families and explore the feasibility of a sparse "motif" that is the length of the protein sequence itself. The type of motif discussed is a sparse family signature consisting of a set of N key residue positions (A1, A2...AN) preceded by gaps (G) thus G1A1G2A2. ...GNAN. Both a residue and gap can be variable. A signature is matched to a protein sequence and scored using a dynamic programming algorithm which permits variability in gap distance and residue type. Generating a signature involves identifying residues associated with points of contact in interactions between secondary structure elements. A raw signature consists of a set of positions with potential key structural roles sampled from a sequence alignment constructed with reference to this contact data. Raw signatures are refined by sampling different gap-residue pairs until the specificity of a signature for the family cannot be further improved. We summarize signatures for nine families of protein of diverse fold and function and present results of scans against the OWL protein sequence database. The implications of such signatures are discussed.  相似文献   

9.
The RAP55 protein family is evolutionarily conserved in eukaryotes. Two highly conserved paralogues, RAP55A and RAP55B, exist in vertebrates; their functional properties and expression patterns remain to be compared. RAP55 proteins share multiple domains: the LSm14 domain, a serine/threonine rich region, an FDF (phenylalanine-aspartate-phenylalanine) motif, an FFD-TFG box and RGG (arginine-glycine-glycine) repeats. Together these domains are responsible for RAP55 proteins participating in translational repression, incorporation into mRNP particles, protein-protein interactions, P-body formation and stress granule localisation. All RAP55A proteins localise to P-body-like complexes either in the germline or in somatic cells. Xenopus laevis RAP55B has been shown to be part of translationally repressed mRNP complexes in early oocytes. Together these findings suggest that this protein family has evolved a common and fundamental role in the control of mRNA translation. Furthermore human RAP55A is an autoantigen detected in the serum of patients with primary biliary cirrhosis (PBC). The link between RAP55A, P-bodies and PBC remains to be elucidated.  相似文献   

10.
Genome-scale metabolic networks can be reconstructed. The systemic biochemical properties of these networks can now be studied. Here, genome-scale reconstructed metabolic networks were analysed using singular value decomposition (SVD). All the individual biochemical conversions contained in a reconstructed metabolic network are described by a stoichiometric matrix (S). SVD of S led to the definition of the underlying modes that characterize the overall biochemical conversions that take place in a network and rank-ordered their importance. The modes were shown to correspond to systemic biochemical reactions and they could be used to identify the groups and clusters of individual biochemical reactions that drive them. Comparative analysis of the Escherichia coli, Haemophilus influenzae, and Helicobacter pylori genome-scale metabolic networks showed that the four dominant modes in all three networks correspond to: (1) the conversion of ATP to ADP, (2) redox metabolism of NADP, (3) proton-motive force, and (4) inorganic phosphate metabolism. The sets of individual metabolic reactions deriving these systemic conversions, however, differed among the three organisms. Thus, we can now define systemic metabolic reactions, or eigen-reactions, for the study of systems biology of metabolism and have a basis for comparing the overall properties of genome-specific metabolic networks.  相似文献   

11.
Protein disorder is characterized by a lack of a stable 3D structure, and is considered to be involved in a number of important protein functions such as regulatory and signalling events. We developed a web application, the POODLE-S, which predicts the disordered region from amino acid sequences by using physicochemical features and reduced amino acid set of a position-specific scoring matrix. Availability: POODLE-S is available from http://mbs.cbrc.jp/poodle/poodle-s.html and can be used by both academic and commercial users.  相似文献   

12.
We designed and synthesized new, fluorescent, non-natural amino acids that emit fluorescence of wavelengths longer than 500 nm and are accepted by an Escherichia coli cell-free translation system. We synthesized p-aminophenylalanine derivatives linked with BODIPY fluorophores at the p-amino group and introduced them into streptavidin using the four-base codon CGGG in a cell-free translation system. Practically, the incorporation efficiency was high enough for BODIPYFL, BODIPY558 and BODIPY576. Next, we incorporated BODIPYFL-aminophenylalanine and BODIPY558-aminophenylalanine into different positions of calmodulin as a donor and acceptor pair for fluorescence resonance energy transfer (FRET) using two four-base codons. Fluorescence spectra and polarization measurements revealed that substantial FRET changes upon the binding of calmodulin-binding peptide occurred for the double-labeled calmodulins containing BODIPY558 at the N terminus and BODIPYFL at the Gly40, Phe99 and Leu112 positions. These results demonstrate the usefulness of FRET based on the position-specific double incorporation of fluorescent amino acids for analyzing conformational changes of proteins.  相似文献   

13.
We now know the structures of over 200 proteins to atomic resolution. Despite the impressive extent and quality of the results, crystal-structure analysis has often been thought of as limited in scope, not only in its restriction to samples that can be crystallized, but in the more important respect that taking ‘snapshots’ of proteins does not directly address the complex spatio-temporal organization of the processes in which proteins participate. It is suggested here that, as the field has matured, this second limitation is gradually being overcome. As we gain increased access to structures of proteins in different conformational states – for example, in conformations produced by different states of ligation – and to families of homologous proteins, we can proceed from the statics of protein structure to the dynamics of conformational change, function, and evolution. A new scientific speciality has grown up around the solved structures: it has as its goal the elucidation of general principles of protein structure and function, to provide a theoretical framework for understanding the properties of proteins revealed by experiment. In this article we shall discuss some of the activity in this field. It will emerge clearly, I believe, that the increasing number and variety of solved structures is exerting a cumulative force. General principles are emerging from comparisons of related proteins and contrasts of dissimilar ones: the whole corpus of data is greater than the sum of the parts.  相似文献   

14.
  1. Download : Download high-res image (100KB)
  2. Download : Download full-size image
  相似文献   

15.
Sialidases are glycohydrolytic enzymes present from virus to mammals that remove sialic acid from oligosaccharide chains. Four different sialidase forms are known in vertebrates: the lysosomal NEU1, the cytosolic NEU2 and the membrane-associated NEU3 and NEU4. These enzymes modulate the cell sialic acid content and are involved in several cellular processes and pathological conditions. Molecular defects in NEU1 are responsible for sialidosis, an inherited disease characterized by lysosomal storage disorder and neurodegeneration. The studies on the biology of sialic acids and sialyltransferases, the anabolic counterparts of sialidases, have revealed a complex picture with more than 50 sialic acid variants selectively present in the different branches of the tree of life. The gain/loss of specific sialoconjugates have been proposed as key events in the evolution of deuterostomes and Homo sapiens, as well as in the host-pathogen interactions. To date, less attention has been paid to the evolution of sialidases. Thus we have conducted a survey on the state of the sialidase family in metazoan. Using an in silico approach, we identified and characterized sialidase orthologs from 21 different organisms distributed among the evolutionary tree: Metazoa relative (Monosiga brevicollis), early Deuterostomia, precursor of Chordata and Vertebrata (teleost fishes, amphibians, reptiles, avians and early and recent mammals). We were able to reconstruct the evolution of the sialidase protein family from the ancestral sialidase NEU1 and identify a new form of the enzyme, NEU5, representing an intermediate step in the evolution leading to the modern NEU3, NEU4 and NEU2. Our study provides new insights on the mechanisms that shaped the substrate specificity and other peculiar properties of the modern mammalian sialidases. Moreover, we further confirm findings on the catalytic residues and identified enzyme loop portions that behave as rapidly diverging regions and may be involved in the evolution of specific properties of sialidases.  相似文献   

16.
17.
18.
Zinc metalloenzymes catalyze many important cellular reactions. Recently, the involvement of zinc in the catalysis of alkylation of sulfur groups has gained prominence. Current studies of the zinc metalloenzyme protein farnesyltransferase have shed light on its structure and catalytic mechanism, as well as the general mechanism of zinc-catalyzed sulfur alkylation.  相似文献   

19.
We present an analysis of 203 completed genomes in the Gene3D resource (including 17 eukaryotes), which demonstrates that the number of protein families is continually expanding over time and that singleton-sequences appear to be an intrinsic part of the genomes. A significant proportion of the proteomes can be assigned to fewer than 6000 well-characterized domain families with the remaining domain-like regions belonging to a much larger number of small uncharacterized families that are largely species specific. Our comprehensive domain annotation of 203 genomes enables us to provide more accurate estimates of the number of multi-domain proteins found in the three kingdoms of life than previous calculations. We find that 67% of eukaryotic sequences are multi-domain compared with 56% of sequences in prokaryotes. By measuring the domain coverage of genome sequences, we show that the structural genomics initiatives should aim to provide structures for less than a thousand structurally uncharacterized Pfam families to achieve reasonable structural annotation of the genomes. However, in large families, additional structures should be determined as these would reveal more about the evolution of the family and enable a greater understanding of how function evolves.  相似文献   

20.

Background  

A promising direction in the analysis of gene expression focuses on the changes in expression of specific predefined sets of genes that are known in advance to be related (e.g., genes coding for proteins involved in cellular pathways or complexes). Such an analysis can reveal features that are not easily visible from the variations in the individual genes and can lead to a picture of expression that is more biologically transparent and accessible to interpretation. In this article, we present a new method of this kind that operates by quantifying the level of 'activity' of each pathway in different samples. The activity levels, which are derived from singular value decompositions, form the basis for statistical comparisons and other applications.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号