首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
The EF-hand calcium-binding protein S100B has been shown to interact in vitro in a calcium-sensitive manner with many substrates. These potential S100B target proteins have been screened for the preservation of a previously identified consensus sequence across species. The results were compared to known structural and in vitro properties of the proteins to rationalize choices for potential binding partners. Our approach uncovered four oligomeric proteins tubulin (alpha and beta), glial fibrillary acidic protein (GFAP), desmin, and vimentin that have conserved regions matching the consensus sequence. In the type III intermediate filament proteins (GFAP, vimentin, and desmin), this region corresponds to a portion of a coiled-coil (helix 2A), the structural element responsible for their assembly. In tubulin, the sequence matches correspond to regions of alpha and beta tubulin found at the alpha beta tubulin interface. In both cases, these consensus sequence matches provide a logical explanation for in vitro observations that S100B is able to inhibit oligomerization of these proteins.  相似文献   

3.

Background  

The evolution of the full repertoire of proteins encoded in a given genome is mostly driven by gene duplications, deletions, and sequence modifications of existing proteins. Indirect information about relative rates and other intrinsic parameters of these three basic processes is contained in the proteome-wide distribution of sequence identities of pairs of paralogous proteins.  相似文献   

4.
Summary The nucleotide sequence of the RNA of the bacteriophage MS2 was examined by computer for internal patterns. We used a technique which analyzes a nucleotide sequence as a Markov chain. This led us to discover patterns within the translated and untranslated regions of the RNA in addition to those patterns formed by the codons. One of the more surprising results of this analysis was the discovery that the non-coding sequences in the genome are as highly ordered, although in a different sense, as the genes themselves. Also of interest was the discovery that the codon frequency distributions for the three genes are similar.  相似文献   

5.
Detection of homologous proteins by an intermediate sequence search   总被引:2,自引:0,他引:2  
We developed a variant of the intermediate sequence search method (ISS(new)) for detection and alignment of weakly similar pairs of protein sequences. ISS(new) relates two query sequences by an intermediate sequence that is potentially homologous to both queries. The improvement was achieved by a more robust overlap score for a match between the queries through an intermediate. The approach was benchmarked on a data set of 2369 sequences of known structure with insignificant sequence similarity to each other (BLAST E-value larger than 0.001); 2050 of these sequences had a related structure in the set. ISS(new) performed significantly better than both PSI-BLAST and a previously described intermediate sequence search method. PSI-BLAST could not detect correct homologs for 1619 of the 2369 sequences. In contrast, ISS(new) assigned a correct homolog as the top hit for 121 of these 1619 sequences, while incorrectly assigning homologs for only nine targets; it did not assign homologs for the remainder of the sequences. By estimate, ISS(new) may be able to assign the folds of domains in approximately 29,000 of the approximately 500,000 sequences unassigned by PSI-BLAST, with 90% specificity (1 - false positives fraction). In addition, we show that the 15 alignments with the most significant BLAST E-values include the nearly best alignments constructed by ISS(new).  相似文献   

6.
Wang J  Feng JA 《Protein engineering》2003,16(11):799-807
This paper reports an extensive sequence analysis of the alpha-helices of proteins. alpha-Helices were extracted from the Protein Data Bank (PDB) and were divided into groups according to their sizes. It was found that some amino acids had differential propensity values for adopting helical conformation in short, medium and long alpha-helices. Pro and Trp had a significantly higher propensity for helical conformation in short helices than in medium and long helices. Trp was the strongest helix conformer in short helices. Sequence patterns favoring helical conformation were derived from a neighbor-dependent sequence analysis of proteins, which calculated the effect of neighboring amino acid type on the propensity of residues for adopting a particular secondary structure in proteins. This method produced an enhanced statistical significance scale that allowed us to explore the positional preference of amino acids for alpha-helical conformations. It was shown that the amino acid pair preference for alpha-helix had a unique pattern and this pattern was not always predictable by assuming proportional contributions from the individual propensity values of the amino acids. Our analysis also yielded a series of amino acid dyads that showed preference for alpha-helix conformation. The data presented in this study, along with our previous study on loop sequences of proteins, should prove useful for developing potential 'codes' for recognizing sequence patterns that are favorable for specific secondary structural elements in proteins.  相似文献   

7.
8.
H J?rnvall 《FEBS letters》1999,456(1):85-88
Motifer is a software tool able to find directly in nucleotide databases very distant homologues to an amino acid query sequence. It focuses searches on a specific amino acid pattern, scoring the matching and intervening residues as specified by the user. The program has been developed for searching databases of expressed sequence tags (ESTs), but it is also well suited to search genomic sequences. The query sequence can be a variable pattern with alternative amino acids or gaps and the sequences searched can contain introns or sequencing errors with accompanying frame shifts. Other features include options to generate a searchable output, set the maximal sequencing error frequency, limit searches to given species, or exclude already known matches. Motifer can find sequence homologues that other search algorithms would deem unrelated or would not find because of sequencing errors or a too large number of other homologues. The ability of Motifer to find relatives to a given sequence is exemplified by searches for members of the transforming growth factor-beta family and for proteins containing a WW-domain. The functions aimed at enhancing EST searches are illustrated by the 'in silico' cloning of a novel cytochrome P450 enzyme.  相似文献   

9.
10.
In previous work, we have shown that a set of characteristics,defined as (code frequency) pairs, can be derived from a proteinfamily by the use of a signal-processing method. This methodenables the location and extraction of sequence patterns bytaking into account each (code frequency) pair individually.In the present paper, we propose to extend this method in orderto detect and visualize patterns by taking into account severalpairs simultaneously. Two ‘multifrequency’ methodsare described. The first one is based on a rewriting of thesequences with new symbols which summarize the frequency information.The second method is based on a clustering of the patterns associatedwith each pair. Both methods lead to the definition of significantconsensus sequences. Some results obtained with calcium-bindingproteins and serine proteases are also discussed. Received on March 6, 1990; accepted on September 24, 1990  相似文献   

11.
We develop a probabilistic system for predicting the subcellular localization of proteins and estimating the relative population of the various compartments in yeast. Our system employs a Bayesian approach, updating a protein's probability of being in a compartment, based on a diverse range of 30 features. These range from specific motifs (e.g. signal sequences or the HDEL motif) to overall properties of a sequence (e.g. surface composition or isoelectric point) to whole-genome data (e.g. absolute mRNA expression levels or their fluctuations). The strength of our approach is the easy integration of many features, particularly the whole-genome expression data. We construct a training and testing set of approximately 1300 yeast proteins with an experimentally known localization from merging, filtering, and standardizing the annotation in the MIPS, Swiss-Prot and YPD databases, and we achieve 75 % accuracy on individual protein predictions using this dataset. Moreover, we are able to estimate the relative protein population of the various compartments without requiring a definite localization for every protein. This approach, which is based on an analogy to formalism in quantum mechanics, gives better accuracy in determining relative compartment populations than that obtained by simply tallying the localization predictions for individual proteins (on the yeast proteins with known localization, 92% versus 74%). Our training and testing also highlights which of the 30 features are informative and which are redundant (19 being particularly useful). After developing our system, we apply it to the 4700 yeast proteins with currently unknown localization and estimate the relative population of the various compartments in the entire yeast genome. An unbiased prior is essential to this extrapolated estimate; for this, we use the MIPS localization catalogue, and adapt recent results on the localization of yeast proteins obtained by Snyder and colleagues using a minitransposon system. Our final localizations for all approximately 6000 proteins in the yeast genome are available over the web at: http://bioinfo.mbb.yale. edu/genome/localize.  相似文献   

12.
The amino-acid sequence from the bilin binding protein (BBP) of the butterfly Pieris brassicae has been determined. The apoprotein with a length of 173 amino-acid residues has a molecular mass of 19,676 Da. The sequence analysis was performed by automated Edman degradation of the intact apoprotein and of fragments as large as possible generated from different digestions. The 3-dimensional structure of BBP, determined by Huber et al. (Huber, R., Schneider, M., Epp, O., Mayr, I., Messerschmidt, A., Pflugrath, J. & Kayser, H. (1987) J. Mol. Biol. 195, 423-434 and Huber, R., Schneider, M., Mayr, I., Müller, R., Deutzmann, R., Suter, F., Zuber, H., Falk, H. & Kayser, H. (1987) J. Mol. Biol. 198, 499-513) down to 2-A resolution, exhibits a similar conformation to the human retinol binding protein. Sawyer (Sawyer, L. (1987) Nature (London) 327, 659) demonstrated that proteins from a wide variety of sources can be gathered into a "superfamily". Computer searches of data banks yielded in a new member of this superfamily, namely human alpha 1-acid glycoprotein. One of the functions of the listed proteins is to bind and transport small hydrophobic molecules in serum.  相似文献   

13.
A mathematical method has been developed in order to search for latent periodicity in protein amino-acid and other symbolical sequences using dynamic programming and random matrices. The method allows the detection of the latent periodicity with insertions and deletions at positions that are unknown beforehand. The developed method has been applied to search for the periodicity in the amino-acid sequences of several proteins and in the euro/dollar exchange rate since 2001. The presence of a long period with insertions and deletions in amino-acid sequences is shown. The period length of seven amino acids is observed in the proteins that contain supercoiled regions (a coiled-coil structure) as well as of six, five, or more amino acids. The existence of the period length of 6 and 7 days, as well as 24 and 25 h in the analyzed financial time series is observed; note that this periodicity is detectable only for insertions and deletions. The causes that underlie the occurrence of the latent periodicity with insertions and deletions in amino-acid sequences and financial time series are discussed.  相似文献   

14.
The entire amino acid sequence of the protein subunit of phosphofructokinase from Bacillus stearothermophilus has been established mainly by sequence analysis of cyanogen bromide fragments and of peptides derived from these fragments by further digestion with proteolytic enzymes. Overlaps of the cyanogen bromide fragments as well as peptide sequences necessary to complement and to confirm tentative assignments within the larger peptide fragments were obtained from the sequences of selected peptides isolated from tryptic and chymotryptic digests of the intact S-[14C]-carboxymethylated protein. Sequence information was also provided by automated sequence analysis of the intact protein subunit and of some of the larger peptide fragments. The sequence is as follows: (See Text).  相似文献   

15.
Using information theory to search for co-evolving residues in proteins   总被引:2,自引:0,他引:2  
MOTIVATION: Some functionally important protein residues are easily detected since they correspond to conserved columns in a multiple sequence alignment (MSA). However important residues may also mutate, with compensatory mutations occurring elsewhere in the protein, which serve to preserve or restore functionality. It is difficult to distinguish these co-evolving sites from other non-conserved sites. RESULTS: We used Mutual Information (MI) to identify co-evolving positions. Using in silico evolved MSAs, we examined the effects of the number of sequences, the size of amino acid alphabet and the mutation rate on two sources of background MI: finite sample size effects and phylogenetic influence. We then assessed the performance of various normalizations of MI in enhancing detection of co-evolving positions and found that normalization by the pair entropy was optimal. Real protein alignments were analyzed and co-evolving isolated pairs were often found to be in contact with each other. AVAILABILITY: All data and program files can be found at http://www.biochem.uwo.ca/cgi-bin/CDD/index.cgi  相似文献   

16.
We describe a web-based resource to identify, search and analyze sequence patterns conserved in the multiple sequence alignments of orthologous promoters from closely related / distant Saccharomyces spp. The webtool interfaces with a database where conserved sequence patterns (greater than 4 bp) have been previously extracted from genome-wide promoter alignments, allowing one to carry out user-defined genome-wide searches for conserved sequences to assist in the discovery of novel promoter elements based on comparative genomics. The web-based server can be accessed at http://www2.imtech.res.in/ anand/sacch_prom_pat.html.  相似文献   

17.
beta-Lactoglobulin isolated from horse colostrum is heterogeneous and contains two components: beta-lactoglobulin I and beta-lactoglobulin II. These two proteins are monomeric and show differences in their electrophoretic mobilities, chain lengths and primary structures. The complete amino-acid sequence of beta-lactoglobulin II was determined by automated Edman degradation of the intact protein and of the peptides derived from these by digestion with trypsin or chymotrypsin and by chemical cleavage with cyanogen bromide. Unlike other beta-lactoglobulins which contain 162 amino acids, horse beta-lactoglobulin II is unique in that it contains 166 amino acids. The additional four amino acids represent an insertion between positions 116 and 117 of other beta-lactoglobulins so far sequenced, including horse beta-lactoglobulin I. Sequence comparison of beta-lactoglobulins I and II from horse colostrum reveals 48 amino acid substitutions (30%). Such a diversity between members of the beta-lactoglobulin gene family has not been encountered before. Sequence comparison with bovine beta-lactoglobulin A shows 85 amino acid replacements accounting for 53% of the residues. The structural homology with human retinol-binding protein may reveal similar biological functions and clues to the origin of milk proteins.  相似文献   

18.
Metallothioneins: proteins in search of function   总被引:43,自引:0,他引:43  
M Karin 《Cell》1985,41(1):9-10
  相似文献   

19.
Heat shock proteins: the search for functions   总被引:36,自引:4,他引:32       下载免费PDF全文
  相似文献   

20.

   

FASH (Fourier Alignment Sequence Heuristics) is a web application, based on the Fast Fourier Transform, for finding remote homologs within a long nucleic acid sequence. Given a query sequence and a long text-sequence (e.g, the human genome), FASH detects subsequences within the text that are remotely-similar to the query. FASH offers an alternative approach to Blast/Fasta for querying long RNA/DNA sequences. FASH differs from these other approaches in that it does not depend on the existence of contiguous seed-sequences in its initial detection phase. The FASH web server is user friendly and very easy to operate.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号