共查询到20条相似文献,搜索用时 15 毫秒
1.
We identified latent periodicity in catalytic domains of approximately 85% of serine/threonine and tyrosine protein kinases. Similar results were obtained for other 22 protein domains. We also designed the method of noise decomposition, which is aimed to distinguish between different periodicity types of the same period length. The method is to be used in conjunction with the cyclic profile alignment, and this combination is able to reveal structure-related or function-related patterns of latent periodicity. Possible origins of the periodic structure of protein kinase active sites are discussed. Summarizing, we presume that latent periodicity is the common property of many catalytic protein domains. 相似文献
2.
Latent amino acid repeats seem to be widespread in genetic sequences and to reflect their structure, function, and evolution. We have recently identified latent periodicity in more than 150 protein families including protein kinases and various nucleotide-binding proteins. The latent repeats in these families were correlated to their structure and evolution. However, a majority of known protein families were not identified with our latent periodicity search algorithm. The main presumable reason for this was the inability of our techniques to identify periodicities interspersed with insertions and deletions. We designed the new latent periodicity search algorithm, which is capable of taking into account insertions and deletions. As a result, we identified many novel cases of latent periodicity peculiar to protein families. Possible origins of the periodic structure of these families are discussed. Summarizing, we presume that latent periodicity is present in a substantial portion of known protein families. The latent periodicity matrices and the results of Swiss-Prot scans are available from http://bioinf.narod.ru/del/. 相似文献
3.
Silverman BD 《Journal of biomolecular structure & dynamics》2005,22(4):411-423
Hydropathy plots or window averages over local stretches of the sequence of residue hydrophobicity have revealed patterns related to various protein tertiary structural features. This has enabled identification of regions of the sequence that are at the surface or within the interior of globular soluble proteins, regions located within the lipid bilayer of transmembrane proteins, portions of the sequence that characterize repeating motifs, as well as motifs that usefully characterize different protein structural families. This, therefore, provides one example of the generally expressed maxim that "sequence determines structure". On the other hand, a number of previous investigations have shown the rapidly varying values of residue hydrophobicity along the sequence to be distributed approximately randomly. So one might question just how much of the sequence actually determines structure. It is, therefore, of interest to extract that part of this rapidly varying distribution of residue hydrophobicity that is responsible for the longer wavelength variations that correlate with protein tertiary structural features and to determine their prevalence within the entire distribution. This is accomplished by a finite Fourier analysis of the sequence of residue hydrophobicity and of a new measure of residue distance from the protein interior. Calculations are performed on a number of globins, immunoglobulins, cuprodoxins, and papain-like structures. The spectral power of the Fourier amplitudes of the frequencies extracted, whose inverse transforms underlie the windowed values of residue hydrophobicity is shown to be a small fraction of the total power of the hydrophobicity distribution and thereby consistent with a distribution that might appear to be predominantly random. The wide range of sequence identity between proteins having the same fold, all exhibiting similar small fractions of power amplitude that correlate with the longer wavelength inside-to-outside excursions of the amino acid residues, supports the general contention that close sequence identity is an expression of a close evolutionary relationship rather than an expression of structural similarity. Practical implications of the present analysis for protein structure prediction and engineering are also described. 相似文献
4.
5.
6.
7.
8.
9.
10.
Prediction of DNA-binding residues from sequence 总被引:2,自引:0,他引:2
MOTIVATION: Thousands of proteins are known to bind to DNA; for most of them the mechanism of action and the residues that bind to DNA, i.e. the binding sites, are yet unknown. Experimental identification of binding sites requires expensive and laborious methods such as mutagenesis and binding essays. Hence, such studies are not applicable on a large scale. If the 3D structure of a protein is known, it is often possible to predict DNA-binding sites in silico. However, for most proteins, such knowledge is not available. RESULTS: It has been shown that DNA-binding residues have distinct biophysical characteristics. Here we demonstrate that these characteristics are so distinct that they enable accurate prediction of the residues that bind DNA directly from amino acid sequence, without requiring any additional experimental or structural information. In a cross-validation based on the largest non-redundant dataset of high-resolution protein-DNA complexes available today, we found that 89% of our predictions are confirmed by experimental data. Thus, it is now possible to identify DNA-binding sites on a proteomic scale even in the absence of any experimental data or 3D-structural information. AVAILABILITY: http://cubic.bioc.columbia.edu/services/disis. 相似文献
11.
12.
13.
Prediction of protein coding regions by the 3-base periodicity analysis of a DNA sequence 总被引:2,自引:0,他引:2
With the exponential growth of genomic sequences, there is an increasing demand to accurately identify protein coding regions (exons) from genomic sequences. Despite many progresses being made in the identification of protein coding regions by computational methods during the last two decades, the performances and efficiencies of the prediction methods still need to be improved. In addition, it is indispensable to develop different prediction methods since combining different methods may greatly improve the prediction accuracy. A new method to predict protein coding regions is developed in this paper based on the fact that most of exon sequences have a 3-base periodicity, while intron sequences do not have this unique feature. The method computes the 3-base periodicity and the background noise of the stepwise DNA segments of the target DNA sequences using nucleotide distributions in the three codon positions of the DNA sequences. Exon and intron sequences can be identified from trends of the ratio of the 3-base periodicity to the background noise in the DNA sequences. Case studies on genes from different organisms show that this method is an effective approach for exon prediction. 相似文献
14.
15.
F. Shira Neuman-Silberberg Eyal Schejter F. Michael Hoffmann Ben-Zion Shilo 《Cell》1984,37(3):1027-1033
Three Drosophila genes homologous to the Ha-ras probe were isolated and mapped to positions 85D, 64B, and 62B on chromosome 3. Two of these genes (termed Dras1 and Dras2) were sequenced. In the case of Dras1, which contains multiple introns, a cDNA clone was isolated and sequenced. In the case of Dras2, the nucleotide sequence of the genomic clone was determined. Each gene codes for a protein with a predicted molecular weight of 21.6 kd. Alignment of the amino acid sequence of Dras1 with the vertebrate Ha-ras protein shows that at the amino terminus and central portion (residues 1–121 and 137–164) the two proteins are remarkably similar, and have an overall homology of 75%. The Dras2 gene lacks significant homology to the vertebrate counterpart at the extreme amino terminus and is homologous only between positions 28–120 and 139–161 (overall homology of 50%). This result suggests that the N terminus of p21 forms a distinct regulatory or functional domain. At the carboxy terminus, the major region of variability among the vertebrate ras proteins, the two Drosophila sequences also display considerable variability. However, both appear to be more similar to exon 4B of the Ki-ras gene. 相似文献
16.
Original spectral-statistical methods were developed to recognize a new type of latent periodicity in DNA, called latent profile periodicity, or latent profility. Searching for latent profility allows the detection of different levels of information coding in genes and local DNA segments. 相似文献
17.
18.
19.
The DNA-binding protein of Pf1 filamentous bacteriophage: amino-acid sequence and structure of the gene 总被引:9,自引:0,他引:9
下载免费PDF全文

Maeda K Kneale GG Tsugita A Short NJ Perham RN Hill DF Petersen GB 《The EMBO journal》1982,1(2):255-261
The amino-acid sequence of the single-stranded DNA-binding protein of bacteriophage Pf1 and the nucleotide sequence of the corresponding gene have been determined. The protein has 144 amino acids and a molecular weight of 15 400; the gene consists of 435 nucleotides. The amino-acid sequence was determined by Edman degradation, carboxypeptidase A, B, and P digestion of intact protein and of peptides derived by chymotrypsin, Staphylococcus aureus V8 protease, and trypsin digestion. The nucleotide sequence was determined by the dideoxy method after random cloning of fragments of Pf1 DNA into M13. No sequence homology could be established between the amino-acid sequence of the DNA-binding protein of Pseudomonas aeruginosa-specific bacteriophage Pf1 and bacteriophage fd of Escherichia coli. 相似文献
20.
H Gomi T Hozumi S Hattori C Tagawa F Kishimoto L Bj?rck 《Journal of immunology (Baltimore, Md. : 1950)》1990,144(10):4046-4052
The gene for protein H, a novel bacterial cell wall protein with specific affinity for human IgG Fc, was cloned from a group A Streptococcus and expressed in Escherichia coli. Recombinant E. coli cells produced two forms of a human IgG Fc-binding protein, one with an apparent Mr of 42 kDa in a periplasmic fraction and the other with an apparent Mr of 45 kDa in a mixed fraction of cytoplasms and membranes. Both 42-kDa and 45-kDa protein preparations similarly bound to human IgG1 to IgG4, human IgG Fc, and rabbit IgG, but not to IgG of mouse, rat, bovine, sheep, goat, and human IgA, IgD, IgE, and IgM. The complete nucleotide sequence of the cloned 1.8-kb DNA fragment was determined. An open reading frame encoded a hypothetical protein of 376 amino acid residues (Mr = 42,498). The N-terminal amino acid sequence, consisting of 41 residues, which was removed post-translationally had typical characteristics of Gram-positive bacterial signal peptides. Thus, the mature form of protein H was suggested to consist of 335 residues (Mr = 38,162). There were 3 repeated sequences consisting of 42 residues that were highly homologous to those of protein Arp, an IgA-binding streptococcal cell wall protein, and streptococcal M6 and M24 proteins. The C-terminal amino acid sequence consisting of 93 residues, directly following the repeated sequences, was also highly homologous to that of M6 and M24 proteins. No sequence homology was found between protein H and protein A or protein G, two other IgG-binding bacterial cell wall proteins. 相似文献