共查询到20条相似文献,搜索用时 15 毫秒
1.
Synonymous codon usage is a commonly used means for estimating gene expression levels of Escherichia coli genes and has also been used for predicting highly expressed genes for a number of prokaryotic genomes. By comparison of expression level-dependent features in codon usage with protein abundance data from two proteome studies of exponentially growing E. coli and Bacillus subtilis cells, we try to evaluate whether the implicit assumption of this approach can be confirmed with experimental data. Log-odds ratio scores are used to model differences in codon usage between highly expressed genes and genomic average. Using these, the strength and significance of expression level-dependent features in codon usage were determined for the genes of the Escherichia coli, Bacillus subtilis and Haemophilus influenzae genomes. The comparison of codon usage features with protein abundance data confirmed a relationship between these to be present, although exceptions to this, possibly related to functional context, were found. For species with expression level-dependent features in their codon usage, the applied methodology could be used to improve in silico simulations of the outcome of two-dimensional gel electrophoretic experiments. 相似文献
2.
Synonymous codon usage and cellular tRNA abundance are thought to be co-evolved in optimizing translational efficiencies in highly expressed genes. Here in this communication by taking the advantage of publicly available gene expression data of rice and Arabidopsis we demonstrated that tRNA gene copy number is not the only driving force favoring translational selection in all highly expressed genes of rice. We found that forces favoring translational selection differ between GC-rich and GC-poor classes of genes. Supporting our results we also showed that, in highly expressed genes of GC-poor class there is a perfect correspondence between majority of preferred codons and tRNA gene copy number that confers translational efficiencies to this group of genes. However, tRNA gene copy number is not fully consistent with models of translational selection in GC-rich group of genes, where constraints on mRNA secondary structure play a role to optimize codon usage in highly expressed genes. 相似文献
3.
4.
Rapid protein domain assignment from amino acid sequence using predicted secondary structure 总被引:8,自引:0,他引:8 下载免费PDF全文
Marsden RL McGuffin LJ Jones DT 《Protein science : a publication of the Protein Society》2002,11(12):2814-2824
The elucidation of the domain content of a given protein sequence in the absence of determined structure or significant sequence homology to known domains is an important problem in structural biology. Here we address how successfully the delineation of continuous domains can be accomplished in the absence of sequence homology using simple baseline methods, an existing prediction algorithm (Domain Guess by Size), and a newly developed method (DomSSEA). The study was undertaken with a view to measuring the usefulness of these prediction methods in terms of their application to fully automatic domain assignment. Thus, the sensitivity of each domain assignment method was measured by calculating the number of correctly assigned top scoring predictions. We have implemented a new continuous domain identification method using the alignment of predicted secondary structures of target sequences against observed secondary structures of chains with known domain boundaries as assigned by Class Architecture Topology Homology (CATH). Taking top predictions only, the success rate of the method in correctly assigning domain number to the representative chain set is 73.3%. The top prediction for domain number and location of domain boundaries was correct for 24% of the multidomain set (+/-20 residues). These results have been put into context in relation to the results obtained from the other prediction methods assessed. 相似文献
5.
Synonymous codon usage analysis between thermophilic and mesophilic prokaryotes has gained wide attention in recent years. Although it is known that thermophilic and mesophilic prokaryotes use different subset of synonymous codons, no reason for this difference is known so far. In the present communication, by analyzing a large number of thermophilic and mesophilic prokaryotes, we provide evidence that bias in the selection of synonymous codons between thermophilic and mesophilic prokaryotes is related to differential folding pattern of mRNA secondary structures. Moreover, we observe that error-minimizing property has significant influence in differentiating the synonymous codon usage between thermophilic and mesophilic prokaryotes. Biological implications of these results are discussed. 相似文献
6.
About 200 mRNA sequences of Escherichia coli and human with matching protein secondary structure data were studied. The mRNA folding for each native sequence and for corresponding randomized sequences was calculated through free energy minimization. We have found that the folding energy of mRNA segments in different protein secondary structures is significantly different. The average Z score is more negative for regular secondary structure (alpha-helix and beta-strand) than that for coil. This suggests that the codon choice in native mRNA sequence coding for protein regular structure contributes more to the mRNA folding stability. 相似文献
7.
Jen Tsi Yang 《Journal of Protein Chemistry》1996,15(2):185-191
The conformational parametersP
k
for each amino acid species (j=1–20) of sequential peptides in proteins are presented as the product ofP
i,k
, wherei is the number of the sequential residues in thekth conformational state (k=-helix,-sheet,-turn, or unordered structure). Since the average parameter for ann-residue segment is related to the average probability of finding the segment in the kth state, it becomes a geometric mean of (P
k
)av=(P
i,k
)
1/n
with amino acid residuei increasing from 1 ton. We then used ln(Pk)av to convert a multiplicative process to a summation, i.e., ln(P
k
)
av
=(1/n)P
i,k
(i=1 ton) for ease of operation. However, this is unlike the popular Chou-Fasman algorithm, which has the flaw of using the arithmetic mean for relative probabilities. The Chou-Fasman algorithm happens to be close to our calculations in many cases mainly because the difference between theirP
k
and our InP
k
is nearly constant for about one-half of the 20 amino acids. When stronger conformation formers and breakers exist, the difference become larger and the prediction at the N- and C-terminal-helix or-sheet could differ. If the average conformational parameters of the overlapping segments of any two states are too close for a unique solution, our calculations could lead to a different prediction. 相似文献
8.
Filamentous fungi are widely exploited in food industry due to their abilities to secrete large amounts of enzymes and metabolites. The recent availability of fungal genome sequences has provided an opportunity to explore the genomic characteristics of these food-related filamentous fungi. In this paper, we selected 12 representative filamentous fungi in the areas of food processing and safety, which were Aspergillus clavatus, A. flavus, A. fumigatus, A. nidulans, A. niger, A. oryzae, A. terreus, Monascus ruber, Neurospora crassa, Penicillium chrysogenum, Rhizopus oryzae and Trichoderma reesei, and did the comparative studies of their genomic characteristics of tRNA gene distribution, codon usage pattern and amino acid composition. The results showed that the copy numbers greatly differed among isoaccepting tRNA genes and the distribution seemed to be related with translation process. The results also revealed that genome compositional variation probably constrained the base choice at the third codon, and affected the overall amino acid composition but seemed to have little effect on the integrated physicochemical characteristics of overall amino acids. The further analysis suggested that the wobble pairing and base modification were the important mechanisms in codon-anticodon interaction. In the scope of authors' knowledge, it is the first report about the genomic characteristics analysis of food-related filamentous fungi, which would be informative for the analysis of filamentous fungal genome evolution and their practical application in food industry. 相似文献
9.
Retrospective analysis of a secondary structure prediction: the catalytic domain of matrix metalloproteinases. 下载免费PDF全文
E. E. Hodgkin I. C. Gillman R. J. Gilbert 《Protein science : a publication of the Protein Society》1994,3(6):984-986
Secondary structure prediction of the catalytic domain of matrix metalloproteinases is evaluated in the light of recently published experimentally determined structures. The prediction was made by combining conformational propensity, surface probability, and residue conservation calculated for an alignment of 19 sequences. The position of each observed secondary structure element was correctly predicted with a high degree of accuracy, with a single beta-strand falsely predicted. The domain fold was also anticipated from the prediction by analogy with the structural elements found in the distantly related metalloproteinases thermolysin, astacin, and adamalysin. 相似文献
10.
Malkov SN Zivković MV Beljanski MV Hall MB Zarić SD 《Journal of molecular modeling》2008,14(8):769-775
The correlation between the primary and secondary structures of proteins was analysed using a large data set from the Protein
Data Bank. Clear preferences of amino acids towards certain secondary structures classify amino acids into four groups: α-helix
preferrers, strand preferrers, turn and bend preferrers, and His and Cys (the latter two amino acids show no clear preference
for any secondary structure). Amino acids in the same group have similar structural characteristics at their Cβ and Cγ atoms
that predicts their preference for a particular secondary structure. All α-helix preferrers have neither polar heteroatoms
on Cβ and Cγ atoms, nor branching or aromatic group on the Cβ atom. All strand preferrers have aromatic groups or branching
groups on the Cβ atom. All turn and bend preferrers have a polar heteroatom on the Cβ or Cγ atoms or do not have a Cβ atom
at all. These new rules could be helpful in making predictions about non-natural amino acids.
相似文献
Snežana D. ZarićEmail: |
11.
High-quality data about protein structures and their gene sequences are essential to the understanding of the relationship between protein folding and protein coding sequences. Firstly we constructed the EcoPDB database, which is a high-quality database of Escherichia coli genes and their corresponding PDB structures. Based on EcoPDB, we presented a novel approach based on information theory to investigate the correlation between cysteine synonymous codon usages and local amino acids flanking cysteines, the correlation between cysteine synonymous codon usages and synonymous codon usages of local amino acids flanking cysteines, as well as the correlation between cysteine synonymous codon usages and the disulfide bonding states of cysteines in the E. coli genome. The results indicate that the nearest neighboring residues and their synonymous codons of the C-terminus have the greatest influence on the usages of the synonymous codons of cysteines and the usage of the synonymous codons has a specific correlation with the disulfide bond formation of cysteines in proteins. The correlations may result from the regulation mechanism of protein structures at gene sequence level and reflect the biological function restriction that cysteines pair to form disulfide bonds. The results may also be helpful in identifying residues that are important for synonymous codon selection of cysteines to introduce disulfide bridges in protein engineering and molecular biology. The approach presented in this paper can also be utilized as a complementary computational method and be applicable to analyse the synonymous codon usages in other model organisms. 相似文献
12.
Viruses differ markedly in their specificity toward host organisms. Here, we test the level of general sequence adaptation that viruses display toward their hosts. We compiled a representative data set of viruses that infect hosts ranging from bacteria to humans. We consider their respective amino acid and codon usages and compare them among the viruses and their hosts. We show that bacteria‐infecting viruses are strongly adapted to their specific hosts, but that they differ from other unrelated bacterial hosts. Viruses that infect humans, but not those that infect other mammals or aves, show a strong resemblance to most mammalian and avian hosts, in terms of both amino acid and codon preferences. In groups of viruses that infect humans or other mammals, the highest observed level of adaptation of viral proteins to host codon usages is for those proteins that appear abundantly in the virion. In contrast, proteins that are known to participate in host‐specific recognition do not necessarily adapt to their respective hosts. The implication for the potential of viral infectivity is discussed. 相似文献
13.
A protein short motif search tool using amino acid sequence and their secondary structure assignment
We present the development of a web server, a protein short motif search tool that allows users to simultaneously search for a protein sequence motif and its secondary structure assignments. The web server is able to query very short motifs searches against PDB structural data from the RCSB Protein Databank, with the users defining the type of secondary structures of the amino acids in the sequence motif. The output utilises 3D visualisation ability that highlights the position of the motif in the structure and on the corresponding sequence. Researchers can easily observe the locations and conformation of multiple motifs among the results. Protein short motif search also has an application programming interface (API) for interfacing with other bioinformatics tools. AVAILABILITY: The database is available for free at http://birg3.fbb.utm.my/proteinsms. 相似文献
14.
Lomize AL Reibarkh MY Pogozheva ID 《Protein science : a publication of the Protein Society》2002,11(8):1984-2000
Van der Waals (vdW) interaction energies between different atom types, energies of hydrogen bonds (H-bonds), and atomic solvation parameters (ASPs) have been derived from the published thermodynamic stabilities of 106 mutants with available crystal structures by use of an originally designed model for the calculation of free-energy differences. The set of mutants included substitutions of uncharged, inflexible, water-inaccessible residues in alpha-helices and beta-sheets of T4, human, and hen lysozymes and HI ribonuclease. The determined energies of vdW interactions and H-bonds were smaller than in molecular mechanics and followed the "like dissolves like" rule, as expected in condensed media but not in vacuum. The depths of modified Lennard-Jones potentials were -0.34, -0.12, and -0.06 kcal/mole for similar atom types (polar-polar, aromatic-aromatic, and aliphatic-aliphatic interactions, respectively) and -0.10, -0.08, -0.06, -0.02, and nearly 0 kcal/mole for different types (sulfur-polar, sulfur-aromatic, sulfur-aliphatic, aliphatic-aromatic, and carbon-polar, respectively), whereas the depths of H-bond potentials were -1.5 to -1.8 kcal/mole. The obtained solvation parameters, that is, transfer energies from water to the protein interior, were 19, 7, -1, -21, and -66 cal/moleA(2) for aliphatic carbon, aromatic carbon, sulfur, nitrogen, and oxygen, respectively, which is close to the cyclohexane scale for aliphatic and aromatic groups but intermediate between octanol and cyclohexane for others. An analysis of additional replacements at the water-protein interface indicates that vdW interactions between protein atoms are reduced when they occur across water. 相似文献
15.
Estimation of secondary structure in polypeptides is important for studying their structure, folding and dynamics. In NMR
spectroscopy, such information is generally obtained after sequence specific resonance assignments are completed. We present
here a new methodology for assignment of secondary structure type to spin systems in proteins directly from NMR spectra, without
prior knowledge of resonance assignments. The methodology, named Combination of Shifts for Secondary Structure Identification
in Proteins (CSSI-PRO), involves detection of specific linear combination of backbone 1Hα and 13C′ chemical shifts in a two-dimensional (2D) NMR experiment based on G-matrix Fourier transform (GFT) NMR spectroscopy. Such
linear combinations of shifts facilitate editing of residues belonging to α-helical/β-strand regions into distinct spectral
regions nearly independent of the amino acid type, thereby allowing the estimation of overall secondary structure content
of the protein. Comparison of the predicted secondary structure content with those estimated based on their respective 3D
structures and/or the method of Chemical Shift Index for 237 proteins gives a correlation of more than 90% and an overall
rmsd of 7.0%, which is comparable to other biophysical techniques used for structural characterization of proteins. Taken
together, this methodology has a wide range of applications in NMR spectroscopy such as rapid protein structure determination,
monitoring conformational changes in protein-folding/ligand-binding studies and automated resonance assignment.
Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users. 相似文献
16.
We have modified and improved the GOR algorithm for the protein secondary structure prediction by using the evolutionary information provided by multiple sequence alignments, adding triplet statistics, and optimizing various parameters. We have expanded the database used to include the 513 non-redundant domains collected recently by Cuff and Barton (Proteins 1999;34:508-519; Proteins 2000;40:502-511). We have introduced a variable size window that allowed us to include sequences as short as 20-30 residues. A significant improvement over the previous versions of GOR algorithm was obtained by combining the PSI-BLAST multiple sequence alignments with the GOR method. The new algorithm will form the basis for the future GOR V release on an online prediction server. The average accuracy of the prediction of secondary structure with multiple sequence alignment and full jack-knife procedure was 73.5%. The accuracy of the prediction increases to 74.2% by limiting the prediction to 375 (of 513) sequences having at least 50 PSI-BLAST alignments. The average accuracy of the prediction of the new improved program without using multiple sequence alignments was 67.5%. This is approximately a 3% improvement over the preceding GOR IV algorithm (Garnier J, Gibrat JF, Robson B. Methods Enzymol 1996;266:540-553; Kloczkowski A, Ting K-L, Jernigan RL, Garnier J. Polymer 2002;43:441-449). We have discussed alternatives to the segment overlap (Sov) coefficient proposed by Zemla et al. (Proteins 1999;34:220-223). 相似文献
17.
The conditional probability, P(sigma/x), is a statement of the probability that the value of sigma will be found given the prior information that a value of x has been observed. Here sigma represents any one of the secondary structure types, alpha, beta, tau, and rho for helix, sheet, turn, and random, respectively, and x represents a sequence attribute, including, but not limited to: (1) hydropathy; (2) hydrophobic moments assuming helix and sheet; (3) Richardson and Richardson helical N-cap and C-cap values; (4) Chou-Fasman conformational parameters for helix, P alpha, for sheet, P beta, and for turn, P tau; and (5) Garnier, Osguthorpe, and Robson (GOR) information values for helix, I alpha, for sheet, I beta, for turn, I tau, and for random structure, I rho. Plots of P(sigma/x) vs. x are demonstrated to provide information about the correlation between structure and attribute, sigma and x. The separations between different P(sigma/x) vs. x curves indicate the capacity of a given attribute to discriminate between different secondary structural types and permit comparison of different attributes. P(alpha/x), P(beta/x), P(tau/x) and P(rho/x) vs. x plots show that the most useful attributes for discriminating helix are, in order: hydrophobic moment assuming helix greater than P alpha much greater than N-cap greater than C-cap approximately I alpha approximately I tau. The information value for turns, I tau, was found to discriminate helix better than turns. Discrimination for sheet was found to be in the following order: I beta much greater than P beta approximately hydropathy greater than I rho approximately hydrophobic moment assuming sheet. Three attributes, at their low values, were found to give significant discrimination for the absence of helix: I alpha approximately P alpha approximately hydrophobic moment assuming helix. Also, three other attributes were found to indicate the absence of sheet: P beta much greater than I rho approximately hydropathy. Indications of the absence of sigma could be as useful for some applications as the indication of the presence of sigma. 相似文献
18.
Conformational analysis of peptides corresponding to all the secondary structure elements of protein L B1 domain: secondary structure propensities are not conserved in proteins with the same fold. 总被引:2,自引:2,他引:2 下载免费PDF全文
M. Ramírez-Alvarado L. Serrano F. J. Blanco 《Protein science : a publication of the Protein Society》1997,6(1):162-174
The solution conformation of three peptides corresponding to the two beta-hairpins and the alpha-helix of the protein L B1 domain have been analyzed by circular dichroism (CD) and nuclear magnetic resonance spectroscopy (NMR). In aqueous solution, the three peptides show low populations of native and non-native locally folded structures, but no well-defined hairpin or helix structures are formed. In 30% aqueous trifluoroethanol (TFE), the peptide corresponding to the alpha-helix adopts a high populated helical conformation three residues longer than in the protein. The hairpin peptides aggregate in TFE, and no significant conformational change occurs in the NMR observable fraction of molecules. These results indicate that the helical peptide has a significant intrinsic tendency to adopt its native structure and that the hairpin sequences seem to be selected as non-helical. This suggests that these sequences favor the structure finally attained in the protein, but the contribution of the local interactions alone is not enough to drive the formation of a detectable population of native secondary structures. This pattern of secondary structure tendencies is different to those observed in two structurally related proteins: ubiquitin and the protein G B1 domain. The only common feature is a certain propensity of the helical segments to form the native structure. These results indicate that for a protein to fold, there is no need for large native-like secondary structure propensities, although a minimum tendency to avoid non-native structures and to favor native ones could be required. 相似文献
19.
Fuzzy cluster analysis of simple physicochemical properties of amino acids for recognizing secondary structure in proteins. 下载免费PDF全文
G. Mocz 《Protein science : a publication of the Protein Society》1995,4(6):1178-1187
Fuzzy cluster analysis has been applied to the 20 amino acids by using 65 physicochemical properties as a basis for classification. The clustering products, the fuzzy sets (i.e., classical sets with associated membership functions), have provided a new measure of amino acid similarities for use in protein folding studies. This work demonstrates that fuzzy sets of simple molecular attributes, when assigned to amino acid residues in a protein''s sequence, can predict the secondary structure of the sequence with reasonable accuracy. An approach is presented for discriminating standard folding states, using near-optimum information splitting in half-overlapping segments of the sequence of assigned membership functions. The method is applied to a nonredundant set of 252 proteins and yields approximately 73% matching for correctly predicted and correctly rejected residues with approximately 60% overall success rate for the correctly recognized ones in three folding states: alpha-helix, beta-strand, and coil. The most useful attributes for discriminating these states appear to be related to size, polarity, and thermodynamic factors. Van der Waals volume, apparent average thickness of surrounding molecular free volume, and a measure of dimensionless surface electron density can explain approximately 95% of prediction results. hydrogen bonding and hydrophobicity induces do not yet enable clear clustering and prediction. 相似文献
20.
A series of Ala vs. Gly mutations at different helical and nonhelical positions of the chemotactic protein CheY, from E. coli, has been made. We have used this information to fit a general analytical equation that describes the free energy changes of an Ala to Gly mutation within ±0.45 kcal mol?1 with 95% confidence. The equation includes three terms: (1) the change in solvent-accessible hydrophobic surface area, corrected for the possible closure of the cavity left by deleting the Cβ of the Ala; (2) the change in hydrophilic area of the nonintramolecularly hydrogen-bonded groups; and (3) the dihedral angles of the position being mutated. This last term extends the calculation to any conformation, not only α-helices. The general applicability of the equation for Ala vs. Gly mutations, when Ala or a small solvent-exposed polar residue is the wild-type residue, has been tested using data from other proteins: barnase, CI2 trypsin inhibitor, T4 lysozyme, and Staphylococcus nuclease. The predictive power of this simple approach offers the possibility of extending it to more complex mutations. © 1995 Wiley-Liss, Inc. 相似文献