首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 484 毫秒
1.
Theory of molecular machines. I. Channel capacity of molecular machines   总被引:4,自引:0,他引:4  
Like macroscopic machines, molecular-sized machines are limited by their material components, their design, and their use of power. One of these limits is the maximum number of states that a machine can choose from. The logarithm to the base 2 of the number of states is defined to be the number of bits of information that the machine could "gain" during its operation. The maximum possible information gain is a function of the energy that a molecular machine dissipates into the surrounding medium (Py), the thermal noise energy which disturbs the machine (Ny) and the number of independently moving parts involved in the operation (dspace): Cy = dspace log2 [( Py + Ny)/Ny] bits per operation. This "machine capacity" is closely related to Shannon's channel capacity for communications systems. An important theorem that Shannon proved for communication channels also applies to molecular machines. With regard to molecular machines, the theorem states that if the amount of information which a machine gains is less than or equal to Cy, then the error rate (frequency of failure) can be made arbitrarily small by using a sufficiently complex coding of the molecular machine's operation. Thus, the capacity of a molecular machine is sharply limited by the dissipation and the thermal noise, but the machine failure rate can be reduced to whatever low level may be required for the organism to survive.  相似文献   

2.
In this paper, we propose a communication model of evolution and investigate its information-theoretic bounds. The process of evolution is modeled as the retransmission of information over a protein communication channel, where the transmitted message is the organism's proteome encoded in the DNA. We compute the capacity and the rate distortion functions of the protein communication system for the three domains of life: Archaea, Bacteria, and Eukaryotes. The tradeoff between the transmission rate and the distortion in noisy protein communication channels is analyzed. As expected, comparison between the optimal transmission rate and the channel capacity indicates that the biological fidelity does not reach the Shannon optimal distortion. However, the relationship between the channel capacity and rate distortion achieved for different biological domains provides tremendous insight into the dynamics of the evolutionary processes of the three domains of life. We rely on these results to provide a model of genome sequence evolution based on the two major evolutionary driving forces: mutations and unequal crossovers.  相似文献   

3.
Shannon’s seminal approach to estimating information capacity is widely used to quantify information processing by biological systems. However, the Shannon information theory, which is based on power spectrum estimation, necessarily contains two sources of error: time delay bias error and random error. These errors are particularly important for systems with relatively large time delay values and for responses of limited duration, as is often the case in experimental work. The window function type and size chosen, as well as the values of inherent delays cause changes in both the delay bias and random errors, with possibly strong effect on the estimates of system properties. Here, we investigated the properties of these errors using white-noise simulations and analysis of experimental photoreceptor responses to naturalistic and white-noise light contrasts. Photoreceptors were used from several insect species, each characterized by different visual performance, behavior, and ecology. We show that the effect of random error on the spectral estimates of photoreceptor performance (gain, coherence, signal-to-noise ratio, Shannon information rate) is opposite to that of the time delay bias error: the former overestimates information rate, while the latter underestimates it. We propose a new algorithm for reducing the impact of time delay bias error and random error, based on discovering, and then using that size of window, at which the absolute values of these errors are equal and opposite, thus cancelling each other, allowing minimally biased measurement of neural coding.  相似文献   

4.
Predicting protein folding rate from amino acid sequence is an important challenge in computational and molecular biology. Over the past few years, many methods have been developed to reflect the correlation between the folding rates and protein structures and sequences. In this paper, we present an effective method, a combined neural network--genetic algorithm approach, to predict protein folding rates only from amino acid sequences, without any explicit structural information. The originality of this paper is that, for the first time, it tackles the effect of sequence order. The proposed method provides a good correlation between the predicted and experimental folding rates. The correlation coefficient is 0.80 and the standard error is 2.65 for 93 proteins, the largest such databases of proteins yet studied, when evaluated with leave-one-out jackknife test. The comparative results demonstrate that this correlation is better than most of other methods, and suggest the important contribution of sequence order information to the determination of protein folding rates.  相似文献   

5.
Quality control operates at different steps in translation to limit errors to approximately one mistranslated codon per 10,000 codons during mRNA-directed protein synthesis. Recent studies have suggested that error rates may actually vary considerably during translation under different growth conditions. Here we examined the misincorporation of Phe at Tyr codons during synthesis of a recombinant antibody produced in tyrosine-limited Chinese hamster ovary (CHO) cells. Tyr to Phe replacements were previously found to occur throughout the antibody at a rate of up to 0.7% irrespective of the identity or context of the Tyr codon translated. Despite this comparatively high mistranslation rate, no significant change in cellular viability was observed. Monitoring of Phe and Tyr levels revealed that changes in error rates correlated with changes in amino acid pools, suggesting that mischarging of tRNATyr with noncognate Phe by tyrosyl-tRNA synthetase was responsible for mistranslation. Steady-state kinetic analyses of CHO cytoplasmic tyrosyl-tRNA synthetase revealed a 25-fold lower specificity for Tyr over Phe as compared with previously characterized bacterial enzymes, consistent with the observed increase in translation error rates during tyrosine limitation. Functional comparisons of mammalian and bacterial tyrosyl-tRNA synthetase revealed key differences at residues responsible for amino acid recognition, highlighting differences in evolutionary constraints for translation quality control.  相似文献   

6.
Here I systematically examine the information complexity of all primary sequences of natural proteins deposited in the Swiss-Prot database. The sequence complexity is assessed by determining the frequency of occurrence of each amino acid type on sequence windows of fixed length, calculating the Shannon entropy of the window and then averaging over all windows covering the sequence. The minimum value in information content obtained from the present-day record imposes a lower limit in the number of letters that a primeval amino acid alphabet must have had.  相似文献   

7.
This paper places models of language evolution within the framework of information theory. We study how signals become associated with meaning. If there is a probability of mistaking signals for each other, then evolution leads to an error limit: increasing the number of signals does not increase the fitness of a language beyond a certain limit. This error limit can be overcome by word formation: a linear increase of the word length leads to an exponential increase of the maximum fitness. We develop a general model of word formation and demonstrate the connection between the error limit and Shannon's noisy coding theorem.  相似文献   

8.
The Shannon information entropy of protein sequences.   总被引:6,自引:1,他引:5       下载免费PDF全文
A comprehensive data base is analyzed to determine the Shannon information content of a protein sequence. This information entropy is estimated by three methods: a k-tuplet analysis, a generalized Zipf analysis, and a "Chou-Fasman gambler." The k-tuplet analysis is a "letter" analysis, based on conditional sequence probabilities. The generalized Zipf analysis demonstrates the statistical linguistic qualities of protein sequences and uses the "word" frequency to determine the Shannon entropy. The Zipf analysis and k-tuplet analysis give Shannon entropies of approximately 2.5 bits/amino acid. This entropy is much smaller than the value of 4.18 bits/amino acid obtained from the nonuniform composition of amino acids in proteins. The "Chou-Fasman" gambler is an algorithm based on the Chou-Fasman rules for protein structure. It uses both sequence and secondary structure information to guess at the number of possible amino acids that could appropriately substitute into a sequence. As in the case for the English language, the gambler algorithm gives significantly lower entropies than the k-tuplet analysis. Using these entropies, the number of most probable protein sequences can be calculated. The number of most probable protein sequences is much less than the number of possible sequences but is still much larger than the number of sequences thought to have existed throughout evolution. Implications of these results for mutagenesis experiments are discussed.  相似文献   

9.
T Palzkill  D Botstein 《Proteins》1992,14(1):29-44
A new analytical mutagenesis technique is described that involves randomizing the DNA sequence of a short stretch of a gene (3-6 codons) and determining the percentage of all possible random sequences that produce a functional protein. A low percentage of functional random sequences in a complete library of random substitutions indicates that the region mutagenized is important for the structure and/or function of the protein. Repeating the mutagenesis over many regions throughout a protein gives a global perspective of which amino acid sequences in a protein are critical. We applied this method to 66 codons of the gene encoding TEM-1 beta-lactamase in 19 separate experiments. We found that TEM-1 beta-lactamase is extremely tolerant of amino acid substitutions: on average, 44% of all mutants with random substitutions function and 20% of the substitutions are expressed, secreted, and fold well enough to function at levels similar to those for the wild-type enzyme. We also found a few exceptional regions where only a few random sequences function. Examination of the X-ray structures of homologous beta-lactamases indicates that the regions most sensitive to substitution are in the vicinity of the active site pocket or buried in the hydrophobic core of the protein. DNA sequence analysis of functional random sequences has been used to obtain more detailed information about the amino acid sequence requirements for several regions and this information has been compared to sequence conservation among several related beta-lactamases.  相似文献   

10.
Huang JT  Tian J 《Proteins》2006,63(3):551-554
The significant correlation between protein folding rates and the sequence-predicted secondary structure suggests that folding rates are largely determined by the amino acid sequence. Here, we present a method for predicting the folding rates of proteins from sequences using the intrinsic properties of amino acids, which does not require any information on secondary structure prediction and structural topology. The contribution of residue to the folding rate is expressed by the residue's Omega value. For a given residue, its Omega depends on the amino acid properties (amino acid rigidity and dislike of amino acid for secondary structures). Our investigation achieves 82% correlation with folding rates determined experimentally for simple, two-state proteins studied until the present, suggesting that the amino acid sequence of a protein is an important determinant of the protein-folding rate and mechanism.  相似文献   

11.
12.
MOTIVATION: Sequence alignment techniques have been developed into extremely powerful tools for identifying the folding families and function of proteins in newly sequenced genomes. For a sufficiently low sequence identity it is necessary to incorporate additional structural information to positively detect homologous proteins. We have carried out an extensive analysis of the effectiveness of incorporating secondary structure information directly into the alignments for fold recognition and identification of distant protein homologs. A secondary structure similarity matrix based on a database of three-dimensionally aligned proteins was first constructed. An iterative application of dynamic programming was used which incorporates linear combinations of amino acid and secondary structure sequence similarity scores. Initially, only primary sequence information is used. Subsequently contributions from secondary structure are phased in and new homologous proteins are positively identified if their scores are consistent with the predetermined error rate. RESULTS: We used the SCOP40 database, where only PDB sequences that have 40% homology or less are included, to calibrate homology detection by the combined amino acid and secondary structure sequence alignments. Combining predicted secondary structure with sequence information results in a 8-15% increase in homology detection within SCOP40 relative to the pairwise alignments using only amino acid sequence data at an error rate of 0.01 errors per query; a 35% increase is observed when the actual secondary structure sequences are used. Incorporating predicted secondary structure information in the analysis of six small genomes yields an improvement in the homology detection of approximately 20% over SSEARCH pairwise alignments, but no improvement in the total number of homologs detected over PSI-BLAST, at an error rate of 0.01 errors per query. However, because the pairwise alignments based on combinations of amino acid and secondary structure similarity are different from those produced by PSI-BLAST and the error rates can be calibrated, it is possible to combine the results of both searches. An additional 25% relative improvement in the number of genes identified at an error rate of 0.01 is observed when the data is pooled in this way. Similarly for the SCOP40 dataset, PSI-BLAST detected 15% of all possible homologs, whereas the pooled results increased the total number of homologs detected to 19%. These results are compared with recent reports of homology detection using sequence profiling methods. AVAILABILITY: Secondary structure alignment homepage at http://lutece.rutgers.edu/ssas CONTACT: anders@rutchem.rutgers.edu; ronlevy@lutece.rutgers.edu Supplementary Information: Genome sequence/structure alignment results at http://lutece.rutgers.edu/ss_fold_predictions.  相似文献   

13.
P H O'Farrell 《Cell》1978,14(3):545-557
Amino acid starvation is shown to decrease the fidelity of translation in E. coli. When proteins are analyzed by two-dimensional gel electrophoresis, missense errors are detected as an unusual heterogeneity in their isoelectric points, while premature termination of protein synthesis can be recognized by a decreased relative rate of synthesis of higher molecular weight proteins and by the the accumulation of a complex group of new small polypeptides. The types of translational errors observed are amino acid-specific. For example, starvation of a rel- strain for histidine produces severe isoelectric point heterogeneity with little evidence of premature termination, while starvation for leucine has little effect on the isoelectric points, but produces a drastic decrease in the average molecular weight of the newly synthesized protein. These differences suggest codon-specific errors in reading the genetic code. In these rel- cells, the effect of amino acid starvation on the rates of synthesis of complete individual proteins is both protein- and amino acid-specific. For example, ribosomal protein L7/12, which lacks histidine, is made at a higher level during histidine starvation than during isoleucine or leucine starvation. This suggests that in rel- cells, the modulation of gene expression caused by the lack of a particular amino acid is, at least in part, a function of the abundance of that amino acid in particular proteins-that is, the response of rel- cells to starvation is consistent with the theory that the inhibition of protein synthesis and the accompanying increase in error frequency both result from low levels of the correct substrate. In marked contrast, virtually no starvation-induced translational errors are detected in a rel+ strain, and the response is not amino acid-specific. Varoius data strongly imply that in this rel+ strain, essentially all the changes caused by starvation are due to the accumulation of ppGpp, which independently reduces protein synthesis, thereby suppressing all the direct effects of amino acid limitation seen in rel- strains (where ppGpp does not accumulate upon starvation). A model is presented which describes how ppGpp might suppress the direct effects of starvation and avoid the loss of translational fidelity. In addition, the direct and specific effects of ppGpp on gene expression are examined independently of amino acid starvation.  相似文献   

14.
Human DNA polymerase nu (pol nu) is one of three A family polymerases conserved in vertebrates. Although its biological functions are unknown, pol nu has been implicated in DNA repair and in translesion DNA synthesis (TLS). Pol nu lacks intrinsic exonucleolytic proofreading activity and discriminates poorly against misinsertion of dNTP opposite template thymine or guanine, implying that it should copy DNA with low base substitution fidelity. To test this prediction and to comprehensively examine pol nu DNA synthesis fidelity as a clue to its function, here we describe human pol nu error rates for all 12 single base-base mismatches and for insertion and deletion errors during synthesis to copy the lacZ alpha-complementation sequence in M13mp2 DNA. Pol nu copies this DNA with average single-base insertion and deletion error rates of 7 x 10(-5) and 17 x 10(-5), respectively. This accuracy is comparable to that of replicative polymerases in the B family, lower than that of its A family homolog, human pol gamma, and much higher than that of Y family TLS polymerases. In contrast, the average single-base substitution error rate of human pol nu is 3.5 x 10(-3), which is inaccurate compared to the replicative polymerases and comparable to Y family polymerases. Interestingly, the vast majority of errors made by pol nu reflect stable misincorporation of dTMP opposite template G, at average rates that are much higher than for homologous A family members. This pol nu error is especially prevalent in sequence contexts wherein the template G is preceded by a C-G or G-C base pair, where error rates can exceed 10%. Amino acid sequence alignments based on the structures of more accurate A family polymerases suggest substantial differences in the O-helix of pol nu that could contribute to this unique error signature.  相似文献   

15.
The accuracy of protein synthesis in reticulocyte and HeLa cell lysates   总被引:1,自引:0,他引:1  
The accuracy of translation in protein synthesis is measured as the rate of misincorporation of a particular amino acid, different from that specified by an mRNA codon, into protein. The cowpea variant of tobacco mosaic virus, CcTMV, contains no cysteine or methionine in its coat protein. Translation in vitro of purified CcTMV coat protein mRNA by rabbit reticulocyte and HeLa cell lysates has been performed. The coat protein product was purified by immunoprecipitation with specific antisera, and separated by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. The error rate was measured by comparing the incorporation of [35S]cysteine with incorporation of [3H]leucine, and the total CcTMV coat protein synthesized was calculated from its known leucine content. An error rate of (1-2) X 10(-3) cysteines/CcTMV coat protein was obtained with reticulocyte lysates. If errors were cysteine incorporation in place of arginine, this number is converted to 3 X 10(-4) cysteine/codon. If cysteine was incorporated anywhere in the polypeptide, the rate is 9 X 10(-6) cysteines/amino acid. The error frequencies with HeLa cell lysates were 6-fold higher. Paromomycin, a eukaryotic misreading antibiotic, increased error rates 10-fold in both lysates. These data compare well with in vivo measurements and suggest that some transformed cells may survive with higher mistranslation rates.  相似文献   

16.
17.
Estimates of missense error rates (misreading) during protein synthesis vary from 10(-3) to 10(-4) per codon. The experiments reporting these rates have measured several distinct errors using several methods and reporter systems. Variation in reported rates may reflect real differences in rates among the errors tested or in sensitivity of the reporter systems. To develop a more accurate understanding of the range of error rates, we developed a system to quantify the frequency of every possible misreading error at a defined codon in Escherichia coli. This system uses an essential lysine in the active site of firefly luciferase. Mutations in Lys529 result in up to a 1600-fold reduction in activity, but the phenotype varies with amino acid. We hypothesized that residual activity of some of the mutant genes might result from misreading of the mutant codons by tRNA(Lys) (UUUU), the cognate tRNA for the lysine codons, AAA and AAG. Our data validate this hypothesis and reveal details about relative missense error rates of near-cognate codons. The error rates in E. coli do, in fact, vary widely. One source of variation is the effect of competition by cognate tRNAs for the mutant codons; higher error frequencies result from lower competition from low-abundance tRNAs. We also used the system to study the effect of ribosomal protein mutations known to affect error rates and the effect of error-inducing antibiotics, finding that they affect misreading on only a subset of near-cognate codons and that their effect may be less general than previously thought.  相似文献   

18.
Whether errors in protein synthesis play a role in aging has been a subject of intense debate. It has been suggested that rare mistakes in protein synthesis in young organisms may result in errors in the protein synthesis machinery, eventually leading to an increasing cascade of errors as organisms age. Studies that followed generally failed to identify a dramatic increase in translation errors with aging. However, whether translation fidelity plays a role in aging remained an open question. To address this issue, we examined the relationship between translation fidelity and maximum lifespan across 17 rodent species with diverse lifespans. To measure translation fidelity, we utilized sensitive luciferase‐based reporter constructs with mutations in an amino acid residue critical to luciferase activity, wherein misincorporation of amino acids at this mutated codon re‐activated the luciferase. The frequency of amino acid misincorporation at the first and second codon positions showed strong negative correlation with maximum lifespan. This correlation remained significant after phylogenetic correction, indicating that translation fidelity coevolves with longevity. These results give new life to the role of protein synthesis errors in aging: Although the error rate may not significantly change with age, the basal rate of translation errors is important in defining lifespan across mammals.  相似文献   

19.
By generating classes of random structures for trypsin inhibitor and carp myogen, each consistent with a given set of experimental or theoretical information, we have assessed the relative utility of various experiments and theories in deducing the conformation of macromolecules. We compare the calculated structures with known x-ray coordinates and compute for each class an average error. Small errors mean that the experimental or theoretical constraints limit the structures to the vicinity of the crystal structure, whereas large errors show that the constraints permit a wide variety of tertiary conformations. We find the following points to hold true: (1) Qualitative information on all the distances, as might be obtained from the correct prediction of interresidue contacts, effectively determines the structure (error approximately 1 Å). (2) Quantitative information on a limited number of distances, as might be obtained from nmr or crosslinking experiments, significantly restricts the range of possible structures only when the number of distances given is comparable to the number of residues (error approximately 3 Å). (3) Quantitative information on the distances of each residue to the center of mass of the molecule, as might in part be obtained from solvent accessibility and solution x-ray studies, is not particularly restrictive by itself (error approximately 5 Å). (4) Complete qualitative local distance information, as might be obtained from secondary prediction and CD/ORD studies, is clearly consistent with a wide variety of tertiary structures (error approximately 7 Å).  相似文献   

20.
SUMMARY: COPS predicts for all 20 naturally occurring amino acids whether the peptide bond in a protein is in cis or trans conformation. The algorithm is based only on secondary structure information of amino acid triplets without considering the amino acid sequence information. Conformation parameters are derived from solved 3D structures deposited in the PDB and led to propensities based on modified Chou-Fasman parameters. COPS analyses amino acid triplets taking only their respective secondary structure into consideration and upon application of a set of rules utilizing the conformation parameters, the N-terminal peptide bond conformation of the middle residue is predicted. COPS was tested on a random selection of protein datasets. AVAILABILITY: The COPS program and further information are freely available from the FMP website at http://www.fmp-berlin.de/nmr/cops CONTACT: labudde@fmp-berlin.de.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号