共查询到20条相似文献,搜索用时 15 毫秒
1.
Different nonsynonymous changes may be under different selective pressure during evolution. Of the 190 possible interchanges among the 20 amino acids, only 75 can be attained by a single-base substitution. An evolutionary index (EI) can be empirically computed for each of the 75 elementary changes as the likelihood of substitutions, relative to that of synonymous changes. We used 280, 1,306, 2,488, and 309 orthologous genes from primates (human versus Old World monkey), rodents (mouse versus rat), yeast (S. cerevisiae versus S. paradoxus), and Drosophila (D. melanogaster versus D. simulans), respectively, to estimate the EIs. In each data set, EI varies more than 10-fold, and the correlation coefficients of EIs from the pairwise comparisons are high (e.g., r = 0.91 between rodent and yeast). The high correlations suggest that the amino acid properties are strong determinants of protein evolution, irrespective of the identities of the proteins or the taxa of interest. However, these properties are not well captured in conventional measures of amino acid exchangeability. We, therefore, propose a universal index of exchange (U): for any large data set, its EI can be expressed as U*R, where R is the average Ka/Ks for that data set. The codon-based, empirically determined EI (i.e., U*R) makes much better predictions on protein evolution than do previous methods. 相似文献
2.
The amino acid sequences of some fiber proteins possibly have a periodic structure. This periodicity can be analyzed using the Fourier transform of the mathematical image of the symbol sequence of amino acid residues in proteins. One of several possible methods of Fourier transform has been chosen as optimal for the given study. This optimal Fourier transform has been used to analyze the periodic structures in several fiber proteins of bacteriophage T4. Amino acids from some groups form sequences of alternating elements with a relatively small period (T=15); those from other groups form sequences with other small periods (T=10 and T=8). Relatively large periods of amino acid arrangement, with the entire amino acid sequence of the protein being divided between them into four or six equal parts, is a new finding. The data on protein structural periodicity make it possible to align the amino acid sequences according to the periodic structures of both type. The results obtained agree with the results of previous crystallographic and electron microscopic studies.__________Translated from Molekulyarnaya Biologiya, Vol. 39, No. 2, 2005, pp. 321–329.Original Russian Text Copyright © 2005 by Simakova, Simakov. 相似文献
3.
Summary We examine in this paper one of the expected consequences of the hypothesis that modern proteins evolved from random heteropeptide
sequences. Specifically, we investigate the lengthwise distributions of amino acids in a set of 1,789 protein sequences with
little sequence identity using the run test statistic (r
o) of Mood (1940,Ann. Math. Stat.
11, 367–392). The probability density ofr
o for a collection of random sequences has mean=0 and variance=1 [the N(0,1) distribution] and can be used to measure the tendency
of amino acids of a given type to cluster together in a sequence relative to that of a random sequence. We implement the run
test using binary representations of protein sequences in which the amino acids of interest are assigned a value of 1 and
all others a value of 0. We consider individual amino acids and sets of various combinations of them based upon hydrophobicity
(4 sets), charge (3 sets), volume (4 sets), and secondary structure propensity (3 sets). We find that any sequence chosen
randomly has a 90% or greater chance of having a lengthwise distribution of amino acids that is indistinguishable from the
random expectation regardless of amino acid type. We regard this as strong support for the random-origin hypothesis. However,
we do observe significant deviations from the random expectation as might be expected after billions years of evolution. Two
important global trends are found: (1) Amino acids with a strong α-helix propensity show a strong tendency to cluster whereas
those with β-sheet or reverse-turn propensity do not. (2) Clustered rather than evenly distributed patterns tend to be preferred
by the individual amino acids and this is particularly so for methionine. Finally, we consider the problem of reconciling
the random nature of protein sequences with structurally meaningful periodic “patterns” that can be detected by sliding-window,
autocorrelation, and Fourier analyses. Two examples, rhodopsin and bacteriorhodopsin, show that such patterns are a natural
feature of random sequences. 相似文献
4.
We develop an approximate maximum likelihood method to estimate flanking nucleotide context-dependent mutation rates and amino acid exchange-dependent selection in orthologous protein-coding sequences and use it to analyze genome-wide coding sequence alignments from mammals and yeast. Allowing context-dependent mutation provides a better fit to coding sequence data than simpler (context-independent or CpG "hotspot") models and significantly affects selection parameter estimates. Allowing asymmetric (nonreciprocal) selection on amino acid exchanges gives a better fit than simple dN/dS or symmetric selection models. Relative selection strength estimates from our models show good agreement with independent estimates derived from human disease-causing and engineered mutations. Selection strengths depend on local protein structure, showing expected biophysical trends in helical versus nonhelical regions and increased asymmetry on polar-hydrophobic exchanges with increased burial. The more stringent selection that has previously been observed for highly expressed proteins is primarily concentrated in buried regions, supporting the notion that such proteins are under stronger than average selection for stability. Our analyses indicate that a highly parameterized model of mutation and selection is computationally tractable and is a useful tool for exploring a variety of biological questions concerning protein and coding sequence evolution. 相似文献
5.
On reduced amino acid alphabets for phylogenetic inference 总被引:1,自引:0,他引:1
We investigate the use of Markov models of evolution for reduced amino acid alphabets or bins of amino acids. The use of reduced amino acid alphabets can ameliorate effects of model misspecification and saturation. We present algorithms for 2 different ways of automating the construction of bins: minimizing criteria based on properties of rate matrices and minimizing criteria based on properties of alignments. By simulation, we show that in the absence of model misspecification, the loss of information due to binning is found to be insubstantial, and the use of Markov models at the binned level is found to be almost as effective as the more appropriate missing data approach. By applying these approaches to real data sets where compositional heterogeneity and/or saturation appear to be causing biased tree estimation, we find that binning can improve topological estimation in practice. 相似文献
6.
The new form of L-arginine D-glutamate is monoclinic, P21, witha = 9.941(1),b = 4.668(2),c = 17.307(1) Å,β = 95.27(1)°, and Z = 2. In terms of composition, the new form differs from the old form in that the former is a monohydrate whereas the latter is a trihydrate. The structure has been solved by the direct methods and refined to R = 0.085 for 1012 observed reflections. The conformation of the arginine molecule is the same in both the forms whereas that of the glutamate ion is different. The change in the conformation of the glutamate ion is such that it facilitates extensive pseudosymmetry in the crystals. The molecules arrange themselves in double-layers stabilised by head-to-tail sequences involving main chains, in both the forms. However, considerable differences exist between the two forms in the interface, consisting of side chains and water molecules, between double-layers. A comparative study of the relationship between the crystal structures of L and DL amino acids on the one hand and that between the structures of LL and LD amino acid-amino acid complexes on the other, provides interesting insights into amino acid aggregation and the effect of chirality on it. The crystal structures of most hydrophobic amino acids are made up of double-layers and those of most hydrophilic amino acids contain single layers, irrespective of the chiralities of the amino acids involved. In most cases, the molecules tend to appropriately rearrange themselves to preserve the broad features of aggregation patterns when the chirality of half the molecules is reversed as in the structures of DL amino acids. The basic elements of aggregation in the LL and the LD complexes, are similar to those found in the crystals of L and DL amino acids. However, the differences between the LL and the LD complexes in the distribution of these elements are more pronounced than those between the distributions in the structures of L and DL amino acids. 相似文献
7.
Willi Schmidt 《Journal of molecular evolution》1995,41(4):522-530
This paper presents an essentially new method used to construct phylogenetic trees from related amino acid sequences. The method is based on a new distance measure which describes sequence relationships by means of typical steric and physicochemical properties of the amino acids and is advantageous in some essential points. The method was applied to different sets of protein sequences and the results were compared with other well-established methods. 相似文献
8.
Reinhard Lohmann Gisbert Schneider Dirk Behrens Paul Wrede 《Protein science : a publication of the Protein Society》1994,3(9):1597-1601
The architecture and weights of an artificial neural network model that predicts putative transmembrane sequences have been developed and optimized by the algorithm of structure evolution. The resulting filter is able to classify membrane/nonmembrane transition regions in sequences of integral human membrane proteins with high accuracy. Similar results have been obtained for both training and test set data, indicating that the network has focused on general features of transmembrane sequences rather than specializing on the training data. Seven physicochemical amino acid properties have been used for sequence encoding. The predictions are compared to hydrophobicity plots. 相似文献
9.
C. Adán A. Ardévol X. Remesar M. Alemany J. A. Fernández-López 《Molecular and cellular biochemistry》1994,130(2):149-157
The changes in hind leg tissue (muscle and skin) amono acid pool size and arteriovenous balance were measured in rats subjected to 0–90 min of cold exposure (4°C). Tissue free amino acid pools presented a different composition pattern from protein amino acids. Muscle rapidly reacted to cold exposure by releasing small amounts of some amino acids (alanine, aspartate), with only small changes in pool size during the first 30 min. Amino acid oxidation was very limited during the whole period of cold exposure, since at all times tested there was either nil ammonia efflux or net absorption of ammonia and glutamine; i.e. the muscle was in positive nitrogen balance throughout the period studied. Thus most of the amino acid nitrogen taken up from the blood and not found in the free amino pools must have been incorporated into protein, since it was not oxidized, as shown by the glutamine and ammonia blance. The data on amino acid incorporation into proteins indicate that hind leg protein turnover is rapidly and widely modulated from a low initial setting upon cold exposure to a higher protein synthesis rate immediately afterwards, suggesting that protein turnover may have an important role in short-term events in cold-exposed muscle, in addition to its influence in long-term adaptation. 相似文献
10.
11.
A simple and fast approach to prediction of protein secondary structure from multiply aligned sequences with accuracy above 70%.
下载免费PDF全文

P. K. Mehta J. Heringa P. Argos 《Protein science : a publication of the Protein Society》1995,4(12):2517-2525
To improve secondary structure predictions in protein sequences, the information residing in multiple sequence alignments of substituted but structurally related proteins is exploited. A database comprised of 70 protein families and a total of 2,500 sequences, some of which were aligned by tertiary structural superpositions, was used to calculate residue exchange weight matrices within alpha-helical, beta-strand, and coil substructures, respectively. Secondary structure predictions were made based on the observed residue substitutions in local regions of the multiple alignments and the largest possible associated exchange weights in each of the three matrix types. Comparison of the observed and predicted secondary structure on a per-residue basis yielded a mean accuracy of 72.2%. Individual alpha-helix, beta-strand, and coil states were respectively predicted at 66.7, and 75.8% correctness, representing a well-balanced three-state prediction. The accuracy level, verified by cross-validation through jack-knife tests on all protein families, dropped, on average, to only 70.9%, indicating the rigor of the prediction procedure. On the basis of robustness, conceptual clarity, accuracy, and executable efficiency, the method has considerable advantage, especially with its sole reliance on amino acid substitutions within structurally related proteins. 相似文献
12.
A statistical approach was applied to select those models that best fit each individual mitochondrial (mt) protein at different taxonomic levels of metazoans. The existing mitochondrial replacement matrices, MtREV and MtMam, were found to be the best-fit models for the mt-proteins of vertebrates, with the exception of Nd6, at different taxonomic levels. Remarkably, existing mitochondrial matrices generally failed to best-fit invertebrate mt-proteins. In an attempt to better model the evolution of invertebrate mt-proteins, a new replacement matrix, named MtArt, was constructed based on arthropod mt-proteomes. The new model was found to best fit almost all analyzed invertebrate mt-protein data sets. The observed pattern of model fit across the different data sets indicates that no single replacement matrix is able to describe the general evolutionary properties of mt-proteins but rather that taxonomical biases and/or the existence of different mt-genetic codes have great influence on which model is selected. 相似文献
13.
G. F. Rohrmann M. N. Pearson T. J. Bailey R. R. Becker G. S. Beaudreau 《Journal of molecular evolution》1981,17(6):329-333
Summary A phylogenetic tree for occluded baculoviruses was constructed based on the N-terminal amino acid sequence of occlusion body proteins from six baculoviruses including three lepidopteran nuclear polyhedrosis viruses (NPVs), [two unicapsid (Bombyx mori andOrgyia pseudotsugata) and one multicapsid (Orgyia pseudotsugata)]; one granulosis virus (Pieris brassicae); and NPVs from a hymenopteran (Neodiprion sertifer) and a dipteran (Tipula paludosa). Amino acid sequence data for theB. mori NPV were from a report by Sere-bryani et al. (1977) and that for theO. pseudotsugata NPVs were reported previously by us (Rohrmann et al. 1979). The other N-terminal amino acid sequences are presented in this paper. The phylogenetic relationships determined based on the molecular evolution of polyhedrin were also investigated by antigenic comparisons of the proteins using a solid phase radioimmune assay. The results indicate that the lepidopteran NPVs are the most closely related of the above group of viruses and are related to these viruses in the following order:N. sertifer NPV,P. brassicae granulosis virus, andT. paludosa NPV. These data, in conjunction withBaculovirus distribution and evidence concerning insect phylogeny, suggest that theBaculovirus have an ancient association with insects and may have evolved along with them. 相似文献
14.
A key question associated with topology predictions for membrane proteins is whether there is sufficient variation in the biophysical properties of residues at the membrane interface to enable identification of TM spans in a robust and efficient manner using relatively simple methods of analysis. Here, a test for the homogeneity of multinomial populations is used to identify statistical differences between the residue compositions of windows within datasets of aligned non-homologous TM α-helices. Using this approach, the accuracy and robustness of the predicted boundaries for datasets of uncleaved signal (US) sequences and stop transfer sequences (ST) is tested. The validity of the 21 residue length, which is generally assumed for TM spans in membrane protein topology prediction is also investigated and it is suggested that ST sequences may be better represented by a length of 22 residues. 相似文献
15.
Characterization of disease-associated single amino acid polymorphisms in terms of sequence and structure properties. 总被引:7,自引:0,他引:7
Carles Ferrer-Costa Modesto Orozco Xavier de la Cruz 《Journal of molecular biology》2002,315(4):771-786
In the present work, we use structural information to characterize a set of disease-associated single amino acid polymorphisms exhaustively. The analysis of different properties, such as substitution matrix elements, secondary structure, accessibility, free energies of transfer from water to octanol, amino acid volume, etc., suggests that many disease-causing mutations are associated with extreme changes in the value of parameters relating to protein stability. Overall, our results indicate that, while knowledge of protein structure clearly helps in understanding these mutations, a finer understanding can come only from a quantitative knowledge of protein stability and of the protein environment in the cell. Interestingly, use of evolutionary information from multiple sequence alignments can be used to increase our knowledge of disease-associated mutations. 相似文献
16.
The use of amino acid sequence analysis in assessing evolution 总被引:1,自引:0,他引:1
The thirteen year history of assessing evolution by amino acid sequence analysis has made apparent the limitations imposed upon this system by the finite nature of the characters. This finiteness exists on several levels and ultimately expresses itself as parallelism, back mutation and the retention of primitive characters in the sequences of proteins from present day species and the putative ancestral protein chains. Sequence analysis shares these problems with other molecular approaches, but because it is concerned both with the nucleotide substitutions in the genome and with the functional roles of proteins, it has unique advantages. For example, the large fluctuation in the rate of fixation of mutations in a protein's evolution can be detected and used to point out the unreliability of any molecular clock for estimating divergence dates. Moreover, when consideration is given to studies which assign functional significance to specific amino acid sites in a protein, changes in function during the descent of a protein can be appreciated and their significance correlated with organismal evolution. 相似文献
17.
Constructing amino acid residue substitution classes maximally indicative of local protein structure
Using an information theoretic formalism, we optimize classes of amino acid substitution to be maximally indicative of local protein structure. Our statistically-derived classes are loosely identifiable with the heuristic constructions found in previously published work. However, while these other methods provide a more rigid idealization of physicochemically constrained residue substitution, our classes provide substantially more structural information with many fewer parameters. Moreover, these substitution classes are consistent with the paradigmatic view of the sequence-to-structure relationship in globular proteins which holds that the three-dimensional architecture is predominantly determined by the arrangement of hydrophobic and polar side chains with weak constraints on the actual amino acid identities. More specific constraints are imposed on the placement of prolines, glycines, and the charged residues. These substitution classes have been used in highly accurate predictions of residue solvent accessibility. They could also be used in the identification of homologous proteins, the construction and refinement of multiple sequence alignments, and as a means of condensing and codifying the information in multiple sequence alignments for secondary structure prediction and tertiary fold recognition. © 1996 Wiley-Liss, Inc. 相似文献
18.
Angel Montoya Maria J. Gómez-Lechón Jose V. Castell 《In vitro cellular & developmental biology. Plant》1989,25(4):358-364
Summary Supplementation of Ham's F12 culture medium with essential amino acids (EAA) up to the rat plasma levels increased the rates of synthesis of albumin and transferrin by cultured rat hepatocytes by 1.3 and 1.7, respectively. Fifty percent of this increase could be attributed to three of the EAA: the branched-chain amino acids (BCAA: Leu Ile and Val). Non-branched-chain essential amino acids (non-BC-EAA) stimulated only 25% of the increase produced by the whole EAA mixture. When each EAA was tested individually, none of them caused an appreciable increase in albumin and transferrin in culture medium. When the concentrations of all EAA were raised to rat postprandial portal levels, albumin and transferrin synthesis rates reached a maximum, increasing by 3.2 and 3.5, respectively. Supplementation with BCAA at postprandial portal concentrations increased albumin and transferrin synthesis rates by 2.2 and 2.0, respectively, and had no noteworthy effect on the synthesis of cellular proteins. Non-BC-EAA at their postprandial portal concentrations increased albumin and transferrin synthesis rates by 1.7 and 1.9, respectively. Supplementation with alanine to reach a nitrogen content equal to that of the modified EAA-enriched medium had no stimulatory effect. Our results show that EAA have a specific effect on the synthesis of plasma proteins by cultured hepatocytes, and that BCAA at physiologic concentrations account for the major part of this stimulatory effect. Consequently, EAA and particularly BCAA concentration should be elevated in serum-free nutrient media to sustain maximum plasma protein synthesis. 相似文献
19.
Alternative splicing has been discovered in nearly all metazoan organisms as a mechanism to increase the diversity of gene products. However, the origin and evolution of alternatively spliced genes are still poorly understood. To understand the mechanisms for the evolution of alternatively spliced genes, it may be important to study the differences between alternatively and non-alternatively spliced genes. The aim of this research was to compare amino acid usage and protein length distribution between alternatively and non-alternatively spliced genes across six nearly complete eukaryotic genomes, including those of human (Homo sapiens), mouse (Mus musculus), rat (Rattus norvegicus), fruit fly (Drosophila melanogaster), Caenorhabditis elegans, and bovine (Bos taurus). Our results have suggested the following: (1) across the six species, alternatively and non-alternatively spliced genes have very similar tendency for amino acids usage for not only the overall scale but also those highly expressed genes, with all of the highly expressed genes having preferred amino acids including A, E, G, K, L, P, S, V, R, T, and D. (2) For not only the overall genes but also those highly expressed ones, the average length of the protein products of alternatively spliced genes is significantly greater than that of non-alternatively spliced ones. In contrast, distributions of protein lengths for the two groups of genes are very similar among all six species. Based on these results, we propose that alternatively spliced genes may have originated from non-alternatively spliced ones through events such as DNA mutations or gene fusion. 相似文献
20.
Izydor Apostol Anthony Giletto Tomoko Komiyama WenLei Zhang Michael Laskowski Jr. 《Journal of Protein Chemistry》1993,12(4):419-433
Ovomucoids consist of a single polypeptide chain which is composed of three tandem Kazal domains. Each Kazal domain is an actual or putative protein inhibitor of serine proteinases. Ovomucoid third domains were already isolated and sequenced from 126 species of birds (Laskowskiet al., 1987, 1990). This paper adds 27 new species. A number of generalizations are made on the basis of sequences from 153 species. The residues that are in contact with the enzyme in enzyme-inhibitor complexes are strikingly hypervariable. While the primary specificity residue,P1, is the most variable; substitutions occur predominantly among aliphatic, hydrophobic residues. Consensus sequences for an avian ovomucoid third domain, for a b-type Kazal domain (i.e., a COOH terminal domain of multidomain inhibitors) and for a general Kazal domain are given. Finally, the individual new sequences are briefly discussed. 相似文献