首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Amino acid substitution models represent the substitution rates among amino acids during the evolution of protein sequences. The models are a prerequisite for maximum likelihood or Bayesian methods to analyse the phylogenetic relationships among species based on their protein sequences. Estimating amino acid substitution models requires large protein datasets and intensive computation. In this paper, we presented the estimation of both time-reversible model (Q.met) and time non-reversible model (NQ.met) for multicellular animals (Metazoa). Analyses showed that the Q.met and NQ.met models were significantly better than existing models in analysing metazoan protein sequences. Moreover, the time non-reversible model NQ.met enables us to reconstruct the rooted phylogenetic tree for Metazoa. We recommend researchers to employ the Q.met and NQ.met models in analysing metazoan protein sequences.  相似文献   

2.
The main goal of the protein evolutionist is the reconstruction of past events leading to the structures of contemporary proteins. The common strategy is to align amino acid sequences and make inferences about matters of common ancestry. The rate of change of amino acid sequence varies greatly from protein to protein, and this naturally affects how far back a given protein's ancestry can be traced. Happily, the rate of change of many proteins is slow enough that very ancient events can be inferred. Many mainstream metabolic enzymes, for example, are 40-50% identical in prokaryotes and eukaryotes, groups that diverged from a common ancestor more than 1.5 billion years ago. Moreover, some eukaryotic proteins like actin and tubulin change so slowly that they are seldom less than 60% identical, no matter from what source they are drawn. As it happens, prokaryotic counterparts for many eukaryotic cytoskeletal proteins are unknown. A recent exception involves the finding that a heat shock protein cognate is a relative of actin. The gene duplication that gave rise to these two proteins must have been an ancient event. The more recent invention of other proteins whose distribution is restricted to one or the other of the major kingdoms may be easier to trace. Among the factors that can confound the reconstruction of events, however, are occasional horizontal gene transfers and exon shuffling. The latter has led to a number of mosaic proteins, many of which contain various combinations of a relatively small set of modules like the epidermal growth factor domain.  相似文献   

3.
Hemoglobin is an important protein found in the red cells of many animals. In humans, the hemoglobin is mainly distributed in the red blood cell. Single amino acid substitution is the main pathogenesis of most hemoglobin disorders. Here, the author used a new gene ontology technology to predict the molecular function and biological process of four important hemoglobin disorders with single substitution. The four studied important abnormal hemoglobins (Hb) with single substitution included Hb S, Hb E, Hb C, and Hb J-Baltimore. Using the GoFigure server, the molecular function and biological process in normal and abnormal hemoglobins was predicted. Compared with normal hemoglobin, all studied abnormal hemoglobins had the same function and biological process. This indicated that the overall function of oxygen transportation is not disturbed in the studied hemoglobin disorders. Clinical findings of oxygen depletion in abnormal hemoglobin should therefore be due to the other processes rather than genomics, proteomics, and expression levels.  相似文献   

4.
Aligned amino acid sequences of three functionally independent samples of transmembrane (TM) transport proteins have been analyzed. The concept of TM-kernel is proposed as the most probable transmembrane region of a sequence. The average amino acid composition of TM-kernels differs from the published amino acid composition of transmembrane segments. TM-kernels contain more alanines, glycines, and less polar, charged, and aromatic residues in contrast to non-TM-proteins. There are also differences between TM-kernels of bacterial and eukaryotic proteins. We have constructed amino acid substitution matrices for bacterial TM-kernels, named the BATMAS (BActerial Transmembrane MAtrix of Substitutions) series. In TM-kernels, polar and charged residues, as well as proline and tyrosine, are highly conserved, whereas there are more substitutions within the group of hydrophobic residues, in contrast to non-TM-proteins that have fewer, relatively more conserved, hydrophobic residues. These results demonstrate that alignment of transmembrane proteins should be based on at least two amino acid substitution matrices, one for loops (e.g., the BLOSUM series) and one for TM-segments (the BATMAS series), and the choice of the TM-matrix should be different for eukaryotic and bacterial proteins.  相似文献   

5.
Published data on mean rates of genetic divergence for a substantial number of protein molecules is used to examine the hypothetical effect of variations in these rates upon the expected relationship between evolutionary time and Nei's (1972, American Naturalist, 106: 283) genetic distance, D. Results indicate that at higher values D can be expected to deviate significantly from stochastic linearity with time. However, over the sort of time scale over which D values are normally estimated, deviation is slight and likely to be insignificant when compared to other potential sources of error. It is concluded that for most practical purposes interprotein differences in mean rates of amino acid substitution need not be taken into account when calibrating genetic distance estimates against evolutionary time.  相似文献   

6.
Summary The genomic DNA of cloned recombinants containing the duck globin genes was compared to that of the analogous domains of the chicken. A 36 kb insert including the three alpha-type globin genes was isolated from a newly prepared duck genomic library in the cosmid PJB8; another recombinant contained a 45 kb insert with the four beta globin genes. In the alpha globin gene domain, the relative positions of genes, of repetitive sequences, and of the A+T-rich segments (AT-rich linkers, ATRLs) which frame the gene cluster (Moreau et al. 1982), were found to be closely maintained between duck and chicken. Although ATRLs and repetitive sequences also frame the gene cluster in the beta globin domains of duck and chicken, there is more genetic drift in their relative positions than in the alpha domain. It is of interest that several repetitive DNA segments were detected in the chicken beta globin domain which do not exist in corresponding positions in the duck. In view of the strict conservation in both species of genes and their relative positions in the cluster, this observation seems to exclude a simple function of repetitive sequences in the control of individual genes. The data are discussed with regard to the possible significance of repetitive and AT-rich DNA segments in genome organisation and function.  相似文献   

7.
Proteins evolve under a myriad of biophysical selection pressures that collectively control the patterns of amino acid substitutions. These evolutionary pressures are sufficiently consistent over time and across protein families to produce substitution patterns, summarized in global amino acid substitution matrices such as BLOSUM, JTT, WAG, and LG, which can be used to successfully detect homologs, infer phylogenies, and reconstruct ancestral sequences. Although the factors that govern the variation of amino acid substitution rates have received much attention, the influence of thermodynamic stability constraints remains unresolved. Here we develop a simple model to calculate amino acid substitution matrices from evolutionary dynamics controlled by a fitness function that reports on the thermodynamic effects of amino acid mutations in protein structures. This hybrid biophysical and evolutionary model accounts for nucleotide transition/transversion rate bias, multi‐nucleotide codon changes, the number of codons per amino acid, and thermodynamic protein stability. We find that our theoretical model accurately recapitulates the complex yet universal pattern observed in common global amino acid substitution matrices used in phylogenetics. These results suggest that selection for thermodynamically stable proteins, coupled with nucleotide mutation bias filtered by the structure of the genetic code, is the primary driver behind the global amino acid substitution patterns observed in proteins throughout the tree of life.  相似文献   

8.
Using an information theoretic formalism, we optimize classes of amino acid substitution to be maximally indicative of local protein structure. Our statistically-derived classes are loosely identifiable with the heuristic constructions found in previously published work. However, while these other methods provide a more rigid idealization of physicochemically constrained residue substitution, our classes provide substantially more structural information with many fewer parameters. Moreover, these substitution classes are consistent with the paradigmatic view of the sequence-to-structure relationship in globular proteins which holds that the three-dimensional architecture is predominantly determined by the arrangement of hydrophobic and polar side chains with weak constraints on the actual amino acid identities. More specific constraints are imposed on the placement of prolines, glycines, and the charged residues. These substitution classes have been used in highly accurate predictions of residue solvent accessibility. They could also be used in the identification of homologous proteins, the construction and refinement of multiple sequence alignments, and as a means of condensing and codifying the information in multiple sequence alignments for secondary structure prediction and tertiary fold recognition. © 1996 Wiley-Liss, Inc.  相似文献   

9.
崔治中  秦爱建 《生命科学》2000,12(4):155-156
所有蛋白质抗原表位既有其相对保守的氨基酸序列作为与相应MHC分子相结合的“锚点”,也有与特定溶细胞性T细胞(CTL)上的TCR要子或抗体他子特异性结合的特异性氨基酸序列。对前者,分子免疫学已积累了相当多的资料,但对后者尚少有报道。我 立克病病毒的9个不同毒株对2个单抗的反应性及其38kD磷蛋白基因的分析,确定了决定两个相叠的不同抗原表们特异性的氨基酸组成。  相似文献   

10.
Summary A method of estimating the number of nucleotide substitutions from amino acid sequence data is developed by using Dayhoff's mutation probability matrix. This method takes into account the effect of nonrandom amino acid substitutions and gives an estimate which is similar to the value obtained by Fitch's counting method, but larger than the estimate obtained under the assumption of random substitutions (Jukes and Cantor's formula). Computer simulations based on Dayhoff's mutation probability matrix have suggested that Jukes and Holmquist's method of estimating the number of nucleotide substitutions gives an overestimate when amino acid substitution is not random and the variance of the estimate is generally very large. It is also shown that when the number of nucleotide substitutions is small, this method tends to give an overestimate even when amino acid substitution is purely at random.  相似文献   

11.
The proportion of amino acid substitutions driven by adaptive evolution can potentially be estimated from polymorphism and divergence data by an extension of the McDonald-Kreitman test. We have developed a maximum-likelihood method to do this and have applied our method to several data sets from three Drosophila species: D. melanogaster, D. simulans, and D. yakuba. The estimated number of adaptive substitutions per codon is not uniformly distributed among genes, but follows a leptokurtic distribution. However, the proportion of amino acid substitutions fixed by adaptive evolution seems to be remarkably constant across the genome (i.e., the proportion of amino acid substitutions that are adaptive appears to be the same in fast-evolving and slow-evolving genes; fast-evolving genes have higher numbers of both adaptive and neutral substitutions). Our estimates do not seem to be significantly biased by selection on synonymous codon use or by the assumption of independence among sites. Nevertheless, an accurate estimate is hampered by the existence of slightly deleterious mutations and variations in effective population size. The analysis of several Drosophila data sets suggests that approximately 25% +/- 20% of amino acid substitutions were driven by positive selection in the divergence between D. simulans and D. yakuba.  相似文献   

12.
Several choices of amino acid substitution matrices are currently available for searching and alignment applications. These choices were evaluated using the BLAST searching program, which is extremely sensitive to differences among matrices, and the Prosite catalog, which lists members of hundreds of protein families. Matrices derived directly from either sequence-based or structurebased alignments of distantly related proteins performed much better overall than extrapolated matrices based on the Dayhoff evolutionary model. Similar results were obtained with the FASTA searching program. Improved performance appears to be general rather than family-specific, reflecting improved accuracy in scoring alignments. An implementation of a multiple matrix strategy was also tested. While no combination of three matrices performed as well as the single best matrix, BLOSUM 62, good results were obtained using a combination of sequence-based and structure-based matrices. This hybrid set of matrices is likely to be useful in certain situations. Our results illustrate the importance of matrix selection and value of a comprehensive approach to evaluation of protein comparison tools. © 1993 Wiley-Liss, Inc.  相似文献   

13.
The misuse and overuse of antibiotics result in the emergence of resistant bacteria and fungi, which make an urgent need of the new antimicrobial agents. Nowadays, antimicrobial peptides have attracted great attention of researchers. However, the low physiological stability in biological system limits the application of naturally occurring antimicrobial peptides as novel therapeutics. In the present study, we synthesized derivatives of protonectin by substituting all the amino acid residues or the cationic lysine residue with the corresponding D ‐amino acids. Both the D ‐enantiomer of protonectin (D ‐prt) and D ‐Lys‐protonectin (D ‐Lys‐prt) exhibited strong antimicrobial activity against bacteria and fungi. Moreover, D ‐prt showed strong stability against trypsin, chymotrypsin and the human serum, while D ‐Lys‐prt only showed strong stability against trypsin. Circular dichroism analysis revealed that D ‐Lys‐prt still kept typical α‐helical structure in the membrane mimicking environment, while D ‐prt showed left hand α‐helical structure. In addition, propidium iodide uptake assay and bacteria and fungi killing experiments indicated that all D ‐amino acid substitution or partially D ‐amino acid substitution analogs could disrupt the integrity of membrane and lead the cell death. In summary, these findings suggested that D ‐prt and D ‐Lys‐prt might be promising candidate antibiotic agents for therapeutic application against resistant bacteria and fungi infection. Copyright © 2017 European Peptide Society and John Wiley & Sons, Ltd.  相似文献   

14.
The amino acid sequences of proteins provide rich information for inferring distant phylogenetic relationships and for predicting protein functions. Estimating the rate matrix of residue substitutions from amino acid sequences is also important because the rate matrix can be used to develop scoring matrices for sequence alignment. Here we use a continuous time Markov process to model the substitution rates of residues and develop a Bayesian Markov chain Monte Carlo method for rate estimation. We validate our method using simulated artificial protein sequences. Because different local regions such as binding surfaces and the protein interior core experience different selection pressures due to functional or stability constraints, we use our method to estimate the substitution rates of local regions. Our results show that the substitution rates are very different for residues in the buried core and residues on the solvent-exposed surfaces. In addition, the rest of the proteins on the binding surfaces also have very different substitution rates from residues. Based on these findings, we further develop a method for protein function prediction by surface matching using scoring matrices derived from estimated substitution rates for residues located on the binding surfaces. We show with examples that our method is effective in identifying functionally related proteins that have overall low sequence identity, a task known to be very challenging.  相似文献   

15.
It has long been suspected that analysis of correlated amino acid substitutions should uncover pairs or clusters of sites that are spatially proximal in mature protein structures. Accordingly, methods based on different mathematical principles such as information theory, correlation coefficients and maximum likelihood have been developed to identify co-evolving amino acids from multiple sequence alignments. Sets of pairs of sites whose behaviour is identified by these methods as correlated are often significantly enriched in pairs of spatially proximal residues. However, relatively high levels of false-positive predictions typically render such methods, in isolation, of little use in the ab initio prediction of protein structure. Misleading signal (or problems with the estimation of significance levels) can be caused by phylogenetic correlations between homologous sequences and from correlation due to factors other than spatial proximity (for example, correlation of sites which are not spatially close but which are involved in common functional properties of the protein). In recent years, several workers have suggested that information from correlated substitutions should be combined with other sources of information (secondary structure, solvent accessibility, evolutionary rates) in an attempt to reduce the proportion of false-positive predictions. We review methods for the detection of correlated amino acid substitutions, compare their relative performance in contact prediction and predict future directions in the field.  相似文献   

16.
Escherichia coli is used extensively in the production of proteins within biotechnology for a number of therapeutic applications. Here, we discuss the production and overexpression of the potential biopharmaceutical human thioredoxin protein (rhTRX) within E. coli. Overexpression of foreign molecules within the cell can put an enormous amount of stress on the translation machinery. This can lead to a misfiring in the construction of a protein resulting in populations differing slightly in amino acid composition. Whilst this may still result in a population of active molecules being expressed, it does present significant problems with molecules that are destined for clinical applications. Amino acid misincorporation of this subset could potentially result in antibodies being raised to these unnatural proteins. Cross-reaction with a patient's endogenous thioredoxin could then lead to an autoimmune phenomena and serious health implications. Generally, the issue of misincorporation appears not to be a routine regulatory concern (see ICH Q6B guidelines). Therefore, amino acid misincorporation may not have been detected, much less explored in the clinic as the occurrence or absence of these random errors is not routinely reported. Using current technologies based on proteomics, the ability to find misincorporation critically depends upon the criteria for matching theoretical and experimental mass spectrometry data. Additionally, isolation and extraction of these mistranslated proteins from the production process is both difficult and expensive. Therefore, it is advantageous to find routes for removing their production during the upstream phase. In this study, we show how modern proteomic technology can be used to identify and quantify amino acid misincorporation. Using these techniques we have shown how manipulation of gene sequence and scoping of fermentation media composition can lead to the reduction and elimination of these misincorporations in rhTRX.  相似文献   

17.
The new form of L-arginine D-glutamate is monoclinic, P21, witha = 9.941(1),b = 4.668(2),c = 17.307(1) Å,β = 95.27(1)°, and Z = 2. In terms of composition, the new form differs from the old form in that the former is a monohydrate whereas the latter is a trihydrate. The structure has been solved by the direct methods and refined to R = 0.085 for 1012 observed reflections. The conformation of the arginine molecule is the same in both the forms whereas that of the glutamate ion is different. The change in the conformation of the glutamate ion is such that it facilitates extensive pseudosymmetry in the crystals. The molecules arrange themselves in double-layers stabilised by head-to-tail sequences involving main chains, in both the forms. However, considerable differences exist between the two forms in the interface, consisting of side chains and water molecules, between double-layers. A comparative study of the relationship between the crystal structures of L and DL amino acids on the one hand and that between the structures of LL and LD amino acid-amino acid complexes on the other, provides interesting insights into amino acid aggregation and the effect of chirality on it. The crystal structures of most hydrophobic amino acids are made up of double-layers and those of most hydrophilic amino acids contain single layers, irrespective of the chiralities of the amino acids involved. In most cases, the molecules tend to appropriately rearrange themselves to preserve the broad features of aggregation patterns when the chirality of half the molecules is reversed as in the structures of DL amino acids. The basic elements of aggregation in the LL and the LD complexes, are similar to those found in the crystals of L and DL amino acids. However, the differences between the LL and the LD complexes in the distribution of these elements are more pronounced than those between the distributions in the structures of L and DL amino acids.  相似文献   

18.
Thebcr-abl chimeric gene of Philadelphia chromosome positive chronic myelogenous leukemias is only weakly transforming. This transformation activity is greatly enhanced by a Lys-for-Glu substitution at position 832 in the c-abl gene, as occurs in the highly transforming v-abl genes. It has been suggested that this mutation results in a significant structural change in the encoded protein product. Using conformational energy analysis, we have determined the allowed low-energy conformations for residues 828–836 of this protein with Lys and Glu at position 832. In both cases, the overwhelmingly preferred conformation for this region is a bend-helix motif. The helix terminates at residue 836, and there are no discernible differences in conformation between the Lys- and Glu-containing sequences. These results suggest that the activating amino acid substitution at position 832 in the c-abl protein product does not produce its effect via a local conformational change.  相似文献   

19.
20.
Mitochondrial DNA (mtDNA) sequences are widely used for inferring the phylogenetic relationships among species. Clearly, the assumed model of nucleotide or amino acid substitution used should be as realistic as possible. Dependence among neighboring nucleotides in a codon complicates modeling of nucleotide substitutions in protein-encoding genes. It seems preferable to model amino acid substitution rather than nucleotide substitution. Therefore, we present a transition probability matrix of the general reversible Markov model of amino acid substitution for mtDNA-encoded proteins. The matrix is estimated by the maximum likelihood (ML) method from the complete sequence data of mtDNA from 20 vertebrate species. This matrix represents the substitution pattern of the mtDNA-encoded proteins and shows some differences from the matrix estimated from the nuclear-encoded proteins. The use of this matrix would be recommended in inferring trees from mtDNA-encoded protein sequences by the ML method. Received: 3 May 1995 / Accepted: 31 October 1995  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号