首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Several choices of amino acid substitution matrices are currently available for searching and alignment applications. These choices were evaluated using the BLAST searching program, which is extremely sensitive to differences among matrices, and the Prosite catalog, which lists members of hundreds of protein families. Matrices derived directly from either sequence-based or structurebased alignments of distantly related proteins performed much better overall than extrapolated matrices based on the Dayhoff evolutionary model. Similar results were obtained with the FASTA searching program. Improved performance appears to be general rather than family-specific, reflecting improved accuracy in scoring alignments. An implementation of a multiple matrix strategy was also tested. While no combination of three matrices performed as well as the single best matrix, BLOSUM 62, good results were obtained using a combination of sequence-based and structure-based matrices. This hybrid set of matrices is likely to be useful in certain situations. Our results illustrate the importance of matrix selection and value of a comprehensive approach to evaluation of protein comparison tools. © 1993 Wiley-Liss, Inc.  相似文献   

2.
The genomic era has seen a remarkable increase in the number of genomes being sequenced and annotated. Nonetheless, annotation remains a serious challenge for compositionally biased genomes. For the preliminary annotation, popular nucleotide and protein comparison methods such as BLAST are widely employed. These methods make use of matrices to score alignments such as the amino acid substitution matrices. Since a nucleotide bias leads to an overall bias in the amino acid composition of proteins, it is possible that a genome with nucleotide bias may have introduced atypical amino acid substitutions in its proteome. Consequently, standard matrices fail to perform well in sequence analysis of these genomes. To address this issue, we examined the amino acid substitution in the AT-rich genome of Plasmodium falciparum, chosen as a reference and reconstituted a substitution matrix in the genome's context. The matrix was used to generate protein sequence alignments for the parasite proteins that improved across the functional regions. We attribute this to the consistency that may have been achieved amid the target and background frequencies calculated exclusively in our study. This study has important implications on annotation of proteins that are of experimental interest but give poor sequence alignments with standard conventional matrices.  相似文献   

3.
Automatic comparison of compositionally biased genomes, such as that of the malarial causative agent Plasmodium falciparum (82% adenosine + thymidine), with genomes of average composition, is currently limited. Indeed, popular tools such as BLAST require that amino acid distributions be similar in aligned sequences. However, the P. falciparum genome is so biased that six amino acids account for more than 50% of the protein composition. One reason for the comparison methods failure lies in the compositional difference between the query and the subject proteomes, which is not taken into account in the amino acid substitution matrices. This paper introduces a method to derive substitution matrices, in particular BLOSUM 62, in the frame of the information theory. It allows the construction of non-symmetrical matrices, taking into account the non-symmetric amino acid distributions. The dirAtPf family of matrices allowing the comparison of P. falciparum and A. thaliana is given as an example. This paper further provides an analysis of the obtained matrices in the frame of the information theory, supporting the discrimination advantage they bring.  相似文献   

4.
Proteins evolve under a myriad of biophysical selection pressures that collectively control the patterns of amino acid substitutions. These evolutionary pressures are sufficiently consistent over time and across protein families to produce substitution patterns, summarized in global amino acid substitution matrices such as BLOSUM, JTT, WAG, and LG, which can be used to successfully detect homologs, infer phylogenies, and reconstruct ancestral sequences. Although the factors that govern the variation of amino acid substitution rates have received much attention, the influence of thermodynamic stability constraints remains unresolved. Here we develop a simple model to calculate amino acid substitution matrices from evolutionary dynamics controlled by a fitness function that reports on the thermodynamic effects of amino acid mutations in protein structures. This hybrid biophysical and evolutionary model accounts for nucleotide transition/transversion rate bias, multi‐nucleotide codon changes, the number of codons per amino acid, and thermodynamic protein stability. We find that our theoretical model accurately recapitulates the complex yet universal pattern observed in common global amino acid substitution matrices used in phylogenetics. These results suggest that selection for thermodynamically stable proteins, coupled with nucleotide mutation bias filtered by the structure of the genetic code, is the primary driver behind the global amino acid substitution patterns observed in proteins throughout the tree of life.  相似文献   

5.
Goonesekere NC  Lee B 《Proteins》2008,71(2):910-919
The sequence homology detection relies on score matrices, which reflect the frequency of amino acid substitutions observed in a dataset of homologous sequences. The substitution matrices in popular use today are usually constructed without consideration of the structural context in which the substitution takes place. Here, we present amino acid substitution matrices specific for particular polar-nonpolar environment of the amino acid. As expected, these matrices [context-specific substitution matrices (CSSMs)] show striking differences from the popular BLOSUM62 matrix, which does not include structural information. When incorporated into BLAST and PSI-BLAST, CSSM outperformed BLOSUM matrices as assessed by ROC curve analyses of the number of true and false hits and by the accuracy of the sequence alignments to the hit sequences. These findings are also of relevance to profile-profile-based methods of homology detection, since CSSMs may help build a better profile. Profiles generated for protein sequences in PDB using CSSM-PSI-BLAST will be made available for searching via RPSBLAST through our web site http://lmbbi.nci.nih.gov/.  相似文献   

6.
MOTIVATION: Amino acid substitution matrices play a central role in protein alignment methods. Standard log-odds matrices, such as those of the PAM and BLOSUM series, are constructed from large sets of protein alignments having implicit background amino acid frequencies. However, these matrices frequently are used to compare proteins with markedly different amino acid compositions, such as transmembrane proteins or proteins from organisms with strongly biased nucleotide compositions. It has been argued elsewhere that standard matrices are not ideal for such comparisons and, furthermore, a rationale has been presented for transforming a standard matrix for use in a non-standard compositional context. RESULTS: This paper presents the mathematical details underlying the compositional adjustment of amino acid or DNA substitution matrices.  相似文献   

7.
MOTIVATION: Protein and DNA are generally represented by sequences of letters. In a number of circumstances simplified alphabets (where one or more letters would be represented by the same symbol) have proved their potential utility in several fields of bioinformatics including searching for patterns occurring at an unexpected rate, studying protein folding and finding consensus sequences in multiple alignments. The main issue addressed in this paper is the possibility of finding a general approach that would allow an exhaustive analysis of all the possible simplified alphabets, using substitution matrices like PAM and BLOSUM as a measure for scoring. RESULTS: The computational approach presented in this paper has led to a computer program called AlphaSimp (Alphabet Simplifier) that can perform an exhaustive analysis of the possible simplified amino acid alphabets, using a branch and bound algorithm together with standard or user-defined substitution matrices. The program returns a ranked list of the highest-scoring simplified alphabets. When the extent of the simplification is limited and the simplified alphabets are maintained above ten symbols the program is able to complete the analysis in minutes or even seconds on a personal computer. However, the performance becomes worse, taking up to several hours, for highly simplified alphabets. AVAILABILITY: AlphaSimp and other accessory programs are available at http://bioinformatics.cribi.unipd.it/alphasimp  相似文献   

8.
The amino acid sequence of Egyptian goose lysozyme (EGL) from egg-white and its enzymatic properties were analyzed. The established sequence had the highest similarity to wood duck lysozyme (WDL) with five amino acid substitutions, and had eighteen substitutions difference from hen egg-white lysozyme (HEL). Tyr34 and Gly37 were found at subsites E and F of the active site when compared with HEL. The experimental time-course characteristics of EGL against the N-acetylglucosamine pentamer substrate, (GlcNAc)(5), revealed higher production of (GlcNAc)(4) and lower production of (GlcNAc)(2) when compared with HEL. The saccharide-binding ability of subsites A-C in EGL was also found to be weaker than in HEL. An analysis of the enzymatic reactions of five mutants in respect of positions 34, 37 and 71 in HEL indicated the time-course characteristics of EGL to be caused by the combination of three substitutions (F34Y, N37G and G71R) between HEL and EGL. A computer simulation of the EGL-catalyzed reaction suggested that the time-course characteristics of EGL resulted from the difference in the binding free energy for subsites A, B, E and F and the rate constant of transglycosylation between EGL and HEL.  相似文献   

9.
The pattern of amino acid substitutions and sequence conservation over many structure-based alignments of protein sequences was analyzed as a function of percentage sequence identity. The statistics of the amino acid substitutions were converted into the form of log-odds amino acid substitution matrices to which eigenvalue decomposition was applied. It was found that the most important component of the substitution matrices exhibited a sharp transition at the sequence identity of 30-35%, which coincides with the twilight zone. Above the transition point, the most dominant component is related to the mutability of amino acids and it acts to disfavor any substitutions, whereas below the transition point, the most dominant component is related to the hydrophobicity of amino acids and substitutions between residues of similar hydrophobic character are positively favored. Implications for protein evolution and sequence analysis are discussed.  相似文献   

10.
Aligned amino acid sequences of three functionally independent samples of transmembrane (TM) transport proteins have been analyzed. The concept of TM-kernel is proposed as the most probable transmembrane region of a sequence. The average amino acid composition of TM-kernels differs from the published amino acid composition of transmembrane segments. TM-kernels contain more alanines, glycines, and less polar, charged, and aromatic residues in contrast to non-TM-proteins. There are also differences between TM-kernels of bacterial and eukaryotic proteins. We have constructed amino acid substitution matrices for bacterial TM-kernels, named the BATMAS (BActerial Transmembrane MAtrix of Substitutions) series. In TM-kernels, polar and charged residues, as well as proline and tyrosine, are highly conserved, whereas there are more substitutions within the group of hydrophobic residues, in contrast to non-TM-proteins that have fewer, relatively more conserved, hydrophobic residues. These results demonstrate that alignment of transmembrane proteins should be based on at least two amino acid substitution matrices, one for loops (e.g., the BLOSUM series) and one for TM-segments (the BATMAS series), and the choice of the TM-matrix should be different for eukaryotic and bacterial proteins.  相似文献   

11.
Molecular dynamics simulations were applied to helix folding of alanine-based synthetic peptides. A single alanine residue in the middle of the peptide was substituted with various nonpolar amino acids (leucine, isoleucine, valine, glycine or proline) to study the effect of the substitution. Unlike many other molecular dynamics simulations, nonhelical initial conformations were used in our simulations to study the folding process. An average solvent effect was included in the energy function to simplify the solvent calculation and to overcome the multiple minima problem. During the simulations, the peptides folded into helices in nanoseconds. Compact structures containing two helical segments were also observed. The calculated helical ratios of the peptides showed the same rank order as observed experimentally for the alanine-based peptides. Within a peptide, the helical ratio of each residue was calculated and a minimum was found near the center of the sequence for all peptides. The substitutions had different asymmetric effects on the helical ratios of the residues preceding and following the substitution site, indicating different helix capping preferences of the substituting amino acids. © 1997 John Wiley & Sons, Inc. Biopoly 42: 633–644, 1997  相似文献   

12.
13.
Hemophilia B Kashihara is a severe hemorrhagic disorder in which the factor IX antigen is present in normal amounts but factor IX biological activity is markedly reduced. In addition, purified factor IX Kashihara is not activated by purified factor XIa in the presence of calcium ions. Amino acid sequence analysis of one of the tryptic peptides isolated from factor IX Kashihara indicated that Val-182 (equivalent to Val-17 in the chymotrypsin numbering system) had been replaced by Phe. No substitution was found in the members of the catalytic triad His-221, Asp-269, and Ser-365 of factor IX Kashihara. The Val-to-Phe replacement found in factor IX Kashihara appears to sterically hinder the cleavage of Arg 180-Val 181 by factor XIa required for the activation of this zymogen.  相似文献   

14.
Yu X  Zheng X  Liu T  Dou Y  Wang J 《Amino acids》2012,42(5):1619-1625
Apoptosis proteins are very important for understanding the mechanism of programmed cell death. Obtaining information on subcellular location of apoptosis proteins is very helpful to understand the apoptosis mechanism. In this paper, based on amino acid substitution matrix and auto covariance transformation, we introduce a new sequence-based model, which not only quantitatively describes the differences between amino acids, but also partially incorporates the sequence-order information. This method is applied to predict the apoptosis proteins’ subcellular location of two widely used datasets by the support vector machine classifier. The results obtained by jackknife test are quite promising, indicating that the proposed method might serve as a potential and efficient prediction model for apoptosis protein subcellular location prediction.  相似文献   

15.
The local environment of an amino acid in a folded protein determines the acceptability of mutations at that position. In order to characterize and quantify these structural constraints, we have made a comparative analysis of families of homologous proteins. Residues in each structure are classified according to amino acid type, secondary structure, accessibility of the side chain, and existence of hydrogen bonds from the side chains. Analysis of the pattern of observed substitutions as a function of local environment shows that there are distinct patterns, especially for buried polar residues. The substitution data tables are available on diskette with Protein Science. Given the fold of a protein, one is able to predict sequences compatible with the fold (profiles or templates) and potentially to discriminate between a correctly folded and misfolded protein. Conversely, analysis of residue variation across a family of aligned sequences in terms of substitution profiles can allow prediction of secondary structure or tertiary environment.  相似文献   

16.
Resistance acquired by the tick Rhipicephalus microplus (Canestrini) to different types of ixodicides in Mexico has had a negative impact on national and local livestock, mainly due to the transmission of diseases such as babesiosis and anaplasmosis, among others. The technique used for the diagnosis of resistance was that in the bioassays noted in the Norma Oficial Mexicana (NOM-006-ZOO-1994). The purpose of this investigation was the determination of resistance to pyrethroids through isoleucine-phenylalanine mutation in the gene KDR, in a population of ticks from Montemorelos, NL, Mexico. Preliminary bioassays demonstrated resistance to cypermethrin and deltamethrin (27.4%) and flumethrin (36.7–34.7%). To identify the mutation, DNA was extracted from 100 mg of larvae (pools), 10 pools were assessed by PCR, in which a pair of primers designed with the program Oligo 2.0 and Amplify 1.2 amplified a 136 bp fragment containing the mutation. The PCR product was subsequently sequenced to confirm the presence of the mutation. A strain susceptible to pyrethroid insecticides (Mora strain) was used as control, but it did not show the mutation. However, the mutation was detected in 4 out of 10 samples of the strain Montemorelos.  相似文献   

17.
18.
19.
Hemoglobin I is an uncommon hemoglobin variant in which the lysine residue at position 16 of the a chain has been replaced by glutamic acid. Lysine is the invariant residue in all myoglobin and hemoglobin subunits that have been sequenced, with the exception of the hemoglobin of the lamprey. Replacement of invariant residues is generally reflected in altered functional properties of the hemoglobin molecule and such invariance may be indicative of a unique functional role. However, a study of the oxygen equilibrium and kinetic properties of hemoglobin I showed the functional properties of this hemoglobin to be indistinguishable from those of normal adult hemoglobin.  相似文献   

20.
Protein A affinity chromatography is the standard purification process for the capture of therapeutic antibodies. The individual IgG‐binding domains of protein A (E, D, A, B, C) have highly homologous amino acid sequences. From a previous report, it has been assumed that the C domain has superior resistance to alkaline conditions compared to the other domains. We investigated several properties of the C domain as an IgG‐Fc capture ligand. Based on cleavage site analysis of a recombinant protein A using a protein sequencer, the C domain was found to be the only domain to have neither of the potential alkaline cleavage sites. Circular dichroism (CD) analysis also indicated that the C domain has good physicochemical stability. Additionally, we evaluated the amino acid substitutions at the Gly‐29 position of the C domain, as the Z domain (an artificial B domain) acquired alkaline resistance through a G29A mutation. The G29A mutation proved to increase the alkaline resistance of the C domain, based on BIACORE analysis, although the improvement was significantly smaller than that observed for the B domain. Interestingly, a number of other amino acid mutations at the same position increased alkaline resistance more than did the G29A mutation. This result supports the notion that even a single mutation on the originally alkali‐stable C domain would improve its alkaline stability. An engineered protein A based on this C domain is expected to show remarkable performance as an affinity ligand for immunoglobulin.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号