首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We describe an algorithm for finding nucleotide residues stronglycorrelated with the amino acid acceptor functions of transferRNAs. The algorithm exploits the fact that each tRNA acceptsonly one of 20 amino acids. The algorithm is applied to 37 Saccharomycescerevisiae transfer RNAs. Received on January 28, 1987  相似文献   

2.
The profile hidden Markov model (PHMM) is widely used to assign the protein sequences to their respective families. A major limitation of a PHMM is the assumption that given states the observations (amino acids) are independent. To overcome this limitation, the dependency between amino acids in a multiple sequence alignment (MSA) which is the representative of a PHMM can be appended to the PHMM. Due to the fact that with a MSA, the sequences of amino acids are biologically related, the one-by-one dependency between two amino acids can be considered. In other words, based on the MSA, the dependency between an amino acid and its corresponding amino acid located above can be combined with the PHMM. For this purpose, the new emission probability matrix which considers the one-by-one dependencies between amino acids is constructed. The parameters of a PHMM are of two types; transition and emission probabilities which are usually estimated using an EM algorithm called the Baum-Welch algorithm. We have generalized the Baum-Welch algorithm using similarity emission matrix constructed by integrating the new emission probability matrix with the common emission probability matrix. Then, the performance of similarity emission is discussed by applying it to the top twenty protein families in the Pfam database. We show that using the similarity emission in the Baum-Welch algorithm significantly outperforms the common Baum-Welch algorithm in the task of assigning protein sequences to protein families.  相似文献   

3.
Noroviruses account for 96% of viral gastroenteritis cases worldwide, with GII.4 strains responsible >80% of norovirus outbreaks. Histo-blood group antigens (HBGAs) are norovirus binding ligands, and antigenic and preferential HBGA binding profiles vary over time as new GII.4 strains emerge. The capsid P2 subdomain facilitates HBGA binding, contains neutralizing antibody epitopes, and likely evolves in response to herd immunity. To identify amino acids regulating HBGA binding and antigenic differences over time, we created chimeric virus-like particles (VLPs) between the GII.4-1987 and GII.4-2006 strains by exchanging amino acids in putative epitopes and characterized their antigenic and HBGA binding profiles using anti-GII.4-1987 and -2006 mouse monoclonal antibodies (MAbs) and polyclonal sera, 1988 outbreak human sera, and synthetic HBGAs. The exchange of amino acids 393 to 395 between GII.4-1987 and GII.4-2006 resulted in altered synthetic HBGA binding compared to parental strains. Introduction of GII.4-1987 residues 294, 297 to 298, 368, and 372 (epitope A) into GII.4-2006 resulted in reactivity with three anti-GII.4-1987 MAbs and reduced reactivity with four anti-GII.4-2006 MAbs. The three anti-GII.4-1987 MAbs also blocked chimeric VLP-HBGA interaction, while an anti-GII.4-2006 blocking antibody did not, indicating that epitope A amino acids comprise a potential neutralizing epitope for GII.4-1987 and GII.4-2006. We also tested GII.4-1987-immunized mouse polyclonal sera and 1988 outbreak human sera for the ability to block chimeric VLP-HBGA interaction and found that epitope A amino acids contribute significantly to the GII.4-1987 blockade response. Our data provide insights that help explain the emergence of new GII.4 epidemic strains over time, may aid development of norovirus therapeutics, and may help predict the emergence of future epidemic strains.  相似文献   

4.
MOTIVATION: We propose representing amino acids by bit-patterns so they may be used in a filter algorithm for similarity searches over protein databases, to rapidly eliminate non-homologous regions of database sequences. The filter algorithm would be based on dynamic programming optimization. It would have the advantage over previous filter algorithms that its substitution scoring function distinguishes between conservative and non-conservative amino acid substitutions. RESULTS: Simulated annealing was used to search for the best five-bit or three-bit patterns to represent amino acids, where similar amino acids were given similar bit-patterns. The similarity between amino acids was estimated from the BLOSUM45 matrix. Representing amino acids by these five-bit and three-bit patterns, the Escherichia coli PhoE precursor and the bacteriophage PA2 LC precursor were aligned. The alignments were nearly the same as that obtained when BLOSUM45 was used to score substitutions. AVAILABILITY: The C code of the optimization algorithm for searching for the optimal bit-pattern representation of amino acids is available from the authors upon request.  相似文献   

5.
The antigenic index: a novel algorithm for predicting antigenic determinants   总被引:39,自引:0,他引:39  
In this paper, we introduce a computer algorithm which can beused to predict the topological features of a protein directlyfrom its primary amino acid sequence. The computer program generatesvalues for surface accessibility parameters and combines thesevalues with those obtained for regional backbone flexibilityand predicted secondary structure. The output of this algorithm,the antigenic index, is used to create a linear surface contourprofile of the protein. Because most, if not all, antigenicsites are located within surface exposed regions of a protein,the program offers a reliable means of predicting potentialantigenic determinants. We have tested the ability of this programto generate accurate surface contour profiles and predict antigenicsites from the linear amino acid sequences of well-characterizedproteins and found a strong correlation between the predictionsof the antigenic index and known structural and biological data. Received on August 17, 1987; accepted on December 31, 1987  相似文献   

6.
A method for identifying the positions in the amino acid sequence, which are critical for the catalytic activity of a protein using support vector machines (SVMs) is introduced and analysed. SVMs are supported by an efficient learning algorithm and can utilize some prior knowledge about the structure of the problem. The amino acid sequences of the variants of a protein, created by inducing mutations, along with their fitness are required as input data by the method to predict its critical positions. To investigate the performance of this algorithm, variants of the beta-lactamase enzyme were created in silico using simulations of both mutagenesis and recombination protocols. Results from literature on beta-lactamase were used to test the accuracy of this method. It was also compared with the results from a simple search algorithm. The algorithm was also shown to be able to predict critical positions that can tolerate two different amino acids and retain function.  相似文献   

7.
After ingestion of the parasporal crystals of Bacillus sphaericus, mosquito larvae process the 42-kilodalton (kDa) toxin to a protein of 39 kDa, which has an increased toxicity (A. H. Broadwell and P. Baumann, Appl. Environ. Microbiol. 53:1333-1337, 1987). A similar activation is performed by trypsin and chymotrypsin. Using site-directed mutagenesis, we have constructed derivatives of the 42-kDa toxin with a deletion of 10 amino acids at the N terminus and deletions of 7, 17, or 20 amino acids at the C terminus. Toxicity for mosquito larvae was retained upon deletion of 7 or 17 amino acids but was lost upon deletion of 20 amino acids. Evidence is presented indicating that the protein containing deletions of 10 amino acids at the N terminus and 17 amino acids at the C terminus (corresponding to potential chymotrypsin cleavage sites) is similar to the 39-kDa protein produced in mosquito larvae or by digestion with chymotrypsin. Digestion with trypsin appears to generate a protein lacking 16 or 19 amino acids from the N terminus and 7 amino acids from the C terminus. As is the case with the recombinant-made 42-kDa protein, toxicity of its derivatives is dependent on the presence of a 51-kDa protein which is a component of the parasporal crystal of B. sphaericus 2362.  相似文献   

8.
In this paper, a new algorithm is presented, which makes possible multilevel comparison of BLOSUM protein substitution matrices based on data from different groups of organisms. As an example, a comparison between substitution matrices based on data from two groups of bacterial genomes with different GC content is presented. Our approach includes evaluating the number of amino acid pairs in BLOCKS databases created separately for the two groups of bacteria using protein sequences deposited in the COG database. Differences of distributions of amino acid pair counts are tested using the chi-squared based G-test. Different analysis levels make it possible to distinguish different patterns of amino acid substitution. Application of the algorithm reveals statistically significant differences in amino acid substitution patterns between AT-rich and GC-rich groups of bacterial organisms. The differences are particularly visible in the overall substitution pattern, amino acid conservation pattern and in comparison of substitution patterns for single amino acids. The algorithm presented in this paper can be considered a novel method for multi-level comparison of amino acid substitution patterns. The presented approach is not limited to bacterial organisms and BLOSUM substitution matrices. Statistically significant differences between substitution patterns in the two groups of bacterial organisms with respect to amino acid conservation pattern can be the evidence of different rate of evolutionary change between AT-rich and GC-rich bacterial organisms.  相似文献   

9.
<正> 在古代的农业系统中,所用的肥料都是有机肥,人们并不知道植物依赖于何种物质得以生长繁衍。自1840年,德国化学家李比希(J.U.Liebig)发表了《化学在农业和植物生理学上应用》一书后,人们对植物的营养有了新的认识,认识到矿物质对植物的重要性。李比希在书中指出:“土壤中矿物质是一切绿色植物唯一的养料。植物可以完全依赖于无机物质而生长发育。这一观点即为“植物的矿质营养学说”。该学说后来为生产实践所充分证明,并成为农业化学  相似文献   

10.
The maximum specific growth rate of Streptococcus lactis and Streptococcus cremoris on synthetic medium containing glutamate but no glutamine decreases rapidly above pH 7. Growth of these organisms is extended to pH values in excess of 8 in the presence of glutamine. These results can be explained by the kinetic properties of glutamate and glutamine transport (B. Poolman, E. J. Smid, and W. N. Konings, J. Bacteriol. 169:2755-2761, 1987). At alkaline pH the rate of growth in the absence of glutamine is limited by the capacity to accumulate glutamate due to the decreased availability of glutamic acid, the transported species of the glutamate-glutamine transport system. Kinetic analysis of leucine and valine transport shows that the maximal rate of uptake of these amino acids by the branched-chain amino acid transport system is 10 times higher in S. lactis cells grown on synthetic medium containing amino acids than in cells grown in complex broth. For cells grown on synthetic medium, the maximal rate of transport exceeds by about 5 times the requirements at maximum specific growth rates for leucine, isoleucine, and valine (on the basis of the amino acid composition of the cell). The maximal rate of phenylalanine uptake by the aromatic amino acid transport system is in small excess of the requirement for this amino acid at maximum specific growth rates. Analysis of the internal amino acid pools of chemostat-grown cells indicates that passive influx of (some) aromatic amino acids may contribute to the net uptake at high dilution rates.  相似文献   

11.
Summary The marine polychaetePhragmatopoma californica (Fewkes) (Sabellariidae) lives within a tube that it constructs by cementing together material such as sand and shells. All of the carbon and nitrogen in the cement (determined by CHN combustion elemental analysis) can be accounted for as protein, although other organic constituents were not specifically looked for. Although the cement may be comprised of more than one protein, amino acid analysis reveals a similarity to the silk protein, sericin, which is the sticky outer covering on silk fibers. The short-chain amino acids comprise 60% of the total residues (glycine: 24%, alanine: 7%, and serine: 29%). Lysine is the next most abundant residue (12%), with basic amino acids totalling 19% of the total residues. Amino acids with hydroxyl side-chains account for 35% of the total. The amino acid DOPA (3,4-dihydroxyphenylalanine), which is present as 2.6% of the total residues, probably acts to stabilize the material through quinone tanning and/or by forming adbesive-type complexes with substrata. The cement thus displays structural and functional similarities with the cement ofMytilus spp. (Waite 1987) and with silk.Abbreviations DOPA 3,4-dihydroxyphenylalanine - HPLC high-pressure liquid chromatography - ODS octadecylsilane - OPA o-phthalaldehyde  相似文献   

12.
Li T  Fan K  Wang J  Wang W 《Protein engineering》2003,16(5):323-330
It is well known that there are some similarities among various naturally occurring amino acids. Thus, the complexity in protein systems could be reduced by sorting these amino acids with similarities into groups and then protein sequences can be simplified by reduced alphabets. This paper discusses how to group similar amino acids and whether there is a minimal amino acid alphabet by which proteins can be folded. Various reduced alphabets are obtained by reserving the maximal information for the simplified protein sequence compared with the parent sequence using global sequence alignment. With these reduced alphabets and simplified similarity matrices, we achieve recognition of the protein fold based on the similarity score of the sequence alignment. The coverage in dataset SCOP40 for various levels of reduction on the amino acid types is obtained, which is the number of homologous pairs detected by program BLAST to the number marked by SCOP40. For the reduced alphabets containing 10 types of amino acids, the ability to detect distantly related folds remains almost at the same level as that by the alphabet of 20 types of amino acids, which implies that 10 types of amino acids may be the degree of freedom for characterizing the complexity in proteins.  相似文献   

13.
Yu YP  Wu SH 《Chirality》2001,13(5):231-235
Among the three chiral columns, CHIROBIOTIC T, CHIRLPAK WH, and CHIRALCEL OD-R, tested for the separation of racemic amino acids and N-acetyl-amino acids, only CHIROBIOTIC T chiral column which is based on covalently bonded amphoteric glycopeptide, teicoplanin, as the stationary phase ligand could be successfully developed to enantiomerically separate racemic amino acids and N-acetyl amino acids simultaneously. This method can be used to determine the enantiomeric composition of amino acids and N-acetyl-amino acids in the catalysis of D-aminoacylase or L-aminoacylase and the conversion rate of N-acylamino acid racemases.  相似文献   

14.
Amino acids and neutral sugars in particulate matter were measured weekly from April in 1987 to March in 1988 in Lake Nakanuma, Japan. Changes in concentrations of total amino acids and total neutral sugars corresponded to those of chlorophylla. The composition of amino acids varied little seasonally and vertically. On the contrary, the composition of neutral sugars changed seasonally and vertically. We discuss the relationship between the changes in the biochemical components and environmental factors.  相似文献   

15.
Identifying the subcellular localization of proteins is particularly helpful in the functional annotation of gene products. In this study, we use Machine Learning and Exploratory Data Analysis (EDA) techniques to examine and characterize amino acid sequences of human proteins localized in nine cellular compartments. A dataset of 3,749 protein sequences representing human proteins was extracted from the SWISS-PROT database. Feature vectors were created to capture specific amino acid sequence characteristics. Relative to a Support Vector Machine, a Multi-layer Perceptron, and a Naive Bayes classifier, the C4.5 Decision Tree algorithm was the most consistent performer across all nine compartments in reliably predicting the subcellular localization of proteins based on their amino acid sequences (average Precision=0.88; average Sensitivity=0.86). Furthermore, EDA graphics characterized essential features of proteins in each compartment. As examples, proteins localized to the plasma membrane had higher proportions of hydrophobic amino acids; cytoplasmic proteins had higher proportions of neutral amino acids; and mitochondrial proteins had higher proportions of neutral amino acids and lower proportions of polar amino acids. These data showed that the C4.5 classifier and EDA tools can be effective for characterizing and predicting the subcellular localization of human proteins based on their amino acid sequences.  相似文献   

16.
Growing-finishing pigs should consume each day the minimum amounts of energy and amino acids needed for maximum lean deposition. This should optimize performance traits, carcass leanness, and N excretion. These ideal conditions are difficult to achieve under experimental or farm conditions due to the factors affecting amino acid requirements and feed intake on a daily basis. Lean deposition rate and sex are two of the major factors affecting amino acid needs. If possible, maximum lean deposition rates should be determined for each herd in order to customize feeding programs, and split-sex feeding will improve N utilization.

Amino acid requirements have been determined empirically and by the factorial method. The latter is preferred if the efficiency of use of absorbed amino acids can be accurately determined. Development of computer models will likely be needed to accomplish this. Apparent ileal digestibility of amino acids is the most practical means of estimating amino acid absorption at present, although it likely overestimates amino acid availability for some amino acids.

Crystalline amino acids can be used to improve amino acid balance and reduce excessive intake of protein which should improve feed efficiency. A portion of the high-quality protein feeds in pig diets can be replaced by synthetic amino acids without sacrificing performance, but the effects of these substitutions on carcass merit is uncertain.

Excretion of N, and the concomitant reduction of N in manure that has to be disposed of, can be manipulated nutritionally by increased use of crystalline amino acids to lower dietary protein, by use of highly digestible feedstuffs and by precise matching of amino acid needs to amino acid supply. Use of these factors could lead to a reduction in total N wastes of 20–30%.  相似文献   


17.
K V Cammarata  G W Schmidt 《Biochemistry》1992,31(10):2779-2789
AB96, a gene encoding a Pisum sativum chlorophyll a/b binding protein [Coruzzi et al. (1983) J. Biol. Chem. 258, 1399-1402], can be expressed in Escherichia coli and reconstituted with pigments by the procedure described by Plumley and Schmidt [(1987) Proc. Natl. Acad. Sci. U.S.A. 84, 146-150]. Following purification by polyacrylamide gel electrophoresis, the reconstituted pigment-protein complex (CP2) is shown to have similar pigment-binding characteristics to native CP2 complexes isolated from thylakoid membranes. Therefore, the AB96 gene product contains binding sites for chlorophylls a and b and xanthophylls, all of which are necessary for optimal reconstitution in vitro. Absorption, fluorescence, and circular dichroism spectroscopy indicate that the pigments are oriented accurately and that chlorophylls a and b are adjoined for energy transfer. Studies with proteins produced after deletion mutagenesis of AB96 indicate that NH2-terminal amino acids 1-21 and COOH-terminal amino acids 219-228 do not play a role in pigment binding. In contrast, amino acids 50-57 and 204-212 (encompassing one of three conserved histidine residues) are essential for reconstitution. Residues near the presumed NH2- and COOH-terminal alpha-helix boundaries (22-49 and 213-218, respectively) affect the stability of reconstituted CP2 during electrophoresis at 4 degrees C. Correlation of diminished chlorophyll a binding with disappearance of a negative circular dichroism near 684 nm suggests that amino acids 213-218 near the COOH-terminal boundary of the third membrane-spanning helix affect the binding of some chlorophyll a molecules.  相似文献   

18.
An integrated family of amino acid sequence analysis programs   总被引:12,自引:0,他引:12  
During the last years abundant sequence data has become availabledue to the rapid progress in protein and DNA sequencing techniques.The exact three-dimensional structures, however, are availableonly for a fraction of proteins with known sequences. For manypurposes the primary amino acid sequence of a protein can bedirectly used to predict important structural parameters. However,mathematical presentation of the calculated values often makesinterpretation difficult, especially if many proteins must beanalysed and compared. Here we introduce a broad-based, user-definedanalysis of amino acid sequence information. The program packageis based on published algorithms and is designed to access standardprotein data bases, calculate hydropathy, surface probabilityand flexibility values and perform secondary structure predictions.The data output is in an ‘easy-to-read’ graphicformat and several parameters can be superimposed within a singleplot in order to simplify data interpretations. Additionally,this package includes a novel algorithm for the prediction ofpotential antigenic sites. Thus the software package presentedhere offers a powerful means of analysing an amino acid sequencefor the purpose of structure/function studies as well as antigenicsite analyses. These algorithms were written to function incontext with the UWGCG (University of Wisconsin Genetics ComputerGroup) program collection, and are now distributed within thatpackage. Received on March 20, 1987; accepted on September 4, 1987  相似文献   

19.
I have observed that in multiple regression the number of codons specifying amino acids in the genetic code is positively correlated with the isoelectric point of amino acids and their molecular weight. Therefore basic amino acids are, on average, codified in the genetic code by a larger number of codons, which seems to imply that the genetic code originated in an acidic 'intracellular' environment. Moreover, I compare the proteins from Picrophilus torridus and Thermoplasma volcanium, which have different intracellular pH and I define the ranks of acidophily for the amino acids. A simple index of acidophily (AI), which can be easily obtained from acidophily ranks, can be associated to any protein and, therefore, can also be associated to the genetic code if the number of synonymous codons attributed to the amino acids in the code is assumed to be the frequency with which the amino acids appeared in ancestral proteins. Finally, the sampling of the variable AI among organisms having an intracellular pH less than or equal to 6.6 and those having a non-acidic intracellular pH leads to the conclusion that the value of the genetic code's AI is not typical of proteins of the latter organisms. As the genetic code's AI value is also statistically not different from that of proteins of the organisms having an acidic intracellular pH, this supports the hypothesis that the structuring of the genetic code took place in acidic pH conditions.  相似文献   

20.

Background

The standard genetic code (SGC) is a unique set of rules which assign amino acids to codons. Similar amino acids tend to have similar codons indicating that the code evolved to minimize the costs of amino acid replacements in proteins, caused by mutations or translational errors. However, if such optimization in fact occurred, many different properties of amino acids must have been taken into account during the code evolution. Therefore, this problem can be reformulated as a multi-objective optimization task, in which the selection constraints are represented by measures based on various amino acid properties.

Results

To study the optimality of the SGC we applied a multi-objective evolutionary algorithm and we used the representatives of eight clusters, which grouped over 500 indices describing various physicochemical properties of amino acids. Thanks to that we avoided an arbitrary choice of amino acid features as optimization criteria. As a consequence, we were able to conduct a more general study on the properties of the SGC than the ones presented so far in other papers on this topic. We considered two models of the genetic code, one preserving the characteristic codon blocks structure of the SGC and the other without this restriction. The results revealed that the SGC could be significantly improved in terms of error minimization, hereby it is not fully optimized. Its structure differs significantly from the structure of the codes optimized to minimize the costs of amino acid replacements. On the other hand, using newly defined quality measures that placed the SGC in the global space of theoretical genetic codes, we showed that the SGC is definitely closer to the codes that minimize the costs of amino acids replacements than those maximizing them.

Conclusions

The standard genetic code represents most likely only partially optimized systems, which emerged under the influence of many different factors. Our findings can be useful to researchers involved in modifying the genetic code of the living organisms and designing artificial ones.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号