共查询到20条相似文献,搜索用时 15 毫秒
1.
Sequence alignment is a common method for finding protein structurally conserved/similar regions. However, sequence alignment
is often not accurate if sequence identities between to-be-aligned sequences are less than 30%. This is because that for these
sequences, different residues may play similar structural roles and they are incorrectly aligned during the sequence alignment
using substitution matrix consisting of 20 types of residues. Based on the similarity of physicochemical features, residues
can be clustered into a few groups. Using such simplified alphabets, the complexity of protein sequences is reduced and at
the same time the key information encoded in the sequences remains. As a result, the accuracy of sequence alignment might
be improved if the residues are properly clustered. Here, by using a database of aligned protein structures (DAPS), a new
clustering method based on the substitution scores is proposed for the grouping of residues, and substitution matrices of
residues at different levels of simplification are constructed. The validity of the reduced alphabets is confirmed by relative
entropy analysis. The reduced alphabets are applied to recognition of protein structurally conserved/similar regions by sequence
alignment. The results indicate that the accuracy or efficiency of sequence alignment can be improved with the optimal reduced
alphabet with N around 9.
Supported by the National Natural Science Foundation of China (Grant Nos. 90403120, 10474041 and 10021001) and the Nonlinear
Project (973) of the NSM 相似文献
2.
《中国科学:生命科学英文版》2007,(3)
Sequence alignment is a common method for finding protein structurally conserved/similar regions. However, sequence alignment is often not accurate if sequence identities between to-be-aligned se- quences are less than 30%. This is because that for these sequences, different residues may play similar structural roles and they are incorrectly aligned during the sequence alignment using substitu- tion matrix consisting of 20 types of residues. Based on the similarity of physicochemical features, residues can be clustered into a few groups. Using such simplified alphabets, the complexity of protein sequences is reduced and at the same time the key information encoded in the sequences remains. As a result, the accuracy of sequence alignment might be improved if the residues are properly clustered. Here, by using a database of aligned protein structures (DAPS), a new clustering method based on the substitution scores is proposed for the grouping of residues, and substitution matrices of residues at different levels of simplification are constructed. The validity of the reduced alphabets is confirmed by relative entropy analysis. The reduced alphabets are applied to recognition of protein structurally conserved/similar regions by sequence alignment. The results indicate that the accuracy or efficiency of sequence alignment can be improved with the optimal reduced alphabet with N around 9. 相似文献
3.
Reduced or simplified amino acid alphabets group the 20 naturally occurring amino acids into a smaller number of representative protein residues. To date, several reduced amino acid alphabets have been proposed, which have been derived and optimized by a variety of methods. The resulting reduced amino acid alphabets have been applied to pattern recognition, generation of consensus sequences from multiple alignments, protein folding, and protein structure prediction. In this work, amino acid substitution matrices and statistical potentials were derived based on several reduced amino acid alphabets and their performance assessed in a large benchmark for the tasks of sequence alignment and fold assessment of protein structure models, using as a reference frame the standard alphabet of 20 amino acids. The results showed that a large reduction in the total number of residue types does not necessarily translate into a significant loss of discriminative power for sequence alignment and fold assessment. Therefore, some definitions of a few residue types are able to encode most of the relevant sequence/structure information that is present in the 20 standard amino acids. Based on these results, we suggest that the use of reduced amino acid alphabets may allow to increasing the accuracy of current substitution matrices and statistical potentials for the prediction of protein structure of remote homologs. 相似文献
4.
A series of compounds (DAP-AA) composed of an amino acid (AA) and a dialkyl phosphoryl group (DAP) is the basic elements of life chemistry. Self-catalysis of DAP-AA gives the self-assembly oligopeptides, even in aqueous medium at 38°C. The oligo-nucleotides could also be assembled from nucleosides' phosphorylation by DAP-AA. DAP-AA acts as the energy source as well as the phosphoryl donor for the synthesis of nuclic Acids and protein. A general expression for the self assembly system is proposed. 相似文献
5.
Summary The lipophilicity (or hydrophobicity) of amino acids is an important property relevant for protein folding and therefore of great interest in protein engineering. For peptides or peptidomimetics of potential therapeutic interest, lipophilicity is related to absorption and distribution, and thus indirectly relates to their bioactivity. A rationalization of peptide lipophilicity requires basic knowledge of the lipophilicity of the constituting amino acids. In the present contribution we will review methods to measure or calculate the lipophilicities of amino acids, including unusual amino acids, and we will make a comparison between various lipophilicity scales. 相似文献
6.
Correlations of amino acids in proteins 总被引:2,自引:0,他引:2
A correlation analysis among 20 amino acids is performed for four protein structural classes (, β, /β, and +β) in a total of 204 proteins. The correlation relationships among amino acids can be classified into the following four types: (1) strong positive correlation, (2) strong negative correlation, (3) weak correlation, and (4) no correlation. The correlation relationships are different for different proteins and are correlated with the features of their structural classes. The amino acids with the weak correlation relationship can be treated as the independent basis functions for the space where proteins are defined. The amino acids with large correlation coefficients are linear correlative with each other and they are not independent. The strong correlation among amino acids reflects their mutual constrained relationship, as exhibited by their relevant structural features. The information obtained through the correlation analysis is used for predicting protein structural classes and a better prediction quality is obtained than that by the simple geometry distance methods without taking into account the correlation effects. 相似文献
7.
Intrinsically disordered regions (IDR) play an important role in key biological processes and are closely related to human diseases. IDRs have great potential to serve as targets for drug discovery, most notably in disordered binding regions. Accurate prediction of IDRs is challenging because their genome wide occurrence and a low ratio of disordered residues make them difficult targets for traditional classification techniques. Existing computational methods mostly rely on sequence profiles to improve accuracy which is time consuming and computationally expensive. This article describes an ab initio sequence-only prediction method—which tries to overcome the challenge of accurate prediction posed by IDRs—based on reduced amino acid alphabets and convolutional neural networks (CNNs). We experiment with six different 3-letter reduced alphabets. We argue that the dimensional reduction in the input alphabet facilitates the detection of complex patterns within the sequence by the convolutional step. Experimental results show that our proposed IDR predictor performs at the same level or outperforms other state-of-the-art methods in the same class, achieving accuracy levels of 0.76 and AUC of 0.85 on the publicly available Critical Assessment of protein Structure Prediction dataset (CASP10). Therefore, our method is suitable for proteome-wide disorder prediction yielding similar or better accuracy than existing approaches at a faster speed. 相似文献
8.
Taurine (Tau) and the small neutral amino acids glycine (Gly), serine (Ser), threonine (Thr), and alanine (Ala) were measured in 53 brain areas of 3- and 29-month-old male Fisher 344 rats. The ratio of highest to lowest level was 34 for Tau, 9.1 for Thr, 7.6 for Gly and Ser, and 6.5 for Ala. The heterogeneity was found in numerous areas; for example, Tau levels were more than 90 nmol/mg protein in 6 areas, and less than 20 nmol/mg protein in 10 areas. Similar heterogeneity was found with the other amino acids. The relative distribution of the small neutral amino acids showed several similarities; Tau distribution was different. With age, four amino acids decreased in 10–18 areas, and increased in only 1–3, while Thr increased in more areas than it decreased. The five amino acids of this paper, and the four of the previous paper, are among the amino acids at highest level in the brain; the sequence in their levels shows considerable regional heterogeneity. 相似文献
9.
Knowing protein structure and inferring its function from the structure are one of the main issues of computational structural biology, and often the first step is studying protein secondary structure. There have been many attempts to predict protein secondary structure contents. Previous attempts assumed that the content of protein secondary structure can be predicted successfully using the information on the amino acid composition of a protein. Recent methods achieved remarkable prediction accuracy by using the expanded composition information. The overall average error of the most successful method is 3.4%. Here, we demonstrate that even if we only use the simple amino acid composition information alone, it is possible to improve the prediction accuracy significantly if the evolutionary information is included. The idea is motivated by the observation that evolutionarily related proteins share the similar structure. After calculating the homolog-averaged amino acid composition of a protein, which can be easily obtained from the multiple sequence alignment by running PSI-BLAST, those 20 numbers are learned by a multiple linear regression, an artificial neural network and a support vector regression. The overall average error of method by a support vector regression is 3.3%. It is remarkable that we obtain the comparable accuracy without utilizing the expanded composition information such as pair-coupled amino acid composition. This work again demonstrates that the amino acid composition is a fundamental characteristic of a protein. It is anticipated that our novel idea can be applied to many areas of protein bioinformatics where the amino acid composition information is utilized, such as subcellular localization prediction, enzyme subclass prediction, domain boundary prediction, signal sequence prediction, and prediction of unfolded segment in a protein sequence, to name a few. 相似文献
10.
Paik MJ Lee HJ Kim KR 《Journal of chromatography. B, Analytical technologies in the biomedical and life sciences》2005,821(1):94-104
Simultaneous profiling analysis of urinary amino acids (AAs) and carboxylic acids (CAs) was combined with retention index (I) analysis for graphic recognition of abnormal metabolic state. The temperature-programmed I values of the AA and CA standards measured as ethoxycarbonyl (EOC)/methoxime (MO)/tert-butyldimethylsilyl (TBDMS) derivatives were used as the reference I values. Urine samples were subjected to the sequential EOC, MO and TBDMS reactions for the analysis by gas chromatography (GC) and GC-mass spectrometry. The complex GC profiles were then transformed into their respective I patterns in bar graphic forms by plotting the normalized peak area ratios (%) of the identified AAs and CAs against their reference I values as the identification numbers. When the present method was applied to infant urine specimens from normal controls and patients with inherited metabolic diseases such as phenylketonuria, maple syrup urine disease, methylmalonic aciduria or isovaleric aciduria, each I pattern of bar graph more distinctly displayed quantitative abundances of urinary AAs and CAs in qualitative I scale, thus allowing graphic discrimination between normal and abnormal states. 相似文献
11.
Lorenzo de Napoli Ernesto Fattorusso Luciano Mayol Ettore Novellino 《Biochemical Systematics and Ecology》1984,12(1):19-21
Fifteen free protein amino acids have been quantitatively determined in nine green algae belongin to the order Siphonales. In all the species examined, aspartic acid, glutamic acid, alanine, glycine and serine dominate. Wide differences of the relative amounts of these amino acids in the various regardless of the genus were observed. 相似文献
12.
Yoshizawa F 《Biochemical and biophysical research communications》2004,313(2):417-422
Recent advances in the understanding of mRNA translation have facilitated molecular studies on the regulation of protein synthesis by nutrients and the interplay between nutrients and hormonal signals. Numerous reports have established that, in skeletal muscle, the branched-chain amino acids (BCAAs) have the unique ability to initiate signal transduction pathways that modulate translation initiation. Of the BCAAs, leucine is the most potent. Oral administration of leucine to food-deprived rats enhances muscle protein synthesis, in part, through activation of the mRNA binding step of translation initiation. Interestingly, leucine signaling in skeletal muscle differs from that in liver, suggesting that the responses may be tissue specific. The purpose of this paper was to briefly review the current knowledge of how BCAAs act as regulators of protein synthesis in physiologically important tissues, with particular focus on the mechanisms by which BCAAs regulate translation initiation. 相似文献
13.
Knowledge of amino acid composition, alone, is verified here to be sufficient for recognizing the structural class, α, β, α+β, or α/β of a given protein with an accuracy of 81%. This is supported by results from exhaustive enumerations of all conformations for all sequences of simple, compact lattice models consisting of two types (hydrophobic and polar) of residues. Different compositions exhibit strong affinities for certain folds. Within the limits of validity of the lattice models, two factors appear to determine the choice of particular folds: 1) the coordination numbers of individual sites and 2) the size and geometry of non-bonded clusters. These two properties, collectively termed the distribution of non-bonded contacts, are quantitatively assessed by an eigenvalue analysis of the so-called Kirchhoff or adjacency matrices obtained by considering the non-bonded interactions on a lattice. The analysis permits the identification of conformations that possess the same distribution of non-bonded contacts. Furthermore, some distributions of non-bonded contacts are favored entropically, due to their high degeneracies. Thus, a competition between enthalpic and entropic effects is effective in determining the choice of a distribution for a given composition. Based on these findings, an analysis of non-bonded contacts in protein structures was made. The analysis shows that proteins belonging to the four distinct folding classes exhibit significant differences in their distributions of non-bonded contacts, which more directly explains the success in predicting structural class from amino acid composition. Proteins 29:172–185, 1997. Published 1997 Wiley-Liss, Inc. 1 This article is a US Goverment work and, as such, is in the public domain in the United States of America. 相似文献
14.
Uterine tubal fluids (UTF) were collected daily over a 214-day period (March through August) from three mares. Individual UTF samples identified by day of estrous cycle for five complete cycles within this six-month span were analyzed for free amino acids and total protein. Biochemical comparisons were made to blood plasma by drawing samples daily from each mare. Free amino acids and total protein were determined also on follicular fluids collected from three different mares on days 5 and 6 of standing estrus.The free amino acid level of UTF was significantly greater than was the amino acid concentration in blood plasma or follicular fluid. The highest concentration of amino acids in UTF was on day 13. Cyclic trends were observed for the amino acids, histidine, methionine, half-cystine, serine, proline, glycine, alanine, isolecine, and leucine. Glycine and alanine were found in the highest concentrations in UTF, peaking on day 17 of the estrous cycle. Protein concentration in UTF was highest on day 13 and lowest on days 7 and 19. Protein values for diestrus (33.1 mg/ml) were significantly greater (p<0.05) than for estrus (28.0 mg/ml). 相似文献
15.
Heterogeneous distribution of functionally important amino acids in brain areas of adult and aging humans 总被引:2,自引:0,他引:2
The regional distribution of seven amino acids thought to have inhibitory neurotransmitter or neurotransmitter precursor function—GABA, glycine, taurine, serine, threonine, phenylalanine, and tyrosine—was determined in 52 discrete areas from brain of adult and old humans. Significant heterogeneity was found, with 3- to 16-fold differences in levels in the various regions analyzed. The patterns of distribution were somewhat different from those in the adult or old rat brain. Relatively few changes were seen in old brain. Heterogeneity in distribution has to be taken into account in assessing physiological changes in amino acid levels and metabolism.Special issue dedicated to Dr. Claude Baxter. 相似文献
16.
Diastereoisomeric 4-substituted acidic amino acids occur in characteristic associations in the green parts of some species of the Filicinae. Subspecies of Phyllitis scolopendrium accumulate 2(S),4(R)-4-methylglutamic acid, 2(S)-4-methyleneglutamic acid and the two diastereoisomers of 2(S)-4-hydroxy-4-methylglutamic acid, the last two occurring at relative concentrations of 3: 1. All Asplenium species investigated were distinctive in accumulating 2(S),4(R)-4-methylglutamic acid, the two diastereoisomers of 2(S)-4-hydroxy-4-methylglutamic acid, and the two diastereoisomers of 2(S)-4-hydroxy-2-aminopimelic acid in a characteristic concentration ratio. Some Polystichum species do not accumulate 4-substituted acidic amino acids whereas others accumulate both diastereoisomers of 2(S)-4-hydroxy-4-methylglutamic acid and 'of 2(S)-4-hydroxy-2-aminopimelic acid, and thus resemble Asplenium species. The seasonal variation in the concentration of 4-substituted acidic amino acids in the green parts of Phyllitis, Asplenium and Polystichum species has also been determined. 相似文献
17.
《Animal : an international journal of animal bioscience》2023,17(1):100684
Dietary proteins need to be digested first while free amino acids (AAs) and small peptides are readily available for absorption and rapidly appear in the blood. The rapid postprandial appearance of dietary AA in the systemic circulation may result in inefficient AA utilisation for protein synthesis of peripheral tissues if other nutrients implicated in AA and protein metabolism are not available at the same time. The objective of this experiment was to compare the postprandial concentrations of plasma AA and other metabolites after the ingestion of a diet that provided AA either as proteins or as free AA and small peptides. Twenty-four male growing pigs (38.8 ± 2.67 kg) fitted with a jugular catheter were assigned to one of three diets that provided AA either in protein form (INT), free AA and small peptides (HYD), or as free AA (FAA). After an overnight fast and initial blood sampling, a small meal was given to each pig followed by serial blood collection for 360 min. Postprandial concentrations of plasma AA, glucose, insulin, and urea were then measured from the collected blood. Non-linear regression was used to summarise the postprandial plasma AA kinetics. Fasting concentrations of urea and some AA were higher (P < 0.05) while postprandial plasma insulin and glucose were lower (P < 0.01) for INT than for HYD and FAA. The area under the curve of plasma concentration after meal distribution was lower for INT for most AAs (P < 0.05), resulting in a flatter curve compared to HYD and FAA. This was the result of the slower appearance of dietary AA in the plasma when proteins are fed instead of free AA and small peptides. The flatter curve may also result from more AAs being metabolised by the intestine and liver when INT was fed. The metabolism of AA of the intestine and liver was higher for HYD than FAA. Providing AA as proteins or as free AA and small peptides affected the postprandial plasma kinetics of AA, urea, insulin, and glucose. Whether the flat kinetics when feeding proteins has a positive or negative effect on AA metabolism still needs to be explored. 相似文献
18.
By using three-dimensional (3D) structure alignments and a previously published method to determine Conserved Key Amino Acid Positions (CKAAPs) we propose a theoretical method to design mutations that can be used to morph the protein folds. The original Paracelsus challenge, met by several groups, called for the engineering of a stable but different structure by modifying less than 50% of the amino acid residues. We have used the sequences from the Protein Data Bank (PDB) identifiers 1ROP, and 2CRO, which were previously used in the Paracelsus challenge by those groups, and suggest mutation to CKAAPs to morph the protein fold. The total number of mutations suggested is less than 40% of the starting sequence theoretically improving the challenge results. From secondary structure prediction experiments of the proposed mutant sequence structures, we observe that each of the suggested mutant protein sequences likely folds to a different, non-native potentially stable target structure. These results are an early indicator that analyses using structure alignments leading to CKAAPs of a given structure are of value in protein engineering experiments. 相似文献
19.
Summary The enzymatic resolution of racemic phenylglycine, phenylglycinol and phenylalaninol has been studied in organic solvents under a variety of experimental conditions. Subtilisin in 3-methyl-3-pentanol was effective for the resolution of phenylglycine esters, via N-acylation with trifluoroethyl butyrate. Porcine pancreatic lipase in ethyl acetate gave satisfactory results in the resolution of phenylglycinol and phenylalaninol; the or position of the phenyl group was found to influence both the rate and the chemioselectivity of the reaction. 相似文献
20.
M. H. M. N. Senden A. J. G. M. Van Der Meer J. Limborgh H. Th. Wolterbeek 《Plant and Soil》1992,142(1):81-89
Major amino acids and organic acids in xylem exudates of tomato plants were separated by reversed phase high performance liquid
chromatography (RP-HPLC) and quantified by UV detection. Before separation, amino acids were converted into their phenylisothiocyanate
(PITC) derivatives. In a single run, Asp, Glu, Ser, Gln, His, Thr, Ala, Tyr, Val, Met, Cys, Ile, Leu, Phe, and Lys could be
separated and detected down to the pmol level. Unresolved peaks were obtained for Asn and Gly and for Arg and Pro. For organic
acid analysis, exudates were pre-treated by perfusion over a prepacked Adsorbex SCX cation exchange column, to eliminate exudate
amino acids. Elution recoveries for organic acids were close to 100%. The exudate organic acids were separated by ion suppression
RP-HPLC chromatography, and peaks could be resolved for L-malic acid, malonic acid, maleic acid, citric acid and fumaric acid,
down to the pmol level. UV signals for exudate ascorbic acid, and succinic acid were below the limits of detection. Determination
of oxalic acid and tartaric acid was impossible, due to the presence of the exudate salt peak in the chromatogram. The results
indicate the potential of the methods applied, and show the applicability of RP-HPLC analysis for the determination of both
amino acids and organic acids in xylem exudates. 相似文献