首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
There is indirect evidence that the amino acid composition of proteins depends on their dimension. The amino acid composition of a nonredundant set of about 550,000 proteins was determined and it was observed that, in the range of 50-200 residues, the percentage of occurrence of most of the residue types significantly depends on protein dimension. This result should prove useful in analyzing protein sequences and genomics.  相似文献   

2.
Liang HK  Huang CM  Ko MT  Hwang JK 《Proteins》2005,59(1):58-63
Structural analysis is useful in elucidating structural features responsible for enhanced thermal stability of proteins. However, due to the rapid increase of sequenced genomic data, there are far more protein sequences than the corresponding three-dimensional (3D) structures. The usual sequence-based amino acid composition analysis provides useful but simplified clues about the amino acid types related to thermal stability of proteins. In this work, we developed a statistical approach to identify the significant amino acid coupling sequence patterns in thermophilic proteins. The amino acid coupling sequence pattern is defined as any 2 types of amino acids separated by 1 or more amino acids. Using this approach, we construct the rho profiles for the coupling patterns. The rho value gives a measure of the relative occurrence of a coupling pattern in thermophiles compared with mesophiles. We found that thermophiles and mesophiles exhibit significant bias in their amino acid coupling patterns. We showed that such bias is mainly due to temperature adaptation instead of species or GC content variations. Though no single outstanding coupling pattern can adequately account for protein thermostability, we can use a group of amino acid coupling patterns having strong statistical significance (p values < 10(-7)) to distinguish between thermophilic and mesophilic proteins. We found a good correlation between the optimal growth temperatures of the genomes and the occurrences of the coupling patterns (the correlation coefficient is 0.89). Furthermore, we can separate the thermophilic proteins from their mesophilic orthologs using the amino acid coupling patterns. These results may be useful in the study of the enhanced stability of proteins from thermophiles-especially when structural information is scarce. Proteins 2005. (c) 2005 Wiley-Liss, Inc.  相似文献   

3.
Mononucleotide repeats (MNRs) have been systematically investigated in the genomes of eukaryotic and prokaryotic organisms. However, detailed information on the distribution of MNRs in viral genomes is limited. In this study, we examined the distributions of MNRs in 256 fully sequenced virus genomes which showed extensive variations across viral genomes, and is significantly influenced by both genome size and CG content. Furthermore, the ratio of the observed to the expected number of MNRs (O/E ratio) appears to be influenced by both the host range and genome type of a particular virus. Additionally, the densities and frequencies of MNRs in genic regions are lower than in non-coding regions, suggesting that selective pressure acts on viral genomes. We also discuss the potential functional roles that these MNR loci could play in virus genomes. To our knowledge, this is the first analysis focusing on MNRs in viruses, and our study could have potential implications for a deeper understanding of virus genome stability and the co-evolution that occurs between a virus and its host.  相似文献   

4.
Huntley MA  Golding GB 《Proteins》2002,48(1):134-140
A simple sequence is abundant in the proteins that have been sequenced to date. But unusual protein features, such as a simple sequence, are not present in the same high frequency within structural databases. A subset of these simple sequences, a group with a highly repetitive nature has been shown to be abundant in eukaryotes but not in prokaryotes. In this study, an examination of the eukaryotic proteins in the Protein Data Bank (PDB) has revealed a large deficiency of low complexity, highly repetitive protein repeats. Through simulated databases of similar samples of eukaryotic proteins taken from the National Center for Biotechnology Information (NCBI) database, it is shown that the PDB contains a significantly less highly repetitive, simple sequence than artificial databases of similar composition randomly derived from NCBI. When the structural data for those few PDB sequences that did contain a highly repetitive simple sequence is examined in detail, it is found that in most cases the tertiary structure is unknown for the regions consisting of a simple sequence. This lack of a simple sequence both in the PDB database and in the structural information suggests that this type of simple sequence may produce disordered structures that make structural characterization difficult.  相似文献   

5.
We present a new method for the identification of conserved patterns in a set of unaligned related protein sequences. It is able to discover patterns of a quite general form, allowing for both ambiguous positions and for variable length wildcard regions. It allows the user to define a class of patterns (e.g., the degree of ambiguity allowed and the length and number of gaps), and the method is then guaranteed to find the conserved patterns in this class scoring highest according to a significance measure defined. Identified patterns may be refined using one of two new algorithms. We present a new (nonstatistical) significance measure for flexible patterns. The method is shown to recover known motifs for PROSITE families and is also applied to some recently described families from the literature.  相似文献   

6.
Statistical approaches have been applied to examine amino acid pairing preferences within parallel beta-sheets. The main chain hydrogen bonding pattern in parallel beta-sheets means that, for each residue pair, only one of the residues is involved in main chain hydrogen bonding with the strand containing the partner residue. We call this the hydrogen bonded (HB) residue and the partner residue the non-hydrogen bonded (nHB) residue, and differentiate between the favorability of a pair and that of its reverse pair, e.g. Asn(HB)-Thr(nHB)versus Thr(HB)-Asn(nHB). Significantly (p < or = 0.000001) favoured pairings were rationalised using stereochemical arguments. For instance, Asn(HB)-Thr(nHB) and Arg(HB)-Thr(nHB) were favoured pairs, where the residues adopted favoured chi1 rotamer positions that allowed side-chain interactions to occur. In contrast, Thr(HB)-Asn(nHB) and Thr(HB)-Arg(nHB) were not significantly favoured, and could only form side-chain interactions if the residues involved adopted less favourable chi1 conformations. The favourability of hydrophobic pairs e.g. Ile(HB)-Ile(nHB), Val(HB)-Val(nHB) and Leu(HB)-Ile(nHB) was explained by the residues adopting their most preferred chi1 and chi2 conformations, which enabled them to form nested arrangements. Cysteine-cysteine pairs are significantly favoured, although these do not form intrasheet disulphide bridges. Interactions between positively and negatively charged residues were asymmetrically preferred: those with the negatively charged residue at the HB position were more favoured. This trend was accounted for by the presence of general electrostatic interactions, which, based on analysis of distances between charged atoms, were likely to be stronger when the negatively charged residue is the HB partner. The Arg(HB)-Asp(nHB) interaction was an exception to this trend and its favorability was rationalised by the formation of specific side-chain interactions. This research provides rules that could be applied to protein structure prediction, comparative modelling and protein engineering and design. The methods used to analyse the pairing preferences are automated and detailed results are available (http://www.rubic.rdg.ac.uk/betapairprefsparallel/).  相似文献   

7.
比较分析香菇球状菌株与正常菌株的形态特征、氨基酸特征和蛋白质品质,并基于现行国际氨基酸模式谱,采用蛋白质的氨基酸评分(amino acid score,AAS)、氨基酸比值系数分(ratio coefficient of amino acid,SRC)、IOM(Institute of Medicine)模式评分、化学评分(chemical score,CS)、必需氨基酸指数(essential amino acid index,EAAI)以及蛋白质校正氨基酸计分(protein digestibility corrected amino acids score,PDCAAS)多种指标进行评价。结果表明:与正常菌株相比,球状菌株形态上没有菌褶和菌柄的分化,同时营养成分也发生了变化,就粗蛋白含量而言,球状菌株为正常菌株的1.37倍(分别为32.32%和23.60%);就平均总氨基酸含量而言,球状菌株(209.58mg/g)是正常菌株(163.10mg/g)的1.28倍。这些球状菌株的特征可作为新品种的培育材料进一步被利用。  相似文献   

8.
Poor protein solubility is a common problem in high-resolution structural studies, formulation of protein pharmaceuticals, and biochemical characterization of proteins. One popular strategy to improve protein solubility is to use site-directed mutagenesis to make hydrophobic to hydrophilic mutations on the protein surface. However, a systematic investigation of the relative contributions of all 20 amino acids to protein solubility has not been done. Here, 20 variants at the completely solvent-exposed position 76 of ribonuclease (RNase) Sa are made to compare the contributions of each amino acid. Stability measurements were also made for these variants, which occur at the i+1 position of a type II beta-turn. Solubility measurements in ammonium sulfate solutions were made at high positive net charge, low net charge, and high negative net charge. Surprisingly, there was a wide range of contributions to protein solubility even among the hydrophilic amino acids. The results suggest that aspartic acid, glutamic acid, and serine contribute significantly more favorably than the other hydrophilic amino acids especially at high net charge. Therefore, to increase protein solubility, asparagine, glutamine, or threonine should be replaced with aspartic acid, glutamic acid or serine.  相似文献   

9.
P McCaldon  P Argos 《Proteins》1988,4(2):99-122
We have examined oligopeptides with lengths ranging from 2 to 11 residues in protein sequences that show no obvious evolutionary relationship. All sequences in the Protein Identification Resource database were carefully classified by sensitive homology searches into superfamilies to obtain unbiased oligopeptide counts. The results, contrary to previous studies, show clear prejudices in protein sequences. The oligopeptide preferences were used to help decide the significance of sequence homologies and to improve the more general methods for detecting protein coding regions within nucleotide sequences.  相似文献   

10.
Cell-free protein synthesis (CFPS) systems are an attractive method to complement the usual cell-based synthesis of proteins, especially for screening approaches. The literature describes a wide variety of CFPS systems, but their performance is difficult to compare since the reaction components are often used at different concentrations. Therefore, we have developed a calculation tool based on amino acid balancing to evaluate the performance of CFPS by determining the fractional yield as the ratio between theoretically achievable and experimentally achieved protein molar concentration. This tool was applied to a series of experiments from our lab and to various systems described in the literature to identify systems that synthesize proteins very efficiently and those that still have potential for higher yields. The well-established Escherichia coli system showed a high efficiency in the utilization of amino acids, but interestingly, less considered systems, such as those based on Vibrio natriegens or Leishmania tarentolae, also showed exceptional fractional yields of over 70% and 90%, respectively, implying very efficient conversions of amino acids. The methods and tools described here can quickly identify when a system has reached its maximum or has limitations. We believe that this approach will facilitate the evaluation and optimization of existing CFPS systems and provides the basis for the systematic development of new CFPS systems.  相似文献   

11.
Compartmentalization of cellular amino acid pools occurs in cultures of cardiac and skeletal muscle cells, but the factors involved in this are not clear. We have further defined this problem by analyzing the intracellular free leucine and the transfer-RNA-(tRNA)-bound leucine pool in cultures of skeletal and cardiac muscle incubated with 3H-leucine in the presence and absence of serum and amino acids. Withdrawal of nitrogen substrates caused substantial changes in leucine pool relationships–in particular, a change in the degree to which intracellular free leucine and tRNA-leucine were derived from the culture medium. In separate experiments, the validity of our tRNA measurements was confirmed by measurements of the specific activity of newly synthesized ferritin after iron induction. We discuss the implications of these findings with regard-to factors involved in the control of amino acid flux through the cell, as well as with regard to design of experiments using isotopic amino acids to measure rates of amino acid utilization.  相似文献   

12.
Amino acid analyses of the band 3 protein purified from erythrocyte membranes of control and epileptic children showed that no major structural abnormalities of this protein could be linked with the red blood cell membrane alterations previously described in child epilepsy and, consequently, the molecular basis of these alterations should be looked for elsewhere.  相似文献   

13.
In the present work, we use structural information to characterize a set of disease-associated single amino acid polymorphisms exhaustively. The analysis of different properties, such as substitution matrix elements, secondary structure, accessibility, free energies of transfer from water to octanol, amino acid volume, etc., suggests that many disease-causing mutations are associated with extreme changes in the value of parameters relating to protein stability. Overall, our results indicate that, while knowledge of protein structure clearly helps in understanding these mutations, a finer understanding can come only from a quantitative knowledge of protein stability and of the protein environment in the cell. Interestingly, use of evolutionary information from multiple sequence alignments can be used to increase our knowledge of disease-associated mutations.  相似文献   

14.
15.
16.
17.
The conditional probability, P(sigma/x), is a statement of the probability that the value of sigma will be found given the prior information that a value of x has been observed. Here sigma represents any one of the secondary structure types, alpha, beta, tau, and rho for helix, sheet, turn, and random, respectively, and x represents a sequence attribute, including, but not limited to: (1) hydropathy; (2) hydrophobic moments assuming helix and sheet; (3) Richardson and Richardson helical N-cap and C-cap values; (4) Chou-Fasman conformational parameters for helix, P alpha, for sheet, P beta, and for turn, P tau; and (5) Garnier, Osguthorpe, and Robson (GOR) information values for helix, I alpha, for sheet, I beta, for turn, I tau, and for random structure, I rho. Plots of P(sigma/x) vs. x are demonstrated to provide information about the correlation between structure and attribute, sigma and x. The separations between different P(sigma/x) vs. x curves indicate the capacity of a given attribute to discriminate between different secondary structural types and permit comparison of different attributes. P(alpha/x), P(beta/x), P(tau/x) and P(rho/x) vs. x plots show that the most useful attributes for discriminating helix are, in order: hydrophobic moment assuming helix greater than P alpha much greater than N-cap greater than C-cap approximately I alpha approximately I tau. The information value for turns, I tau, was found to discriminate helix better than turns. Discrimination for sheet was found to be in the following order: I beta much greater than P beta approximately hydropathy greater than I rho approximately hydrophobic moment assuming sheet. Three attributes, at their low values, were found to give significant discrimination for the absence of helix: I alpha approximately P alpha approximately hydrophobic moment assuming helix. Also, three other attributes were found to indicate the absence of sheet: P beta much greater than I rho approximately hydropathy. Indications of the absence of sigma could be as useful for some applications as the indication of the presence of sigma.  相似文献   

18.
Secondary transporters in humans are a large group of proteins that transport a wide range of ions, metals, organic and inorganic solutes involved in energy transduction, control of membrane potential and osmotic balance, metabolic processes and in the absorption or efflux of drugs and xenobiotics. They are also emerging as important targets for development of new drugs and as target sites for drug delivery to specific organs or tissues. We have performed amino acid composition (AAC) and phylogenetic analyses and membrane topology predictions for 336 human secondary transport proteins and used the results to confirm protein classification and to look for trends and correlations with structural domains and specific substrates and/or function. Some proteins showed statistically high contents of individual amino acids or of groups of amino acids with similar physicochemical properties. One recurring trend was a correlation between high contents of charged and/or polar residues with misleading results in predictions of membrane topology, which was especially prevalent in Mitochondrial Carrier family proteins. We demonstrate how charged or polar residues located in the middle of transmembrane helices can interfere with their identification by membrane topology tools resulting in missed helices in the prediction. Comparison of AAC in the human proteins with that in 235 secondary transport proteins from Escherichia coli revealed similar overall trends along with differences in average contents for some individual amino acids and groups of similar amino acids that are presumed to result from a greater number of functions and complexity in the higher organism.  相似文献   

19.
20.
Partition ratios of 8 free l-amino acids (Gln, Glu, His, Lys, Met, Ser, Thr, and Tyr) were measured in 10 different polymer/polymer aqueous two-phase systems containing 0.15?M NaCl in 0.01?M phosphate buffer, pH 7.4. The solute-specific coefficients representing the solute dipole/dipole, hydrogen-bonding and electrostatic interactions with the aqueous environment of the amino acids were determined by multiple linear regression analysis using a modified linear solvation energy relationship. The solute-specific coefficients determined in this study together with the solute-specific coefficients reported previously for amino acids with non-polar side-chains where used in a Quantitative Structure/Property Relationship analysis. It is shown that linear combinations of these solute-specific coefficients are correlated well with various physicochemical, structural, and biological properties of amino acids.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号