共查询到20条相似文献,搜索用时 0 毫秒
1.
蛋白质折叠模式识别是一种分析蛋白质结构的重要方法。以序列相似性较低的蛋白质为训练集,提取蛋白质序列信息频数及疏水性等信息作为折叠类型特征,从SCOP数据库中已分类蛋白质构建1 393种折叠模式的数据集,采用SVM预测蛋白质1 393种折叠模式。封闭测试准确率达99.612 2%,基于SCOP的开放测试准确率达79.632 9%。基于另一个权威测试集的开放测试折叠准确率达64.705 9%,SCOP类准确率达76.470 6%,可以有效地对蛋白质折叠模式进行预测,从而为蛋白质从头预测提供参考。 相似文献
2.
依据蛋白质折叠子中氨基酸保守性,以氨基酸、氨基酸的极性、氨基酸的电性以及氨基酸的亲—疏水性为参数,从蛋白质的氨基酸序列出发,采用"一对多"的分类策略,通过构建打分矩阵和选取氨基酸序列模式片断,利用5种相似性打分函数对27类折叠子进行识别,最好的预测精度达到83.46%。结果表明,打分矩阵是预测多类蛋白质折叠子有效的方法。 相似文献
3.
Background
Although Transmembrane Proteins (TMPs) are highly important in various biological processes and pharmaceutical developments, general prediction of TMP structures is still far from satisfactory. Because TMPs have significantly different physicochemical properties from soluble proteins, current protein structure prediction tools for soluble proteins may not work well for TMPs. With the increasing number of experimental TMP structures available, template-based methods have the potential to become broadly applicable for TMP structure prediction. However, the current fold recognition methods for TMPs are not as well developed as they are for soluble proteins.Methodology
We developed a novel TMP Fold Recognition method, TMFR, to recognize TMP folds based on sequence-to-structure pairwise alignment. The method utilizes topology-based features in alignment together with sequence profile and solvent accessibility. It also incorporates a gap penalty that depends on predicted topology structure segments. Given the difference between α-helical transmembrane protein (αTMP) and β-strands transmembrane protein (βTMP), parameters of scoring functions are trained respectively for these two protein categories using 58 αTMPs and 17 βTMPs in a non-redundant training dataset.Results
We compared our method with HHalign, a leading alignment tool using a non-redundant testing dataset including 72 αTMPs and 30 βTMPs. Our method achieved 10% and 9% better accuracies than HHalign in αTMPs and βTMPs, respectively. The raw score generated by TMFR is negatively correlated with the structure similarity between the target and the template, which indicates its effectiveness for fold recognition. The result demonstrates TMFR provides an effective TMP-specific fold recognition and alignment method. 相似文献4.
AAindex is a database of numerical indices representing various physicochemical and biochemical properties of amino acids and pairs of amino acids. It consists of two sections: AAindex1 for the amino acid index of 20 numerical values and AAindex2 for the amino acid mutation matrix of 210 numerical values. Each entry of either AAindex1 or AAindex2 consists of the definition, the reference information, a list of related entries in terms of the correlation coefficient, and the actual data. The database may be accessed through the DBGET/LinkDB system at GenomeNet (http://www.genome.ad. jp/dbget/) or may be downloaded by anonymous FTP (ftp://ftp.genome. ad.jp/db/genomenet/aaindex/). 相似文献
5.
Protein products of highly expressed genes tend to favor amino acids that have lower average biosynthetic costs (i.e., they exhibit metabolic efficiency). While this trend has been observed in several studies, the specific sites where cost-reducing substitutions accumulate have not been well characterized. Toward that end, weighted costs in conserved and variable positions were evaluated across a total of 9,119 homologous proteins in four mammalian orders (primate, carnivore, rodent, and artiodactyls), which together contain a total of 20,457,072 amino acids. Degree of conservation at homologous positions in these mammalian proteins and average-weighted cost across all positions within a single protein are significantly correlated. Dividing human genes into two classes (those with and those without CpG islands in their promoters) suggests that humans also preferentially utilize less costly amino acids in highly expressed genes. In contrast to the intuitive expectation that the relatively weak selective force associated with metabolic efficiency would be a selection pressure in complex multicellular organisms, the overall level of selective constraint within the variable regions of mammalian proteins allows the metabolic efficiency to derive a reduction of overall biosynthetic cost, particularly in genes with the highest levels of expression. 相似文献
6.
The paper focuses on the development of a software tool for protein clustering according to their amino acid content. All known human proteins were clustered according to the relative frequencies of their amino acids starting from the UniProtKB/Swiss-Prot reference database and making use of hierarchical cluster analysis. Results were compared to those based on sequence similarities. Results: Proteins display different clustering patterns according to type. Many extracellular proteins with highly specific and repetitive sequences (keratins, collagens etc.) cluster clearly confirming the accuracy of the clustering method. In our case clustering by sequence and amino acid content overlaps. Proteins with a more complex structure with multiple domains (catalytic, extracellular, transmembrane etc.), even if classified very similar according to sequence similarity and function (aquaporins, cadherins, steroid 5-alpha reductase etc.) showed different clustering according to amino acid content. Availability of essential amino acids according to local conditions (starvation, low or high oxygen, cell cycle phase etc.) may be a limiting factor in protein synthesis, whatever the mRNA level. This type of protein clustering may therefore prove a valuable tool in identifying so far unknown metabolic connections and constraints. 相似文献
7.
Jun-ichi Kira Gladys E. Deibler Henry C. Krutzsch Russell E. Martenson 《Journal of neurochemistry》1985,44(1):134-142
The myelin basic protein (BP) of pig brain was cleaved into its constituent tryptic peptides and the amino acid composition of each was determined. Those tryptic peptides that had not been sequenced previously were cleaved with dipeptidyl peptidases and the resulting dipeptides were trimethylsilated, separated by gas chromatography, and identified by mass spectrometry. Carboxypeptidases B and Y were used to establish the COOH-terminal sequences of some of the tryptic peptides; one tryptic peptide (sequence 76-92) was cleaved with thermolysin and the thermolytic peptides were analyzed. From the results of the present study together with those reported previously, it has been possible to determine the complete amino acid sequence of the protein. The protein consists of 172 residues and has a theoretical molecular weight of 18,604. Its amino acid sequence is identical with that reported for the homologous bovine protein with the following exceptions: Ser replaces (bovine) Ala2; His-Gly is inserted between Arg9 and Ser10; Ala replaces Ser45; His and Gly replace Gly76 and His77, respectively; Pro replaces Ser131 and Ser135; Ala is inserted between Gly142 and His143; and Gln replaces His143. 相似文献
8.
9.
10.
11.
Protein Content and Amino Acid Composition of Certain Fungi Evaluated for Microbial Protein Production
下载免费PDF全文

C. Christias C. Couvaraki S. G. Georgopoulos B. Macris V. Vomvoyanni 《Applied microbiology》1975,29(2):250-254
The protein and total amino acid contents of four mycelial fungal strains and one yeast were approximately the same for cultures harvested in the mid-log and early stationary growth phases. It was found that Fusarium oxysporum and Fusarium moniliforme contained approximately 30% more protein and total amino acids than Aspergillus niger. The amino acid composition of mycelial protein compares favorably with that of British Petroleum yeast protein Toprina produced commercially on hydrocarbon substrates. Fusarium spp. may be suitable for commercial production of microbial protein, especially when low-cost agricultural or industrial waste products are readily available as energy sources. Genetic manipulation of these fungi, such as induction of mutant strains through irradiation, may be desirable to obtain a mycelial product of improved yield and/or quality. 相似文献
12.
Radoslaw Michalski Jacek Zielonka Ewa Gapys Andrzej Marcinek Joy Joseph Balaraman Kalyanaraman 《The Journal of biological chemistry》2014,289(32):22536-22553
Hydroperoxides of amino acid and amino acid residues (tyrosine, cysteine, tryptophan, and histidine) in proteins are formed during oxidative modification induced by reactive oxygen species. Amino acid hydroperoxides are unstable intermediates that can further propagate oxidative damage in proteins. The existing assays (oxidation of ferrous cation and iodometric assays) cannot be used in real-time measurements. In this study, we show that the profluorescent coumarin boronic acid (CBA) probe reacts with amino acid and protein hydroperoxides to form the corresponding fluorescent product, 7-hydroxycoumarin. 7-Hydroxycoumarin formation was catalase-independent. Based on this observation, we have developed a fluorometric, real-time assay that is adapted to a multiwell plate format. This is the first report showing real-time monitoring of amino acid and protein hydroperoxides using the CBA-based assay. This approach was used to detect protein hydroperoxides in cell lysates obtained from macrophages exposed to visible light and photosensitizer (rose bengal). We also measured the rate constants for the reaction between amino acid hydroperoxides (tyrosyl, tryptophan, and histidine hydroperoxides) and CBA, and these values (7–23 m−1 s−1) were significantly higher than that measured for H2O2 (1.5 m−1 s−1). Using the CBA-based competition kinetics approach, the rate constants for amino acid hydroperoxides with ebselen, a glutathione peroxidase mimic, were also determined, and the values were within the range of 1.1–1.5 × 103
m−1 s−1. Both ebselen and boronates may be used as small molecule scavengers of amino acid and protein hydroperoxides. Here we also show formation of tryptophan hydroperoxide from tryptophan exposed to co-generated fluxes of nitric oxide and superoxide. This observation reveals a new mechanism for amino acid and protein hydroperoxide formation in biological systems. 相似文献
13.
Masanori Kohmura Noriki Nio Yasuo Ariyoshi 《Bioscience, biotechnology, and biochemistry》2013,77(9):2219-2224
The sweet protein monellin consists of two noncovalently associated polypeptide chains, the A chain of 44 amino acid residues and the B chain of 50 residues. Two different primary structures have been reported for each of these chains. The complete amino acid sequence of monellin was determined by a combination of FAB- and ESI-mass spectrometry, and by automatic Edman degradation. 相似文献
14.
Widely used models of protein evolution ignore protein structure. Therefore, these models do not predict spatial clustering
of amino acid replacements with respect to tertiary structure. One formal and biologically implausible possibility is that
there is no tendency for amino acid replacements to be spatially clustered during evolution. An alternative to this is that
amino acid replacements are spatially clustered and this spatial clustering can be fully explained by a tendency for similar
rates of amino acid replacement at sites that are nearby in protein tertiary structure. A third possibility is that the amount
of clustering exceeds that which can be explained solely on the basis of independently evolving protein sites with spatially
clustered replacement rates. We introduce two simple and not very parametric hypothesis tests that help distinguish these
three possibilities. We then apply these tests to 273 homologous protein families. The null hypothesis of no spatial clustering
is rejected for 102 of 273 families. The explanation of spatially clustered rates but independent change among sites is rejected
for 43 families. These findings need to be reconciled with the common practice of basing evolutionary inferences on models
that assume independent change among sites.
[Reviewing Editior: Dr. David Pollock] 相似文献
15.
Protein remote homology detection is one of the most important problems in bioinformatics. Discriminative methods such as support vector machines (SVM) have shown superior performance. However, the performance of SVM-based methods depends on the vector representations of the protein sequences. Prior works have demonstrated that sequence-order effects are relevant for discrimination, but little work has explored how to incorporate the sequence-order information along with the amino acid physicochemical properties into the prediction. In order to incorporate the sequence-order effects into the protein remote homology detection, the physicochemical distance transformation (PDT) method is proposed. Each protein sequence is converted into a series of numbers by using the physicochemical property scores in the amino acid index (AAIndex), and then the sequence is converted into a fixed length vector by PDT. The sequence-order information can be efficiently included into the feature vector with little computational cost by this approach. Finally, the feature vectors are input into a support vector machine classifier to detect the protein remote homologies. Our experiments on a well-known benchmark show the proposed method SVM-PDT achieves superior or comparable performance with current state-of-the-art methods and its computational cost is considerably superior to those of other methods. When the evolutionary information extracted from the frequency profiles is combined with the PDT method, the profile-based PDT approach can improve the performance by 3.4% and 11.4% in terms of ROC score and ROC50 score respectively. The local sequence-order information of the protein can be efficiently captured by the proposed PDT and the physicochemical properties extracted from the amino acid index are incorporated into the prediction. The physicochemical distance transformation provides a general framework, which would be a valuable tool for protein-level study. 相似文献
16.
The advance of next-generation sequencing technologies has made exome sequencing rapid and relatively inexpensive. A major application of exome sequencing is the identification of genetic variations likely to cause Mendelian diseases. This requires processing large amounts of sequence information and therefore computational approaches that can accurately and efficiently identify the subset of disease-associated variations are needed. The accuracy and high false positive rates of existing computational tools leave much room for improvement. Here, we develop a boosted tree regression machine-learning approach to predict human disease-associated amino acid variations by utilizing a comprehensive combination of protein sequence and structure features. On comparing our method, ENTPRISE, to the state-of-the-art methods SIFT, PolyPhen-2, MUTATIONASSESSOR, MUTATIONTASTER, FATHMM, ENTPRISE exhibits significant improvement. In particular, on a testing dataset consisting of only proteins with balanced disease-associated and neutral variations defined as having the ratio of neutral/disease-associated variations between 0.3 and 3, the Mathews Correlation Coefficient by ENTPRISE is 0.493 as compared to 0.432 by PPH2-HumVar, 0.406 by SIFT, 0.403 by MUTATIONASSESSOR, 0.402 by PPH2-HumDiv, 0.305 by MUTATIONTASTER, and 0.181 by FATHMM. ENTPRISE is then applied to nucleic acid binding proteins in the human proteome. Disease-associated predictions are shown to be highly correlated with the number of protein-protein interactions. Both these predictions and the ENTPRISE server are freely available for academic users as a web service at http://cssb.biology.gatech.edu/entprise/. 相似文献
17.
Agustin Luz-Madrigal Alexander Asanov Aldo R. Camacho-Zarco Alicia Sampieri Luis Vaca 《Journal of virology》2013,87(21):11894-11907
Baculoviridae is a large family of double-stranded DNA viruses that selectively infect insects. Autographa californica multiple nucleopolyhedrovirus (AcMNPV) is the best-studied baculovirus from the family. Many studies over the last several years have shown that AcMNPV can enter a wide variety of mammalian cells and deliver genetic material for foreign gene expression. While most animal viruses studied so far have developed sophisticated mechanisms to selectively infect specific cells and tissues in an organism, AcMNPV can penetrate and deliver foreign genes into most cells studied to this date. The details about the mechanisms of internalization have been partially described. In the present study, we have identified a cholesterol recognition amino acid consensus (CRAC) domain present in the AcMNPV envelope fusion protein GP64. We demonstrated the association of a CRAC domain with cholesterol, which is important to facilitate the anchoring of the virus at the mammalian cell membrane. Furthermore, this initial anchoring favors AcMNPV endocytosis via a dynamin- and clathrin-dependent mechanism. Under these conditions, efficient baculovirus-driven gene expression is obtained. In contrast, when cholesterol is reduced from the plasma membrane, AcMNPV enters the cell via a dynamin- and clathrin-independent mechanism. The result of using this alternative internalization pathway is a reduced level of baculovirus-driven gene expression. This study is the first to document the importance of a novel CRAC domain in GP64 and its role in modulating gene delivery in AcMNPV. 相似文献
18.
根据蛋白质的氨基酸组成实现其快速鉴定 总被引:1,自引:0,他引:1
常规进行蛋白质鉴定的方法是测定其氨基酸顺序,它需要蛋白质顺序分析仪,对蛋白质的纯度要求高,费时和花费大,与之相比,蛋白质的氨基酸组成和分子量是容易实验测定的。本文描述了一个基于蛋白质的组成和分子量进行其快速鉴定的方法。其基本出发点是,通过统计蛋白质序列数据库中每个序列的氨基酸组成和分子量,得到一个含蛋白质长度、组成和分子量的数据库,将靶蛋白质的组成等数据与该数据库进行对比,可以检出组成和分子量与之接近的蛋白质。从而对该蛋白质进行初步鉴定。在有些情况下,甚至能相当准确地确定靶蛋白质与数据库中的某个(些)蛋白质相关。根据这一原理本文设计了根据氨基酸组成检索蛋白质组成数据库的程序,通过对胰岛素原、细胞肿瘤抗原P53和泛肽等多种蛋白质的组成分析,证实根据氨基酸组成能较好地进行蛋白质鉴定。 相似文献
19.
Abstract Inferring the protein architecture chronology is one of central topics in origin of life study and has been given much attention. Based on an amino acid evolutionary model that late amino acids were bio-synthesized prior to early counterparts, we addressed the issue by examining the structures of amino acid synthases. Despite the limited structural information on amino acid synthases, our deduction revealed that α/β was the oldest protein class, which is in good agreement with the prior fold-usage-based conclusion. 相似文献
20.
A subunit of molecular weight 18300 has been separated and isolated from seeds of Brassica campestris L. This subunit was cleaved by using cyanogen bromide, trypsin, Staphylococcus aureus V8 protease and chymotrypsin; the fragments obtained from enzymatlc and chemical cleavages were separated and isolated by polyacrylamide gel electrophoresis and gel filtration. The amino acid analyses were carried out. The complete amino acid sequence of the subunit containing 172 amino acid residues has been established by manual Edman method. 相似文献