期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Prediction of hot spot residues at protein-protein interfaces by combining machine learning and energy-based methods

Stefano Lise Cedric Archambeau Massimiliano Pontil David T Jones 《BMC bioinformatics》2009,10(1):365

Background

Alanine scanning mutagenesis is a powerful experimental methodology for investigating the structural and energetic characteristics of protein complexes. Individual amino-acids are systematically mutated to alanine and changes in free energy of binding (ΔΔG) measured. Several experiments have shown that protein-protein interactions are critically dependent on just a few residues ("hot spots") at the interface. Hot spots make a dominant contribution to the free energy of binding and if mutated they can disrupt the interaction. As mutagenesis studies require significant experimental efforts, there is a need for accurate and reliable computational methods. Such methods would also add to our understanding of the determinants of affinity and specificity in protein-protein recognition. 相似文献

2.

Prediction of mRNA polyadenylation sites by support vector machine 总被引：3，自引：0，他引：3

Cheng Y Miura RM Tian B 《Bioinformatics (Oxford, England)》2006,22(19):2320-2325

相似文献

3.

Prediction of matrix metal proteinases-12 inhibitors by machine learning approaches

Bingke Li Li Hu Ying Xue Min Yang Long Huang Zhentao Zhang 《Journal of biomolecular structure & dynamics》2019,37(10):2627-2640

相似文献

4.

Accurate prediction of stability changes in protein mutants by combining machine learning with structure based computational mutagenesis

Masso M Vaisman II 《Bioinformatics (Oxford, England)》2008,24(18):2002-2009

相似文献

5.

Prediction of side chain orientations in proteins by statistical machine learning methods

Yan A Kloczkowski A Hofmann H Jernigan RL 《Journal of biomolecular structure & dynamics》2007,25(3):275-288

We develop ways to predict the side chain orientations of residues within a protein structure by using several different statistical machine learning methods. Here side chain orientation of a given residue i is measured by an angle Omega(i) between the vector pointing from the center of the protein structure to the C(i)(alpha) atom and the vector pointing from the C(i)(alpha) atom to the center of its side chain atoms. To predict the Omega(i) angles, we construct statistical models by using several different methods such as general linear regression, a regression tree and bagging, a neural network, and a support vector machine. The root mean square errors for the different models range only from 36.67 to 37.60 degrees and the correlation coefficients are all between 30% and 34%. The performances of different models in the test set are, thus, quite similar, and show the relative predictive power of these models to be significant in comparison with random side chain orientations. 相似文献

6.

A new method for species identification via protein-coding and non-coding DNA barcodes by combining machine learning with bioinformatic methods

Zhang AB Feng J Ward RD Wan P Gao Q Wu J Zhao WZ 《PloS one》2012,7(2):e30986

Species identification via DNA barcodes is contributing greatly to current bioinventory efforts. The initial, and widely accepted, proposal was to use the protein-coding cytochrome c oxidase subunit I (COI) region as the standard barcode for animals, but recently non-coding internal transcribed spacer (ITS) genes have been proposed as candidate barcodes for both animals and plants. However, achieving a robust alignment for non-coding regions can be problematic. Here we propose two new methods (DV-RBF and FJ-RBF) to address this issue for species assignment by both coding and non-coding sequences that take advantage of the power of machine learning and bioinformatics. We demonstrate the value of the new methods with four empirical datasets, two representing typical protein-coding COI barcode datasets (neotropical bats and marine fish) and two representing non-coding ITS barcodes (rust fungi and brown algae). Using two random sub-sampling approaches, we demonstrate that the new methods significantly outperformed existing Neighbor-joining (NJ) and Maximum likelihood (ML) methods for both coding and non-coding barcodes when there was complete species coverage in the reference dataset. The new methods also out-performed NJ and ML methods for non-coding sequences in circumstances of potentially incomplete species coverage, although then the NJ and ML methods performed slightly better than the new methods for protein-coding barcodes. A 100% success rate of species identification was achieved with the two new methods for 4,122 bat queries and 5,134 fish queries using COI barcodes, with 95% confidence intervals (CI) of 99.75-100%. The new methods also obtained a 96.29% success rate (95%CI: 91.62-98.40%) for 484 rust fungi queries and a 98.50% success rate (95%CI: 96.60-99.37%) for 1094 brown algae queries, both using ITS barcodes. 相似文献

7.

Prediction of promiscuous p-glycoprotein inhibition using a novel machine learning scheme

Leong MK Chen HB Shih YH 《PloS one》2012,7(3):e33829

Background

P-glycoprotein (P-gp) is an ATP-dependent membrane transporter that plays a pivotal role in eliminating xenobiotics by active extrusion of xenobiotics from the cell. Multidrug resistance (MDR) is highly associated with the over-expression of P-gp by cells, resulting in increased efflux of chemotherapeutical agents and reduction of intracellular drug accumulation. It is of clinical importance to develop a P-gp inhibition predictive model in the process of drug discovery and development.

Methodology/Principal Findings

An in silico model was derived to predict the inhibition of P-gp using the newly invented pharmacophore ensemble/support vector machine (PhE/SVM) scheme based on the data compiled from the literature. The predictions by the PhE/SVM model were found to be in good agreement with the observed values for those structurally diverse molecules in the training set (n = 31, r ² = 0.89, q ² = 0.86, RMSE = 0.40, s = 0.28), the test set (n = 88, r ² = 0.87, RMSE = 0.39, s = 0.25) and the outlier set (n = 11, r ² = 0.96, RMSE = 0.10, s = 0.05). The generated PhE/SVM model also showed high accuracy when subjected to those validation criteria generally adopted to gauge the predictivity of a theoretical model.

Conclusions/Significance

This accurate, fast and robust PhE/SVM model that can take into account the promiscuous nature of P-gp can be applied to predict the P-gp inhibition of structurally diverse compounds that otherwise cannot be done by any other methods in a high-throughput fashion to facilitate drug discovery and development by designing drug candidates with better metabolism profile. 相似文献

8.

Improving indicator species analysis by combining groups of sites 总被引：2，自引：0，他引：2

Miquel De Cáceres Pierre Legendre Marco Moretti 《Oikos》2010,119(10):1674-1684

Indicator species are species that are used as ecological indicators of community or habitat types, environmental conditions, or environmental changes. In order to determine indicator species, the characteristic to be predicted is represented in the form of a classification of the sites, which is compared to the patterns of distribution of the species found at the sites. Indicator species analysis should take into account the fact that species have different niche breadths: if a species is related to the conditions prevailing in two or more groups of sites, an indicator species analysis undertaken on individual groups of sites may fail to reveal this association. In this paper, we suggest improving indicator species analysis by considering all possible combinations of groups of sites and selecting the combination for which the species can be best used as indicator. When using a correlation index, such as the point‐biserial correlation, the method yields the combination where the difference between the observed and expected abundance/frequency of the species is the largest. When an indicator value index (IndVal) is used, the method provides the set of site‐groups that best matches the observed distribution pattern of the species. We illustrate the advantages of the method in three different examples. Consideration of combinations of groups of sites provides an extra flexibility to qualitatively model the habitat preferences of the species of interest. The method also allows users to cross multiple classifications of the same sites, increasing the amount of information resulting from the analysis. When applied to community types, it allows one to distinguish those species that characterize individual types from those that characterize the relationships between them. This distinction is useful to determine the number of types that maximizes the number of indicator species. 相似文献

9.

Prediction of methionine oxidation risk in monoclonal antibodies using a machine learning method

《MABS-AUSTIN》2013,5(8):1281-1290

ABSTRACT

Monoclonal antibodies (mAbs) have become a major class of protein therapeutics that target a spectrum of diseases ranging from cancers to infectious diseases. Similar to any protein molecule, mAbs are susceptible to chemical modifications during the manufacturing process, long-term storage, and in vivo circulation that can impair their potency. One such modification is the oxidation of methionine residues. Chemical modifications that occur in the complementarity-determining regions (CDRs) of mAbs can lead to the abrogation of antigen binding and reduce the drug’s potency and efficacy. Thus, it is highly desirable to identify and eliminate any chemically unstable residues in the CDRs during the therapeutic antibody discovery process. To provide increased throughput over experimental methods, we extracted features from the mAbs’ sequences, structures, and dynamics, used random forests to identify important features and develop a quantitative and highly predictive in silico methionine oxidation model. 相似文献

10.

Prediction of beta-turns with learning machines 总被引：3，自引：0，他引：3

Cai YD Liu XJ Li YX Xu XB Chou KC 《Peptides》2003,24(5):665-669

The support vector machine approach was introduced to predict the beta-turns in proteins. The overall self-consistency rate by the re-substitution test for the training or learning dataset reached 100%. Both the training dataset and independent testing dataset were taken from Chou [J. Pept. Res. 49 (1997) 120]. The success prediction rates by the jackknife test for the beta-turn subset of 455 tetrapeptides and non-beta-turn subset of 3807 tetrapeptides in the training dataset were 58.1 and 98.4%, respectively. The success rates with the independent dataset test for the beta-turn subset of 110 tetrapeptides and non-beta-turn subset of 30,231 tetrapeptides were 69.1 and 97.3%, respectively. The results obtained from this study support the conclusion that the residue-coupled effect along a tetrapeptide is important for the formation of a beta-turn. 相似文献

11.

1H nuclear magnetic resonance study of the two calcium-binding sites of porcine intestinal calcium-binding protein 总被引：1，自引：0，他引：1

J G Shelling B D Sykes 《The Journal of biological chemistry》1985,260(14):8342-8347

1H nuclear magnetic resonance has been employed to study the calcium-binding properties of the NH2- and COOH-terminal calcium-binding sites of the porcine intestinal calcium-binding protein. The protein was titrated with calcium in the presence of the chelator EDTA in order to determine the association constants of the protein for calcium relative to the known association constant of EDTA for calcium. The resulting data were compared with various models for the binding of calcium to two sites on the protein. Models were considered for which the two sites in the apoprotein have either intrinsically equal or unequal affinities for calcium. For each of these two cases, positive cooperativity, no cooperativity, and negative cooperativity were considered. The data fit best for the case of random binding to two independent sites with equivalent association constants of 1.0 +/- 0.1 X 10(7) M-1. The case of ordered binding to two sites with intrinsically different affinities, with concomitant positive affinity between the two sites so that the effective association constants were made equal, could not be mathematically excluded when only one protein NMR resonance is considered but can be shown to be implausible when the whole spectrum is considered. 相似文献

12.

Predicting binding sites of hydrolase-inhibitor complexes by combining several methods

Taner Z Sen Andrzej Kloczkowski Robert L Jernigan Changhui Yan Vasant Honavar Kai-Ming Ho Cai-Zhuang Wang Yungok Ihm Haibo Cao Xun Gu Drena Dobbs 《BMC bioinformatics》2004,5(1):1-11

Background

Protein-protein interactions play a critical role in protein function. Completion of many genomes is being followed rapidly by major efforts to identify interacting protein pairs experimentally in order to decipher the networks of interacting, coordinated-in-action proteins. Identification of protein-protein interaction sites and detection of specific amino acids that contribute to the specificity and the strength of protein interactions is an important problem with broad applications ranging from rational drug design to the analysis of metabolic and signal transduction networks.

Results

In order to increase the power of predictive methods for protein-protein interaction sites, we have developed a consensus methodology for combining four different methods. These approaches include: data mining using Support Vector Machines, threading through protein structures, prediction of conserved residues on the protein surface by analysis of phylogenetic trees, and the Conservatism of Conservatism method of Mirny and Shakhnovich. Results obtained on a dataset of hydrolase-inhibitor complexes demonstrate that the combination of all four methods yield improved predictions over the individual methods.

Conclusions

We developed a consensus method for predicting protein-protein interface residues by combining sequence and structure-based methods. The success of our consensus approach suggests that similar methodologies can be developed to improve prediction accuracies for other bioinformatic problems. 相似文献

13.

Prediction of protein binding sites in protein structures using hidden Markov support vector machine

Bin Liu Xiaolong Wang Lei Lin Buzhou Tang Qiwen Dong Xuan Wang 《BMC bioinformatics》2009,10(1):381

Background

Predicting the binding sites between two interacting proteins provides important clues to the function of a protein. Recent research on protein binding site prediction has been mainly based on widely known machine learning techniques, such as artificial neural networks, support vector machines, conditional random field, etc. However, the prediction performance is still too low to be used in practice. It is necessary to explore new algorithms, theories and features to further improve the performance. 相似文献

14.

Characterization of the calcium-binding sites of Listeria monocytogenes InlB

Marino M Banerjee M Copp J Dramsi S Chapman T van der Geer P Cossart P Ghosh P 《Biochemical and biophysical research communications》2004,316(2):379-386

The Listeria monocytogenes protein InlB promotes invasion of mammalian cells through activation of the receptor tyrosine kinase Met. The InlB N-cap, a approximately 40 residue part of the domain that binds Met, was previously observed to bind two calcium ions in a novel and unusually exposed manner. Because subsequent work raised questions about the existence of these calcium-binding sites, we assayed calcium binding in solution to the InlB N-cap. We show that calcium ions are bound with dissociation constants in the low micromolar range at the two identified sites, and that the sites interact with one another. We demonstrate that the calcium ions are not required for structure, and also find that they have no appreciable effect on Met activation or intracellular invasion. Therefore, our results indicate that the sites are fortuitous in InlB, but also suggest that the simple architecture of the sites may be adaptable for protein engineering purposes. 相似文献

15.

65-kilodalton protein phosphorylated by interleukin 2 stimulation bears two putative actin-binding sites and two calcium-binding sites 总被引：8，自引：0，他引：8

Y L Zu K Shigesada E Nishida I Kubota M Kohno M Hanaoka Y Namba 《Biochemistry》1990,29(36):8319-8324

We have previously characterized a 65-kilodalton protein (p65) as an interleukin 2 stimulated phosphoprotein in human T cells and showed that three endopeptide sequences of p65 are present in the sequence of l-plastin [Zu et al. (1990) Biochemistry 29, 1055-1062]. In this paper, we present the complete primary structure of p65 based on the cDNA isolated from a human T lymphocyte (KUT-2) cDNA library. Analysis of p65 sequences and the amino acid composition of cleaved p65 N-terminal peptide indicated that the deduced p65 amino acid sequence exactly coincides with that of l-plastin over the C-terminal 580 residues [Lin et al. (1988) Mol. Cell. Biol. 8, 4659-4668] and has a 57-residue extension at the N-terminus to l-plastin. Computer-assisted structural analysis revealed that p65 is a multidomain molecule involving at least three intriguing functional domains: two putative calcium-binding sites along the N-terminal 80 amino acid residues; a putative calmodulin-binding site following the calcium-binding region; and two tandem repeats of putative actin-binding domains in its middle and C-terminal parts, each containing approximately 240 amino acid residues. These results suggest that p65 belongs to actin-binding proteins. 相似文献

16.

Fold recognition by combining profile-profile alignment and support vector machine 总被引：1，自引：0，他引：1

Han S Lee BC Yu ST Jeong CS Lee S Kim D 《Bioinformatics (Oxford, England)》2005,21(11):2667-2673

MOTIVATION: Currently, the most accurate fold-recognition method is to perform profile-profile alignments and estimate the statistical significances of those alignments by calculating Z-score or E-value. Although this scheme is reliable in recognizing relatively close homologs related at the family level, it has difficulty in finding the remote homologs that are related at the superfamily or fold level. RESULTS: In this paper, we present an alternative method to estimate the significance of the alignments. The alignment between a query protein and a template of length n in the fold library is transformed into a feature vector of length n + 1, which is then evaluated by support vector machine (SVM). The output from SVM is converted to a posterior probability that a query sequence is related to a template, given SVM output. Results show that a new method shows significantly better performance than PSI-BLAST and profile-profile alignment with Z-score scheme. While PSI-BLAST and Z-score scheme detect 16 and 20% of superfamily-related proteins, respectively, at 90% specificity, a new method detects 46% of these proteins, resulting in more than 2-fold increase in sensitivity. More significantly, at the fold level, a new method can detect 14% of remotely related proteins at 90% specificity, a remarkable result considering the fact that the other methods can detect almost none at the same level of specificity. 相似文献

17.

Lanthanide ion probes of calcium-binding sites on cellular membranes 总被引：1，自引：0，他引：1

Cristobal G. dos Remedios 《Cell calcium》1981,2(1):29-51

The chemical basis for the similarity between the lanthanide series of ions and calcium is outlined together with the experimental difficulties associated with the use of these ions. A number of properties of the lanthanide ions are highlighted which make them potentially valuable probe elements. In this context the use of lanthanum and the lanthanide ions in probing calcium sites on cellular membranes is reviewed. In most instances, the lanthanide ions displace membranebound Ca and inhibit Ca-mediated membrane function, but, unlike Ca, these ions do not appear to be transported across cellular membranes (mitochondria may be an exception). Generally two relationships can be demonstrated between the inhibition of the Ca-mediated function and the atomic number of the lanthanide ion. Extracellular membranes appear to respond selectively to Tm(III) while intracellular membranes lack the Tm peak and instead exhibit a broad trend in which the smaller ions are the least effective inhibitors. 相似文献

18.

Subunit distribution of calcium-binding sites in Lumbricus terrestris hemoglobin

Kuchumov AR Loo JA Vinogradov SN 《Journal of Protein Chemistry》2000,19(2):139-149

The giant, 3.6-MDa hexagonal bilayer hemoglobin (Hb) of Lumbricus terrestris consist of twelve 213-kDa globin subassemblies, each comprised of three disulfide-bonded trimers and three monomer globin chains, tethered to a central scaffolding of 36–42 linkers L1–L4 (24–32 kDa). It is known to contain 50–80 Ca and 2–4 Cu and Zn; the latter are thought to be responsible for the superoxide dismutase activity of the Hb. Total reflection X-ray fluorescence spectrometry was used to determine the Ca, Cu, and Zn contents of the Hb dissociated at pH 2.2, the globin dodecamer subassembly, and linker subunits L2 and L4. Although the dissociated Hb retained 20 Ca²⁺ and all the Cu and Zn, the globin subassembly had 0.4 to 3 Ca²⁺, depending on the method of isolation, and only traces of Cu and Zn. The linkers L2 and L4, isolated by reversed-phase high-pressure liquid chromatography at pH 2.2, had 1 Ca per mole and very little Cu and Zn. Electrospray ionization mass spectrometry of linker L3 at pH 2.2 and at neutral pH demonstrated avid binding of 1 Ca²⁺ and additional weaker binding of 7 Ca²⁺ in the presence of added Ca²⁺. Based on these and previous results which document the heterogeneous nature of the Ca²⁺-binding sites in Lumbricus Hb, we propose three classes of Ca²⁺-binding sites with affinities increasing in the following order: (i) a large number of sites (>100) with affinities lower than EDTA associated with linker L3 and dodecamer subassembly, (ii) 30 sites with affinities higher than EDTA occurring within the cysteine-rich domains of linker L3 and dodecamer subassembly, and (iii) 25 very high affinity sites associated with the linker subunits L1, L2, and L4. It is likely that the low-affinity type (i) sites are the ones involved in the effects of 1–100 mM Group IIA cations on Lumbricus Hb structure and function, namely increased stability of its quaternary structure and increased affinity and cooperativity of its oxygen binding. 相似文献

19.

Biomedical informatics with optimization and machine learning

Shuai?Huang Jiayu?Zhou Zhangyang?Wang Qing?Ling Yang?Shen Email author 《EURASIP Journal on Bioinformatics and Systems Biology》2017,2017(1):4

相似文献

20.

Calcium binding to calmodulin. Cooperativity of the calcium-binding sites 总被引：3，自引：0，他引：3

S Iida J D Potter 《Journal of biochemistry》1986,99(6):1765-1772

The effects of Mg2+ ion, pH, and KCl concentration on Ca2+ binding to calmodulin were studied by using a Ca2+ ion-sensitive electrode. The Ca2+ ion affinity of calmodulin increased with increasing pH or decreasing KCl concentration. Cooperativity between the Ca2+-binding sites was observed, and increased with decreasing pH or increasing KCl concentration. Free Ca2+ ion concentration was decreased by adding MgCl2 ion at low Mg2+ concentration and increased at higher concentrations in the presence of small amounts of Ca2+ ion. The decrease of free Ca2+ ion concentration by Mg2+ ion strongly suggests cooperativity between the Ca2+-binding sites, and it is difficult to explain the decrease in terms of the ordered binding models previously proposed. These results can be explained by a simple model which has four equivalent binding sites that bind Ca2+ and Mg2+ competitively, and showing cooperativity when either Ca2+ or Mg2+ is bound. Mg2+ ion binding to calmodulin was measured in the presence or absence of Ca2+ to confirm the validity of this model, and no Mg2+-specific site was observed. 相似文献