首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Proteins play important roles in living organisms, and their function is directly linked with their structure. Due to the growing gap between the number of proteins being discovered and their functional characterization (in particular as a result of experimental limitations), reliable prediction of protein function through computational means has become crucial. This paper reviews the machine learning techniques used in the literature, following their evolution from simple algorithms such as logistic regression to more advanced methods like support vector machines and modern deep neural networks. Hyperparameter optimization methods adopted to boost prediction performance are presented. In parallel, the metamorphosis in the features used by these algorithms from classical physicochemical properties and amino acid composition, up to text-derived features from biomedical literature and learned feature representations using autoencoders, together with feature selection and dimensionality reduction techniques, are also reviewed. The success stories in the application of these techniques to both general and specific protein function prediction are discussed.  相似文献   

2.
Identification and characterization of antigenic determinants on proteins has received considerable attention utilizing both, experimental as well as computational methods. For computational routines mostly structural as well as physicochemical parameters have been utilized for predicting the antigenic propensity of protein sites. However, the performance of computational routines has been low when compared to experimental alternatives. Here we describe the construction of machine learning based classifiers to enhance the prediction quality for identifying linear B-cell epitopes on proteins. Our approach combines several parameters previously associated with antigenicity, and includes novel parameters based on frequencies of amino acids and amino acid neighborhood propensities. We utilized machine learning algorithms for deriving antigenicity classification functions assigning antigenic propensities to each amino acid of a given protein sequence. We compared the prediction quality of the novel classifiers with respect to established routines for epitope scoring, and tested prediction accuracy on experimental data available for HIV proteins. The major finding is that machine learning classifiers clearly outperform the reference classification systems on the HIV epitope validation set.  相似文献   

3.
Understanding epigenetic processes holds immense promise for medical applications. Advances in Machine Learning (ML) are critical to realize this promise. Previous studies used epigenetic data sets associated with the germline transmission of epigenetic transgenerational inheritance of disease and novel ML approaches to predict genome-wide locations of critical epimutations. A combination of Active Learning (ACL) and Imbalanced Class Learning (ICL) was used to address past problems with ML to develop a more efficient feature selection process and address the imbalance problem in all genomic data sets. The power of this novel ML approach and our ability to predict epigenetic phenomena and associated disease is suggested. The current approach requires extensive computation of features over the genome. A promising new approach is to introduce Deep Learning (DL) for the generation and simultaneous computation of novel genomic features tuned to the classification task. This approach can be used with any genomic or biological data set applied to medicine. The application of molecular epigenetic data in advanced machine learning analysis to medicine is the focus of this review.  相似文献   

4.
随着质谱技术的进步以及生物信息学与统计学算法的发展,以疾病研究为主要目的之一的人类蛋白质组计划正快速推进。蛋白质生物标志物在疾病早期诊断和临床治疗等方面有着非常重要的意义,其发现策略和方法的研究已成为一个重要的热点领域。特征选择与机器学习对于解决蛋白质组数据"高维度"及"稀疏性"问题有较好的效果,因而逐渐被广泛地应用于发现蛋白质生物标志物的研究中。文中主要阐述蛋白质生物标志物的发现策略以及其中特征选择与机器学习方法的原理、应用实例和适用范围,并讨论深度学习方法在本领域的应用前景及局限性,以期为相关研究提供参考。  相似文献   

5.
Biological imaging techniques are the most efficient way to locally measure the variation of different parameters on tissue sections. These analyses are gaining increasing interest since 20 years and allow observing extremely complex biological phenomena at lower and lower time and resolution scale. Nevertheless, most of them only target very few compounds of interest, which are chosen a priori, due to their low resolution power and sensitivity. New chemical imaging technique has to be introduced in order to overcome these limitations, leading to more informative and sensitive analyses for biologists and physicians.Two major mass spectrometry methods can be efficiently used to generate the distribution of biological compounds over a tissue section. Matrix-Assisted Laser Desorption/Ionisation-Mass Spectrometry (MALDI-MS) needs the co-crystallization of the sample with a matrix before to be irradiated by a laser, whereas the analyte is directly desorbed by a primary ion bombardment for Secondary Ion Mass Spectrometry (SIMS) experiments. In both cases, energy used for desorption/ionization is locally deposited -some tens of microns for the laser and some hundreds of nanometers for the ion beam- meaning that small areas over the surface sample can be separately analyzed. Step by step analysis allows spectrum acquisitions over the tissue sections and the data are treated by modern informatics software in order to create ion density maps, i.e., the intensity plot of one specific ion versus the (x,y) position.Main advantages of SIMS and MALDI compared to other chemical imaging techniques lie in the simultaneous acquisition of a large number of biological compounds in mixture with an excellent sensitivity obtained by Time-of-Flight (ToF) mass analyzer. Moreover, data treatment is done a posteriori, due to the fact that no compound is selectively marked, and let us access to the localization of different lipid classes in only one complete acquisition.  相似文献   

6.
Introduction: Mass spectrometry imaging (MSI) is a label free, multiplex imaging technology able to simultaneously record the distributions of 100’s to 1000’s of species, and which may be configured to study metabolites, lipids, glycans, peptides, and proteins simply by changing the tissue preparation protocol.

Areas covered: The capability of MSI to complement established histopathological practice through the identification of biomarkers for differential diagnosis, patient prognosis, and response to therapy; the capability of MSI to annotate tissues on the basis of each pixel’s mass spectral signature; the development of reproducible MSI through multicenter studies.

Expert commentary: We discuss how MSI can be combined with microsampling/microdissection technologies in order to investigate, with more depth of coverage, the molecular changes uncovered by MSI.  相似文献   


7.
Introduction: The last 20 years have seen significant improvements in the analytical capabilities of biological mass spectrometry (MS). Studies using advanced MS have resulted in new insights into cell biology and the etiology of diseases as well as its use in clinical applications.

Areas covered: This review discusses recent developments in MS-based technologies and their cancer-related applications with a focus on proteomics. It also discusses the issues around translating the research findings to the clinic and provides an outline of where the field is moving.

Expert commentary: Proteomics has been problematic to adapt for the clinical setting. However, MS-based techniques continue to demonstrate potential in novel clinical uses beyond classical cancer proteomics.  相似文献   


8.
Protein–protein interactions are intrinsic to virtually every cellular process. Predicting the binding affinity of protein–protein complexes is one of the challenging problems in computational and molecular biology. In this work, we related sequence features of protein–protein complexes with their binding affinities using machine learning approaches. We set up a database of 185 protein–protein complexes for which the interacting pairs are heterodimers and their experimental binding affinities are available. On the other hand, we have developed a set of 610 features from the sequences of protein complexes and utilized Ranker search method, which is the combination of Attribute evaluator and Ranker method for selecting specific features. We have analyzed several machine learning algorithms to discriminate protein‐protein complexes into high and low affinity groups based on their Kd values. Our results showed a 10‐fold cross‐validation accuracy of 76.1% with the combination of nine features using support vector machines. Further, we observed accuracy of 83.3% on an independent test set of 30 complexes. We suggest that our method would serve as an effective tool for identifying the interacting partners in protein–protein interaction networks and human–pathogen interactions based on the strength of interactions. Proteins 2014; 82:2088–2096. © 2014 Wiley Periodicals, Inc.  相似文献   

9.
The purpose of this narrative review is to provide a critical reflection of how analytical machine learning approaches could provide the platform to harness variability of patient presentation to enhance clinical prediction. The review includes a summary of current knowledge on the physiological adaptations present in people with spinal pain. We discuss how contemporary evidence highlights the importance of not relying on single features when characterizing patients given the variability of physiological adaptations present in people with spinal pain. The advantages and disadvantages of current analytical strategies in contemporary basic science and epidemiological research are reviewed and we consider how analytical machine learning approaches could provide the platform to harness the variability of patient presentations to enhance clinical prediction of pain persistence or recurrence. We propose that machine learning techniques can be leveraged to translate a potentially heterogeneous set of variables into clinically useful information with the potential to enhance patient management.  相似文献   

10.
PurposeArtificial intelligence (AI) models are playing an increasing role in biomedical research and healthcare services. This review focuses on challenges points to be clarified about how to develop AI applications as clinical decision support systems in the real-world context.MethodsA narrative review has been performed including a critical assessment of articles published between 1989 and 2021 that guided challenging sections.ResultsWe first illustrate the architectural characteristics of machine learning (ML)/radiomics and deep learning (DL) approaches. For ML/radiomics, the phases of feature selection and of training, validation, and testing are described. DL models are presented as multi-layered artificial/convolutional neural networks, allowing us to directly process images. The data curation section includes technical steps such as image labelling, image annotation (with segmentation as a crucial step in radiomics), data harmonization (enabling compensation for differences in imaging protocols that typically generate noise in non-AI imaging studies) and federated learning. Thereafter, we dedicate specific sections to: sample size calculation, considering multiple testing in AI approaches; procedures for data augmentation to work with limited and unbalanced datasets; and the interpretability of AI models (the so-called black box issue). Pros and cons for choosing ML versus DL to implement AI applications to medical imaging are finally presented in a synoptic way.ConclusionsBiomedicine and healthcare systems are one of the most important fields for AI applications and medical imaging is probably the most suitable and promising domain. Clarification of specific challenging points facilitates the development of such systems and their translation to clinical practice.  相似文献   

11.
12.
MS imaging (MSI) is a remarkable new technology that enables us to determine the distribution of biological molecules present in tissue sections by direct ionization and detection. This technique is now widely used for in situ imaging of endogenous or exogenous molecules such as proteins, lipids, drugs and their metabolites, and it is a potential tool for pathological analysis and the investigation of disease mechanisms. MSI is also thought to be a technique that could be used for biomarker discovery with spatial information. The application of MSI to the study of endogenous metabolites has received considerable attention because metabolites are the result of the interactions of a system's genome with its environment and a total set of these metabolites more closely represents the phenotype of an organism under a given set of conditions. Recent studies have suggested the importance of in situ metabolite imaging in biological discovery and biomedical applications, but several issues regarding the technical application limits of MSI still remained to be resolved. In this review, we describe the capabilities of the latest MSI techniques for the imaging of endogenous metabolites in biological samples, and also discuss the technical problems and new challenges that need to be addressed for effective and widespread application of MSI in both preclinical and clinical settings.  相似文献   

13.
有关蛋白质功能的研究是解析生命奥秘的基础,机器学习技术在该领域已有广泛应用。利用支持向量机(support vectormachine,SVM)方法,构建一个预测蛋白质功能位点的通用平台。该平台先提取非同源蛋白质序列,再对这些序列进行特征编码(包括序列的基本信息、物化特征、结构信息及序列保守性特征等),以编码好的样本作为训练数据,利用SVM进行训练,得到敏感性、特异性、Matthew相关系数、准确率及ROC曲线等评价指标,反复测试,得到评价指标最优的SVM模型后,便可以用来预测蛋白质序列上的功能位点。该平台除了应用在预测蛋白质功能位点之外,还可以应用于疾病相关单核苷酸多态性(SNP)预测分析、预测蛋白质结构域分析、生物分子间的相互作用等。  相似文献   

14.
In order to quantify small molecules at the early stage of drug discovery, we developed a quantitation approach based on mass spectrometry imaging (MSI) using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) without the use of a labeled compound. We describe a method intended to respond to the main challenges encountered in quantification through MALDI imaging dedicated to whole-body or single heterogeneous organ samples (brain, eye, liver). These include the high dependence of the detected signal on the matrix deposition, the MALDI ionization yield of specific target molecules, and lastly, the ion suppression effect on the tissue. To address these challenges, we based our approach on the use of a normalization factor called the TEC (Tissue Extinction Coefficient). This factor takes into account the ion suppression effect that is both tissue- and drug-specific. Through this protocol, the amount of drug per gram of tissue was determined, which in turn, was compared with other analytical techniques such as Liquid Chromatography-Mass spectrometry (LC-MS/MS).  相似文献   

15.
16.
赖氨酸琥珀酰化是一种新型的翻译后修饰,在蛋白质调节和细胞功能控制中发挥重要作用,所以准确识别蛋白质中的琥珀酰化位点是有必要的。传统的实验耗费物力和财力。通过计算方法预测是近段时间以来提出的一种高效的预测方法。本研究中,我们开发了一种新的预测方法iSucc-PseAAC,它是通过使用多种分类算法结合不同的特征提取方法。最终发现,基于耦合序列(PseAAC)特征提取下,使用支持向量机分类效果是最好的,并结合集成学习解决了数据不平衡问题。与现有方法预测效果对比,iSucc-PseAAC在区分赖氨酸琥珀酰化位点方面,更具有意义和实用性。  相似文献   

17.
MALDI imaging mass spectrometry (‘MALDI imaging’) is an increasingly recognized technique for biomarker research. After years of method development in the scientific community, the technique is now increasingly applied in clinical research. In this article, we discuss the use of MALDI imaging in clinical proteomics and put it in context with classical proteomics techniques. We also highlight a number of upcoming challenges for personalized medicine, development of targeted therapies and diagnostic molecular pathology where MALDI imaging could help.  相似文献   

18.
Machine learning for Big Data analytics in plants   总被引:2,自引:0,他引:2  
  相似文献   

19.
Glycosylation is one of the most important posttranslational modifications of proteins and plays essential roles in various biological processes. Aberration in the glycan moieties of glycoproteins is associated with many diseases. It is especially critical to develop the rapid and sensitive methods for analysis of aberrant glycoproteins associated with diseases. Mass spectrometry (MS) has become a powerful tool for glycoprotein analysis. Especially, tandem mass spectrometry can provide highly informative fragments for structural identification of glycoproteins. This review provides an overview of the development of MS technologies and their applications in identification of abnormal glycoproteins and glycans in human serum to screen cancer biomarkers in recent years.  相似文献   

20.
Now in its 6(th) year, the East Midlands Proteomics workshop held in November 2007 brought together over 200 scientists with a common interest in proteomic techniques and their application to complex biological and biomedical problems. For the first time, this meeting was jointly supported by the British Society for Proteome Research (BSPR) and British Mass Spectrometry Society (BMSS).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号