期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Learning divisive normalization in primary visual cortex

Max F. Burg Santiago A. Cadena George H. Denfield Edgar Y. Walker Andreas S. Tolias Matthias Bethge Alexander S. Ecker 《PLoS computational biology》2021,17(6)

相似文献

2.

Face Recognition with Multi-Resolution Spectral Feature Images

Zhan-Li Sun Kin-Man Lam Zhao-Yang Dong Han Wang Qing-Wei Gao Chun-Hou Zheng 《PloS one》2013,8(2)

The one-sample-per-person problem has become an active research topic for face recognition in recent years because of its challenges and significance for real-world applications. However, achieving relatively higher recognition accuracy is still a difficult problem due to, usually, too few training samples being available and variations of illumination and expression. To alleviate the negative effects caused by these unfavorable factors, in this paper we propose a more accurate spectral feature image-based 2DLDA (two-dimensional linear discriminant analysis) ensemble algorithm for face recognition, with one sample image per person. In our algorithm, multi-resolution spectral feature images are constructed to represent the face images; this can greatly enlarge the training set. The proposed method is inspired by our finding that, among these spectral feature images, features extracted from some orientations and scales using 2DLDA are not sensitive to variations of illumination and expression. In order to maintain the positive characteristics of these filters and to make correct category assignments, the strategy of classifier committee learning (CCL) is designed to combine the results obtained from different spectral feature images. Using the above strategies, the negative effects caused by those unfavorable factors can be alleviated efficiently in face recognition. Experimental results on the standard databases demonstrate the feasibility and efficiency of the proposed method. 相似文献

3.

Calculating the target exposure index using a deep convolutional neural network and a rule base

《Physica medica : PM : an international journal devoted to the applications of physics to medicine and biology : official journal of the Italian Association of Biomedical Physics (AIFB)》2020

PurposeThe objective of this study is to determine the quality of chest X-ray images using a deep convolutional neural network (DCNN) and a rule base without performing any visual assessment. A method is proposed for determining the minimum diagnosable exposure index (EI) and the target exposure index (EIt).MethodsThe proposed method involves transfer learning to assess the lung fields, mediastinum, and spine using GoogLeNet, which is a type of DCNN that has been trained using conventional images. Three detectors were created, and the image quality of local regions was rated. Subsequently, the results were used to determine the overall quality of chest X-ray images using a rule-based technique that was in turn based on expert assessment. The minimum EI required for diagnosis was calculated based on the distribution of the EI values, which were classified as either suitable or non-suitable and then used to ascertain the EIt.ResultsThe accuracy rate using the DCNN and the rule base was 81%. The minimum EI required for diagnosis was 230, and the EIt was 288.ConclusionThe results indicated that the proposed method using the DCNN and the rule base could discriminate different image qualities without any visual assessment; moreover, it could determine both the minimum EI required for diagnosis and the EIt. 相似文献

4.

Deep Supervised,but Not Unsupervised,Models May Explain IT Cortical Representation

Seyed-Mahdi Khaligh-Razavi Nikolaus Kriegeskorte 《PLoS computational biology》2014,10(11)

Inferior temporal (IT) cortex in human and nonhuman primates serves visual object recognition. Computational object-vision models, although continually improving, do not yet reach human performance. It is unclear to what extent the internal representations of computational models can explain the IT representation. Here we investigate a wide range of computational model representations (37 in total), testing their categorization performance and their ability to account for the IT representational geometry. The models include well-known neuroscientific object-recognition models (e.g. HMAX, VisNet) along with several models from computer vision (e.g. SIFT, GIST, self-similarity features, and a deep convolutional neural network). We compared the representational dissimilarity matrices (RDMs) of the model representations with the RDMs obtained from human IT (measured with fMRI) and monkey IT (measured with cell recording) for the same set of stimuli (not used in training the models). Better performing models were more similar to IT in that they showed greater clustering of representational patterns by category. In addition, better performing models also more strongly resembled IT in terms of their within-category representational dissimilarities. Representational geometries were significantly correlated between IT and many of the models. However, the categorical clustering observed in IT was largely unexplained by the unsupervised models. The deep convolutional network, which was trained by supervision with over a million category-labeled images, reached the highest categorization performance and also best explained IT, although it did not fully explain the IT data. Combining the features of this model with appropriate weights and adding linear combinations that maximize the margin between animate and inanimate objects and between faces and other objects yielded a representation that fully explained our IT data. Overall, our results suggest that explaining IT requires computational features trained through supervised learning to emphasize the behaviorally important categorical divisions prominently reflected in IT. 相似文献

5.

Automating chest radiograph imaging quality control

《Physica medica : PM : an international journal devoted to the applications of physics to medicine and biology : official journal of the Italian Association of Biomedical Physics (AIFB)》2021

PurposeTo automate diagnostic chest radiograph imaging quality control (lung inclusion at all four edges, patient rotation, and correct inspiration) using convolutional neural network models.MethodsThe data comprised of 2589 postero-anterior chest radiographs imaged in a standing position, which were divided into train, validation, and test sets. We increased the number of images for the inclusion by cropping appropriate images, and for the inclusion and the rotation by flipping the images horizontally. The image histograms were equalized, and the images were resized to a 512 × 512 resolution. We trained six convolutional neural networks models to detect the image quality features using manual image annotations as training targets. Additionally, we studied the inter-observer variability of the image annotation.ResultsThe convolutional neural networks’ areas under the receiver operating characteristic curve were >0.88 for the inclusions, and >0.70 and >0.79 for the rotation and the inspiration, respectively. The inter-observer agreement between two human annotators for the assessed image-quality features were: 92%, 90%, 82%, and 88% for the inclusion at patient’s left, patient’s right, cranial, and caudal edges, and 78% and 89% for the rotation and inspiration, respectively. Higher inter-observer agreement was related to a smaller variance in the network confidence.ConclusionsThe developed models provide automated tools for the quality control in a radiological department. Additionally, the convolutional neural networks could be used to obtain immediate feedback of the chest radiograph image quality, which could serve as an educational instrument. 相似文献

6.

CO-WOA: Novel Optimization Approach for Deep Learning Classification of Fish Image

Rabia Musheer Aziz Rajul Mahto Aryan Das Saboor Uddin Ahmed Priyanka Roy Saurav Mallik Aimin Li 《化学与生物多样性》2023,20(8):e202201123

The most significant groupings of cold-blooded creatures are the fish family. It is crucial to recognize and categorize the most significant species of fish since various species of seafood diseases and decay exhibit different symptoms. Systems based on enhanced deep learning can replace the area's currently cumbersome and sluggish traditional approaches. Although it seems straightforward, classifying fish images is a complex procedure. In addition, the scientific study of population distribution and geographic patterns is important for advancing the field's present advancements. The goal of the proposed work is to identify the best performing strategy using cutting-edge computer vision, the Chaotic Oppositional Based Whale Optimization Algorithm (CO-WOA), and data mining techniques. Performance comparisons with leading models, such as Convolutional Neural Networks (CNN) and VGG-19, are made to confirm the applicability of the suggested method. The suggested feature extraction approach with Proposed Deep Learning Model was used in the research, yielding accuracy rates of 100 %. The performance was also compared to cutting-edge image processing models with an accuracy of 98.48 %, 98.58 %, 99.04 %, 98.44 %, 99.18 % and 99.63 % such as Convolutional Neural Networks, ResNet150V2, DenseNet, Visual Geometry Group-19, Inception V3, Xception. Using an empirical method leveraging artificial neural networks, the Proposed Deep Learning model was shown to be the best model. 相似文献

7.

Prediction of tyrosine sulfation with mRMR feature selection and analysis

Niu S Huang T Feng K Cai Y Li Y 《Journal of proteome research》2010,9(12):6490-6497

Protein tyrosine sulfation is a ubiquitous post-translational modification (PTM) of secreted and transmembrane proteins that pass through the Golgi apparatus. In this study, we developed a new method for protein tyrosine sulfation prediction based on a nearest neighbor algorithm with the maximum relevance minimum redundancy (mRMR) method followed by incremental feature selection (IFS). We incorporated features of sequence conservation, residual disorder, and amino acid factor, 229 features in total, to predict tyrosine sulfation sites. From these 229 features, 145 features were selected and deemed as the optimized features for the prediction. The prediction model achieved a prediction accuracy of 90.01% using the optimal 145-feature set. Feature analysis showed that conservation, disorder, and physicochemical/biochemical properties of amino acids all contributed to the sulfation process. Site-specific feature analysis showed that the features derived from its surrounding sites contributed profoundly to sulfation site determination in addition to features derived from the sulfation site itself. The detailed feature analysis in this paper might help understand more of the sulfation mechanism and guide the related experimental validation. 相似文献

8.

Enabling full-length evolutionary profiles based deep convolutional neural network for predicting DNA-binding proteins from sequence

Sucheta Chauhan Shandar Ahmad 《Proteins》2020,88(1):15-30

Sequence based DNA-binding protein (DBP) prediction is a widely studied biological problem. Sliding windows on position specific substitution matrices (PSSMs) rows predict DNA-binding residues well on known DBPs but the same models cannot be applied to unequally sized protein sequences. PSSM summaries representing column averages and their amino-acid wise versions have been effectively used for the task, but it remains unclear if these features carry all the PSSM's predictive power, traditionally harnessed for binding site predictions. Here we evaluate if PSSMs scaled up to a fixed size by zero-vector padding (pPSSM) could perform better than the summary based features on similar models. Using multilayer perceptron (MLP) and deep convolutional neural network (CNN), we found that (a) Summary features work well for single-genome (human-only) data but are outperformed by pPSSM for diverse PDB-derived data sets, suggesting greater summary-level redundancy in the former, (b) even when summary features work comparably well with pPSSM, a consensus on the two outperforms both of them (c) CNN models comprehensively outperform their corresponding MLP models and (d) actual predicted scores from different models depend on the choice of input feature sets used whereas overall performance levels are model-dependent in which CNN leads the accuracy. 相似文献

9.

Prediction of Protein Cleavage Site with Feature Selection by Random Forest

Bi-Qing Li Yu-Dong Cai Kai-Yan Feng Gui-Jun Zhao 《PloS one》2012,7(9)

Proteinases play critical roles in both intra and extracellular processes by binding and cleaving their protein substrates. The cleavage can either be non-specific as part of degradation during protein catabolism or highly specific as part of proteolytic cascades and signal transduction events. Identification of these targets is extremely challenging. Current computational approaches for predicting cleavage sites are very limited since they mainly represent the amino acid sequences as patterns or frequency matrices. In this work, we developed a novel predictor based on Random Forest algorithm (RF) using maximum relevance minimum redundancy (mRMR) method followed by incremental feature selection (IFS). The features of physicochemical/biochemical properties, sequence conservation, residual disorder, amino acid occurrence frequency, secondary structure and solvent accessibility were utilized to represent the peptides concerned. Here, we compared existing prediction tools which are available for predicting possible cleavage sites in candidate substrates with ours. It is shown that our method makes much more reliable predictions in terms of the overall prediction accuracy. In addition, this predictor allows the use of a wide range of proteinases. 相似文献

10.

Distinguishing features of 16S rDNA gene for five dominating bacterial genus observed in bioremediation.

D V Raje H J Purohit R N Singh 《Journal of computational biology》2002,9(6):819-829

Defining a microbial community and identifying bacteria, at least at the genus level, is a first step in predicting the behavior of a microbial community in bioremediation. In biological treatment systems, the most dominating groups observed are Pseudomonas, Moraxella, Acinetobactor, Burkholderia, and Alcaligenes. Our interest lies in identifying the distinguishing features of these bacterial groups based on their 16S rDNA sequence data, which could be used further for generating genus-specific probes. Accordingly, 20 sequences representing different species from each genus above were retrieved, which constituted a training set. A 16-dimensional feature vector comprised of transition probabilities of nucleotides was considered and each sampled sequence was expressed in terms of these features. A stepwise feature selection method was used to identify features that are distinct across the species of these five groups. Wilk's lambda selection criterion was used and resulted in a subset with six distinguishing features. The discriminating efficacy of this subset was tested through multiple group discriminant analysis. Two linear composites, as a function of these features, could discriminate the test set of forty-five sequences from these groups with 95% accuracy, thereby ascertaining the relevance of the identified features. The geometric representation of feature correlation in the reduced discriminant space demonstrated the dominance of identified features in specific groups. These features independently or in combination could be used to generate genus-specific patterns to design probes, so as to develop a tracking tool for the selected group of bacteria. 相似文献

11.

Second order dimensionality reduction using minimum and maximum mutual information models

Fitzgerald JD Rowekamp RJ Sincich LC Sharpee TO 《PLoS computational biology》2011,7(10):e1002249

Conventional methods used to characterize multidimensional neural feature selectivity, such as spike-triggered covariance (STC) or maximally informative dimensions (MID), are limited to Gaussian stimuli or are only able to identify a small number of features due to the curse of dimensionality. To overcome these issues, we propose two new dimensionality reduction methods that use minimum and maximum information models. These methods are information theoretic extensions of STC that can be used with non-Gaussian stimulus distributions to find relevant linear subspaces of arbitrary dimensionality. We compare these new methods to the conventional methods in two ways: with biologically-inspired simulated neurons responding to natural images and with recordings from macaque retinal and thalamic cells responding to naturalistic time-varying stimuli. With non-Gaussian stimuli, the minimum and maximum information methods significantly outperform STC in all cases, whereas MID performs best in the regime of low dimensional feature spaces. 相似文献

12.

基于X线图像的膝关节周围原发性骨肿瘤辅助诊断的机器学习模型研究

下载免费PDF全文

何方舟牛凯唐顺张熠丹谢璐王冀川夏楚藜赵志庆贺志强郭卫《现代生物医学进展》2021,(15):2842-2847

摘要目的：开发机器学习模型,并评估其在膝关节周围原发性骨肿瘤诊断方面的准确性。方法：本文将深度卷积神经网络（DCNN）这一深度学习方法应用于膝关节X线图像的影像组学分析,探讨其辅助诊断膝关节周围原发性骨肿瘤的临床价值。结果：该深度学习模型在区分正常与肿瘤影像方面展现出优异的诊断准确性,使用DCNN模型进行5轮测试的总体准确性为（99.8±0.4）%,而阳性预测值和阴性预测值分别为（100.0±0.0）%和（99.6±0.8）%,各个数据集的曲线下面积（AUC）分别为0.99、1.00、1.00、1.0和1.0,平均AUC为（0.998±0.004）;进一步使用DCNN模型进行了10轮测试显示其在区分良性与恶性骨肿瘤方面的总体准确性为（71.2±1.6）%,且达到了强阳性预测值（91.9±8.5）%,各个数据集的AUC分别为0.63、0.63、0.58、0.69、0.55、0.63、0.54、0.57、0.73、0.63,平均AUC为（0.62±0.06）。结论：本文是首个将人工智能技术应用于骨肿瘤诊断的X线图像影像组学分析方面的研究,人工智能影像组学模型能够帮助医生自动地快速筛查骨肿瘤,确定良性或恶性肿瘤时,阳性预测值较高。相似文献

13.

DLMC-Net: Deeper lightweight multi-class classification model for plant leaf disease detection

《Ecological Informatics》2023

Plant-leaf disease detection is one of the key problems of smart agriculture which has a significant impact on the global economy. To mitigate this, intelligent agricultural solutions are evolving that aid farmer to take preventive measures for improving crop production. With the advancement of deep learning, many convolutional neural network models have blazed their way to the identification of plant-leaf diseases. However, these models are limited to the detection of specific crops only. Therefore, this paper presents a new deeper lightweight convolutional neural network architecture (DLMC-Net) to perform plant leaf disease detection across multiple crops for real-time agricultural applications. In the proposed model, a sequence of collective blocks is introduced along with the passage layer to extract deep features. These benefits in feature propagation and feature reuse, which results in handling the vanishing gradient problem. Moreover, point-wise and separable convolution blocks are employed to reduce the number of trainable parameters. The efficacy of the proposed DLMC-Net model is validated across four publicly available datasets, namely citrus, cucumber, grapes, and tomato. Experimental results of the proposed model are compared against seven state-of-the-art models on eight parameters, namely accuracy, error, precision, recall, sensitivity, specificity, F1-score, and Matthews correlation coefficient. Experiments demonstrate that the proposed model has surpassed all the considered models, even under complex background conditions, with an accuracy of 93.56%, 92.34%, 99.50%, and 96.56% on citrus, cucumber, grapes, and tomato, respectively. Moreover, the proposed DLMC-Net requires only 6.4 million trainable parameters, which is the second best among the compared models. Therefore, it can be asserted that the proposed model is a viable alternative to perform plant leaf disease detection across multiple crops. 相似文献

14.

Feature decomposition architectures for neural networks: algorithms, error bounds, and applications

Wang H Mukhopadhyay S Fang S 《International journal of neural systems》2002,12(1):69-81

In recent years, systems consisting of multiple modular neural networks have attracted substantial interest in the neural networks community because of various advantages they offer over a single large monolithic network. In this paper, we propose two basic feature decomposition models (namely, parallel model and tandem model) in which each of the neural network modules processes a disjoint subset of the input features. A novel feature decomposition algorithm is introduced to partition the input space into disjoint subsets solely based on the available training data. Under certain assumptions, the approximation error due to decomposition can be proved to be bounded by any desired small value over a compact set. Finally, the performance of feature decomposition networks is compared with that of a monolithic network in real world bench mark pattern recognition and modeling problems. 相似文献

15.

Multi-view features fusion for birdsong classification

《Ecological Informatics》2022

As important members of the ecosystem, birds are good monitors of the ecological environment. Bird recognition, especially birdsong recognition, has attracted more and more attention in the field of artificial intelligence. At present, traditional machine learning and deep learning are widely used in birdsong recognition. Deep learning can not only classify and recognize the spectrums of birdsong, but also be used as a feature extractor. Machine learning is often used to classify and recognize the extracted birdsong handcrafted feature parameters. As the data samples of the classifier, the feature of birdsong directly determines the performance of the classifier. Multi-view features from different methods of feature extraction can obtain more perfect information of birdsong. Therefore, aiming at enriching the representational capacity of single feature and getting a better way to combine features, this paper proposes a birdsong classification model based multi-view features, which combines the deep features extracted by convolutional neural network (CNN) and handcrafted features. Firstly, four kinds of handcrafted features are extracted. Those are wavelet transform (WT) spectrum, Hilbert-Huang transform (HHT) spectrum, short-time Fourier transform (STFT) spectrum and Mel-frequency cepstral coefficients (MFCC). Then CNN is used to extract the deep features from WT, HHT and STFT spectrum, and the minimal-redundancy-maximal-relevance (mRMR) to select optimal features. Finally, three classification models (random forest, support vector machine and multi-layer perceptron) are built with the deep features and handcrafted features, and the probability of classification results of the two types of features are fused as the new features to recognize birdsong. Taking sixteen species of birds as research objects, the experimental results show that the three classifiers obtain the accuracy of 95.49%, 96.25% and 96.16% respectively for the features of the proposed method, which are better than the seven single features and three fused features involved in the experiment. This proposed method effectively combines the deep features and handcrafted features from the perspectives of signal. The fused features can more comprehensively express the information of the bird audio itself, and have higher classification accuracy and lower dimension, which can effectively improve the performance of bird audio classification. 相似文献

16.

Deep self-supervised transformation learning for leukocyte classification

Xinwei Chen Guolin Zheng Liwei Zhou Zuoyong Li Haoyi Fan 《Journal of biophotonics》2023,16(3):e202200244

The scarcity of training annotation is one of the major challenges for the application of deep learning technology in medical image analysis. Recently, self-supervised learning provides a powerful solution to alleviate this challenge by extracting useful features from a large number of unlabeled training data. In this article, we propose a simple and effective self-supervised learning method for leukocyte classification by identifying the different transformations of leukocyte images, without requiring a large batch of negative sampling or specialized architectures. Specifically, a convolutional neural network backbone takes different transformations of leukocyte image as input for feature extraction. Then, a pretext task of self-supervised transformation recognition on the extracted feature is conducted by a classifier, which helps the backbone learn useful representations that generalize well across different leukocyte types and datasets. In the experiment, we systematically study the effect of different transformation compositions on useful leukocyte feature extraction. Compared with five typical baselines of self-supervised image classification, experimental results demonstrate that our method performs better in different evaluation protocols including linear evaluation, domain transfer, and finetuning, which proves the effectiveness of the proposed method. 相似文献

17.

A Robust and Accurate Method for Feature Selection and Prioritization from Multi-Class OMICs Data

Vittorio Fortino Pia Kinaret Nanna Fyhrquist Harri Alenius Dario Greco 《PloS one》2014,9(9)

Selecting relevant features is a common task in most OMICs data analysis, where the aim is to identify a small set of key features to be used as biomarkers. To this end, two alternative but equally valid methods are mainly available, namely the univariate (filter) or the multivariate (wrapper) approach. The stability of the selected lists of features is an often neglected but very important requirement. If the same features are selected in multiple independent iterations, they more likely are reliable biomarkers. In this study, we developed and evaluated the performance of a novel method for feature selection and prioritization, aiming at generating robust and stable sets of features with high predictive power. The proposed method uses the fuzzy logic for a first unbiased feature selection and a Random Forest built from conditional inference trees to prioritize the candidate discriminant features. Analyzing several multi-class gene expression microarray data sets, we demonstrate that our technique provides equal or better classification performance and a greater stability as compared to other Random Forest-based feature selection methods. 相似文献

18.

A Comparison of Supervised Machine Learning Algorithms and Feature Vectors for MS Lesion Segmentation Using Multimodal Structural MRI

Elizabeth M. Sweeney Joshua T. Vogelstein Jennifer L. Cuzzocreo Peter A. Calabresi Daniel S. Reich Ciprian M. Crainiceanu Russell T. Shinohara 《PloS one》2014,9(4)

Machine learning is a popular method for mining and analyzing large collections of medical data. We focus on a particular problem from medical research, supervised multiple sclerosis (MS) lesion segmentation in structural magnetic resonance imaging (MRI). We examine the extent to which the choice of machine learning or classification algorithm and feature extraction function impacts the performance of lesion segmentation methods. As quantitative measures derived from structural MRI are important clinical tools for research into the pathophysiology and natural history of MS, the development of automated lesion segmentation methods is an active research field. Yet, little is known about what drives performance of these methods. We evaluate the performance of automated MS lesion segmentation methods, which consist of a supervised classification algorithm composed with a feature extraction function. These feature extraction functions act on the observed T1-weighted (T1-w), T2-weighted (T2-w) and fluid-attenuated inversion recovery (FLAIR) MRI voxel intensities. Each MRI study has a manual lesion segmentation that we use to train and validate the supervised classification algorithms. Our main finding is that the differences in predictive performance are due more to differences in the feature vectors, rather than the machine learning or classification algorithms. Features that incorporate information from neighboring voxels in the brain were found to increase performance substantially. For lesion segmentation, we conclude that it is better to use simple, interpretable, and fast algorithms, such as logistic regression, linear discriminant analysis, and quadratic discriminant analysis, and to develop the features to improve performance. 相似文献

19.

大肠杆菌基因组中重叠基因注释的机器学习优化方法

杜明伦黄君君马香唐燕琼刘柱《中国生物化学与分子生物学报》2018,34(8):861-867

细菌基因组上存在着大量的重叠基因,这不但缩减基因组尺寸,增加对遗传信息的有效利用,而且参与转录及转录后水平的调控。目前重叠基因的形成原因尚不清楚,缺少预测重叠基因是否存在的特征信息,不利于对重叠基因的注释。本研究通过机器学习中的卷积神经网络算法对基因相关区域进行扫描,发现基因编码区前54 bp的区域可以作为判定重叠基因的标记信息,并采用支持向量机算法确证以上预测结果的准确性。通过对卷积神经网络模型的训练与优化,成功构建卷积神经网络模型,并用于大肠杆菌基因组中重叠基因的注释,对重叠基因的研究有重要意义。已训练好的模型和使用方法已经发布于GitHub,具体内容参看以下网址：https://github.com/breadpot/Convolutional_Neural_Network_Bacteria_overlapping_genes_prediction。相似文献

20.

A method to distinguish between lysine acetylation and lysine ubiquitination with feature selection and analysis

You Zhou Ning Zhang Bi-Qing Li 《Journal of biomolecular structure & dynamics》2013,31(11):2479-2490

Lysine acetylation and ubiquitination are two primary post-translational modifications (PTMs) in most eukaryotic proteins. Lysine residues are targets for both types of PTMs, resulting in different cellular roles. With the increasing availability of protein sequences and PTM data, it is challenging to distinguish the two types of PTMs on lysine residues. Experimental approaches are often laborious and time consuming. There is an urgent need for computational tools to distinguish between lysine acetylation and ubiquitination. In this study, we developed a novel method, called DAUFSA (distinguish between lysine acetylation and lysine ubiquitination with feature selection and analysis), to discriminate ubiquitinated and acetylated lysine residues. The method incorporated several types of features: PSSM (position-specific scoring matrix) conservation scores, amino acid factors, secondary structures, solvent accessibilities, and disorder scores. By using the mRMR (maximum relevance minimum redundancy) method and the IFS (incremental feature selection) method, an optimal feature set containing 290 features was selected from all incorporated features. A dagging-based classifier constructed by the optimal features achieved a classification accuracy of 69.53%, with an MCC of .3853. An optimal feature set analysis showed that the PSSM conservation score features and the amino acid factor features were the most important attributes, suggesting differences between acetylation and ubiquitination. Our study results also supported previous findings that different motifs were employed by acetylation and ubiquitination. The feature differences between the two modifications revealed in this study are worthy of experimental validation and further investigation. 相似文献