首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
《Genomics》2020,112(3):2524-2534
The development of embryonic cells involves several continuous stages, and some genes are related to embryogenesis. To date, few studies have systematically investigated changes in gene expression profiles during mammalian embryogenesis. In this study, a computational analysis using machine learning algorithms was performed on the gene expression profiles of mouse embryonic cells at seven stages. First, the profiles were analyzed through a powerful Monte Carlo feature selection method for the generation of a feature list. Second, increment feature selection was applied on the list by incorporating two classification algorithms: support vector machine (SVM) and repeated incremental pruning to produce error reduction (RIPPER). Through SVM, we extracted several latent gene biomarkers, indicating the stages of embryonic cells, and constructed an optimal SVM classifier that produced a nearly perfect classification of embryonic cells. Furthermore, some interesting rules were accessed by the RIPPER algorithm, suggesting different expression patterns for different stages.  相似文献   

2.
Bayesian networks are knowledge representation tools that model the (in)dependency relationships among variables for probabilistic reasoning. Classification with Bayesian networks aims to compute the class with the highest probability given a case. This special kind is referred to as Bayesian network classifiers. Since learning the Bayesian network structure from a dataset can be viewed as an optimization problem, heuristic search algorithms may be applied to build high-quality networks in medium- or large-scale problems, as exhaustive search is often feasible only for small problems. In this paper, we present our new algorithm, ABC-Miner, and propose several extensions to it. ABC-Miner uses ant colony optimization for learning the structure of Bayesian network classifiers. We report extended computational results comparing the performance of our algorithm with eight other classification algorithms, namely six variations of well-known Bayesian network classifiers, cAnt-Miner for discovering classification rules and a support vector machine algorithm.  相似文献   

3.
Four different neural network algorithms, binary adaptive resonance theory (ART1), self-organizing map, learning vector quantization and back-propagation, were compared in the diagnosis of acute appendicitis with different parameter groups. The results show that supervised learning algorithms learning vector quantization and back-propagation were better than unsupervised algorithms in this medical decision making problem. The best results were obtained with the learning vector quantization. The self-organizing map algorithm showed good specificity, but this was in conjunction with lower sensitivity. The best parameter group was found to be the clinical signs. It seems beneficial to design a decision support system which uses these methods in the decision making process.  相似文献   

4.
支持向量机与神经网络的关系研究   总被引:2,自引:0,他引:2  
支持向量机是一种基于统计学习理论的新颖的机器学习方法,由于其出色的学习性能,该技术已成为当前国际机器学习界的研究热点,该方法已经广泛用于解决分类和回归问题.本文将结构风险函数应用于径向基函数网络学习中,同时讨论了支持向量回归模型和径向基函数网络之间的关系.仿真实例表明所给算法提高了径向基函数网络的泛化性能.  相似文献   

5.
A new paradigm of neural network architecture is proposed that works as associative memory along with capabilities of pruning and order-sensitive learning. The network has a composite structure wherein each node of the network is a Hopfield network by itself. The Hopfield network employs an order-sensitive learning technique and converges to user-specified stable states without having any spurious states. This is based on geometrical structure of the network and of the energy function. The network is so designed that it allows pruning in binary order as it progressively carries out associative memory retrieval. The capacity of the network is 2n, where n is the number of basic nodes in the network. The capabilities of the network are demonstrated by experimenting on three different application areas, namely a Library Database, a Protein Structure Database and Natural Language Understanding.  相似文献   

6.
Large-scale artificial neural networks have many redundant structures, making the network fall into the issue of local optimization and extended training time. Moreover, existing neural network topology optimization algorithms have the disadvantage of many calculations and complex network structure modeling. We propose a Dynamic Node-based neural network Structure optimization algorithm (DNS) to handle these issues. DNS consists of two steps: the generation step and the pruning step. In the generation step, the network generates hidden layers layer by layer until accuracy reaches the threshold. Then, the network uses a pruning algorithm based on Hebb’s rule or Pearson’s correlation for adaptation in the pruning step. In addition, we combine genetic algorithm to optimize DNS (GA-DNS). Experimental results show that compared with traditional neural network topology optimization algorithms, GA-DNS can generate neural networks with higher construction efficiency, lower structure complexity, and higher classification accuracy.  相似文献   

7.

Background  

Cancer diagnosis and clinical outcome prediction are among the most important emerging applications of gene expression microarray technology with several molecular signatures on their way toward clinical deployment. Use of the most accurate classification algorithms available for microarray gene expression data is a critical ingredient in order to develop the best possible molecular signatures for patient care. As suggested by a large body of literature to date, support vector machines can be considered "best of class" algorithms for classification of such data. Recent work, however, suggests that random forest classifiers may outperform support vector machines in this domain.  相似文献   

8.
Few studies have looked at the potential of using diffusion tensor imaging (DTI) in conjunction with machine learning algorithms in order to automate the classification of healthy older subjects and subjects with mild cognitive impairment (MCI). Here we apply DTI to 40 healthy older subjects and 33 MCI subjects in order to derive values for multiple indices of diffusion within the white matter voxels of each subject. DTI measures were then used together with support vector machines (SVMs) to classify control and MCI subjects. Greater than 90% sensitivity and specificity was achieved using this method, demonstrating the potential of a joint DTI and SVM pipeline for fast, objective classification of healthy older and MCI subjects. Such tools may be useful for large scale drug trials in Alzheimer's disease where the early identification of subjects with MCI is critical.  相似文献   

9.
表面肌电信号(Surface Electromyography,sEMG)是通过相应肌群表面的传感器记录下来的一维时间序列非平稳生物电信号,不但反映了神经肌肉系统活动,对于反映相应动作肢体活动信息同样重要。而模式识别是肌电应用领域的基础和关键。为了在应用基于表面肌电信号模式识别中选取合适算法,本文拟对基于表面肌电信号的人体动作识别算法进行回顾分析,主要包括模糊模式识别算法、线性判别分析算法、人工神经网络算法和支持向量机算法。模糊模式识别能自适应提取模糊规则,对初始化规则不敏感,适合处理s EMG这样具有严格不重复的生物电信号;线性判别分析对数据进行降维,计算简单,但不适合大数据;人工神经网络可以同时描述训练样本输入输出的线性关系和非线性映射关系,可以解决复杂的分类问题,学习能力强;支持向量机处理小样本、非线性的高维数据优势明显,计算速度快。比较各方法的优缺点,为今后处理此类问题模式识别算法选取提供了参考和依据。  相似文献   

10.
Microarray data classification using automatic SVM kernel selection   总被引:1,自引:0,他引:1  
Nahar J  Ali S  Chen YP 《DNA and cell biology》2007,26(10):707-712
Microarray data classification is one of the most important emerging clinical applications in the medical community. Machine learning algorithms are most frequently used to complete this task. We selected one of the state-of-the-art kernel-based algorithms, the support vector machine (SVM), to classify microarray data. As a large number of kernels are available, a significant research question is what is the best kernel for patient diagnosis based on microarray data classification using SVM? We first suggest three solutions based on data visualization and quantitative measures. Different types of microarray problems then test the proposed solutions. Finally, we found that the rule-based approach is most useful for automatic kernel selection for SVM to classify microarray data.  相似文献   

11.
12.
MOTIVATION: Apoptosis has drawn the attention of researchers because of its importance in treating some diseases through finding a proper way to block or slow down the apoptosis process. Having understood that caspase cleavage is the key to apoptosis, we find novel methods or algorithms are essential for studying the specificity of caspase cleavage activity and this helps the effective drug design. As bio-basis function neural networks have proven to outperform some conventional neural learning algorithms, there is a motivation, in this study, to investigate the application of bio-basis function neural networks for the prediction of caspase cleavage sites. RESULTS: Thirteen protein sequences with experimentally determined caspase cleavage sites were downloaded from NCBI. Bayesian bio-basis function neural networks are investigated and the comparisons with single-layer perceptrons, multilayer perceptrons, the original bio-basis function neural networks and support vector machines are given. The impact of the sliding window size used to generate sub-sequences for modelling on prediction accuracy is studied. The results show that the Bayesian bio-basis function neural network with two Gaussian distributions for model parameters (weights) performed the best and the highest prediction accuracy is 97.15 +/- 1.13%. AVAILABILITY: The package of Bayesian bio-basis function neural network can be obtained by request to the author.  相似文献   

13.
《IRBM》2020,41(4):229-239
Feature selection algorithms are the cornerstone of machine learning. By increasing the properties of the samples and samples, the feature selection algorithm selects the significant features. The general name of the methods that perform this function is the feature selection algorithm. The general purpose of feature selection algorithms is to select the most relevant properties of data classes and to increase the classification performance. Thus, we can select features based on their classification performance. In this study, we have developed a feature selection algorithm based on decision support vectors classification performance. The method can work according to two different selection criteria. We tested the classification performances of the features selected with P-Score with three different classifiers. Besides, we assessed P-Score performance with 13 feature selection algorithms in the literature. According to the results of the study, the P-Score feature selection algorithm has been determined as a method which can be used in the field of machine learning.  相似文献   

14.
This paper presents a pruning method for artificial neural networks (ANNs) based on the 'Lempel-Ziv complexity' (LZC) measure. We call this method the 'silent pruning algorithm' (SPA). The term 'silent' is used in the sense that SPA prunes ANNs without causing much disturbance during the network training. SPA prunes hidden units during the training process according to their ranks computed from LZC. LZC extracts the number of unique patterns in a time sequence obtained from the output of a hidden unit and a smaller value of LZC indicates higher redundancy of a hidden unit. SPA has a great resemblance to biological brains since it encourages higher complexity during the training process. SPA is similar to, yet different from, existing pruning algorithms. The algorithm has been tested on a number of challenging benchmark problems in machine learning, including cancer, diabetes, heart, card, iris, glass, thyroid, and hepatitis problems. We compared SPA with other pruning algorithms and we found that SPA is better than the 'random deletion algorithm' (RDA) which prunes hidden units randomly. Our experimental results show that SPA can simplify ANNs with good generalization ability.  相似文献   

15.
16.
目前,基于计算机数学方法对基因的功能注释已成为热点及挑战,其中以机器学习方法应用最为广泛。生物信息学家不断提出有效、快速、准确的机器学习方法用于基因功能的注释,极大促进了生物医学的发展。本文就关于机器学习方法在基因功能注释的应用与进展作一综述。主要介绍几种常用的方法,包括支持向量机、k近邻算法、决策树、随机森林、神经网络、马尔科夫随机场、logistic回归、聚类算法和贝叶斯分类器,并对目前机器学习方法应用于基因功能注释时如何选择数据源、如何改进算法以及如何提高预测性能上进行讨论。  相似文献   

17.
In this paper, a novel efficient learning algorithm towards self-generating fuzzy neural network (SGFNN) is proposed based on ellipsoidal basis function (EBF) and is functionally equivalent to a Takagi-Sugeno-Kang (TSK) fuzzy system. The proposed algorithm is simple and efficient and is able to generate a fuzzy neural network with high accuracy and compact structure. The structure learning algorithm of the proposed SGFNN combines criteria of fuzzy-rule generation with a pruning technology. The Kalman filter (KF) algorithm is used to adjust the consequent parameters of the SGFNN. The SGFNN is employed in a wide range of applications ranging from function approximation and nonlinear system identification to chaotic time-series prediction problem and real-world fuel consumption prediction problem. Simulation results and comparative studies with other algorithms demonstrate that a more compact architecture with high performance can be obtained by the proposed algorithm. In particular, this paper presents an adaptive modeling and control scheme for drug delivery system based on the proposed SGFNN. Simulation study demonstrates the ability of the proposed approach for estimating the drug's effect and regulating blood pressure at a prescribed level.  相似文献   

18.
Trained radial basis function networks are well-suited for use in extracting rules and explanations because they contain a set of locally tuned units. However, for rule extraction to be useful, these networks must first be pruned to eliminate unnecessary weights. The pruning algorithm cannot search the network exhaustively because of the computational effort involved. It is shown that using multiple pruning methods with smart ordering of the pruning candidates, the number of weights in a radial basis function network can be reduced to a small fraction of the original number. The complexity of the pruning algorithm is quadratic (instead of exponential) in the number of network weights. Pruning performance is shown using a variety of benchmark problems from the University of California, Irvine machine learning database.  相似文献   

19.
Tandem mass spectrometry (MS/MS) combined with protein database searching has been widely used in protein identification. A validation procedure is generally required to reduce the number of false positives. Advanced tools using statistical and machine learning approaches may provide faster and more accurate validation than manual inspection and empirical filtering criteria. In this study, we use two feature selection algorithms based on random forest and support vector machine to identify peptide properties that can be used to improve validation models. We demonstrate that an improved model based on an optimized set of features reduces the number of false positives by 58% relative to the model which used only search engine scores, at the same sensitivity score of 0.8. In addition, we develop classification models based on the physicochemical properties and protein sequence environment of these peptides without using search engine scores. The performance of the best model based on the support vector machine algorithm is at 0.8 AUC, 0.78 accuracy, and 0.7 specificity, suggesting a reasonably accurate classification. The identified properties important to fragmentation and ionization can be either used in independent validation tools or incorporated into peptide sequencing and database search algorithms to improve existing software programs.  相似文献   

20.
Mark A. Hallen 《Proteins》2019,87(1):62-73
Protein design algorithms must search an enormous conformational space to identify favorable conformations. As a result, those that perform this search with guarantees of accuracy generally start with a conformational pruning step, such as dead-end elimination (DEE). However, the mathematical assumptions of DEE-based pruning algorithms have up to now severely restricted the biophysical model that can feasibly be used in protein design. To lift these restrictions, I propose to prune local unrealistic geometries (PLUG) using a linear programming-based method. PLUG's biophysical model consists only of well-known lower bounds on interatomic distances. PLUG is intended as preprocessing for energy-based protein design calculations, whose biophysical model need not support DEE pruning. Based on 96 test cases, PLUG is at least as effective at pruning as DEE for larger protein designs—the type that most require pruning. When combined with the LUTE protein design algorithm, PLUG greatly facilitates designs that account for continuous entropy, large multistate designs with continuous flexibility, and designs with extensive continuous backbone flexibility and advanced nonpairwise energy functions. Many of these designs are tractable only with PLUG, either for empirical reasons (LUTE's machine learning step achieves an accurate fit only after PLUG pruning), or for theoretical reasons (many energy functions are fundamentally incompatible with DEE).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号