首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 288 毫秒
1.
Recent progress in bioinformatics research has led to the accumulation of huge quantities of biological data at various data sources. The DNA microarray technology makes it possible to simultaneously analyze large number of genes across different samples. Clustering of microarray data can reveal the hidden gene expression patterns from large quantities of expression data that in turn offers tremendous possibilities in functional genomics, comparative genomics, disease diagnosis and drug development. The k- ¬means clustering algorithm is widely used for many practical applications. But the original k-¬means algorithm has several drawbacks. It is computationally expensive and generates locally optimal solutions based on the random choice of the initial centroids. Several methods have been proposed in the literature for improving the performance of the k-¬means algorithm. A meta-heuristic optimization algorithm named harmony search helps find out near-global optimal solutions by searching the entire solution space. Low clustering accuracy of the existing algorithms limits their use in many crucial applications of life sciences. In this paper we propose a novel Harmony Search-K means Hybrid (HSKH) algorithm for clustering the gene expression data. Experimental results show that the proposed algorithm produces clusters with better accuracy in comparison with the existing algorithms.  相似文献   

2.
In this work, a new plant-inspired optimization algorithm namely the hybrid artificial root foraging optimizion (HARFO) is proposed, which mimics the iterative root foraging behaviors for complex optimization. In HARFO model, two innovative strategies were developed: one is the root-to-root communication strategy, which enables the individual exchange information with each other in different efficient topologies that can essentially improve the exploration ability; the other is co-evolution strategy, which can structure the hierarchical spatial population driven by evolutionary pressure of multiple sub-populations that ensure the diversity of root population to be well maintained. The proposed algorithm is benchmarked against four classical evolutionary algorithms on well-designed test function suites including both classical and composition test functions. Through the rigorous performance analysis that of all these tests highlight the significant performance improvement, and the comparative results show the superiority of the proposed algorithm.  相似文献   

3.
An improved algorithm for clustering gene expression data   总被引:1,自引:0,他引:1  
MOTIVATION: Recent advancements in microarray technology allows simultaneous monitoring of the expression levels of a large number of genes over different time points. Clustering is an important tool for analyzing such microarray data, typical properties of which are its inherent uncertainty, noise and imprecision. In this article, a two-stage clustering algorithm, which employs a recently proposed variable string length genetic scheme and a multiobjective genetic clustering algorithm, is proposed. It is based on the novel concept of points having significant membership to multiple classes. An iterated version of the well-known Fuzzy C-Means is also utilized for clustering. RESULTS: The significant superiority of the proposed two-stage clustering algorithm as compared to the average linkage method, Self Organizing Map (SOM) and a recently developed weighted Chinese restaurant-based clustering method (CRC), widely used methods for clustering gene expression data, is established on a variety of artificial and publicly available real life data sets. The biological relevance of the clustering solutions are also analyzed.  相似文献   

4.
AimThe aim of this study is to construct and evaluate Pseudo-CT images (P-CTs) for electron density calculation to facilitate external radiotherapy treatment planning.BackgroundDespite numerous benefits, computed tomography (CT) scan does not provide accurate information on soft tissue contrast, which often makes it difficult to precisely differentiate target tissues from the organs at risk and determine the tumor volume. Therefore, MRI imaging can reduce the variability of results when registering with a CT scan.Materials and methodsIn this research, a fuzzy clustering algorithm was used to segment images into different tissues, also linear regression methods were used to design the regression model based on the feature extraction method and the brightness intensity values. The results of the proposed algorithm for dose-volume histogram (DVH), Isodose curves, and gamma analysis were investigated using the RayPlan treatment planning system, and VeriSoft software. Furthermore, various statistical indices such as Mean Absolute Error (MAE), Mean Error (ME), and Structural Similarity Index (SSIM) were calculated.ResultsThe MAE of a range of 45–55 was found from the proposed methods. The relative difference error between the PTV region of the CT and the Pseudo-CT was 0.5, and the best gamma rate was 95.4% based on the polar coordinate feature and proposed polynomial regression model.ConclusionThe proposed method could support the generation of P-CT data for different parts of the brain region from a collection of MRI series with an acceptable average error rate by different evaluation criteria.  相似文献   

5.
Bioluminescence tomography is a preclinical imaging modality to locate and quantify internal bioluminescent sources from surface measurements, which experienced rapid growth in the last 10 years. However, multiple‐source resolving remains a challenging issue in BLT. In this study, it is treated as an unsupervised pattern recognition problem based on the reconstruction result, and a novel hybrid clustering algorithm combining the advantages of affinity propagation (AP) and K‐means is developed to identify multiple sources automatically. Moreover, we incorporate the clustering analysis into a general multiple‐source reconstruction framework, which can provide stable reconstruction and accurate resolving result without providing the number of targets. Numerical simulations and in vivo experiments on 4T1‐luc2 mouse model were conducted to assess the performance of the proposed method in multiple‐source resolving. The encouraging results demonstrate significant effectiveness and potential of our method in preclinical BLT applications.   相似文献   

6.
In heterogeneous distributed computing systems like cloud computing, the problem of mapping tasks to resources is a major issue which can have much impact on system performance. For some reasons such as heterogeneous and dynamic features and the dependencies among requests, task scheduling is known to be a NP-complete problem. In this paper, we proposed a hybrid heuristic method (HSGA) to find a suitable scheduling for workflow graph, based on genetic algorithm in order to obtain the response quickly moreover optimizes makespan, load balancing on resources and speedup ratio. At first, the HSGA algorithm makes tasks prioritization in complex graph considering their impact on others, based on graph topology. This technique is efficient to reduction of completion time of application. Then, it merges Best-Fit and Round Robin methods to make an optimal initial population to obtain a good solution quickly, and apply some suitable operations such as mutation to control and lead the algorithm to optimized solution. This algorithm evaluates the solutions by considering efficient parameters in cloud environment. Finally, the proposed algorithm presents the better results with increasing number of tasks in application graph in contrast with other studied algorithms.  相似文献   

7.
This paper proposes a novel artificial bee colony algorithm with dynamic population (ABC-DP), which synergizes the idea of extended life-cycle evolving model to balance the exploration and exploitation tradeoff. The proposed ABC-DP is a more bee-colony-realistic model that the bee can reproduce and die dynamically throughout the foraging process and population size varies as the algorithm runs. ABC-DP is then used for solving the optimal power flow (OPF) problem in power systems that considers the cost, loss, and emission impacts as the objective functions. The 30-bus IEEE test system is presented to illustrate the application of the proposed algorithm. The simulation results, which are also compared to nondominated sorting genetic algorithm II (NSGAII) and multi-objective ABC (MOABC), are presented to illustrate the effectiveness and robustness of the proposed method.  相似文献   

8.
Peng  Bo  Li  Lei 《Cognitive neurodynamics》2015,9(2):249-256
Wireless sensor network (WSN) are widely used in many applications. A WSN is a wireless decentralized structure network comprised of nodes, which autonomously set up a network. The node localization that is to be aware of position of the node in the network is an essential part of many sensor network operations and applications. The existing localization algorithms can be classified into two categories: range-based and range-free. The range-based localization algorithm has requirements on hardware, thus is expensive to be implemented in practice. The range-free localization algorithm reduces the hardware cost. Because of the hardware limitations of WSN devices, solutions in range-free localization are being pursued as a cost-effective alternative to more expensive range-based approaches. However, these techniques usually have higher localization error compared to the range-based algorithms. DV-Hop is a typical range-free localization algorithm utilizing hop-distance estimation. In this paper, we propose an improved DV-Hop algorithm based on genetic algorithm. Simulation results show that our proposed algorithm improves the localization accuracy compared with previous algorithms.  相似文献   

9.
Data clustering is commonly employed in many disciplines. The aim of clustering is to partition a set of data into clusters, in which objects within the same cluster are similar and dissimilar to other objects that belong to different clusters. Over the past decade, the evolutionary algorithm has been commonly used to solve clustering problems. This study presents a novel algorithm based on simplified swarm optimization, an emerging population-based stochastic optimization approach with the advantages of simplicity, efficiency, and flexibility. This approach combines variable vibrating search (VVS) and rapid centralized strategy (RCS) in dealing with clustering problem. VVS is an exploitation search scheme that can refine the quality of solutions by searching the extreme points nearby the global best position. RCS is developed to accelerate the convergence rate of the algorithm by using the arithmetic average. To empirically evaluate the performance of the proposed algorithm, experiments are examined using 12 benchmark datasets, and corresponding results are compared with recent works. Results of statistical analysis indicate that the proposed algorithm is competitive in terms of the quality of solutions.  相似文献   

10.
Chang  Luyao  Li  Fan  Niu  Xinzheng  Zhu  Jiahui 《Cluster computing》2022,25(4):3005-3017

To better collect data in context to balance energy consumption, wireless sensor networks (WSN) need to be divided into clusters. The division of clusters makes the network become a hierarchical organizational structure, which plays the role of balancing the network load and prolonging the life cycle of the system. In clustering routing algorithm, the pros and cons of clustering algorithm directly affect the result of cluster division. In this paper, an algorithm for selecting cluster heads based on node distribution density and allocating remaining nodes is proposed for the defects of cluster head random election and uneven clustering in the traditional LEACH protocol clustering algorithm in WSN. Experiments show that the algorithm can realize the rapid selection of cluster heads and division of clusters, which is effective for node clustering and is conducive to equalizing energy consumption.

  相似文献   

11.
Phylogeny reconstruction is a difficult computational problem, because the number of possible solutions increases with the number of included taxa. For example, for only 14 taxa, there are more than seven trillion possible unrooted phylogenetic trees. For this reason, phylogenetic inference methods commonly use clustering algorithms (e.g., the neighbor-joining method) or heuristic search strategies to minimize the amount of time spent evaluating nonoptimal trees. Even heuristic searches can be painfully slow, especially when computationally intensive optimality criteria such as maximum likelihood are used. I describe here a different approach to heuristic searching (using a genetic algorithm) that can tremendously reduce the time required for maximum-likelihood phylogenetic inference, especially for data sets involving large numbers of taxa. Genetic algorithms are simulations of natural selection in which individuals are encoded solutions to the problem of interest. Here, labeled phylogenetic trees are the individuals, and differential reproduction is effected by allowing the number of offspring produced by each individual to be proportional to that individual's rank likelihood score. Natural selection increases the average likelihood in the evolving population of phylogenetic trees, and the genetic algorithm is allowed to proceed until the likelihood of the best individual ceases to improve over time. An example is presented involving rbcL sequence data for 55 taxa of green plants. The genetic algorithm described here required only 6% of the computational effort required by a conventional heuristic search using tree bisection/reconnection (TBR) branch swapping to obtain the same maximum-likelihood topology.   相似文献   

12.
k-均值聚类算法是一种广泛应用于基因表达数据聚类分析中的迭代变换算法,它通常用距离法来表示基因间的关系,但不能有效的反应基因间的相互依赖的关系。为此,提出基于信息论的k-modes聚类算法,克服了以上缺点。另外,还引入了伪F统计量,一方面,可以对空间中有部分重叠的点进行有效的分类;另一方面,可以给出最佳聚类数目,从而弥补了k-modes聚类法的不足。使其成为一种非常有效的算法,从而达到较优的聚类效果。  相似文献   

13.
Photosynthesis response to carbon dioxide concentration can provide data on a number of important parameters related to leaf physiology. The genetic algorithm (GA), which is a robust stochastic evolutionary computational algorithm inspired by both natural selection and natural genetics, is proposed to simultaneously estimate the parameters [including maximum carboxylation rate allowed by ribulose 1·5-bisphosphate carboxylase/oxygenase (Rubisco) carboxylation rate ( V cmax), potential light-saturated electron transport rate ( J max), triose-phosphate utilization (TPU), leaf dark respiration in the light ( R d) and mesophyll conductance ( g m)] of the photosynthesis models presented by Farquhar, von Caemmerer and Berry, and Ethier and Livingston. The results show that by properly constraining the parameter bounds the GA-based estimate methods can effectively and efficiently obtain globally (or, at least near globally) optimal solutions, which are as good as or better than those obtained by non-linear curve fitting methods used in previous studies. More complicated problems such as taking the g m variation response to CO2 into account can be easily formulated and solved by using GA. The influence of the crossover probability ( P c), mutation probability ( P m), population size and generation on the performance of GA was also investigated.  相似文献   

14.
This paper describes a novel evolutionary algorithm for regulatory motif discovery in DNA promoter sequences. The algorithm uses data clustering to logically distribute the evolving population across the search space. Mating then takes place within local regions of the population, promoting overall solution diversity and encouraging discovery of multiple solutions. Experiments using synthetic data sets have demonstrated the algorithm's capacity to find position frequency matrix models of known regulatory motifs in relatively long promoter sequences. These experiments have also shown the algorithm's ability to maintain diversity during search and discover multiple motifs within a single population. The utility of the algorithm for discovering motifs in real biological data is demonstrated by its ability to find meaningful motifs within muscle-specific regulatory sequences.  相似文献   

15.
蛋白质能量最小化是蛋白质折叠的重要内容。用于蛋白质折叠的新的杂合进化算法结合了交叉和柯西变异。基于toy模型的蛋白质能量最小化算例表明,这个新的杂合进化算法是有效的。  相似文献   

16.
The prediction of protein side-chain conformation is central for understanding protein functions. Side-chain packing is a sub-problem of protein folding and its computational complexity has been shown to be NP-hard. We investigated the capabilities of a hybrid (genetic algorithm/simulated annealing) technique for side-chain packing and for the generation of an ensemble of low energy side-chain conformations. Our method first relies on obtaining a near-optimal low energy protein conformation by optimizing its amino-acid side-chains. Upon convergence, the genetic algorithm is allowed to undergo forward and “backward” evolution by alternating selection pressures between minimal and higher energy setpoints. We show that this technique is very efficient for obtaining distributions of solutions centered at any desired energy from the minimum. We outline the general concepts of our evolutionary sampling methodology using three different alternating selective pressure schemes. Quality of the method was assessed by using it for protein pK(a) prediction.  相似文献   

17.
The paper presents a new approach for medical image segmentation. Exudates are a visible sign of diabetic retinopathy that is the major reason of vision loss in patients with diabetes. If the exudates extend into the macular area, blindness may occur. Automated detection of exudates will assist ophthalmologists in early diagnosis. This segmentation process includes a new mechanism for clustering the elements of high-resolution images in order to improve precision and reduce computation time. The system applies K-means clustering to the image segmentation after getting optimized by Pillar algorithm; pillars are constructed in such a way that they can withstand the pressure. Improved pillar algorithm can optimize the K-means clustering for image segmentation in aspects of precision and computation time. This evaluates the proposed approach for image segmentation by comparing with Kmeans and Fuzzy C-means in a medical image. Using this method, identification of dark spot in the retina becomes easier and the proposed algorithm is applied on diabetic retinal images of all stages to identify hard and soft exudates, where the existing pillar K-means is more appropriate for brain MRI images. This proposed system help the doctors to identify the problem in the early stage and can suggest a better drug for preventing further retinal damage.  相似文献   

18.
Microarray-CGH (comparative genomic hybridization) experiments are used to detect and map chromosomal imbalances. A CGH profile can be viewed as a succession of segments that represent homogeneous regions in the genome whose representative sequences share the same relative copy number on average. Segmentation methods constitute a natural framework for the analysis, but they do not provide a biological status for the detected segments. We propose a new model for this segmentation/clustering problem, combining a segmentation model with a mixture model. We present a new hybrid algorithm called dynamic programming-expectation maximization (DP-EM) to estimate the parameters of the model by maximum likelihood. This algorithm combines DP and the EM algorithm. We also propose a model selection heuristic to select the number of clusters and the number of segments. An example of our procedure is presented, based on publicly available data sets. We compare our method to segmentation methods and to hidden Markov models, and we show that the new segmentation/clustering model is a promising alternative that can be applied in the more general context of signal processing.  相似文献   

19.
20.
We propose a new particle swarm optimization algorithm for problems where objective functions are subject to zero-mean, independent, and identically distributed stochastic noise. While particle swarm optimization has been successfully applied to solve many complex deterministic nonlinear optimization problems, straightforward applications of particle swarm optimization to noisy optimization problems are subject to failure because the noise in objective function values can lead the algorithm to incorrectly identify positions as the global/personal best positions. Instead of having the entire swarm follow a global best position based on the sample average of objective function values, the proposed new algorithm works with a set of statistically global best positions that include one or more positions with objective function values that are statistically equivalent, which is achieved using a combination of statistical subset selection and clustering analysis. The new PSO algorithm can be seamlessly integrated with adaptive resampling procedures to enhance the capability of PSO to cope with noisy objective functions. Numerical experiments demonstrate that the new algorithm is able to consistently find better solutions than the canonical particle swarm optimization algorithm in the presence of stochastic noise in objective function values with different resampling procedures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号