首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Merz P  Katayama K 《Bio Systems》2004,78(1-3):99-118
This paper presents a memetic algorithm, a highly effective evolutionary algorithm incorporating local search for solving the unconstrained binary quadratic programming problem (BQP). To justify the approach, a fitness landscape analysis is conducted experimentally for several instances of the BQP. The results of the analysis show that recombination-based variation operators are well suited for the evolutionary algorithms with local search. Therefore, the proposed approach includes--besides a highly effective randomized k-opt local search--a new variation operator that has been tailored specially for the application in the hybrid evolutionary framework. The operator is called innovative variation and is fundamentally different from traditional crossover operators, since new genetic material is included in the offspring which is not contained in one of the parents. The evolutionary heuristic is tested on 35 publicly available BQP instances, and it is shown experimentally that the algorithm is capable of finding best-known solutions to large BQPs in a short time and with a high frequency. In comparison to other approaches for the BQP, the approach appears to be much more effective, particularly for large instances of 1000 or 2500 binary variables.  相似文献   

2.
Nonlinear system modelling via optimal design of neural trees   总被引:1,自引:0,他引:1  
This paper introduces a flexible neural tree model. The model is computed as a flexible multi-layer feed-forward neural network. A hybrid learning/evolutionary approach to automatically optimize the neural tree model is also proposed. The approach includes a modified probabilistic incremental program evolution algorithm (MPIPE) to evolve and determine a optimal structure of the neural tree and a parameter learning algorithm to optimize the free parameters embedded in the neural tree. The performance and effectiveness of the proposed method are evaluated using function approximation, time series prediction and system identification problems and compared with the related methods.  相似文献   

3.
As the ultimate source of genetic variation, spontaneous mutation is essential to evolutionary change. Theoretical studies over several decades have revealed the dependence of evolutionary consequences of mutation on specific mutational properties, including genomic mutation rates, U, and the effects of newly arising mutations on individual fitness, s. The recent resurgence of empirical effort to infer these properties for diverse organisms has not achieved consensus. Estimates, which have been obtained by methods that assume mutations are unidirectional in their effects on fitness, are imprecise. Both because a general approach must allow for occurrence of fitness-enhancing mutations, even if these are rare, and because recent evidence demands it, we present a new method for inferring mutational parameters. For the distribution of mutational effects, we retain Keightley's assumption of the gamma distribution, to take advantage of the flexibility of its shape. Because the conventional gamma is one sided, restricting it to unidirectional effects, we include an additional parameter, rho, as an amount it is displaced from zero. Estimation is accomplished by Markov chain Monte Carlo maximum likelihood. Through a limited set of simulations, we verify the accuracy of this approach. We apply it to analyze data on two reproductive fitness components from a 17-generation mutation-accumulation study of a Columbia accession of Arabidopsis thaliana in which 40 lines sampled in three generations were assayed simultaneously. For these traits, U approximately/= 0.1-0.2, with distributions of mutational effects broadly spanning zero, such that roughly half the mutations reduce reproductive fitness. One evolutionary consequence of these results is lower extinction risks of small populations of A. thaliana than expected from the process of mutational meltdown. A comprehensive view of the evolutionary consequences of mutation will depend on quantitatively accounting for fitness-enhancing, as well as fitness-reducing, mutations.  相似文献   

4.
Evolutionary developmental biology and the problem of variation   总被引:11,自引:0,他引:11  
Abstract. One of the oldest problems in evolutionary biology remains largely unsolved. Which mutations generate evolutionarily relevant phenotypic variation? What kinds of molecular changes do they entail? What are the phenotypic magnitudes, frequencies of origin, and pleiotropic effects of such mutations? How is the genome constructed to allow the observed abundance of phenotypic diversity? Historically, the neo‐Darwinian synthesizers stressed the predominance of micromutations in evolution, whereas others noted the similarities between some dramatic mutations and evolutionary transitions to argue for macromutationism. Arguments on both sides have been biased by misconceptions of the developmental effects of mutations. For example, the traditional view that mutations of important developmental genes always have large pleiotropic effects can now be seen to be a conclusion drawn from observations of a small class of mutations with dramatic effects. It is possible that some mutations, for example, those in cis‐regulatory DNA, have few or no pleiotropic effects and may be the predominant source of morphological evolution. In contrast, mutations causing dramatic phenotypic effects, although superficially similar to hypothesized evolutionary transitions, are unlikely to fairly represent the true path of evolution. Recent developmental studies of gene function provide a new way of conceptualizing and studying variation that contrasts with the traditional genetic view that was incorporated into neo‐Darwinian theory and population genetics. This new approach in developmental biology is as important for micro‐evolutionary studies as the actual results from recent evolutionary developmental studies. In particular, this approach will assist in the task of identifying the specific mutations generating phenotypic variation and elucidating how they alter gene function. These data will provide the current missing link between molecular and phenotypic variation in natural populations.  相似文献   

5.
In this paper, we present a novel approach of implementing a combination methodology to find appropriate neural network architecture and weights using an evolutionary least square based algorithm (GALS).1 This paper focuses on aspects such as the heuristics of updating weights using an evolutionary least square based algorithm, finding the number of hidden neurons for a two layer feed forward neural network, the stopping criterion for the algorithm and finally some comparisons of the results with other existing methods for searching optimal or near optimal solution in the multidimensional complex search space comprising the architecture and the weight variables. We explain how the weight updating algorithm using evolutionary least square based approach can be combined with the growing architecture model to find the optimum number of hidden neurons. We also discuss the issues of finding a probabilistic solution space as a starting point for the least square method and address the problems involving fitness breaking. We apply the proposed approach to XOR problem, 10 bit odd parity problem and many real-world benchmark data sets such as handwriting data set from CEDAR, breast cancer and heart disease data sets from UCI ML repository. The comparative results based on classification accuracy and the time complexity are discussed.  相似文献   

6.
Study of emotions in human–computer interaction is a growing research area. This paper shows an attempt to select the most significant features for emotion recognition in spoken Basque and Spanish Languages using different methods for feature selection. RekEmozio database was used as the experimental data set. Several Machine Learning paradigms were used for the emotion classification task. Experiments were executed in three phases, using different sets of features as classification variables in each phase. Moreover, feature subset selection was applied at each phase in order to seek for the most relevant feature subset. The three phases approach was selected to check the validity of the proposed approach. Achieved results show that an instance-based learning algorithm using feature subset selection techniques based on evolutionary algorithms is the best Machine Learning paradigm in automatic emotion recognition, with all different feature sets, obtaining a mean of 80,05% emotion recognition rate in Basque and a 74,82% in Spanish. In order to check the goodness of the proposed process, a greedy searching approach (FSS-Forward) has been applied and a comparison between them is provided. Based on achieved results, a set of most relevant non-speaker dependent features is proposed for both languages and new perspectives are suggested.  相似文献   

7.
Metabolic flux analysis is important for metabolic system regulation and intracellular pathway identification. A popular approach for intracellular flux estimation involves using 13C tracer experiments to label states that can be measured by nuclear magnetic resonance spectrometry or gas chromatography mass spectrometry. However, the bilinear balance equations derived from 13C tracer experiments and the noisy measurements require a nonlinear optimization approach to obtain the optimal solution. In this paper, the flux quantification problem is formulated as an error-minimization problem with equality and inequality constraints through the 13C balance and stoichiometric equations. The stoichiometric constraints are transformed to a null space by singular value decomposition. Self-adaptive evolutionary algorithms are then introduced for flux quantification. The performance of the evolutionary algorithm is compared with ordinary least squares estimation by the simulation of the central pentose phosphate pathway. The proposed algorithm is also applied to the central metabolism of Corynebacterium glutamicum under lysine-producing conditions. A comparison between the results from the proposed algorithm and data from the literature is given. The complexity of a metabolic system with bidirectional reactions is also investigated by analyzing the fluctuations in the flux estimates when available measurements are varied.  相似文献   

8.
The evolutionary tree reconstruction algorithm called SEMPHY using structural expectation maximization (SEM) is an efficient approach but has local optimality problem. To improve SEMPHY, a new algorithm named HSEMPHY based on the homotopy continuation principle is proposed in the present study for reconstructing evolutionary trees. The HSEMPHY algorithm computes the condition probability of hidden variables in the structural through maximum entropy principle. It can reduce the influence of the initial value of the final resolution by simulating the process of the homotopy principle and by introducing the homotopy parameter beta. HSEMPHY is tested on real datasets and simulated dataset to compare with SEMPHY and the two most popular reconstruction approaches PHYML and RAXML. Experimental results show that HSEMPHY is at least as good as PHYML and RAXML and is very robust to poor starting trees.  相似文献   

9.
蛋白质能量最小化是蛋白质折叠的重要内容。用于蛋白质折叠的新的杂合进化算法结合了交叉和柯西变异。基于toy模型的蛋白质能量最小化算例表明,这个新的杂合进化算法是有效的。  相似文献   

10.
Community detection has drawn a lot of attention as it can provide invaluable help in understanding the function and visualizing the structure of networks. Since single objective optimization methods have intrinsic drawbacks to identifying multiple significant community structures, some methods formulate the community detection as multi-objective problems and adopt population-based evolutionary algorithms to obtain multiple community structures. Evolutionary algorithms have strong global search ability, but have difficulty in locating local optima efficiently. In this study, in order to identify multiple significant community structures more effectively, a multi-objective memetic algorithm for community detection is proposed by combining multi-objective evolutionary algorithm with a local search procedure. The local search procedure is designed by addressing three issues. Firstly, nondominated solutions generated by evolutionary operations and solutions in dominant population are set as initial individuals for local search procedure. Then, a new direction vector named as pseudonormal vector is proposed to integrate two objective functions together to form a fitness function. Finally, a network specific local search strategy based on label propagation rule is expanded to search the local optimal solutions efficiently. The extensive experiments on both artificial and real-world networks evaluate the proposed method from three aspects. Firstly, experiments on influence of local search procedure demonstrate that the local search procedure can speed up the convergence to better partitions and make the algorithm more stable. Secondly, comparisons with a set of classic community detection methods illustrate the proposed method can find single partitions effectively. Finally, the method is applied to identify hierarchical structures of networks which are beneficial for analyzing networks in multi-resolution levels.  相似文献   

11.
Data clustering is commonly employed in many disciplines. The aim of clustering is to partition a set of data into clusters, in which objects within the same cluster are similar and dissimilar to other objects that belong to different clusters. Over the past decade, the evolutionary algorithm has been commonly used to solve clustering problems. This study presents a novel algorithm based on simplified swarm optimization, an emerging population-based stochastic optimization approach with the advantages of simplicity, efficiency, and flexibility. This approach combines variable vibrating search (VVS) and rapid centralized strategy (RCS) in dealing with clustering problem. VVS is an exploitation search scheme that can refine the quality of solutions by searching the extreme points nearby the global best position. RCS is developed to accelerate the convergence rate of the algorithm by using the arithmetic average. To empirically evaluate the performance of the proposed algorithm, experiments are examined using 12 benchmark datasets, and corresponding results are compared with recent works. Results of statistical analysis indicate that the proposed algorithm is competitive in terms of the quality of solutions.  相似文献   

12.
Huang HL  Lee CC  Ho SY 《Bio Systems》2007,90(1):78-86
It is essential to select a minimal number of relevant genes from microarray data while maximizing classification accuracy for the development of inexpensive diagnostic tests. However, it is intractable to simultaneously optimize gene selection and classification accuracy that is a large parameter optimization problem. We propose an efficient evolutionary approach to gene selection from microarray data which can be combined with the optimal design of various multiclass classifiers. The proposed method (named GeneSelect) consists of three parts which are fully cooperated: an efficient encoding scheme of candidate solutions, a generalized fitness function, and an intelligent genetic algorithm (IGA). An existing hybrid approach based on genetic algorithm and maximum likelihood classification (GA/MLHD) is proposed to select a small number of relevant genes for accurate classification of samples. To evaluate the performance of GeneSelect, the gene selection is combined with the same maximum likelihood classification (named IGA/MLHD) for convenient comparisons. The performance of IGA/MLHD is applied to 11 cancer-related human gene expression datasets. The simulation results show that IGA/MLHD is superior to GA/MLHD in terms of the number of selected genes, classification accuracy, and robustness of selected genes and accuracy.  相似文献   

13.
Molecular phylogeny based on nucleotide or amino acid sequence comparison has become a widespread tool for general taxonomy and evolutionary analyses. It seems the only means to establish a natural classification of microorganisms, since their phenotypic traits are not always consistent with genealogy. After an optimistic period during which comprehensive microbial evolutionary pictures appeared, the discovery of several pitfalls affecting molecular phylogenetic reconstruction challenged the general validity of this approach. In addition to biological factors, such as horizontal gene transfer, some methodological problems may produce misleading phylogenies. They are essentially (i) loss of phylogenetic signal by the accumulation of overlapping mutations, (ii) incongruity between the real evolutionary process and the assumed models of sequence evolution, and (iii) differences of evolutionary rates among species or among positions within a sequence. Here, we discuss these problems and some strategies proposed to overcome their effects.  相似文献   

14.
A new software tool making use of a genetic algorithm for multi-objective experimental optimization (GAME.opt) was developed based on a strength Pareto evolutionary algorithm. The software deals with high dimensional variable spaces and unknown interactions of design variables. This approach was evaluated by means of multi-objective test problems replacing the experimental results. A default parameter setting is proposed enabling users without expert knowledge to minimize the experimental effort (small population sizes and few generations).  相似文献   

15.
A. R. Templeton  C. F. Sing 《Genetics》1993,134(2):659-669
We previously developed an analytical strategy based on cladistic theory to identify subsets of haplotypes that are associated with significant phenotypic deviations. Our initial approach was limited to segments of DNA in which little recombination occurs. In such cases, a cladogram can be constructed from the restriction site data to estimate the evolutionary steps that interrelate the observed haplotypes to one another. The cladogram is then used to define a nested statistical design for identifying mutational steps associated with significant phenotypic deviations. The central assumption behind this strategy is that a mutation responsible for a particular phenotypic effect is embedded within the evolutionary history that is represented by the cladogram. The power of this approach depends on the accuracy of the cladogram in portraying the evolutionary history of the DNA region. This accuracy can be diminished both by recombination and by uncertainty in the estimated cladogram topology. In a previous paper, we presented an algorithm for estimating the set of likely cladograms and recombination events. In this paper we present an algorithm for defining a nested statistical design under cladogram uncertainty and recombination. Given the nested design, phenotypic associations can be examined using either a nested analysis of variance (for haploids or homozygous strains) or permutation testing (for outcrossed, diploid gene regions). In this paper we also extend this analytical strategy to include categorical phenotypes in addition to quantitative phenotypes. Some worked examples are presented using Drosophila data sets. These examples illustrate that having some recombination may actually enhance the biological inferences that may derived from a cladistic analysis. In particular, recombination can be used to assign a physical localization to a given subregion for mutations responsible for significant phenotypic effects.  相似文献   

16.
Du QS  Wang CH  Liao SM  Huang RB 《PloS one》2010,5(10):e13207

Background

It has been widely recognized that the mutations at specific directions are caused by the functional constraints in protein family and the directional mutations at certain positions control the evolutionary direction of the protein family. The mutations at different positions, even distantly separated, are mutually coupled and form an evolutionary network. Finding the controlling mutative positions and the mutative network among residues are firstly important for protein rational design and enzyme engineering.

Methodology

A computational approach, namely amino acid position conservation-mutation correlation analysis (CMCA), is developed to predict mutually mutative positions and find the evolutionary network in protein family. The amino acid position mutative function, which is the foundational equation of CMCA measuring the mutation of a residue at a position, is derived from the MSA (multiple structure alignment) database of protein evolutionary family. Then the position conservation correlation matrix and position mutation correlation matrix is constructed from the amino acid position mutative equation. Unlike traditional SCA (statistical coupling analysis) approach, which is based on the statistical analysis of position conservations, the CMCA focuses on the correlation analysis of position mutations.

Conclusions

As an example the CMCA approach is used to study the PDZ domain of protein family, and the results well illustrate the distantly allosteric mechanism in PDZ protein family, and find the functional mutative network among residues. We expect that the CMCA approach may find applications in protein engineering study, and suggest new strategy to improve bioactivities and physicochemical properties of enzymes.  相似文献   

17.
《Journal of molecular biology》2019,431(13):2449-2459
Nearly one-third of non-synonymous single-nucleotide polymorphism (nsSNPs) are deleterious to human health, but recognition of the disease-associated mutations remains a significant unsolved problem. We proposed a new algorithm, DAMpred, to identify disease-causing nsSNPs through the coupling of evolutionary profiles with structure predictions of proteins and protein–protein interactions. The pipeline was trained by a novel Bayes-guided artificial neural network algorithm that incorporates posterior probabilities of distinct feature classifiers with the network training process. DAMpred was tested on a large-scale data set involving 10,635 nsSNPs from 2154 ORFs in the human genome and recognized disease-associated nsSNPs with an accuracy 0.80 and a Matthews correlation coefficient of 0.601, which is 9.1% higher than the best of other state-of-the-art methods. In the blind test on the TP53 gene, DAMpred correctly recognized the mutations causative of Li–Fraumeni-like syndrome with a Matthews correlation coefficient that is 27% higher than the control methods. The study demonstrates an efficient avenue to quantitatively model the association of nsSNPs with human diseases from low-resolution protein structure prediction, which should find important usefulness in diagnosis and treatment of genetic diseases.  相似文献   

18.
The structure and organization of natural plant populations can be understood by estimating the genetic parameters related to mating behavior, recombination frequency, and gene associations with DNA-based markers typed throughout the genome. We developed a statistical and computational model for estimating and testing these parameters from multilocus data collected in a natural population. This model, constructed by a maximum likelihood approach and implemented within the EM algorithm, is shown to be robust for simultaneously estimating the outcrossing rate, recombination frequencies and linkage disequilibria. The algorithm built with three or more markers allows the characterization of crossover interference in meiosis and high-order disequilibria among different genes, thus providing a powerful tool for illustrating a detailed picture of genetic diversity and organization in natural populations. Computer simulations demonstrate the statistical properties of the proposed model. This multilocus model will be useful for studying the pattern and amount of genetic variation within and among populations to further infer the evolutionary history of a plant species.  相似文献   

19.
In haploid budding yeast, evolutionary adaptation to constitutive DNA replication stress alters three genome maintenance modules: DNA replication, the DNA damage checkpoint, and sister chromatid cohesion. We asked how these trajectories depend on genomic features by comparing the adaptation in three strains: haploids, diploids, and recombination deficient haploids. In all three, adaptation happens within 1000 generations at rates that are correlated with the initial fitness defect of the ancestors. Mutations in individual genes are selected at different frequencies in populations with different genomic features, but the benefits these mutations confer are similar in the three strains, and combinations of these mutations reproduce the fitness gains of evolved populations. Despite the differences in the selected mutations, adaptation targets the same three functional modules in strains with different genomic features, revealing a common evolutionary response to constitutive DNA replication stress.  相似文献   

20.
We present an original approach to identifying sequence variants in a mixed DNA population from sequence trace data. The heart of the method is based on parsimony: given a wildtype DNA sequence, a set of observed variations at each position collected from sequencing data, and a complete catalog of all possible mutations, determine the smallest set of mutations from the catalog that could fully explain the observed variations. The algorithmic complexity of the problem is analyzed for several classes of mutations, including block substitutions, single-range deletions, and single-range insertions. The reconstruction problem is shown to be NP-complete for single-range insertions and deletions, while for block substitutions, single character insertion, and single character deletion mutations, polynomial time algorithms are provided. Once a minimum set of mutations compatible with the observed sequence is found, the relative frequency of those mutations is recovered by solving a system of linear equations. Simulation results show the algorithm successfully deconvolving mutations in p53 known to cause cancer. An extension of the algorithm is proposed as a new method of high throughput screening for single nucleotide polymorphisms by multiplexing DNA.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号