首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
Codon models of evolution have facilitated the interpretation of selective forces operating on genomes. These models, however, assume a single rate of non-synonymous substitution irrespective of the nature of amino acids being exchanged. Recent developments have shown that models which allow for amino acid pairs to have independent rates of substitution offer improved fit over single rate models. However, these approaches have been limited by the necessity for large alignments in their estimation. An alternative approach is to assume that substitution rates between amino acid pairs can be subdivided into rate classes, dependent on the information content of the alignment. However, given the combinatorially large number of such models, an efficient model search strategy is needed. Here we develop a Genetic Algorithm (GA) method for the estimation of such models. A GA is used to assign amino acid substitution pairs to a series of rate classes, where is estimated from the alignment. Other parameters of the phylogenetic Markov model, including substitution rates, character frequencies and branch lengths are estimated using standard maximum likelihood optimization procedures. We apply the GA to empirical alignments and show improved model fit over existing models of codon evolution. Our results suggest that current models are poor approximations of protein evolution and thus gene and organism specific multi-rate models that incorporate amino acid substitution biases are preferred. We further anticipate that the clustering of amino acid substitution rates into classes will be biologically informative, such that genes with similar functions exhibit similar clustering, and hence this clustering will be useful for the evolutionary fingerprinting of genes.  相似文献   

4.
Nutrient concentrations (N, P, K) were determined within stemwood in an age series of eucalyptus stands. Four trees per stand were selected according to their size to represent the whole range of basal areas in 1-, 2-, 3-, 4-, 5-, 6- and 7-year-old stands. Cross-sections were sampled every 4 m from the ground to the top of the tree, and chemical analyses were performed for each annual ring in the cross-sections. We constructed a new and generic model to describe the dynamics of nutrient concentrations within the stemwood. Three main parameters were used: (1) the initial concentration of the ring, Ic; (2) the final concentration of the ring at harvest, Fc; and (3) the rate of change in concentration, k. The model is very flexible and was adapted to describe N, P and K concentrations within the stems, and their dynamics over time. An analysis of the parameters showed that k was constant for a given nutrient. Ic varied with height within the tree for P, whereas for N and K it was a function of: (1) the age of the tree when the ring was initiated: and (2) height within the tree. Fc was constant for N, and dependent on the age of the tree when the ring was initiated for K and P. The final models showed a low Root Mean Square Error for a limited number of parameters (less than seven). When validated on an independent sample, the models were shown to have high predictive quality.  相似文献   

5.
We investigate some discrete structural properties of evolutionary trees generated under simple null models of speciation, such as the Yule model. These models have been used as priors in Bayesian approaches to phylogenetic analysis, and also to test hypotheses concerning the speciation process. In this paper we describe new results for three properties of trees generated under such models. Firstly, for a rooted tree generated by the Yule model we describe the probability distribution on the depth (number of edges from the root) of the most recent common ancestor of a random subset of k species. Next we show that, for trees generated under the Yule model, the approximate position of the root can be estimated from the associated unrooted tree, even for trees with a large number of leaves. Finally, we analyse a biologically motivated extension of the Yule model and describe its distribution on tree shapes when speciation occurs in rapid bursts.  相似文献   

6.
Discrete state‐space models are used in ecology to describe the dynamics of wild animal populations, with parameters, such as the probability of survival, being of ecological interest. For a particular parametrization of a model it is not always clear which parameters can be estimated. This inability to estimate all parameters is known as parameter redundancy or a model is described as nonidentifiable. In this paper we develop methods that can be used to detect parameter redundancy in discrete state‐space models. An exhaustive summary is a combination of parameters that fully specify a model. To use general methods for detecting parameter redundancy a suitable exhaustive summary is required. This paper proposes two methods for the derivation of an exhaustive summary for discrete state‐space models using discrete analogues of methods for continuous state‐space models. We also demonstrate that combining multiple data sets, through the use of an integrated population model, may result in a model in which all parameters are estimable, even though models fitted to the separate data sets may be parameter redundant.  相似文献   

7.
Computational simulation models can provide a way of understanding and predicting insect population dynamics and evolution of resistance, but the usefulness of such models depends on generating or estimating the values of key parameters. In this paper, we describe four numerical algorithms generating or estimating key parameters for simulating four different processes within such models. First, we describe a novel method to generate an offspring genotype table for one- or two-locus genetic models for simulating evolution of resistance, and how this method can be extended to create offspring genotype tables for models with more than two loci. Second, we describe how we use a generalized inverse matrix to find a least-squares solution to an over-determined linear system for estimation of parameters in probit models of kill rates. This algorithm can also be used for the estimation of parameters of Freundlich adsorption isotherms. Third, we describe a simple algorithm to randomly select initial frequencies of genotypes either without any special constraints or with some pre-selected frequencies. Also we give a simple method to calculate the “stable” Hardy–Weinberg equilibrium proportions that would result from these initial frequencies. Fourth we describe how the problem of estimating the intrinsic rate of natural increase of a population can be converted to a root-finding problem and how the bisection algorithm can then be used to find the rate. We implemented all these algorithms using MATLAB and Python code; the key statements in both codes consist of only a few commands and are given in the appendices. The results of numerical experiments are also provided to demonstrate that our algorithms are valid and efficient.  相似文献   

8.
Tree vigor is often used as a covariate when tree mortality is predicted from tree growth in tropical forest dynamic models, but it is rarely explicitly accounted for in a coherent modeling framework. We quantify tree vigor at the individual tree level, based on the difference between expected and observed growth. The available methods to join nonlinear tree growth and mortality processes are not commonly used by forest ecologists so that we develop an inference methodology based on an MCMC approach, allowing us to sample the parameters of the growth and mortality model according to their posterior distribution using the joint model likelihood. We apply our framework to a set of data on the 20‐year dynamics of a forest in Paracou, French Guiana, taking advantage of functional trait‐based growth and mortality models already developed independently. Our results showed that growth and mortality are intimately linked and that the vigor estimator is an essential predictor of mortality, highlighting that trees growing more than expected have a far lower probability of dying. Our joint model methodology is sufficiently generic to be used to join two longitudinal and punctual linked processes and thus may be applied to a wide range of growth and mortality models. In the context of global changes, such joint models are urgently needed in tropical forests to analyze, and then predict, the effects of the ongoing changes on the tree dynamics in hyperdiverse tropical forests.  相似文献   

9.
For a model of molecular evolution to be useful for phylogenetic inference, the topology of evolutionary trees must be identifiable. That is, from a joint distribution the model predicts, it must be possible to recover the tree parameter. We establish tree identifiability for a number of phylogenetic models, including a covarion model and a variety of mixture models with a limited number of classes. The proof is based on the introduction of a more general model, allowing more states at internal nodes of the tree than at leaves, and the study of the algebraic variety formed by the joint distributions to which it gives rise. Tree identifiability is first established for this general model through the use of certain phylogenetic invariants.  相似文献   

10.
Several stochastic models of character change, when implemented in a maximum likelihood framework, are known to give a correspondence between the maximum parsimony method and the method of maximum likelihood. One such model has an independently estimated branch-length parameter for each site and each branch of the phylogenetic tree. This model--the no-common-mechanism model--has many parameters, and, in fact, the number of parameters increases as fast as the alignment is extended. We take a Bayesian approach to the no-common-mechanism model and place independent gamma prior probability distributions on the branch-length parameters. We are able to analytically integrate over the branch lengths, and this allowed us to implement an efficient Markov chain Monte Carlo method for exploring the space of phylogenetic trees. We were able to reliably estimate the posterior probabilities of clades for phylogenetic trees of up to 500 sequences. However, the Bayesian approach to the problem, at least as implemented here with an independent prior on the length of each branch, does not tame the behavior of the branch-length parameters. The integrated likelihood appears to be a simple rescaling of the parsimony score for a tree, and the marginal posterior probability distribution of the length of a branch is dependent upon how the maximum parsimony method reconstructs the characters at the interior nodes of the tree. The method we describe, however, is of potential importance in the analysis of morphological character data and also for improving the behavior of Markov chain Monte Carlo methods implemented for models in which sites share a common branch-length parameter.  相似文献   

11.
Multilabeled trees or MUL-trees, for short, are trees whose leaves are labeled by elements of some nonempty finite set X such that more than one leaf may be labeled by the same element of X. This class of trees includes phylogenetic trees and tree shapes. MUL-trees arise naturally in, for example, biogeography and gene evolution studies and also in the area of phylogenetic network reconstruction. In this paper, we introduce novel metrics which may be used to compare MUL-trees, most of which generalize well-known metrics on phylogenetic trees and tree shapes. These metrics can be used, for example, to better understand the space of MUL-trees or to help visualize collections of MUL-trees. In addition, we describe some relationships between the MUL-tree metrics that we present and also give some novel diameter bounds for these metrics. We conclude by briefly discussing some open problems as well as pointing out how MUL-tree metrics may be used to define metrics on the space of phylogenetic networks.  相似文献   

12.
Although a large body of work investigating tests of correlated evolution of two continuous characters exists, hypotheses such as character displacement are really tests of whether substantial evolutionary change has occurred on a particular branch or branches of the phylogenetic tree. In this study, we present a methodology for testing such a hypothesis using ancestral character state reconstruction and simulation. Furthermore, we suggest how to investigate the robustness of the hypothesis test by varying the reconstruction methods or simulation parameters. As a case study, we tested a hypothesis of character displacement in body size of Caribbean Anolis lizards. We compared squared-change, weighted squared-change, and linear parsimony reconstruction methods, gradual Brownian motion and speciational models of evolution, and several resolution methods for linear parsimony. We used ancestor reconstruction methods to infer the amount of body size evolution, and tested whether evolutionary change in body size was greater on branches of the phylogenetic tree in which a transition from occupying a single-species island to a two-species island occurred. Simulations were used to generate null distributions of reconstructed body size change. The hypothesis of character displacement was tested using Wilcoxon Rank-Sums. When tested against simulated null distributions, all of the reconstruction methods resulted in more significant P-values than when standard statistical tables were used. These results confirm that P-values for tests using ancestor reconstruction methods should be assessed via simulation rather than from standard statistical tables. Linear parsimony can produce an infinite number of most parsimonious reconstructions in continuous characters. We present an example of assessing the robustness of our statistical test by exploring the sample space of possible resolutions. We compare ACCTRAN and DELTRAN resolutions of ambiguous character reconstructions in linear parsimony to the most and least conservative resolutions for our particular hypothesis.  相似文献   

13.
Dynamical systems which generate periodic signals are of interest as models of biological central pattern generators and in a number of robotic applications. A basic functionality that is required in both biological modelling and robotics is frequency modulation. This leads to the question of whether there are generic mechanisms to control the frequency of neural oscillators. Here we describe why this objective is of a different nature, and more difficult to achieve, than modulating other oscillation characteristics (like amplitude, offset, signal shape). We propose a generic way to solve this task which makes use of a simple linear controller. It rests on the insight that there is a bidirectional dependency between the frequency of an oscillation and geometric properties of the neural oscillator’s phase portrait. By controlling the geometry of the neural state orbits, it is possible to control the frequency on the condition that the state space can be shaped such that it can be pushed easily to any frequency.  相似文献   

14.
Contemporary methods for visualizing phenotypic evolution, such as phylomorphospaces, often reveal patterns which depart strongly from a naïve expectation of consistently divergent branching and expansion. Instead, branches regularly crisscross as convergence, reversals, or other forms of homoplasy occur, forming patterns described as “birds’ nests”, “flies in vials”, or less elegantly, “a mess”. In other words, the phenotypic tree of life often appears highly tangled. Various explanations are given for this, such as differential degrees of developmental constraint, adaptation, or lack of adaptation. However, null expectations for the magnitude of disorder or “tangling” have never been established, so it is unclear which or even whether various evolutionary factors are required to explain messy patterns of evolution. I simulated evolution along phylogenies under a number of varying parameters (number of taxa and number of traits) and models (Brownian motion, Ornstein–Uhlenbeck (OU)-based, early burst, and character displacement (CD)] and quantified disorder using 2 measures. All models produce substantial amounts of disorder. Disorder increases with tree size and the number of phenotypic traits. OU models produced the largest amounts of disorder—adaptive peaks influence lineages to evolve within restricted areas, with concomitant increases in crossing of branches and density of evolution. Large early changes in trait values can be important in minimizing disorder. CD consistently produced trees with low (but not absent) disorder. Overall, neither constraints nor a lack of adaptation is required to explain messy phylomorphospaces—both stochastic and deterministic processes can act to produce the tantalizingly tangled phenotypic tree of life.  相似文献   

15.
The field of phylogenetic tree estimation has been dominated by three broad classes of methods: distance-based approaches, parsimony and likelihood-based methods (including maximum likelihood (ML) and Bayesian approaches). Here we introduce two new approaches to tree inference: pairwise likelihood estimation and a distance-based method that estimates the number of substitutions along the paths through the tree. Our results include the derivation of the formulae for the probability that two leaves will be identical at a site given a number of substitutions along the path connecting them. We also derive the posterior probability of the number of substitutions along a path between two sequences. The calculations for the posterior probabilities are exact for group-based, symmetric models of character evolution, but are only approximate for more general models.  相似文献   

16.
With growing amounts of genome data and constant improvement of models of molecular evolution, phylogenetic reconstruction became more reliable. However, our knowledge of the real process of molecular evolution is still limited. When enough large-sized data sets are analyzed, any subtle biases in statistical models can support incorrect topologies significantly because of the high signal-to-noise ratio. We propose a procedure to locate sequences in a multidimensional vector space (MVS), in which the geometry of the space is uniquely determined in such a way that the vectors of sequence evolution are orthogonal among different branches. In this paper, the MVS approach is developed to detect and remove biases in models of molecular evolution caused by unrecognized convergent evolution among lineages or unexpected patterns of substitutions. Biases in the estimated pairwise distances are identified as deviations (outliers) of sequence spatial vectors from the expected orthogonality. Modifications to the estimated distances are made by minimizing an index to quantify the deviations. In this way, it becomes possible to reconstruct the phylogenetic tree, taking account of possible biases in the model of molecular evolution. The efficacy of the modification procedure was verified by simulating evolution on various topologies with rate heterogeneity and convergent change. The phylogeny of placental mammals in previous analyses of large data sets has varied according to the genes being analyzed. Systematic deviations caused by convergent evolution were detected by our procedure in all representative data sets and were found to strongly affect the tree structure. However, the bias correction yielded a consistent topology among data sets. The existence of strong biases was validated by examining the sites of convergent evolution between the hedgehog and other species in mitochondrial data set. This convergent evolution explains why it has been difficult to determine the phylogenetic placement of the hedgehog in previous studies.  相似文献   

17.
The rate at which a given site in a gene sequence alignment evolves over time may vary. This phenomenon--known as heterotachy--can bias or distort phylogenetic trees inferred from models of sequence evolution that assume rates of evolution are constant. Here, we describe a phylogenetic mixture model designed to accommodate heterotachy. The method sums the likelihood of the data at each site over more than one set of branch lengths on the same tree topology. A branch-length set that is best for one site may differ from the branch-length set that is best for some other site, thereby allowing different sites to have different rates of change throughout the tree. Because rate variation may not be present in all branches, we use a reversible-jump Markov chain Monte Carlo algorithm to identify those branches in which reliable amounts of heterotachy occur. We implement the method in combination with our 'pattern-heterogeneity' mixture model, applying it to simulated data and five published datasets. We find that complex evolutionary signals of heterotachy are routinely present over and above variation in the rate or pattern of evolution across sites, that the reversible-jump method requires far fewer parameters than conventional mixture models to describe it, and serves to identify the regions of the tree in which heterotachy is most pronounced. The reversible-jump procedure also removes the need for a posteriori tests of 'significance' such as the Akaike or Bayesian information criterion tests, or Bayes factors. Heterotachy has important consequences for the correct reconstruction of phylogenies as well as for tests of hypotheses that rely on accurate branch-length information. These include molecular clocks, analyses of tempo and mode of evolution, comparative studies and ancestral state reconstruction. The model is available from the authors' website, and can be used for the analysis of both nucleotide and morphological data.  相似文献   

18.
Variations of nucleotidic composition affect phylogenetic inference conducted under stationary models of evolution. In particular, they may cause unrelated taxa sharing similar base composition to be grouped together in the resulting phylogeny. To address this problem, we developed a nonstationary and nonhomogeneous model accounting for compositional biases. Unlike previous nonstationary models, which are branchwise, that is, assume that base composition only changes at the nodes of the tree, in our model, the process of compositional drift is totally uncoupled from the speciation events. In addition, the total number of events of compositional drift distributed across the tree is directly inferred from the data. We implemented the method in a Bayesian framework, relying on Markov Chain Monte Carlo algorithms, and applied it to several nucleotidic data sets. In most cases, the stationarity assumption was rejected in favor of our nonstationary model. In addition, we show that our method is able to resolve a well-known artifact. By Bayes factor evaluation, we compared our model with 2 previously developed nonstationary models. We show that the coupling between speciations and compositional shifts inherent to branchwise models may lead to an overparameterization, resulting in a lesser fit. In some cases, this leads to incorrect conclusions, concerning the nature of the compositional biases. In contrast, our compound model more flexibly adapts its effective number of parameters to the data sets under investigation. Altogether, our results show that accounting for nonstationary sequence evolution may require more elaborate and more flexible models than those currently used.  相似文献   

19.
The MMSOM identification method, which had been presented by the authors, is improved to the multiple modeling by the irregular self-organizing map (MMISOM) using the irregular SOM (ISOM). Inputs to the neural networks are parameters of the instantaneous model computed adaptively at every instant. The neural network learns these models. The reference vectors of its output nodes are estimation of the parameters of the local models. At every instant, the model with closest output to the plant output is selected as the model of the plant. ISOM used in this paper is a graph of all the nodes and some of the weighted links between them to make a minimum spanning tree graph. It is shown in this paper that it is possible to add new models if the number of models is initially less than the appropriate one. The MMISOM shows more flexibility to cover the linear model space of the plant when the space is concave.  相似文献   

20.
The MMSOM identification method, which had been presented by the authors, is improved to the multiple modeling by the irregular self-organizing map (MMISOM) using the irregular SOM (ISOM). Inputs to the neural networks are parameters of the instantaneous model computed adaptively at every instant. The neural network learns these models. The reference vectors of its output nodes are estimation of the parameters of the local models. At every instant, the model with closest output to the plant output is selected as the model of the plant. ISOM used in this paper is a graph of all the nodes and some of the weighted links between them to make a minimum spanning tree graph. It is shown in this paper that it is possible to add new models if the number of models is initially less than the appropriate one. The MMISOM shows more flexibility to cover the linear model space of the plant when the space is concave.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号