首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In genetics, many evolutionary pathways can be modeled by the ordered accumulation of permanent changes. Mixture models of mutagenetic trees have been used to describe disease progression in cancer and in HIV. In cancer, progression is modeled by the accumulation of chromosomal gains and losses in tumor cells; in HIV, the accumulation of drug resistance-associated mutations in the viral genome is known to be associated with disease progression. From such evolutionary models, genetic progression scores can be derived that assign measures for the disease state to single patients. Rtreemix is an R package for estimating mixture models of evolutionary pathways from observed cross-sectional data and for estimating associated genetic progression scores. The package also provides extended functionality for estimating confidence intervals for estimated model parameters and for evaluating the stability of the estimated evolutionary mixture models.  相似文献   

2.

Background  

Mixture models of mutagenetic trees are evolutionary models that capture several pathways of ordered accumulation of genetic events observed in different subsets of patients. They were used to model HIV progression by accumulation of resistance mutations in the viral genome under drug pressure and cancer progression by accumulation of chromosomal aberrations in tumor cells. From the mixture models a genetic progression score (GPS) can be derived that estimates the genetic status of single patients according to the corresponding progression along the tree models. GPS values were shown to have predictive power for estimating drug resistance in HIV or the survival time in cancer. Still, the reliability of the exact values of such complex markers derived from graphical models can be questioned.  相似文献   

3.
MOTIVATION: In cancer research, prediction of time to death or relapse is important for a meaningful tumor classification and selecting appropriate therapies. Survival prognosis is typically based on clinical and histological parameters. There is increasing interest in identifying genetic markers that better capture the status of a tumor in order to improve on existing predictions. The accumulation of genetic alterations during tumor progression can be used for the assessment of the genetic status of the tumor. For modeling dependences between the genetic events, evolutionary tree models have been applied. RESULTS: Mixture models of oncogenetic trees provide a probabilistic framework for the estimation of typical pathogenetic routes. From these models we derive a genetic progression score (GPS) that estimates the genetic status of a tumor. GPS is calculated for glioblastoma patients from loss of heterozygosity measurements and for prostate cancer patients from comparative genomic hybridization measurements. Cox proportional hazard models are then fitted to observed survival times of glioblastoma patients and to times until PSA relapse following radical prostatectomy of prostate cancer patients. It turns out that the genetically defined GPS is predictive even after adjustment for classical clinical markers and thus can be considered a medically relevant prognostic factor. AVAILABILITY: Mtreemix, a software package for estimating tree mixture models, is freely available for non-commercial users at http://mtreemix.bioinf.mpi-sb.mpg.de. The raw cancer datasets and R code for the analysis with Cox models are available upon request from the corresponding author.  相似文献   

4.
5.
A working guide to boosted regression trees   总被引:33,自引:0,他引:33  
1. Ecologists use statistical models for both explanation and prediction, and need techniques that are flexible enough to express typical features of their data, such as nonlinearities and interactions. 2. This study provides a working guide to boosted regression trees (BRT), an ensemble method for fitting statistical models that differs fundamentally from conventional techniques that aim to fit a single parsimonious model. Boosted regression trees combine the strengths of two algorithms: regression trees (models that relate a response to their predictors by recursive binary splits) and boosting (an adaptive method for combining many simple models to give improved predictive performance). The final BRT model can be understood as an additive regression model in which individual terms are simple trees, fitted in a forward, stagewise fashion. 3. Boosted regression trees incorporate important advantages of tree-based methods, handling different types of predictor variables and accommodating missing data. They have no need for prior data transformation or elimination of outliers, can fit complex nonlinear relationships, and automatically handle interaction effects between predictors. Fitting multiple trees in BRT overcomes the biggest drawback of single tree models: their relatively poor predictive performance. Although BRT models are complex, they can be summarized in ways that give powerful ecological insight, and their predictive performance is superior to most traditional modelling methods. 4. The unique features of BRT raise a number of practical issues in model fitting. We demonstrate the practicalities and advantages of using BRT through a distributional analysis of the short-finned eel (Anguilla australis Richardson), a native freshwater fish of New Zealand. We use a data set of over 13 000 sites to illustrate effects of several settings, and then fit and interpret a model using a subset of the data. We provide code and a tutorial to enable the wider use of BRT by ecologists.  相似文献   

6.
Carrie L. Woods 《Biotropica》2017,49(4):452-460
Epiphytes are integral to tropical forests yet little is understood about how succession proceeds in these communities. As trees increase in size they create microhabitats for late‐colonizing species in both small and large branches while maintaining small tree microhabitats for early colonizing species in the small and young branches. Thus, epiphyte succession may follow different models depending on the scale: at the scale of the entire tree, epiphytes may follow a species accumulation model where species are continuously added to the tree as trees increase in size but at the scale of one zone on a branch (e.g., inner crown: 0–2 m from the trunk), they may follow the replacement model of succession seen in terrestrial ecosystems. Assuming tree size as an indicator of tree age, I surveyed 61 Virola koschnyi trees of varying size (2.5–103.3 cm diameter at breast height) in lowland wet tropical forest in Costa Rica to examine how epiphyte communities change through succession. Epiphyte communities in small trees were nested subsets of those in large trees and epiphyte communities became more similar to the largest trees as trees increased in size. Furthermore, epiphyte species in small trees were replaced by mid‐ and late‐successional species in the oldest parts of the tree crown but dispersed toward the younger branches as trees increased in size. Thus, epiphyte succession followed a replacement model in particular zones within treecrowns but a species accumulation model at the scale of the entire tree crown.  相似文献   

7.
One tool in the study of the forces that determine species diversity is the null, or simple, model. The fit of predictions to observations, good or bad, leads to a useful paradigm or to knowledge of forces not accounted for, respectively. It is shown how simple models of speciation and extinction lead directly to predictions of the structure of phylogenetic trees. These predictions include both essential attributes of phylogenetic trees: lengths, in the form of internode distances; and topology, in the form of internode links. These models also lead directly to statistical tests which can be used to compare predictions with phylogenetic trees that are estimated from data. Two different models and eight data sets are considered. A model without species extinction consistently yielded predictions closer to observations than did a model that included extinction. It is proposed that it may be useful to think of the diversification of recently formed monophyletic groups as a random speciation process without extinction.  相似文献   

8.
Simple stochastic models for phylogenetic trees on species have been well studied. But much paleontology data concerns time series or trees on higher-order taxa, and any broad picture of relationships between extant groups requires use of higher-order taxa. A coherent model for trees on (say) genera should involve both a species-level model and a model for the classification scheme by which species are assigned to genera. We present a general framework for such models, and describe three alternate classification schemes. Combining with the species-level model of Aldous and Popovic (Adv Appl Probab 37:1094–1115, 2005), one gets models for higher-order trees, and we initiate analytic study of such models. In particular we derive formulas for the lifetime of genera, for the distribution of number of species per genus, and for the offspring structure of the tree on genera. David Aldous’s research was supported by NSF Grant DMS-0704159.  相似文献   

9.
10.
Trees are usually grown in containers in the nursery until they reach a certain size, whereupon they are transplanted to a permanent location. Infrastructure development has often led to the removal of large trees. To maintain lush foliage and trees of a size that benefit urban ecology, trees can be grown in containers. Containerized trees can be moved from one location to another, and this relocation does not require root pruning or crown-size reduction. The drawback to having trees in containers is the small and confined volume of the container, which limits tree root development and thus affects containerized tree stability. The objective of this study was to understand the failure mechanisms for and the effect of the root dimensions on the stability of containerized trees. Therefore, small-scale stability model tests were conducted which were verified using numerical and analytical models. The results identified two failure modes that were likely to occur: tree overturning and container overturning. The mode of failure was dependent on the root dimensions. When the trees had extended their roots deep into the container, the whole container would overturn in the event of failure due to increased root confinement and shear resistance of the soil. On the other hand, the main failure mechanism when there was shallow root development was the uplifting of the tree from the container while the container remained upright. The results from numerical and analytical models were consistent with those obtained during the small-scale model stability tests.  相似文献   

11.
The phenology of wood formation is a critical process to consider for predicting how trees from the temperate and boreal zones may react to climate change. Compared to leaf phenology, however, the determinism of wood phenology is still poorly known. Here, we compared for the first time three alternative ecophysiological model classes (threshold models, heat‐sum models and chilling‐influenced heat‐sum models) and an empirical model in their ability to predict the starting date of xylem cell enlargement in spring, for four major Northern Hemisphere conifers (Larix decidua, Pinus sylvestris, Picea abies and Picea mariana). We fitted models with Bayesian inference to wood phenological data collected for 220 site‐years over Europe and Canada. The chilling‐influenced heat‐sum model received most support for all the four studied species, predicting validation data with a 7.7‐day error, which is within one day of the observed data resolution. We conclude that both chilling and forcing temperatures determine the onset of wood formation in Northern Hemisphere conifers. Importantly, the chilling‐influenced heat‐sum model showed virtually no spatial bias whichever the species, despite the large environmental gradients considered. This suggests that the spring onset of wood formation is far less affected by local adaptation than by environmentally driven plasticity. In a context of climate change, we therefore expect rising winter–spring temperature to exert ambivalent effects on the spring onset of wood formation, tending to hasten it through the accumulation of forcing temperature, but imposing a higher forcing temperature requirement through the lower accumulation of chilling.  相似文献   

12.
Large-scale association studies hold promise for discovering the genetic basis of common human disease. These studies will consist of a large number of individuals, as well as large number of genetic markers, such as single nucleotide polymorphisms (SNPs). The potential size of the data and the resulting model space require the development of efficient methodology to unravel associations between phenotypes and SNPs in dense genetic maps. Our approach uses a genetic algorithm (GA) to construct logic trees consisting of Boolean expressions involving strings or blocks of SNPs. These blocks or nodes of the logic trees consist of SNPs in high linkage disequilibrium (LD), that is, SNPs that are highly correlated with each other due to evolutionary processes. At each generation of our GA, a population of logic tree models is modified using selection, cross-over and mutation moves. Logic trees are selected for the next generation using a fitness function based on the marginal likelihood in a Bayesian regression frame-work. Mutation and cross-over moves use LD measures to pro pose changes to the trees, and facilitate the movement through the model space. We demonstrate our method and the flexibility of logic tree structure with variable nodal lengths on simulated data from a coalescent model, as well as data from a candidate gene study of quantitative genetic variation.  相似文献   

13.
The mealy plum aphid, Hyalopterus pruni (Geoffroy) (Hemiptera: Aphididae) is a pest of prune trees in California. The impact of aphids as pests is well characterized by their population growth rate, a parameter integrating their age-specific development, survivorship, and fecundity. These population parameters were measured at five constants temperatures on potted prune trees. Development rates increased with temperature up to an optimum. The relationship between development rate and temperature was described by linear and nonlinear models. Developmental threshold temperature was greater for the nonlinear model than for the linear model. Thermal requirement for development and maximum lethal temperature determined by these models were similar to those for other aphids. The greatest proportional survivorship of nymphs occurred at 26 degrees C. Mean daily fecundity was lowest at 14 degrees C and highest at 22 degrees C. Adult longevity decreased with temperature. Population growth rates for H. pruni were estimated from measurements of fecundity and development time and were highest at 22 degrees C. This is the first study to document the temperature dependence of the life history parameters for H. pruni and the first to generate a degree-day model for the prediction of phenological events.  相似文献   

14.
Carrying out simultaneous tree-building and alignment of sequence data is a difficult computational task, and the methods currently available are either limited to a few sequences or restricted to highly simplified models of alignment and phylogeny. A method is given here for overcoming these limitations by Bayesian sampling of trees and alignments simultaneously. The method uses a standard substitution matrix model for residues together with a hidden Markov model structure that allows affine gap penalties. It escapes the heavy computational burdens of other models by using an approximation called the ``*' rule, which replaces missing data by a sum over all possible values of variables. The behavior of the model is demonstrated on test sets of globins. Received: 25 May 1998 / Accepted: 8 December 1998  相似文献   

15.
Genome-scale sequence data have become increasingly available in the phylogenetic studies for understanding the evolutionary histories of species. However, it is challenging to develop probabilistic models to account for heterogeneity of phylogenomic data. The multispecies coalescent model describes gene trees as independent random variables generated from a coalescence process occurring along the lineages of the species tree. Since the multispecies coalescent model allows gene trees to vary across genes, coalescent-based methods have been popularly used to account for heterogeneous gene trees in phylogenomic data analysis. In this paper, we summarize and evaluate the performance of coalescent-based methods for estimating species trees from genome-scale sequence data. We investigate the effects of deep coalescence and mutation on the performance of species tree estimation methods. We found that the coalescent-based methods perform well in estimating species trees for a large number of genes, regardless of the degree of deep coalescence and mutation. The performance of the coalescent methods is negatively correlated with the lengths of internal branches of the species tree.  相似文献   

16.
We introduce a mechanism for analytically deriving upper bounds on the maximum likelihood for genetic sequence data on sets of phylogenies. A simple 'partition' bound is introduced for general models. Tighter bounds are developed for the simplest model of evolution, the two state symmetric model of nucleotide substitution under the molecular clock. This follows earlier theoretical work which has been restricted to this model by analytic complexity. A weakness of current numerical computation is that reported 'maximum likelihood' results cannot be guaranteed, both for a specified tree (because of the possibility of multiple maxima) or over the full tree space (as the computation is intractable for large sets of trees). The bounds we develop here can be used to conclusively eliminate large proportions of tree space in the search for the maximum likelihood tree. This is vital in the development of a branch and bound search strategy for identifying the maximum likelihood tree. We report the results from a simulation study of approximately 10(6) data sets generated on clock-like trees of five leaves. In each trial a likelihood value of one specific instance of a parameterised tree is compared to the bound determined for each of the 105 possible rooted binary trees. The proportion of trees that are eliminated from the search for the maximum likelihood tree ranged from 92% to almost 98%, indicating a computational speed-up factor of between 12 and 44.  相似文献   

17.
A general stochastic model is presented that simulates the time course of flowering of individual trees and populations, integrating the synchronization of flowering both between and within trees. Making some hypotheses, a simplified expression of the model, called the 'shoot' model, is proposed, in which the synchronization of flowering both between and within trees is characterized by specific parameters. Two derived models, the 'tree' model and the 'population' model, are presented. They neglect the asynchrony of flowering, respectively, within trees, and between and within trees. Models were fitted and tested using data on flowering of Psidium cattleianum observed at study sites at elevations of 200, 520 and 890 m in Reunion Island. The 'shoot' model fitted the data best and reproduced the strong irregularities in flowering shown by empirical data. The asynchrony of flowering in P. cattleianum was more pronounced within than between trees. Simulations showed that various flowering patterns can be reproduced by the 'shoot' model. The use of different levels of organization of the general model is discussed.  相似文献   

18.
MOTIVATION: The computation of large phylogenetic trees with statistical models such as maximum likelihood or bayesian inference is computationally extremely intensive. It has repeatedly been demonstrated that these models are able to recover the true tree or a tree which is topologically closer to the true tree more frequently than less elaborate methods such as parsimony or neighbor joining. Due to the combinatorial and computational complexity the size of trees which can be computed on a Biologist's PC workstation within reasonable time is limited to trees containing approximately 100 taxa. RESULTS: In this paper we present the latest release of our program RAxML-III for rapid maximum likelihood-based inference of large evolutionary trees which allows for computation of 1.000-taxon trees in less than 24 hours on a single PC processor. We compare RAxML-III to the currently fastest implementations for maximum likelihood and bayesian inference: PHYML and MrBayes. Whereas RAxML-III performs worse than PHYML and MrBayes on synthetic data it clearly outperforms both programs on all real data alignments used in terms of speed and final likelihood values. Availability SUPPLEMENTARY INFORMATION: RAxML-III including all alignments and final trees mentioned in this paper is freely available as open source code at http://wwwbode.cs.tum/~stamatak CONTACT: stamatak@cs.tum.edu.  相似文献   

19.
The objective of the present study was to develop an empirical cold hardiness model applicable to several taxa of deciduous trees. Cold hardiness expressed as lowest survival temperature of Acer rubrum, Betula nigra, Liquidambar styracifiua, Fraxinus pennsylvanica, Prunus serotina and Quercus alba was evaluated at approximately weekly intervals during the winters of three consecutive years. Plant samples and meteorological data were collected from Georgia Experiment Station, Griffin, Georgia. Maximum, minimum and average temperatures, hourly chill and heat accumulation. day length and time of year were used as input variables for model development. The statistical method of stepwise procedure of regression analysis was employed to select variables for the model. Based on the assumption that model components should be the same for all taxa included in this study and all three winters, the following independent model variables were selected as valid inputs: day length, number of accumulated hours with temperature above 20°C and number of accumulated hours with temperature below 10°C. Equation coefficients of species-specific models were determined for each species. Cold hardiness predictions were compared to actual observations for each species. The model components were interpreted as representing two processes: (1) internally regulated and independent of actual temperature, and (2) externally regulated and dependent on the amount of accumulated chill or heat. The model allowed for comparisons of cold hardening and dehardening between the studied taxa and between years.  相似文献   

20.
基于GreenLab的油松结构-功能模型   总被引:5,自引:1,他引:4       下载免费PDF全文
 植物结构-功能模型(Functional-structural models, FSMs)将结构模型与过程模型结合起来, 用以描述环境机制驱动的植物生长, 输出植物的三维结构。GreenLab是一个近年来不断发展着的基于源-汇关系的通用植物结构-功能模型, 它多应用于农作物, 在树木方面的应用还很少。该文以幼龄油松(Pinus tabulaeformis)为研究对象, 首次将GreenLab模型应用到虚拟树木生长的研究中。采用破坏性取样, 实测了9株油松幼树的形态结构、拓扑结构和器官生物量信息, 根据拓扑编码体系组织数据。模型的直接参数是通过实测数据获得的, 隐含参数是利用非线性最小二乘法拟合反求获得的。对模型的假设进行了验证, 并对模型的模拟效果进行了评估, 结果表明: 节间总鲜质量、树木叶总鲜质量、节间鲜质量、节间长度观测值和模型模拟值建立的回归方程的决定系数在0.78~0.91之间, 因此该模型较真实地反映了油松的结构和生长过程。提出的树木结构和生物量测量及编码方法, 可作为针叶树建立结构-功能模型的参照。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号