首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The objective of this study was to obtain a quantitative assessment of the monophyly of morning glory taxa, specifically the genus Ipomoea and the tribe Argyreieae. Previous systematic studies of morning glories intimated the paraphyly of Ipomoea by suggesting that the genera within the tribe Argyreieae are derived from within Ipomoea; however, no quantitative estimates of statistical support were developed to address these questions. We applied a Bayesian analysis to provide quantitative estimates of monophyly in an investigation of morning glory relationships using DNA sequence data. We also explored various approaches for examining convergence of the Markov chain Monte Carlo (MCMC) simulation of the Bayesian analysis by running 18 separate analyses varying in length. We found convergence of the important components of the phylogenetic model (the tree with the maximum posterior probability, branch lengths, the parameter values from the DNA substitution model, and the posterior probabilities for clade support) for these data after one million generations of the MCMC simulations. In the process, we identified a run where the parameter values obtained were often outside the range of values obtained from the other runs, suggesting an aberrant result. In addition, we compared the Bayesian method of phylogenetic analysis to maximum likelihood and maximum parsimony. The results from the Bayesian analysis and the maximum likelihood analysis were similar for topology, branch lengths, and parameters of the DNA substitution model. Topologies also were similar in the comparison between the Bayesian analysis and maximum parsimony, although the posterior probabilities and the bootstrap proportions exhibited some striking differences. In a Bayesian analysis of three data sets (ITS sequences, waxy sequences, and ITS + waxy sequences) no supoort for the monophyly of the genus Ipomoea, or for the tribe Argyreieae, was observed, with the estimate of the probability of the monophyly of these taxa being less than 3.4 x 10(-7).  相似文献   

2.
Among-site rate variation (alpha) and transition bias (kappa) have been shown, most often as independent parameters, to be important dynamics in DNA evolution. Accounting for these dynamics should result in better estimates of phylogenetic relationships. To test this idea, we simultaneously estimated overall (averaged over all codon positions) and codon-specific values of alpha and kappa, using maximum likelihood analyses of cytochrome b data from all genera of pipits and wagtails (Aves: Motacillidae), and six outgroup species, using initial trees generated with default values. Estimates of alpha and kappa were robust to initial tree topology and suggested substantial among-site rate variation even within codon classes; alpha was lowest (large among-site rate variation) at second-codon and highest (low among-site rate variation) at third-codon positions. When overall values were applied, there were shifts in tree topology and dramatic and statistically significant improvements in log-likelihood scores of trees compared with the scores from application of default values. Applying codon-specific values resulted in yet another highly significant increase in likelihood. However, although incorporating substitution dynamics into maximum likelihood, maximum parsimony, and neighbor-joining analyses resulted in increases in congruence among trees, there were only minor improvements in phylogenetic signal, and none of the successive approximations tree topologies were statistically distinguishable from one another by the data. We suggest that the bushlike nature of many higher-level phylogenies in birds makes estimating the dynamics of DNA evolution less sensitive to tree topology but also less susceptible to improvement via weighting.  相似文献   

3.
MIXED MODEL APPROACHES FOR ESTIMATING GENETIC VARIANCES AND COVARIANCES   总被引:62,自引:4,他引:58  
The limitations of methods for analysis of variance(ANOVA)in estimating genetic variances are discussed. Among the three methods(maximum likelihood ML, restricted maximum likelihood REML, and minimum norm quadratic unbiased estimation MINQUE)for mixed linear models, MINQUE method is presented with formulae for estimating variance components and covariances components and for predicting genetic effects. Several genetic models, which cannot be appropriately analyzed by ANOVA methods, are introduced in forms of mixed linear models. Genetic models with independent random effects can be analyzed by MINQUE(1)method whieh is a MINQUE method with all prior values setting 1. MINQUE(1)method can give unbiased estimation for variance components and covariance components, and linear unbiased prediction (LUP) for genetic effects. There are more complicate genetic models for plant seeds which involve correlated random effects. MINQUE(0/1)method, which is a MINQUE method with all prior covariances setting 0 and all prior variances setting 1, is suitable for estimating variance and covariance components in these models. Mixed model approaches have advantage over ANOVA methods for the capacity of analyzing unbalanced data and complicated models. Some problems about estimation and hypothesis test by MINQUE method are discussed.  相似文献   

4.
By a maximum likelihood analysis of mitochondrial DNA sequences, we examine Graur and Higgins' hypothesis of the Ruminantia/Cetacea clade with Suiformes as an outgroup. Graur and Higgins analyzed these sequences by the neighbor-joining and parsimony methods, as well as by the maximum likelihood method under the assumption that the substitution rate is the same for all sites. The Ruminantia/Suiformes clade assumed by the traditional taxonomy was rejected strongly by this analysis and the Ruminantia/Cetacea clade was supported. Adoption of a more realistic model distinguishing among rates at different codon positions in the maximum likelihood analysis of the same data, however, grossly reduces the significance level on the Graur-Higgins hypothesis. Thus, although the Ruminantia/Suiformes grouping is indeed least likely from Graur and Higgins' data set of mitochondrial DNA, this traditional tree cannot be rejected with statistical significance under the new analysis, and more data are needed to settle the issue. In the same way, we examine Irwin and Arnason's suggestion of the Hippopotamus/Cetacea clade by using cytochrome b and hemoglobins alpha and beta, and it turn out that their suggestion is also fragile. This analysis demonstrates the importance of selecting an appropriate model among the alternatives in the maximum likelihood analysis and of using many different genes from many relevant species in order to make reliable phylogenetic inferences.   相似文献   

5.
ABSTRACT: BACKGROUND: Linkage analysis is a useful tool for detecting genetic variants that regulate a trait of interest, especially genes associated with a given disease. Although penetrance parameters play an important role in determining gene location, they are assigned arbitrary values according to the researcher's intuition or as estimated by the maximum likelihood principle. Several methods exist by which to evaluate the maximum likelihood estimates of penetrance, although not all of these are supported by software packages and some are biased by marker genotype information, even when disease development is due solely to the genotype of a single allele. FINDINGS: Programs for exploring the maximum likelihood estimates of penetrance parameters were developed using the R statistical programming language supplemented by external C functions. The software returns a vector of polynomial coefficients of penetrance parameters, representing the likelihood of pedigree data. From the likelihood polynomial supplied by the proposed method, the likelihood value and its gradient can be precisely computed. To reduce the effect of the supplied dataset on the likelihood function, feasible parameter constraints can be introduced into maximum likelihood estimates, thus enabling flexible exploration of the penetrance estimates. An auxiliary program generates a perspective plot allowing visual validation of the model's convergence. The functions are collectively available as the MLEP R package. CONCLUSIONS: Linkage analysis using penetrance parameters estimated by the MLEP package enables feasible localization of a disease locus. This is shown through a simulation study and by demonstrating how the package is used to explore maximum likelihood estimates. Although the input dataset tends to bias the likelihood estimates, the method yields accurate results superior to the analysis using intuitive penetrance values for disease with low allele frequencies. MLEP is part of the Comprehensive R Archive Network and is freely available at http://cran.r-project.org/web/packages/MLEP/index.html.  相似文献   

6.
A method to estimate genetic variance components in populations partially pedigreed by DNA fingerprinting is presented. The focus is on aquaculture, where breeding procedures may produce thousands of individuals. In aquaculture populations the individuals available for measurement will often be selected, i.e. will come from the upper tail of a size‐at‐age distribution, or the lower tail of an age‐at‐maturity distribution etc. Selection typically occurs by size grading during grow‐out and/or choice of superior fish as broodstock. The method presented in this paper enables us to estimate genetic variance components when only a small proportion of individuals, those with extreme phenotypes, have been identified by DNA fingerprinting. We replace the usual normal density by appropriate robust least favourable densities to ensure the robustness of our estimates. Standard analysis of variance or maximum likelihood estimation cannot be used when only the extreme progeny have been pedigreed because of the biased nature of the estimates. In our model‐based procedure a full robust likelihood function is defined, in which the missing information about non‐extreme progeny has been taken into account. This robust likelihood function is transformed into a computable function which is maximized to get the estimates. The estimates of sire and dam additive variance components are significantly and uniformly more accurate than those obtained by any of the standard methods when tested on simulated population data and have desirable robustness properties.  相似文献   

7.
Summary A maximum likelihood method for inferring evolutionary trees from DNA sequence data was developed by Felsenstein (1981). In evaluating the extent to which the maximum likelihood tree is a significantly better representation of the true tree, it is important to estimate the variance of the difference between log likelihood of different tree topologies. Bootstrap resampling can be used for this purpose (Hasegawa et al. 1988; Hasegawa and Kishino 1989), but it imposes a great computation burden. To overcome this difficulty, we developed a new method for estimating the variance by expressing it explicitly.The method was applied to DNA sequence data from primates in order to evaluate the maximum likelihood branching order among Hominoidea. It was shown that, although the orangutan is convincingly placed as an outgroup of a human and African apes clade, the branching order among human, chimpanzee, and gorilla cannot be determined confidently from the DNA sequence data presently available when the evolutionary rate constancy is not assumed.  相似文献   

8.
Summary Analysis of variance and principal components methods have been suggested for estimating repeatability. In this study, six estimation procedures are compared: ANOVA, principal components based on the sample covariance matrix and also on the sample correlation matrix, a related multivariate method (structural analysis) based on the sample covariance matrix and also on the sample correlation matrix, and maximum likelihood estimation. A simulation study indicates that when the standard linear model assumptions are met, the estimators are quite similar except when the repeatability is small. Overall, maximum likelihood appears the preferred method. If the assumption of equal variance is relaxed, the methods based on the sample correlation matrix perform better although others are surprisingly robust. The structural analysis method (with sample correlation matrix) appears to be best.Paper number 776 from the Department of Meat and Animal Science, University of Wisconsin-Madison.  相似文献   

9.
Summary The efficiency of obtaining the correct tree by the maximum likelihood method (Felsenstein 1981) for inferring trees from DNA sequence data was compared with trees obtained by distance methods. It was shown that the maximum likelihood method is superior to distance methods in the efficiency particularly when the evolutionary rate differs among lineages.  相似文献   

10.
Parameters of the two-parameter logistic model are generally estimated via the expectation-maximization algorithm, which improves initial values for all parameters iteratively until convergence is reached. Effects of initial values are rarely discussed in item response theory (IRT), but initial values were recently found to affect item parameters when estimating the latent distribution with full non-parametric maximum likelihood. However, this method is rarely used in practice. Hence, the present study investigated effects of initial values on item parameter bias and on recovery of item characteristic curves in BILOG-MG 3, a widely used IRT software package. Results showed notable effects of initial values on item parameters. For tighter convergence criteria, effects of initial values decreased, but item parameter bias increased, and the recovery of the latent distribution worsened. For practical application, it is advised to use the BILOG default convergence criterion with appropriate initial values when estimating the latent distribution from data.  相似文献   

11.
Estimation of variance components in linear mixed models is important in clinical trial and longitudinal data analysis. It is also important in animal and plant breeding for accurately partitioning total phenotypic variance into genetic and environmental variances. Restricted maximum likelihood (REML) method is often preferred over the maximum likelihood (ML) method for variance component estimation because REML takes into account the lost degree of freedom resulting from estimating the fixed effects. The original restricted likelihood function involves a linear transformation of the original response variable (a collection of error contrasts). Harville's final form of the restricted likelihood function does not involve the transformation and thus is much easier to manipulate than the original restricted likelihood function. There are several different ways to show that the two forms of the restricted likelihood are equivalent. In this study, I present a much simpler way to derive Harville's restricted likelihood function. I first treat the fixed effects as random effects and call such a mixed model a pseudo random model (PDRM). I then construct a likelihood function for the PDRM. Finally, I let the variance of the pseudo random effects be infinity and show that the limit of the likelihood function of the PDRM is the restricted likelihood function.  相似文献   

12.
为了探究进化模型对DNA条形码分类的影响, 本研究以雾灵山夜蛾科44个种的标本为材料, 获得COI基因序列。使用邻接法(neighbor-joining)、 最大简约法(maximum parsimony)、 最大似然法(maximum likelihood)以及贝叶斯法(Bayesian inference)构建系统发育树, 并且对邻接法的12种模型、 最大似然法的7种模型、 贝叶斯法的2种模型进行模型成功率的评估。结果表明, 邻接法的12种模型成功率相差不大, 较稳定; 最大似然法及贝叶斯法的不同模型成功率存在明显差异, 不稳定; 最大简约法不基于模型, 成功率比较稳定。邻接法及最大似然法共有6种相同的模型, 这6种模型在不同的方法中成功率存在差异。此外, 分子数据中存在单个物种仅有一条序列的情况, 显著降低了模型成功率, 表明在DNA条形码研究中, 每个物种需要有多个样本。  相似文献   

13.
The usefulness of fluorescence techniques for the study of macromolecular structure and dynamics depends on the accuracy and sensitivity of the methods used for data analysis. Many methods for data analysis have been proposed and used, but little attention has been paid to the maximum likelihood method, generally known as the most powerful statistical method for parameter estimation. In this paper we study the properties and behavior of maximum likelihood estimates by using simulated fluorescence intensity decay data. We show that the maximum likelihood method provides generally more accurate estimates of lifetimes and fractions than does the standard least-squares approach especially when the lifetime ratios between individual components are small. Three novelties to the field of fluorescence decay analysis are also introduced and studied in this paper: a) discretization of the convolution integral based on the generalized integral mean value theorem: b) the likelihood ratio test as a tool to determine the number of exponential decay components in a given decay profile; and c) separability and detectability indices which provide measures on how accurately, a particular decay component can be detected. Based on the experience gained from this and from our previous study of the Padé-Laplace method, we make some recommendations on how the complex problem of deconvolution and parameter estimation of multiexponential functions might be approached in an experimental setting. Offprint requests to: F. G. Prendergast  相似文献   

14.
In this article we give a procedure for the common estimation of parameters corresponding to several treatment groups. Thereby we assume that the distribution functions of the groups belong to the same family and differ only in the parameter values. The procedure allows the common estimation of some of these parameters. The parameters themselves will be estimated by the maximum likelihood method; the estimators will be calculated iteratively by the Newton-Raphson method. To prove if the common estimation is possible, we propose as a suitable test the maximum likelihood ratio test. Finally we show the application of our procedure in the case of the probit analysis.  相似文献   

15.
Summary Based on mitochondrial DNA (mt-DNA) sequence data from a wide range of primate species, branching order in the evolution of primates was inferred by the maximum likelihood method of Felsenstein without assuming rate constancy among lineages. Bootstrap probabilities for being the maximum likelihood tree topology among alternatives were estimated without performing a maximum likelihood estimation for each resampled data set. Variation in the evolutionary rate among lineages was examined for the maximum likelihood tree by a method developed by Kishino and Hasegawa. From these analyses it appears that the transition rate of mtDNA evolution in the lemur has been extremely low, only about 1/10 that in other primate lines, whereas the transversion rate does not differ significantly from that of other primates. Furthermore, the transition rate in catarrhines, except the gibbon, is higher than those in the tarsier and in platyrrhines, and the transition rate in the gibbon is lower than those in other catarrhines. Branching dates in primate evolution were estimated by a molecular clock analysis of mtDNA, taking into account the rate of variation among different lines, and the results were compared with those estimated from nuclear DNA. Under the most likely model, where the evolutionary rate of mtDNA has been unifrom within a great apes/human calde, human/chimpanzee clustering is preferred to the alternative branching orders among human, chimpanzee, and gorilla.  相似文献   

16.
Several maximum likelihood and distance matrix methods for estimating phylogenetic trees from homologous DNA sequences were compared when substitution rates at sites were assumed to follow a gamma distribution. Computer simulations were performed to estimate the probabilities that various tree estimation methods recover the true tree topology. The case of four species was considered, and a few combinations of parameters were examined. Attention was applied to discriminating among different sources of error in tree reconstruction, i.e., the inconsistency of the tree estimation method, the sampling error in the estimated tree due to limited sequence length, and the sampling error in the estimated probability due to the number of simulations being limited. Compared to the least squares method based on pairwise distance estimates, the joint likelihood analysis is found to be more robust when rate variation over sites is present but ignored and an assumption is thus violated. With limited data, the likelihood method has a much higher probability of recovering the true tree and is therefore more efficient than the least squares method. The concept of statistical consistency of a tree estimation method and its implications were explored, and it is suggested that, while the efficiency (or sampling error) of a tree estimation method is a very important property, statistical consistency of the method over a wide range of, if not all, parameter values is prerequisite.  相似文献   

17.
Mitochondrial DNA sequences from the 12S rRNA gene, four tRNA genes, and a portion of two protein coding genes were used to investigate the relationship of myliobatoid genera. In addition, we conducted an investigation of the sister group to the freshwater stingrays by sampling additional DNA sequences from GenBank. Consequently, two datasets were used to examine myliobatoid relationships. The first consisted of the genes sequenced in this study. The second dataset was compiled by combining the first dataset with cytochrome b sequences from GenBank. The second dataset, however, included a number of missing characters due to differences in sampling. The effect of the missing characters on both maximum parsimony and maximum likelihood analysis was investigated by conducting a simulation study. Results of the simulation study indicated that maximum likelihood was not sensitive to the missing data, whereas the accuracy of maximum parsimony analysis was expected to decrease. Phylogenetic analysis of this group had several areas concordant with morphological studies, however, the analysis also revealed two novel relationships. In addition, placement of two taxa (Gymnura and Himantura) were dependent both on the dataset and analytical method used.  相似文献   

18.
The available information on sample size requirements of mixture analysis methods is insufficient to permit a precise evaluation of the potential problems facing practical applications of mixture analysis. We use results from Monte Carlo simulation to assess the sample size requirements of a simple mixture analysis method under conditions relevant to biological applications of mixture analysis. The mixture model used includes two univariate normal components with equal variances but assumes that the researcher is ignorant as to the equality of the variances. The method used relies on the EM algorithm to compute the maximum likelihood estimates of the mixture parameters, and the likelihood ratio test to assess the number of components in the mixtures. Our results suggest that sample sizes close to 500 or 1000 data may be required to adequately solve mixtures commonly found in biology. Sample sizes of 500 or 1000 are difficult to achieve. However, use of this MA method may be a reasonable option when the researcher deals with problems which are intractable by other means. Copyright 1999 Academic Press.  相似文献   

19.
Thre methods of estimating the parameters of the Johnson S6 distribution were tested by simulation. The maximum likelihood method, the method based on percentiles of a sample and the method based on moments of a transformed random variable were taken into consideration. Many sets of samples were generated differing in sizes and in the actual values of parameters, whereupon the parameters were estimated by the three methods. It was proved that if the sample is small or the skewness of the distribution is considerable, the maximum likelihood estimates can assume preposterous values. The method based on moments is recommended due to its simplicity and to the fact that the estimates, though usually biased, never assume absurd values.  相似文献   

20.
Markatou M 《Biometrics》2000,56(2):483-486
Problems associated with the analysis of data from a mixture of distributions include the presence of outliers in the sample, the fact that a component may not be well represented in the data, and the problem of biases that occur when the model is slightly misspecified. We study the performance of weighted likelihood in this context. The method produces estimates with low bias and mean squared error, and it is useful in that it unearths data substructures in the form of multiple roots. This in turn indicates multiple potential mixture model fits due to the presence of more components than originally specified in the model. To compute the weighted likelihood estimates, we use as starting values the method of moment estimates computed on bootstrap subsamples drawn from the data. We address a number of important practical issues involving bootstrap sample size selection, the role of starting values, and the behavior of the roots. The algorithm used to compute the weighted likelihood estimates is competitive with EM, and it is similar to EM when the components are not well separated. Moreover, we propose a new statistical stopping rule for the termination of the algorithm. An example and a small simulation study illustrate the above points.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号