首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
《动物分类学报》2017,(1):46-58
To distinguish species or populations using morphometric data is generally processed through multivariate analyses,in particular the discriminant analysis.We explored another approach based on the maximum likelihood method.Simple statistics based on the assumption of normal distribution at a single variable allows to compute the chance of observing a particular data (or sample) in a given reference group.When data are described by more than one variable,the maximum likelihood (MLi) approach allows to combine these chances to fmd the best fit for the data.Such approach assumes independence between variables.The assumptions of normal distribution of variables and independence between them are frequently not met in morphometrics,but improvements may be obtained after some mathematical transformations.Provided there is strict anatomical correspondence of variables between unknown and reference data,the MLi classification produces consistent classification.We explored this approach using various input data,and compared validated classification scores with the ones obtained after the Mahalanobis distance-based classification.The simplicity of the method,its fast computation,performance and versatility,make it an interesting complement to other classification techniques.  相似文献   

2.
This study presents an effective procedure for the determination of a biologically inspired, black-box model of cultures of microorganisms (including yeasts, bacteria, plant and animal cells) in bioreactors. This procedure is based on sets of experimental data measuring the time-evolution of a few extracellular species concentrations, and makes use of maximum likelihood principal component analysis to determine, independently of the kinetics, an appropriate number of macroscopic reactions and their stoichiometry. In addition, this paper provides a discussion of the geometric interpretation of a stoichiometric matrix and the potential equivalent reaction schemes. The procedure is carefully evaluated within the stoichiometric identification framework of the growth of the yeast Kluyveromyces marxianus on cheese whey. Using Monte Carlo studies, it is also compared with two other previously published approaches.  相似文献   

3.

Background  

The aim of protein design is to predict amino-acid sequences compatible with a given target structure. Traditionally envisioned as a purely thermodynamic question, this problem can also be understood in a wider context, where additional constraints are captured by learning the sequence patterns displayed by natural proteins of known conformation. In this latter perspective, however, we still need a theoretical formalization of the question, leading to general and efficient learning methods, and allowing for the selection of fast and accurate objective functions quantifying sequence/structure compatibility.  相似文献   

4.
A maximum likelihood approach to two-dimensional crystals   总被引:1,自引:0,他引:1  
Maximum likelihood (ML) processing of transmission electron microscopy images of protein particles can produce reconstructions of superior resolution due to a reduced reference bias. We have investigated a ML processing approach to images centered on the unit cells of two-dimensional (2D) crystal images. The implemented software makes use of the predictive lattice node tracking in the MRC software, which is used to window particle stacks. These are then noise-whitened and subjected to ML processing. Resulting ML maps are translated into amplitudes and phases for further processing within the 2dx software package. Compared with ML processing for randomly oriented single particles, the required computational costs are greatly reduced as the 2D crystals restrict the parameter search space. The software was applied to images of negatively stained or frozen hydrated 2D crystals of different crystal order. We find that the ML algorithm is not free from reference bias, even though its sensitivity to noise correlation is lower than for pure cross-correlation alignment. Compared with crystallographic processing, the newly developed software yields better resolution for 2D crystal images of lower crystal quality, and it performs equally well for well-ordered crystal images.  相似文献   

5.
Summary A non-linear method of ordinating vegetation samples based on the fitting of bell-shaped response curves is lescribed. For each species two Gaussian curves were itted, one to quantitative values, where the species was present, the other to probabilities of absence. A maximum likelihood approach was then used to obtain a best approximation of the positions of the samples along a one-dimensional gradient. By an iterative process successively better approximations were obtained.The method was successful in recovering gradients based on hypothetical data. With two sets of real data the gradient produced was more ecologically satisfying and far less distorted than that revealed by principal component analysis.  相似文献   

6.
Modeling residue usage in aligned protein sequences via maximum likelihood   总被引:9,自引:6,他引:3  
A computational method is presented for characterizing residue usage, i.e., site-specific residue frequencies, in aligned protein sequences. The method obtains frequency estimates that maximize the likelihood of the sequences in a simple model for sequence evolution, given a tree or a set of candidate trees computed by other methods. These maximum- likelihood frequencies constitute a profile of the sequences, and thus the method offers a rigorous alternative to sequence weighting for constructing such a profile. The ability of this method to discard misleading phylogenetic effects allows the biochemical propensities of different positions in a sequence to be more clearly observed and interpreted.   相似文献   

7.

Background  

Inference of population stratification and individual admixture from genetic markers is an integrative part of a study in diverse situations, such as association mapping and evolutionary studies. Bayesian methods have been proposed for population stratification and admixture inference using multilocus genotypes and widely used in practice. However, these Bayesian methods demand intensive computation resources and may run into convergence problem in Markov Chain Monte Carlo based posterior samplings.  相似文献   

8.
With the expansion of offender/arrestee DNA profile databases, genetic forensic identification has become commonplace in the United States criminal justice system. Implementation of familial searching has been proposed to extend forensic identification to family members of individuals with profiles in offender/arrestee DNA databases. In familial searching, a partial genetic profile match between a database entrant and a crime scene sample is used to implicate genetic relatives of the database entrant as potential sources of the crime scene sample. In addition to concerns regarding civil liberties, familial searching poses unanswered statistical questions. In this study, we define confidence intervals on estimated likelihood ratios for familial identification. Using these confidence intervals, we consider familial searching in a structured population. We show that relatives and unrelated individuals from population samples with lower gene diversity over the loci considered are less distinguishable. We also consider cases where the most appropriate population sample for individuals considered is unknown. We find that as a less appropriate population sample, and thus allele frequency distribution, is assumed, relatives and unrelated individuals become more difficult to distinguish. In addition, we show that relationship distinguishability increases with the number of markers considered, but decreases for more distant genetic familial relationships. All of these results indicate that caution is warranted in the application of familial searching in structured populations, such as in the United States.  相似文献   

9.
Genetic data are useful for estimating the genealogical relationship or relatedness between individuals of unknown ancestry. We present a computer program, ml ‐relate that calculates maximum likelihood estimates of relatedness and relationship. ml ‐relate is designed for microsatellite data and can accommodate null alleles. It uses simulation to determine which relationships are consistent with genotype data and to compare putative relationships with alternatives. ml ‐relate runs on the Microsoft Windows operating system and is available from http://www.montana.edu/kalinowski .  相似文献   

10.
Despite the increasing number of published protein structures, and the fact that each protein's function relies on its three-dimensional structure, there is limited access to automatic programs used for the identification of critical residues from the protein structure, compared with those based on protein sequence. Here we present a new algorithm based on network analysis applied exclusively on protein structures to identify critical residues. Our results show that this method identifies critical residues for protein function with high reliability and improves automatic sequence-based approaches and previous network-based approaches. The reliability of the method depends on the conformational diversity screened for the protein of interest. We have designed a web site to give access to this software at http://bis.ifc.unam.mx/jamming/. In summary, a new method is presented that relates critical residues for protein function with the most traversed residues in networks derived from protein structures. A unique feature of the method is the inclusion of the conformational diversity of proteins in the prediction, thus reproducing a basic feature of the structure/function relationship of proteins.  相似文献   

11.
Identification of phenotypic modules, semiautonomous sets of highly correlated traits, can be accomplished through exploratory (e.g., cluster analysis) or confirmatory approaches (e.g., RV coefficient analysis). Although statistically more robust, confirmatory approaches are generally unable to compare across different model structures. For example, RV coefficient analysis finds support for both two‐ and six‐module models for the therian mammalian skull. Here, we present a maximum likelihood approach that takes into account model parameterization. We compare model log‐likelihoods of trait correlation matrices using the finite‐sample corrected Akaike Information Criterion, allowing for comparison of hypotheses across different model structures. Simulations varying model complexity and within‐ and between‐module contrast demonstrate that this method correctly identifies model structure and parameters across a wide range of conditions. We further analyzed a dataset of 3‐D data, consisting of 61 landmarks from 181 macaque (Macaca fuscata) skulls, distributed among five age categories, testing 31 models, including no modularity among the landmarks and various partitions of two, three, six, and eight modules. Our results clearly support a complex six‐module model, with separate within‐ and intermodule correlations. Furthermore, this model was selected for all five age categories, demonstrating that this complex pattern of integration in the macaque skull appears early and is highly conserved throughout postnatal ontogeny. Subsampling analyses demonstrate that this method is robust to relatively low sample sizes, as is commonly encountered in rare or extinct taxa. This new approach allows for the direct comparison of models with different parameterizations, providing an important tool for the analysis of modularity across diverse systems.  相似文献   

12.
PAML 4: phylogenetic analysis by maximum likelihood   总被引:41,自引:1,他引:41  
PAML, currently in version 4, is a package of programs for phylogeneticanalyses of DNA and protein sequences using maximum likelihood(ML). The programs may be used to compare and test phylogenetictrees, but their main strengths lie in the rich repertoire ofevolutionary models implemented, which can be used to estimateparameters in models of sequence evolution and to test interestingbiological hypotheses. Uses of the programs include estimationof synonymous and nonsynonymous rates (dN and dS) between twoprotein-coding DNA sequences, inference of positive Darwinianselection through phylogenetic comparison of protein-codinggenes, reconstruction of ancestral genes and proteins for molecularrestoration studies of extinct life forms, combined analysisof heterogeneous data sets from multiple gene loci, and estimationof species divergence times incorporating uncertainties in fossilcalibrations. This note discusses some of the major applicationsof the package, which includes example data sets to demonstratetheir use. The package is written in ANSI C, and runs underWindows, Mac OSX, and UNIX systems. It is available at http://abacus.gene.ucl.ac.uk/software/paml.html.  相似文献   

13.
To obtain the correlation dimension and entropy from an experimental time series we derive estimators for these quantities together with expressions for their variances using a maximum likelihood approach. The validity of these expressions is supported by Monte Carlo simulations. We illustrate the use of the estimators with a local recording of atrial fibrillation obtained from a conscious dog.  相似文献   

14.

Background  

The covarion hypothesis of molecular evolution holds that selective pressures on a given amino acid or nucleotide site are dependent on the identity of other sites in the molecule that change throughout time, resulting in changes of evolutionary rates of sites along the branches of a phylogenetic tree. At the sequence level, covarion-like evolution at a site manifests as conservation of nucleotide or amino acid states among some homologs where the states are not conserved in other homologs (or groups of homologs). Covarion-like evolution has been shown to relate to changes in functions at sites in different clades, and, if ignored, can adversely affect the accuracy of phylogenetic inference.  相似文献   

15.
MOTIVATION: In recent years there has been increased interest in producing large and accurate phylogenetic trees using statistical approaches. However for a large number of taxa, it is not feasible to construct large and accurate trees using only a single processor. A number of specialized parallel programs have been produced in an attempt to address the huge computational requirements of maximum likelihood. We express a number of concerns about the current set of parallel phylogenetic programs which are currently severely limiting the widespread availability and use of parallel computing in maximum likelihood-based phylogenetic analysis. RESULTS: We have identified the suitability of phylogenetic analysis to large-scale heterogeneous distributed computing. We have completed a distributed and fully cross-platform phylogenetic tree building program called distributed phylogeny reconstruction by maximum likelihood. It uses an already proven maximum likelihood-based tree building algorithm and a popular phylogenetic analysis library for all its likelihood calculations. It offers one of the most extensive sets of DNA substitution models currently available. We are the first, to our knowledge, to report the completion of a distributed phylogenetic tree building program that can achieve near-linear speedup while only using the idle clock cycles of machines. For those in an academic or corporate environment with hundreds of idle desktop machines, we have shown how distributed computing can deliver a 'free' ML supercomputer.  相似文献   

16.
Ronald A. Fisher, who is the founder of maximum likelihood estimation (ML estimation), criticized the Bayes estimation of using a uniform prior distribution, because we can create estimates arbitrarily if we use Bayes estimation by changing the transformation used before the analysis. Thus, the Bayes estimates lack the scientific objectivity, especially when the amount of data is small. However, we can use the Bayes estimates as an approximation to the objective ML estimates if we use an appropriate transformation that makes the posterior distribution close to a normal distribution. One-to-one correspondence exists between a uniform prior distribution under a transformed scale and a non-uniform prior distribution under the original scale. For this reason, the Bayes estimation of ML estimates is essentially identical to the estimation using Jeffreys prior.  相似文献   

17.
An approximation to maximum likelihood estimates in reduced models   总被引:2,自引:0,他引:2  
COX  D. R.; WERMUTH  NANNY 《Biometrika》1990,77(4):747-761
  相似文献   

18.
Fitting regression models to case-control data by maximum likelihood   总被引:3,自引:0,他引:3  
SCOTT  A. J.; WILD  C. J. 《Biometrika》1997,84(1):57-71
  相似文献   

19.
THESEUS is a command line program for performing maximum likelihood (ML) superpositions and analysis of macromolecular structures. While conventional superpositioning methods use ordinary least-squares (LS) as the optimization criterion, ML superpositions provide substantially improved accuracy by down-weighting variable structural regions and by correcting for correlations among atoms. ML superpositioning is robust and insensitive to the specific atoms included in the analysis, and thus it does not require subjective pruning of selected variable atomic coordinates. Output includes both likelihood-based and frequentist statistics for accurate evaluation of the adequacy of a superposition and for reliable analysis of structural similarities and differences. THESEUS performs principal components analysis for analyzing the complex correlations found among atoms within a structural ensemble. AVAILABILITY: ANSI C source code and selected binaries for various computing platforms are available under the GNU open source license from http://monkshood.colorado.edu/theseus/ or http://www.theseus3d.org.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号