首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Prosthetic devices need to be controlled by their users, typically using physiological signals. People tend to look at objects before reaching for them and we have shown that combining eye movements with other continuous physiological signal sources enhances control. This approach suffers when subjects also look at non-targets, a problem we addressed with a probabilistic mixture over targets where subject gaze information is used to identify target candidates. However, this approach would be ineffective if a user wanted to move towards targets that have not been foveated. Here we evaluated how the accuracy of prior target information influenced decoding accuracy, as the availability of neural control signals was varied. We also considered a mixture model where we assumed that the target may be foveated or, alternatively, that the target may not be foveated. We tested the accuracy of the models at decoding natural reaching data, and also in a closed-loop robot-assisted reaching task. The mixture model worked well in the face of high target uncertainty. Furthermore, errors due to inaccurate target information were reduced by including a generic model that relied on neural signals only.  相似文献   

2.
Phylogenetic analyses of DNA sequences were conducted to evaluate four alternative hypotheses of phrynosomatine sand lizard relationships. Sequences comprising 2871 aligned base pair positions representing the regions spanning ND1-COI and cyt b-tRNA(Thr) of the mitochondrial genome from all recognized sand lizard species were analyzed using unpartitioned parsimony and likelihood methods, likelihood methods with assumed partitions, Bayesian methods with assumed partitions, and Bayesian mixture models. The topology (Uma, (Callisaurus, (Cophosaurus, Holbrookia))) and thus monophyly of the "earless" taxa, Cophosaurus and Holbrookia, is supported by all analyses. Previously proposed topologies in which Uma and Callisaurus are sister taxa and those in which Holbrookia is the sister group to all other sand lizard taxa are rejected using both parsimony and likelihood-based significance tests with the combined, unparitioned data set. Bayesian hypothesis tests also reject those topologies using six assumed partitioning strategies, and the two partitioning strategies presumably associated with the most powerful tests also reject a third previously proposed topology, in which Callisaurus and Cophosaurus are sister taxa. For both maximum likelihood and Bayesian methods with assumed partitions, those partitions defined by codon position and tRNA stem and nonstems explained the data better than other strategies examined. Bayes factor estimates comparing results of assumed partitions versus mixture models suggest that mixture models perform better than assumed partitions when the latter were not based on functional characteristics of the data, such as codon position and tRNA stem and nonstems. However, assumed partitions performed better than mixture models when functional differences were incorporated. We reiterate the importance of accounting for heterogeneous evolutionary processes in the analysis of complex data sets and emphasize the importance of implementing mixed model likelihood methods.  相似文献   

3.
Hall DB 《Biometrics》2000,56(4):1030-1039
In a 1992 Technometrics paper, Lambert (1992, 34, 1-14) described zero-inflated Poisson (ZIP) regression, a class of models for count data with excess zeros. In a ZIP model, a count response variable is assumed to be distributed as a mixture of a Poisson(lambda) distribution and a distribution with point mass of one at zero, with mixing probability p. Both p and lambda are allowed to depend on covariates through canonical link generalized linear models. In this paper, we adapt Lambert's methodology to an upper bounded count situation, thereby obtaining a zero-inflated binomial (ZIB) model. In addition, we add to the flexibility of these fixed effects models by incorporating random effects so that, e.g., the within-subject correlation and between-subject heterogeneity typical of repeated measures data can be accommodated. We motivate, develop, and illustrate the methods described here with an example from horticulture, where both upper bounded count (binomial-type) and unbounded count (Poisson-type) data with excess zeros were collected in a repeated measures designed experiment.  相似文献   

4.
Mixed models are now well‐established methods in ecology and evolution because they allow accounting for and quantifying within‐ and between‐individual variation. However, the required normal distribution of the random effects can often be violated by the presence of clusters among subjects, which leads to multi‐modal distributions. In such cases, using what is known as mixture regression models might offer a more appropriate approach. These models are widely used in psychology, sociology, and medicine to describe the diversity of trajectories occurring within a population over time (e.g. psychological development, growth). In ecology and evolution, however, these models are seldom used even though understanding changes in individual trajectories is an active area of research in life‐history studies. Our aim is to demonstrate the value of using mixture models to describe variation in individual life‐history tactics within a population, and hence to promote the use of these models by ecologists and evolutionary ecologists. We first ran a set of simulations to determine whether and when a mixture model allows teasing apart latent clustering, and to contrast the precision and accuracy of estimates obtained from mixture models versus mixed models under a wide range of ecological contexts. We then used empirical data from long‐term studies of large mammals to illustrate the potential of using mixture models for assessing within‐population variation in life‐history tactics. Mixture models performed well in most cases, except for variables following a Bernoulli distribution and when sample size was small. The four selection criteria we evaluated [Akaike information criterion (AIC), Bayesian information criterion (BIC), and two bootstrap methods] performed similarly well, selecting the right number of clusters in most ecological situations. We then showed that the normality of random effects implicitly assumed by evolutionary ecologists when using mixed models was often violated in life‐history data. Mixed models were quite robust to this violation in the sense that fixed effects were unbiased at the population level. However, fixed effects at the cluster level and random effects were better estimated using mixture models. Our empirical analyses demonstrated that using mixture models facilitates the identification of the diversity of growth and reproductive tactics occurring within a population. Therefore, using this modelling framework allows testing for the presence of clusters and, when clusters occur, provides reliable estimates of fixed and random effects for each cluster of the population. In the presence or expectation of clusters, using mixture models offers a suitable extension of mixed models, particularly when evolutionary ecologists aim at identifying how ecological and evolutionary processes change within a population. Mixture regression models therefore provide a valuable addition to the statistical toolbox of evolutionary ecologists. As these models are complex and have their own limitations, we provide recommendations to guide future users.  相似文献   

5.
Roy J  Daniels MJ 《Biometrics》2008,64(2):538-545
Summary .   In this article we consider the problem of fitting pattern mixture models to longitudinal data when there are many unique dropout times. We propose a marginally specified latent class pattern mixture model. The marginal mean is assumed to follow a generalized linear model, whereas the mean conditional on the latent class and random effects is specified separately. Because the dimension of the parameter vector of interest (the marginal regression coefficients) does not depend on the assumed number of latent classes, we propose to treat the number of latent classes as a random variable. We specify a prior distribution for the number of classes, and calculate (approximate) posterior model probabilities. In order to avoid the complications with implementing a fully Bayesian model, we propose a simple approximation to these posterior probabilities. The ideas are illustrated using data from a longitudinal study of depression in HIV-infected women.  相似文献   

6.
A method is proposed that aims at identifying clusters of individuals that show similar patterns when observed repeatedly. We consider linear‐mixed models that are widely used for the modeling of longitudinal data. In contrast to the classical assumption of a normal distribution for the random effects a finite mixture of normal distributions is assumed. Typically, the number of mixture components is unknown and has to be chosen, ideally by data driven tools. For this purpose, an EM algorithm‐based approach is considered that uses a penalized normal mixture as random effects distribution. The penalty term shrinks the pairwise distances of cluster centers based on the group lasso and the fused lasso method. The effect is that individuals with similar time trends are merged into the same cluster. The strength of regularization is determined by one penalization parameter. For finding the optimal penalization parameter a new model choice criterion is proposed.  相似文献   

7.
As larger, more complex data sets are being used to infer phylogenies, accuracy of these phylogenies increasingly requires models of evolution that accommodate heterogeneity in the processes of molecular evolution. We investigated the effect of improper data partitioning on phylogenetic accuracy, as well as the type I error rate and sensitivity of Bayes factors, a commonly used method for choosing among different partitioning strategies in Bayesian analyses. We also used Bayes factors to test empirical data for the need to divide data in a manner that has no expected biological meaning. Posterior probability estimates are misleading when an incorrect partitioning strategy is assumed. The error was greatest when the assumed model was underpartitioned. These results suggest that model partitioning is important for large data sets. Bayes factors performed well, giving a 5% type I error rate, which is remarkably consistent with standard frequentist hypothesis tests. The sensitivity of Bayes factors was found to be quite high when the across-class model heterogeneity reflected that of empirical data. These results suggest that Bayes factors represent a robust method of choosing among partitioning strategies. Lastly, results of tests for the inclusion of unexpected divisions in empirical data mirrored the simulation results, although the outcome of such tests is highly dependent on accounting for rate variation among classes. We conclude by discussing other approaches for partitioning data, as well as other applications of Bayes factors.  相似文献   

8.
Model checking for ROC regression analysis   总被引:1,自引:0,他引:1  
Cai T  Zheng Y 《Biometrics》2007,63(1):152-163
Summary .   The receiver operating characteristic (ROC) curve is a prominent tool for characterizing the accuracy of a continuous diagnostic test. To account for factors that might influence the test accuracy, various ROC regression methods have been proposed. However, as in any regression analysis, when the assumed models do not fit the data well, these methods may render invalid and misleading results. To date, practical model-checking techniques suitable for validating existing ROC regression models are not yet available. In this article, we develop cumulative residual-based procedures to graphically and numerically assess the goodness of fit for some commonly used ROC regression models, and show how specific components of these models can be examined within this framework. We derive asymptotic null distributions for the residual processes and discuss resampling procedures to approximate these distributions in practice. We illustrate our methods with a dataset from the cystic fibrosis registry.  相似文献   

9.
Reliable statistical validation of peptide and protein identifications is a top priority in large-scale mass spectrometry based proteomics. PeptideProphet is one of the computational tools commonly used for assessing the statistical confidence in peptide assignments to tandem mass spectra obtained using database search programs such as SEQUEST, MASCOT, or X! TANDEM. We present two flexible methods, the variable component mixture model and the semiparametric mixture model, that remove the restrictive parametric assumptions in the mixture modeling approach of PeptideProphet. Using a control protein mixture data set generated on an linear ion trap Fourier transform (LTQ-FT) mass spectrometer, we demonstrate that both methods improve parametric models in terms of the accuracy of probability estimates and the power to detect correct identifications controlling the false discovery rate to the same degree. The statistical approaches presented here require that the data set contain a sufficient number of decoy (known to be incorrect) peptide identifications, which can be obtained using the target-decoy database search strategy.  相似文献   

10.
Success of maximum likelihood phylogeny inference in the four-taxon case   总被引:12,自引:4,他引:8  
We used simulated data to investigate a number of properties of maximum- likelihood (ML) phylogenetic tree estimation for the case of four taxa. Simulated data were generated under a broad range of conditions, including wide variation in branch lengths, differences in the ratio of transition and transversion substitutions, and the absence of presence of gamma-distributed site-to-site rate variation. Data were analyzed in the ML framework with two different substitution models, and we compared the ability of the two models to reconstruct the correct topology. Although both models were inconsistent for some branch-length combinations in the presence of site-to-site variation, the models were efficient predictors of topology under most simulation conditions. We also examined the performance of the likelihood ratio (LR) test for significant positive interior branch length. This test was found to be misleading under many simulation conditions, rejecting too often under some simulation conditions. Under the null hypothesis of zero length internal branch, LR statistics are assumed to be asymptotically distributed chi 2(1); with limited data, the distribution of LR statistics under the null hypothesis varies from chi 2(1).   相似文献   

11.
林火是森林生态系统的重要影响因子,建立科学准确的林火预测预报模型对林火管理工作至关重要。本研究以不同气象因子为主要预测变量,基于Logistic回归和广义线性混合效应模型建立福建省林火发生预测模型,通过对比Logistic基础模型和广义线性混合效应模型的拟合度和预测精度,研究混合效应模型在林火预报中的适用性。结果表明: Logistic基础模型的受试者工作特征曲线下面积(AUC)值为0.664,验证准确率为60.4%。添加随机效应后,模型的拟合和检验精度均获得了提升。其中,考虑行政区划和海拔差异效应的两水平混合效应模型的表现最优,其AUC值和验证准确率分别比基础模型提升0.057和6.0%。用此混合效应模型对福建省各地区的林火发生概率进行预测的结果表明,福建省西北部和南部为林火中高发区域,西南部和东部为林火低发区域,与实际观测的火点分布一致。混合效应模型在数据拟合和林火发生预测方面均优于Logistic基础模型,可作为林火预测和管理的重要工具。  相似文献   

12.
We investigate the performance of phylogenetic mixture models in reducing a well-known and pervasive artifact of phylogenetic inference known as the node-density effect, comparing them to partitioned analyses of the same data. The node-density effect refers to the tendency for the amount of evolutionary change in longer branches of phylogenies to be underestimated compared to that in regions of the tree where there are more nodes and thus branches are typically shorter. Mixture models allow more than one model of sequence evolution to describe the sites in an alignment without prior knowledge of the evolutionary processes that characterize the data or how they correspond to different sites. If multiple evolutionary patterns are common in sequence evolution, mixture models may be capable of reducing node-density effects by characterizing the evolutionary processes more accurately. In gene-sequence alignments simulated to have heterogeneous patterns of evolution, we find that mixture models can reduce node-density effects to negligible levels or remove them altogether, performing as well as partitioned analyses based on the known simulated patterns. The mixture models achieve this without knowledge of the patterns that generated the data and even in some cases without specifying the full or true model of sequence evolution known to underlie the data. The latter result is especially important in real applications, as the true model of evolution is seldom known. We find the same patterns of results for two real data sets with evidence of complex patterns of sequence evolution: mixture models substantially reduced node-density effects and returned better likelihoods compared to partitioning models specifically fitted to these data. We suggest that the presence of more than one pattern of evolution in the data is a common source of error in phylogenetic inference and that mixture models can often detect these patterns even without prior knowledge of their presence in the data. Routine use of mixture models alongside other approaches to phylogenetic inference may often reveal hidden or unexpected patterns of sequence evolution and can improve phylogenetic inference.  相似文献   

13.
Analyses of animal movement data have primarily focused on understanding patterns of space use and the behavioural processes driving them. Here, we analyzed animal movement data to infer components of individual fitness, specifically parturition and neonate survival. We predicted that parturition and neonate loss events could be identified by sudden and marked changes in female movement patterns. Using GPS radio‐telemetry data from female woodland caribou (Rangifer tarandus caribou), we developed and tested two novel movement‐based methods for inferring parturition and neonate survival. The first method estimated movement thresholds indicative of parturition and neonate loss from population‐level data then applied these thresholds in a moving‐window analysis on individual time‐series data. The second method used an individual‐based approach that discriminated among three a priori models representing the movement patterns of non‐parturient females, females with surviving offspring, and females losing offspring. The models assumed that step lengths (the distance between successive GPS locations) were exponentially distributed and that abrupt changes in the scale parameter of the exponential distribution were indicative of parturition and offspring loss. Both methods predicted parturition with near certainty (>97% accuracy) and produced appropriate predictions of parturition dates. Prediction of neonate survival was affected by data quality for both methods; however, when using high quality data (i.e., with few missing GPS locations), the individual‐based method performed better, predicting neonate survival status with an accuracy rate of 87%. Understanding ungulate population dynamics often requires estimates of parturition and neonate survival rates. With GPS radio‐collars increasingly being used in research and management of ungulates, our movement‐based methods represent a viable approach for estimating rates of both parameters.  相似文献   

14.
Predictive habitat distribution models are normally assumed to sacrifice generality for precision and reality. Nevertheless, such models are often applied to predict the distribution of a species outside the area for which the model has been calibrated.
We investigated how the geographic extent of the data used for calibration influenced the performance of habitat distribution models applied on independent data. We took a multi-scale logistic regression approach by varying the grain size to develop six habitat models for capercaillie Tetrao urogallus in Switzerland: three regional models, for the northern Pre-Alps, eastern Central Alps and Jura mountains, respectively, and three pooled models, each using data from two of the three regions. The six models were validated with data from the region(s) not used for model building. We used Cohen's Kappa and the area under the receiver operating characteristics curve as accuracy measures. The regional models performed well in the region where they had been calibrated, but poorly to moderately well in the other regions. The pooled models classified almost as well in their calibration regions as the corresponding regional models, but generally better when validated on data from the independent region. Hence, models built with data from single regions provide less certain predictions of species' distributions in other regions. We recommend building more general models using data pooled from several regions, when the aim is to predict species' distributions in independent regions.  相似文献   

15.
Dunson DB  Weinberg CR 《Biometrics》2000,56(1):288-292
The probability of conception in a given menstrual cycle is closely related to the timing of intercourse relative to ovulation. Although commonly used markers of time of ovulation are known to be error prone, most fertility models assume the day of ovulation is measured without error. We develop a mixture model that allows the day to be misspecified. We assume that the measurement errors are i.i.d. across menstrual cycles. Heterogeneity among couples in the per cycle likelihood of conception is accounted for using a beta mixture model. Bayesian estimation is straightforward using Markov chain Monte Carlo techniques. The methods are applied to a prospective study of couples at risk of pregnancy. In the absence of validation data or multiple independent markers of ovulation, the identifiability of the measurement error distribution depends on the assumed model. Thus, the results of studies relating the timing of intercourse to the probability of conception should be interpreted cautiously.  相似文献   

16.
Species with similar geographical distribution patterns are often assumed to have a shared biogeographical history, an assumption that can be tested with a combination of molecular, spatial, and environmental data. This study investigates three lineages of Hyperolius frogs with concordant ranges within the Eastern Afromontane Biodiversity Hotspot to determine whether allopatric populations of co‐distributed lineages shared a parallel biogeographical response to their shared paleoclimatic histories. The roles of refugial distributions, isolation, and climate cycles in shaping their histories are examined through Hierarchical Approximate Bayesian Computation, comparative phylogeography, and comparisons of current and past geographical distributions using ecological niche models. Results from these analyses show these three lineages to have independent evolutionary histories, which current spatial configurations of sparsely available habitat (montane wetlands) have moulded into convergent geographical ranges. In spite of independent phylogeographical histories, diversification events are temporally concentrated, implying that past vicariant events were significant at the generic level. This mixture of apparently disparate histories is likely due to quantifiably different patterns of expansion and retreat among species in response to past climate cycles. Combining climate modelling and phylogeographical data can reveal unrecognized complexities in the evolution of co‐distributed taxa.  相似文献   

17.
J Robinson 《Biometrics》1976,32(1):61-68
We consider models for the release of transmitter in response to nerve impulses, where it is assumed that quanta of transmitter are released from some of n sites, the probability of release from any site being p. It is assumed that the quantal size is either a constant or is distributed as a normal or a gamma variate. Observations on both spontaneous potentials and evoked potentials are used to obtain moment estimated of n and p. Large sample estimates of the standard errors of these estimates are given.  相似文献   

18.
Likelihood ratio tests are derived for bivariate normal structural relationships in the presence of group structure. These tests may also be applied to less restrictive models where only errors are assumed to be normally distributed. Tests for a common slope amongst those from several datasets are derived for three different cases – when the assumed ratio of error variances is the same across datasets and either known or unknown, and when the standardised major axis model is used. Estimation of the slope in the case where the ratio of error variances is unknown could be considered as a maximum likelihood grouping method. The derivations are accompanied by some small sample simulations, and the tests are applied to data arising from work on seed allometry.  相似文献   

19.
Temporal variation in the detectability of a species can bias estimates of relative abundance if not handled correctly. For example, when effort varies in space and/or time it becomes necessary to take variation in detectability into account when data are analyzed. We demonstrate the importance of incorporating seasonality into the analysis of data with unequal sample sizes due to lost traps at a particular density of a species. A case study of count data was simulated using a spring-active carabid beetle. Traps were 'lost' randomly during high beetle activity in high abundance sites and during low beetle activity in low abundance sites. Five different models were fitted to datasets with different levels of loss. If sample sizes were unequal and a seasonality variable was not included in models that assumed the number of individuals was log-normally distributed, the models severely under- or overestimated the true effect size. Results did not improve when seasonality and number of trapping days were included in these models as offset terms, but only performed well when the response variable was specified as following a negative binomial distribution. Finally, if seasonal variation of a species is unknown, which is often the case, seasonality can be added as a free factor, resulting in well-performing negative binomial models. Based on these results we recommend (a) add sampling effort (number of trapping days in our example) to the models as an offset term, (b) if precise information is available on seasonal variation in detectability of a study object, add seasonality to the models as an offset term; (c) if information on seasonal variation in detectability is inadequate, add seasonality as a free factor; and (d) specify the response variable of count data as following a negative binomial or over-dispersed Poisson distribution.  相似文献   

20.
Summary Statistical models that include random effects are commonly used to analyze longitudinal and correlated data, often with the assumption that the random effects follow a Gaussian distribution. Via theoretical and numerical calculations and simulation, we investigate the impact of misspecification of this distribution on both how well the predicted values recover the true underlying distribution and the accuracy of prediction of the realized values of the random effects. We show that, although the predicted values can vary with the assumed distribution, the prediction accuracy, as measured by mean square error, is little affected for mild‐to‐moderate violations of the assumptions. Thus, standard approaches, readily available in statistical software, will often suffice. The results are illustrated using data from the Heart and Estrogen/Progestin Replacement Study using models to predict future blood pressure values.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号