首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Dominance hierarchies have been widely used for describing the outcome of competitive interactions in an animal group. We present a procedure for estimating the linear dominance hierarchy. The procedure uses the statistical method of paired comparisons, assuming weak stochastic transitivity to model interactions within a linear dominance hierarchy. The linear dominance hierarchy is estimated using a maximum likelihood ranking procedure. This method allows unequal numbers of encounters between pairs and does not require all pairs to have observed encounters. The method is illustrated by application to behavioural data from a group of 10 baboons (Papio cynocephalus anubis).  相似文献   

2.
While many approaches have been proposed to identify the signal onset in EMG recordings, there is no standardized method for performing this task. Here, we propose to use a change-point detection procedure based on singular spectrum analysis to determine the onset of EMG signals. This method is suitable for automated real-time implementation, can be applied directly to the raw signal, and does not require any prior knowledge of the EMG signal’s properties. The algorithm proposed by Moskvina and Zhigljavsky (2003) was applied to EMG segments recorded from wrist and trunk muscles. Wrist EMG data was collected from 9 Parkinson’s disease patients with and without tremor, while trunk EMG data was collected from 13 healthy able-bodied individuals. Along with the change-point detection analysis, two threshold-based onset detection methods were applied, as well as visual estimates of the EMG onset by trained practitioners. In the case of wrist EMG data without tremor, the change-point analysis showed comparable or superior frequency and quality of detection results, as compared to other automatic detection methods. In the case of wrist EMG data with tremor and trunk EMG data, performance suffered because other changes occurring in these signals caused larger changes in the detection statistic than the changes caused by the initial muscle activation, suggesting that additional criteria are needed to identify the onset from the detection statistic other than its magnitude alone. Once this issue is resolved, change-point detection should provide an effective EMG-onset detection method suitable for automated real-time implementation.  相似文献   

3.
In ecology, if the considered area or space is large, the spatial distribution of individuals of a given plant species is never homogeneous; plants form different patches. The homogeneity change in space or in time (in particular, the related change-point problem) is an important research subject in mathematical statistics. In the paper, for a given data system along a straight line, two areas are considered, where the data of each area come from different discrete distributions, with unknown parameters. In the paper a method is presented for the estimation of the distribution change-point between both areas and an estimate is given for the distributions separated by the obtained change-point. The solution of this problem will be based on the maximum likelihood method. Furthermore, based on an adaptation of the well-known bootstrap resampling, a method for the estimation of the so-called change-interval is also given. The latter approach is very general, since it not only applies in the case of the maximum-likelihood estimation of the change-point, but it can be also used starting from any other change-point estimation known in the ecological literature. The proposed model is validated against typical ecological situations, providing at the same time a verification of the applied algorithms.  相似文献   

4.
State‐space models (SSMs) are a popular tool for modeling animal abundances. Inference difficulties for simple linear SSMs are well known, particularly in relation to simultaneous estimation of process and observation variances. Several remedies to overcome estimation problems have been studied for relatively simple SSMs, but whether these challenges and proposed remedies apply for nonlinear stage‐structured SSMs, an important class of ecological models, is less well understood. Here we identify improvements for inference about nonlinear stage‐structured SSMs fit with biased sequential life stage data. Theoretical analyses indicate parameter identifiability requires covariates in the state processes. Simulation studies show that plugging in externally estimated observation variances, as opposed to jointly estimating them with other parameters, reduces bias and standard error of estimates. In contrast to previous results for simple linear SSMs, strong confounding between jointly estimated process and observation variance parameters was not found in the models explored here. However, when observation variance was also estimated in the motivating case study, the resulting process variance estimates were implausibly low (near‐zero). As SSMs are used in increasingly complex ways, understanding when inference can be expected to be successful, and what aids it, becomes more important. Our study illustrates (a) the need for relevant process covariates and (b) the benefits of using externally estimated observation variances for inference about nonlinear stage‐structured SSMs.  相似文献   

5.
This article describes the application of a change-point algorithm to the analysis of stochastic signals in biological systems whose underlying state dynamics consist of transitions between discrete states. Applications of this analysis include molecular-motor stepping, fluorophore bleaching, electrophysiology, particle and cell tracking, detection of copy number variation by sequencing, tethered-particle motion, etc. We present a unified approach to the analysis of processes whose noise can be modeled by Gaussian, Wiener, or Ornstein-Uhlenbeck processes. To fit the model, we exploit explicit, closed-form algebraic expressions for maximum-likelihood estimators of model parameters and estimated information loss of the generalized noise model, which can be computed extremely efficiently. We implement change-point detection using the frequentist information criterion (which, to our knowledge, is a new information criterion). The frequentist information criterion specifies a single, information-based statistical test that is free from ad hoc parameters and requires no prior probability distribution. We demonstrate this information-based approach in the analysis of simulated and experimental tethered-particle-motion data.  相似文献   

6.
Researchers usually estimate benchmark dose (BMD) for dichotomous experimental data using a binomial model with a single response function. Several forms of response function have been proposed to fit dose–response models to estimate the BMD and the corresponding benchmark dose lower bound (BMDL). However, if the assumed response function is not correct, then the estimated BMD and BMDL from the fitted model may not be accurate. To account for model uncertainty, model averaging (MA) methods are proposed to estimate BMD averaging over a model space containing a finite number of standard models. Usual model averaging focuses on a pre-specified list of parametric models leading to pitfalls when none of the models in the list is the correct model. Here, an alternative which augments an initial list of parametric models with an infinite number of additional models having varying response functions has been proposed to estimate BMD for dichotomous response data. In addition, different methods for estimating BMDL based on the family of response functions are derived. The proposed approach is compared with MA in a simulation study and applied to a real dataset. Simulation studies are also conducted to compare the four methods of estimating BMDL.  相似文献   

7.
This paper develops a model for repeated binary regression when a covariate is measured with error. The model allows for estimating the effect of the true value of the covariate on a repeated binary response. The choice of a probit link for the effect of the error-free covariate, coupled with normal measurement error for the error-free covariate, results in a probit model after integrating over the measurement error distribution. We propose a two-stage estimation procedure where, in the first stage, a linear mixed model is used to fit the repeated covariate. In the second stage, a model for the correlated binary responses conditional on the linear mixed model estimates is fit to the repeated binary data using generalized estimating equations. The approach is demonstrated using nutrient safety data from the Diet Intervention of School Age Children (DISC) study.  相似文献   

8.
In this work, we fit pattern-mixture models to data sets with responses that are potentially missing not at random (MNAR, Little and Rubin, 1987). In estimating the regression parameters that are identifiable, we use the pseudo maximum likelihood method based on exponential families. This procedure provides consistent estimators when the mean structure is correctly specified for each pattern, with further information on the variance structure giving an efficient estimator. The proposed method can be used to handle a variety of continuous and discrete outcomes. A test built on this approach is also developed for model simplification in order to improve efficiency. Simulations are carried out to compare the proposed estimation procedure with other methods. In combination with sensitivity analysis, our approach can be used to fit parsimonious semi-parametric pattern-mixture models to outcomes that are potentially MNAR. We apply the proposed method to an epidemiologic cohort study to examine cognition decline among elderly.  相似文献   

9.
Digesta flow models have been based on linear compartment theory that assumes exponential retention times, and on a generalized theory that incorporates nonexponential (Erlang) retention times (Matis, 1987, Journal of Theoretical Biology 124, 371-376). This paper develops a new family of passage models for heterogeneous digesta by mixing the previous models with assumed parametric, usually gamma, mixing distributions. The utility of the resulting models is demonstrated with experimental data on two treatments, namely a chopped and a ground straw, given to each of four cows. Treatment differences are apparent in the preferred model form and in the means of the estimated mean residence times. The models are relatively easy to fit to data using standard estimation procedures, and they should have broad application to other compartment modeling problems with "heterogeneous particles."  相似文献   

10.
A score‐type test is proposed for testing the hypothesis of independent binary random variables against positive correlation in linear logistic models with sparse data and cluster specific covariates. The test is developed for univariate and multivariate one‐sided alternatives. The main advantage of using score test is that it requires estimation of the model only under the null hypothesis, that in this case corresponds to the binomial maximum likelihood fit. The score‐type test is developed from a class of estimating equations with block‐diagonal structure in which the coefficients of the linear logistic model are estimated simultaneously with the correlation. The simplicity of the score test is illustrated in two particular examples.  相似文献   

11.
In many situations one wishes to fit a piecewige regression which enables one to obtain estimates of the join points as well as the slopes and intercepts of the fitted submodels. This study developes a technique for fitting piecewise models to data which contain measurement error in an independent variable. The technique developed here combines the HUDSON (1966) procedure for estimating parameters in piecewise regression and the WALD (1940) Grouping Technique which obviates the problem of measurement error. If one assumes some knowledge of the position of the join point in relation to the data, methodology has been developed to estimate the parameters and study the asymptotic properties of the means and variances of the parameter estimates. However, in the more realistic case, when additional knowledge is limited, it is only possible to obtain the parameter estimates using an iterative technique (TEETER, 1982). The general technique for obtaining the join point estimate in the presence of measurement error is presented here and an example is given using data on women's basal body temperature during menstrual cycles.  相似文献   

12.
Aspects of parameter estimation in ascertainment sampling schemes.   总被引:6,自引:6,他引:0       下载免费PDF全文
It has recently been suggested that ascertainment sampling estimation procedures commonly used are not fully efficient in that the number of unobserved families is an unknown parameter that should be estimated (contrary to common practice) along with the genetic parameters for fully efficient estimation. It has also been suggested that the frequency distribution of family size contains unknown parameters that should similarly be estimated with the genetic parameters. These two suggestions are considered in this paper. It is shown by means of an equivalence theorem that in both cases the estimates and their variances obtained by adopting the suggested procedure are identical with those found by ignoring the unobserved families and by ignoring the family-size distribution. This demonstration leads to a formal justification of further procedures, in particular: (1) use of "method-of-moments" estimators, (2) ignoring the ascertainment scheme in some cases when estimating parameters, and (3) forming estimates of parameters when various parts of the data are obtained by different ascertainment schemes.  相似文献   

13.
Measurements in populations which serve as valid indicators of biological relationship should be proportional to genetic distance. In order to test the utility of discrete cranial traits for estimating genetic distances among populations, estimates of admixture are obtained for gene frequency data and nonmetric cranial data in São Paulo mulattos (M). The gene frequency data serve as a control that the three populations are related as stated: estimates of admixture are obtained by using São Paulo whites (W) and blacks (B) as parental populations and by estimating the parameter of admixture, m, in the model pM = (1 ? m) pW + mpB (Elston, 1971) where the p's are either gene frequencies or nonmetric trait frequencies. A test of goodness of fit of the model provides a means of ascertaining whether or not the data fit this linear model. While the gene frequency data indicate distances among the three populations which are highly compatible with the linear model of admixture, the nonmetric data show significant deviations from the model. This implies that the frequencies of the nonmetric traits in the populations used in this analysis are not a linear function of genetic distance. This discourages the use of nonmetric traits in making quantitative conclusions about genetic relationships. It also suggests the need for investigation of the use of other skeletal characters for estimating genetic distance, as well as approaches for such investigations through the study of hybrid individuals.  相似文献   

14.
To date, most statistical developments in QTL detection methodology have been directed at continuous traits with an underlying normal distribution. This paper presents a method for QTL analysis of non-normal traits using a generalized linear mixed model approach. Development of this method has been motivated by a backcross experiment involving two inbred lines of mice that was conducted in order to locate a QTL for litter size. A Poisson regression form is used to model litter size, with allowances made for under- as well as over-dispersion, as suggested by the experimental data. In addition to fixed parity effects, random animal effects have also been included in the model. However, the method is not fully parametric as the model is specified only in terms of means, variances and covariances, and not as a full probability model. Consequently, a generalized estimating equations (GEE) approach is used to fit the model. For statistical inferences, permutation tests and bootstrap procedures are used. This method is illustrated with simulated as well as experimental mouse data. Overall, the method is found to be quite reliable, and with modification, can be used for QTL detection for a range of other non-normally distributed traits.  相似文献   

15.
In allometry, researchers are commonly interested in estimating the slope of the major axis or standardized major axis (methods of bivariate line fitting related to principal components analysis). This study considers the robustness of two tests for a common slope amongst several axes. It is of particular interest to measure the robustness of these tests to slight violations of assumptions that may not be readily detected in sample datasets. Type I error is estimated in simulations of data generated with varying levels of nonnormality, heteroscedasticity and nonlinearity. The assumption failures introduced in simulations were difficult to detect in a moderately sized dataset, with an expert panel only able to correct detect assumption violations 34-45% of the time. While the common slope tests were robust to nonnormal and heteroscedastic errors from the line, Type I error was inflated if the two variables were related in a slightly nonlinear fashion. Similar results were also observed for the linear regression case. The common slope tests were more liberal when the simulated data had greater nonlinearity, and this effect was more evident when the underlying distribution had longer tails than the normal. This result raises concerns for common slopes testing, as slight nonlinearities such as those in simulations are often undetectable in moderately sized datasets. Consequently, practitioners should take care in checking for nonlinearity and interpreting the results of a test for common slope. This work has implications for the robustness of inference in linear models in general.  相似文献   

16.
The use of multiple hypothesis testing procedures has been receiving a lot of attention recently by statisticians in DNA microarray analysis. The traditional FWER controlling procedures are not very useful in this situation since the experiments are exploratory by nature and researchers are more interested in controlling the rate of false positives rather than controlling the probability of making a single erroneous decision. This has led to increased use of FDR (False Discovery Rate) controlling procedures. Genovese and Wasserman proposed a single-step FDR procedure that is an asymptotic approximation to the original Benjamini and Hochberg stepwise procedure. In this paper, we modify the Genovese-Wasserman procedure to force the FDR control closer to the level alpha in the independence setting. Assuming that the data comes from a mixture of two normals, we also propose to make this procedure adaptive by first estimating the parameters using the EM algorithm and then using these estimated parameters into the above modification of the Genovese-Wasserman procedure. We compare this procedure with the original Benjamini-Hochberg and the SAM thresholding procedures. The FDR control and other properties of this adaptive procedure are verified numerically.  相似文献   

17.
Wang Y  Sun G  Ji Z  Xing C  Liang Y 《PloS one》2012,7(1):e29860
In previous work, we proposed a method for detecting differential gene expression based on change-point of expression profile. This non-parametric change-point method gave promising result in both simulation study and public dataset experiment. However, the performance is still limited by the less sensitiveness to the right bound and the statistical significance of the statistics has not been fully explored. To overcome the insensitiveness to the right bound we modified the original method by adding a weight function to the D(n) statistic. Simulation study showed that the weighted change-point statistics method is significantly better than the original NPCPS in terms of ROC, false positive rate, as well as change-point estimate. The mean absolute error of the estimated change-point by weighted change-point method was 0.03, reduced by more than 50% comparing with the original 0.06, and the mean FPR was reduced by more than 55%. Experiment on microarray Dataset I resulted in 3974 differentially expressed genes out of total 5293 genes; experiment on microarray Dataset II resulted in 9983 differentially expressed genes among total 12576 genes. In summary, the method proposed here is an effective modification to the previous method especially when only a small subset of cancer samples has DGE.  相似文献   

18.
Micropipette aspiration (MA) has been widely used to measure the biomechanical properties of cells and biomaterials. To estimate material parameters from MA experimental data, analytical half-space models and inverse finite element (FE) analyses are typically used. The half-space model is easy to implement but cannot account for nonlinear material properties and complex geometrical boundary conditions that are inherent to MA. Inverse FE approaches can account for geometrical and material nonlinearities, but their implementation is resource-intensive and not widely available. Here, by making analogy between an analytical uniaxial tension model and a FE model of MA, we proposed an easily implementable and accurate method to estimate the material parameters of tissues tested by MA. We first adopted a strain invariant-based isotropic exponential constitutive model and implemented it in both the analytical uniaxial tension model and the FE model. The two models were fit to experimental data generated by MA of porcine aortic valve tissue (45 spots on four leaflets) to estimate material parameters. We found no significant differences between the effective moduli estimated by the two models ( $p > 0.39$ ), with the effective moduli estimated by the uniaxial tension model correlating significantly with those estimated by the FE model ( $p < 0.001; R^{2}= 0.96$ ) with a linear regression slope that was not different than unity ( $p = 0.38$ ). Thus, the analytical uniaxial tension model, which avoids solving resource-intensive numerical problems, is as accurate as the FE model in estimating the effective modulus of valve tissue tested by MA.  相似文献   

19.

Background

Most studies on genomic prediction with reference populations that include multiple lines or breeds have used linear models. Data heterogeneity due to using multiple populations may conflict with model assumptions used in linear regression methods.

Methods

In an attempt to alleviate potential discrepancies between assumptions of linear models and multi-population data, two types of alternative models were used: (1) a multi-trait genomic best linear unbiased prediction (GBLUP) model that modelled trait by line combinations as separate but correlated traits and (2) non-linear models based on kernel learning. These models were compared to conventional linear models for genomic prediction for two lines of brown layer hens (B1 and B2) and one line of white hens (W1). The three lines each had 1004 to 1023 training and 238 to 240 validation animals. Prediction accuracy was evaluated by estimating the correlation between observed phenotypes and predicted breeding values.

Results

When the training dataset included only data from the evaluated line, non-linear models yielded at best a similar accuracy as linear models. In some cases, when adding a distantly related line, the linear models showed a slight decrease in performance, while non-linear models generally showed no change in accuracy. When only information from a closely related line was used for training, linear models and non-linear radial basis function (RBF) kernel models performed similarly. The multi-trait GBLUP model took advantage of the estimated genetic correlations between the lines. Combining linear and non-linear models improved the accuracy of multi-line genomic prediction.

Conclusions

Linear models and non-linear RBF models performed very similarly for genomic prediction, despite the expectation that non-linear models could deal better with the heterogeneous multi-population data. This heterogeneity of the data can be overcome by modelling trait by line combinations as separate but correlated traits, which avoids the occasional occurrence of large negative accuracies when the evaluated line was not included in the training dataset. Furthermore, when using a multi-line training dataset, non-linear models provided information on the genotype data that was complementary to the linear models, which indicates that the underlying data distributions of the three studied lines were indeed heterogeneous.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-014-0075-3) contains supplementary material, which is available to authorized users.  相似文献   

20.
We present prediction and variable importance (VIM) methods for longitudinal data sets containing continuous and binary exposures subject to missingness. We demonstrate the use of these methods for prognosis of medical outcomes of severe trauma patients, a field in which current medical practice involves rules of thumb and scoring methods that only use a few variables and ignore the dynamic and high-dimensional nature of trauma recovery. Well-principled prediction and VIM methods can provide a tool to make care decisions informed by the high-dimensional patient’s physiological and clinical history. Our VIM parameters are analogous to slope coefficients in adjusted regressions, but are not dependent on a specific statistical model, nor require a certain functional form of the prediction regression to be estimated. In addition, they can be causally interpreted under causal and statistical assumptions as the expected outcome under time-specific clinical interventions, related to changes in the mean of the outcome if each individual experiences a specified change in the variable (keeping other variables in the model fixed). Better yet, the targeted MLE used is doubly robust and locally efficient. Because the proposed VIM does not constrain the prediction model fit, we use a very flexible ensemble learner (the SuperLearner), which returns a linear combination of a list of user-given algorithms. Not only is such a prediction algorithm intuitive appealing, it has theoretical justification as being asymptotically equivalent to the oracle selector. The results of the analysis show effects whose size and significance would have been not been found using a parametric approach (such as stepwise regression or LASSO). In addition, the procedure is even more compelling as the predictor on which it is based showed significant improvements in cross-validated fit, for instance area under the curve (AUC) for a receiver-operator curve (ROC). Thus, given that 1) our VIM applies to any model fitting procedure, 2) under assumptions has meaningful clinical (causal) interpretations and 3) has asymptotic (influence-curve) based robust inference, it provides a compelling alternative to existing methods for estimating variable importance in high-dimensional clinical (or other) data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号