首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Entropy based measures have been frequently used in symbolic sequence analysis. A symmetrized and smoothed form of Kullback-Leibler divergence or relative entropy, the Jensen-Shannon divergence (JSD), is of particular interest because of its sharing properties with families of other divergence measures and its interpretability in different domains including statistical physics, information theory and mathematical statistics. The uniqueness and versatility of this measure arise because of a number of attributes including generalization to any number of probability distributions and association of weights to the distributions. Furthermore, its entropic formulation allows its generalization in different statistical frameworks, such as, non-extensive Tsallis statistics and higher order Markovian statistics. We revisit these generalizations and propose a new generalization of JSD in the integrated Tsallis and Markovian statistical framework. We show that this generalization can be interpreted in terms of mutual information. We also investigate the performance of different JSD generalizations in deconstructing chimeric DNA sequences assembled from bacterial genomes including that of E. coli, S. enterica typhi, Y. pestis and H. influenzae. Our results show that the JSD generalizations bring in more pronounced improvements when the sequences being compared are from phylogenetically proximal organisms, which are often difficult to distinguish because of their compositional similarity. While small but noticeable improvements were observed with the Tsallis statistical JSD generalization, relatively large improvements were observed with the Markovian generalization. In contrast, the proposed Tsallis-Markovian generalization yielded more pronounced improvements relative to the Tsallis and Markovian generalizations, specifically when the sequences being compared arose from phylogenetically proximal organisms.  相似文献   

2.
3.
We develop a general theory of organism movement in heterogeneous populations that can explain the leptokurtic movement distributions commonly measured in nature. We describe population heterogeneity in a state-structured framework, employing advection-diffusion as the fundamental movement process of individuals occupying different movement states. Our general analysis shows that population heterogeneity in movement behavior can be defined as the existence of different movement states and among-individual variability in the time individuals spend in these states. A presentation of moment-based metrics of movement illustrates the role of these attributes in general dispersal processes. We also present a special case of the general theory: a model population composed of individuals occupying one of two movement states with linear transitions, or exchange, between the two states. This two-state "exchange model" can be viewed as a correlated random walk and provides a generalization of the telegraph equation. By exploiting the main result of our general analysis, we characterize the exchange model by deriving moment-based metrics of its movement process and identifying an analytical representation of the model's time-dependent solution. Our results provide general and specific theoretical explanations for empirical patterns in organism movement; the results also provide conceptual and analytical bases for extending diffusion-based dispersal theory in several directions, thereby facilitating mechanistic links between individual behavior and spatial population dynamics.  相似文献   

4.
The stationary birth-only, or Yule-Furry, process for rooted binary trees has been analysed with a view to developing explicit expressions for two fundamental statistical distributions: the probability that a randomly selected leaf is preceded by N nodes, or “ancestors”, and the probability that two randomly selected leaves are separated by N nodes. For continuous-time Yule processes, the first of these distributions is presented in closed analytical form as a function of time, with time being measured with respect to the moment of “birth” of the common ancestor (which is essentially inaccessible to phylogenetic analysis), or with respect to the instant at which the first bifurcation occurred.The second distribution is shown to follow in an iterative manner from a hierarchy of second-order ordinary differential equations.For Yule trees of a given number n of tips, expressions have been derived for the mean and variance for each of these distributions as functions of n, as well as for the distributions themselves.In addition, it is shown how the methods developed to obtain these distributions can be employed to find, with minor effort, expressions for the expectation values of two statistics on Yule trees, the Sackin index (sum over all root-to-leaf distances), and the sum over all leaf-to-leaf distances.  相似文献   

5.
Several studies using test-day models show clear heterogeneity of residual variance along lactation. A changepoint technique to account for this heterogeneity is proposed. The data set included 100 744 test-day records of 10 869 Holstein-Friesian cows from northern Spain. A three-stage hierarchical model using the Wood lactation function was employed. Two unknown changepoints at times T1 and T2, (0 <T1 <T2 <tmax), with continuity of residual variance at these points, were assumed. Also, a nonlinear relationship between residual variance and the number of days of milking t was postulated. The residual variance at a time t() in the lactation phase i was modeled as: for (i = 1, 2, 3), where λι is a phase-specific parameter. A Bayesian analysis using Gibbs sampling and the Metropolis-Hastings algorithm for marginalization was implemented. After a burn-in of 20 000 iterations, 40 000 samples were drawn to estimate posterior features. The posterior modes of T1, T2, λ1, λ2, λ3, , , were 53.2 and 248.2 days; 0.575, -0.406, 0.797 and 0.702, 34.63 and 0.0455 kg2, respectively. The residual variance predicted using these point estimates were 2.64, 6.88, 3.59 and 4.35 kg2 at days of milking 10, 53, 248 and 305, respectively. This technique requires less restrictive assumptions and the model has fewer parameters than other methods proposed to account for the heterogeneity of residual variance during lactation.  相似文献   

6.
To evaluate whether environmental heterogeneity contributes to the genetic heterogeneity in Anopheles triannulatus, larval habitat characteristics across the Brazilian states of Roraima and Pará and genetic sequences were examined. A comparison with Anopheles goeldii was utilised to determine whether high genetic diversity was unique to An. triannulatus. Student t test and analysis of variance found no differences in habitat characteristics between the species. Analysis of population structure of An. triannulatus and An. goeldii revealed distinct demographic histories in a largely overlapping geographic range. Cytochrome oxidase I sequence parsimony networks found geographic clustering for both species; however nuclear marker networks depicted An. triannulatus with a more complex history of fragmentation, secondary contact and recent divergence. Evidence of Pleistocene expansions suggests both species are more likely to be genetically structured by geographic and ecological barriers than demography. We hypothesise that niche partitioning is a driving force for diversity, particularly in An. triannulatus.  相似文献   

7.
In this paper the situation of extra population heterogeneity is discussed from a analysis of variance point of view. We first provide a non‐iterative way of estimating the variance of the heterogeneity distribution without estimating the heterogeneity distribution itself for Poisson and binomial counts. The consequences of the presence of heterogeneity in the estimation of the mean are discussed. We show that if the homogeneity assumption holds, the pooled mean is optimal while in the presence of strong heterogeneity, the simple (arithmetic) mean is an optimal estimator of the mean SMR or mean proportion. These results lead to the problem of finding an optimal estimator for situations not represented by these two extreme cases. We propose an iterative solution to this problem. Illustrations for the application of these findings are provided with examples from various areas.  相似文献   

8.
We report on a diffusive analysis of the motion of flagellate protozoa species. These parasites are the etiological agents of neglected tropical diseases: leishmaniasis caused by Leishmania amazonensis and Leishmania braziliensis, African sleeping sickness caused by Trypanosoma brucei, and Chagas disease caused by Trypanosoma cruzi. By tracking the positions of these parasites and evaluating the variance related to the radial positions, we find that their motions are characterized by a short-time transient superdiffusive behavior. Also, the probability distributions of the radial positions are self-similar and can be approximated by a stretched Gaussian distribution. We further investigate the probability distributions of the radial velocities of individual trajectories. Among several candidates, we find that the generalized gamma distribution shows a good agreement with these distributions. The velocity time series have long-range correlations, displaying a strong persistent behavior (Hurst exponents close to one). The prevalence of “universal” patterns across all analyzed species indicates that similar mechanisms may be ruling the motion of these parasites, despite their differences in morphological traits. In addition, further analysis of these patterns could become a useful tool for investigating the activity of new candidate drugs against these and others neglected tropical diseases.  相似文献   

9.
10.
Investigating differences between means of more than two groups or experimental conditions is a routine research question addressed in biology. In order to assess differences statistically, multiple comparison procedures are applied. The most prominent procedures of this type, the Dunnett and Tukey-Kramer test, control the probability of reporting at least one false positive result when the data are normally distributed and when the sample sizes and variances do not differ between groups. All three assumptions are non-realistic in biological research and any violation leads to an increased number of reported false positive results. Based on a general statistical framework for simultaneous inference and robust covariance estimators we propose a new statistical multiple comparison procedure for assessing multiple means. In contrast to the Dunnett or Tukey-Kramer tests, no assumptions regarding the distribution, sample sizes or variance homogeneity are necessary. The performance of the new procedure is assessed by means of its familywise error rate and power under different distributions. The practical merits are demonstrated by a reanalysis of fatty acid phenotypes of the bacterium Bacillus simplex from the “Evolution Canyons” I and II in Israel. The simulation results show that even under severely varying variances, the procedure controls the number of false positive findings very well. Thus, the here presented procedure works well under biologically realistic scenarios of unbalanced group sizes, non-normality and heteroscedasticity.  相似文献   

11.
Extinction models for cancer stem cell therapy   总被引:1,自引:0,他引:1  
Cells with stem cell-like properties are now viewed as initiating and sustaining many cancers. This suggests that cancer can be cured by driving these cancer stem cells to extinction. The problem with this strategy is that ordinary stem cells are apt to be killed in the process. This paper sets bounds on the killing differential (difference between death rates of cancer stem cells and normal stem cells) that must exist for the survival of an adequate number of normal stem cells. Our main tools are birth-death Markov chains in continuous time. In this framework, we investigate the extinction times of cancer stem cells and normal stem cells. Application of extreme value theory from mathematical statistics yields an accurate asymptotic distribution and corresponding moments for both extinction times. We compare these distributions for the two cell populations as a function of the killing rates. Perhaps a more telling comparison involves the number of normal stem cells NH at the extinction time of the cancer stem cells. Conditioning on the asymptotic time to extinction of the cancer stem cells allows us to calculate the asymptotic mean and variance of NH. The full distribution of NH can be retrieved by the finite Fourier transform and, in some parameter regimes, by an eigenfunction expansion. Finally, we discuss the impact of quiescence (the resting state) on stem cell dynamics. Quiescence can act as a sanctuary for cancer stem cells and imperils the proposed therapy. We approach the complication of quiescence via multitype branching process models and stochastic simulation. Improvements to the τ-leaping method of stochastic simulation make it a versatile tool in this context. We conclude that the proposed therapy must target quiescent cancer stem cells as well as actively dividing cancer stem cells. The current cancer models demonstrate the virtue of attacking the same quantitative questions from a variety of modeling, mathematical, and computational perspectives.  相似文献   

12.
In this paper, a new approach based on eigen-systems pseudo-spectral estimation methods, namely Eigenvector (EV) and MUSIC, and Multiple Layer Perceptron (MLP) neural network is introduced. In this approach, the calculated EEG (electroencephalogram) spectrum is divided into smaller frequency sub-bands. Then, a set of features, {maximum, entropy, average, standard deviation, mobility}, are extracted from these sub-bands. Next, incorporating a set of the EEG time domain features {standard deviation, complexity measure} with the spectral feature set, a feature vector is formed. The feature vector is then fetched into a MLP neural network to classify the signal into the following three states: normal (healthy), epileptic patient signal in a seizure-free interval (inter-ictal), and epileptic patient signal in a full seizure interval (ictal). The experimental results show that the classification of the EEG signals maybe achieved with approximately 97.5% accuracy and the variance of 0.095% using an available public EEG signals database. The results are among the best reported methods for classifying the three states aforementioned. This is a high speed with high accuracy as well as low misclassifying rate method so it can make the practical and real-time detection of this chronic disease feasible.  相似文献   

13.
The scale-invariant and intermittent dynamics of animal behavior are attracting scientific interest. Recent findings concerning the statistical laws of behavioral organization shared between healthy humans and wild-type mice (WT) and their alterations in human depression patients and circadian clock gene (Period 2; Per2) mutant mice indicate that clock genes play functional roles in intermittent, ultradian locomotor dynamics. They also claim the clinical and biological importance of the laws as objective biobehavioral measures or endophenotypes for psychiatric disorders. In this study, to elucidate the roles of breakdown of the broader circadian regulatory circuit in intermittent behavioral dynamics, we studied the statistical properties and rhythmicity of locomotor activity in Per2 mutants and mice deficient in other clock genes (Bmal1, Clock). We performed wavelet analysis to examine circadian and ultradian rhythms and estimated the cumulative distributions of resting period durations during which locomotor activity levels are continuously lower than a predefined threshold value. The wavelet analysis revealed significant amplification of ultradian rhythms in the BMAL1-deficient mice, and instability in the Per2 mutants. The resting period distributions followed a power-law form in all mice. While the distributions for the BMAL1-deficient and Clock mutant mice were almost identical to those for the WT mice, with no significant differences in their parameter (power-law scaling exponent), only the Per2 mutant mice showed consistently and significantly lower values of the scaling exponent, indicating the increased intermittency in ultradian locomotor dynamics. Furthermore, based on a stochastic priority queuing model, we explained the power-law nature of resting period distributions, as well as its alterations shared with human depressive patients and Per2 mutant mice. Our findings lead to the development of a novel mathematical model for abnormal behaviors in psychiatric disorders.  相似文献   

14.
Since its birth in the early 1960s, Italian operaismo (workerism) has provided an optimistic reading of working-class militancy, a theoretically stimulating account of capitalist transformation, and a set of highly productive conceptual categories. Despite a shared provenance, however, operaista-influenced movements and theorists have since taken these categories in quite varied directions. Given this conceptual heterogeneity, I consider herein one such category—the “social factory”—and its conceptual reworking by Antonio Negri, as he elaborates in his 2017 book, Marx and Foucault. I employ, as means to pursue this inquiry, an anthropological lens—drawing, to do so, on anthropological theory and ethnographic research. My aim is to build toward to a reconception of the social factory analytic for use in a contemporary anthropology of state formation.  相似文献   

15.
16.

Background

The X chromosome plays an important role in human diseases and traits. However, few X-linked associations have been reported in genome-wide association studies, partly due to analytical complications and low statistical power.

Results

In this study, we propose tests of X-linked association that capitalize on variance heterogeneity caused by various factors, predominantly the process of X-inactivation. In the presence of X-inactivation, the expression of one copy of the chromosome is randomly silenced. Due to the consequent elevated randomness of expressed variants, females that are heterozygotes for a quantitative trait locus might exhibit higher phenotypic variance for that trait. We propose three tests that build on this phenomenon: 1) A test for inflated variance in heterozygous females; 2) A weighted association test; and 3) A combined test. Test 1 captures the novel signal proposed herein by directly testing for higher phenotypic variance of heterozygous than homozygous females. As a test of variance it is generally less powerful than standard tests of association that consider means, which is supported by extensive simulations. Test 2 is similar to a standard association test in considering the phenotypic mean, but differs by accounting for (rather than testing) the variance heterogeneity. As expected in light of X-inactivation, this test is slightly more powerful than a standard association test. Finally, test 3 further improves power by combining the results of the first two tests. We applied the these tests to the ARIC cohort data and identified a novel X-linked association near gene AFF2 with blood pressure, which was not significant based on standard association testing of mean blood pressure.

Conclusions

Variance-based tests examine overdispersion, thereby providing a complementary type of signal to a standard association test. Our results point to the potential to improve power of detecting X-linked associations in the presence of variance heterogeneity.  相似文献   

17.

Background

The movement patterns of wild animals depend crucially on the spatial and temporal availability of resources in their habitat. To date, most attempts to model this relationship were forced to rely on simplified assumptions about the spatiotemporal distribution of food resources. Here we demonstrate how advances in statistics permit the combination of sparse ground sampling with remote sensing imagery to generate biological relevant, spatially and temporally explicit distributions of food resources. We illustrate our procedure by creating a detailed simulation model of fruit production patterns for Dipteryx oleifera, a keystone tree species, on Barro Colorado Island (BCI), Panama.

Methodology and Principal Findings

Aerial photographs providing GPS positions for large, canopy trees, the complete census of a 50-ha and 25-ha area, diameter at breast height data from haphazardly sampled trees and long-term phenology data from six trees were used to fit 1) a point process model of tree spatial distribution and 2) a generalized linear mixed-effect model of temporal variation of fruit production. The fitted parameters from these models are then used to create a stochastic simulation model which incorporates spatio-temporal variations of D. oleifera fruit availability on BCI.

Conclusions and Significance

We present a framework that can provide a statistical characterization of the habitat that can be included in agent-based models of animal movements. When environmental heterogeneity cannot be exhaustively mapped, this approach can be a powerful alternative. The results of our model on the spatio-temporal variation in D. oleifera fruit availability will be used to understand behavioral and movement patterns of several species on BCI.  相似文献   

18.
The additive genetic variance–covariance matrix (G) summarizes the multivariate genetic relationships among a set of traits. The geometry of G describes the distribution of multivariate genetic variance, and generates genetic constraints that bias the direction of evolution. Determining if and how the multivariate genetic variance evolves has been limited by a number of analytical challenges in comparing G-matrices. Current methods for the comparison of G typically share several drawbacks: metrics that lack a direct relationship to evolutionary theory, the inability to be applied in conjunction with complex experimental designs, difficulties with determining statistical confidence in inferred differences and an inherently pair-wise focus. Here, we present a cohesive and general analytical framework for the comparative analysis of G that addresses these issues, and that incorporates and extends current methods with a strong geometrical basis. We describe the application of random skewers, common subspace analysis, the 4th-order genetic covariance tensor and the decomposition of the multivariate breeders equation, all within a Bayesian framework. We illustrate these methods using data from an artificial selection experiment on eight traits in Drosophila serrata, where a multi-generational pedigree was available to estimate G in each of six populations. One method, the tensor, elegantly captures all of the variation in genetic variance among populations, and allows the identification of the trait combinations that differ most in genetic variance. The tensor approach is likely to be the most generally applicable method to the comparison of G-matrices from any sampling or experimental design.  相似文献   

19.
I present a general diffusion-based modeling framework for the analysis of animal movements in heterogeneous landscapes, including terms representing advection, mortality, and edge-mediated behavior. I use adjoint operator theory to develop mathematical machinery for the assessment of a number of biologically relevant quantities, such as occupancy times, hitting probabilities, quasi-stationary distributions, the backwards equation, and conditional probability densities. I derive finite-element approximations, which can be used to obtain numerical solutions in domains which do not allow for an analytical treatment. As an example, I model the movements of the butterfly Melitaea cinxia in an island consisting of a set of habitat patches and the intervening matrix habitat. I illustrate the behavior of the model and the mathematical theory by examining the effects of a hypothetical movement barrier and advection caused by prevailing wind conditions.  相似文献   

20.
A mathematical framework for a rigorous theory of general systems is constructed, using the notions of the theory of Categories and Functors introduced by Eilenberg and MacLane (1945,Trans. Am. Math. Soc.,58, 231–94). A short discussion of the basic ideas is given, and their possible application to the theory of biological systems is discussed. On the basis of these considerations, a number of results are proved, including the possibility of selecting a unique representative (a “canonical form”) from a family of mathematical objects, all of which represent the same system. As an example, the representation of the neural net and the finite automaton is constructed in terms of our general theory.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号