首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The iterated birth and death process is defined as an n-fold iteration of a stochastic process consisting of the combination of instantaneous random killing of individuals in a certain population with a given survival probability s with a Markov birth and death process describing subsequent population dynamics. A long standing problem of computing the distribution of the number of clonogenic tumor cells surviving a fractionated radiation schedule consisting of n equal doses separated by equal time intervals tau is solved within the framework of iterated birth and death processes. For any initial tumor size i, an explicit formula for the distribution of the number M of surviving clonogens at moment tau after the end of treatment is found. It is shown that if i-->infinity and s-->0 so that is(n) tends to a finite positive limit, the distribution of random variable M converges to a probability distribution, and a formula for the latter is obtained. This result generalizes the classical theorem about the Poisson limit of a sequence of binomial distributions. The exact and limiting distributions are also found for the number of surviving clonogens immediately after the nth exposure. In this case, the limiting distribution turns out to be a Poisson distribution.  相似文献   

2.
Power investigations, for example, in statistical procedures for the assessment of agreement among multiple raters often require the simultaneous simulation of several dependent binomial or Poisson distributions to appropriately model the stochastical dependencies between the raters' results. Regarding the rather large dimensions of the random vectors to be generated and the even larger number of interactions to be introduced into the simulation scenarios to determine all necessary information on their distributions' dependence stucture, one needs efficient and fast algorithms for the simulation of multivariate Poisson and binomial distributions. Therefore two equivalent models for the multivariate Poisson distribution are combined to obtain an algorithm for the quick implementation of its multivariate dependence structure. Simulation of the multivariate Poisson distribution then becomes feasible by first generating and then convoluting independent univariate Poisson variates with appropriate expectations. The latter can be computed via linear recursion formulae. Similar means for simulation are also considered for the binomial setting. In this scenario it turns out, however, that exact computation of the probability function is even easier to perform; therefore corresponding linear recursion formulae for the point probabilities of multivariate binomial distributions are presented, which only require information about the index parameter and the (simultaneous) success probabilities, that is the multivariate dependence structure among the binomial marginals.  相似文献   

3.
The question of how to characterize the bacterial density in a body of water when data are available as counts from a number of small-volume samples was examined for cases where either the Poisson or negative binomial probability distributions could be used to describe the bacteriological data. The suitability of the Poisson distribution when replicate analyses were performed under carefully controlled conditions and of the negative binomial distribution for samples collected from different locations and over time were illustrated by two examples. In cases where the negative binomial distribution was appropriate, a procedure was given for characterizing the variability by dividing the bacterial counts into homogeneous groups. The usefulness of this procedure was illustrated for the second example based on survey data for Lake Erie. A further illustration of the difference between results based on the Poisson and negative binomial distributions was given by calculating the probability of obtaining all samples sterile, assuming various bacterial densities and sample sizes.  相似文献   

4.
We provide a computationally realistic mathematical framework for the NP-hard problem of the multichromosomal breakpoint median for linear genomes that can be used in constructing phylogenies. A novel approach is provided that can handle signed, unsigned, and partially signed cases of the multichromosomal breakpoint median problem. Our method provides an avenue for incorporating biological assumptions (whenever available) such as the number of chromosomes in the ancestor, and thus it can be tailored to obtain a more biologically relevant picture of the median. We demonstrate the usefulness of our method by performing an empirical study on both simulated and real data with a comparison to other methods.  相似文献   

5.
Motivated by the trend of genome sequencing without completing the sequence of the whole genomes, a problem on filling an incomplete multichromosomal genome (or scaffold) I with respect to a complete target genome G was studied. The objective is to minimize the resulting genomic distance between I' and G, where I' is the corresponding filled scaffold. We call this problem the onesided scaffold filling problem. In this paper, we conduct a systematic study for the scaffold filling problem under the breakpoint distance and its variants, for both unichromosomal and multichromosomal genomes (with and without gene repetitions). When the input genome contains no gene repetition (i.e., is a fragment of a permutation), we show that the two-sided scaffold filling problem (i.e., G is also incomplete) is polynomially solvable for unichromosomal genomes under the breakpoint distance and for multichromosomal genomes under the genomic (or DCJ--Double-Cut-and-Join) distance. However, when the input genome contains some repeated genes, even the one-sided scaffold filling problem becomes NP-complete when the similarity measure is the maximum number of adjacencies between two sequences. For this problem, we also present efficient constant-factor approximation algorithms: factor-2 for the general case and factor 1.33 for the one-sided case.  相似文献   

6.
MOTIVATION: The double cut and join operation (abbreviated as DCJ) has been extensively used for genomic rearrangement. Although the DCJ distance between signed genomes with both linear and circular (uni- and multi-) chromosomes is well studied, the only known result for the NP-complete unsigned DCJ distance problem is an approximation algorithm for unsigned linear unichromosomal genomes. In this article, we study the problem of computing the DCJ distance on two unsigned linear multichromosomal genomes (abbreviated as UDCJ). RESULTS: We devise a 1.5-approximation algorithm for UDCJ by exploiting the distance formula for signed genomes. In addition, we show that UDCJ admits a weak kernel of size 2k and hence an FPT algorithm running in O(2(2k)n) time.  相似文献   

7.
We prove that the generalized Poisson distribution GP(theta, eta) (eta > or = 0) is a mixture of Poisson distributions; this is a new property for a distribution which is the topic of the book by Consul (1989). Because we find that the fits to count data of the generalized Poisson and negative binomial distributions are often similar, to understand their differences, we compare the probability mass functions and skewnesses of the generalized Poisson and negative binomial distributions with the first two moments fixed. They have slight differences in many situations, but their zero-inflated distributions, with masses at zero, means and variances fixed, can differ more. These probabilistic comparisons are helpful in selecting a better fitting distribution for modelling count data with long right tails. Through a real example of count data with large zero fraction, we illustrate how the generalized Poisson and negative binomial distributions as well as their zero-inflated distributions can be discriminated.  相似文献   

8.
We describe a method for the recursive computation of exact probability distributions for the number of neutral mutations segregating in samples of arbitrary size and configuration. Construction of the recursions requires only characterization of evolutionary changes as a Markov process and determination of one-step transition matrices. We address the pattern of nucleotide diversity at a neutral marker locus linked to a determinant of mating type. Under a reformulation of parameters, the method also applies directly to metapopulation models with island migration among demes. Characterization of complete probability distributions facilitates parameter estimation and hypothesis testing by likelihood- as well as moment-based approaches.  相似文献   

9.
Toward a neutral evolutionary model of gene expression   总被引:4,自引:2,他引:2       下载免费PDF全文
Khaitovich P  Pääbo S  Weiss G 《Genetics》2005,170(2):929-939
  相似文献   

10.
We present a model of gene duplication by means of unequal crossover (UCO) where the probability of any given pairing between homologous sequences scales as a penalty factor p z ≤ 1, with z the number of mismatches due to asymmetric sequence alignment. From this general representation, we derive several limiting case models of UCO, some of which have been treated elsewhere in the literature. One limiting case is random unequal crossover (RUCO), obtained by setting p = 1 (corresponding to equiprobable pairings at each site). Another limiting case scenario (the ‘Krueger-Vogel’ model) proposes an optimal ‘endpoint’ alignment which strongly penalizes both overhang and deviations from endpoint matching positions. For both of these scenarios, we make use of the symmetry properties of the transition operator (together with the more general UCO properties of copy number conservation and equal parent-offspring mean copy number) to derive the stationary distribution of gene copy number generated by UCO. For RUCO, the stationary distribution of genotypes is shown to be a negative binomial, or alternatively, a convolution of geometric distributions on ‘haplotype’ frequencies. A different type of model derived from the general representation only allows recombination without overhang (internal UCO or IntUCO). This process has the special property of converging to a single copy length or a distribution on a pair of copy lengths in the absence of any other evolutionary forces. For UCO systems in general, we also show that selection can readily act on gene copy number in all of the UCO systems we investigate due to the perfect heritability (h 2 = 1) imposed by conservation of copy number. Finally, some preliminary work is presented which suggests that the more general models based on misalignment probabilities seem to also converge to stationary distributions, which are most likely functions of parameter value p. An erratum to this article is available at .  相似文献   

11.
GRIMM: genome rearrangements web server   总被引:14,自引:0,他引:14  
SUMMARY: Genome Rearrangements In Man and Mouse (GRIMM) is a tool for analyzing rearrangements of gene orders in pairs of unichromosomal and multichromosomal genomes, with either signed or unsigned gene data. Although there are several programs for analyzing rearrangements in unichromosomal genomes, this is the first to analyze rearrangements in multichromosomal genomes. GRIMM also provides a new algorithm for analyzing comparative maps for which gene directions are unknown. AVAILABILITY: A web server, with instructions and sample data, is available at http://www-cse.ucsd.edu/groups/bioinformatics/GRIMM.  相似文献   

12.
MOTIVATION: Gene genealogies offer a powerful context for inferences about the evolutionary process based on presently segregating DNA variation. In many cases, it is the distribution of population parameters, marginalized over the effectively infinite-dimensional tree space, that is of interest. Our evolutionary forest (EF) algorithm uses Monte Carlo methods to generate posterior distributions of population parameters. A novel feature is the updating of parameter values based on a probability measure defined on an ensemble of histories (a forest of genealogies), rather than a single tree. RESULTS: The EF algorithm generates samples from the correct marginal distribution of population parameters. Applied to actual data from closely related fruit fly species, it rapidly converged to posterior distributions that closely approximated the exact posteriors generated through massive computational effort. Applied to simulated data, it generated credible intervals that covered the actual parameter values in accordance with the nominal probabilities. AVAILABILITY: A C++ implementation of this method is freely accessible at http://www.isds.duke.edu/~scl13  相似文献   

13.
14.
Aims Fits of species-abundance distributions to empirical data are increasingly used to evaluate models of diversity maintenance and community structure and to infer properties of communities, such as species richness. Two distributions predicted by several models are the Poisson lognormal (PLN) and the negative binomial (NB) distribution; however, at least three different ways to parameterize the PLN have been proposed, which differ in whether unobserved species contribute to the likelihood and in whether the likelihood is conditional upon the total number of individuals in the sample. Each of these has an analogue for the NB. Here, we propose a new formulation of the PLN and NB that includes the number of unobserved species as one of the estimated parameters. We investigate the performance of parameter estimates obtained from this reformulation, as well as the existing alternatives, for drawing inferences about the shape of species abundance distributions and estimation of species richness.Methods We simulate the random sampling of a fixed number of individuals from lognormal and gamma community relative abundance distributions, using a previously developed 'individual-based' bootstrap algorithm. We use a range of sample sizes, community species richness levels and shape parameters for the species abundance distributions that span much of the realistic range for empirical data, generating 1?000 simulated data sets for each parameter combination. We then fit each of the alternative likelihoods to each of the simulated data sets, and we assess the bias, sampling variance and estimation error for each method.Important findings Parameter estimates behave reasonably well for most parameter values, exhibiting modest levels of median error. However, for the NB, median error becomes extremely large as the NB approaches either of two limiting cases. For both the NB and PLN,>90% of the variation in the error in model parameters across parameter sets is explained by three quantities that corresponded to the proportion of species not observed in the sample, the expected number of species observed in the sample and the discrepancy between the true NB or PLN distribution and a Poisson distribution with the same mean. There are relatively few systematic differences between the four alternative likelihoods. In particular, failing to condition the likelihood on the total sample sizes does not appear to systematically increase the bias in parameter estimates. Indeed, overall, the classical likelihood performs slightly better than the alternatives. However, our reparameterized likelihood, for which species richness is a fitted parameter, has important advantages over existing approaches for estimating species richness from fitted species-abundance models.  相似文献   

15.
A family of trivariate binomial mixtures with respect to their exponent parameter is introduced and its structure is studied by the use of probability generating functions. Expressions for probabilities, factorial moments and factorial cumulants are given. Conditional distributions are also examined. Illustrative examples include the trivariate Poisson, binomial, negative binomial and modified logarithmic series distributions. In addition, properties of the compounded trivariate Poisson distribution are discussed. Finally biological, medical and ecological applications are indicated.  相似文献   

16.
In this paper we study the effect of selection procedures on certain parameters of the distribution function (d.f.) (such as mean, percentile etc.) of a quantitative characteristic X, in successive generations when the d.f. is governed by a single locus. We study the changes in gene frequencies under the truncation and genotype selection procedures by obtaining approximations to the gene frequencies, since exact expressions are not available. Using these approximations for the gene frequencies, we compute the selection differentials of X for different values of n, the number of generations. We also obtain the limiting distributions as n → ∞ and compute the number of generations required for the above parameter(s) of the d.f. to reach a value close enough to the limiting value.  相似文献   

17.
Promotion time models have been recently adapted to the context of infectious diseases to take into account discrete and multiple exposures. However, Poisson distribution of the number of pathogens transmitted at each exposure was a very strong assumption and did not allow for inter-individual heterogeneity. Bernoulli, the negative binomial, and the compound Poisson distributions were proposed as alternatives to Poisson distribution for the promotion time model with time-changing exposure. All were derived within the frailty model framework. All these distributions have a point mass at zero to take into account non-infected people. Bernoulli distribution, the two-component cure rate model, was extended to multiple exposures. Contrary to the negative binomial and the compound Poisson distributions, Bernoulli distribution did not enable to connect the number of pathogens transmitted to the delay between transmission and infection detection. Moreover, the two former distributions enable to account for inter-individual heterogeneity. The delay to surgical site infection was an example of single exposure. The probability of infection was very low; thus, estimation of the effect of selected risk factors on that probability obtained with Bernoulli and Poisson distributions were very close. The delay to nosocomial urinary tract infection was a multiple exposure example. The probabilities of pathogen transmission during catheter placement and catheter presence were estimated. Inter-individual heterogeneity was very high, and the fit was better with the compound Poisson and the negative binomial distributions. The proposed models proved to be also mechanistic. The negative binomial and the compound Poisson distributions were useful alternatives to account for inter-individual heterogeneity.  相似文献   

18.
Three models are presented, which describe the aggregation of objects into groups and the distributions of groups sizes and group numbers within habitats. The processes regarded are pure accumulation processes which involve only formation and invasion of groups. Invasion represents the special case of fusion when only single objects - and not groups - join a group of certain size. The basic model is derived by a single parameter, the formation probability q, which represents the probability of an object to form a new group. A novel, discrete and finite distribution that results for the group sizes is deduced from this aggregation process and it is shown that it converges to a geometric distribution if the number of objects tends to infinity. Two extensions of this model, which both converge to the Waring distribution, are added: the model can be extended either with a beta distributed formation probability or with the assumption that the invasion probability depends on the group size. Relationships between the limiting distributions involved are discussed.  相似文献   

19.
We begin with a review of the areas of application of the signed-rank tests (SRTs) and we conclude that the results are exact only if no ties of non-null differences exist. In order to apply the SRTs according to WILCOXON and according to PRATT also in the presence of ties, by assigning midranks, we derive their null distributions. As special cases the null distributions for the problem without ties are obtained. In order to save the practising statistician the time-consuming calculations of the distribution functions, we compute tables of critical values (for reasons of volume they will be published as part of the reprints only). For N0 = 0 (1) 5 null differences and M = = 1(1) 10 non-null differences the critical values of all distributions with all possible tie vectors are calculated. Instructions are provided and an example serves to illustrate the use of the table. The extension of the tables are obtained by means of counting formulas given in the text. Approximations are provided in order to make the application of tests possible for larger samples as well. It is shown that the approximation of the null distribution in the presence of ties by the null distributions under the assumption of no ties in some cases overstates and sometimes understates the exact rejection probability. For N0 = 0 (1) 10 and M = 1 (1) 10 all distributions with all possible tie vectors for the SRTs with WILCOXON and PRATT ranking are examined with respect to the lattice type of the test statistic. The result is given in table 6. It is evident that the portion of PRATT -distributions with lattice character decreases as the number of null differences increases. Continuity corrections are obtained for the asymptotic normal distribution which take into account the lattice character of the distribution of the test statistic.  相似文献   

20.
Unusual probability distribution profiles, including transient multi-peak distributions, have been observed in computer simulations of cell signaling dynamics. The emergence of these complex distributions cannot be explained using either deterministic chemical kinetics or simple Gaussian noise approximation. To develop physical insights into the origin of complex distributions in stochastic cell signaling, we compared our approximate analytical solutions of signaling dynamics with the exact numerical simulations. Our results are based on studying signaling in 2-step and 3-step enzyme amplification cascades that are among the most common building blocks of cellular protein signaling networks. We have found that while the multi-peak distributions are typically transient, and eventually evolve into single peak distributions, in certain cases these distributions may be stable in the limit of long times. We also have shown that introducing positive feedback loops results in diminution of the probability distribution complexity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号