首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
DNA and protein sequence comparisons are performed by a number of computational algorithms. Most of these algorithms search for the alignment of two sequences that optimizes some alignment score. It is an important problem to assess the statistical significance of a given score. In this paper we use newly developed methods for Poisson approximation to derive estimates of the statistical significance ofk-word matches on a diagonal of a sequence comparison. We require at leastq of thek letters of the words to match where 0<qk. The distribution of the number of matches on a diagonal is approximated as well as the distribution of the order statistics of the sizes of clumps of matches on the diagonal. These methods provide an easily computed approximation of the distribution of the longest exact matching word between sequences. The methods are validated using comparisons of vertebrate andE. coli protein sequences. In addition, we compare two HLA class II transplantation antigens by this method and contrast the results with a dynamic programming approach. Several open problems are outlined in the last section. This work was supported by grants DMS 90-05833 from NSF and GM 36230 from NIH.  相似文献   

2.
This paper considers some approximations for the Borel-Tanner (Generalized Poisson) sums by using (i) Gram-Charlier Poisson expansion, (ii) Mixture of two Poisson distributions, (iii) Variance stabilizing technique, and (iv) negative binomial distribution. It has been found that the approximation obtained by using the negative binomial distribution seems to be more efficient than the other approximation.  相似文献   

3.
Assessing the exceptionality of network motifs.   总被引:1,自引:0,他引:1  
Getting and analyzing biological interaction networks is at the core of systems biology. To help understanding these complex networks, many recent works have suggested to focus on motifs which occur more frequently than expected in random. To identify such exceptional motifs in a given network, we propose a statistical and analytical method which does not require any simulation. For this, we first provide an analytical expression of the mean and variance of the count under any exchangeable random graph model. Then we approximate the motif count distribution by a compound Poisson distribution whose parameters are derived from the mean and variance of the count. Thanks to simulations, we show that the compound Poisson approximation outperforms the Gaussian approximation. The compound Poisson distribution can then be used to get an approximate p-value and to decide if an observed count is significantly high or not. Our methodology is applied on protein-protein interaction (PPI) networks, and statistical issues related to exceptional motif detection are discussed.  相似文献   

4.
Arora N  Bashford D 《Proteins》2001,43(1):12-27
In calculations involving many displacements of an interacting pair of biomolecules, such as brownian dynamics, the docking of a substrate/ligand to an enzyme/receptor, or the screening of a large number of ligands as prospective inhibitors for a particular receptor site, there is a need for rapid evaluation of the desolvation penalties of the interacting pair. Although continuum electrostatic treatments with distinct dielectric constants for solute and solvent provide an account of the electrostatics of solvation and desolvation, it is necessary to re-solve the Poisson equation, at considerable computational cost, for each displacement of the interacting pair. We present a new method that uses a formulation of continuum electrostatic solvation in terms of the solvation energy density and approximates desolvation in terms of the occlusion of this density. We call it the SEDO approximation. It avoids the need to re-solve the Poisson equation, as desolvation is now estimated by an integral over the occluded volume. Test calculations are presented for some simple model systems and for some real systems that have previously been studied using the Poisson equation approach: MHC class I protein-peptide complexes and a congeneric series of human immunodeficiency virus type 1 (HIV-1) protease--ligand complexes. For most of the systems considered, the trends and magnitudes of the desolvation component of interaction energies obtained using the SEDO approximation are in reasonable correlation with those obtained by re-solving the Poisson equation. In most cases, the error introduced by the SEDO approximation is much less than that of the often-used test-charge approximation for the charge-charge components of intermolecular interactions. Proteins 2001;43:12-27.  相似文献   

5.
Tobias' repair-misrepair (RMR) model of cell survival is formulated as a Markov process, a sequence of discrete repair steps occurring at random times, and the probability of a sequence of viable repairs is calculated. The Markov formulation describes the time evolution of the probability distribution for the number of lesions in a cell. The probability of cell survival is calculated from the distribution of the initial number of lesions and the probabilities of the repair events. The production of lesions is formulated in accordance with the principles of microdosimetry, and the distribution of the initial number of lesions is obtained as an approximation for high and low linear energy transfer cases. The Markov formulation of the RMR model uses the same biological hypotheses as the original version with two statistical approximations deleted. These approximations are the neglect of the effect of statistical fluctuations in calculating the average rate of repair of lesions and the assumption that the final number of unrepaired and lethally misrepaired lesions has a Poisson distribution. The quantitative effect of these approximations is calculated, and a basis is provided for an alternative approach to calculating survival probabilities.  相似文献   

6.
ObjectivesThe assumption that nuclear decays are governed by Poisson statistics is an approximation. This approximation becomes unjustified when data acquisition times longer than or even comparable with the half-lives of the radioisotope in the sample are considered. In this work, the limits of the Poisson-statistics approximation are investigated.MethodsThe formalism for the statistics of radioactive decay based on binomial distribution is derived. The theoretical factor describing the deviation of variance of the number of decays predicated by the Poisson distribution from the true variance is defined and investigated for several commonly used radiotracers such as 18F, 15O, 82Rb, 13N, 99mTc, 123I, and 201Tl.ResultsThe variance of the number of decays estimated using the Poisson distribution is significantly different than the true variance for a 5-minute observation time of 11C, 15O, 13N, and 82Rb.ConclusionsDurations of nuclear medicine studies often are relatively long; they may be even a few times longer than the half-lives of some short-lived radiotracers. Our study shows that in such situations the Poisson statistics is unsuitable and should not be applied to describe the statistics of the number of decays in radioactive samples. However, the above statement does not directly apply to counting statistics at the level of event detection. Low sensitivities of detectors which are used in imaging studies make the Poisson approximation near perfect.  相似文献   

7.
We present numerical solutions for the one-dimensional Nernst-Planck and Poisson system of equations for steady-state electrodiffusion. Commonly used approximate solutions to these equations invoke assumptions of local electroneutrality (Planck approximation) or constant electric field (Goldman approximation). Calculations were performed to test the ranges over which these approximate theories are valid. For a dilutional junction of a 1:1 electrolyte, separated from adjoining perfectly stirred solutions by sharp boundaries, the Planck approximation is valid for values of kappa dL greater than 10, where 1/kappa d is the Debye length of the more dilute solution. The Goldman approximation is valid for kappa cL less than 0.1 where 1/kappa c is the Debye length of the more concentrated solution. These results suggest that the modeling of electrodiffusive flows in and near membrane ion channels may require numerical solutions of this set of equations rather than the use of either limiting case.  相似文献   

8.
It is shown that any discrete distribution with non-negative support has a representation in terms of an extended Poisson process (or pure birth process). A particular extension of the simple Poisson process is proposed: one that admits a variety of distributions; the equations for such processes may be readily solved numerically. An analytical approximation for the solution is given, leading to approximate mean-variance relationships. The resulting distributions are then applied to analyses of some biological data-sets.  相似文献   

9.
A spherical glow discharge with a pointlike anode is considered in a self-consistent drift-diffusion approximation. The model includes the time-dependent continuity equations for ions and electrons in the drift-diffusion approximation and Poisson’s equation for the radial electric field. In finding steady-state distributions, Ohm’s law is used to relate the discharge voltage and discharge current. Steady-state distributions of the plasma parameters across the discharge gap, current-voltage characteristics, and cathode characteristics for an abnormal spherical discharge in molecular nitrogen are obtained. In a subnormal glow-discharge regime, oscillations in the conduction current, potential, and other discharge parameters are revealed. Similar regimes are also observed in conventional discharges in tubes.  相似文献   

10.
Thispaperdiscussesthelimitdistributionofnumberofraremutantsinamutationprocess.Theresultofthepapergeneralizedthatof[1].Meanwhile,theauthoralsodiscusscompoundPoissonapproximationtheoremforakindofrandomsum.  相似文献   

11.
The use of an approximation to the median of the Poisson distribution to represent each occurrence of mutations in a growing clone permits the prediction of the number of mutants per clone without the limitations imposed by more heuristic expressions. Its application to the evaluation of mutation rates yields results comparable to those obtained by fluctuation analysis.  相似文献   

12.
A theorem for Poisson convergence on realizations of twodimensional Branching Random Walks with an underlying continuous time Markov Branching Process is proved. This result can be used to gain an approximation for the number of cells having sustained a certain deficiency after a long time in multistage carcinogenesis.  相似文献   

13.
Abstract

Time saving procedures unifying Monte Carlo and self consistent field approaches for the calculation of equilibrium potentials and density distributions of mobile ions around a polyion in a polyelectrolyte system are considered. In the final version of the method the region around the polyion is divided into two zones—internal and external; all the ions of the internal zone are accounted for explicitly in a Monte Carlo procedure, in the external zone the self consistent field approximation is applied with an exchange of ions between regions. Simulations are carried out for cylindrical and spherical polyions in solutions with mono-and divalent ions and their mixtures. The results are compared with Poisson—Boltzmann approximation and experimental data on intrinsic viscosity.  相似文献   

14.
A PCR primer sequence is called degenerate if some of its positions have several possible bases. The degeneracy of the primer is the number of unique sequence combinations it contains. We study the problem of designing a pair of primers with prescribed degeneracy that match a maximum number of given input sequences. Such problems occur when studying a family of genes that is known only in part, or is known in a related species. We prove that various simplified versions of the problem are hard, show the polynomiality of some restricted cases, and develop approximation algorithms for one variant. Based on these algorithms, we implemented a program called HYDEN for designing highly-degenerate primers for a set of genomic sequences. We report on the success of the program in an experimental scheme for identifying all human olfactory receptor (OR) genes. In that project, HYDEN was used to design primers with degeneracies up to 10(10) that amplified with high specificity many novel genes of that family, tripling the number of OR genes known at the time.  相似文献   

15.
A generalization of an earlier paper (Capocelli and Ricciardi, 1971), dealing with a diffusion approximation for a neuron subject to one excitatory and one inhibitory Poisson input, is provided by not imposing any restrictions on number and magnitude if synaptic inputs. An equation for the neuron's transition p.d.f. is derived, use of which is made to determine the moments of the membrane potential. It is finally shown that a diffusion approximation is possible and that the resulting diffusion process is characterized by constant infinitesimal variance and linear drift.  相似文献   

16.
In the malaria model of Dietz, Molineaux, and Thomas [Bull. WHO 50:347–357 (1974)] the iroculation rate depends on a pseudoequilibrium approximation to a differential equation describing mosquito dynamics. By biasing a key parameter, the approximation can match the predictions of the differential equation; with fixed parameters, the approximation sometimes predicts qualitatively different disease behavior than does its approximand. The model's recovery rate depends on an approximation to a full time-dependent formulation of Macdonald's superinfection hypothesis. Judged by the ability to fit data, the approximation performs better than its approximand. Alternative implementations of the model yield significantly different estimates of scientifically meaningful parameters.  相似文献   

17.
18.
The site-frequency spectrum, representing the distribution of allele frequencies at a set of polymorphic sites, is a commonly used summary statistic in population genetics. Explicit forms of the spectrum are known for both models with and without selection if independence among sites is assumed. The availability of these explicit forms has allowed for maximum likelihood estimation of selection, developed first in the Poisson random field model of Sawyer and Hartl, which is now the primary method for estimating selection directly from DNA sequence data. The independence assumption, which amounts to assume free recombination between sites, is, however, a limiting case for many population genetics models. Here, we extend the site-frequency spectrum theory to consider the case where the sites are completely linked. We use diffusion approximation to calculate the joint distribution of the allele frequencies of linked sites for models without selection and for models with equal coefficient selection. The joint distribution is derived by first constructing Green’s functions corresponding to multiallele diffusion equations. We show that the site-frequency spectrum is highly correlated between frequencies that are complementary (i.e., sum to 1), and the correlation is significantly elevated by positive selection. The results presented here can be used to extend the Poisson random field to allow for estimating selection for correlated sites. More generally, the Green’s function construction should be able to aid in studying the genetic drift of multiple alleles in other cases.  相似文献   

19.
Jung BC  Jhun M  Lee JW 《Biometrics》2005,61(2):626-628
Ridout, Hinde, and Demétrio (2001, Biometrics 57, 219-223) derived a score test for testing a zero-inflated Poisson (ZIP) regression model against zero-inflated negative binomial (ZINB) alternatives. They mentioned that the score test using the normal approximation might underestimate the nominal significance level possibly for small sample cases. To remedy this problem, a parametric bootstrap method is proposed. It is shown that the bootstrap method keeps the significance level close to the nominal one and has greater power uniformly than the existing normal approximation for testing the hypothesis.  相似文献   

20.
Chao L  Rang CU  Wong LE 《Journal of virology》2002,76(7):3276-3281
When a parent virus replicates inside its host, it must first use its own genome as the template for replication. However, once progeny genomes are produced, the progeny can in turn act as templates. Depending on whether the progeny genomes become templates, the distribution of mutants produced by an infection varies greatly. While information on the distribution is important for many population genetic models, it is also useful for inferring the replication mode of a virus. We have analyzed the distribution of mutants emerging from single bursts in the RNA bacteriophage phi6 and find that the distribution closely matches a Poisson distribution. The match suggests that replication in this bacteriophage is effectively by a stamping machine model in which the parental genome is the main template used for replication. However, because the distribution deviates slightly from a Poisson distribution, the stamping machine is not perfect and some progeny genomes must replicate. By fitting our data to a replication model in which the progeny genomes become replicative at a given rate or probability per round of replication, we estimated the rate to be very low and on the on the order of 10(-4). We discuss whether different replication modes may confer an adaptive advantage to viruses.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号