首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 421 毫秒
1.
A growth model for topological trees is formulated as a generalization of the terminal and segmental growth model. For this parameterized growth model, expressions are derived for the partition probabilities (probabilities of subtree pairs of certain degrees). The probabilities of complete trees are easily derived from these partition probabilities.  相似文献   

2.
"This paper aims to identify net and partial-crude probabilities in the competing-risk life table context, by using probabilistic approaches. Five types of lifelength random variables are defined to formulate these nonidentifiable probabilities. General expressions for net and partial-crude probabilities are first derived under independent risks assumptions. Two sets of explicit formulas for estimating the net and partial-crude probabilities are then derived in terms of the identifiable overall and crude probabilities by making the additional assumption of piecewise uniform distribution of the lifelength random variables. A study of the degree to which nonidentifiability can affect the net and partial-crude probabilities in a variety of situations is developed. An example from cross-sectional studies is employed to illustrate the methodology developed."  相似文献   

3.
Multilocus genotype probabilities, estimated using the assumption of independent association of alleles within and across loci, are subject to sampling fluctuation, since allele frequencies used in such computations are derived from samples drawn from a population. We derive exact sampling variances of estimated genotype probabilities and provide simple approximation of sampling variances. Computer simulations conducted using real DNA typing data indicate that, while the sampling distribution of estimated genotype probabilities is not symmetric around the point estimate, the confidence interval of estimated (single-locus or multilocus) genotype probabilities can be obtained from the sampling of a logarithmic transformation of the estimated values. This, in turn, allows an examination of heterogeneity of estimators derived from data on different reference populations. Applications of this theory to DNA typing data at VNTR loci suggest that use of different reference population data may yield significantly different estimates. However, significant differences generally occur with rare (less than 1 in 40,000) genotype probabilities. Conservative estimates of five-locus DNA profile probabilities are always less than 1 in 1 million in an individual from the United States, irrespective of the racial/ethnic origin.  相似文献   

4.
Teuscher F  Broman KW 《Genetics》2007,175(3):1267-1274
Recombinant inbred lines (RIL) derived from multiple inbred strains can serve as a powerful resource for the genetic dissection of complex traits. The use of such multiple-strain RIL requires a detailed knowledge of the haplotype structure in such lines. Broman (2005) derived the two- and three-point haplotype probabilities for 2(n)-way RIL; the former required hefty computation to infer the symbolic results, and the latter were strictly numerical. We describe a simpler approach for the calculation of these probabilities, which allowed us to derive the symbolic form of the three-point haplotype probabilities. We also extend the two-point results for the case of additional generations of intermating, including the case of 2(n)-way intermated recombinant inbred populations (IRIP).  相似文献   

5.
Amino acid sequences of peptides are often inferred from their amino acid compositions by comparison with homologous peptides of known sequence. The probabilities are considered that by such an approach errors are made due to the occurrence of balanced double changes, i.e. reciprocal substitutions, between two homologous peptides of identical compositions. Formulae are derived for the calculation of these probabilities, depending on peptide length and evolutionary distance. However, such calculations requiring too much computer time, the probabilities for reciprocal substitutions are estimated by simulation of evolutionary changes in peptides. It can be concluded from the resulting data that for many purposes the possible errors in amino acid sequences partially inferred from amino acid compositions are acceptably small.  相似文献   

6.
Understanding the biology and conducting effective conservation of migratory species requires an understanding of migratory connectivity – the geographic linkages of populations between stages of the annual cycle. Unfortunately, for most species, we are lacking such information. The North American Bird Banding Laboratory (BBL) houses an extensive database of marking, recaptures and recoveries, and such data could provide migratory connectivity information for many species. To date, however, few species have been analyzed for migratory connectivity largely because heterogeneous re‐encounter probabilities make interpretation problematic. We accounted for regional variation in re‐encounter probabilities by borrowing information across species and by using effort covariates on recapture and recovery probabilities in a multistate capture–recapture and recovery model. The effort covariates were derived from recaptures and recoveries of species within the same regions. We estimated the migratory connectivity for three tern species breeding in North America and over‐wintering in the tropics, common (Sterna hirundo), roseate (Sterna dougallii), and Caspian terns (Hydroprogne caspia). For western breeding terns, model‐derived estimates of migratory connectivity differed considerably from those derived directly from the proportions of re‐encounters. Conversely, for eastern breeding terns, estimates were merely refined by the inclusion of re‐encounter probabilities. In general, eastern breeding terns were strongly connected to eastern South America, and western breeding terns were strongly linked to the more western parts of the nonbreeding range under both models. Through simulation, we found this approach is likely useful for many species in the BBL database, although precision improved with higher re‐encounter probabilities and stronger migratory connectivity. We describe an approach to deal with the inherent biases in BBL banding and re‐encounter data to demonstrate that this large dataset is a valuable source of information about the migratory connectivity of the birds of North America.  相似文献   

7.
DUPUIS  JEROME A. 《Biometrika》1995,82(4):761-772
The Arnason–Schwarz model is usually used for estimatingsurvival and movement probabilities of animal populations fromcapture-recapture data. The missing data structure of this capture-recapturemodel is exhibited and summarised via a directed graph representation.Taking advantage of this structure we implement a Gibbs samplingalgorithm from which Bayesian estimates and credible intervalsfor survival and movement probabilities are derived. Convergenceof the algorithm is proved using a duality principle. We illustrateour approach through a real example.  相似文献   

8.
9.
Insertions and deletions in a profile hidden Markov model (HMM) are modeled by transition probabilities between insert, delete and match states. These are estimated by combining observed data and prior probabilities. The transition prior probabilities can be defined either ad hoc or by maximum likelihood (ML) estimation. We show that the choice of transition prior greatly affects the HMM's ability to discriminate between true and false hits. HMM discrimination was measured using the HMMER 2.2 package applied to 373 families from Pfam. We measured the discrimination between true members and noise sequences employing various ML transition priors and also systematically scanned the parameter space of ad hoc transition priors. Our results indicate that ML priors produce far from optimal discrimination, and we present an empirically derived prior that considerably decreases the number of misclassifications compared to ML. Most of the difference stems from the probabilities for exiting a delete state. The ML prior, which is unaware of noise sequences, estimates a delete-to-delete probability that is relatively high and does not penalize noise sequences enough for optimal discrimination.  相似文献   

10.
Chao A  Chu W  Hsu CH 《Biometrics》2000,56(2):427-433
We consider a capture-recapture model in which capture probabilities vary with time and with behavioral response. Two inference procedures are developed under the assumption that recapture probabilities bear a constant relationship to initial capture probabilities. These two procedures are the maximum likelihood method (both unconditional and conditional types are discussed) and an approach based on optimal estimating functions. The population size estimators derived from the two procedures are shown to be asymptotically equivalent when population size is large enough. The performance and relative merits of various population size estimators for finite cases are discussed. The bootstrap method is suggested for constructing a variance estimator and confidence interval. An example of the deer mouse analyzed in Otis et al. (1978, Wildlife Monographs 62, 93) is given for illustration.  相似文献   

11.
Logistic regression in capture-recapture models   总被引:6,自引:1,他引:5  
J M Alho 《Biometrics》1990,46(3):623-635
The effect of population heterogeneity in capture-recapture, or dual registration, models is discussed. An estimator of the unknown population size based on a logistic regression model is introduced. The model allows different capture probabilities across individuals and across capture times. The probabilities are estimated from the observed data using conditional maximum likelihood. The resulting population estimator is shown to be consistent and asymptotically normal. A variance estimator under population heterogeneity is derived. The finite-sample properties of the estimators are studied via simulation. An application to Finnish occupational disease registration data is presented.  相似文献   

12.
We evaluate statistical models used in two-hypothesis tests for identifying peptides from tandem mass spectrometry data. The null hypothesis H(0), that a peptide matches a spectrum by chance, requires information on the probability of by-chance matches between peptide fragments and peaks in the spectrum. Likewise, the alternate hypothesis H(A), that the spectrum is due to a particular peptide, requires probabilities that the peptide fragments would indeed be observed if it was the causative agent. We compare models for these probabilities by determining the identification rates produced by the models using an independent data set. The initial models use different probabilities depending on fragment ion type, but uniform probabilities for each ion type across all of the labile bonds along the backbone. More sophisticated models for probabilities under both H(A) and H(0) are introduced that do not assume uniform probabilities for each ion type. In addition, the performance of these models using a standard likelihood model is compared to an information theory approach derived from the likelihood model. Also, a simple but effective model for incorporating peak intensities is described. Finally, a support-vector machine is used to discriminate between correct and incorrect identifications based on multiple characteristics of the scoring functions. The results are shown to reduce the misidentification rate significantly when compared to a benchmark cross-correlation based approach.  相似文献   

13.
In a competing risks problem where a well-defined population is exposed simultaneously to several causes of death, interest has centered on the estimation of the probability of death from a given cause when one or more other causes have been eliminated. A basic component of all available procedures for estimating these probabilities is the assumption that the several causes of death act independently—an unrealistic assumption in the context of human and animal populations. This article considers the estimation of these probabilities assuming the existence ofinterdependencies among the various causes of death. A general formula is derived based on a given set of crude probabilities of death as well as the characteristics of the joint distribution of random variables indicating death from the various causes. This formula identifies alternative assumptions, less restrictive than that of independent risks, which may he used for estimation purposes.  相似文献   

14.
The expected values of the probabilities of identity by descent are derived for the circular stepping-stone model. The results are more easily interpreted than those derived previously.  相似文献   

15.
Three bivariate generalizations of the POISSON binomial distribution are introduced. The probabilities, moments, conditional distributions and regression functions for these distributions are obtained in terms of bipartitional polynomials. Recurrences for the probabilities and moments are also given. Parameter estimators are derived using the methods of moments and zero frequencies and the three distributions are fitted to some ecological data.  相似文献   

16.
A procedure for obtaining the theoretical probability of heritability (h2) estimates from full-sib analysis exceeding unity is described. Probability densities useful in the evaluation of these probabilities are also derived. These are used for obtaining the probabilities for several combinations of sire/dam numbers and three levels of h2 (0.10, 0.25, 0.50) assuming additive genetic model and two full-sibs per mating.  相似文献   

17.
A model is derived to estimate the survival probability of a time interval when censorings occur. The time interval is divided into partial intervals in order to obtain the conditional survival probabilities, each of which is a parameter of a Binomial distributed random variable. To allow for the dependence between the events in the different intervals these parameters are transformed. Corresponding a priori density functions are formulated regarding both the Bayesian uniform distribution and the special model. The a posteriori density function is derived for the product of the conditional survival probabilities, and formulae for the BAYE sian confidence interval and the expectation are given. Lower and upper bounds for the confidence interval and the expectation are derived. Some examples are given to compare the results with other methods.  相似文献   

18.
A mathematical model taking into account the observed diurnal variations in cell kinetics is presented. Its principle is to divide each phase of the cell cycle into a definite number of compartments and to assume time-dependent probabilities of transition from one compartment to the following; general properties of the model are derived.The particular case where the only time-dependent transition probabilities are those corresponding to the G1 phase is studied. A characterization of the joint percentages of S and M cells variations is given. The application of the model to interpretation of published experimental data obtained in hamster cheek pouch epithelium is given.  相似文献   

19.
An approximate formula is derived for determination of the sample sizes needed to detect a difference between two binomial probabilities.  相似文献   

20.
The identification of proton contacts from NOE spectra remains the major bottleneck in NMR protein structure calculations. We describe an automated assignment-free system for deriving proton contact probabilities from NOESY peak lists that can be viewed as a quantitative extension of manual assignment techniques. Rather than assigning contacts to NOESY crosspeaks, a rigorous Bayesian methodology is used to transform initial proton contact probabilities derived from a set of 2992 protein structures into posterior probabilities using the observed crosspeaks as evidence. Given a target protein, the Bayesian approach is used to derive probabilities for all possible proton contacts. We evaluated the accuracy of this approach at predicting proton contacts on 60 15N separated NOESY and 13C separated NOESY datasets simulated from experimentally determined NMR structures and compared it to CYANA, an established method for proton constraint assignment. On average, at the highest confidence level, our method accurately identifies 3.16/3.17 long range contacts per residue and 12.11/12.18 interresidue proton contacts per residue. These accuracies represent a significant increase over the performance of CYANA on the same data set. On a difficult real dataset that is publicly available, the coverage is lower but our method retains its advantage in accuracy over CANDID/CYANA. The algorithm is publicly available via the Protinfo NMR webserver .  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号