首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 23 毫秒
1.
The output process of an infinite-server queue with a Poisson process input is observed starting at time 0 with an empty queue. It is assumed that the service time distribution is known. This article discusses statistical inference about the input intensity. A controversial issue in the study of multiple sclerosis is addressed as a motivation for the model and methods developed.  相似文献   

2.
The calculation of multipoint likelihoods is computationally challenging, with the exact calculation of multipoint probabilities only possible on small pedigrees with many markers or large pedigrees with few markers. This paper explores the utility of calculating multipoint likelihoods using data on markers flanking a hypothesized position of the trait locus. The calculation of such likelihoods is often feasible, even on large pedigrees with missing data and complex structures. Performance characteristics of the flanking marker procedure are assessed through the calculation of multipoint heterogeneity LOD scores on data simulated for Genetic Analysis Workshop 14 (GAW14). Analysis is restricted to data on the Aipotu population on chromosomes 1, 3, and 4, where chromosomes 1 and 3 are known to contain disease loci. The flanking marker procedure performs well, even when missing data and genotyping errors are introduced.  相似文献   

3.
Centromeric-mapping methods have been used to investigate the association between altered recombination and meiotic nondisjunction in humans. For trisomies, current methods are based on the genotypes from a trisomic offspring and both parents. Because it is sometimes difficult to obtain samples from both parents and because the ability to use sources of DNA previously not available (e.g., stored paraffin-embedded pathological samples) has increased, we have been interested in creating similar maps for trisomic populations in which one of the parents of the trisomic individual is unavailable for genotyping. In this paper, we derive multipoint likelihoods for both missing-parent data and conventional two-parent data. We find that likelihoods for two-parent data and for data generated without a sample from the correctly disjoining parent can be maximized in exactly the same way but also that missing-parent data has a high frequency of partial data of the same sort produced by intercross matings. Previously published centromeric-mapping methods use incorrect likelihoods for intercross matings and thus can perform poorly on missing-parent data. We wrote a FORTRAN program to maximize our multipoint likelihoods and used it in simulation studies to demonstrate the biases in the previous methods.  相似文献   

4.
Efficiency and robustness of pedigree segregation analysis.   总被引:18,自引:13,他引:5  
Different pedigree structures and likelihoods are examined to determine their efficiency for parameter estimation under one-locus models. For the cases simulated, family size has little effect; estimates based on unconditional likelihoods are generally more efficient than those based on conditional likelihoods. The proposed method of pedigree analysis under a one-locus model is found to be robust in the analysis of nuclear families: skewness of the data and polygenic inheritance will not lead to the spurious detection of major loci unless they occur simultaneously, and together with a moderate amount of environmental correlation among sibs.  相似文献   

5.
A concept for the application of complex pedigree analysis to multilocus DNA fingerprinting is described. By following this approach, the extent to which the DNA fingerprints of grandparents influence the phenotype likelihoods of their offspring was determined. It was demonstrated by simulation that approximately 90% of paternity disputes can be solved if mother, child, and paternal grandparents, instead of the putative father, are tested. If only phenotype information on a single paternal sib is allowed for, true paternity will be detected with reasonable persuasive power in up to 64% of cases. Exclusion of false paternity remains possible for 40% of cases. Finally, the analysis concept is modified by reducing the number of genotype variations considered in likelihood computations. This time-saving procedure is shown to yield sufficiently accurate likelihoods in the analysis of both simulation data and multilocus DNA fingerprints obtained in two large families.  相似文献   

6.
Wang J 《Genetics》2012,191(1):183-194
Quite a few methods have been proposed to infer sibship and parentage among individuals from their multilocus marker genotypes. They are all based on Mendelian laws either qualitatively (exclusion methods) or quantitatively (likelihood methods), have different optimization criteria, and use different algorithms in searching for the optimal solution. The full-likelihood method assigns sibship and parentage relationships among all sampled individuals jointly. It is by far the most accurate method, but is computationally prohibitive for large data sets with many individuals and many loci. In this article I propose a new likelihood-based method that is computationally efficient enough to handle large data sets. The method uses the sum of the log likelihoods of pairwise relationships in a configuration as the score to measure its plausibility, where log likelihoods of pairwise relationships are calculated only once and stored for repeated use. By analyzing several empirical and many simulated data sets, I show that the new method is more accurate than pairwise likelihood and exclusion-based methods, but is slightly less accurate than the full-likelihood method. However, the new method is computationally much more efficient than the full-likelihood method, and for the cases of both sexes polygamous and markers with genotyping errors, it can be several orders faster. The new method can handle a large sample with thousands of individuals and the number of markers limited only by the computer memory.  相似文献   

7.
In the present paper, techniques of genealogy reconstruction based on genetic likelihoods of parent-offspring relationships are explored. Previous applications of such techniques have involved human populations, with emphasis placed on identification of parent pairs followed by reconstruction of families. In natural populations, this approach is neither practical nor necessarily a realistic representation of population structure. It is proposed that for natural populations emphasis should be placed first on locating the most likely mothers and fathers for a given individual, then seeking the most likely pair among that subset of genetically possible parents. Thus the genealogy is ultimately represented as a set of genotype triplets consisting of each individual coupled with its mother and father. Mathematical analyses show a strong positive correlation between single parent and parent pair likelihoods within triplets; this result is corroborated by statistical investigation of data from a natural plant population. Therefore the practice of constructing parent pairs using only likely single parents is justifiable on statistical grounds.  相似文献   

8.
Cook RJ  Farewell VT 《Biometrics》1999,55(1):284-288
We highlight a feature of likelihood-based methods that provides flexibility in model formulation and inference. In particular, overall likelihoods that consist of likelihood contributions with different forms are considered. The particular forms may be predetermined by design criteria or may be selected based on features of the data. Inferences based on such mixed-form likelihoods are valid provided standard regularity conditions hold and the parameters of interest have the same interpretation in the various forms. The advantages of constructing overall likelihoods in this way are illustrated by applications involving the analysis of 2 x 2 x K tables and left-censored water quality data.  相似文献   

9.
Summary The analysis of linkage data with multiple markers is a complex problem, only soluble with the help of computer programs. A package of programs is presented which allows the analysis of X-linked data with reasonable speed. Published data relating Duchenne muscular dystrophy and seven X-specific DNA probes are analysed. The results presented are the maximum likelihoods of the eight possible orders and the interlocus distances of the most likely order. Also calculated are the mean and standard deviation of risk for selected cases with information derived from both single probes and pair of probes bridging the disease locus.  相似文献   

10.
On extended pedigrees with extensive missing data, the calculation of multilocus likelihoods for linkage analysis is often beyond the computational bounds of exact methods. Growing interest therefore surrounds the implementation of Monte Carlo estimation methods. In this paper, we demonstrate the speed and accuracy of a new Markov chain Monte Carlo method for the estimation of linkage likelihoods through an analysis of real data from a study of early-onset Alzheimer's disease. For those data sets where comparison with exact analysis is possible, we achieved up to a 100-fold increase in speed. Our approach is implemented in the program lm_bayes within the framework of the freely available MORGAN 2.6 package for Monte Carlo genetic analysis (http://www.stat.washington.edu/thompson/Genepi/MORGAN/Morgan.shtml).  相似文献   

11.
I here consider the question of when to formulate a likelihood over the whole data set, as opposed to conditioning the likelihood on subsets of the data (i.e., joint vs. conditional likelihoods). I show that when certain conditions are met, these two likelihoods are guaranteed to be equivalent, and thus that it is generally preferable to condition on subsets, since that likelihood is mathematically and computationally simpler. However, I show that when these conditions are not met, conditioning on subsets of the data is equivalent to introducing additional df into our genetic model, df that we may not have been aware of. I discuss the implications of these facts for ascertainment corrections and other genetic problems.  相似文献   

12.
Gene mapping and genetic epidemiology require large-scale computation of likelihoods based on human pedigree data. Although computation of such likelihoods has become increasingly sophisticated, fast calculations are still impeded by complex pedigree structures, by models with many underlying loci and by missing observations on key family members. The current paper 'introduces' a new method of array factorization that substantially accelerates linkage calculations with large numbers of markers. This method is not limited to nuclear families or to families with complete phenotyping. Vectorization and parallelization are two general-purpose hardware techniques for accelerating computations. These techniques can assist in the rapid calculation of genetic likelihoods. We describe our experience using both of these methods with the existing program MENDEL. A vectorized version of MENDEL was run on an IBM 3090 supercomputer. A parallelized version of MENDEL was run on parallel machines of different architectures and on a network of workstations. Applying these revised versions of MENDEL to two challenging linkage problems yields substantial improvements in computational speed.  相似文献   

13.
Using an extension of a statistical model given by E. Lander and M. Waterman, we define the a posteriori probability of a clone ordering based upon oligonucleotide hybridization data. We give algorithms for computing the likelihood of a clone ordering and for finding a clone ordering of maximum likelihood. The dynamic programming algorithm for computing likelihoods runs in time O(mnc), where m is the number of oligonucleotide probes, n is the number of clones, and c is the coverage of the clone library. We use the Expectation-Maximization technique to maximize likelihoods.  相似文献   

14.
Selective genotyping (i.e., genotyping only those individuals with extreme phenotypes) can greatly improve the power to detect and map quantitative trait loci in genetic association studies. Because selection depends on the phenotype, the resulting data cannot be properly analyzed by standard statistical methods. We provide appropriate likelihoods for assessing the effects of genotypes and haplotypes on quantitative traits under selective-genotyping designs. We demonstrate that the likelihood-based methods are highly effective in identifying causal variants and are substantially more powerful than existing methods.  相似文献   

15.
Leung Lai T  Shih MC  Wong SP 《Biometrics》2006,62(1):159-167
To circumvent the computational complexity of likelihood inference in generalized mixed models that assume linear or more general additive regression models of covariate effects, Laplace's approximations to multiple integrals in the likelihood have been commonly used without addressing the issue of adequacy of the approximations for individuals with sparse observations. In this article, we propose a hybrid estimation scheme to address this issue. The likelihoods for subjects with sparse observations use Monte Carlo approximations involving importance sampling, while Laplace's approximation is used for the likelihoods of other subjects that satisfy a certain diagnostic check on the adequacy of Laplace's approximation. Because of its computational tractability, the proposed approach allows flexible modeling of covariate effects by using regression splines and model selection procedures for knot and variable selection. Its computational and statistical advantages are illustrated by simulation and by application to longitudinal data from a fecundity study of fruit flies, for which overdispersion is modeled via a double exponential family.  相似文献   

16.
We present an approach to integrate physical properties of DNA, such as DNA bendability or GC content, into our probabilistic promoter recognition system McPROMOTER. In the new model, a promoter is represented as a sequence of consecutive segments represented by joint likelihoods for DNA sequence and profiles of physical properties. Sequence likelihoods are modeled with interpolated Markov chains, physical properties with Gaussian distributions. The background uses two joint sequence/profile models for coding and non-coding sequences, each consisting of a mixture of a sense and an anti-sense submodel. On a large Drosophila test set, we achieved a reduction of about 30% of false positives when compared with a model solely based on sequence likelihoods.  相似文献   

17.
Although it is widely agreed that data from multiple sources are necessary to confidently resolve phylogenetic relationships, procedures for accommodating and incorporating heterogeneity in such data remain underdeveloped. We explored the use of partitioned, model-based analyses of heterogeneous molecular data in the context of a phylogenetic study of swallowtail butterflies (Lepidoptera: Papilionidae). Despite substantial basic and applied study, phylogenetic relationships among the major lineages of this prominent group remain contentious. We sequenced 3.3 kb of mitochondrial and nuclear DNA (2.3 kb of cytochrome oxidase I and II and 1.0 kb of elongation factor-1 alpha, respectively) from 22 swallowtails, including representatives of Baroniinae, Parnassiinae, and Papilioninae, and from several moth and butterfly outgroups. Using parsimony, we encountered considerable difficulty in resolving the deepest splits among these taxa. We therefore chose two outgroups with undisputed relationships to each other and to Papilionidae and undertook detailed likelihood analyses of alternative topologies. Following from previous studies that have demonstrated substantial heterogeneity in the evolutionary dynamics among process partitions of these genes, we estimated evolutionary parameters separately for gene-based and codon-based partitions. These values were then used as the basis for examining the likelihoods of possible resolutions and rootings under several partitioned and unpartitioned likelihood models. Partitioned models gave markedly better fits to the data than did unpartitioned models and supported different topologies. However, the most likely topology varied from model to model. The most likely ingroup topology under the best-fitting, six-partition GTR + gamma model favors a paraphyletic Parnassiinae. However, when examining the likelihoods of alternative rootings of this tree relative to rootings of the classical hypothesis, two rootings of the latter emerge as most likely. Of these two, the most likely rooting is within the Papilioninae, although a rooting between Baronia and the remaining Papilionidae is only nonsignificantly less likely.  相似文献   

18.
In wireless sensor networks, when a sensor node detects events in the surrounding environment, the sensing period for learning detailed information is likely to be short. However, the short sensing cycle increases the data traffic of the sensor nodes in a routing path. Since the high traffic load causes a data queue overflow in the sensor nodes, important information about urgent events could be lost. In addition, since the battery energy of the sensor nodes is quickly exhausted, the entire lifetime of wireless sensor networks would be shortened. In this paper, to address these problem issues, a new routing protocol is proposed based on a lightweight genetic algorithm. In the proposed method, the sensor nodes are aware of the data traffic rate to monitor the network congestion. In addition, the fitness function is designed from both the average and the standard deviation of the traffic rates of sensor nodes. Based on dominant gene sets in a genetic algorithm, the proposed method selects suitable data forwarding sensor nodes to avoid heavy traffic congestion. In experiments, the proposed method demonstrates efficient data transmission due to much less queue overflow and supports fair data transmission for all sensor nodes. From the results, it is evident that the proposed method not only enhances the reliability of data transmission but also distributes the energy consumption across wireless sensor networks.  相似文献   

19.
The ant subfamily Formicinae is a large assemblage (2458 species (J. Nat. Hist. 29 (1995) 1037), including species that weave leaf nests together with larval silk and in which the metapleural gland-the ancestrally defining ant character-has been secondarily lost. We used sequences from two mitochondrial genes (cytochrome b and cytochrome oxidase 2) from 18 formicine and 4 outgroup taxa to derive a robust phylogeny, employing a search for tree islands using 10000 randomly constructed trees as starting points and deriving a maximum likelihood consensus tree from the ML tree and those not significantly different from it. Non-parametric bootstrapping showed that the ML consensus tree fit the data significantly better than three scenarios based on morphology, with that of Bolton (Identification Guide to the Ant Genera of the World, Harvard University Press, Cambridge, MA) being the best among these alternative trees. Trait mapping showed that weaving had arisen at least four times and possibly been lost once. A maximum likelihood analysis showed that loss of the metapleural gland is significantly associated with the weaver life-pattern. The graph of the frequencies with which trees were discovered versus their likelihood indicates that trees with high likelihoods have much larger basins of attraction than those with lower likelihoods. While this result indicates that single searches are more likely to find high- than low-likelihood tree islands, it also indicates that searching only for the single best tree may lose important information.  相似文献   

20.
Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genes and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information is necessary to obtain a more accurate network. All described workflows are implemented as part of the DOE Systems Biology Knowledgebase (KBase) and are publicly available via API or command-line web interface.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号