首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
Allele-sharing models: LOD scores and accurate linkage tests.   总被引:40,自引:16,他引:24       下载免费PDF全文
Starting with a test statistic for linkage analysis based on allele sharing, we propose an associated one-parameter model. Under general missing-data patterns, this model allows exact calculation of likelihood ratios and LOD scores and has been implemented by a simple modification of existing software. Most important, accurate linkage tests can be performed. Using an example, we show that some previously suggested approaches to handling less than perfectly informative data can be unacceptably conservative. Situations in which this model may not perform well are discussed, and an alternative model that requires additional computations is suggested.  相似文献   

2.
Maximum likelihood haplotyping for general pedigrees   总被引:3,自引:0,他引:3  
Haplotype data is valuable in mapping disease-susceptibility genes in the study of Mendelian and complex diseases. We present algorithms for inferring a most likely haplotype configuration for general pedigrees, implemented in the newest version of the genetic linkage analysis system SUPERLINK. In SUPERLINK, genetic linkage analysis problems are represented internally using Bayesian networks. The use of Bayesian networks enables efficient maximum likelihood haplotyping for more complex pedigrees than was previously possible. Furthermore, to support efficient haplotyping for larger pedigrees, we have also incorporated a novel algorithm for determining a better elimination order for the variables of the Bayesian network. The presented optimization algorithm also improves likelihood computations. We present experimental results for the new algorithms on a variety of real and semiartificial data sets, and use our software to evaluate MCMC approximations for haplotyping.  相似文献   

3.
A mixture model approach is presented for the mapping of one or more quantitative trait loci (QTLs) in complex populations. In order to exploit the full power of complete linkage maps the simultaneous likelihood of phenotype and a multilocus (all markers and putative QTLs) genotype is computed. Maximum likelihood estimation in our mixture models is implemented via an Expectation-Maximization algorithm: exact, stochastic or Monte Carlo EM by using a simple and flexible Gibbs sampler. Parameters include allele frequencies of markers and QTLs, discrete or normal effects of biallelic or multiallelic QTLs, and homogeneous or heterogeneous residual variances. As an illustration a dairy cattle data set consisting of twenty half-sib families has been reanalyzed. We discuss the potential which our and other approaches have for realistic multiple-QTL analyses in complex populations.  相似文献   

4.
For pedigrees with multiple loops, exact likelihoods could not be computed in an acceptable time frame and thus, approximate methods are used. Some of these methods are based on breaking loops and approximations of complex pedigree likelihoods using the exact likelihood of the corresponding zero-loop pedigree. Due to ignoring loops, this method results in a loss of genetic information and a decrease in the power to detect linkage. To minimize this loss, an optimal set of loop breakers has to be selected. In this paper, we present a graph theory based algorithm for automatic selection of an optimal set of loop breakers. We propose using a total relationship between measured pedigree members as a proxy to power. To minimize the loss of genetic information, we suggest selection of such breakers whose duplication in a pedigree would be accompanied by a minimal loss of total relationship between measured pedigree members. We show that our algorithm compares favorably with other existing loop-breaker selection algorithms in terms of conservation of genetic information, statistical power and CPU time of subsequent linkage analysis. We implemented our method in a software package LOOP_EDGE, which is available at http://mga.bionet.nsc.ru/nlru/.  相似文献   

5.
There are three assumptions of independence or conditional independence that underlie linkage likelihood computations on sets of related individuals. The first is the independence of meioses, which gives rise to the conditional independence of haplotypes carried by offspring, given those of their parents. The second derives from the assumption of absence of genetic interference, which gives rise to the conditional independence of inheritance vectors, given the inheritance vector at an intermediate location. The third is the assumption of independence of allelic types, at the population level, both among haplotypes of unrelated individuals and also over the loci along a given haplotype. These three assumptions have been integral to likelihood computations since the first lod scores were computed, and remain key components in analysis of modern genetic data. In this paper we trace the role of these assumptions through the history of linkage likelihood computation, through to a new framework of genetic linkage analysis in the era of dense genomic marker data.  相似文献   

6.
On estimating the heterozygosity and polymorphism information content value   总被引:1,自引:0,他引:1  
The polymorphism information content (PIC) value is commonly used in genetics as a measure of polymorphism for a marker locus used in linkage analysis. In this communication we have derived the uniformly minimum variance unbiased estimator of PIC along with its exact variance. We have also calculated the exact variance of the maximum likelihood estimator of PIC which is asymptotically an unbiased estimator. In order to find this variance we have derived a recursive formula to calculate the moments of any polynomial in a set of variables that are multinomially distributed.  相似文献   

7.
In an effort to accelerate likelihood computations on pedigrees, Lange and Goradia defined a genotype-elimination algorithm that aims to identify those genotypes that need not be considered during the likelihood computation. For pedigrees without loops, they showed that their algorithm was optimal, in the sense that it identified all genotypes that lead to a Mendelian inconsistency. Their algorithm, however, is not optimal for pedigrees with loops, which continue to pose daunting computational challenges. We present here a simple extension of the Lange-Goradia algorithm that we prove is optimal on pedigrees with loops, and we give examples of how our new algorithm can be used to detect genotyping errors. We also introduce a more efficient and faster algorithm for carrying out the fundamental step in the Lange-Goradia algorithm-namely, genotype elimination within a nuclear family. Finally, we improve a common algorithm for computing the likelihood of a pedigree with multiple loops. This algorithm breaks each loop by duplicating a person in that loop and then carrying out a separate likelihood calculation for each vector of possible genotypes of the loop breakers. This algorithm, however, does unnecessary computations when the loop-breaker vector is inconsistent. In this paper we present a new recursive loop breaker-elimination algorithm that solves this problem and illustrate its effectiveness on a pedigree with six loops.  相似文献   

8.
MOTIVATION: Genetic linkage analysis is a useful statistical tool for mapping disease genes and for associating functionality of genes with their location on the chromosome. There is a need for a program that computes multipoint likelihood on general pedigrees with many markers that also deals with two-locus disease models. RESULTS: In this paper we present algorithms for performing exact multipoint likelihood calculations on general pedigrees with a large number of highly polymorphic markers, taking into account a variety of disease models. We have implemented these algorithms in a new computer program called SUPERLINK which outperforms leading linkage software with regards to functionality, speed, memory requirements and extensibility.  相似文献   

9.
EM算法是在不完全信息资料下实现参数极大似然估计的一种通用方法.本文导出了双位点不同标记类型,包括共显性-共显性,共显性-显性和显性-显性3种模式下,估计遗传重组率的EM算法,以及获得重组率抽样方差的Bootstrap方法;并将之推广到部分个体缺失标记基因型(未检测到电泳谱带)下的重组率估计.通过大量Monte Carlo模拟研究发现: (1)连锁紧密时,样本容量对重组率的估计影响不大;连锁松散时,需要较大样本容量才可检测到连锁以及实现重组率的较精确估计.(2)用包含缺失标记的所有个体估计重组率比仅用其中的非缺失标记个体估计更准确,且可显著提高连锁检测的统计功效.  相似文献   

10.
Statistical packages for constructing genetic linkage maps in inbred lines are well developed and applied extensively, while linkage analysis in outcrossing species faces some statistical challenges because of their complicated genetic structures. In this article, we present a multilocus linkage analysis via hidden Markov models for a linkage group of markers in a full-sib family. The advantage of this method is the simultaneous estimation of the recombination fractions between adjacent markers that possibly segregate in different ratios, and the calculation of likelihood for a certain order of the markers. When the number of markers decreases to two or three, the multilocus linkage analysis becomes traditional two-point or three-point linkage analysis, respectively. Monte Carlo simulations are performed to show that the recombination fraction estimates of multilocus linkage analysis are more accurate than those just using two-point linkage analysis and that the likelihood as an objective function for ordering maker loci is the most powerful method compared with other methods. By incorporating this multilocus linkage analysis, we have developed a Windows software, FsLinkageMap, for constructing genetic maps in a full-sib family. A real example is presented for illustrating linkage maps constructed by using mixed segregation markers. Our multilocus linkage analysis provides a powerful method for constructing high-density genetic linkage maps in some outcrossing plant species, especially in forest trees.  相似文献   

11.
Markov chain Monte Carlo (MCMC) has recently gained use as a method of estimating required probability and likelihood functions in pedigree analysis, when exact computation is impractical. However, when a multiallelic locus is involved, irreducibility of the constructed Markov chain, an essential requirement of the MCMC method, may fail. Solutions proposed by several researchers, which do not identify all the noncommunicating sets of genotypic configurations, are inefficient with highly polymorphic loci. This is a particularly serious problem in linkage analysis, because highly polymorphic markers are much more informative and thus are preferred. In the present paper, we describe an algorithm that finds all the noncommunicating classes of genotypic configurations on any pedigree. This leads to a more efficient method of defining an irreducible Markov chain. Examples, including a pedigree from a genetic study of familial Alzheimer disease, are used to illustrate how the algorithm works and how penetrances are modified for specific individuals to ensure irreducibility.  相似文献   

12.
Incorporating genotypes of relatives into a test of linkage disequilibrium.   总被引:3,自引:0,他引:3  
Genetic data from autosomal loci in diploids generally consist of genotype data for which no phase information is available, making it difficult to implement a test of linkage disequilibrium. In this paper, we describe a test of linkage disequilibrium based on an empirical null distribution of the likelihood of a sample. Information on the genotypes of related individuals is explicitly used to help reconstruct the gametic phase of the independent individuals. Simulation studies show that the present approach improves on estimates of linkage disequilibrium gathered from samples of completely independent individuals but only if some offspring are sampled together with their parents. The failure to incorporate some parents sharply decreases the sensitivity and accuracy of the test. Simulations also show that for multiallelic data (more than two alleles) our testing procedure is not as powerful as an exact test based on known haplotype frequencies, owing to the interaction between departure from Hardy-Weinberg equilibrium and linkage disequilibrium.  相似文献   

13.
This paper provides an alternative to Albert's (1991), Biometrics 47, 1371-1381) approximation to the E-step when using the EM algorithm for parameter estimation in Markov mixture models. Use of a recursive algorithm of Baum et al. (1970, Annals of Mathematical Statistics 41, 164-171) results in exact evaluation of the likelihood, optimal parameter estimates, and very efficient computation. Applications to time series of seizure counts and fetal movements clearly show the advantages of this exact approach.  相似文献   

14.
Accurate parameter estimation of allometric equations is a question of considerable interest. Various techniques that address this problem exist. In this paper it is assumed that the measured values are normally distributed and a maximum likelihood estimation approach is used. The computations involved in this procedure are reducible to relatively simple forms, and an efficient numerical algorithm is used. A listing of the computer program is included as an appendix.  相似文献   

15.
Mapping in forest trees generally relies on outbred pedigrees in which genetic segregation is the result of meiotic recombination from both parents. The currently available mapping packages are not optimal for outcrossed pedigrees as they either cannot order phase-ambiguous data or only use pairwise information when ordering loci within linkage groups. A new package, OUTMAP, has been developed for mapping codominant loci in outcrossed trees. A comparison of maps produced using linkage data from two pedigrees of Acacia mangium Willd demonstrated that the marker orders produced using OUTMAP were consistently of higher likelihood than those produced by JOINMAP. In addition, the maps were produced more efficiently, without the need for recoding data or the detailed investigation of pairwise recombination fractions which was necessary to select the optimal marker order using JOINMAP. Distances between markers often varied from those calculated by JOINMAP, resulting in an increase in the estimated genome length. OUTMAP can be used with all segregation types to determine phase and to calculate the likelihood of alternative marker orders, with a choice of three optimisation methods.  相似文献   

16.
OBJECTIVE: When numerous single nucleotide polymorphisms (SNPs) have been identified in a candidate gene, a relevant and still unanswered question is to determine how many and which of these SNPs should be optimally tested to detect an association with the disease. Testing them all is expensive and often unnecessary. Alleles at different SNPs may be associated in the population because of the existence of linkage disequilibrium, so that knowing the alleles carried at one SNP could provide exact or partial knowledge of alleles carried at a second SNP. We present here a method to select the most appropriate subset of SNPs in a candidate gene based on the pairwise linkage disequilibrium between the different SNPs. METHOD: The best subset is identified through power computations performed under different genetic models, assuming that one of the SNPs identified is the disease susceptibility variant. RESULTS: We applied the method on two data sets, an empirical study of the APOE gene region and a simulated study concerning one of the major genes (MG1) from the Genetic Analysis Workshop 12. For these two genes, the sets of SNPs selected were compared to the ones obtained using two other methods that need the reconstruction of multilocus haplotypes in order to identify haplotype-tag SNPs (htSNPs). We showed that with both data sets, our method performed better than the other selection methods.  相似文献   

17.
We apply the method of "blocking Gibbs" sampling to a problem of great importance and complexity-linkage analysis. Blocking Gibbs sampling combines exact local computations with Gibbs sampling, in a way that complements the strengths of both. The method is able to handle problems with very high complexity, such as linkage analysis in large pedigrees with many loops, a task that no other known method is able to handle. New developments of the method are outlined, and it is applied to a highly complex linkage problem in a human pedigree.  相似文献   

18.
We model the recombination process of fungal systems via chromatid exchange in meiosis, which accounts for any type of bivalent configuration in a genetic interval in any specified order of genetic markers, for both random spore and tetrad data. First, a probability model framework is developed for two genes and then generalized for an arbitrary number of genes. Maximum likelihood estimators (MLEs) for both random and tetrad data are developed. It is shown that the MLE of recombination for tetrad data is uniformly more efficient over that from random spore data by a factor of at least 4 usually. The MLE for the generalized probability framework is computed using the expectation-maximization (EM) algorithm. Pearson's chi-squared statistic is computed as a measure of goodness of fit using a product-multinomial setup. We implement our model with genetic marker data on the whole genome of Neurospora crassa. Simulated annealing is used to search for the best order of genetic markers for each chromosome, and the goodness of fit value is evaluated for model assumptions. Inferred map orders are corroborated by genomic sequence, with the exception of linkage groups I, II, and V.  相似文献   

19.
An algorithm for automatic genotype elimination.   总被引:13,自引:4,他引:9       下载免费PDF全文
Automatic genotype elimination algorithms for a single locus play a central role in making likelihood computations on human pedigree data feasible. We present a simple algorithm that is fully efficient in pedigrees without loops. This algorithm can be easily coded and has been instrumental in greatly reducing computing times for pedigree analysis. A contrived counter-example demonstrates that some superfluous genotypes cannot be excluded for inbred pedigrees.  相似文献   

20.
Constructing dense genetic linkage maps   总被引:4,自引:0,他引:4  
This paper describes a novel combination of techniques for the construction of dense genetic linkage maps. The construction of such maps is hampered by the occurrence of even small proportions of typing errors. Simulated annealing is used to obtain the best map according to the optimality criterion: the likelihood or the total number of recombination events. Spatial sampling of markers is used to obtain a framework map. The construction of a framework map is essential if the steps used for simulated annealing are required to be simple. For missing-data imputation the Gibbs sampler is used. Map construction using simulated annealing and missing-data imputation are used in an iterative way. In order to obtain some measure of precision of the genetic linkage map obtained, the Metropolis-Hastings algorithm is used to obtain posterior intervals for the positions of markers. The process of map construction is embedded in a framework of pre-mapping and post-mapping diagnostics. The techniques described are illustrated using a practical application. Received: 1 June 2000 / Accepted: 21 September 2000  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号