首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Optimizing exact genetic linkage computations.   总被引:3,自引:0,他引:3  
Genetic linkage analysis is a challenging application which requires Bayesian networks consisting of thousands of vertices. Consequently, computing the probability of data, which is needed for learning linkage parameters, using exact computation procedures calls for an extremely efficient implementation that carefully optimizes the order of conditioning and summation operations. In this paper, we present the use of stochastic greedy algorithms for optimizing this order. Our algorithm has been incorporated into the newest version of SUPERLINK, which is a fast genetic linkage program for exact likelihood computations in general pedigrees. We demonstrate an order of magnitude improvement in run times of likelihood computations using our new optimization algorithm and hence enlarge the class of problems that can be handled effectively by exact computations.  相似文献   

2.
MOTIVATION: Genetic linkage analysis is a useful statistical tool for mapping disease genes and for associating functionality of genes with their location on the chromosome. There is a need for a program that computes multipoint likelihood on general pedigrees with many markers that also deals with two-locus disease models. RESULTS: In this paper we present algorithms for performing exact multipoint likelihood calculations on general pedigrees with a large number of highly polymorphic markers, taking into account a variety of disease models. We have implemented these algorithms in a new computer program called SUPERLINK which outperforms leading linkage software with regards to functionality, speed, memory requirements and extensibility.  相似文献   

3.
Single-nucleotide polymorphisms (SNPs) are rapidly replacing microsatellites as the markers of choice for genetic linkage studies and many other studies of human pedigrees. Here, we describe an efficient approach for modeling linkage disequilibrium (LD) between markers during multipoint analysis of human pedigrees. Using a gene-counting algorithm suitable for pedigree data, our approach enables rapid estimation of allele and haplotype frequencies within clusters of tightly linked markers. In addition, with the use of a hidden Markov model, our approach allows for multipoint pedigree analysis with large numbers of SNP markers organized into clusters of markers in LD. Simulation results show that our approach resolves previously described biases in multipoint linkage analysis with SNPs that are in LD. An updated version of the freely available Merlin software package uses the approach described here to perform many common pedigree analyses, including haplotyping and haplotype frequency estimation, parametric and nonparametric multipoint linkage analysis of discrete traits, variance-components and regression-based analysis of quantitative traits, calculation of identity-by-descent or kinship coefficients, and case selection for follow-up association studies. To illustrate the possibilities, we examine a data set that provides evidence of linkage of psoriasis to chromosome 17.  相似文献   

4.
An algorithm for automatic genotype elimination.   总被引:13,自引:4,他引:9       下载免费PDF全文
Automatic genotype elimination algorithms for a single locus play a central role in making likelihood computations on human pedigree data feasible. We present a simple algorithm that is fully efficient in pedigrees without loops. This algorithm can be easily coded and has been instrumental in greatly reducing computing times for pedigree analysis. A contrived counter-example demonstrates that some superfluous genotypes cannot be excluded for inbred pedigrees.  相似文献   

5.
Haplotyping in pedigrees provides valuable information for genetic studies (e.g., linkage analysis and association study). In order to identify a set of haplotype configurations with the highest likelihoods for a large pedigree with a large number of linked loci, in our previous work, we proposed a conditional enumeration haplotyping method which sets a threshold for the conditional probabilities of the possible ordered genotypes at every unordered individual-marker to delete some ordered genotypes with low conditional probabilities and then eliminate some haplotype configurations with low likelihoods. In this article we present a rapid haplotyping algorithm based on a modification of our previous method by setting an additional threshold for the ratio of the conditional probability of a haplotype configuration to the largest conditional probability of all haplotype configurations in order to eliminate those configurations with relatively low conditional probabilities. The new algorithm is much more efficient than our previous method and the widely used software SimWalk2.  相似文献   

6.
QTL analysis in arbitrary pedigrees with incomplete marker information   总被引:3,自引:0,他引:3  
Vogl C  Xu S 《Heredity》2002,89(5):339-345
Mapping quantitative trait loci (QTL) in arbitrary outbred pedigrees is complicated by the combinatorial possibilities of allele flow relationships and of the founder allelic configurations. Exact methods are only available for rather short and simple pedigrees. Stochastic simulation using Markov chain Monte Carlo (MCMC) integration offers more flexibility. MCMC methods are less natural in a frequentist than in a Bayesian context, which we therefore adopt. Among the MCMC algorithms for updating marker locus genotypes, we implement the descent-graph algorithm. It can be used to update marker locus allele flow relationships and can handle arbitrarily complex pedigrees and missing marker information. Compared with updating marker genotypic information, updating QTL parameters, such as position, effects, and the allele flow relationships is relatively easy with MCMC. We treat the effect of each diploid combination of founder alleles as a random variable and only estimate the variance of these effects, ie, we model diploid genotypic effects instead of the usual partition in additive and dominance effects. This is a variant of the random model approach. The number of QTL alleles is generally unknown. In the Bayesian context, the number of QTL present on a linkage group can be treated as variable. Computer simulations suggest that the algorithm can indeed handle complex pedigrees and detect two QTL on a linkage group, but that the number of individuals in a single extended family is limited to about 50 to 100 individuals.  相似文献   

7.
C. Stricker  R. L. Fernando    R. C. Elston 《Genetics》1995,141(4):1651-1656
This paper presents an extension of the finite polygenic mixed model of FERNANDO et al. (1994) to linkage analysis. The finite polygenic mixed model, extended for linkage analysis, leads to a likelihood that can be calculated using efficient algorithms developed for oligogenic models. For comparison, linkage analysis of 5 simulated 4021-member pedigrees was performed using the usual mixed model of inheritance, approximated by HASSTEDT (1982), and the finite polygenic mixed model extended for linkage analysis presented here. Maximum likelihood estimates of the finite polygenic mixed model could be inferred to be closer to the simulated values in these pedigrees.  相似文献   

8.
The problem of determining haplotypes from genotypes has gained considerable prominence in the research community. Here the focus is on determining sets of SNP values on individual chromosomes since such information captures the genetic causes of diseases. The most efficient algorithmic tool for haplotyping is based on perfect phylogenetic trees. A drawback of this method is that it cannot be applied in situations when the data contains homoplasies (multiple mutations of the same character) or recombinations. Recently, Song et al. ( 2005 ) studied the two cases: haplotyping via imperfect phylogenies with a single homoplasy and via galled-tree networks with one gall. In Gupta et al. ( 2010 ), we have shown that the haplotyping via galled-tree networks is NP-hard, even if we restrict to the case when every gall contains at most 3 mutations. We present a polynomial algorithm for haplotyping via galled-tree networks with simple galls (each having two mutations) for genotype matrices which satisfy a natural condition which is implied by presence of at least one 1 in each column that contains a 2. In the end, we give the experimental results comparing our algorithm with PHASE on simulated data.  相似文献   

9.
Recently developed algorithms permit nonparametric linkage analysis of large, complex pedigrees with multiple inbreeding loops. We have used one such algorithm, implemented in the package SimWalk2, to reanalyze previously published genome-screen data from a Costa Rican kindred segregating for severe bipolar disorder. Our results are consistent with previous linkage findings on chromosome 18 and suggest a new locus on chromosome 5 that was not identified using traditional linkage analysis.  相似文献   

10.
Conditional probability methods for haplotyping in pedigrees   总被引:3,自引:0,他引:3  
Gao G  Hoeschele I  Sorensen P  Du F 《Genetics》2004,167(4):2055-2065
Efficient haplotyping in pedigrees is important for the fine mapping of quantitative trait locus (QTL) or complex disease genes. To reconstruct haplotypes efficiently for a large pedigree with a large number of linked loci, two algorithms based on conditional probabilities and likelihood computations are presented. The first algorithm (the conditional probability method) produces a single, approximately optimal haplotype configuration, with computing time increasing linearly in the number of linked loci and the pedigree size. The other algorithm (the conditional enumeration method) identifies a set of haplotype configurations with high probabilities conditional on the observed genotype data for a pedigree. Its computing time increases less than exponentially with the size of a subset of the set of person-loci with unordered genotypes and linearly with its complement. The size of the subset is controlled by a threshold parameter. The set of identified haplotype configurations can be used to estimate the identity-by-descent (IBD) matrix at a map position for a pedigree. The algorithms have been tested on published and simulated data sets. The new haplotyping methods are much faster and provide more information than several existing stochastic and rule-based methods. The accuracies of the new methods are equivalent to or better than those of these existing methods.  相似文献   

11.
Variance component modeling for linkage analysis of quantitative traits is a powerful tool for detecting and locating genes affecting a trait of interest, but the presence of genetic heterogeneity will decrease the power of a linkage study and may even give biased estimates of the location of the quantitative trait loci. Many complex diseases are believed to be influenced by multiple genes and therefore genetic heterogeneity is likely to be present for many real applications of linkage analysis. We consider a mixture of multivariate normals to model locus heterogeneity by allowing only a proportion of the sampled pedigrees to segregate trait-influencing allele(s) at a specific locus. However, for mixtures of normals the classical asymptotic distribution theory of the maximum likelihood estimates does not hold, so tests of linkage and/or heterogeneity are evaluated using resampling methods. It is shown that allowing for genetic heterogeneity leads to an increase in power to detect linkage. This increase is more prominent when the genetic effect of the locus is small or when the percentage of pedigrees not segregating trait-influencing allele(s) at the locus is high.  相似文献   

12.
Late-onset familial Alzheimer disease (LOFAD) is a genetically heterogeneous and complex disease for which only one locus, APOE, has been definitively identified. Difficulties in identifying additional loci are likely to stem from inadequate linkage analysis methods. Nonparametric methods suffer from low power because of limited use of the data, and traditional parametric methods suffer from limitations in the complexity of the genetic model that can be feasibly used in analysis. Alternative methods that have recently been developed include Bayesian Markov chain-Monte Carlo methods. These methods allow multipoint linkage analysis under oligogenic trait models in pedigrees of arbitrary size; at the same time, they allow for inclusion of covariates in the analysis. We applied this approach to an analysis of LOFAD on five chromosomes with previous reports of linkage. We identified strong evidence of a second LOFAD gene on chromosome 19p13.2, which is distinct from APOE on 19q. We also obtained weak evidence of linkage to chromosome 10 at the same location as a previous report of linkage but found no evidence for linkage of LOFAD age-at-onset loci to chromosomes 9, 12, or 21.  相似文献   

13.
Recursive likelihood calculations for genetic analysis with ungenotyped pedigree data employ variations of the Elston-Stewart (ES) or the Lander-Green (LG) algorithms. With the ES algorithm, the number of loci may be limited but not the pedigree size. With the LG algorithm, the reverse is the case. We introduce two new algorithms for the computation of regressive likelihoods for pedigrees with multivariate traits. The first is an alternative formulation of our existing model, which leads to a simpler form in the binary trait, polygenic and mixed model cases. The second is an approximation model, which is computationally efficient. These methods apply to both continuous and binary traits, in the oligogenic and polygenic cases. Both methods coincide in the binary case. We considered these methods for cases in which all the traits are controlled by a single locus, with each trait controlled by one locus independent to the others. Simulation studies and analysis of a real data are presented for segregation analysis as illustrations. These methods can also be used in other model-based analyses. These methods are implemented in G.E.M.S., the genetic epidemiology models software.  相似文献   

14.
Efficient computations in multilocus linkage analysis.   总被引:22,自引:11,他引:11       下载免费PDF全文
This paper describes efficient methods for likelihood calculations and maximum-likelihood estimation in multilocus linkage analysis of reference families and general disease pedigrees, and it documents their performance as implemented in the LINKAGE programs. This information should be of considerable value in determining computing needs for linkage investigations, and in evaluating the merits of alternative algorithms.  相似文献   

15.
Du FX  Hoeschele I 《Genetics》2000,156(4):2051-2062
Elimination of genotypes or alleles for each individual or meiosis, which are inconsistent with observed genotypes, is a component of various genetic analyses of complex pedigrees. Computational efficiency of the elimination algorithm is critical in some applications such as genotype sampling via descent graph Markov chains. We present an allele elimination algorithm and two genotype elimination algorithms for complex pedigrees with incomplete genotype data. We modify all three algorithms to incorporate inheritance restrictions imposed by a complete or incomplete descent graph such that every inconsistent complete descent graph is detected in any pedigree, and every inconsistent incomplete descent graph is detected in any pedigree without loops with the genotype elimination algorithms. Allele elimination requires less CPU time and memory, but does not always eliminate all inconsistent alleles, even in pedigrees without loops. The first genotype algorithm produces genotype lists for each individual, which are identical to those obtained from the Lange-Goradia algorithm, but exploits the half-sib structure of some populations and reduces CPU time. The second genotype elimination algorithm deletes more inconsistent genotypes in pedigrees with loops and detects more illegal, incomplete descent graphs in such pedigrees.  相似文献   

16.
We study the problem of reconstructing haplotype configurations from genotypes on pedigree data with missing alleles under the Mendelian law of inheritance and the minimum-recombination principle, which is important for the construction of haplotype maps and genetic linkage/association analyses. Our previous results show that the problem of finding a minimum-recombinant haplotype configuration (MRHC) is in general NP-hard. This paper presents an effective integer linear programming (ILP) formulation of the MRHC problem with missing data and a branch-and-bound strategy that utilizes a partial order relationship and some other special relationships among variables to decide the branching order. Nontrivial lower and upper bounds on the optimal number of recombinants are introduced at each branching node to effectively prune the search tree. When multiple solutions exist, a best haplotype configuration is selected based on a maximum likelihood approach. The paper also shows for the first time how to incorporate marker interval distance into a rule-based haplotyping algorithm. Our results on simulated data show that the algorithm could recover haplotypes with 50 loci from a pedigree of size 29 in seconds on a Pentium IV computer. Its accuracy is more than 99.8% for data with no missing alleles and 98.3% for data with 20% missing alleles in terms of correctly recovered phase information at each marker locus. A comparison with a statistical approach SimWalk2 on simulated data shows that the ILP algorithm runs much faster than SimWalk2 and reports better or comparable haplotypes on average than the first and second runs of SimWalk2. As an application of the algorithm to real data, we present some test results on reconstructing haplotypes from a genome-scale SNP dataset consisting of 12 pedigrees that have 0.8% to 14.5% missing alleles.  相似文献   

17.
Aul'chenko IU  Aksenovich TI 《Genetika》1999,35(9):1294-1301
The study is a further development of the methods for genetic analysis using pedigree data. Methods for approximation of the likelihood based on cutting of all loops are often used in analysis of large pedigrees with multiple loops. In this study, a fast efficient algorithm for calculating likelihood is proposed. This algorithm allows short inbred loops to be processed without cutting them and, hence, prevents the loss of genetic information. The approach proposed may be important for analysis of the pedigrees of farm and laboratory animals, where inbred crosses resulting in short inbred loops are common. The results of a stochastic genetic experiment agree with this suggestion: the use of the algorithm proposed considerably increases the accuracy of estimation of model parameters and testing of genetic hypotheses.  相似文献   

18.
This paper is concerned with efficient strategies for gene mapping using pedigrees containing small numbers of affecteds and identity-by-descent data from closely spaced markers throughout the genome. Particular attention is paid to additive traits involving phenocopies and/or locus heterogeneity. For a sample of pedigrees containing a particular configuration of affecteds, e.g., pairs of siblings together with a first cousin, we use a likelihood analysis to find 1-df statistics that are very efficient over a broad range of penetrances and allele frequencies. We identify configurations of affecteds that are particularly powerful for detecting linkage, and we show how pedigrees containing different numbers and configurations of affecteds can be efficiently combined in an overall test statistic.  相似文献   

19.
With the advent of RFLPs, genetic linkage maps are now being assembled for a number of organisms including both inbred experimental populations such as maize and outbred natural populations such as humans. Accurate construction of such genetic maps requires multipoint linkage analysis of particular types of pedigrees. We describe here a computer package, called MAPMAKER, designed specifically for this purpose. The program uses an efficient algorithm that allows simultaneous multipoint analysis of any number of loci. MAPMAKER also includes an interactive command language that makes it easy for a geneticist to explore linkage data. MAPMAKER has been applied to the construction of linkage maps in a number of organisms, including the human and several plants, and we outline the mapping strategies that have been used.  相似文献   

20.
Our Markov chain Monte Carlo (MCMC) methods were used in linkage analyses of the Framingham Heart Study data using all available pedigrees. Our goal was to detect and map loci associated with covariate-adjusted traits log triglyceride (lnTG) and high-density lipoprotein cholesterol (HDL) using multipoint LOD score analysis, Bayesian oligogenic linkage analysis and identity-by-descent (IBD) scoring methods. Each method used all marker data for all markers on a chromosome. Bayesian linkage analysis detected a linkage signal on chromosome 7 for lnTG and HDL, corroborating previously published results. However, these results were not replicated in a classical linkage analysis of the data or by using IBD scoring methods.We conclude that Bayesian linkage analysis provides a powerful paradigm for mapping trait loci but interpretation of the Bayesian linkage signals is subjective. In the absence of a LOD score method accommodating genetically complex traits and linkage heterogeneity, validation of these signals remains elusive.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号