首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
In an effort to accelerate likelihood computations on pedigrees, Lange and Goradia defined a genotype-elimination algorithm that aims to identify those genotypes that need not be considered during the likelihood computation. For pedigrees without loops, they showed that their algorithm was optimal, in the sense that it identified all genotypes that lead to a Mendelian inconsistency. Their algorithm, however, is not optimal for pedigrees with loops, which continue to pose daunting computational challenges. We present here a simple extension of the Lange-Goradia algorithm that we prove is optimal on pedigrees with loops, and we give examples of how our new algorithm can be used to detect genotyping errors. We also introduce a more efficient and faster algorithm for carrying out the fundamental step in the Lange-Goradia algorithm-namely, genotype elimination within a nuclear family. Finally, we improve a common algorithm for computing the likelihood of a pedigree with multiple loops. This algorithm breaks each loop by duplicating a person in that loop and then carrying out a separate likelihood calculation for each vector of possible genotypes of the loop breakers. This algorithm, however, does unnecessary computations when the loop-breaker vector is inconsistent. In this paper we present a new recursive loop breaker-elimination algorithm that solves this problem and illustrate its effectiveness on a pedigree with six loops.  相似文献   

2.
For pedigrees with multiple loops, exact likelihoods could not be computed in an acceptable time frame and thus, approximate methods are used. Some of these methods are based on breaking loops and approximations of complex pedigree likelihoods using the exact likelihood of the corresponding zero-loop pedigree. Due to ignoring loops, this method results in a loss of genetic information and a decrease in the power to detect linkage. To minimize this loss, an optimal set of loop breakers has to be selected. In this paper, we present a graph theory based algorithm for automatic selection of an optimal set of loop breakers. We propose using a total relationship between measured pedigree members as a proxy to power. To minimize the loss of genetic information, we suggest selection of such breakers whose duplication in a pedigree would be accompanied by a minimal loss of total relationship between measured pedigree members. We show that our algorithm compares favorably with other existing loop-breaker selection algorithms in terms of conservation of genetic information, statistical power and CPU time of subsequent linkage analysis. We implemented our method in a software package LOOP_EDGE, which is available at http://mga.bionet.nsc.ru/nlru/.  相似文献   

3.
Simulation of genotypes in pedigrees is an important tool to evaluate the power of a linkage or an association study and to assess the empirical significance of results. SLINK is a widely-used package for pedigree simulations, but its implementation has not previously been described in a published paper. SLINK was initially derived from the LINKAGE programs. Over the 20 years since its release, SLINK has been modified to incorporate faster algorithms, notably from the linkage analysis package FASTLINK, also derived from LINKAGE. While SLINK can simulate genotypes on pedigrees of high complexity, one limitation of SLINK, as with most methods based on peeling algorithms to evaluate pedigree likelihoods, is the small number of linked markers that can be generated. The software package SUP includes an elegant wrapper for SLINK that circumvents the limitation on number of markers by using descent markers generated by SLINK to simulate a much larger number of markers on the same chromosome, linked and possibly associated with a trait locus. We have released new coordinated versions of SLINK (3.0; available from http://watson.hgen.pitt.edu) and SUP (v090804; available from http://mlemire.freeshell.org/software or http://watson.hgen.pitt.edu) that integrate the two software packages. Thereby, we have removed some of the previous limitations on the joint functionality of the programs, such as the number of founders in a pedigree. We review the history of SLINK and describe how SLINK and SUP are now coordinated to permit the simulation of large numbers of markers linked and possibly associated with a trait in large pedigrees.  相似文献   

4.
Du FX  Hoeschele I 《Genetics》2000,156(4):2051-2062
Elimination of genotypes or alleles for each individual or meiosis, which are inconsistent with observed genotypes, is a component of various genetic analyses of complex pedigrees. Computational efficiency of the elimination algorithm is critical in some applications such as genotype sampling via descent graph Markov chains. We present an allele elimination algorithm and two genotype elimination algorithms for complex pedigrees with incomplete genotype data. We modify all three algorithms to incorporate inheritance restrictions imposed by a complete or incomplete descent graph such that every inconsistent complete descent graph is detected in any pedigree, and every inconsistent incomplete descent graph is detected in any pedigree without loops with the genotype elimination algorithms. Allele elimination requires less CPU time and memory, but does not always eliminate all inconsistent alleles, even in pedigrees without loops. The first genotype algorithm produces genotype lists for each individual, which are identical to those obtained from the Lange-Goradia algorithm, but exploits the half-sib structure of some populations and reduces CPU time. The second genotype elimination algorithm deletes more inconsistent genotypes in pedigrees with loops and detects more illegal, incomplete descent graphs in such pedigrees.  相似文献   

5.
Homozygosity mapping is a powerful strategy for mapping rare recessive traits in children of consanguineous marriages. Practical applications of this strategy are currently limited by the inability of conventional linkage analysis software to compute, in reasonable time, multipoint LOD scores for pedigrees with inbreeding loops. We have developed a new algorithm for rapid multipoint likelihood calculations in small pedigrees, including those with inbreeding loops. The running time of the algorithm grows, at most, linearly with the number of loci considered simultaneously. The running time is not sensitive to the presence of inbreeding loops, missing genotype information, and highly polymorphic loci. We have incorporated this algorithm into a software package, MAPMAKER/HOMOZ, that allows very rapid multipoint mapping of disease genes in nuclear families, including homozygosity mapping. Multipoint analysis with dozens of markers can be carried out in minutes on a personal workstation.  相似文献   

6.
An algorithm for automatic genotype elimination.   总被引:13,自引:4,他引:9       下载免费PDF全文
Automatic genotype elimination algorithms for a single locus play a central role in making likelihood computations on human pedigree data feasible. We present a simple algorithm that is fully efficient in pedigrees without loops. This algorithm can be easily coded and has been instrumental in greatly reducing computing times for pedigree analysis. A contrived counter-example demonstrates that some superfluous genotypes cannot be excluded for inbred pedigrees.  相似文献   

7.
MOTIVATION: Giving a meaningful representation of a pedigree is not obvious when it includes consanguinity loops, individuals with multiple mates or several related families. RESULTS: We show that finding a perfectly meaningful representation of a pedigree is equivalent to the interval graph sandwich problem and we propose an algorithm for drawing pedigrees.  相似文献   

8.
Recognition of binding patterns common to a set of protein structures is important for recognition of function, prediction of binding, and drug design. We consider protein binding sites represented by a set of 3D points with assigned physico-chemical and geometrical properties important for protein-ligand interactions. We formulate the multiple binding site alignment problem as detection of the largest common set of such 3D points. We discuss the computational problem of multiple common point set detection and, particularly, the matching problem in K-partite-epsilon graphs, where K partitions are associated with K structures and edges are defined between epsilon-close points. We show that the K-partite-epsilon matching problem is NP-hard in the Euclidean space with dimension larger than one. Consequently, we show that the largest common point set problem between three point sets is NP-hard. On the practical side, we present a novel computational method, MultiBind, for recognition of binding patterns common to a set of protein structures. It performs a multiple alignment between protein binding sites in the absence of overall sequence, fold, or binding partner similarity. Despite the NP-hardness results, in our applications, we practically overcome the exponential number of multiple alignment combinations by applying an efficient branchand- bound filtering procedure. We show applications of MultiBind to several biological targets. The method recognizes patterns which are responsible for binding small molecules, such as estradiol, ATP/ANP, and transition state analogues.  相似文献   

9.

We study an extension of the standard framework for pedigree analysis, in which we allow pedigree founders to be inbred. This solves a number of practical challenges in calculating coefficients of relatedness, including condensed identity coefficients. As a consequence we expand considerably the class of pedigrees for which such coefficients may be efficiently computed. An application of this is the modelling of background inbreeding as a continuous effect. We also use inbred founders to shed new light on constructibility of relatedness coefficients, i.e., the problem of finding a genealogy yielding a given set of coefficients. In particular, we show that any theoretically admissible coefficients for a pair of noninbred individuals can be produced by a finite pedigree with inbred founders. Coupled with our computational methods, implemented in the R package ribd, this allows for the first time computer analysis of general constructibility solutions, thus making them accessible for practical use.

  相似文献   

10.
A heuristic algorithm for finding gene transmission patterns on large and complex pedigrees with partially observed genotype data is proposed. The method can be used to generate an initial point for a Markov chain Monte Carlo simulation or to check that the given pedigree and the genotype data are consistent. In small pedigrees, the algorithm is exact by exhaustively enumerating all possibilities, but, in large pedigrees, with a considerable amount of unknown data, only a subset of promising configurations can actually be checked. For that purpose, the configurations are ordered by combining the approximative conditional probability distribution of the unknown genotypes with the information on the relationships between individuals. We also introduce a way to divide the task into subparts, which has been shown to be useful in large pedigrees. The algorithm has been implemented in a program called APE (Allelic Path Explorer) and tested in three different settings with good results.  相似文献   

11.
We apply the method of "blocking Gibbs" sampling to a problem of great importance and complexity-linkage analysis. Blocking Gibbs sampling combines exact local computations with Gibbs sampling, in a way that complements the strengths of both. The method is able to handle problems with very high complexity, such as linkage analysis in large pedigrees with many loops, a task that no other known method is able to handle. New developments of the method are outlined, and it is applied to a highly complex linkage problem in a human pedigree.  相似文献   

12.
This paper presents a new approximation to the likelihood for a pedigree with loops, based on cutting all loops and extending the pedigree at the cuts. An opimum loop-cutting strategy and an iterative extension technique are presented. The likelihood for a pedigree with loops is then approximated by the conditional likelihood for the entire cut-extended pedigree given the extended part. The approximate likelihoods are compared with the exact likelihoods obtained using the program MENDEL for several small pedigrees with loops. The approximation is efficient for large pedigrees with complex loops in terms of computing speed and memory requirements.  相似文献   

13.
We describe a set of computational problems motivated by certain analysis tasks in genome resequencing. These are assembly problems for which multiple distinct sequences must be assembled, but where the relative positions of reads to be assembled are already known. This information is obtained from a common reference genome and is characteristic of resequencing experiments. The simplest variant of the problem aims at determining a minimum set of superstrings such that each sequenced read matches at least one superstring. We give an algorithm with time complexity O(N), where N is the sum of the lengths of reads, substantially improving on previous algorithms for solving the same problem. We also examine the problem of finding the smallest number of reads to remove such that the remaining reads are consistent with k superstrings. By exploiting a surprising relationship with the minimum cost flow problem, we show that this problem can be solved in polynomial time when nested reads are excluded. If nested reads are permitted, this problem of removing the minimum number of reads becomes NP-hard. We show that permitting mismatches between reads and their nearest superstrings generally renders these problems NP-hard.  相似文献   

14.
The molecular biology of viruses can be effectively described by kinetic logic as several feedback loops are implicated in all viral cycles and as viral proteins generally display several functions. We applied this method to the study of the rhabdovirus cycle.Formally, the dynamics of the model are explored on the basis of a discrete caricature (kinetic logic), with special emphasis on the role of the constitutive feedback loops to determine the essential dynamical behaviour of the viral cycle. From a biological point of view, our model accounts for several stable regimes or attractors: healthy cells, acute infection and different kinds of persistent infections, a multistationarity in good agreement with the existence of several positive feedback loops in our system.  相似文献   

15.
Pairs of helices in transmembrane (TM) proteins are often tightly packed. We present a scoring function and a computational methodology for predicting the tertiary fold of a pair of alpha-helices such that its chances of being tightly packed are maximized. Since the number of TM protein structures solved to date is small, it seems unlikely that a reliable scoring function derived statistically from the known set of TM protein structures will be available in the near future. We therefore constructed a scoring function based on the qualitative insights gained in the past two decades from the solved structures of TM and soluble proteins. In brief, we reward the formation of contacts between small amino acid residues such as Gly, Cys, and Ser, that are known to promote dimerization of helices, and penalize the burial of large amino acid residues such as Arg and Trp. As a case study, we show that our method predicts the native structure of the TM homodimer glycophorin A (GpA) to be, in essence, at the global score optimum. In addition, by correlating our results with empirical point mutations on this homodimer, we demonstrate that our method can be a helpful adjunct to mutation analysis. We present a data set of canonical alpha-helices from the solved structures of TM proteins and provide a set of programs for analyzing it (http://ashtoret.tau.ac.il/~sarel). From this data set we derived 11 helix pairs, and conducted searches around their native states as a further test of our method. Approximately 73% of our predictions showed a reasonable fit (RMS deviation <2A) with the native structures compared to the success rate of 8% expected by chance. The search method we employ is less effective for helix pairs that are connected via short loops (<20 amino acid residues), indicating that short loops may play an important role in determining the conformation of alpha-helices in TM proteins.  相似文献   

16.
Relaxation dispersion spectroscopy is one of the most widely used techniques for the analysis of protein dynamics. To obtain a detailed understanding of the protein function from the view point of dynamics, it is essential to fit relaxation dispersion data accurately. The grid search method is commonly used for relaxation dispersion curve fits, but it does not always find the global minimum that provides the best-fit parameter set. Also, the fitting quality does not always improve with increase of the grid size although the computational time becomes longer. This is because relaxation dispersion curve fitting suffers from a local minimum problem, which is a general problem in non-linear least squares curve fitting. Therefore, in order to fit relaxation dispersion data rapidly and accurately, we developed a new fitting program called GLOVE that minimizes global and local parameters alternately, and incorporates a Monte-Carlo minimization method that enables fitting parameters to pass through local minima with low computational cost. GLOVE also implements a random search method, which sets up initial parameter values randomly within user-defined ranges. We demonstrate here that the combined use of the three methods can find the global minimum more rapidly and more accurately than grid search alone.  相似文献   

17.
In designing a study to demonstrate the existence of a major locus for a quantitative trait, an investigator chooses a sampling rule to ascertain pedigrees. The choice of sampling rule can significantly affect the study's power. Here, we compare two types of sampling rules for family studies: fixed-structure rules, in which the same set of relatives are sampled for each proband, and sequential rules, in which the relative or relatives to be sampled next may depend on the trait values of the individuals already observed. We compare fixed-structure and sequential sampling in the setting of extended pedigrees, a quantitative trait, and the genetic mixed model. Using computer simulation, we show that sequential sampling can increase power to detect segregation at a dominant major locus by over 60% in comparison with fixed-structure sampling. Just as important, this substantially increased power is obtained with an easily implemented sampling rule, one that might reasonably be employed in a family study of a quantitative trait.  相似文献   

18.
The calculation of maximum likelihood pedigrees for related organisms using genotypic data is considered. The problem is formulated so that the domain of optimization is a permutation space. This is a feature shared by the travelling salesman problem, for which simulated annealing is known to be effective. Using this technique it is found that pedigrees can be reconstructed with minimal error using genotypic data of a quality currently realizable. In complex pedigrees accurate reconstruction can be done with no a priori age or sex information. For smaller numbers of individuals a method of efficiently enumerating all admissible pedigrees of nonzero likelihood is given.  相似文献   

19.
In genetic linkage studies, while the pedigrees are generally known, background relatedness between the founding individuals, assumed by definition to be unrelated, can seriously affect the results of the analysis. Likelihood approaches to relationship estimation from genetic marker data can all be expressed in terms of finding the most likely pedigree connecting the individuals of interest. When the true relationship is the main focus, the set of all possible alternative pedigrees can be too large to consider. However, prior information is often available which, when incorporated in a formal and structured way, can restrict this set to a manageable size thus enabling the calculation of a posterior distribution from which inferences can be drawn. Here, the unknown relationships are more of a nuisance factor than of interest in their own right, so the focus is on adjusting the results of the analysis rather than on direct estimation. In this paper, we show how prior information on founder relationships can be exploited in some applications to generate a set of candidate extended pedigrees. We then weight the relevant pedigree-specific likelihoods by their posterior probabilities to adjust the lod score statistics.  相似文献   

20.
A new approach to loop analysis is presented in which decompositions of the total elasticity of a population projection matrix over a set of life history pathways are obtained as solutions of a constrained system of linear equations. In loop analysis, life history pathways are represented by loops in the life cycle graph, and the elasticity of the loop is interpreted as a measure of the contribution of the life history pathway to the population growth rate. Associated with the life cycle graph is a vector space -- the cycle space of the graph -- which is spanned by the loops. The elasticities of the transitions in the life cycle graph can be represented by a vector in the cycle space, and a loop decomposition of the life cycle graph is then defined to be any nonnegative linear combination of the loops which sum to the vector of elasticities. In contrast to previously published algorithms for carrying out loop analysis, we show that a given life cycle graph admits of either a unique loop decomposition or an infinite set of loop decompositions which can be characterized as a bounded convex set of nonnegative vectors. Using this approach, loop decompositions which minimize or maximize a linear objective function can be obtained as solutions of a linear programming problem, allowing us to place lower and upper bounds on the contributions of life history pathways to the population growth rate. Another consequence of our approach to loop analysis is that it allows us to identify the exact tradeoffs in contributions to the population growth rate that must exist between life history pathways.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号