首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Conditional probability methods for haplotyping in pedigrees   总被引:3,自引:0,他引:3  
Gao G  Hoeschele I  Sorensen P  Du F 《Genetics》2004,167(4):2055-2065
Efficient haplotyping in pedigrees is important for the fine mapping of quantitative trait locus (QTL) or complex disease genes. To reconstruct haplotypes efficiently for a large pedigree with a large number of linked loci, two algorithms based on conditional probabilities and likelihood computations are presented. The first algorithm (the conditional probability method) produces a single, approximately optimal haplotype configuration, with computing time increasing linearly in the number of linked loci and the pedigree size. The other algorithm (the conditional enumeration method) identifies a set of haplotype configurations with high probabilities conditional on the observed genotype data for a pedigree. Its computing time increases less than exponentially with the size of a subset of the set of person-loci with unordered genotypes and linearly with its complement. The size of the subset is controlled by a threshold parameter. The set of identified haplotype configurations can be used to estimate the identity-by-descent (IBD) matrix at a map position for a pedigree. The algorithms have been tested on published and simulated data sets. The new haplotyping methods are much faster and provide more information than several existing stochastic and rule-based methods. The accuracies of the new methods are equivalent to or better than those of these existing methods.  相似文献   

2.
An algorithm for automatic genotype elimination.   总被引:13,自引:4,他引:9       下载免费PDF全文
Automatic genotype elimination algorithms for a single locus play a central role in making likelihood computations on human pedigree data feasible. We present a simple algorithm that is fully efficient in pedigrees without loops. This algorithm can be easily coded and has been instrumental in greatly reducing computing times for pedigree analysis. A contrived counter-example demonstrates that some superfluous genotypes cannot be excluded for inbred pedigrees.  相似文献   

3.
Gene content is the number of copies of a particular allele in a genotype of an animal. Gene content can be used to study additive gene action of candidate gene. Usually genotype data are available only for a part of population and for the rest gene contents have to be calculated based on typed relatives. Methods to calculate expected gene content for animals on large complex pedigrees are relatively complex. In this paper we proposed a practical method to calculate gene content using a linear regression. The method does not estimate genotype probabilities but these can be approximated from gene content assuming Hardy-Weinberg proportions. The approach was compared with other methods on multiple simulated data sets for real bovine pedigrees of 1 082 and 907 903 animals. Different allelic frequencies (0.4 and 0.2) and proportions of the missing genotypes (90, 70, and 50%) were considered in simulation. The simulation showed that the proposed method has similar capability to predict gene content as the iterative peeling method, however it requires less time and can be more practical for large pedigrees. The method was also applied to real data on the bovine myostatin locus on a large dual-purpose Belgian Blue pedigree of 235 133 animals. It was demonstrated that the proposed method can be easily adapted for particular pedigrees.  相似文献   

4.
A method for estimating genotypic and identity-by-descent probabilities in complex pedigrees is described. The method consists of an algorithm for drawing independent genotype samples which are consistent with the pedigree and observed genotype. The probability distribution function for samples obtained using the algorithm can be evaluated up to a normalizing constant, and combined with the likelihood to produce a weight for each sample. Importance sampling is then used to estimate genotypic and identity-by-descent probabilities. On small but complex pedigrees, the genotypic probability estimates are demonstrated to be empirically unbiased. On large complex pedigrees, while the algorithm for obtaining genotype samples is feasible, importance sampling may require an infeasible number of samples to estimate genotypic probabilities with accuracy.  相似文献   

5.
We propose an analytical approximation method for the estimation of multipoint identity by descent (IBD) probabilities in pedigrees containing a moderate number of distantly related individuals. We show that in large pedigrees where cases are related through untyped ancestors only, it is possible to formulate the hidden Markov model of the Lander-Green algorithm in terms of the IBD configurations of the cases. We use a first-order Markov approximation to model the changes in this IBD-configuration variable along the chromosome. In simulated and real data sets, we demonstrate that estimates of parametric and nonparametric linkage statistics based on the first-order Markov approximation are accurate. The computation time is exponential in the number of cases instead of in the number of meioses separating the cases. We have implemented our approach in the computer program ALADIN (accurate linkage analysis of distantly related individuals). ALADIN can be applied to general pedigrees and marker types and has the ability to model marker-marker linkage disequilibrium with a clustered-markers approach. Using ALADIN is straightforward: It requires no parameters to be specified and accepts standard input files.  相似文献   

6.
Du FX  Hoeschele I 《Genetics》2000,156(4):2051-2062
Elimination of genotypes or alleles for each individual or meiosis, which are inconsistent with observed genotypes, is a component of various genetic analyses of complex pedigrees. Computational efficiency of the elimination algorithm is critical in some applications such as genotype sampling via descent graph Markov chains. We present an allele elimination algorithm and two genotype elimination algorithms for complex pedigrees with incomplete genotype data. We modify all three algorithms to incorporate inheritance restrictions imposed by a complete or incomplete descent graph such that every inconsistent complete descent graph is detected in any pedigree, and every inconsistent incomplete descent graph is detected in any pedigree without loops with the genotype elimination algorithms. Allele elimination requires less CPU time and memory, but does not always eliminate all inconsistent alleles, even in pedigrees without loops. The first genotype algorithm produces genotype lists for each individual, which are identical to those obtained from the Lange-Goradia algorithm, but exploits the half-sib structure of some populations and reduces CPU time. The second genotype elimination algorithm deletes more inconsistent genotypes in pedigrees with loops and detects more illegal, incomplete descent graphs in such pedigrees.  相似文献   

7.
QTL analysis in arbitrary pedigrees with incomplete marker information   总被引:3,自引:0,他引:3  
Vogl C  Xu S 《Heredity》2002,89(5):339-345
Mapping quantitative trait loci (QTL) in arbitrary outbred pedigrees is complicated by the combinatorial possibilities of allele flow relationships and of the founder allelic configurations. Exact methods are only available for rather short and simple pedigrees. Stochastic simulation using Markov chain Monte Carlo (MCMC) integration offers more flexibility. MCMC methods are less natural in a frequentist than in a Bayesian context, which we therefore adopt. Among the MCMC algorithms for updating marker locus genotypes, we implement the descent-graph algorithm. It can be used to update marker locus allele flow relationships and can handle arbitrarily complex pedigrees and missing marker information. Compared with updating marker genotypic information, updating QTL parameters, such as position, effects, and the allele flow relationships is relatively easy with MCMC. We treat the effect of each diploid combination of founder alleles as a random variable and only estimate the variance of these effects, ie, we model diploid genotypic effects instead of the usual partition in additive and dominance effects. This is a variant of the random model approach. The number of QTL alleles is generally unknown. In the Bayesian context, the number of QTL present on a linkage group can be treated as variable. Computer simulations suggest that the algorithm can indeed handle complex pedigrees and detect two QTL on a linkage group, but that the number of individuals in a single extended family is limited to about 50 to 100 individuals.  相似文献   

8.
MOTIVATION: Haplotype reconstruction is an essential step in genetic linkage and association studies. Although many methods have been developed to estimate haplotype frequencies and reconstruct haplotypes for a sample of unrelated individuals, haplotype reconstruction in large pedigrees with a large number of genetic markers remains a challenging problem. METHODS: We have developed an efficient computer program, HAPLORE (HAPLOtype REconstruction), to identify all haplotype sets that are compatible with the observed genotypes in a pedigree for tightly linked genetic markers. HAPLORE consists of three steps that can serve different needs in applications. In the first step, a set of logic rules is used to reduce the number of compatible haplotypes of each individual in the pedigree as much as possible. After this step, the haplotypes of all individuals in the pedigree can be completely or partially determined. These logic rules are applicable to completely linked markers and they can be used to impute missing data and check genotyping errors. In the second step, a haplotype-elimination algorithm similar to the genotype-elimination algorithms used in linkage analysis is applied to delete incompatible haplotypes derived from the first step. All superfluous haplotypes of the pedigree members will be excluded after this step. In the third step, the expectation-maximization (EM) algorithm combined with the partition and ligation technique is used to estimate haplotype frequencies based on the inferred haplotype configurations through the first two steps. Only compatible haplotype configurations with haplotypes having frequencies greater than a threshold are retained. RESULTS: We test the effectiveness and the efficiency of HAPLORE using both simulated and real datasets. Our results show that, the rule-based algorithm is very efficient for completely genotyped pedigree. In this case, almost all of the families have one unique haplotype configuration. In the presence of missing data, the number of compatible haplotypes can be substantially reduced by HAPLORE, and the program will provide all possible haplotype configurations of a pedigree under different circumstances, if such multiple configurations exist. These inferred haplotype configurations, as well as the haplotype frequencies estimated by the EM algorithm, can be used in genetic linkage and association studies. AVAILABILITY: The program can be downloaded from http://bioinformatics.med.yale.edu.  相似文献   

9.
Sobel E  Sengul H  Weeks DE 《Human heredity》2001,52(3):121-131
OBJECTIVES: To describe, implement, and test an efficient algorithm to obtain multipoint identity-by-descent (IBD) probabilities at arbitrary positions among marker loci for general pedigrees. Unlike existing programs, our algorithm can analyze data sets with large numbers of people and markers. The algorithm has been implemented in the SimWalk2 computer package. METHODS: Using a rigorous testing regimen containing five pedigrees of various sizes with realistic marker data, we compared several widely used IBD computation programs: Allegro, Aspex, GeneHunter, MapMaker/Sibs, Mendel, Sage, SimWalk2, and Solar. RESULTS: The testing revealed a few discrepancies, particularly on consanguineous pedigrees, but overall excellent results in the deterministic multipoint packages. SimWalk2 was also found to be in good agreement with the deterministic multipoint programs, usually matching to two decimal places the kinship coefficient that ranges from 0 to 1. However, the packages based on single-point IBD estimation, while consistent with each other, often showed poor results, disagreeing with the multipoint kinship results by as much as 0.5. CONCLUSIONS: Our testing has clearly shown that multipoint IBD estimation is much better than single-point estimation. In addition, our testing has validated our algorithm for estimating IBD probabilities at arbitrary positions on general pedigrees.  相似文献   

10.
An increased availability of genotypes at marker loci has prompted the development of models that include the effect of individual genes. Selection based on these models is known as marker-assisted selection (MAS). MAS is known to be efficient especially for traits that have low heritability and non-additive gene action. BLUP methodology under non-additive gene action is not feasible for large inbred or crossbred pedigrees. It is easy to incorporate non-additive gene action in a finite locus model. Under such a model, the unobservable genotypic values can be predicted using the conditional mean of the genotypic values given the data. To compute this conditional mean, conditional genotype probabilities must be computed. In this study these probabilities were computed using iterative peeling, and three Markov chain Monte Carlo (MCMC) methods – scalar Gibbs, blocking Gibbs, and a sampler that combines the Elston Stewart algorithm with iterative peeling (ESIP). The performance of these four methods was assessed using simulated data. For pedigrees with loops, iterative peeling fails to provide accurate genotype probability estimates for some pedigree members. Also, computing time is exponentially related to the number of loci in the model. For MCMC methods, a linear relationship can be maintained by sampling genotypes one locus at a time. Out of the three MCMC methods considered, ESIP, performed the best while scalar Gibbs performed the worst.  相似文献   

11.
Lin S  Ding J  Dong C  Liu Z  Ma ZJ  Wan S  Xu Y 《BMC genetics》2005,6(Z1):S76
We compare and contrast the performance of SIMPLE, a Monte Carlo based software, with that of several other methods for linkage and haplotype analyses, focusing on the simulated data from the New York City population. First, a whole-genome scan study based on the microsatellite markers was performed using GENEHUNTER. Because GENEHUNTER had to drop individuals for many of the pedigrees, we performed a follow-up study focusing on several regions of interest using SIMPLE, which can handle all pedigrees in their entirety. Second, 3 haplotyping programs, including that in SIMPLE, were used to reconstruct haplotypic configurations in pedigrees. SIMPLE emerges clearly as a preferred tool, as it can handle large pedigrees and produces haplotypic configurations without double recombinant haplotypes. For this study, we had knowledge of the simulating models at the time we performed the analysis.  相似文献   

12.
Gao G  Hoeschele I 《Genetics》2005,171(1):365-376
Identity-by-descent (IBD) matrix calculation is an important step in quantitative trait loci (QTL) analysis using variance component models. To calculate IBD matrices efficiently for large pedigrees with large numbers of loci, an approximation method based on the reconstruction of haplotype configurations for the pedigrees is proposed. The method uses a subset of haplotype configurations with high likelihoods identified by a haplotyping method. The new method is compared with a Markov chain Monte Carlo (MCMC) method (Loki) in terms of QTL mapping performance on simulated pedigrees. Both methods yield almost identical results for the estimation of QTL positions and variance parameters, while the new method is much more computationally efficient than the MCMC approach for large pedigrees and large numbers of loci. The proposed method is also compared with an exact method (Merlin) in small simulated pedigrees, where both methods produce nearly identical estimates of position-specific kinship coefficients. The new method can be used for fine mapping with joint linkage disequilibrium and linkage analysis, which improves the power and accuracy of QTL mapping.  相似文献   

13.
Minimum-recombinant haplotyping in pedigrees   总被引:15,自引:0,他引:15       下载免费PDF全文
This article presents a six-rule algorithm for the reconstruction of multiple minimum-recombinant haplotype configurations in pedigrees. The algorithm has three major features: First, it allows exhaustive search of all possible haplotype configurations under the criterion that there are minimum recombinants between markers. Second, its computational requirement is on the order of O(J(2)L(3)) in current implementation, where J is the family size and L is the number of marker loci under analysis. Third, it applies to various pedigree structures, with and without consanguinity relationship, and allows missing alleles to be imputed, during the haplotyping process, from their identical-by-descent copies. Haplotyping examples are provided using both published and simulated data sets.  相似文献   

14.
In an effort to accelerate likelihood computations on pedigrees, Lange and Goradia defined a genotype-elimination algorithm that aims to identify those genotypes that need not be considered during the likelihood computation. For pedigrees without loops, they showed that their algorithm was optimal, in the sense that it identified all genotypes that lead to a Mendelian inconsistency. Their algorithm, however, is not optimal for pedigrees with loops, which continue to pose daunting computational challenges. We present here a simple extension of the Lange-Goradia algorithm that we prove is optimal on pedigrees with loops, and we give examples of how our new algorithm can be used to detect genotyping errors. We also introduce a more efficient and faster algorithm for carrying out the fundamental step in the Lange-Goradia algorithm-namely, genotype elimination within a nuclear family. Finally, we improve a common algorithm for computing the likelihood of a pedigree with multiple loops. This algorithm breaks each loop by duplicating a person in that loop and then carrying out a separate likelihood calculation for each vector of possible genotypes of the loop breakers. This algorithm, however, does unnecessary computations when the loop-breaker vector is inconsistent. In this paper we present a new recursive loop breaker-elimination algorithm that solves this problem and illustrate its effectiveness on a pedigree with six loops.  相似文献   

15.
We propose the technique of Adaptive Allele Consolidation, that greatly improves the performance of the Lange-Goradia algorithm for genotype elimination in pedigrees, while still producing equivalent output. Genotype elimination consists in removing from a pedigree those genotypes that are impossible according to the Mendelian law of inheritance. This is used to find errors in genetic data and is useful as a preprocessing step in other analyses (such as linkage analysis or haplotype imputation). The problem of genotype elimination is intrinsically combinatorial, and Allele Consolidation is an existing technique where several alleles are replaced by a single “lumped” allele in order to reduce the number of combinations of genotypes that have to be considered, possibly at the expense of precision. In existing Allele Consolidation techniques, alleles are lumped once and for all before performing genotype elimination. The idea of Adaptive Allele Consolidation is to dynamically change the set of alleles that are lumped together during the execution of the Lange-Goradia algorithm, so that both high performance and precision are achieved. We have implemented the technique in a tool called Celer and evaluated it on a large set of scenarios, with good results.  相似文献   

16.
17.
This paper is concerned with efficient strategies for gene mapping using pedigrees containing small numbers of affecteds and identity-by-descent data from closely spaced markers throughout the genome. Particular attention is paid to additive traits involving phenocopies and/or locus heterogeneity. For a sample of pedigrees containing a particular configuration of affecteds, e.g., pairs of siblings together with a first cousin, we use a likelihood analysis to find 1-df statistics that are very efficient over a broad range of penetrances and allele frequencies. We identify configurations of affecteds that are particularly powerful for detecting linkage, and we show how pedigrees containing different numbers and configurations of affecteds can be efficiently combined in an overall test statistic.  相似文献   

18.
For wildlife populations, it is often difficult to determine biological parameters that indicate breeding patterns and population mixing, but knowledge of these parameters is essential for effective management. A pedigree encodes the relationship between individuals and can provide insight into the dynamics of a population over its recent history. Here, we present a method for the reconstruction of pedigrees for wild populations of animals that live long enough to breed multiple times over their lifetime and that have complex or unknown generational structures. Reconstruction was based on microsatellite genotype data along with ancillary biological information: sex and observed body size class as an indicator of relative age of individuals within the population. Using body size‐class data to infer relative age has not been considered previously in wildlife genealogy and provides a marked improvement in accuracy of pedigree reconstruction. Body size‐class data are particularly useful for wild populations because it is much easier to collect noninvasively than absolute age data. This new pedigree reconstruction system, PR‐genie, performs reconstruction using maximum likelihood with optimization driven by the cross‐entropy method. We demonstrated pedigree reconstruction performance on simulated populations (comparing reconstructed pedigrees to known true pedigrees) over a wide range of population parameters and under assortative and intergenerational mating schema. Reconstruction accuracy increased with the presence of size‐class data and as the amount and quality of genetic data increased. We provide recommendations as to the amount and quality of data necessary to provide insight into detailed familial relationships in a wildlife population using this pedigree reconstruction technique.  相似文献   

19.
Methods for detecting genetic linkage are more powerful when they fully use all of the data collected from pedigrees. We first discuss a method for obtaining the probability that a pedigree member has a given genotype, conditional on the phenotypes of his relatives. We then develop a rapid method to obtain the conditional probabilities of identity-by-descent sharing of marker alleles for all related pairs of individuals from extended pedigrees. The method assumes that the individuals are noninbred and that the relationship between genotype and phenotype is known for the marker locus studied. The probabilities of identity-by-descent sharing among relative pairs, conditional on marker phenotype information, can then be used in any of the model free tests for linkage between a trait locus and a marker locus.  相似文献   

20.
Abney M 《Genetics》2008,179(3):1577-1590
Computing identity-by-descent sharing between individuals connected through a large, complex pedigree is a computationally demanding task that often cannot be done using exact methods. What I present here is a rapid computational method for estimating, in large complex pedigrees, the probability that pairs of alleles are IBD given the single-point genotype data at that marker for all individuals. The method can be used on pedigrees of essentially arbitrary size and complexity without the need to divide the individuals into separate subpedigrees. I apply the method to do qualitative trait linkage mapping using the nonparametric sharing statistic S(pairs). The validity of the method is demonstrated via simulation studies on a 13-generation 3028-person pedigree with 700 genotyped individuals. An analysis of an asthma data set of individuals in this pedigree finds four loci with P-values <10(-3) that were not detected in prior analyses. The mapping method is fast and can complete analyses of approximately 150 affected individuals within this pedigree for thousands of markers in a matter of hours.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号