首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A method for estimating genotypic and identity-by-descent probabilities in complex pedigrees is described. The method consists of an algorithm for drawing independent genotype samples which are consistent with the pedigree and observed genotype. The probability distribution function for samples obtained using the algorithm can be evaluated up to a normalizing constant, and combined with the likelihood to produce a weight for each sample. Importance sampling is then used to estimate genotypic and identity-by-descent probabilities. On small but complex pedigrees, the genotypic probability estimates are demonstrated to be empirically unbiased. On large complex pedigrees, while the algorithm for obtaining genotype samples is feasible, importance sampling may require an infeasible number of samples to estimate genotypic probabilities with accuracy.  相似文献   

2.
Du FX  Hoeschele I 《Genetics》2000,156(4):2051-2062
Elimination of genotypes or alleles for each individual or meiosis, which are inconsistent with observed genotypes, is a component of various genetic analyses of complex pedigrees. Computational efficiency of the elimination algorithm is critical in some applications such as genotype sampling via descent graph Markov chains. We present an allele elimination algorithm and two genotype elimination algorithms for complex pedigrees with incomplete genotype data. We modify all three algorithms to incorporate inheritance restrictions imposed by a complete or incomplete descent graph such that every inconsistent complete descent graph is detected in any pedigree, and every inconsistent incomplete descent graph is detected in any pedigree without loops with the genotype elimination algorithms. Allele elimination requires less CPU time and memory, but does not always eliminate all inconsistent alleles, even in pedigrees without loops. The first genotype algorithm produces genotype lists for each individual, which are identical to those obtained from the Lange-Goradia algorithm, but exploits the half-sib structure of some populations and reduces CPU time. The second genotype elimination algorithm deletes more inconsistent genotypes in pedigrees with loops and detects more illegal, incomplete descent graphs in such pedigrees.  相似文献   

3.
Methods for detecting Quantitative Trait Loci (QTL) without markers have generally used iterative peeling algorithms for determining genotype probabilities. These algorithms have considerable shortcomings in complex pedigrees. A Monte Carlo Markov chain (MCMC) method which samples the pedigree of the whole population jointly is described. Simultaneous sampling of the pedigree was achieved by sampling descent graphs using the Metropolis-Hastings algorithm. A descent graph describes the inheritance state of each allele and provides pedigrees guaranteed to be consistent with Mendelian sampling. Sampling descent graphs overcomes most, if not all, of the limitations incurred by iterative peeling algorithms. The algorithm was able to find the QTL in most of the simulated populations. However, when the QTL was not modeled or found then its effect was ascribed to the polygenic component. No QTL were detected when they were not simulated.  相似文献   

4.
M C Bink  J A Van Arendonk 《Genetics》1999,151(1):409-420
Augmentation of marker genotypes for ungenotyped individuals is implemented in a Bayesian approach via the use of Markov chain Monte Carlo techniques. Marker data on relatives and phenotypes are combined to compute conditional posterior probabilities for marker genotypes of ungenotyped individuals. The presented procedure allows the analysis of complex pedigrees with ungenotyped individuals to detect segregating quantitative trait loci (QTL). Allelic effects at the QTL were assumed to follow a normal distribution with a covariance matrix based on known QTL position and identity by descent probabilities derived from flanking markers. The Bayesian approach estimates variance due to the single QTL, together with polygenic and residual variance. The method was empirically tested through analyzing simulated data from a complex granddaughter design. Ungenotyped dams were related to one or more sons or grandsires in the design. Heterozygosity of the marker loci and size of QTL were varied. Simulation results indicated a significant increase in power when ungenotyped dams were included in the analysis.  相似文献   

5.
A heuristic algorithm for finding gene transmission patterns on large and complex pedigrees with partially observed genotype data is proposed. The method can be used to generate an initial point for a Markov chain Monte Carlo simulation or to check that the given pedigree and the genotype data are consistent. In small pedigrees, the algorithm is exact by exhaustively enumerating all possibilities, but, in large pedigrees, with a considerable amount of unknown data, only a subset of promising configurations can actually be checked. For that purpose, the configurations are ordered by combining the approximative conditional probability distribution of the unknown genotypes with the information on the relationships between individuals. We also introduce a way to divide the task into subparts, which has been shown to be useful in large pedigrees. The algorithm has been implemented in a program called APE (Allelic Path Explorer) and tested in three different settings with good results.  相似文献   

6.
Gene content is the number of copies of a particular allele in a genotype of an animal. Gene content can be used to study additive gene action of candidate gene. Usually genotype data are available only for a part of population and for the rest gene contents have to be calculated based on typed relatives. Methods to calculate expected gene content for animals on large complex pedigrees are relatively complex. In this paper we proposed a practical method to calculate gene content using a linear regression. The method does not estimate genotype probabilities but these can be approximated from gene content assuming Hardy-Weinberg proportions. The approach was compared with other methods on multiple simulated data sets for real bovine pedigrees of 1 082 and 907 903 animals. Different allelic frequencies (0.4 and 0.2) and proportions of the missing genotypes (90, 70, and 50%) were considered in simulation. The simulation showed that the proposed method has similar capability to predict gene content as the iterative peeling method, however it requires less time and can be more practical for large pedigrees. The method was also applied to real data on the bovine myostatin locus on a large dual-purpose Belgian Blue pedigree of 235 133 animals. It was demonstrated that the proposed method can be easily adapted for particular pedigrees.  相似文献   

7.
Haplotyping in pedigrees provides valuable information for genetic studies (e.g., linkage analysis and association study). In order to identify a set of haplotype configurations with the highest likelihoods for a large pedigree with a large number of linked loci, in our previous work, we proposed a conditional enumeration haplotyping method which sets a threshold for the conditional probabilities of the possible ordered genotypes at every unordered individual-marker to delete some ordered genotypes with low conditional probabilities and then eliminate some haplotype configurations with low likelihoods. In this article we present a rapid haplotyping algorithm based on a modification of our previous method by setting an additional threshold for the ratio of the conditional probability of a haplotype configuration to the largest conditional probability of all haplotype configurations in order to eliminate those configurations with relatively low conditional probabilities. The new algorithm is much more efficient than our previous method and the widely used software SimWalk2.  相似文献   

8.
Conditional probability methods for haplotyping in pedigrees   总被引:3,自引:0,他引:3  
Gao G  Hoeschele I  Sorensen P  Du F 《Genetics》2004,167(4):2055-2065
Efficient haplotyping in pedigrees is important for the fine mapping of quantitative trait locus (QTL) or complex disease genes. To reconstruct haplotypes efficiently for a large pedigree with a large number of linked loci, two algorithms based on conditional probabilities and likelihood computations are presented. The first algorithm (the conditional probability method) produces a single, approximately optimal haplotype configuration, with computing time increasing linearly in the number of linked loci and the pedigree size. The other algorithm (the conditional enumeration method) identifies a set of haplotype configurations with high probabilities conditional on the observed genotype data for a pedigree. Its computing time increases less than exponentially with the size of a subset of the set of person-loci with unordered genotypes and linearly with its complement. The size of the subset is controlled by a threshold parameter. The set of identified haplotype configurations can be used to estimate the identity-by-descent (IBD) matrix at a map position for a pedigree. The algorithms have been tested on published and simulated data sets. The new haplotyping methods are much faster and provide more information than several existing stochastic and rule-based methods. The accuracies of the new methods are equivalent to or better than those of these existing methods.  相似文献   

9.
We propose an analytical approximation method for the estimation of multipoint identity by descent (IBD) probabilities in pedigrees containing a moderate number of distantly related individuals. We show that in large pedigrees where cases are related through untyped ancestors only, it is possible to formulate the hidden Markov model of the Lander-Green algorithm in terms of the IBD configurations of the cases. We use a first-order Markov approximation to model the changes in this IBD-configuration variable along the chromosome. In simulated and real data sets, we demonstrate that estimates of parametric and nonparametric linkage statistics based on the first-order Markov approximation are accurate. The computation time is exponential in the number of cases instead of in the number of meioses separating the cases. We have implemented our approach in the computer program ALADIN (accurate linkage analysis of distantly related individuals). ALADIN can be applied to general pedigrees and marker types and has the ability to model marker-marker linkage disequilibrium with a clustered-markers approach. Using ALADIN is straightforward: It requires no parameters to be specified and accepts standard input files.  相似文献   

10.
Albers CA  Heskes T  Kappen HJ 《Genetics》2007,177(2):1101-1116
We present CVMHAPLO, a probabilistic method for haplotyping in general pedigrees with many markers. CVMHAPLO reconstructs the haplotypes by assigning in every iteration a fixed number of the ordered genotypes with the highest marginal probability, conditioned on the marker data and ordered genotypes assigned in previous iterations. CVMHAPLO makes use of the cluster variation method (CVM) to efficiently estimate the marginal probabilities. We focused on single-nucleotide polymorphism (SNP) markers in the evaluation of our approach. In simulated data sets where exact computation was feasible, we found that the accuracy of CVMHAPLO was high and similar to that of maximum-likelihood methods. In simulated data sets where exact computation of the maximum-likelihood haplotype configuration was not feasible, the accuracy of CVMHAPLO was similar to that of state of the art Markov chain Monte Carlo (MCMC) maximum-likelihood approximations when all ordered genotypes were assigned and higher when only a subset of the ordered genotypes was assigned. CVMHAPLO was faster than the MCMC approach and provided more detailed information about the uncertainty in the inferred haplotypes. We conclude that CVMHAPLO is a practical tool for the inference of haplotypes in large complex pedigrees.  相似文献   

11.
A method was derived to estimate effects of quantitative trait loci (QTL) using incomplete genotype information in large outbreeding populations with complex pedigrees. The method accounts for background genes by estimating polygenic effects. The basic equations used are very similar to the usual linear mixed model equations for polygenic models, and segregation analysis was used to estimate the probabilities of the QTL genotypes for each animal. Method R was used to estimate the polygenic heritability simultaneously with the QTL effects. Also, initial allele frequencies were estimated. The method was tested in a simulated data set of 10,000 animals evenly distributed over 10 generations, where 0, 400 or 10,000 animals were genotyped for a candidate gene. In the absence of selection, the bias of the QTL estimates was <2%. Selection biased the estimate of the Aa genotype slightly, when zero animals were genotyped. Estimates of the polygenic heritability were 0.251 and 0.257, in absence and presence of selection, respectively, while the simulated value was 0.25. Although not tested in this study, marker information could be accommodated by adjusting the transmission probabilities of the genotypes from parent to offspring according to the marker information. This renders a QTL mapping study in large multi-generation pedigrees possible.  相似文献   

12.
Detection and Integration of Genotyping Errors in Statistical Genetics   总被引:15,自引:0,他引:15       下载免费PDF全文
Detection of genotyping errors and integration of such errors in statistical analysis are relatively neglected topics, given their importance in gene mapping. A few inopportunely placed errors, if ignored, can tremendously affect evidence for linkage. The present study takes a fresh look at the calculation of pedigree likelihoods in the presence of genotyping error. To accommodate genotyping error, we present extensions to the Lander-Green-Kruglyak deterministic algorithm for small pedigrees and to the Markov-chain Monte Carlo stochastic algorithm for large pedigrees. These extensions can accommodate a variety of error models and refrain from simplifying assumptions, such as allowing, at most, one error per pedigree. In principle, almost any statistical genetic analysis can be performed taking errors into account, without actually correcting or deleting suspect genotypes. Three examples illustrate the possibilities. These examples make use of the full pedigree data, multiple linked markers, and a prior error model. The first example is the estimation of genotyping error rates from pedigree data. The second-and currently most useful-example is the computation of posterior mistyping probabilities. These probabilities cover both Mendelian-consistent and Mendelian-inconsistent errors. The third example is the selection of the true pedigree structure connecting a group of people from among several competing pedigree structures. Paternity testing and twin zygosity testing are typical applications.  相似文献   

13.
An increased availability of genotypes at marker loci has prompted the development of models that include the effect of individual genes. Selection based on these models is known as marker-assisted selection (MAS). MAS is known to be efficient especially for traits that have low heritability and non-additive gene action. BLUP methodology under non-additive gene action is not feasible for large inbred or crossbred pedigrees. It is easy to incorporate non-additive gene action in a finite locus model. Under such a model, the unobservable genotypic values can be predicted using the conditional mean of the genotypic values given the data. To compute this conditional mean, conditional genotype probabilities must be computed. In this study these probabilities were computed using iterative peeling, and three Markov chain Monte Carlo (MCMC) methods – scalar Gibbs, blocking Gibbs, and a sampler that combines the Elston Stewart algorithm with iterative peeling (ESIP). The performance of these four methods was assessed using simulated data. For pedigrees with loops, iterative peeling fails to provide accurate genotype probability estimates for some pedigree members. Also, computing time is exponentially related to the number of loci in the model. For MCMC methods, a linear relationship can be maintained by sampling genotypes one locus at a time. Out of the three MCMC methods considered, ESIP, performed the best while scalar Gibbs performed the worst.  相似文献   

14.
This paper described a method for predicting additive effects of a cluster of tightly linked QTLs for outbred populations of animals in the situation where the QTLs are located on a chromosome segment surrounded by multiple linked DNA markers. We present a mixed model method for best linear unbiased prediction (conditional to the marker data) of the additive effects of the QTL-cluster and of the remaining QTLs unlinked to the marker linkage group. This method takes into consideration the identity-by-descent proportion (IBDP) for the particular chromosomal segment, in contrast to some other methods which use IBD probabilities at one specific location. In this method, fully informative data on different flanking markers is used to calculate the values of the expectations of the IBDPs (EIBDPs) between gametes for animals to be evaluated. Then the expected values are used as the elements of the gametic relationship matrix required in the best linear unbiased prediction. Giving a small numerical example, we illustrate how the present method can be used for the prediction of the QTL-cluster effects and for genetic evaluation of animals in outbred populations. A computational strategy is discussed on the basis of the calculation of the EIBDPs and the inverted gametic relationship matrix in complex pedigrees.  相似文献   

15.
An algorithm for drawing large, complex pedigrees containing inbred loops and multiple-mate families is presented. The algorithm is based on a step-by-step approach to imaging, when the researcher determines the direction of further extension of the scheme. The algorithm is implemented as the PedigreeQuery software package written in Java. The software has a convenient graphical interface. The software package permits constructing not only whole pedigrees, but also their fragments that are particularly interesting for research. It also allows for adding new information on the phenotypes and genotypes of pedigree members. PedigreeQuery is distributed free of charge; it is available at http://mga.bionet.msc.ru/PedigreeQuery/PedigreeQuery.html and ftp://mga.bionet.nsc.ru/PedigreeQuery/.  相似文献   

16.
The two alleles an individual carries at a locus are identical by descent (ibd) if they have descended from a single ancestral allele in a reference population, and the probability of such identity is the inbreeding coefficient of the individual. Inbreeding coefficients can be predicted from pedigrees with founders constituting the reference population, but estimation from genetic data is not possible without data from the reference population. Most inbreeding estimators that make explicit use of sample allele frequencies as estimates of allele probabilities in the reference population are confounded by average kinships with other individuals. This means that the ranking of those estimates depends on the scope of the study sample and we show the variation in rankings for common estimators applied to different subdivisions of 1000 Genomes data. Allele-sharing estimators of within-population inbreeding relative to average kinship in a study sample, however, do have invariant rankings across all studies including those individuals. They are unbiased with a large number of SNPs. We discuss how allele sharing estimates are the relevant quantities for a range of empirical applications.Subject terms: Population genetics, Evolutionary biology, Molecular ecology  相似文献   

17.
Kirichenko AV 《Genetika》2004,40(10):1425-1428
An algorithm for drawing large, complex pedigrees containing inbred loops and multiple-mate families is presented. The algorithm is based on a step-by-step approach to imaging, when the researcher determines the direction of further extension of the scheme. The algorithm is implemented as the PedigreeQuery software package written in Java. The software has a convenient graphical interface. The software package permits constructing not only whole pedigrees, but also their fragments that are particularly interesting for research. It also allows for adding new information on the phenotypes and genotypes of pedigree members. PedigreeQuery is distributed free of charge; it is available at http://mga.bionet.msc.ru/PedigreeQuery/PedigreeQuery.html and ftp://mga.bionet.msc.ru/PedigreeQuery/.  相似文献   

18.
Here, we introduce the idea of probabilities of line origins for alleles in general pedigrees as found in crosses between outbred lines. We also present software for calculating these probabilities. The proposed algorithm is based on the linear regression method of Haley, Knott and Elsen (1994) combined with the Markov chain Monte Carlo (MCMC) method for estimating quantitative trait locus coefficients used as regressors. We compared the relative precision of our method and the original method as proposed by Haley et al. (1994). The scenarios studied varied in the allelic distribution of marker alleles in parental lines and in the frequency of missing marker genotypes. We found that the MCMC method achieves a higher accuracy in all scenarios considered. The benefits of using MCMC approximation are substantial if the frequency of missing marker data is high or the number of marker alleles is low and the allelic frequency distribution is similar in both parental lines.  相似文献   

19.
Founder-origin probability methods are used to trace specific chromosomal segments in individual offspring. A haplotypic method was developed for calculating founder-origin probabilities in three-generation outbred pedigrees suited to quantitative trait locus (QTL) analysis. Estimators for expected founder-origin proportions were derived for a linkage group segment, an entire linkage group and a complete haplotype. If the founders are truly outbred, the haplotypic method gives a close approximation when compared with the Haley et al. (1994) method that simultaneously uses all marker information for QTL analysis, and it is less computationally demanding. The chief limitation of the haplotypic method is that some information in two-allele intercross marker-type configurations is ignored. Informativeness of marker arrays is discussed in the framework of founder-origin probabilities and proportions. The haplotypic method can be extended to more complex pedigrees with additional generations.  相似文献   

20.
Sobel E  Sengul H  Weeks DE 《Human heredity》2001,52(3):121-131
OBJECTIVES: To describe, implement, and test an efficient algorithm to obtain multipoint identity-by-descent (IBD) probabilities at arbitrary positions among marker loci for general pedigrees. Unlike existing programs, our algorithm can analyze data sets with large numbers of people and markers. The algorithm has been implemented in the SimWalk2 computer package. METHODS: Using a rigorous testing regimen containing five pedigrees of various sizes with realistic marker data, we compared several widely used IBD computation programs: Allegro, Aspex, GeneHunter, MapMaker/Sibs, Mendel, Sage, SimWalk2, and Solar. RESULTS: The testing revealed a few discrepancies, particularly on consanguineous pedigrees, but overall excellent results in the deterministic multipoint packages. SimWalk2 was also found to be in good agreement with the deterministic multipoint programs, usually matching to two decimal places the kinship coefficient that ranges from 0 to 1. However, the packages based on single-point IBD estimation, while consistent with each other, often showed poor results, disagreeing with the multipoint kinship results by as much as 0.5. CONCLUSIONS: Our testing has clearly shown that multipoint IBD estimation is much better than single-point estimation. In addition, our testing has validated our algorithm for estimating IBD probabilities at arbitrary positions on general pedigrees.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号