首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Du FX  Hoeschele I 《Genetics》2000,156(4):2051-2062
Elimination of genotypes or alleles for each individual or meiosis, which are inconsistent with observed genotypes, is a component of various genetic analyses of complex pedigrees. Computational efficiency of the elimination algorithm is critical in some applications such as genotype sampling via descent graph Markov chains. We present an allele elimination algorithm and two genotype elimination algorithms for complex pedigrees with incomplete genotype data. We modify all three algorithms to incorporate inheritance restrictions imposed by a complete or incomplete descent graph such that every inconsistent complete descent graph is detected in any pedigree, and every inconsistent incomplete descent graph is detected in any pedigree without loops with the genotype elimination algorithms. Allele elimination requires less CPU time and memory, but does not always eliminate all inconsistent alleles, even in pedigrees without loops. The first genotype algorithm produces genotype lists for each individual, which are identical to those obtained from the Lange-Goradia algorithm, but exploits the half-sib structure of some populations and reduces CPU time. The second genotype elimination algorithm deletes more inconsistent genotypes in pedigrees with loops and detects more illegal, incomplete descent graphs in such pedigrees.  相似文献   

2.
An increased availability of genotypes at marker loci has prompted the development of models that include the effect of individual genes. Selection based on these models is known as marker-assisted selection (MAS). MAS is known to be efficient especially for traits that have low heritability and non-additive gene action. BLUP methodology under non-additive gene action is not feasible for large inbred or crossbred pedigrees. It is easy to incorporate non-additive gene action in a finite locus model. Under such a model, the unobservable genotypic values can be predicted using the conditional mean of the genotypic values given the data. To compute this conditional mean, conditional genotype probabilities must be computed. In this study these probabilities were computed using iterative peeling, and three Markov chain Monte Carlo (MCMC) methods – scalar Gibbs, blocking Gibbs, and a sampler that combines the Elston Stewart algorithm with iterative peeling (ESIP). The performance of these four methods was assessed using simulated data. For pedigrees with loops, iterative peeling fails to provide accurate genotype probability estimates for some pedigree members. Also, computing time is exponentially related to the number of loci in the model. For MCMC methods, a linear relationship can be maintained by sampling genotypes one locus at a time. Out of the three MCMC methods considered, ESIP, performed the best while scalar Gibbs performed the worst.  相似文献   

3.
No exact method for determining genotypic and identity-by-descent probabilities is available for large complex pedigrees. Approximate methods for such pedigrees cannot be guaranteed to be unbiased. A new method is proposed that uses the Metropolis-Hastings algorithm to sample a Markov chain of descent graphs which fit the pedigree and known genotypes. Unknown genotypes are determined from each descent graph. Genotypic probabilities are estimated as their means. The algorithm is shown to be unbiased for small complex pedigrees and feasible and consistent for moderately large complex pedigrees.  相似文献   

4.
Markov chain Monte Carlo (MCMC) methods have been proposed to overcome computational problems in linkage and segregation analyses. This approach involves sampling genotypes at the marker and trait loci. Scalar-Gibbs is easy to implement, and it is widely used in genetics. However, the Markov chain that corresponds to scalar-Gibbs may not be irreducible when the marker locus has more than two alleles, and even when the chain is irreducible, mixing has been observed to be slow. These problems do not arise if the genotypes are sampled jointly from the entire pedigree. This paper proposes a method to jointly sample genotypes. The method combines the Elston-Stewart algorithm and iterative peeling, and is called the ESIP sampler. For a hypothetical pedigree, genotype probabilities are estimated from samples obtained using ESIP and also scalar-Gibbs. Approximate probabilities were also obtained by iterative peeling. Comparisons of these with exact genotypic probabilities obtained by the Elston-Stewart algorithm showed that ESIP and iterative peeling yielded genotypic probabilities that were very close to the exact values. Nevertheless, estimated probabilities from scalar-Gibbs with a chain of length 235 000, including a burn-in of 200 000 steps, were less accurate than probabilities estimated using ESIP with a chain of length 10 000, with a burn-in of 5 000 steps. The effective chain size (ECS) was estimated from the last 25 000 elements of the chain of length 125 000. For one of the ESIP samplers, the ECS ranged from 21 579 to 22 741, while for the scalar-Gibbs sampler, the ECS ranged from 64 to 671. Genotype probabilities were also estimated for a large real pedigree consisting of 3 223 individuals. For this pedigree, it is not feasible to obtain exact genotype probabilities by the Elston-Stewart algorithm. ESIP and iterative peeling yielded very similar results. However, results from scalar-Gibbs were less accurate.  相似文献   

5.
Simulation of genotypes in pedigrees is an important tool to evaluate the power of a linkage or an association study and to assess the empirical significance of results. SLINK is a widely-used package for pedigree simulations, but its implementation has not previously been described in a published paper. SLINK was initially derived from the LINKAGE programs. Over the 20 years since its release, SLINK has been modified to incorporate faster algorithms, notably from the linkage analysis package FASTLINK, also derived from LINKAGE. While SLINK can simulate genotypes on pedigrees of high complexity, one limitation of SLINK, as with most methods based on peeling algorithms to evaluate pedigree likelihoods, is the small number of linked markers that can be generated. The software package SUP includes an elegant wrapper for SLINK that circumvents the limitation on number of markers by using descent markers generated by SLINK to simulate a much larger number of markers on the same chromosome, linked and possibly associated with a trait locus. We have released new coordinated versions of SLINK (3.0; available from http://watson.hgen.pitt.edu) and SUP (v090804; available from http://mlemire.freeshell.org/software or http://watson.hgen.pitt.edu) that integrate the two software packages. Thereby, we have removed some of the previous limitations on the joint functionality of the programs, such as the number of founders in a pedigree. We review the history of SLINK and describe how SLINK and SUP are now coordinated to permit the simulation of large numbers of markers linked and possibly associated with a trait in large pedigrees.  相似文献   

6.
Multipoint quantitative-trait linkage analysis in general pedigrees.   总被引:49,自引:12,他引:37       下载免费PDF全文
Multipoint linkage analysis of quantitative-trait loci (QTLs) has previously been restricted to sibships and small pedigrees. In this article, we show how variance-component linkage methods can be used in pedigrees of arbitrary size and complexity, and we develop a general framework for multipoint identity-by-descent (IBD) probability calculations. We extend the sib-pair multipoint mapping approach of Fulker et al. to general relative pairs. This multipoint IBD method uses the proportion of alleles shared identical by descent at genotyped loci to estimate IBD sharing at arbitrary points along a chromosome for each relative pair. We have derived correlations in IBD sharing as a function of chromosomal distance for relative pairs in general pedigrees and provide a simple framework whereby these correlations can be easily obtained for any relative pair related by a single line of descent or by multiple independent lines of descent. Once calculated, the multipoint relative-pair IBDs can be utilized in variance-component linkage analysis, which considers the likelihood of the entire pedigree jointly. Examples are given that use simulated data, demonstrating both the accuracy of QTL localization and the increase in power provided by multipoint analysis with 5-, 10-, and 20-cM marker maps. The general pedigree variance component and IBD estimation methods have been implemented in the SOLAR (Sequential Oligogenic Linkage Analysis Routines) computer package.  相似文献   

7.
Pérez-Enciso M 《Genetics》2003,163(4):1497-1510
We present a Bayesian method that combines linkage and linkage disequilibrium (LDL) information for quantitative trait locus (QTL) mapping. This method uses jointly all marker information (haplotypes) and all available pedigree information; i.e., it is not restricted to any specific experimental design and it is not required that phases are known. Infinitesimal genetic effects or environmental noise ("fixed") effects can equally be fitted. A diallelic QTL is assumed and both additive and dominant effects can be estimated. We have implemented a combined Gibbs/Metropolis-Hastings sampling to obtain the marginal posterior distributions of the parameters of interest. We have also implemented a Bayesian variant of usual disequilibrium measures like D' and r(2) between QTL and markers. We illustrate the method with simulated data in "simple" (two-generation full-sib families) and "complex" (four-generation) pedigrees. We compared the estimates with and without using linkage disequilibrium information. In general, using LDL resulted in estimates of QTL position that were much better than linkage-only estimates when there was complete disequilibrium between the mutant QTL allele and the marker. This advantage, however, decreased when the association was only partial. In all cases, additive and dominant effects were estimated accurately either with or without disequilibrium information.  相似文献   

8.
Conditional probability methods for haplotyping in pedigrees   总被引:3,自引:0,他引:3  
Gao G  Hoeschele I  Sorensen P  Du F 《Genetics》2004,167(4):2055-2065
Efficient haplotyping in pedigrees is important for the fine mapping of quantitative trait locus (QTL) or complex disease genes. To reconstruct haplotypes efficiently for a large pedigree with a large number of linked loci, two algorithms based on conditional probabilities and likelihood computations are presented. The first algorithm (the conditional probability method) produces a single, approximately optimal haplotype configuration, with computing time increasing linearly in the number of linked loci and the pedigree size. The other algorithm (the conditional enumeration method) identifies a set of haplotype configurations with high probabilities conditional on the observed genotype data for a pedigree. Its computing time increases less than exponentially with the size of a subset of the set of person-loci with unordered genotypes and linearly with its complement. The size of the subset is controlled by a threshold parameter. The set of identified haplotype configurations can be used to estimate the identity-by-descent (IBD) matrix at a map position for a pedigree. The algorithms have been tested on published and simulated data sets. The new haplotyping methods are much faster and provide more information than several existing stochastic and rule-based methods. The accuracies of the new methods are equivalent to or better than those of these existing methods.  相似文献   

9.
The introduction of stochastic methods in pedigree analysis has enabled geneticists to tackle computations intractable by standard deterministic methods. Until now these stochastic techniques have worked by running a Markov chain on the set of genetic descent states of a pedigree. Each descent state specifies the paths of gene flow in the pedigree and the founder alleles dropped down each path. The current paper follows up on a suggestion by Elizabeth Thompson that genetic descent graphs offer a more appropriate space for executing a Markov chain. A descent graph specifies the paths of gene flow but not the particular founder alleles traveling down the paths. This paper explores algorithms for implementing Thompson's suggestion for codominant markers in the context of automatic haplotyping, estimating location scores, and computing gene-clustering statistics for robust linkage analysis. Realistic numerical examples demonstrate the feasibility of the algorithms.  相似文献   

10.
Recent technological development in genetics has made large-scale marker genotyping fast and practicable, facilitating studies for detection of QTL in large general pedigrees. We developed a method that speeds up restricted maximum-likelihood (REML) algorithms for QTL analysis by simplifying the inversion of the variance-covariance matrix of the trait vector. The method was tested in an experimental chicken pedigree including 767 phenotyped individuals and 14 genotyped markers on chicken chromosome 1. The computation time in a chromosome scan covering 475 cM was reduced by 43% when the analysis was based on linkage only and by 72% when linkage disequilibrium information was included. The relative advantage of using our method increases with pedigree size, marker density, and linkage disequilibrium, indicating even greater improvements in the future.  相似文献   

11.
Gene content is the number of copies of a particular allele in a genotype of an animal. Gene content can be used to study additive gene action of candidate gene. Usually genotype data are available only for a part of population and for the rest gene contents have to be calculated based on typed relatives. Methods to calculate expected gene content for animals on large complex pedigrees are relatively complex. In this paper we proposed a practical method to calculate gene content using a linear regression. The method does not estimate genotype probabilities but these can be approximated from gene content assuming Hardy-Weinberg proportions. The approach was compared with other methods on multiple simulated data sets for real bovine pedigrees of 1 082 and 907 903 animals. Different allelic frequencies (0.4 and 0.2) and proportions of the missing genotypes (90, 70, and 50%) were considered in simulation. The simulation showed that the proposed method has similar capability to predict gene content as the iterative peeling method, however it requires less time and can be more practical for large pedigrees. The method was also applied to real data on the bovine myostatin locus on a large dual-purpose Belgian Blue pedigree of 235 133 animals. It was demonstrated that the proposed method can be easily adapted for particular pedigrees.  相似文献   

12.
Markov chain Monte Carlo (MCMC) methods have been proposed to overcome computational problems in linkage and segregation analyses. This approach involves sampling genotypes at the marker and trait loci. Among MCMC methods, scalar-Gibbs is the easiest to implement, and it is used in genetics. However, the Markov chain that corresponds to scalar-Gibbs may not be irreducible when the marker locus has more than two alleles, and even when the chain is irreducible, mixing has been observed to be slow. Joint sampling of genotypes has been proposed as a strategy to overcome these problems. An algorithm that combines the Elston-Stewart algorithm and iterative peeling (ESIP sampler) to sample genotypes jointly from the entire pedigree is used in this study. Here, it is shown that the ESIP sampler yields an irreducible Markov chain, regardless of the number of alleles at a locus. Further, results obtained by ESIP sampler are compared with other methods in the literature. Of the methods that are guaranteed to be irreducible, ESIP was the most efficient.  相似文献   

13.
In sexual species, autosomal alleles are transmitted through multigeneration organismal pedigrees via pathways of descent involving both genders. Here, models assess the sampling properties of these gender-described transmission pathways. An isolation-by-distance model of mating was used to construct a series of computer population pedigrees by systematically varying neighborhood size and the timing of isolation events in sundered populations. For each known pedigree, a matrix of true coancestry coefficients between all individuals in the final generation was calculated and compared (using cophenetic correlations) to mean pairwise times to common ancestry as estimated by sampling varying numbers of gender-defined lineage routes available to individual alleles through that pedigree. When few lineage routes were sampled, agreement between the estimated and the true pedigree was poor and showed a large variance. Agreement improved as more lineage routes were incorporated and asymptotically approached plateau levels predictably relatable to the magnitude of population structure. Results underscore a distinction between the composite genealogical information in a population pedigree and the subsets of that information registered in allelic lineage pathways.  相似文献   

14.
Understanding and correctly utilizing relatedness among samples is essential for genetic analysis; however, managing sample records and pedigrees can often be error prone and incomplete. Data sets ascertained by random sampling often harbor cryptic relatedness that can be leveraged in genetic analyses for maximizing power. We have developed a method that uses genome-wide estimates of pairwise identity by descent to identify families and quickly reconstruct and score all possible pedigrees that fit the genetic data by using up to third-degree relatives, and we have included it in the software package PRIMUS (Pedigree Reconstruction and Identification of the Maximally Unrelated Set). Here, we validate its performance on simulated, clinical, and HapMap pedigrees. Among these samples, we demonstrate that PRIMUS can verify reported pedigree structures and identify cryptic relationships. Finally, we show that PRIMUS reconstructed pedigrees, all of which were previously unknown, for 203 families from a cohort collected in Starr County, TX (1,890 samples).  相似文献   

15.

Background  

Identity by descent (IBD) matrix estimation is a central component in mapping of Quantitative Trait Loci (QTL) using variance component models. A large number of algorithms have been developed for estimation of IBD between individuals in populations at discrete locations in the genome for use in genome scans to detect QTL affecting various traits of interest in experimental animal, human and agricultural pedigrees. Here, we propose a new approach to estimate IBD as continuous functions rather than as discrete values.  相似文献   

16.
The problem of ascertainment in segregation analysis arises when families are selected for study through ascertainment of affected individuals. In this case, ascertainment must be corrected for in data analysis. However, methods for ascertainment correction are not available for many common sampling schemes, e.g., sequential sampling of extended pedigrees (except in the case of "single" selection). Concerns about whether ascertainment correction is even required for large pedigrees, about whether and how multiple probands in the same pedigree can be taken into account properly, and about how to apply sequential sampling strategies have occupied many investigators in recent years. We address these concerns by reconsidering a central issue, namely, how to handle pedigree structure (including size). We introduce a new distinction, between sampling in such a way that observed pedigree structure does not depend on which pedigree members are probands (proband-independent [PI] sampling) and sampling in such a way that observed pedigree structure does depend on who are the probands (proband-dependent [PD] sampling). This distinction corresponds roughly (but not exactly) to the distinction between fixed-structure and sequential sampling. We show that conditioning on observed pedigree structure in ascertained data sets obtained under PD sampling is not in general correct (with the exception of "single" selection), while PI sampling of pedigree structures larger than simple sibships is generally not possible. Yet, in practice one has little choice but to condition on observed pedigree structure. We conclude that the problem of genetic modeling in ascertained data sets is, in most situations, literally intractable. We recommend that future efforts focus on the development of robust approximate approaches to the problem.  相似文献   

17.
Regions of the genome influencing wood and fibre traits in Eucalyptus globulus Labill. have been identified in two full-sib pedigrees that share a common male parent. The first pedigree, cross A, contains 148 progeny, and the second pedigree, cross B, contains 135 progeny. Subsets of progeny of these two controlled crosses were planted at seven sites throughout Australia in 1990. Wood cores were taken at 0.9 m above ground in 1997, and wood and fibre traits were analysed for each individual. Three quantitative trait loci (QTL) affecting wood density, one QTL affecting pulp yield and one QTL affecting microfibril angle have been located in both pedigrees, using single-factor analysis of variance. Other QTLs affecting these traits, as well as fibre length and cellulose content were located in cross A only.  相似文献   

18.
In livestock populations, missing genotypes on a large proportion of the animals is a major problem when implementing gene-assisted breeding value estimation for genes with known effect. The objective of this study was to compare different methods to deal with missing genotypes on accuracy of gene-assisted breeding value estimation for identified bi-allelic genes using Monte Carlo simulation. A nested full-sib half-sib structure was simulated with a mixed inheritance model with one bi-allelic quantitative trait loci (QTL) and a polygenic effect due to infinite number of polygenes. The effect of the QTL was included in gene-assisted BLUP either by random regression on predicted gene content, i.e. the number of positive alleles, or including haplotype effects in the model with an inverse IBD matrix to account for identity-by-descent relationships between haplotypes using linkage analysis information (IBD-LA). The inverse IBD matrix was constructed using segregation indicator probabilities obtained from multiple marker iterative peeling. Gene contents for unknown genotypes were predicted using either multiple marker iterative peeling or mixed model methodology. For both methods, gene-assisted breeding value estimation increased accuracies of total estimated breeding value (EBV) with 0% to 22% for genotyped animals in comparison to conventional breeding value estimation. For animals that were not genotyped, the increase in accuracy was much lower (0% to 5%), but still substantial when the heritability was 0.1 and when the QTL explained at least 15% of the genetic variance. Regression on predicted gene content yielded higher accuracies than IBD-LA. Allele substitution effects were, however, overestimated, especially when only sires and males in the last generation were genotyped. For juveniles without phenotypic records and traits measured only on females, the superiority of regression on gene content over IBD-LA was larger than when all animals had phenotypes. Missing gene contents were predicted with higher accuracy using multiple-marker iterative peeling than with using mixed model methodology, but the difference in accuracy of total EBV was negligible and mixed model methodology was computationally much faster than multiple iterative peeling. For large livestock populations it can be concluded that gene-assisted breeding value estimation can be practically best performed by regression on gene contents, using mixed model methodology to predict missing marker genotypes, combining phenotypic information of genotyped and ungenotyped animals in one evaluation. This technique would be, in principle, also feasible for genomic selection. It is expected that genomic selection for ungenotyped animals using predicted single nucleotide polymorphism gene contents might be beneficial especially for low heritable traits.  相似文献   

19.
Optimizing quantitative trait locus (QTL) mapping experiments requires a generalized measure of marker informativeness because variable information is obtained from different marker systems, marker distribution and pedigree types. Such a measure can be derived from the concept of Shannon entropy, a central concept in information theory. Here we introduce entropy-based founder informativeness (EFI), a new measure of information content generalized across pedigrees, maps, marker systems and mating configurations. We derived equations for inbred- and outbred-derived mapping populations. Mathematical properties of EFI include enhanced sensitivity to mapping population type and extension to any number of founders. To illustrate the use of EFI, we compared experimental designs for QTL mapping for three examples: (i) different marker systems for an F2 pedigree, (ii) different marker densities and sampling sizes for a BC1 pedigree and (iii) a comparison of haplotypic versus zygotic analyses of an outbred pedigree. As an a priori generalized measure of information content, EFI does not require phenotypic data for optimizing experimental designs for QTL mapping.  相似文献   

20.
随着DNA分子标记技术的迅速发展,QTL定位已成为当前生物学研究领域的前沿。迄今已对许多种动、植物定位了许多重要性状的QTL。这些研究促进了遗传学的发展,并将作为育种的新策略应用。与作物相比,林木QTL定位有其特性。本文详细讨论了林木的生物学特性对QTL定位的影响、QTL定位的系谱设计和统计分析方法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号