首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Lee SH  Van der Werf JH  Tier B 《Genetics》2005,171(4):2063-2072
A linkage analysis for finding inheritance states and haplotype configurations is an essential process for linkage and association mapping. The linkage analysis is routinely based upon observed pedigree information and marker genotypes for individuals in the pedigree. It is not feasible for exact methods to use all such information for a large complex pedigree especially when there are many missing genotypic data. Proposed Markov chain Monte Carlo approaches such as a single-site Gibbs sampler or the meiosis Gibbs sampler are able to handle a complex pedigree with sparse genotypic data; however, they often have reducibility problems, causing biased estimates. We present a combined method, applying the random walk approach to the reducible sites in the meiosis sampler. Therefore, one can efficiently obtain reliable estimates such as identity-by-descent coefficients between individuals based on inheritance states or haplotype configurations, and a wider range of data can be used for mapping of quantitative trait loci within a reasonable time.  相似文献   

2.
An increased availability of genotypes at marker loci has prompted the development of models that include the effect of individual genes. Selection based on these models is known as marker-assisted selection (MAS). MAS is known to be efficient especially for traits that have low heritability and non-additive gene action. BLUP methodology under non-additive gene action is not feasible for large inbred or crossbred pedigrees. It is easy to incorporate non-additive gene action in a finite locus model. Under such a model, the unobservable genotypic values can be predicted using the conditional mean of the genotypic values given the data. To compute this conditional mean, conditional genotype probabilities must be computed. In this study these probabilities were computed using iterative peeling, and three Markov chain Monte Carlo (MCMC) methods – scalar Gibbs, blocking Gibbs, and a sampler that combines the Elston Stewart algorithm with iterative peeling (ESIP). The performance of these four methods was assessed using simulated data. For pedigrees with loops, iterative peeling fails to provide accurate genotype probability estimates for some pedigree members. Also, computing time is exponentially related to the number of loci in the model. For MCMC methods, a linear relationship can be maintained by sampling genotypes one locus at a time. Out of the three MCMC methods considered, ESIP, performed the best while scalar Gibbs performed the worst.  相似文献   

3.
Thompson E  Basu S 《Human heredity》2003,56(1-3):119-125
Our objective is the development of robust methods for assessment of evidence for linkage of loci affecting a complex trait to a marker linkage group, using data on extended pedigrees. Using Markov chain Monte Carlo (MCMC) methods, it is possible to sample realizations from the distribution of gene identity by descent (IBD) patterns on a pedigree, conditional on observed data YM at multiple marker loci. Measures of gene IBDW which capture joint genome sharing in extended pedigrees often have unknown and highly skewed distributions, particularly when conditioned on marker data. MCMC provides a direct estimate of the distribution of such measures. Let W be the IBD measure from data YM, and W* the IBD measure from pseudo-data Y*M simulated with the same data availability and genetic marker model as the true data YM, but in the absence of linkage. Then measures of the difference in distributions of W and W* provide evidence for linkage. This approach extracts more information from the data YM than either comparison to the pedigree prior distribution of W or use of statistics that are expectations of W given the data YM. A small example is presented.  相似文献   

4.
Markov chain Monte Carlo (MCMC) methods have been widely used to overcome computational problems in linkage and segregation analyses. Many variants of this approach exist and are practiced; among the most popular is the Gibbs sampler. The Gibbs sampler is simple to implement but has (in its simplest form) mixing and reducibility problems; furthermore in order to initiate a Gibbs sampling chain we need a starting genotypic or allelic configuration which is consistent with the marker data in the pedigree and which has suitable weight in the joint distribution. We outline a procedure for finding such a configuration in pedigrees which have too many loci to allow for exact peeling. We also explain how this technique could be used to implement a blocking Gibbs sampler.  相似文献   

5.
Markov chain Monte Carlo (MCMC) methods have been proposed to overcome computational problems in linkage and segregation analyses. This approach involves sampling genotypes at the marker and trait loci. Among MCMC methods, scalar-Gibbs is the easiest to implement, and it is used in genetics. However, the Markov chain that corresponds to scalar-Gibbs may not be irreducible when the marker locus has more than two alleles, and even when the chain is irreducible, mixing has been observed to be slow. Joint sampling of genotypes has been proposed as a strategy to overcome these problems. An algorithm that combines the Elston-Stewart algorithm and iterative peeling (ESIP sampler) to sample genotypes jointly from the entire pedigree is used in this study. Here, it is shown that the ESIP sampler yields an irreducible Markov chain, regardless of the number of alleles at a locus. Further, results obtained by ESIP sampler are compared with other methods in the literature. Of the methods that are guaranteed to be irreducible, ESIP was the most efficient.  相似文献   

6.
Joint linkage of multiple loci for a complex disorder.   总被引:5,自引:4,他引:1       下载免费PDF全文
Many investigators who have been searching for linkage to complex diseases have by now accumulated a drawer full of negative results. If disease is actually caused by genes at several loci, these data might contain multiple-locus system (MLS) information that the investigator does not realize. Trying to obtain this information formally, through the MLS likelihood, leads to severe computational and statistical difficulties. Therefore, we propose a scheme of inference based on single-locus (SL) statistics, considered jointly. By simulation, we find that the MLS lod score is closely approximated by the sum of SL lod scores. However, we also find that for moderately large systems, say three of four loci, both MLS and SL lod scores are likely to be inconclusive. Nonetheless, MLS can often be detected through the correlation of individual pedigree SL lod scores. Significant correlation is itself evidence of an MLS, because, in the absence of linkage, false-positive lod scores are necessarily random. Under epistasis SL lod scores tend to be positively correlated among pedigrees, while under independent action SL lod scores from high-density samples tend to be negatively correlated.  相似文献   

7.
Markov chain Monte Carlo (MCMC) methods have been proposed to overcome computational problems in linkage and segregation analyses. This approach involves sampling genotypes at the marker and trait loci. Scalar-Gibbs is easy to implement, and it is widely used in genetics. However, the Markov chain that corresponds to scalar-Gibbs may not be irreducible when the marker locus has more than two alleles, and even when the chain is irreducible, mixing has been observed to be slow. These problems do not arise if the genotypes are sampled jointly from the entire pedigree. This paper proposes a method to jointly sample genotypes. The method combines the Elston-Stewart algorithm and iterative peeling, and is called the ESIP sampler. For a hypothetical pedigree, genotype probabilities are estimated from samples obtained using ESIP and also scalar-Gibbs. Approximate probabilities were also obtained by iterative peeling. Comparisons of these with exact genotypic probabilities obtained by the Elston-Stewart algorithm showed that ESIP and iterative peeling yielded genotypic probabilities that were very close to the exact values. Nevertheless, estimated probabilities from scalar-Gibbs with a chain of length 235 000, including a burn-in of 200 000 steps, were less accurate than probabilities estimated using ESIP with a chain of length 10 000, with a burn-in of 5 000 steps. The effective chain size (ECS) was estimated from the last 25 000 elements of the chain of length 125 000. For one of the ESIP samplers, the ECS ranged from 21 579 to 22 741, while for the scalar-Gibbs sampler, the ECS ranged from 64 to 671. Genotype probabilities were also estimated for a large real pedigree consisting of 3 223 individuals. For this pedigree, it is not feasible to obtain exact genotype probabilities by the Elston-Stewart algorithm. ESIP and iterative peeling yielded very similar results. However, results from scalar-Gibbs were less accurate.  相似文献   

8.
Markov chain-Monte Carlo (MCMC) techniques for multipoint mapping of quantitative trait loci have been developed on nuclear-family and extended-pedigree data. These methods are based on repeated sampling-peeling and gene dropping of genotype vectors and random sampling of each of the model parameters from their full conditional distributions, given phenotypes, markers, and other model parameters. We further refine such approaches by improving the efficiency of the marker haplotype-updating algorithm and by adopting a new proposal for adding loci. Incorporating these refinements, we have performed an extensive simulation study on simulated nuclear-family data, varying the number of trait loci, family size, displacement, and other segregation parameters. Our simulation studies show that our MCMC algorithm identifies the locations of the true trait loci and estimates their segregation parameters well-provided that the total number of sibship pairs in the pedigree data is reasonably large, heritability of each individual trait locus is not too low, and the loci are not too close together. Our MCMC algorithm was shown to be significantly more efficient than LOKI (Heath 1997) in our simulation study using nuclear-family data.  相似文献   

9.
Preliminary ranking procedures for multilocus ordering   总被引:12,自引:0,他引:12  
D E Weeks  K Lange 《Genomics》1987,1(3):236-242
N linked loci can be arranged in N!/2 possible orders. We describe two criteria for providing a preliminary ranking of the possible orders based on the N(N-1)/2 pairwise lod score curves for the loci. For a given order the first criterion is the sum of the N-1 maximal lod scores corresponding to the adjacent pairs of loci in the order. The second criterion is the minimum of a least-squares problem due to J.M. Lalouel (1977, Heredity 38(1): 61-77). This least-squares problem requires the maximum likelihood recombination fraction estimates and their standard errors. For N small it is feasible to evaluate these measures for every possible order. For N large we use a simulated annealing algorithm. This gives a fairly complete listing of the best-candidate orders without sampling every possible order. These ranking methods are applied to data from linkage groups on chromosomes 1, 6, 11, and 13.  相似文献   

10.
The power to detect linkage for likelihood and nonparametric (Haseman-Elston, affected-sib-pair, and affected-pedigree-member) methods is compared for the case of a common, dichotomous trait resulting from the segregation of two loci. Pedigree data for several two-locus epistatic and heterogeneity models have been simulated, with one of the loci linked to a marker locus. Replicate samples of 20 three-generation pedigrees (16 individuals/pedigree) were simulated and then ascertained for having at least 6 affected individuals. The power of linkage detection calculated under the correct two-locus model is only slightly higher than that under a single locus model with reduced penetrance. As expected, the nonparametric linkage methods have somewhat lower power than does the lod-score method, the difference depending on the mode of transmission of the linked locus. Thus, for many pedigree linkage studies, the lod-score method will have the best power. However, this conclusion depends on how many times the lod score will be calculated for a given marker. The Haseman-Elston method would likely be preferable to calculating lod scores under a large number of genetic models (i.e., varying both the mode of transmission and the penetrances), since such an analysis requires an increase in the critical value of the lod criterion. The power of the affected-pedigree-member method is lower than the other methods, which can be shown to be largely due to the fact that marker genotypes for unaffected individuals are not used.  相似文献   

11.
J D Terwilliger  Y Ding  J Ott 《Genomics》1992,13(4):951-956
Molecular biologists are often confronted with the problem of whether they should try to generate large numbers of very closely linked markers of low heterozygosity or smaller numbers of less closely linked markers of high heterozygosity. In other words, What is more important for gene mapping, high marker heterozygosity or dense marker spacing? We investigated that problem by analytically computing the expected lod score per meiosis in which the new locus is informative and phase known. We also looked at the length of the 1-unit-of-lod-score support interval for the expected lod score from 100 such meioses. We found that while both quantities have an influence on the number of meioses needed to find linkage, the length of the support interval is almost entirely dependent on the intermarker distance, for heterozygosities between 20 and 100%. However, the probability of any given meiosis being phase known and the ability to develop an accurate map of the markers are functions of marker heterozygosity, further complicating the issue.  相似文献   

12.
The problem of ascertainment for linkage analysis.   总被引:2,自引:0,他引:2       下载免费PDF全文
It is generally believed that ascertainment corrections are unnecessary in linkage analysis, provided individuals are selected for study solely on the basis of trait phenotype and not on the basis of marker genotype. The theoretical rationale for this is that standard linkage analytic methods involve conditioning likelihoods on all the trait data, which may be viewed as an application of the ascertainment assumption-free (AAF) method of Ewens and Shute. In this paper, we show that when the observed pedigree structure depends on which relatives within a pedigree happen to have been the probands (proband-dependent, or PD, sampling) conditioning on all the trait data is not a valid application of the AAF method and will result in asymptotically biased estimates of genetic parameters (except under single ascertainment). Furthermore, this result holds even if the recombination fraction R is the only parameter of interest. Since the lod score is proportional to the likelihood of the marker data conditional on all the trait data, this means that when data are obtained under PD sampling the lod score will yield asymptotically biased estimates of R, and that so-called mod scores (i.e., lod scores maximized over both R and parameters theta of the trait distribution) will yield asymptotically biased estimates of R and theta. Furthermore, the problem appears to be intractable, in the sense that it is not possible to formulate the correct likelihood conditional on observed pedigree structure. In this paper we do not investigate the numerical magnitude of the bias, which may be small in many situations. On the other hand, virtually all linkage data sets are collected under PD sampling. Thus, the existence of this bias will be the rule rather than the exception in the usual applications.  相似文献   

13.
Previously we have conducted a genome-wide search for inflammatory bowel disease susceptibility loci in a large European cohort. Results from this study demonstrated suggestive evidence of linkage to loci at chromosomes 1q, 6p, and 10p and replicated linkages on chromosomes 12 and 16. Recently, NOD2/CARD15 on chromosome 16q12 has been found to be strongly associated with Crohn's disease. In order to determine if there are other loci in the genome that interact with the three associated functional variants in CARD15 (R702W, G908R, 1007fs), we have stratified our large inflammatory bowel disease genome scan cohort by dividing pedigrees into two groups stratified by CARD15 variant genotype. The two pedigree groups were analysed using non-parametric allele sharing methods. The group of pedigrees that contained one of the three CARD15 variants had two suggestive linkage results occurring in 6p (lod = 3.06 at D6S197, IBD phenotype) and 10p (lod=2.29 at D10S197, CD phenotype). In addition, at 16q12 where CARD15 is located, the original genome scan had a peak lod score of 2.18 at D16S415 (CD phenotype). The stratified pedigree cohort containing one of three CARD15 variants had a peak lod score of 0.90 at D16S415 (CD phenotype), accounting for approximately less than half of the genetic evidence for linkage at this locus. This result is in agreement with the existence of a substantial number of private variants at the NOD2/CARD15 locus. Interaction with NOD2/CARD15 needs to be considered in future gene identification efforts on chromosomes 6 and 10.  相似文献   

14.
Lee SH  Van der Werf JH 《Genetics》2006,173(4):2329-2337
Within a small region (e.g., <10 cM), there can be multiple quantitative trait loci (QTL) underlying phenotypes of a trait. Simultaneous fine mapping of closely linked QTL needs an efficient tool to remove confounded shade effects among QTL within such a small region. We propose a variance component method using combined linkage disequilibrium (LD) and linkage information and a reversible jump Markov chain Monte Carlo (MCMC) sampling for model selection. QTL identity-by-descent (IBD) coefficients between individuals are estimated by a hybrid MCMC combining the random walk and the meiosis Gibbs sampler. These coefficients are used in a mixed linear model and an empirical Bayesian procedure combines residual maximum likelihood (REML) to estimate QTL effects and a reversible jump MCMC that samples the number of QTL and the posterior QTL intensities across the tested region. Note that two MCMC processes are used, i.e., an (internal) MCMC for IBD estimation and an (external) MCMC for model selection. In a simulation study, the use of the multiple-QTL model clearly removes the shade effects between three closely linked QTL located at 1.125, 3.875, and 7.875 cM across the region of 10 cM, using 40 markers at 0.25-cM intervals. It is shown that the use of combined LD and linkage information gives much more useful information compared to using linkage information alone for both single- and multiple-QTL analyses. When using a lower marker density (11 markers at 1-cM intervals), the signal of the second QTL can disappear. Extreme values of past effective size (resulting in extreme levels of LD) decrease the mapping accuracy.  相似文献   

15.
Genetic linkage between hereditary hemochromatosis and HLA.   总被引:19,自引:14,他引:5       下载免费PDF全文
A large Mormon pedigree of a proband with hemochromatosis was studied, using transferrin saturation as the quantitative phenotypic trait. The analysis indicated that the inheritance of hemochromatosis was recessive, with partial expression in some heterozygotes. The lod score of 6.88 (theta = .0) was strongly indicative of linkage between the hemochromatosis locus and the human major histocompatibility (HLA) loci.  相似文献   

16.
Yi N 《Genetics》2004,167(2):967-975
In this article, a unified Markov chain Monte Carlo (MCMC) framework is proposed to identify multiple quantitative trait loci (QTL) for complex traits in experimental designs, based on a composite space representation of the problem that has fixed dimension. The proposed unified approach includes the existing Bayesian QTL mapping methods using reversible jump MCMC algorithm as special cases. We also show that a variety of Bayesian variable selection methods using Gibbs sampling can be applied to the composite model space for mapping multiple QTL. The unified framework not only results in some new algorithms, but also gives useful insight into some of the important factors governing the performance of Gibbs sampling and reversible jump for mapping multiple QTL. Finally, we develop strategies to improve the performance of MCMC algorithms.  相似文献   

17.
Thomas A 《Human heredity》2007,64(1):16-26
We review recent developments of MCMC integration methods for computations on graphical models for two applications in statistical genetics: modelling allelic association and pedigree based linkage analysis. We discuss and illustrate estimation of graphical models from haploid and diploid genotypes, and the importance of MCMC updating schemes beyond what is strictly necessary for irreducibility. We then outline an approach combining these methods to compute linkage statistics when alleles at the marker loci are in linkage disequilibrium. Other extensions suitable for analysis of SNP genotype data in pedigrees are also discussed and programs that implement these methods, and which are available from the author's web site, are described. We conclude with a discussion of how this still experimental approach might be further developed.  相似文献   

18.
Summary Data from five nuclear families within a large kindred with multiple endocrine neoplasia type-2 (MEN-2) or Sipple's syndrome exclude linkage between the disease locus and the loci in the HLA complex. There were seven recombinants and seven non-recombinants in the four families with phase known data and total lod score of-2.659 for a recombination fraction of 0.1.  相似文献   

19.
Methods to handle missing data have been an area of statistical research for many years. Little has been done within the context of pedigree analysis. In this paper we present two methods for imputing missing data for polygenic models using family data. The imputation schemes take into account familial relationships and use the observed familial information for the imputation. A traditional multiple imputation approach and multiple imputation or data augmentation approach within a Gibbs sampler for the handling of missing data for a polygenic model are presented.We used both the Genetic Analysis Workshop 13 simulated missing phenotype and the complete phenotype data sets as the means to illustrate the two methods. We looked at the phenotypic trait systolic blood pressure and the covariate gender at time point 11 (1970) for Cohort 1 and time point 1 (1971) for Cohort 2. Comparing the results for three replicates of complete and missing data incorporating multiple imputation, we find that multiple imputation via a Gibbs sampler produces more accurate results. Thus, we recommend the Gibbs sampler for imputation purposes because of the ease with which it can be extended to more complicated models, the consistency of the results, and the accountability of the variation due to imputation.  相似文献   

20.
In human genetics, two loci are declared to be linked when the lod score at the maximum likelihood recombination fraction theta exceeds the threshold of 3.0. Since recombination rates differ between the sexes, one can alternatively detect linkage by estimating separate recombination rates, theta m and theta f, for male and female meiosis and examining the corresponding sex-specific lod scores. The question arises: In order to maintain the same chance of falsely declaring linkage, what is the correct threshold for declaring linkage when sex-specific lod scores are used? We show here that the appropriate threshold is about 3.5. If the restriction that theta f greater than theta m is added, the appropriate threshold falls to about 3.25. We also discuss the relative efficiency of detecting linkage by using sex-specific and sex-averaged lod scores.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号