首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
2.
A chromosome in an individual of recently admixed ancestry resembles a mosaic of chromosomal segments, or ancestry blocks, each derived from a particular ancestral population. We consider the problem of inferring ancestry along the chromosomes in an admixed individual and thereby delineating the ancestry blocks. Using a simple population model, we infer gene-flow history in each individual. Compared with existing methods, which are based on a hidden Markov model, the Markov-hidden Markov model (MHMM) we propose has the advantage of accounting for the background linkage disequilibrium (LD) that exists in ancestral populations. When there are more than two ancestral groups, we allow each ancestral population to admix at a different time in history. We use simulations to illustrate the accuracy of the inferred ancestry as well as the importance of modeling the background LD; not accounting for background LD between markers may mislead us to false inferences about mixed ancestry in an indigenous population. The MHMM makes it possible to identify genomic blocks of a particular ancestry by use of any high-density single-nucleotide-polymorphism panel. One application of our method is to perform admixture mapping without genotyping special ancestry-informative-marker panels.  相似文献   

3.
There is considerable interest in identifying and characterizing block-like patterns of linkage disequilibrium (LD; haplotype blocks) in the human genome as these may facilitate the identification of complex disease genes via genome-wide association studies. Although recombination hot-spots have been suggested as the primary mechanism to explain the block-like pattern of LD, other forces, such as genetic drift, may also be important. To this end, we have studied the effect of various recombination models on patterns of LD by using extensive simulations. As expected, haplotype blocks were observed under a model allowing recombination hot-spots. However, we also observed similar block-like patterns in the models where recombination crossovers are randomly and uniformly distributed, and we demonstrate that these blocks are generated by genetic drift. We caution that genetic drift may be an alternative mechanism (in addition to recombination hot-spots) that can lead to block-like patterns of LD. Our findings highlight the necessity of characterizing haplotype blocks in world-wide populations.  相似文献   

4.
Multilocus analysis of single nucleotide polymorphism haplotypes is a promising approach to dissecting the genetic basis of complex diseases. We propose a coalescent-based model for association mapping that potentially increases the power to detect disease-susceptibility variants in genetic association studies. The approach uses Bayesian partition modelling to cluster haplotypes with similar disease risks by exploiting evolutionary information. We focus on candidate gene regions with densely spaced markers and model chromosomal segments in high linkage disequilibrium therein assuming a perfect phylogeny. To make this assumption more realistic, we split the chromosomal region of interest into sub-regions or windows of high linkage disequilibrium. The haplotype space is then partitioned into disjoint clusters, within which the phenotype–haplotype association is assumed to be the same. For example, in case-control studies, we expect chromosomal segments bearing the causal variant on a common ancestral background to be more frequent among cases than controls, giving rise to two separate haplotype clusters. The novelty of our approach arises from the fact that the distance used for clustering haplotypes has an evolutionary interpretation, as haplotypes are clustered according to the time to their most recent common ancestor. Our approach is fully Bayesian and we develop a Markov Chain Monte Carlo algorithm to sample efficiently over the space of possible partitions. We compare the proposed approach to both single-marker analyses and recently proposed multi-marker methods and show that the Bayesian partition modelling performs similarly in localizing the causal allele while yielding lower false-positive rates. Also, the method is computationally quicker than other multi-marker approaches. We present an application to real genotype data from the CYP2D6 gene region, which has a confirmed role in drug metabolism, where we succeed in mapping the location of the susceptibility variant within a small error.  相似文献   

5.
We propose an analytical approximation method for the estimation of multipoint identity by descent (IBD) probabilities in pedigrees containing a moderate number of distantly related individuals. We show that in large pedigrees where cases are related through untyped ancestors only, it is possible to formulate the hidden Markov model of the Lander-Green algorithm in terms of the IBD configurations of the cases. We use a first-order Markov approximation to model the changes in this IBD-configuration variable along the chromosome. In simulated and real data sets, we demonstrate that estimates of parametric and nonparametric linkage statistics based on the first-order Markov approximation are accurate. The computation time is exponential in the number of cases instead of in the number of meioses separating the cases. We have implemented our approach in the computer program ALADIN (accurate linkage analysis of distantly related individuals). ALADIN can be applied to general pedigrees and marker types and has the ability to model marker-marker linkage disequilibrium with a clustered-markers approach. Using ALADIN is straightforward: It requires no parameters to be specified and accepts standard input files.  相似文献   

6.
With the completion of the first draft of the human genome sequencing project, a new challenge is to characterize patterns of linkage disequilibrium and haplotype structure across genomic regions to identify mutations associated with complex disease. Recent work shows considerable linkage disequilibrium heterogeneity, where genomic regions of extended haplotype blocks are punctuated by recombination hotspots. In this review we explore some of the current approaches to defining and characterizing 'hapblocks', mechanisms by which hapblocks may be generated, and the implications this block-like structure may have for successfully mapping mutations associated with complex disease.  相似文献   

7.
8.
Linkage disequilibrium in the North American Holstein population   总被引:2,自引:0,他引:2  
Linkage disequilibrium was estimated using 7119 single nucleotide polymorphism markers across the genome and 200 animals from the North American Holstein cattle population. The analysis of maternally inherited haplotypes revealed strong linkage disequilibrium ( r 2   >   0.8) in genomic regions of ∼50 kb or less. While linkage disequilibrium decays as a function of genomic distance, genomic regions within genes showed greater linkage disequilibrium and greater variation in linkage disequilibrium compared with intergenic regions. Identification of haplotype blocks could characterize the most common haplotypes. Although maximum haplotype block size was over 1 Mb, mean block size was 26–113 kb by various definitions, which was larger than that observed in humans (∼10 kb). Effective population size of the dairy cattle population was estimated from linkage disequilibrium between single nucleotide polymorphism marker pairs in various haplotype ranges. Rapid reduction of effective population size of dairy cattle was inferred from linkage disequilibrium in recent generations. This result implies a loss of genetic diversity because of the high rate of inbreeding and high selection intensity in dairy cattle. The pattern observed in this study indicated linkage disequilibrium in the current dairy cattle population could be exploited to refine mapping resolution. Changes in effective population size during past generations imply a necessity of plans to maintain polymorphism in the Holstein population.  相似文献   

9.
In this study, we obtained sequence and population genetic data for three X-linked short tandem repeat markers (X-STRs; DXS7129, DXS2500, G10583). We investigated their population genetics and estimated their forensic parameters in 214 healthy unrelated individuals from the Han population of Northern China (105 males and 109 females). We showed that DXS2500 and G10583 were highly polymorphic and thus have potential for application in forensic medicine. We also estimated the overall linkage disequilibrium between pairs of loci, specific multiallelic or interallelic associations, and haplotype frequencies in males. We showed that the three X-STR loci segregate as stable haplotype blocks; this could be a powerful tool for haplotype analysis in kinship testing.  相似文献   

10.
Although permutation testing has been the gold standard for assessing significance levels in studies using multiple markers, it is time-consuming. A Bonferroni correction to the nominal p-value that uses the underlying pair-wise linkage disequilibrium (LD) structure among the markers to determine the number of effectively independent tests has recently been proposed. We propose using the number of independent LD blocks plus the number of independent single-nucleotide polymorphisms for correction. Using the Collaborative Study on the Genetics of Alcoholism LD data for chromosome 21, we simulated 1,000 replicates of parent-child trio data under the null hypothesis with two levels of LD: moderate and high. Assuming haplotype blocks were independent, we calculated the number of independent statistical tests using 3 haplotype blocking algorithms. We then compared the type I error rates using a principal components-based method, the three blocking methods, a traditional Bonferroni correction, and the unadjusted p-values obtained from FBAT. Under high LD conditions, the PC method and one of the blocking methods were slightly conservative, whereas the 2 other blocking methods exceeded the target type I error rate. Under conditions of moderate LD, we show that the blocking algorithm corrections are closest to the desired type I error, although still slightly conservative, with the principal components-based method being almost as conservative as the traditional Bonferroni correction.  相似文献   

11.
Haplotype inference in random population samples   总被引:16,自引:0,他引:16       下载免费PDF全文
Contemporary genotyping and sequencing methods do not provide information on linkage phase in diploid organisms. The application of statistical methods to infer and reconstruct linkage phase in samples of diploid sequences is a potentially time- and labor-saving method. The Stephens-Smith-Donnelly (SSD) algorithm is one such method, which incorporates concepts from population genetics theory in a Markov chain-Monte Carlo technique. We applied a modified SSD method, as well as the expectation-maximization and partition-ligation algorithms, to sequence data from eight loci spanning >1 Mb on the human X chromosome. We demonstrate that the accuracy of the modified SSD method is better than that of the other algorithms and is superior in terms of the number of sites that may be processed. Also, we find phase reconstructions by the modified SSD method to be highly accurate over regions with high linkage disequilibrium (LD). If only polymorphisms with a minor allele frequency >0.2 are analyzed and scored according to the fraction of neighbor relations correctly called, reconstructions are 95.2% accurate over entire 100-kb stretches and are 98.6% accurate within blocks of high LD.  相似文献   

12.
Genome-wide association (GWA) studies are currently one of the most powerful tools in identifying disease-associated genes or variants. In typical GWA studies, single-nucleotide polymorphisms (SNPs) are often used as genetic makers. Therefore, it is critical to estimate the percentage of genetic variations which can be covered by SNPs through linkage disequilibrium (LD). In this study, we use the concept of haplotype blocks to evaluate the coverage of five SNP sets including the HapMap and four commercial arrays, for every exon in the human genome. We show that although some Chips can reach similar coverage as the HapMap, only about 50% of exons are completely covered by haplotype blocks of HapMap SNPs. We suggest further high-resolution genotyping methods are required, to provide adequate genome-wide power for identifying variants.  相似文献   

13.
HaploBlockFinder: haplotype block analyses   总被引:8,自引:0,他引:8  
Recent studies have unveiled discrete block-like structures of linkage disequilibrium (LD) in the human genome. We have developed a set of computer programs to analyze the block-like LD structures (haplotype blocks) based on haplotype data. Three definitions of haplotype block are supported, including minimal LD range, no historic recombination, and chromosome coverage. Tagged SNPs that uniquely distinguish common haplotypes are identified. A greedy algorithm was used to improve the efficiency. Two separate utilities were also provided to assist visual inspection of haplotype block structure and pattern of linkage disequilibrium. AVAILABILITY: A web interface for the HaploBlockFinder is available at http://cgi.uc.edu/cgi-bin/kzhang/haploBlockFinder.cgi the source codes are also freely available on the web site.  相似文献   

14.
Deterministic theory suggests that reciprocal recombination and intragenic, interallelic conversion have different effects on the linkage disequilibrium between a pair of genetic markers. Under a model of reciprocal recombination, the decay rate of linkage disequilibrium depends on the distance between the two markers, while under conversion the decay rate is independent of this distance, provided that conversion tracts are short. A population genetic three-locus model provides a function Q of two-locus linkage disequilibria. Viewed as a random variable, Q is the basis for a test of the relative impact of conversion and recombination. This test requires haplotype frequency data of a sufficiently variable three-locus system. One of the few examples currently available is data from the Human Leukocyte Antigen (HLA) class I genes of three Amerindian populations. We find that conversion may have played a dominant role in shaping haplotype patterns over short stretches of DNA, whereas reciprocal recombination may have played a greater role over longer stretches of DNA. However, in order to draw firm conclusions more independent data are necessary.  相似文献   

15.
A significant proportion of the human genome is contained within haplotype blocks across which pairwise linkage disequilibrium (LD) is very high. However, LD is also often high between markers at more remote distances, and within different haplotype blocks. Here, we evaluate the origins of haplotype block structure in the three genes for alpha1 adrenergic receptors (alpha1-AR) in the human genome ( ADRA1A, ADRA1B and ADRA1D) by genotyping dense single-nucleotide polymorphism (SNP) marker maps, and show that LD signals between distant markers are due to the presence of extended haplotype superblocks in individuals with ancient chromosomes which have escaped historic recombination. ARs mediate the physiological effects of epinephrine and norepinephrine, and are targets of many therapeutic drugs. This work has identified haplotype backgrounds of alpha1-AR missense variants, haplotype block structures in US Caucasians and African Americans, and haplotype tag SNPs for each block, and we present strong evidence for ancient haplotype block superstructure at these genes which has been partially disrupted by recombination, and evidence for reinstatement of linkage disequilibrium by subsequent recombination events. ADRA1A is comprised of four haplotype blocks in US Caucasians, while in African Americans Block 1 is split. ADRA1B has four blocks in US Caucasians, but in African Americans only the first two blocks are present. ADRA1D has two blocks in US Caucasians, and the first block is replaced by two smaller blocks in African Americans. For both ADRA1A and ADRA1B, haplotype superstructures may represent a novel, higher-level hierarchy in the human genome, which may reduce redundancy of testing by further aggregation of genotype data.Electronic Supplementary Material Supplementary material is available in the online version of this article at Communicated by W. R. McCombie  相似文献   

16.
Bayesian spatial modeling of haplotype associations   总被引:9,自引:0,他引:9  
We review methods for relating the risk of disease to a collection of single nucleotide polymorphisms (SNPs) within a small region. Association studies using case-control designs with unrelated individuals could be used either to test for a direct effect of a candidate gene and characterize the responsible variant(s), or to fine map an unknown gene by exploiting the pattern of linkage disequilibrium (LD). We consider a flexible class of logistic penetrance models based on haplotypes and compare them with an alternative formulation based on unphased multilocus genotypes. The likelihood for haplotype-based models requires summation over all possible haplotype assignments consistent with the observed genotype data, and can be fitted using either Expectation-Maximization (E-M) or Markov chain Monte Carlo (MCMC) methods. Subtleties involving ascertainment correction for case-control studies are discussed. There has been great interest in methods for LD mapping based on the coalescent or ancestral recombination graphs as well as methods based on haplotype sharing, both of which we review briefly. Because of their computational complexity, we propose some alternative empirical modeling approaches using techniques borrowed from the Bayesian spatial statistics literature. Here, space is interpreted in terms of a distance metric describing the similarity of any pair of haplotypes to each other, and hence their presumed common ancestry. Specifically, we discuss the conditional autoregressive model and two spatial clustering models: Potts and Voronoi. We conclude with a discussion of the implications of these methods for modeling cryptic relatedness, haplotype blocks, and haplotype tagging SNPs, and suggest a Bayesian framework for the HapMap project.  相似文献   

17.
Haplotype block structure is conserved across mammals   总被引:2,自引:0,他引:2  
Genetic variation in genomes is organized in haplotype blocks, and species-specific block structure is defined by differential contribution of population history effects in combination with mutation and recombination events. Haplotype maps characterize the common patterns of linkage disequilibrium in populations and have important applications in the design and interpretation of genetic experiments. Although evolutionary processes are known to drive the selection of individual polymorphisms, their effect on haplotype block structure dynamics has not been shown. Here, we present a high-resolution haplotype map for a 5-megabase genomic region in the rat and compare it with the orthologous human and mouse segments. Although the size and fine structure of haplotype blocks are species dependent, there is a significant interspecies overlap in structure and a tendency for blocks to encompass complete genes. Extending these findings to the complete human genome using haplotype map phase I data reveals that linkage disequilibrium values are significantly higher for equally spaced positions in genic regions, including promoters, as compared to intergenic regions, indicating that a selective mechanism exists to maintain combinations of alleles within potentially interacting coding and regulatory regions. Although this characteristic may complicate the identification of causal polymorphisms underlying phenotypic traits, conservation of haplotype structure may be employed for the identification and characterization of functionally important genomic regions.  相似文献   

18.
We present a statistical model for patterns of genetic variation in samples of unrelated individuals from natural populations. This model is based on the idea that, over short regions, haplotypes in a population tend to cluster into groups of similar haplotypes. To capture the fact that, because of recombination, this clustering tends to be local in nature, our model allows cluster memberships to change continuously along the chromosome according to a hidden Markov model. This approach is flexible, allowing for both "block-like" patterns of linkage disequilibrium (LD) and gradual decline in LD with distance. The resulting model is also fast and, as a result, is practicable for large data sets (e.g., thousands of individuals typed at hundreds of thousands of markers). We illustrate the utility of the model by applying it to dense single-nucleotide-polymorphism genotype data for the tasks of imputing missing genotypes and estimating haplotypic phase. For imputing missing genotypes, methods based on this model are as accurate or more accurate than existing methods. For haplotype estimation, the point estimates are slightly less accurate than those from the best existing methods (e.g., for unrelated Centre d'Etude du Polymorphisme Humain individuals from the HapMap project, switch error was 0.055 for our method vs. 0.051 for PHASE) but require a small fraction of the computational cost. In addition, we demonstrate that the model accurately reflects uncertainty in its estimates, in that probabilities computed using the model are approximately well calibrated. The methods described in this article are implemented in a software package, fastPHASE, which is available from the Stephens Lab Web site.  相似文献   

19.
Chemokine (C-C-motif) receptor 3 (CCR3), playing an important role in endometrium related metabolic pathways, may influence the onset of menarche. To test linkage and/or association between CCR3 polymorphisms with the variation of age at menarche (AAM) in Caucasian females, we recruited a sample of 1,048 females from 354 Caucasian nuclear families and genotyped 16 SNPs spanning the entire CCR3 gene. Linkage disequilibrium and haplotype blocks were inferred by Haploview. Both single-SNP markers and haplotypes were tested for linkage and/or association with AAM using QTDT (quantitative transmission disequilibrium test). We also tested associations between CCR3 polymorphisms and AAM in a selected random sample of daughters using ANOVA (analysis of variance). We identified two haplotype blocks. Only block two showed significant results. After correction for multiple testing, significant total associations of SNP7, SNP9 with AAM were detected (P = 0.009 and 0.006, respectively). We also detected significant within-family association of SNP9 (P = 0.01). SNP14 was linked to AAM (P = 0.02) at the nominal level. In addition, there was evidence of significant total association and nominal significant linkage (P = 0.008 and 0.03, respectively) with AAM for the haplotype AGA reconstructed by SNP7, SNP9 and SNP13. ANOVA confirmed the results by QTDT. For the first time we reported that CCR3 is linked and associated with AAM variation in Caucasian women. However, further studies are necessary to substantiate our conclusions. Fang Yang and Dong-hai Xiong had contributed equally to this work.  相似文献   

20.
We have created a high-density SNP resource encompassing 7.87 million polymorphic loci across 49 inbred mouse strains of the laboratory mouse by combining data available from public databases and training a hidden Markov model to impute missing genotypes in the combined data. The strong linkage disequilibrium found in dense sets of SNP markers in the laboratory mouse provides the basis for accurate imputation. Using genotypes from eight independent SNP resources, we empirically validated the quality of the imputed genotypes and demonstrated that they are highly reliable for most inbred strains. The imputed SNP resource will be useful for studies of natural variation and complex traits. It will facilitate association study designs by providing high-density SNP genotypes for large numbers of mouse strains. We anticipate that this resource will continue to evolve as new genotype data become available for laboratory mouse strains. The data are available for bulk download or query at /. Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号