首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
HaploBlockFinder: haplotype block analyses   总被引:8,自引:0,他引:8  
Recent studies have unveiled discrete block-like structures of linkage disequilibrium (LD) in the human genome. We have developed a set of computer programs to analyze the block-like LD structures (haplotype blocks) based on haplotype data. Three definitions of haplotype block are supported, including minimal LD range, no historic recombination, and chromosome coverage. Tagged SNPs that uniquely distinguish common haplotypes are identified. A greedy algorithm was used to improve the efficiency. Two separate utilities were also provided to assist visual inspection of haplotype block structure and pattern of linkage disequilibrium. AVAILABILITY: A web interface for the HaploBlockFinder is available at http://cgi.uc.edu/cgi-bin/kzhang/haploBlockFinder.cgi the source codes are also freely available on the web site.  相似文献   

2.
With the widespread availability of SNP genotype data, there is great interest in analyzing pedigree haplotype data. Intermarker linkage disequilibrium for microsatellite markers is usually low due to their physical distance; however, for dense maps of SNP markers, there can be strong linkage disequilibrium between marker loci. Linkage analysis (parametric and nonparametric) and family-based association studies are currently being carried out using dense maps of SNP marker loci. Monte Carlo methods are often used for both linkage and association studies; however, to date there are no programs available which can generate haplotype and/or genotype data consisting of a large number of loci for pedigree structures. SimPed is a program that quickly generates haplotype and/or genotype data for pedigrees of virtually any size and complexity. Marker data either in linkage disequilibrium or equilibrium can be generated for greater than 20,000 diallelic or multiallelic marker loci. Haplotypes and/or genotypes are generated for pedigree structures using specified genetic map distances and haplotype and/or allele frequencies. The simulated data generated by SimPed is useful for a variety of purposes, including evaluating methods that estimate haplotype frequencies for pedigree data, evaluating type I error due to intermarker linkage disequilibrium and estimating empirical p values for linkage and family-based association studies.  相似文献   

3.
Linkage disequilibrium in the North American Holstein population   总被引:2,自引:0,他引:2  
Linkage disequilibrium was estimated using 7119 single nucleotide polymorphism markers across the genome and 200 animals from the North American Holstein cattle population. The analysis of maternally inherited haplotypes revealed strong linkage disequilibrium ( r 2   >   0.8) in genomic regions of ∼50 kb or less. While linkage disequilibrium decays as a function of genomic distance, genomic regions within genes showed greater linkage disequilibrium and greater variation in linkage disequilibrium compared with intergenic regions. Identification of haplotype blocks could characterize the most common haplotypes. Although maximum haplotype block size was over 1 Mb, mean block size was 26–113 kb by various definitions, which was larger than that observed in humans (∼10 kb). Effective population size of the dairy cattle population was estimated from linkage disequilibrium between single nucleotide polymorphism marker pairs in various haplotype ranges. Rapid reduction of effective population size of dairy cattle was inferred from linkage disequilibrium in recent generations. This result implies a loss of genetic diversity because of the high rate of inbreeding and high selection intensity in dairy cattle. The pattern observed in this study indicated linkage disequilibrium in the current dairy cattle population could be exploited to refine mapping resolution. Changes in effective population size during past generations imply a necessity of plans to maintain polymorphism in the Holstein population.  相似文献   

4.
With the completion of the first draft of the human genome sequencing project, a new challenge is to characterize patterns of linkage disequilibrium and haplotype structure across genomic regions to identify mutations associated with complex disease. Recent work shows considerable linkage disequilibrium heterogeneity, where genomic regions of extended haplotype blocks are punctuated by recombination hotspots. In this review we explore some of the current approaches to defining and characterizing 'hapblocks', mechanisms by which hapblocks may be generated, and the implications this block-like structure may have for successfully mapping mutations associated with complex disease.  相似文献   

5.
Disequilibrium Pattern Analysis. I. Theory   总被引:5,自引:3,他引:2       下载免费PDF全文
We have developed a method, disequilibrium pattern analysis, for examining the disequilibrium distribution of the entire array of two locus multiallelic haplotypes in a population. It is shown that a selected haplotype will produce a distinct pattern of linkage disequilibrium values for all generations while the selection is acting. This pattern will also presumably be maintained for many generations after the selection event, until the disequilibrium pattern is eventually broken down by genetic drift and recombination. Related haplotypes, sharing an allele with a selected haplotype, assume a value of linkage disequilibrium proportional to the frequency of the unshared allele and have a single negative value of the normalized linkage disequilibrium. The analysis assumes zero linkage disequilibrium for all allelic combinations initially. The same basic results continue to apply if the selection involves a new mutant, the occurrence of which creates linkage disequilibrium for some haplotypes. The disequilibrium pattern predicted under selection is robust with respect to the influence of migration and random genetic drift. This method is applicable to population data having linked polymorphic loci including that determined from protein or DNA sequencing.  相似文献   

6.
Lou XY  Casella G  Littell RC  Yang MC  Johnson JA  Wu R 《Genetics》2003,163(4):1533-1548
For tightly linked loci, cosegregation may lead to nonrandom associations between alleles in a population. Because of its evolutionary relationship with linkage, this phenomenon is called linkage disequilibrium. Today, linkage disequilibrium-based mapping has become a major focus of recent genome research into mapping complex traits. In this article, we present a new statistical method for mapping quantitative trait loci (QTL) of additive, dominant, and epistatic effects in equilibrium natural populations. Our method is based on haplotype analysis of multilocus linkage disequilibrium and exhibits two significant advantages over current disequilibrium mapping methods. First, we have derived closed-form solutions for estimating the marker-QTL haplotype frequencies within the maximum-likelihood framework implemented by the EM algorithm. The allele frequencies of putative QTL and their linkage disequilibria with the markers are estimated by solving a system of regular equations. This procedure has significantly improved the computational efficiency and the precision of parameter estimation. Second, our method can detect marker-QTL disequilibria of different orders and QTL epistatic interactions of various kinds on the basis of a multilocus analysis. This can not only enhance the precision of parameter estimation, but also make it possible to perform whole-genome association studies. We carried out extensive simulation studies to examine the robustness and statistical performance of our method. The application of the new method was validated using a case study from humans, in which we successfully detected significant QTL affecting human body heights. Finally, we discuss the implications of our method for genome projects and its extension to a broader circumstance. The computer program for the method proposed in this article is available at the webpage http://www.ifasstat.ufl.edu/genome/~LD.  相似文献   

7.
Haploview: analysis and visualization of LD and haplotype maps   总被引:134,自引:0,他引:134  
SUMMARY: Research over the last few years has revealed significant haplotype structure in the human genome. The characterization of these patterns, particularly in the context of medical genetic association studies, is becoming a routine research activity. Haploview is a software package that provides computation of linkage disequilibrium statistics and population haplotype patterns from primary genotype data in a visually appealing and interactive interface. AVAILABILITY: http://www.broad.mit.edu/mpg/haploview/ CONTACT: jcbarret@broad.mit.edu  相似文献   

8.
Hagenblad J  Nordborg M 《Genetics》2002,161(1):289-298
Linkage disequilibrium in highly selfing organisms is expected to extend well beyond the scale of individual genes. The pattern of polymorphism in such species must thus be studied over a larger scale. We sequenced 14 short (0.5-1 kb) fragments from a 400-kb region surrounding the flowering time locus FRI in a sample of 20 accessions of Arabidopsis thaliana. The distribution of allele frequencies, as quantified by Tajima's D, varies considerably over the region and is incompatible with a standard neutral model. The region is characterized by extensive haplotype structure, with linkage disequilibrium decaying over 250 kb. In particular, recombination is evident within 35 kb of FRI in a haplotype associated with a functionally important allele. This suggests that A. thaliana may be highly suitable for linkage disequilibrium mapping.  相似文献   

9.
Archeogenetics has been revolutionary, revealing insights into demographic history and recent positive selection. However, most studies to date have ignored the nonrandom association of genetic variants at different loci (i.e. linkage disequilibrium). This may be in part because basic properties of linkage disequilibrium in samples from different times are still not well understood. Here, we derive several results for summary statistics of haplotypic variation under a model with time-stratified sampling: (1) The correlation between the number of pairwise differences observed between time-staggered samples (πΔt) in models with and without strict population continuity; (2) The product of the linkage disequilibrium coefficient, D, between ancient and modern samples, which is a measure of haplotypic similarity between modern and ancient samples; and (3) The expected switch rate in the Li and Stephens haplotype copying model. The latter has implications for genotype imputation and phasing in ancient samples with modern reference panels. Overall, these results provide a characterization of how haplotype patterns are affected by sample age, recombination rates, and population sizes. We expect these results will help guide the interpretation and analysis of haplotype data from ancient and modern samples.  相似文献   

10.
OBJECTIVES: Linkage disequilibrium (LD) between closely spaced SNPs can be accommodated in linkage analysis by specifying the multi-SNP haplotype frequencies, if known. Phased haplotypes in candidate regions can provide gold standard haplotype frequency estimates, and may be of inherent interest as markers. We evaluated the effects of different methods of haplotype frequency estimation, and the use of marker phase information, on linkage analysis of a multi-SNP cluster in a candidate region for Alzheimer's disease (AD). METHODS: We performed parametric linkage analysis of a five-SNP cluster in extended pedigrees to compare the use of: (1) haplotype frequencies estimated by molecular phase determination, maximum likelihood estimation, or by assuming linkage equilibrium (LE); (2) AD families or controls as the frequency source; and (3) unphased or molecularly phased SNP data. RESULTS: There was moderate to strong pairwise LD among the five SNPs. Falsely assuming LE substantially inflated the LOD score, but the method of haplotype frequency estimation and particular sample used made little difference provided that LD was accommodated. Use of phased haplotypes produced a modest increase in the LOD score over unphased SNPs. CONCLUSIONS: Ignoring LD between markers can lead to substantially inflated evidence for linkage in LOD score analysis of extended pedigrees with missing data. Use of marker phase information in linkage analysis may be important in disease studies where the costs of family recruitment and phenotyping greatly exceed the costs of phase determination.  相似文献   

11.
Summary In 237 French families with cystic fibrosis (CF) restricted fragment length polymorphisms (RFLPs) were detected by two DNa probes, XV-2c and KM-19, which are tightly linked to the CF allele. As in other European populations linkage disequilibrium is found between the haplotype B (XV-2c, allele 1: KM-19, allele 2) and the CF allele. Linkage disequilibrium alters the probability that a person bearing a given haplotype is a carrier.  相似文献   

12.
We compared the accuracy of haplotype inferences at a 6 Mb region on chromosome 7 where significant linkage between a brain oscillation phenotype and a cholinergic muscarinic receptor gene was previously reported. Individual haplotype assignments and haplotype frequencies were estimated using 5, 10, and 14 consecutive Illumina single-nucleotide polymorphisms (SNPs) within the 1-LOD unit support interval of the chromosome 7 linkage peak. Initially, haplotypes were constructed incorporating phase information provided by relatives using the pedigree analysis package MERLIN. Population-based haplotypes were inferred using the haplotype estimation software HAPLO.STATS and PHASE, using unrelated individuals. The 14 SNPs within this region exhibited markedly low linkage disequilibrium, and the average D' estimate between SNPs was 0.18 (range: 0.01-0.97). In comparison to the family-based haplotypes calculated in MERLIN, the computational inferences of individual haplotype assignments were most accurate when considering 5 consecutive SNPs, but decayed dramatically when considering 10 or 14 SNPs in both PHASE and HAPLO.STATS. When comparing the two haplotype inference methods, both PHASE and HAPLO.STATS performed poorly. These analyses underscore the difficulties of haplotype estimation in the presence of low linkage disequilibrium and stress the importance of careful consideration of confidence measures when using estimated haplotype frequencies and individual assignments in biomedical research.  相似文献   

13.
Greenspan G  Geiger D 《Genetics》2006,172(4):2583-2599
Models of background variation in genomic regions form the basis of linkage disequilibrium mapping methods. In this work we analyze a background model that groups SNPs into haplotype blocks and represents the dependencies between blocks by a Markov chain. We develop an error measure to compare the performance of this model against the common model that assumes that blocks are independent. By examining data from the International Haplotype Mapping project, we show how the Markov model over haplotype blocks is most accurate when representing blocks in strong linkage disequilibrium. This contrasts with the independent model, which is rendered less accurate by linkage disequilibrium. We provide a theoretical explanation for this surprising property of the Markov model and relate its behavior to allele diversity.  相似文献   

14.
A recently described region on chromosome 2q contains seven restriction fragment length polymorphisms (RFLPs) revealed by single-copy probes isolated from a 20-kilobase (kb) segment of a single cosmid insert. Analysis of six of these loci demonstrates modest amounts of linkage disequilibrium. This reflects the presence of a substantial number of different haplotypes in this chromosome region and indicates that the region could be used as one highly polymorphic locus. No consistent relationship is found between the amount of linkage disequilibrium and the physical distance between pairs of loci. For seven of the 10 pairs of diallelic loci studied, the observed disequilibrium can be attributed primarily to the absence of the minor haplotype from the population. These results suggest that, for small regions of the genome, factors such as mutation, genetic drift, and population admixture may have effects that outweight those of recombination. In addition, results are reviewed which show that estimates of linkage disequilibrium coefficients for tightly linked loci are very imprecise. Thus, the inference of gene order from linkage disequilibrium values must be regarded with caution.  相似文献   

15.
In population- and family-based association studies, it is useful to have some knowledge of the patterns of linkage disequilibrium that exist between markers in candidate regions. When such studies are carried out with multiallelic markers, it is often convenient to group the alleles into a biallelic system, for analysis. In this study, we specifically examined the interleukin-1 (IL-1) gene cluster on chromosome 2, a region containing candidates for many inflammatory and autoimmune disorders. Data were collected on eight markers, four of which were multiallelic. Using these data, we investigated the effect of three allele-grouping strategies, including a novel method, on the detection of linkage disequilibrium. The novel approach, termed the "delta method," measures the deviation from the expected haplotype frequencies under linkage equilibrium, for each allelic combination. This information is then used to group the alleles, in an attempt to avoid the grouping together of alleles at one locus that are in opposite disequilibrium with the same allele at the second locus. The estimate haplotype frequencies (EH) program was used to estimate haplotype frequencies and the disequilibrium measure. In our data it was found that the delta method compared well with the other two strategies. Using this method, we found that there was a reasonable correlation between disequilibrium and physical distance in the region (r=-.540, P=.001, one-tailed). We also identified a common, eight-locus haplotype of the IL-1 gene cluster.  相似文献   

16.
Exome sequencing identifies thousands of DNA variants and a proportion of these are involved in disease. Genotypes derived from exome sequences provide particularly high-resolution coverage enabling study of the linkage disequilibrium structure of individual genes. The extent and strength of linkage disequilibrium reflects the combined influences of mutation, recombination, selection and population history. By constructing linkage disequilibrium maps of individual genes, we show that genes containing OMIM-listed disease variants are significantly under-represented amongst genes with complete or very strong linkage disequilibrium (P = 0.0004). In contrast, genes with disease variants are significantly over-represented amongst genes with levels of linkage disequilibrium close to the average for genes not known to contain disease variants (P = 0.0038). Functional clustering reveals, amongst genes with particularly strong linkage disequilibrium, significant enrichment of essential biological functions (e.g. phosphorylation, cell division, cellular transport and metabolic processes). Strong linkage disequilibrium, corresponding to reduced haplotype diversity, may reflect selection in utero against deleterious mutations which have profound impact on the function of essential genes. Genes with very weak linkage disequilibrium show enrichment of functions requiring greater allelic diversity (e.g. sensory perception and immune response). This category is not enriched for genes containing disease variation. In contrast, there is significant enrichment of genes containing disease variants amongst genes with more average levels of linkage disequilibrium. Mutations in these genes may less likely lead to in utero lethality and be subject to less intense selection.  相似文献   

17.
18.
王雅文  朱小泉  宋玉国  孙亮  杨泽 《遗传》2007,29(7):805-812
为了寻找中国人群中与强直性脊柱炎相关的新的易感基因及其所在位置, 在与强直性脊柱炎强连锁的6 号染色体短臂上的HLA基因区域内选取11个SNPs多态位点, 通过对中国吉林地区79名AS患者和132名正常对照者进行case-control分析, 发现TNF-a -850处TT突变基因型在AS组中的分布高于正常对照组(P=0.027), 突变型T等位基因在AS组和正常对照组中的分布差异更为显著(P=0.002)。通过多位点之间的连锁不平衡分析发现, LTA基因、TNF-a基因、LST1基因和NCR3基因中的 5个SNPs多态位点之间存在连锁不平衡, 范围是15 kb, 在这5个SNPs多态位点组成的单体型中, TCTTC单体型在AS组和正常对照组中的分布有显著差异(c2=7.406, P=0.0065),并且该单体型中含有具有统计学意义的TNF-a –850的突变型等位基因T。提示在LTA、TNF-a、NCR3和LST1 这4个基因构成的15 kb范围内可能存在增加AS患病易感性的位点, 可能是TNF-a –850 C→T突变, 也可能是在TNF-a –850附近的其他位点。  相似文献   

19.
The gene for variegate porphyria (VP), an autosomal dominant disease with a high prevalence in South Africa, evidently due to a founder effect, was previously mapped to chromosome 14q32. In the current study this localization was evaluated by linkage and haplotype analyses using microsatellite markers spanning a region of more than 20 cM on chromosome 14q32. In many recent studies linkage disequilibrium between disease and marker loci has been utilized to map genes in founder populations, but we could not find any association between VP and the markers used in this study. Our data suggest that the allocation of VP to chromosome 14q32 may be incorrect. Received: 1 September 1995 / Revised: 1 November 1995  相似文献   

20.
We simulated the evolution of a three-site haplotype system, two restriction fragment length polymorphisms flanking one short tandem repeat polymorphism, under five different demographic scenarios, three with constant population size and two with population growth. The simulation was designed to observe the effects of population history, recombination fraction, and mutation rate on allele and haplotype frequencies, haplotype diversity, frequency of ancestral alleles, and linkage disequilibrium. The known ancestral haplotypes were often found at low frequencies and even became extinct after 5, 000 generations, especially with small effective population sizes. The original linkage disequilibrium was eroded and even reversed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号