首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 328 毫秒
1.
The genotyping of closely spaced single-nucleotide polymorphism (SNP) markers frequently yields highly correlated data, owing to extensive linkage disequilibrium (LD) between markers. The extent of LD varies widely across the genome and drives the number of frequent haplotypes observed in small regions. Several studies have illustrated the possibility that LD or haplotype data could be used to select a subset of SNPs that optimize the information retained in a genomic region while reducing the genotyping effort and simplifying the analysis. We propose a method based on the spectral decomposition of the matrices of pairwise LD between markers, and we select markers on the basis of their contributions to the total genetic variation. We also modify Clayton's "haplotype tagging SNP" selection method, which utilizes haplotype information. For both methods, we propose sliding window-based algorithms that allow the methods to be applied to large chromosomal regions. Our procedures require genotype information about a small number of individuals for an initial set of SNPs and selection of an optimum subset of SNPs that could be efficiently genotyped on larger numbers of samples while retaining most of the genetic variation in samples. We identify suitable parameter combinations for the procedures, and we show that a sample size of 50-100 individuals achieves consistent results in studies of simulated data sets in linkage equilibrium and LD. When applied to experimental data sets, both procedures were similarly effective at reducing the genotyping requirement while maintaining the genetic information content throughout the regions. We also show that haplotype-association results that Hosking et al. obtained near CYP2D6 were almost identical before and after marker selection.  相似文献   

2.
Analysis of haplotypes based on multiple single-nucleotide polymorphisms (SNP) is becoming common for both candidate gene and fine-mapping studies. Before embarking on studies of haplotypes from genetically distinct populations, however, it is important to consider variation both in linkage disequilibrium (LD) and in haplotype frequencies within and across populations, as both vary. Such diversity will influence the choice of "tagging" SNPs for candidate gene or whole-genome association studies because some markers will not be polymorphic in all samples and some haplotypes will be poorly represented or completely absent. Here we analyze 11 genes, originally chosen as candidate genes for oral clefts, where multiple markers were genotyped on individuals from four populations. Estimated haplotype frequencies, measures of pairwise LD, and genetic diversity were computed for 135 European-Americans, 57 Chinese-Singaporeans, 45 Malay-Singaporeans, and 46 Indian-Singaporeans. Patterns of pairwise LD were compared across these four populations and haplotype frequencies were used to assess genetic variation. Although these populations are fairly similar in allele frequencies and overall patterns of LD, both haplotype frequencies and genetic diversity varied significantly across populations. Such haplotype diversity has implications for designing studies of association involving samples from genetically distinct populations.  相似文献   

3.
Linkage disequilibrium (LD) is a major concern in many genetic studies because of the markedly increased density of SNP (Single Nucleotide Polymorphism) genotype markers. This dramatic increase in the number of SNPs may cause problems in statistical analyses, such as by introducing multiple comparisons in hypothesis testing and colinearity in logistic regression models, because of the presence of complex LD structures. Inferences must be made about the underlying genetic variation through the LD structure before applying statistical models to the data. Therefore, we introduced the textile plot to provide a visualization of LD to improve the analysis of the genetic variation present in multiple-SNP genotype data. The plot can accentuate LD by displaying specific geometrical shapes, and allowing for the underlying haplotype structure to be inferred without any haplotype-phasing algorithms. Application of this technique to simulated and real data sets illustrated the potential usefulness of the textile plot as an aid to the interpretation of LD in multiple-SNP genotype data. The initial results of LD mapping and haplotype analyses of disease genes are encouraging, indicating that the textile plot may be useful in disease association studies.  相似文献   

4.
5.
Appaloosa horses are predisposed to equine recurrent uveitis (ERU), an immune‐mediated disease characterized by recurring inflammation of the uveal tract in the eye, which is the leading cause of blindness in horses. Nine genetic markers from the ECA1 region responsible for the spotted coat color of Appaloosa horses, and 13 microsatellites spanning the equine major histocompatibility complex (ELA) on ECA20, were evaluated for association with ERU in a group of 53 Appaloosa ERU cases and 43 healthy Appaloosa controls. Three markers were significantly associated (corrected P‐value <0.05): a SNP within intron 11 of the TRPM1 gene on ECA1, an ELA class I microsatellite located near the boundary of the ELA class III and class II regions and an ELA class II microsatellite located in intron 1 of the DRA gene. Association between these three genetic markers and the ERU phenotype was confirmed in a second population of 24 insidious ERU Appaloosa cases and 16 Appaloosa controls. The relative odds of being an ERU case for each allele of these three markers were estimated by fitting a logistic mixed model with each of the associated markers independently and with all three markers simultaneously. The risk model using these markers classified ~80% of ERU cases and 75% of controls in the second population as moderate or high risk, and low risk respectively. Future studies to refine the associations at ECA1 and ELA loci and identify functional variants could uncover alleles conferring susceptibility to ERU in Appaloosa horses.  相似文献   

6.
Psoriasis is a common skin disorder of multifactorial origin. Genomewide scans for disease susceptibility have repeatedly demonstrated the existence of a major locus, PSORS1 (psoriasis susceptibility 1), contained within the major histocompatibility complex (MHC), on chromosome 6p21. Subsequent refinement studies have highlighted linkage disequilibrium (LD) with psoriasis, along a 150-kb segment that includes at least three candidate genes (encoding human leukocyte antigen-C [HLA-C], alpha-helix-coiled-coil-rod homologue, and corneodesmosin), each of which has been shown to harbor disease-associated alleles. However, the boundaries of the minimal PSORS1 region remain poorly defined. Moreover, interpretations of allelic association with psoriasis are compounded by limited insight of LD conservation within MHC class I interval. To address these issues, we have pursued a high-resolution genetic characterization of the PSORS1 locus. We resequenced genomic segments along a 220-kb region at chromosome 6p21 and identified a total of 119 high-frequency SNPs. Using 59 SNPs (18 coding and 41 noncoding SNPs) whose position was representative of the overall marker distribution, we genotyped a data set of 171 independently ascertained parent-affected offspring trios. Family-based association analysis of this cohort highlighted two SNPs (n.7 and n.9) respectively lying 7 and 4 kb proximal to HLA-C. These markers generated highly significant evidence of disease association (P<10-9), several orders of magnitude greater than the observed significance displayed by any other SNP that has previously been associated with disease susceptibility. This observation was replicated in a Gujarati Indian case/control data set. Haplotype-based analysis detected overtransmission of a cluster of chromosomes, which probably originated by ancestral mutation of a common disease-bearing haplotype. The only markers exclusive to the overtransmitted chromosomes are SNPs n.7 and n.9, which define a 10-kb PSORS1 core risk haplotype. These data demonstrate the power of SNP haplotype-based association analyses and provide high-resolution dissection of genetic variation across the PSORS1 interval, the major susceptibility locus for psoriasis.  相似文献   

7.
Patterns of linkage disequilibrium in the MHC region on human chromosome 6p   总被引:5,自引:0,他引:5  
Single nucleotide polymorphisms (SNPs) in the human genome are thought to be organised into blocks of high internal linkage disequilibrium (LD), separated by intermittent recombination hotspots. Since understanding haplotype structure is critical for an accurate assessment of inter-individual genetic differences, we investigated up to 968 SNPs from a 10-Mb region on chromosome 6p21, including the human major histocompatibility complex (MHC), in five different population samples (45–550 individuals). Regions of well-defined block structure were found to coexist alongside large areas lacking any clear structure; occasional long-range LD was observed in all five samples. The four white populations analysed were remarkably similar in terms of the extend and spatial distribution of local LD. In US African Americans, the distribution of LD was similar to that in the white populations but the observed haplotype diversity was higher. The existence of large regions without any clear block structure renders the systematic and thorough construction of SNP haplotype maps a crucial prerequisite for disease-association studies.Electronic Supplementary Material Supplementary material is available in the online version of this article at Electronic database information: URLs for the data in this article are as follows:  相似文献   

8.
Drought often delays developmental events so that plant height and above-ground biomass are reduced, resulting in yield loss due to inadequate photosynthate. In this study, plant height and biomass measured by the Normalized Difference Vegetation Index (NDVI) were used as criteria for drought tolerance. A total of 305 lines representing temperate, tropical and subtropical maize germplasm were genotyped using two single nucleotide polymorphism (SNP) chips each containing 1536 markers, from which 2052 informative SNPs and 386 haplotypes each constructed with two or more SNPs were used for linkage disequilibrium (LD) or association mapping. Single SNP- and haplotype-based LD mapping identified two significant SNPs and three haplotype loci [a total of four quantitative trait loci (QTL)] for plant height under well-watered and water-stressed conditions. For biomass, 32 SNPs and 12 haplotype loci (30 QTL) were identified using NDVIs measured at seven stages under the two water regimes. Some significant SNP and haplotype loci for NDVI were shared by different stages. Comparing significant loci identified by single SNP- and haplotype-based LD mapping, we found that six out of the 14 chromosomal regions defined by haplotype loci each included at least one significant SNP for the same trait. Significant SNP haplotype loci explained much higher phenotypic variation than individual SNPs. Moreover, we found that two significant SNPs (two QTL) and one haplotype locus were shared by plant height and NDVI. The results indicate the power of comparative LD mapping using single SNPs and SNP haplotypes with QTL shared by plant height and biomass as secondary traits for drought tolerance in maize.  相似文献   

9.
An integrated haplotype map of the human major histocompatibility complex   总被引:26,自引:0,他引:26  
Numerous studies have clearly indicated a role for the major histocompatibility complex (MHC) in susceptibility to autoimmune diseases. Such studies have focused on the genetic variation of a small number of classical human-leukocyte-antigen (HLA) genes in the region. Although these genes represent good candidates, given their immunological roles, linkage disequilibrium (LD) surrounding these genes has made it difficult to rule out neighboring genes, many with immune function, as influencing disease susceptibility. It is likely that a comprehensive analysis of the patterns of LD and variation, by using a high-density map of single-nucleotide polymorphisms (SNPs), would enable a greater understanding of the nature of the observed associations, as well as lead to the identification of causal variation. We present herein an initial analysis of this region, using 201 SNPs, nine classical HLA loci, two TAP genes, and 18 microsatellites. This analysis suggests that LD and variation in the MHC, aside from the classical HLA loci, are essentially no different from those in the rest of the genome. Furthermore, these data show that multi-SNP haplotypes will likely be a valuable means for refining association signals in this region.  相似文献   

10.
Single-nucleotide polymorphisms (SNPs) are rapidly replacing microsatellites as the markers of choice for genetic linkage studies and many other studies of human pedigrees. Here, we describe an efficient approach for modeling linkage disequilibrium (LD) between markers during multipoint analysis of human pedigrees. Using a gene-counting algorithm suitable for pedigree data, our approach enables rapid estimation of allele and haplotype frequencies within clusters of tightly linked markers. In addition, with the use of a hidden Markov model, our approach allows for multipoint pedigree analysis with large numbers of SNP markers organized into clusters of markers in LD. Simulation results show that our approach resolves previously described biases in multipoint linkage analysis with SNPs that are in LD. An updated version of the freely available Merlin software package uses the approach described here to perform many common pedigree analyses, including haplotyping and haplotype frequency estimation, parametric and nonparametric multipoint linkage analysis of discrete traits, variance-components and regression-based analysis of quantitative traits, calculation of identity-by-descent or kinship coefficients, and case selection for follow-up association studies. To illustrate the possibilities, we examine a data set that provides evidence of linkage of psoriasis to chromosome 17.  相似文献   

11.
Segregation distortion was found for a haplotype of the equine lymphocyte antigen (ELA) system in an extended family of American Standardbred horses. In one sire family, consisting of a stallion and his 17 sons and grandsons, the gene for ELA A10 (A10) was transmitted to 57.7% of 638 offspring scored (P=0.0001). Significant segregation distortion was not seen for mares or for unrelated stallions, regardless of the ELA markers they possessed. Since the effect was seen for this one sire family and not seen for other stallions with A10, it is unlikely that the gene for A10 is the cause of this phenomenon, but rather A10 is linked to another major histocompatibility complex (MHC) gene causing this trait. This trait appeared analogous to the segregation distortion observed for the T/t complex of the mouse. Since segregation distortion involving MHC genes has been seen in other species, genes for this trait may be a general feature of the MHC.  相似文献   

12.
Hao Z  Li X  Xie C  Weng J  Li M  Zhang D  Liang X  Liu L  Liu S  Zhang S 《植物学报(英文版)》2011,53(8):641-652
Single nucleotide polymorphism (SNP) is a common form of genetic variation and popularly exists in maize genome. An Illumina GoldenGate assay with 1 536 SNP markers was used to genotype maize inbred lines and identified the functional genetic variations underlying drought tolerance by association analysis. Across 80 lines, 1 006 polymorphic SNPs (65.5% of the total) in the assay with good call quality were used to estimate the pattern of genetic diversity, population structure, and familial relatedness. The analysis showed the best number of fixed subgroups was six, which was consistent with their original sources and results using only simple sequence repeat markers. Pairwise linkage disequilibrium (LD) and association mapping with phenotypic traits investigated under water-stressed and well-watered regimes showed rapid LD decline within 100-500 kb along the physical distance of each chromosome, and that 29 SNPs were associated with at least two phenotypic traits in one or more environments, which were related to drought-tolerant or drought-responsive genes. These drought-tolerant SNPs could be converted into functional markers and then used for maize improvement by marker-assisted selection.  相似文献   

13.
Gattepaille LM  Jakobsson M 《Genetics》2012,190(1):159-174
High-throughput genotyping and sequencing technologies can generate dense sets of genetic markers for large numbers of individuals. For most species, these data will contain many markers in linkage disequilibrium (LD). To utilize such data for population structure inference, we investigate the use of haplotypes constructed by combining the alleles at single-nucleotide polymorphisms (SNPs). We introduce a statistic derived from information theory, the gain of informativeness for assignment (GIA), which quantifies the additional information for assigning individuals to populations using haplotype data compared to using individual loci separately. Using a two-loci-two-allele model, we demonstrate that combining markers in linkage equilibrium into haplotypes always leads to nonpositive GIA, suggesting that combining the two markers is not advantageous for ancestry inference. However, for loci in LD, GIA is often positive, suggesting that assignment can be improved by combining markers into haplotypes. Using GIA as a criterion for combining markers into haplotypes, we demonstrate for simulated data a significant improvement of assigning individuals to candidate populations. For the many cases that we investigate, incorrect assignment was reduced between 26% and 97% using haplotype data. For empirical data from French and German individuals, the incorrectly assigned individuals can, for example, be decreased by 73% using haplotypes. Our results can be useful for challenging population structure and assignment problems, in particular for studies where large-scale population-genomic data are available.  相似文献   

14.
Interleukin-10 (IL-10) is a cytokine that seems to function as a downregulator of the innate (nonadaptive) immune system. Approximately three-quarters of interindividual variability in human IL-10 levels has been attributed to genetic variation, and there is evidence suggesting a potential role for IL-10 in a range of human diseases. To provide a basis for haplotype analysis and future disease association studies, we characterized genetic variation in IL10 by sequencing all exons, and 2.5 kb of the 5'- and the 3'-flanking region in a panel of DNA samples from 24 African Americans, 23 European Americans, and 24 Hispanic Americans. The region sequenced was found to contain 28 single-nucleotide polymorphisms (SNPs), 16 with frequency >2% and 14 with frequency >5%. All SNPs with frequency >5% were present in subjects from all three populations. No SNP caused amino acid changes. Differences in pairwise linkage-disequilibrium (LD) patterns and in SNP and haplotype frequency distributions among the three populations may be of potential importance for disease association studies.  相似文献   

15.
This report describes single-nucleotide polymorphisms (SNPs) in the sheep major histocompatibility complex (MHC) class II and class III regions and provides insights into the internal structure of this important genomic complex. MHC haplotypes were deduced from sheep family trios based on genotypes from 20 novel SNPs representative of the class II region and 10 previously described SNPs spanning the class III region. All 30 SNPs exhibited Hardy-Weinberg proportions in the sheep population studied. Recombination within an extended sire haplotype was observed within the class II region for 4 of 20 sheep chromosomes, thereby supporting the presence of separated IIa and IIb subregions similar to those present in cattle. SNP heterozygosity varied across the class II and III regions. One segment of the class IIa subregion manifested very low heterozygosity for several SNPs spanning approximately 120 Kbp. This feature corresponds to a subregion within the human MHC class II region previously described as a 'SNP desert' because of its paucity of SNPs. Linkage disequilibrium (LD) was reduced at the junction separating the putative class IIb and IIa subregions and also between the class IIa and the class III subregions. The latter observation is consistent with either an unmapped physical separation at this location or more likely a boundary characterized by more frequent recombination between two conserved subregions, each manifesting high within-block LD. These results identify internal blocks of loci in the sheep MHC, within which recombination is relatively rare.  相似文献   

16.
Lu Y  Shah T  Hao Z  Taba S  Zhang S  Gao S  Liu J  Cao M  Wang J  Prakash AB  Rong T  Xu Y 《PloS one》2011,6(9):e24861
Understanding of genetic diversity and linkage disequilibrium (LD) decay in diverse maize germplasm is fundamentally important for maize improvement. A total of 287 tropical and 160 temperate inbred lines were genotyped with 1943 single nucleotide polymorphism (SNP) markers of high quality and compared for genetic diversity and LD decay using the SNPs and their haplotypes developed from genic and intergenic regions. Intronic SNPs revealed a substantial higher variation than exonic SNPs. The big window size haplotypes (3-SNP slide-window covering 2160 kb on average) revealed much higher genetic diversity than the 10 kb-window and gene-window haplotypes. The polymorphic information content values revealed by the haplotypes (0.436-0.566) were generally much higher than individual SNPs (0.247-0.259). Cluster analysis classified the 447 maize lines into two major groups, corresponding to temperate and tropical types. The level of genetic diversity and subpopulation structure were associated with the germplasm origin and post-domestication selection. Compared to temperate lines, the tropical lines had a much higher level of genetic diversity with no significant subpopulation structure identified. Significant variation in LD decay distance (2-100 kb) was found across the genome, chromosomal regions and germplasm groups. The average of LD decay distance (10-100 kb) in the temperate germplasm was two to ten times larger than that in the tropical germplasm (5-10 kb). In conclusion, tropical maize not only host high genetic diversity that can be exploited for future plant breeding, but also show rapid LD decay that provides more opportunity for selection.  相似文献   

17.
The ability of natural populations to adapt to new environmental conditions is crucial for their survival and partly determined by the standing genetic variation in each population. Populations with higher genetic diversity are more likely to contain individuals that are better adapted to new circumstances than populations with lower genetic diversity. Here, we use both neutral and major histocompatibility complex (MHC) markers to test whether small and highly fragmented populations hold lower genetic diversity than large ones. We use black grouse as it is distributed across Europe and found in populations with varying degrees of isolation and size. We sampled 11 different populations; five continuous, three isolated, and three small and isolated. We tested patterns of genetic variation in these populations using three different types of genetic markers: nine microsatellites and 21 single nucleotide polymorphisms (SNPs) which both were found to be neutral, and two functional MHC genes that are presumably under selection. The small isolated populations displayed significantly lower neutral genetic diversity compared to continuous populations. A similar trend, but not as pronounced, was found for genotypes at MHC class II loci. Populations were less divergent at MHC genes compared to neutral markers. Measures of genetic diversity and population genetic structure were positively correlated among microsatellites and SNPs, but none of them were correlated to MHC when comparing all populations. Our results suggest that balancing selection at MHC loci does not counteract the power of genetic drift when populations get small and fragmented.  相似文献   

18.
Yoo YK  Ke X  Hong S  Jang HY  Park K  Kim S  Ahn T  Lee YD  Song O  Rho NY  Lee MS  Lee YS  Kim J  Kim YJ  Yang JM  Song K  Kimm K  Weir B  Cardon LR  Lee JE  Hwang JJ 《Genetics》2006,174(1):491-497
The International HapMap Project aims to generate detailed human genome variation maps by densely genotyping single-nucleotide polymorphisms (SNPs) in CEPH, Chinese, Japanese, and Yoruba samples. This will undoubtedly become an important facility for genetic studies of diseases and complex traits in the four populations. To address how the genetic information contained in such variation maps is transferable to other populations, the Korean government, industries, and academics have launched the Korean HapMap project to genotype high-density Encyclopedia of DNA Elements (ENCODE) regions in 90 Korean individuals. Here we show that the LD pattern, block structure, haplotype diversity, and recombination rate are highly concordant between Korean and the two HapMap Asian samples, particularly Japanese. The availability of information from both Chinese and Japanese samples helps to predict more accurately the possible performance of HapMap markers in Korean disease-gene studies. Tagging SNPs selected from the two HapMap Asian maps, especially the Japanese map, were shown to be very effective for Korean samples. These results demonstrate that the HapMap variation maps are robust in related populations and will serve as an important resource for the studies of the Korean population in particular.  相似文献   

19.
Kim KJ  Lee HJ  Park MH  Cha SH  Kim KS  Kim HT  Kimm K  Oh B  Lee JY 《Genomics》2006,88(5):535-540
Understanding patterns of linkage disequilibrium (LD) across genomes may facilitate association mapping studies to localize genetic variants influencing complex diseases, a recognition that led to the International Haplotype Mapping Project (HapMap). Divergent patterns of haplotype frequency and LD across global populations require that the HapMap database be supplemented with haplotype and LD data from additional populations. We conducted a pilot study of the LD and haplotype structure of a genomic region in a Korean population. A total of 165 SNPs were identified in a 200-kb region of 22q13.2 by direct sequencing. Unphased genotype data were generated for 76 SNPs in 90 unrelated Korean individuals. LD, haplotype diversity, and recombination rates were assessed in this region and compared with the HapMap database. The pattern of LD and haplotype frequencies of Korean samples showed a high degree of similarity with Japanese data. There was a strong correlation between high LD and low recombination frequency in this region. We found considerable similarities in local LD patterns between three Asian populations (Han Chinese, Japanese, and Korean) and the CEPH population. Haplotype frequencies were, however, significantly different between them. Our results should further the understanding of distinctive Korean genomic features and assist in designing appropriate association studies.  相似文献   

20.
OBJECTIVES: Linkage disequilibrium (LD) between closely spaced SNPs can be accommodated in linkage analysis by specifying the multi-SNP haplotype frequencies, if known. Phased haplotypes in candidate regions can provide gold standard haplotype frequency estimates, and may be of inherent interest as markers. We evaluated the effects of different methods of haplotype frequency estimation, and the use of marker phase information, on linkage analysis of a multi-SNP cluster in a candidate region for Alzheimer's disease (AD). METHODS: We performed parametric linkage analysis of a five-SNP cluster in extended pedigrees to compare the use of: (1) haplotype frequencies estimated by molecular phase determination, maximum likelihood estimation, or by assuming linkage equilibrium (LE); (2) AD families or controls as the frequency source; and (3) unphased or molecularly phased SNP data. RESULTS: There was moderate to strong pairwise LD among the five SNPs. Falsely assuming LE substantially inflated the LOD score, but the method of haplotype frequency estimation and particular sample used made little difference provided that LD was accommodated. Use of phased haplotypes produced a modest increase in the LOD score over unphased SNPs. CONCLUSIONS: Ignoring LD between markers can lead to substantially inflated evidence for linkage in LOD score analysis of extended pedigrees with missing data. Use of marker phase information in linkage analysis may be important in disease studies where the costs of family recruitment and phenotyping greatly exceed the costs of phase determination.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号