首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The population of Costa Rica (CR) represents an admixture of major continental populations. An investigation of the CR population structure would provide an important foundation for mapping genetic variants underlying common diseases and traits. We conducted an analysis of 1,301 women from the Guanacaste region of CR using 27,904 single nucleotide polymorphisms (SNPs) genotyped on a custom Illumina InfiniumII iSelect chip. The program STRUCTURE was used to compare the CR Guanacaste sample with four continental reference samples, including HapMap Europeans (CEU), East Asians (JPT+CHB), West African Yoruba (YRI), as well as Native Americans (NA) from the Illumina iControl database. Our results show that the CR Guanacaste sample comprises a three-way admixture estimated to be 43% European, 38% Native American and 15% West African. An estimated 4% residual Asian ancestry may be within the error range. Results from principal components analysis reveal a correlation between genetic and geographic distance. The magnitude of linkage disequilibrium (LD) measured by the number of tagging SNPs required to cover the same region in the genome in the CR Guanacaste sample appeared to be weaker than that observed in CEU, JPT+CHB and NA reference samples but stronger than that of the HapMap YRI sample. Based on the clustering pattern observed in both STRUCTURE and principal components analysis, two subpopulations were identified that differ by approximately 20% in LD block size averaged over all LD blocks identified by Haploview. We also show in a simulated association study conducted within the two subpopulations, that the failure to account for population stratification (PS) could lead to a noticeable inflation in the false positive rate. However, we further demonstrate that existing PS adjustment approaches can reduce the inflation to an acceptable level for gene discovery.  相似文献   

2.
Linkage disequilibrium (LD) has received much attention recently because of its value in localizing disease-causing genes. Due to the extensive LD between neighboring loci in the human genome, it is believed that a subset of the single nucleotide polymorphisms in a region (tagSNPs) can be selected to capture most of the remaining SNP variants. In this study, we examined LD patterns and HapMap tagSNP transferability in more than 300 individuals. A South Indian sample and an African Mbuti Pygmy population sample were included to evaluate the performance of HapMap tagSNPs in geographically distinct and genetically isolated populations. Our results show that HapMap tagSNPs selected with r(2) >= 0.8 can capture more than 85% of the SNPs in populations that are from the same continental group. Combined tagSNPs from HapMap CEU and CHB+JPT serve as the best reference for the Indian sample. The HapMap YRI are a sufficient reference for tagSNP selection in the Pygmy sample. In addition to our findings, we reviewed over 25 recent studies of tagSNP transferability and propose a general guideline for selecting tagSNPs from HapMap populations.  相似文献   

3.
SNP markers provide the primary data for population structure analysis. In this study, we employed whole-genome autosomal SNPs as a marker set (54,836 SNP markers) and tested their possible effects on genetic ancestry using 320 subjects covering 24 regional groups including Northern ( = 16) and Southern ( = 3) Asians, Amerindians ( = 1), and four HapMap populations (YRI, CEU, JPT, and CHB). Additionally, we evaluated the effectiveness and robustness of 50K autosomal SNPs with various clustering methods, along with their dependencies on recombination hotspots (RH), linkage disequilibrium (LD), missing calls and regional specific markers. The RH- and LD-free multi-dimensional scaling (MDS) method showed a broad picture of human migration from Africa to North-East Asia on our genome map, supporting results from previous haploid DNA studies. Of the Asian groups, the East Asian group showed greater differentiation than the Northern and Southern Asian groups with respect to Fst statistics. By extension, the analysis of monomorphic markers implied that nine out of ten historical regions in South Korea, and Tokyo in Japan, showed signs of genetic drift caused by the later settlement of East Asia (South Korea, Japan and China), while Gyeongju in South East Korea showed signs of the earliest settlement in East Asia. In the genome map, the gene flow to the Korean Peninsula from its neighboring countries indicated that some genetic signals from Northern populations such as the Siberians and Mongolians still remain in the South East and West regions, while few signals remain from the early Southern lineages.  相似文献   

4.
5.

Background  

Since the single nucleotide polymorphisms (SNPs) are genetic variations which determine the difference between any two unrelated individuals, the SNPs can be used to identify the correct source population of an individual. For efficient population identification with the HapMap genotype data, as few informative SNPs as possible are required from the original 4 million SNPs. Recently, Park et al. (2006) adopted the nearest shrunken centroid method to classify the three populations, i.e., Utah residents with ancestry from Northern and Western Europe (CEU), Yoruba in Ibadan, Nigeria in West Africa (YRI), and Han Chinese in Beijing together with Japanese in Tokyo (CHB+JPT), from which 100,736 SNPs were obtained and the top 82 SNPs could completely classify the three populations.  相似文献   

6.
7.
The International HapMap Project is a resource for researchers containing genotype, sequencing, and expression information for EBV-transformed lymphoblastoid cell lines derived from populations across the world. The expansion of the HapMap beyond the four initial populations of Phase 2, referred to as Phase 3, has increased the sample number and ethnic diversity available for investigation. However, differences in the rate of cellular proliferation between the populations can serve as confounders in phenotype-genotype studies using these cell lines. Within the Phase 2 populations, the JPT and CHB cell lines grow faster (p < 0.0001) than the CEU or YRI cell lines. Phase 3 YRI cell lines grow significantly slower than Phase 2 YRI lines (p < 0.0001), with no widespread genetic differences based on common SNPs. In addition, we found significant growth differences between the cell lines in the Phase 2 ASN populations and the Han Chinese from the Denver metropolitan area panel in Phase 3 (p < 0.0001). Therefore, studies that separate HapMap panels into discovery and replication sets must take this into consideration.  相似文献   

8.
Efficacy assessment of SNP sets for genome-wide disease association studies   总被引:1,自引:0,他引:1  
The power of a genome-wide disease association study depends critically upon the properties of the marker set used, particularly the number and physical spacing of markers, and the level of inter-marker association due to linkage disequilibrium. Extending our previously devised theoretical framework for the entropy-based selection of genetic markers, we have developed a local measure of the efficacy of a marker set, relative to including a maximally polymorphic single nucleotide polymorphism (SNP) at the map position of interest. Using this quantitative criterion, we evaluated five currently available SNP sets, namely Affymetrix 100K and 500K, and Illumina 100K, 300K and 550K in the CEU, YRI and JPT + CHB HapMap populations. At 50% relative efficacy, the commercial marker sets cover between 19 and 68% of the human genome, depending upon the population under study. An optimal technology-independent 500K marker set constructed from HapMap for Caucasians, in contrast, would achieve 73% coverage at the same relative efficacy.  相似文献   

9.
The vitamin D receptor (VDR) is an essential protein related to bone metabolism. Some VDR alleles are differentially distributed among ethnic populations and display variable patterns of linkage disequilibrium (LD). In this study, 200 unrelated Brazilians were genotyped using 21 VDR single nucleotide polymorphisms (SNPs) and 28 ancestry informative markers. The patterns of LD and haplotype distribution were compared among Brazilian and the HapMap populations of African (YRI), European (CEU) and Asian (JPT+CHB) origins. Conditional regression and haplotype-specific analysis were performed using estimates of individual genetic ancestry in Brazilians as a quantitative trait. Similar patterns of LD were observed in the 5' and 3' gene regions. However, the frequency distribution of haplotype blocks varied among populations. Conditional regression analysis identified haplotypes associated with European and Amerindian ancestry, but not with the proportion of African ancestry. Individual ancestry estimates were associated with VDR haplotypes. These findings reinforce the need to correct for population stratification when performing genetic association studies in admixed populations.  相似文献   

10.
Lu JT  Wang Y  Gibbs RA  Yu F 《Genome biology》2012,13(2):R15-11

Background

Indels are an important cause of human variation and central to the study of human disease. The 1000 Genomes Project Low-Coverage Pilot identified over 1.3 million indels shorter than 50 bp, of which over 890 were identified as potentially disruptive variants. Yet, despite their ubiquity, the local genomic characteristics of indels remain unexplored.

Results

Herein we describe population- and minor allele frequency-based differences in linkage disequilibrium and imputation characteristics for indels included in the 1000 Genomes Project Low-Coverage Pilot for the CEU, YRI and CHB+JPT populations. Common indels were well tagged by nearby SNPs in all studied populations, and were also tagged at a similar rate to common SNPs. Both neutral and functionally deleterious common indels were imputed with greater than 95% concordance from HapMap Phase 3 and OMNI SNP sites. Further, 38 to 56% of low frequency indels were tagged by low frequency SNPs. We were able to impute heterozygous low frequency indels with over 50% concordance. Lastly, our analysis also revealed evidence of ascertainment bias. This bias prevents us from extending the applicability of our results to highly polymorphic indels that could not be identified in the Low-Coverage Pilot.

Conclusions

Although further scope exists to improve the imputation of low frequency indels, our study demonstrates that there are already ample opportunities to retrospectively impute indels for prior genome-wide association studies and to incorporate indel imputation into future case/control studies.  相似文献   

11.
Genotype imputation, used in genome-wide association studies to expand coverage of single nucleotide polymorphisms (SNPs), has performed poorly in African Americans compared to less admixed populations. Overall, imputation has typically relied on HapMap reference haplotype panels from Africans (YRI), European Americans (CEU), and Asians (CHB/JPT). The 1000 Genomes project offers a wider range of reference populations, such as African Americans (ASW), but their imputation performance has had limited evaluation. Using 595 African Americans genotyped on Illumina’s HumanHap550v3 BeadChip, we compared imputation results from four software programs (IMPUTE2, BEAGLE, MaCH, and MaCH-Admix) and three reference panels consisting of different combinations of 1000 Genomes populations (February 2012 release): (1) 3 specifically selected populations (YRI, CEU, and ASW); (2) 8 populations of diverse African (AFR) or European (AFR) descent; and (3) all 14 available populations (ALL). Based on chromosome 22, we calculated three performance metrics: (1) concordance (percentage of masked genotyped SNPs with imputed and true genotype agreement); (2) imputation quality score (IQS; concordance adjusted for chance agreement, which is particularly informative for low minor allele frequency [MAF] SNPs); and (3) average r2hat (estimated correlation between the imputed and true genotypes, for all imputed SNPs). Across the reference panels, IMPUTE2 and MaCH had the highest concordance (91%–93%), but IMPUTE2 had the highest IQS (81%–83%) and average r2hat (0.68 using YRI+ASW+CEU, 0.62 using AFR+EUR, and 0.55 using ALL). Imputation quality for most programs was reduced by the addition of more distantly related reference populations, due entirely to the introduction of low frequency SNPs (MAF≤2%) that are monomorphic in the more closely related panels. While imputation was optimized by using IMPUTE2 with reference to the ALL panel (average r2hat = 0.86 for SNPs with MAF>2%), use of the ALL panel for African American studies requires careful interpretation of the population specificity and imputation quality of low frequency SNPs.  相似文献   

12.
In order to analyze the pattern of DNA polymorphism in detail, we have developed a simple method using a new statistic theta(i) which estimates 4Nmu from the number of segregating sites whose allelic nucleotide frequency is i/n among n DNA sequences, where N is the effective population size and mu is the mutation rate per generation per nucleotide site. Under the assumption that mutations are selectively neutral and a population size is constant, the expectation of theta(i) is equal to that of theta, which estimates 4Nmu from the number of segregating sites, so that the distribution of theta(i) is flat. Therefore, the departure of the distribution of theta(i) from the horizontal line, which represents the value of theta, reflects change in population size and natural selection. Results of the coalescent simulation show that the distributions of theta(i) in the populations which experienced expansion and reduction are U-shaped and upside-down U-shaped, respectively. And the distributions of theta(i) in some populations that experienced bottleneck are W-shaped. Furthermore, we have applied this method to the SNP data in the International HapMap Project. Results of data analyses show that the distributions of theta(i) in the CEU (European), CHB and JPT (Asian) populations are different from that in the YRI population (African). From these results of data analyses in nuclear DNA and the pattern of polymorphism in human mitochondrial DNA already known, we infer that the CEU, CHB and JPT populations experienced the bottleneck.  相似文献   

13.
Pharmacogenomic variant information is well known for major human populations; however, this information is less commonly studied in minorities. In the present study, we genotyped 85 very important pharmacogenetic (VIP) variants (selected from the PharmGKB database) in the Kyrgyz population and compared our data with other four major human populations including Han Chinese in Beijing, China (CHB), the Japanese in Tokyo, Japan (JPT), a northern and western Europe population (CEU), and the Yoruba in Ibadan, Nigeria (YRI). There were 13, 12 and 16 of the selected VIP variant genotype frequencies in the Kyrgyz which differed from those of the CHB, JPT and CEU, respectively (p < 0.005). In the YRI, there were 32 different variants, compared to the Kyrgyz (p < 0.005). Genotype frequencies of ADH1B, AHR, CYP3A5, PTGS2, VDR, and VKORC1 in the Kyrgyz differed widely from those in the four populations. Haplotype analyses also showed differences among the Kyrgyz and the other four populations. Our results complement the information provided by the database of pharmacogenomics on Kyrgyz. We provide a theoretical basis for safer drug administration and individualized treatment plans for the Kyrgyz. We also provide a template for the study of pharmacogenomics in various ethnic minority groups in China.  相似文献   

14.

Background

The Chinese Hui population, as the second largest minority ethnic group in China, may have a different genetic background from Han people because of its unique demographic history. In this study, we aimed to identify genetic differences between Han and Hui Chinese from the Ningxia region of China by comparing eighteen single nucleotide polymorphisms in cancer-related genes.

Methods

DNA samples were collected from 99 Hui and 145 Han people from the Ningxia Hui Autonomous Region in China, and SNPs were detected using an improved multiplex ligase detection reaction method. Genotyping data from six 1000 Genomes Project population samples (99 Utah residents with northern and western European ancestry (CEU), 107 Toscani in Italy (TSI), 108 Yoruba in Ibadan (YRI), 61 of African ancestry in the southwestern US (ASW), 103 Han Chinese in Beijing (CHB), and 104 Japanese in Tokyo (JPT)) were also included in this study. Differences in the distribution of alleles among the populations were assessed using χ2 tests, and FST was used to measure the degree of population differentiation.

Results

We found that the genetic diversity of many SNPs in cancer-related genes in the Hui Chinese in Ningxia was different from that in the Han Chinese in Ningxia. For example, the allele frequencies of four SNPs (rs13361707, rs2274223, rs465498, and rs753955) showed different genetic distributions (p<0.05) between Chinese Ningxia Han and Chinese Ningxia Hui. Five SNPs (rs730506, rs13361707, rs2274223, rs465498 and rs753955) had different FST values (FST >0.000) between the Hui and Han populations.

Conclusions

These results suggest that some SNPs associated with cancer-related genes vary among different Chinese ethnic groups. We suggest that population differences should be carefully considered in evaluating cancer risk and prognosis as well as the efficacy of cancer therapy.  相似文献   

15.
Previous theory indicates that zygotic linkage disequilibrium (LD) is more informative than gametic or composite digenic LD in revealing natural population history. Further, the difference between the composite digenic and maximum zygotic LDs can be used to detect epistatic selection for fitness. Here we corroborate the theory by investigating genome-wide zygotic LDs in HapMap phase III human populations. Results show that non-Africa populations have much more significant zygotic LDs than do Africa populations. Africa populations (ASW, LWK, MKK, and YRI) possess more significant zygotic LDs for the double-homozygotes (DAABB) than any other significant zygotic LDs (DAABb, DAaBB, and DAaBb), while non-Africa populations generally have more significant DAaBb’s than any other significant zygotic LDs (DAABB, DAABb, and DAaBB). Average r-squares for any significant zygotic LDs increase generally in an order of populations YRI, MKK, CEU, CHB, LWK, JPT, CHD, TSI, GIH, ASW, and MEX. Average r-squares are greater for DAABB and DAaBb than for DAaBB and DAABb in each population. YRI and MKK can be separated from LWK and ASW in terms of the pattern of average r-squares. All population divergences in zygotic LDs can be interpreted with the model of Out of Africa for modern human origins. We have also detected 19735-95921 SNP pairs exhibiting strong signals of epistatic selection in different populations. Gene-gene interactions for some epistatic SNP pairs are evident from empirical findings, but many more epistatic SNP pairs await evidence. Common epistatic SNP pairs rarely exist among all populations, but exist in distinct regions (Africa, Europe, and East Asia), which helps to understand geographical genomic medicine.  相似文献   

16.
We screened for the major essential single-nucleotide polymorphism (SNP) variant that might be associated with the MSH2 gene based on the data available from three types of human tissue samples [156 lymphoblastoid cell variations (LCL), 160 epidermis, 166 fat]. An association analysis confirmed that the KCNK12 SNP variant (rs748780) was highly associated (p value 9 × 10?4) with the MSH2 gene for all three samples. Using SNP identification, we further found that the recognized SNP was also relevant among Hapmap populations. Techniques that display specific SNPs associated with the gene of interest or nearby genes provide more reliable genetic associations than techniques that rely on data from individual SNPs. We investigated the MSH2 gene regional linkage association with the determined SNP (rs748780), KCNK12 variant (Allele T>C) in the intronic region, in HapMap3 full dataset populations, Yoruba in Ibadan, Nigeria (YRI), Utah residents with ancestry from northern Europe (CEU), Han Chinese in Beijing, China (CHB), and a population of Mexican ancestry in Los Angeles, California (MEX). A gene-based SNP association analysis analyzes the combined impact of every variant within the gene while creating referrals to linkage disequilibrium or connections between markers. Our results indicated that among the four populations studied, this association was highest in the MEX population based on the r 2 value; a similar pattern was also observed in the other three populations. The relevant SNP rs748780 in KCNK12 is related to a superfamily of potassium channel pore-forming P-domain proteins as well as to other non-pore-forming proteins and has been shown to be relevant to neurological disorder predisposition in MEX as well as in other populations.  相似文献   

17.
Prolongation of the electrocardiographic QT interval, a measure of cardiac repolarization, predisposes one to ventricular arrhythmias and sudden cardiac death. Since NOS1AP, a regulator of neuronal nitric oxide synthase, was discovered in a genome-wide association study (GWAS) as a novel target that modulates cardiac repolarization, several loci have been linked to the QT interval in studies (QTGEN and QTSCD) of European descendents. However, there has been no GWAS of the QT interval in Asian populations. We conducted a GWAS with regard to the QT interval in Korea Association Resource (KARE [n = 6,805]) cohorts. Replication studies in independent populations of Korean (n = 4,686) and Japanese (n = 2,687) groups validated the association between a SNP, rs13017846, which maps to near SLC8A1 (sodium/calcium exchanger 1 precursor, overall p = 8.0 × 10(-14)), and the QT interval. The minor allele frequency (MAF) of rs13017846 varies widely between ethnicities-0.053 in Europeans (HapMap CEU [Utah residents with ancestry from northern and western Europe from the Centre d'étude du Polymorphisme Humain collection] samples) versus 0.080 in Africans (HapMap YRI [Yoruba in Ibadan, Nigeria] samples)-whereas a MAF of 0.500 has been reported in Asians (HapMap HCB [Han Chinese in Beijing, China] and JPT [Japanese in Tokyo, Japan] samples). This might explain why this locus has not been identified in Europeans in previous studies.  相似文献   

18.
19.
MOTIVATIONS: The tag SNP approach is a valuable tool in whole genome association studies, and a variety of algorithms have been proposed to identify the optimal tag SNP set. Currently, most tag SNP selection is based on two-marker (pairwise) linkage disequilibrium (LD). Recent literature has shown that multiple-marker LD also contains useful information that can further increase the genetic coverage of the tag SNP set. Thus, tag SNP selection methods that incorporate multiple-marker LD are expected to have advantages in terms of genetic coverage and statistical power. RESULTS: We propose a novel algorithm to select tag SNPs in an iterative procedure. In each iteration loop, the SNP that captures the most neighboring SNPs (through pair-wise and multiple-marker LD) is selected as a tag SNP. We optimize the algorithm and computer program to make our approach feasible on today's typical workstations. Benchmarked using HapMap release 21, our algorithm outperforms standard pair-wise LD approach in several aspects. (i) It improves genetic coverage (e.g. by 7.2% for 200 K tag SNPs in HapMap CEU) compared to its conventional pair-wise counterpart, when conditioning on a fixed tag SNP number. (ii) It saves genotyping costs substantially when conditioning on fixed genetic coverage (e.g. 34.1% saving in HapMap CEU at 90% coverage). (iii) Tag SNPs identified using multiple-marker LD have good portability across closely related ethnic groups and (iv) show higher statistical power in association tests than those selected using conventional methods. AVAILABILITY: A computer software suite, multiTag, has been developed based on this novel algorithm. The program is freely available by written request to the author at ke_hao@merck.com  相似文献   

20.

Background

Genetic isolates such as the Ashkenazi Jews (AJ) potentially offer advantages in mapping novel loci in whole genome disease association studies. To analyze patterns of genetic variation in AJ, genotypes of 101 healthy individuals were determined using the Affymetrix EAv3 500 K SNP array and compared to 60 CEPH-derived HapMap (CEU) individuals. 435,632 SNPs overlapped and met annotation criteria in the two groups.

Results

A small but significant global difference in allele frequencies between AJ and CEU was demonstrated by a mean F ST of 0.009 (P < 0.001); large regions that differed were found on chromosomes 2 and 6. Haplotype blocks inferred from pairwise linkage disequilibrium (LD) statistics (Haploview) as well as by expectation-maximization haplotype phase inference (HAP) showed a greater number of haplotype blocks in AJ compared to CEU by Haploview (50,397 vs. 44,169) or by HAP (59,269 vs. 54,457). Average haplotype blocks were smaller in AJ compared to CEU (e.g., 36.8 kb vs. 40.5 kb HAP). Analysis of global patterns of local LD decay for closely-spaced SNPs in CEU demonstrated more LD, while for SNPs further apart, LD was slightly greater in the AJ. A likelihood ratio approach showed that runs of homozygous SNPs were approximately 20% longer in AJ. A principal components analysis was sufficient to completely resolve the CEU from the AJ.

Conclusion

LD in the AJ versus was lower than expected by some measures and higher by others. Any putative advantage in whole genome association mapping using the AJ population will be highly dependent on regional LD structure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号