共查询到20条相似文献,搜索用时 15 毫秒
1.
Yohan Bossé François Bacot Alexandre Montpetit Johan Rung Hui-Qi Qu James C. Engert Constantin Polychronakos Thomas J. Hudson Philippe Froguel Robert Sladek Martin Desrosiers 《Human genetics》2009,125(3):305-318
The success of genome-wide association studies (GWAS) to identify risk loci of complex diseases is now well-established. One
persistent major hurdle is the cost of those studies, which make them beyond the reach of most research groups. Performing
GWAS on pools of DNA samples may be an effective strategy to reduce the costs of these studies. In this study, we performed
pooling-based GWAS with more than 550,000 SNPs in two case-control cohorts consisting of patients with Type II diabetes (T2DM)
and with chronic rhinosinusitis (CRS). In the T2DM study, the results of the pooling experiment were compared to individual
genotypes obtained from a previously published GWAS. TCF7L2 and HHEX SNPs associated with T2DM by the traditional GWAS were
among the top ranked SNPs in the pooling experiment. This dataset was also used to refine the best strategy to correctly identify
SNPs that will remain significant based on individual genotyping. In the CRS study, the top hits from the pooling-based GWAS
located within ten kilobases of known genes were validated by individual genotyping of 1,536 SNPs. Forty-one percent (598
out of the 1,457 SNPs that passed quality control) were associated with CRS at a nominal P value of 0.05, confirming the potential of pooling-based GWAS to identify SNPs that differ in allele frequencies between
two groups of subjects. Overall, our results demonstrate that a pooling experiment on high-density genotyping arrays can accurately
determine the minor allelic frequency as compared to individual genotyping and produce a list of top ranked SNPs that captures
genuine allelic differences between a group of cases and controls. The low cost associated with a pooling-based GWAS clearly
justifies its use in screening for genetic determinants of complex diseases.
Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users. 相似文献
2.
As millions of single-nucleotide polymorphisms (SNPs) have been identified and high-throughput genotyping technologies have been rapidly developed, large-scale genomewide association studies are soon within reach. However, since a genomewide association study involves a large number of SNPs it is therefore nearly impossible to ensure a genomewide significance level of 0.05 using the available statistics, although the multiple-test problems can be alleviated, but not sufficiently, by the use of tagging SNPs. One strategy to circumvent the multiple-test problem associated with genome-wide association tests is to develop novel test statistics with high power. In this report, we introduce several nonlinear tests, which are based on nonlinear transformation of allele or haplotype frequencies. We investigate the power of the nonlinear test statistics and demonstrate that under certain conditions, some nonlinear test statistics have much higher power than the standard chi2-test statistic. Type I error rates of the nonlinear tests are validated using simulation studies. We also show that a class of similarity measure-based test statistics is based on the quadratic function of allele or haplotype frequencies, and thus they belong to nonlinear tests. To evaluate their performance, the nonlinear test statistics are also applied to three real data sets. Our study shows that nonlinear test statistics have great potential in association studies of complex diseases. 相似文献
3.
Guidelines for genotyping in genomewide linkage studies: single-nucleotide-polymorphism maps versus microsatellite maps
下载免费PDF全文

Genomewide linkage scans have traditionally employed panels of microsatellite markers spaced at intervals of approximately 10 cM across the genome. However, there is a growing realization that a map of closely spaced single-nucleotide polymorphisms (SNPs) may offer equal or superior power to detect linkage, compared with low-density microsatellite maps. We performed a series of simulations to calculate the information content associated with microsatellite and SNP maps across a range of different marker densities and heterozygosities for sib pairs (with and without parental genotypes), sib trios, and sib quads. In the case of microsatellite markers, we varied density across 11 levels (1 marker every 0.5, 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 cM) and marker heterozygosity across 6 levels (2, 3, 4, 5, 10, or 20 equally frequent alleles), whereas, in the case of SNPs, we varied marker density across 4 levels (1 marker every 0.1, 0.2, 0.5, or 1 cM) and minor-allele frequency across 7 levels (0.5, 0.4, 0.3, 0.2, 0.1, 0.05, and 0.01). When parental genotypes were available, a map consisting of microsatellites spaced every 2 cM or a relatively sparse map of SNPs (i.e., at least 1 SNP/cM) was sufficient to extract most of the inheritance information from the map (>95% in most cases). However, when parental genotypes were unavailable, it was important to use as dense a map of markers as possible to extract the greatest amount of inheritance information. It is important to note that the information content associated with a traditional map of microsatellite markers (i.e., 1 marker every ~10 cM) was significantly lower than the information content associated with a dense map of SNPs or microsatellites. These results strongly suggest that previous linkage studies that employed sparse microsatellite maps could benefit substantially from reanalysis by use of a denser map of markers. 相似文献
4.
As the extent of human genetic variation becomes more fully characterized, the research community is faced with the challenging task of using this information to dissect the heritable components of complex traits. Genomewide association studies offer great promise in this respect, but their analysis poses formidable difficulties. In this article, we describe a computationally efficient approach to mining genotype-phenotype associations that scales to the size of the data sets currently being collected in such studies. We use discrete graphical models as a data-mining tool, searching for single- or multilocus patterns of association around a causative site. The approach is fully Bayesian, allowing us to incorporate prior knowledge on the spatial dependencies around each marker due to linkage disequilibrium, which reduces considerably the number of possible graphical structures. A Markov chain-Monte Carlo scheme is developed that yields samples from the posterior distribution of graphs conditional on the data from which probabilistic statements about the strength of any genotype-phenotype association can be made. Using data simulated under scenarios that vary in marker density, genotype relative risk of a causative allele, and mode of inheritance, we show that the proposed approach has better localization properties and leads to lower false-positive rates than do single-locus analyses. Finally, we present an application of our method to a quasi-synthetic data set in which data from the CYP2D6 region are embedded within simulated data on 100K single-nucleotide polymorphisms. Analysis is quick (<5 min), and we are able to localize the causative site to a very short interval. 相似文献
5.
Efficient genotyping methods and the availability of a large collection of single-nucleotide polymorphisms provide valuable tools for genetic studies of human disease. The standard chi2 statistic for case-control studies, which uses a linear function of allele frequencies, has limited power when the number of marker loci is large. We introduce a novel test statistic for genetic association studies that uses Shannon entropy and a nonlinear function of allele frequencies to amplify the differences in allele and haplotype frequencies to maintain statistical power with large numbers of marker loci. We investigate the relationship between the entropy-based test statistic and the standard chi2 statistic and show that, in most cases, the power of the entropy-based statistic is greater than that of the standard chi2 statistic. The distribution of the entropy-based statistic and the type I error rates are validated using simulation studies. Finally, we apply the new entropy-based test statistic to two real data sets, one for the COMT gene and schizophrenia and one for the MMP-2 gene and esophageal carcinoma, to evaluate the performance of the new method for genetic association studies. The results show that the entropy-based statistic obtained smaller P values than did the standard chi2 statistic. 相似文献
6.
Identification of genetic variants contributing to cisplatin-induced cytotoxicity by use of a genomewide approach 总被引:2,自引:0,他引:2
下载免费PDF全文

Huang RS Duan S Shukla SJ Kistner EO Clark TA Chen TX Schweitzer AC Blume JE Dolan ME 《American journal of human genetics》2007,81(3):427-437
Cisplatin, a platinating agent commonly used to treat several cancers, is associated with nephrotoxicity, neurotoxicity, and ototoxicity, which has hindered its utility. To gain a better understanding of the genetic variants associated with cisplatin-induced toxicity, we present a stepwise approach integrating genotypes, gene expression, and sensitivity of HapMap cell lines to cisplatin. Cell lines derived from 30 trios of European descent (CEU) and 30 trios of African descent (YRI) were used to develop a preclinical model to identify genetic variants and gene expression that contribute to cisplatin-induced cytotoxicity in two different populations. Cytotoxicity was determined as cell-growth inhibition at increasing concentrations of cisplatin for 48 h. Gene expression in 176 HapMap cell lines (87 CEU and 89 YRI) was determined using the Affymetrix GeneChip Human Exon 1.0 ST Array. We identified six, two, and nine representative SNPs that contribute to cisplatin-induced cytotoxicity through their effects on 8, 2, and 16 gene expressions in the combined, Centre d'Etude du Polymorphisme Humain (CEPH), and Yoruban populations, respectively. These genetic variants contribute to 27%, 29%, and 45% of the overall variation in cell sensitivity to cisplatin in the combined, CEPH, and Yoruban populations, respectively. Our whole-genome approach can be used to elucidate the expression of quantitative trait loci contributing to a wide range of cellular phenotypes. 相似文献
7.
Published genomewide association (GWA) studies typically analyze and report single-nucleotide polymorphisms (SNPs) and their neighboring genes with the strongest evidence of association (the “most-significant SNPs/genes” approach), while paying little attention to the rest. Borrowing ideas from microarray data analysis, we demonstrate that pathway-based approaches, which jointly consider multiple contributing factors in the same pathway, might complement the most-significant SNPs/genes approach and provide additional insights into interpretation of GWA data on complex diseases. 相似文献
8.
9.
10.
Homer N Tembe WD Szelinger S Redman M Stephan DA Pearson JV Nelson SF Craig D 《Bioinformatics (Oxford, England)》2008,24(17):1896-1902
For many genome-wide association (GWA) studies individually genotyping one million or more SNPs provides a marginal increase in coverage at a substantial cost. Much of the information gained is redundant due to the correlation structure inherent in the human genome. Pooling-based GWA studies could benefit significantly by utilizing this redundancy to reduce noise, improve the accuracy of the observations and increase genomic coverage. We introduce a measure of correlation between individual genotyping and pooling, under the same framework that r(2) provides a measure of linkage disequilibrium (LD) between pairs of SNPs. We then report a new non-haplotype multimarker multi-loci method that leverages the correlation structure between SNPs in the human genome to increase the efficacy of pooling-based GWA studies. We first give a theoretical framework and derivation of our multimarker method. Next, we evaluate simulations using this multimarker approach in comparison to single marker analysis. Finally, we experimentally evaluate our method using different pools of HapMap individuals on the Illumina 450S Duo, Illumina 550K and Affymetrix 5.0 platforms for a combined total of 1 333 631 SNPs. Our results show that use of multimarker analysis reduces noise specific to pooling-based studies, allows for efficient integration of multiple microarray platforms and provides more accurate measures of significance than single marker analysis. Additionally, this approach can be extended to allow for imputing the association significance for SNPs not directly observed using neighboring SNPs in LD. This multimarker method can now be used to cost-effectively complete pooling-based GWA studies with multiple platforms across over one million SNPs and to impute neighboring SNPs weighted for the loss of information due to pooling. 相似文献
11.
Enriching the analysis of genomewide association studies with hierarchical modeling 总被引:1,自引:0,他引:1
下载免费PDF全文

Genomewide association studies (GWAs) initially investigate hundreds of thousands of single-nucleotide polymorphisms (SNPs), and the most promising SNPs are further evaluated with additional subjects, for replication or a joint analysis. Deciding which SNPs merit follow-up is one of the most crucial aspects of these studies. We present here an approach for selecting the most-promising SNPs that incorporates into a hierarchical model both conventional results and other existing information about the SNPs. The model is developed for general use, its potential value is shown by application, and tools are provided for undertaking hierarchical modeling. By quantitatively harnessing all available information in GWAs, hierarchical modeling may more clearly distinguish true causal variants from noise. 相似文献
12.
Highly sensitive method for genomewide detection of allelic composition in nonpaired, primary tumor specimens by use of affymetrix single-nucleotide-polymorphism genotyping microarrays
下载免费PDF全文

Yamamoto G Nannya Y Kato M Sanada M Levine RL Kawamata N Hangaishi A Kurokawa M Chiba S Gilliland DG Koeffler HP Ogawa S 《American journal of human genetics》2007,81(1):114-126
Loss of heterozygosity (LOH), either with or without accompanying copy-number loss, is a cardinal feature of cancer genomes that is tightly linked to cancer development. However, detection of LOH is frequently hampered by the presence of normal cell components within tumor specimens and the limitation in availability of constitutive DNA. Here, we describe a simple but highly sensitive method for genomewide detection of allelic composition, based on the Affymetrix single-nucleotide-polymorphism genotyping microarray platform, without dependence on the availability of constitutive DNA. By sensing subtle distortions in allele-specific signals caused by allelic imbalance with the use of anonymous controls, sensitive detection of LOH is enabled with accurate determination of allele-specific copy numbers, even in the presence of up to 70%-80% normal cell contamination. The performance of the new algorithm, called "AsCNAR" (allele-specific copy-number analysis using anonymous references), was demonstrated by detecting the copy-number neutral LOH, or uniparental disomy (UPD), in a large number of acute leukemia samples. We next applied this technique to detection of UPD involving the 9p arm in myeloproliferative disorders (MPDs), which is tightly associated with a homozygous JAK2 mutation. It revealed an unexpectedly high frequency of 9p UPD that otherwise would have been undetected and also disclosed the existence of multiple subpopulations having distinct 9p UPD within the same MPD specimen. In conclusion, AsCNAR should substantially improve our ability to dissect the complexity of cancer genomes and should contribute to our understanding of the genetic basis of human cancers. 相似文献
13.
A weighted-Holm procedure accounting for allele frequencies in genomewide association studies
下载免费PDF全文

In the context of genomewide association studies where hundreds of thousand of polymorphisms are tested, stringent thresholds on the raw association test P-values are generally used to limit false-positive results. Instead of using thresholds based on raw P-values as in Bonferroni and sequential Sidak (SidakSD) corrections, we propose here to use a weighted-Holm procedure with weights depending on allele frequency of the polymorphisms. This method is shown to substantially improve the power to detect associations, in particular by favoring the detection of rare variants with high genetic effects over more frequent ones with lower effects. 相似文献
14.
Yijun Zuo 《Journal of theoretical biology》2010,262(4):576-583
Genomewide association studies (GWAS) are being conducted to unravel the genetic etiology of complex diseases, in which complex epistasis may play an important role. One-stage method in which interactions are tested using all samples at one time may be computationally problematic, may have low power as the number of markers tested increases and may not be cost-efficient. A common two-stage method may be a reasonable and powerful approach for detecting interacting genes using all samples in both two stages. In this study, we introduce an alternative two-stage method, in which some promising markers are selected using a proportion of samples in the first stage and interactions are then tested using the remaining samples in the second stage. This two-stage method is called mixed two-stage method. We then investigate the power of both one-stage method and mixed two-stage method to detect interacting disease loci for a range of two-locus epistatic models in a case-control study design. Our results suggest that mixed two-stage method may be more powerful than one-stage method if we choose about 30% of samples for single-locus tests in the first stage, and identify less than and equal to 1% of markers for follow-up interaction tests. In addition, we compare both two-stage methods and find that our two-stage method will lose power because we only use part of samples in both two stages. 相似文献
15.
Garner C 《Human heredity》2006,61(1):22-26
BACKGROUND: The optimal control sample would be ethnically-matched and at minimal risk of developing the disease. Alternatively, one could collect random individuals from the population or select individuals to reduce the number of at-risk individuals in the sample. The effect of randomly selected individuals in a control sample on the statistical power and the odds ratio estimate was investigated. METHODS: Case and control genotype distributions were simulated using standard genetic models with an additional term representing the proportion of unidentified cases in the control sample. Power and odds ratio were calculated from the genotype distributions generated under different sampling scenarios using established methods. RESULTS: Random sampling of controls resulted in a loss in power and a reduction in the odds ratio estimate to a degree that is determined by the proportion of random sampling and the prevalence of the disease. Random sampling resulted in a 19% loss in power for a disease having prevalence of 0.20, compared to a control sample that contained no at-risk individuals. Having random controls results in a decrease in the odds ratio estimate. CONCLUSIONS: Investigators planning case-control genetic association studies should be aware of the statistical costs of different ascertainment approaches. 相似文献
16.
Kyung-Won Hong Cheong-Sik Kim Haesook Min Seon-Joo Park Jae Kyung Park Younjhin Ahn Sung Soo Kim Yeonjung Kim 《Genes & genomics.》2013,35(1):69-75
Early menarche is associated with adverse health outcomes, including breast cancer, endometrial cancer, obesity, type 2 diabetes, and cardiovascular disease. Recently, a genomewide association study (GWAS) of age at menarche (AAM) in 104,533 individuals of European ancestry was reported by the ReproGen consortium. They identified 42 loci known and novel loci that were linked to age at menarche. Because age at menarche varies between ethnic groups, we decided to investigate if these results would be replicated in the Korean population. To this end, we examined the association of the SNPs reported in the ReproGen GWAS with AAM in 3,194 individuals from the Korean Genome and Epidemiology Study (KoGES) cohort. Genotype data for total 17 SNPs (6 genotyped SNPs and 11 imputed SNPs) were available for the association analysis using linear regression analysis for age at menarche with controlling current age, waist-to-hip ratio, and body mass index as the covariates. We found replication of the ReproGen study in two SNPs; one SNP (rs466639) in the retinoic acid receptor gamma gene (RXRG), showing a significant association with early menarche (beta = ?0.224 ± 0.065, p value = 5.2 × 10?4, Bonferroni-corrected p value = 0.009), and the other (rs10899489), in GRB2 (growth factor receptor bound protein 2)-associated binding protein 2 (GAB2), linked to late menarche (beta = 0.140 ± 0.047, p value = 2.8 × 10?3, Bonferroni-corrected p value = 0.049). This result possibly suggests that genetic factors governing AAM in the Korean population would be distinct from those in the Europeans, implying roles of modulating or interacting factors in determining AAM, including environmental factors such as nutritional status. 相似文献
17.
Meta-analysis of genetic association studies 总被引:11,自引:0,他引:11
Meta-analysis, a statistical tool for combining results across studies, is becoming popular as a method for resolving discrepancies in genetic association studies. Persistent difficulties in obtaining robust, replicable results in genetic association studies are almost certainly because genetic effects are small, requiring studies with many thousands of subjects to be detected. In this article, we describe how meta-analysis works and consider whether it will solve the problem of underpowered studies or whether it is another affliction visited by statisticians on geneticists. We show that meta-analysis has been successful in revealing unexpected sources of heterogeneity, such as publication bias. If heterogeneity is adequately recognized and taken into account, meta-analysis can confirm the involvement of a genetic variant, but it is not a substitute for an adequately powered primary study. 相似文献
18.
Probabilistic graphical models have been widely recognized as a powerful formalism in the bioinformatics field, especially in gene expression studies and linkage analysis. Although less well known in association genetics, many successful methods have recently emerged to dissect the genetic architecture of complex diseases. In this review article, we cover the applications of these models to the population association studies' context, such as linkage disequilibrium modeling, fine mapping and candidate gene studies, and genome-scale association studies. Significant breakthroughs of the corresponding methods are highlighted, but emphasis is also given to their current limitations, in particular, to the issue of scalability. Finally, we give promising directions for future research in this field. 相似文献
19.
Genotyping technology now allows the rapid and affordable generation of million-SNP profiles for humans, leading to considerable activity in association mapping. Similar activity is anticipated for many plant species, including Brassica. These plant association mapping activities will require the same care in quality control and quality assurance as for humans. The subsequent analyses may draw upon the same body of theory that is described here in the language of quantitative genetics. 相似文献
20.
The genetic analysis of mate choice is fraught with difficulties. Males produce complex signals and displays that can consist of a combination of acoustic, visual, chemical and behavioural phenotypes. Furthermore, female preferences for these male traits are notoriously difficult to quantify. During mate choice, genes not only affect the phenotypes of the individual they are in, but can influence the expression of traits in other individuals. How can genetic analyses be conducted to encompass this complexity? Tighter integration of classical quantitative genetic approaches with modern genomic technologies promises to advance our understanding of the complex genetic basis of mate choice. 相似文献