期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bayesian meta-analysis of genetic association studies with different sets of markers

Verzilli C Shah T Casas JP Chapman J Sandhu M Debenham SL Boekholdt MS Khaw KT Wareham NJ Judson R Benjamin EJ Kathiresan S Larson MG Rong J Sofat R Humphries SE Smeeth L Cavalleri G Whittaker JC Hingorani AD 《American journal of human genetics》2008,82(4):859-872

Robust assessment of genetic effects on quantitative traits or complex-disease risk requires synthesis of evidence from multiple studies. Frequently, studies have genotyped partially overlapping sets of SNPs within a gene or region of interest, hampering attempts to combine all the available data. By using the example of C-reactive protein (CRP) as a quantitative trait, we show how linkage disequilibrium in and around its gene facilitates use of Bayesian hierarchical models to integrate informative data from all available genetic association studies of this trait, irrespective of the SNP typed. A variable selection scheme, followed by contextualization of SNPs exhibiting independent associations within the haplotype structure of the gene, enhanced our ability to infer likely causal variants in this region with population-scale data. This strategy, based on data from a literature based systematic review and substantial new genotyping, facilitated the most comprehensive evaluation to date of the role of variants governing CRP levels, providing important information on the minimal subset of SNPs necessary for comprehensive evaluation of the likely causal relevance of elevated CRP levels for coronary-heart-disease risk by Mendelian randomization. The same method could be applied to evidence synthesis of other quantitative traits, whenever the typed SNPs vary among studies, and to assist fine mapping of causal variants. 相似文献

2.

Exploration of relationships between production and fertility traits in dairy cattle via association studies of SNPs within candidate genes derived by expression profiling

Pimentel EC Bauersachs S Tietze M Simianer H Tetens J Thaller G Reinhardt F Wolf E König S 《Animal genetics》2011,42(3):251-262

The objective of this work was to integrate findings from functional genomics studies with genome-wide association studies for fertility and production traits in dairy cattle. Association analyses of production and fertility traits with SNPs located within or close to 170 candidate genes derived from two gene expression studies and from the literature were performed. Data from 2294 Holstein bulls genotyped for 39557 SNPs were used. A total of 111 SNPs were located on chromosomal segments covered by a candidate gene. Allele substitution effects for each SNP were estimated using a mixed model with a fixed effect of marker and a random polygenic effect. Assumed covariance was derived either from marker or from pedigree information. Results from the analysis with the kinship matrix built from marker genotypes were more conservative than from the analysis with the pedigree-derived relationship matrix. From sixteen SNPs with significant effects on both classes of traits, ten provided evidence of an antagonistic relationship between productivity and fertility. However, we found four SNPs with favourable effects on fertility and on yield traits, one SNP with favourable effects on fertility and percentage traits, and one SNP with antagonistic effects on two fertility traits. While most quantitative genetic studies have proven genetic antagonisms between yield and functional traits, improvements in both production and functionality may be possible when focusing on a few relevant SNPs. Investigations combining input from quantitative genetics and functional genomics with association analysis may be applied for the identification of such SNPs. 相似文献

3.

Genome‐wide association studies for hematological traits in swine

J. Y. Wang Y. R. Luo W. X. Fu X. Lu J. P. Zhou X. D. Ding J. F. Liu Q. Zhang 《Animal genetics》2013,44(1):34-43

Improving immune capacity may increase the profitability of animal production if it enables animals to better cope with infections. Hematological traits play pivotal roles in animal immune capacity and disease resistance. Thus far, few studies have been conducted using a high‐density swine SNP chip panel to unravel the genetic mechanism of the immune capability in domestic animals. In this study, using mixed model‐based single‐locus regression analyses, we carried out genome‐wide association studies, using the Porcine SNP60 BeadChip, for immune responses in piglets for 18 hematological traits (seven leukocyte traits, seven erythrocyte traits, and four platelet traits) after being immunized with classical swine fever vaccine. After adjusting for multiple testing based on permutations, 10, 24, and 77 chromosome‐wise significant SNPs were identified for the leukocyte traits, erythrocyte traits, and platelet traits respectively, of which 10 reached genome‐wise significance level. Among the 53 SNPs for mean platelet volume, 29 are located in a linkage disequilibrium block between 32.77 and 40.59 Mb on SSC6. Four genes of interest are located within the block, providing genetic evidence that this genomic segment may be considered a candidate region relevant to the platelet traits. Other candidate genes of interest for red blood cell, hemoglobin, and red blood cell volume distribution width also have been found near the significant SNPs. Our genome‐wide association study provides a list of significant SNPs and candidate genes that offer valuable information for future dissection of molecular mechanisms regulating hematological traits. 相似文献

4.

Genome wide association studies for milk production traits in Chinese Holstein population 总被引：5，自引：0，他引：5

Jiang L Liu J Sun D Ma P Ding X Yu Y Zhang Q 《PloS one》2010,5(10):e13661

Genome-wide association studies (GWAS) based on high throughput SNP genotyping technologies open a broad avenue for exploring genes associated with milk production traits in dairy cattle. Motivated by pinpointing novel quantitative trait nucleotide (QTN) across Bos Taurus genome, the present study is to perform GWAS to identify genes affecting milk production traits using current state-of-the-art SNP genotyping technology, i.e., the Illumina BovineSNP50 BeadChip. In the analyses, the five most commonly evaluated milk production traits are involved, including milk yield (MY), milk fat yield (FY), milk protein yield (PY), milk fat percentage (FP) and milk protein percentage (PP). Estimated breeding values (EBVs) of 2,093 daughters from 14 paternal half-sib families are considered as phenotypes within the framework of a daughter design. Association tests between each trait and the 54K SNPs are achieved via two different analysis approaches, a paternal transmission disequilibrium test (TDT)-based approach (L1-TDT) and a mixed model based regression analysis (MMRA). In total, 105 SNPs were detected to be significantly associated genome-wise with one or multiple milk production traits. Of the 105 SNPs, 38 were commonly detected by both methods, while four and 63 were solely detected by L1-TDT and MMRA, respectively. The majority (86 out of 105) of the significant SNPs is located within the reported QTL regions and some are within or close to the reported candidate genes. In particular, two SNPs, ARS-BFGL-NGS-4939 and BFGL-NGS-118998, are located close to the DGAT1 gene (160bp apart) and within the GHR gene, respectively. Our findings herein not only provide confirmatory evidences for previously findings, but also explore a suite of novel SNPs associated with milk production traits, and thus form a solid basis for eventually unraveling the causal mutations for milk production traits in dairy cattle. 相似文献

5.

Statistical Estimation of Correlated Genome Associations to a Quantitative Trait Network

下载免费PDF全文

Seyoung Kim Eric P. Xing 《PLoS genetics》2009,5(8)

Many complex disease syndromes, such as asthma, consist of a large number of highly related, rather than independent, clinical or molecular phenotypes. This raises a new technical challenge in identifying genetic variations associated simultaneously with correlated traits. In this study, we propose a new statistical framework called graph-guided fused lasso (GFlasso) to directly and effectively incorporate the correlation structure of multiple quantitative traits such as clinical metrics and gene expressions in association analysis. Our approach represents correlation information explicitly among the quantitative traits as a quantitative trait network (QTN) and then leverages this network to encode structured regularization functions in a multivariate regression model over the genotypes and traits. The result is that the genetic markers that jointly influence subgroups of highly correlated traits can be detected jointly with high sensitivity and specificity. While most of the traditional methods examined each phenotype independently and combined the results afterwards, our approach analyzes all of the traits jointly in a single statistical framework. This allows our method to borrow information across correlated phenotypes to discover the genetic markers that perturb a subset of the correlated traits synergistically. Using simulated datasets based on the HapMap consortium and an asthma dataset, we compared the performance of our method with other methods based on single-marker analysis and regression-based methods that do not use any of the relational information in the traits. We found that our method showed an increased power in detecting causal variants affecting correlated traits. Our results showed that, when correlation patterns among traits in a QTN are considered explicitly and directly during a structured multivariate genome association analysis using our proposed methods, the power of detecting true causal SNPs with possibly pleiotropic effects increased significantly without compromising performance on non-pleiotropic SNPs. 相似文献

6.

A powerful global test for spliceQTL effects

Renee X. de Menezes Armin Rauschenberger BIOS Consortium Peter A. C. 't Hoen Marianne A. Jonker 《Biometrical journal. Biometrische Zeitschrift》2023,65(1):2100123

Statistical methods to test for effects of single nucleotide polymorphisms (SNPs) on exon inclusion exist but often rely on testing of associations between multiple exon–SNP pairs, with sometimes subsequent summarization of results at the gene level. Such approaches require heavy multiple testing corrections and detect mostly events with large effect sizes. We propose here a test to find spliceQTL (splicing quantitative trait loci) effects that takes all exons and all SNPs into account simultaneously. For any chosen gene, this score-based test looks for an association between the set of exon expressions and the set of SNPs, via a random-effects model framework. It is efficient to compute and can be used if the number of SNPs is larger than the number of samples. In addition, the test is powerful in detecting effects that are relatively small for individual exon–SNP pairs but are observed for many pairs. Furthermore, test results are more often replicated across datasets than pairwise testing results. This makes our test more robust to exon–SNP pair-specific effects, which do not extend to multiple pairs within the same gene. We conclude that the test we propose here offers more power and better replicability in the search for spliceQTL effects. 相似文献

7.

A Multi-Trait,Meta-analysis for Detecting Pleiotropic Polymorphisms for Stature,Fatness and Reproduction in Beef Cattle

Sunduimijid Bolormaa Jennie E. Pryce Antonio Reverter Yuandan Zhang William Barendse Kathryn Kemper Bruce Tier Keith Savin Ben J. Hayes Michael E. Goddard 《PLoS genetics》2014,10(3)

Polymorphisms that affect complex traits or quantitative trait loci (QTL) often affect multiple traits. We describe two novel methods (1) for finding single nucleotide polymorphisms (SNPs) significantly associated with one or more traits using a multi-trait, meta-analysis, and (2) for distinguishing between a single pleiotropic QTL and multiple linked QTL. The meta-analysis uses the effect of each SNP on each of n traits, estimated in single trait genome wide association studies (GWAS). These effects are expressed as a vector of signed t-values (t) and the error covariance matrix of these t values is approximated by the correlation matrix of t-values among the traits calculated across the SNP (V). Consequently, t''V⁻¹t is approximately distributed as a chi-squared with n degrees of freedom. An attractive feature of the meta-analysis is that it uses estimated effects of SNPs from single trait GWAS, so it can be applied to published data where individual records are not available. We demonstrate that the multi-trait method can be used to increase the power (numbers of SNPs validated in an independent population) of GWAS in a beef cattle data set including 10,191 animals genotyped for 729,068 SNPs with 32 traits recorded, including growth and reproduction traits. We can distinguish between a single pleiotropic QTL and multiple linked QTL because multiple SNPs tagging the same QTL show the same pattern of effects across traits. We confirm this finding by demonstrating that when one SNP is included in the statistical model the other SNPs have a non-significant effect. In the beef cattle data set, cluster analysis yielded four groups of QTL with similar patterns of effects across traits within a group. A linear index was used to validate SNPs having effects on multiple traits and to identify additional SNPs belonging to these four groups. 相似文献

8.

<Emphasis Type="Italic">In silico</Emphasis> characterization of functional SNP within the oestrogen receptor gene

MAHA REBAÏ AHMED REBAÏ∗ 《Journal of genetics》2016,95(4):865-874

相似文献

9.

Genome-wide association study identified a narrow chromosome 1 region associated with chicken growth traits 总被引：2，自引：0，他引：2

Xie L Luo C Zhang C Zhang R Tang J Nie Q Ma L Hu X Li N Da Y Zhang X 《PloS one》2012,7(2):e30910

Chicken growth traits are important economic traits in broilers. A large number of studies are available on finding genetic factors affecting chicken growth. However, most of these studies identified chromosome regions containing putative quantitative trait loci and finding causal mutations is still a challenge. In this genome-wide association study (GWAS), we identified a narrow 1.5 Mb region (173.5-175 Mb) of chicken (Gallus gallus) chromosome (GGA) 1 to be strongly associated with chicken growth using 47,678 SNPs and 489 F2 chickens. The growth traits included aggregate body weight (BW) at 0-90 d of age measured weekly, biweekly average daily gains (ADG) derived from weekly body weight, and breast muscle weight (BMW), leg muscle weight (LMW) and wing weight (WW) at 90 d of age. Five SNPs in the 1.5 Mb KPNA3-FOXO1A region at GGA1 had the highest significant effects for all growth traits in this study, including a SNP at 8.9 Kb upstream of FOXO1A for BW at 22-48 d and 70 d, a SNP at 1.9 Kb downstream of FOXO1A for WW, a SNP at 20.9 Kb downstream of ENSGALG00000022732 for ADG at 29-42 d, a SNP in INTS6 for BW at 90 d, and a SNP in KPNA3 for BMW and LMW. The 1.5 Mb KPNA3-FOXO1A region contained two microRNA genes that could bind to messenger ribonucleic acid (mRNA) of IGF1, FOXO1A and KPNA3. It was further indicated that the 1.5 Mb GGA1 region had the strongest effects on chicken growth during 22-42 d. 相似文献

10.

Combined sequence and sequence-structure-based methods for analyzing RAAS gene SNPs: a computational approach

《Journal of receptor and signal transduction research》2013,33(6):513-526

Abstract

The renin–angiotensin–aldosterone system (RAAS) plays a key role in the regulation of blood pressure (BP). Mutations on the genes that encode components of the RAAS have played a significant role in genetic susceptibility to hypertension and have been intensively scrutinized. The identification of such probably causal mutations not only provides insight into the RAAS but may also serve as antihypertensive therapeutic targets and diagnostic markers. The methods for analyzing the SNPs from the huge dataset of SNPs, containing both functional and neutral SNPs is challenging by the experimental approach on every SNPs to determine their biological significance. To explore the functional significance of genetic mutation (SNPs), we adopted combined sequence and sequence-structure-based SNP analysis algorithm. Out of 3864 SNPs reported in dbSNP, we found 108 missense SNPs in the coding region and remaining in the non-coding region. In this study, we are reporting only those SNPs in coding region to be deleterious when three or more tools are predicted to be deleterious and which have high RMSD from the native structure. Based on these analyses, we have identified two SNPs of REN gene, eight SNPs of AGT gene, three SNPs of ACE gene, two SNPs of AT1R gene, three SNPs of CYP11B2 gene and three SNPs of CMA1 gene in the coding region were found to be deleterious. Further this type of study will be helpful in reducing the cost and time for identification of potential SNP and also helpful in selecting potential SNP for experimental study out of SNP pool. 相似文献

11.

Fast identification of biological pathways associated with a quantitative trait using group lasso with overlaps

Silver M Montana G;Alzheimer's Disease Neuroimaging Initiative 《Statistical applications in genetics and molecular biology》2012,11(1):Article 7

Where causal SNPs (single nucleotide polymorphisms) tend to accumulate within biological pathways, the incorporation of prior pathways information into a statistical model is expected to increase the power to detect true associations in a genetic association study. Most existing pathways-based methods rely on marginal SNP statistics and do not fully exploit the dependence patterns among SNPs within pathways.We use a sparse regression model, with SNPs grouped into pathways, to identify causal pathways associated with a quantitative trait. Notable features of our "pathways group lasso with adaptive weights" (P-GLAW) algorithm include the incorporation of all pathways in a single regression model, an adaptive pathway weighting procedure that accounts for factors biasing pathway selection, and the use of a bootstrap sampling procedure for the ranking of important pathways. P-GLAW takes account of the presence of overlapping pathways and uses a novel combination of techniques to optimise model estimation, making it fast to run, even on whole genome datasets.In a comparison study with an alternative pathways method based on univariate SNP statistics, our method demonstrates high sensitivity and specificity for the detection of important pathways, showing the greatest relative gains in performance where marginal SNP effect sizes are small. 相似文献

12.

A latent variable partial least squares path modeling approach to regional association and polygenic effect with applications to a human obesity study

Xue F Li S Luan J Yuan Z Luben RN Khaw KT Wareham NJ Loos RJ Zhao JH 《PloS one》2012,7(2):e31927

Genetic association studies are now routinely used to identify single nucleotide polymorphisms (SNPs) linked with human diseases or traits through single SNP-single trait tests. Here we introduced partial least squares path modeling (PLSPM) for association between single or multiple SNPs and a latent trait that can involve single or multiple correlated measurement(s). Furthermore, the framework naturally provides estimators of polygenic effect by appropriately weighting trait-attributing alleles. We conducted computer simulations to assess the performance via multiple SNPs and human obesity-related traits as measured by body mass index (BMI), waist and hip circumferences. Our results showed that the associate statistics had type I error rates close to nominal level and were powerful for a range of effect and sample sizes. When applied to 12 candidate regions in data (N = 2,417) from the European Prospective Investigation of Cancer (EPIC)-Norfolk study, a region in FTO was found to have stronger association (rs7204609∼rs9939881 at the first intron P = 4.29×10⁻⁷) than single SNP analysis (all with P>10⁻⁴) and a latent quantitative phenotype was obtained using a subset sample of EPIC-Norfolk (N = 12,559). We believe our method is appropriate for assessment of regional association and polygenic effect on a single or multiple traits. 相似文献

13.

A latent model for prioritization of SNPs for functional studies

Fridley BL Iversen E Tsai YY Jenkins GD Goode EL Sellers TA 《PloS one》2011,6(6):e20764

相似文献

14.

MultiPhen: joint model of multiple phenotypes can increase discovery in GWAS

O'Reilly PF Hoggart CJ Pomyen Y Calboli FC Elliott P Jarvelin MR Coin LJ 《PloS one》2012,7(5):e34861

The genome-wide association study (GWAS) approach has discovered hundreds of genetic variants associated with diseases and quantitative traits. However, despite clinical overlap and statistical correlation between many phenotypes, GWAS are generally performed one-phenotype-at-a-time. Here we compare the performance of modelling multiple phenotypes jointly with that of the standard univariate approach. We introduce a new method and software, MultiPhen, that models multiple phenotypes simultaneously in a fast and interpretable way. By performing ordinal regression, MultiPhen tests the linear combination of phenotypes most associated with the genotypes at each SNP, and thus potentially captures effects hidden to single phenotype GWAS. We demonstrate via simulation that this approach provides a dramatic increase in power in many scenarios. There is a boost in power for variants that affect multiple phenotypes and for those that affect only one phenotype. While other multivariate methods have similar power gains, we describe several benefits of MultiPhen over these. In particular, we demonstrate that other multivariate methods that assume the genotypes are normally distributed, such as canonical correlation analysis (CCA) and MANOVA, can have highly inflated type-1 error rates when testing case-control or non-normal continuous phenotypes, while MultiPhen produces no such inflation. To test the performance of MultiPhen on real data we applied it to lipid traits in the Northern Finland Birth Cohort 1966 (NFBC1966). In these data MultiPhen discovers 21% more independent SNPs with known associations than the standard univariate GWAS approach, while applying MultiPhen in addition to the standard approach provides 37% increased discovery. The most associated linear combinations of the lipids estimated by MultiPhen at the leading SNPs accurately reflect the Friedewald Formula, suggesting that MultiPhen could be used to refine the definition of existing phenotypes or uncover novel heritable phenotypes. 相似文献

15.

Imputation-based analysis of association studies: candidate regions and quantitative traits

Servin B Stephens M 《PLoS genetics》2007,3(7):e114

We introduce a new framework for the analysis of association studies, designed to allow untyped variants to be more effectively and directly tested for association with a phenotype. The idea is to combine knowledge on patterns of correlation among SNPs (e.g., from the International HapMap project or resequencing data in a candidate region of interest) with genotype data at tag SNPs collected on a phenotyped study sample, to estimate ("impute") unmeasured genotypes, and then assess association between the phenotype and these estimated genotypes. Compared with standard single-SNP tests, this approach results in increased power to detect association, even in cases in which the causal variant is typed, with the greatest gain occurring when multiple causal variants are present. It also provides more interpretable explanations for observed associations, including assessing, for each SNP, the strength of the evidence that it (rather than another correlated SNP) is causal. Although we focus on association studies with quantitative phenotype and a relatively restricted region (e.g., a candidate gene), the framework is applicable and computationally practical for whole genome association studies. Methods described here are implemented in a software package, Bim-Bam, available from the Stephens Lab website http://stephenslab.uchicago.edu/software.html. 相似文献

16.

Association Test Based on SNP Set: Logistic Kernel Machine Based Test vs. Principal Component Analysis

Yang Zhao Feng Chen Rihong Zhai Xihong Lin Nancy Diao David C. Christiani 《PloS one》2012,7(9)

GWAS has facilitated greatly the discovery of risk SNPs associated with complex diseases. Traditional methods analyze SNP individually and are limited by low power and reproducibility since correction for multiple comparisons is necessary. Several methods have been proposed based on grouping SNPs into SNP sets using biological knowledge and/or genomic features. In this article, we compare the linear kernel machine based test (LKM) and principal components analysis based approach (PCA) using simulated datasets under the scenarios of 0 to 3 causal SNPs, as well as simple and complex linkage disequilibrium (LD) structures of the simulated regions. Our simulation study demonstrates that both LKM and PCA can control the type I error at the significance level of 0.05. If the causal SNP is in strong LD with the genotyped SNPs, both the PCA with a small number of principal components (PCs) and the LKM with kernel of linear or identical-by-state function are valid tests. However, if the LD structure is complex, such as several LD blocks in the SNP set, or when the causal SNP is not in the LD block in which most of the genotyped SNPs reside, more PCs should be included to capture the information of the causal SNP. Simulation studies also demonstrate the ability of LKM and PCA to combine information from multiple causal SNPs and to provide increased power over individual SNP analysis. We also apply LKM and PCA to analyze two SNP sets extracted from an actual GWAS dataset on non-small cell lung cancer. 相似文献

17.

Leveraging Genetic Variability across Populations for the Identification of Causal Variants

Noah Zaitlen Bogdan Pa?aniuc Tom Gur Elad Ziv Eran Halperin 《American journal of human genetics》2010,86(1):23-33

Genome-wide association studies have been performed extensively in the last few years, resulting in many new discoveries of genomic regions that are associated with complex traits. It is often the case that a SNP found to be associated with the condition is not the causal SNP, but a proxy to it as a result of linkage disequilibrium. For the identification of the actual causal SNP, fine-mapping follow-up is performed, either with the use of dense genotyping or by sequencing of the region. In either case, if the causal SNP is in high linkage disequilibrium with other SNPs, the fine-mapping procedure will require a very large sample size for the identification of the causal SNP. Here, we show that by leveraging genetic variability across populations, we significantly increase the localization success rate (LSR) for a causal SNP in a follow-up study that involves multiple populations as compared to a study that involves only one population. Thus, the average power for detection of the causal variant will be higher in a joint analysis than that in studies in which only one population is analyzed at a time. On the basis of this observation, we developed a framework to efficiently search for a follow-up study design: our framework searches for the best combination of populations from a pool of available populations to maximize the LSR for detection of a causal variant. This framework and its accompanying software can be used to considerably enhance the power of fine-mapping studies. 相似文献

18.

A Post-GWAS Replication Study Confirming the PTK2 Gene Associated with Milk Production Traits in Chinese Holstein

Haifei Wang Li Jiang Xuan Liu Jie Yang Julong Wei Jingen Xu Qin Zhang Jian-Feng Liu 《PloS one》2013,8(12)

Our initial genome-wide association study (GWAS) demonstrated that two SNPs (ARS-BFGL-NGS-33248, UA-IFASA-9288) within the protein tyrosine kinase 2 (PTK2) gene were significantly associated with milk production traits in Chinese Holstein dairy cattle. To further validate if the statistical evidence provided in GWAS were true-positive findings, a replication study was performed herein through genotype-phenotype associations. The two tested SNPs were found to show significant associations with milk production traits, which confirmed the associations observed in the original study. Specifically, SNPs lying in the PTK2 gene were also detected by sequencing 14 unrelated sires in Chinese Holsteins and a total of thirty-three novel SNPs were identified. Thirteen out of these identified SNPs were genotyped and tested for association with milk production traits in an independent resource population. After Bonferroni correction for multiple testing, twelve SNPs were statistically significant for more than two milk production traits. Analyses of pairwise D’ measures of linkage disequilibrium (LD) between all SNPs were also explored. Two haplotype blocks were inferred and the association study at haplotype level revealed similar effects on milk production traits. In addition, the RNA expression analyses revealed that a non-synonymous coding SNP (g.4061098T>G) was involved in the regulation of gene expression. Thus the findings presented here provide strong evidence for associations of PTK2 variants with dairy production traits and may be applied in Chinese Holstein breeding program. 相似文献

19.

Calibrating the performance of SNP arrays for whole-genome association studies

Hao K Schadt EE Storey JD 《PLoS genetics》2008,4(6):e1000109

To facilitate whole-genome association studies (WGAS), several high-density SNP genotyping arrays have been developed. Genetic coverage and statistical power are the primary benchmark metrics in evaluating the performance of SNP arrays. Ideally, such evaluations would be done on a SNP set and a cohort of individuals that are both independently sampled from the original SNPs and individuals used in developing the arrays. Without utilization of an independent test set, previous estimates of genetic coverage and statistical power may be subject to an overfitting bias. Additionally, the SNP arrays' statistical power in WGAS has not been systematically assessed on real traits. One robust setting for doing so is to evaluate statistical power on thousands of traits measured from a single set of individuals. In this study, 359 newly sampled Americans of European descent were genotyped using both Affymetrix 500K (Affx500K) and Illumina 650Y (Ilmn650K) SNP arrays. From these data, we were able to obtain estimates of genetic coverage, which are robust to overfitting, by constructing an independent test set from among these genotypes and individuals. Furthermore, we collected liver tissue RNA from the participants and profiled these samples on a comprehensive gene expression microarray. The RNA levels were used as a large-scale set of quantitative traits to calibrate the relative statistical power of the commercial arrays. Our genetic coverage estimates are lower than previous reports, providing evidence that previous estimates may be inflated due to overfitting. The Ilmn650K platform showed reasonable power (50% or greater) to detect SNPs associated with quantitative traits when the signal-to-noise ratio (SNR) is greater than or equal to 0.5 and the causal SNP's minor allele frequency (MAF) is greater than or equal to 20% (N=359). In testing each of the more than 40,000 gene expression traits for association to each of the SNPs on the Ilmn650K and Affx500K arrays, we found that the Ilmn650K yielded 15% times more discoveries than the Affx500K at the same false discovery rate (FDR) level. 相似文献

20.

BcSNPdb: bovine coding region single nucleotide polymorphisms located proximal to quantitative trait loci

Moon S Shin HD Cheong HS Cho HY Namgoong S Kim EM Han CS Sung S Kim H 《Journal of biochemistry and molecular biology》2007,40(1):95-99

Bovine coding region single nucleotide polymorphisms located proximal to quantitative trait loci were identified to facilitate bovine QTL fine mapping research. A total of 692,763 bovine SNPs was extracted from 39,432 UniGene clusters, and 53,446 candidate SNPs were found to be a depth >3. In order to validate the in silico SNPs experimentally, 186 animals representing 14 breeds and 100 mixed breeds were analyzed. Genotyping of 40 randomly selected candidate SNPs revealed that 43% of these SNPs ranged in frequency from 0.009 to 0.498. To identify non-synonymous SNPs and to correct for possible frameshift errors in the ESTs at the predicted SNP positions, we designed a program that determines coding regions by protein-sequence referencing, and identified 17,735 nsSNPs. The SNPs and bovine quantitative traits loci informations were integrated into a bovine SNP data: BcSNPdb (http://snugenome.snu.ac.kr/BtcSNP/). Currently there are 43 different kinds of quantitative traits available. Thus, these SNPs would serve as valuable resources for exploiting genomic variation that influence economically and agriculturally important traits in cows. 相似文献