首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 328 毫秒
1.
Genomic best linear-unbiased prediction (GBLUP) assumes equal variance for all marker effects, which is suitable for traits that conform to the infinitesimal model. For traits controlled by major genes, Bayesian methods with shrinkage priors or genome-wide association study (GWAS) methods can be used to identify causal variants effectively. The information from Bayesian/GWAS methods can be used to construct the weighted genomic relationship matrix (G). However, it remains unclear which methods perform best for traits varying in genetic architecture. Therefore, we developed several methods to optimize the performance of weighted GBLUP and compare them with other available methods using simulated and real data sets. First, two types of methods (marker effects with local shrinkage or normal prior) were used to obtain test statistics and estimates for each marker effect. Second, three weighted G matrices were constructed based on the marker information from the first step: (1) the genomic-feature-weighted G, (2) the estimated marker-variance-weighted G, and (3) the absolute value of the estimated marker-effect-weighted G. Following the above process, six different weighted GBLUP methods (local shrinkage/normal-prior GF/EV/AEWGBLUP) were proposed for genomic prediction. Analyses with both simulated and real data demonstrated that these options offer flexibility for optimizing the weighted GBLUP for traits with a broad spectrum of genetic architectures. The advantage of weighting methods over GBLUP in terms of accuracy was trait dependant, ranging from 14.8% to marginal for simulated traits and from 44% to marginal for real traits. Local-shrinkage prior EVWGBLUP is superior for traits mainly controlled by loci of a large effect. Normal-prior AEWGBLUP performs well for traits mainly controlled by loci of moderate effect. For traits controlled by some loci with large effects (explain 25–50% genetic variance) and a range of loci with small effects, GFWGBLUP has advantages. In conclusion, the optimal weighted GBLUP method for genomic selection should take both the genetic architecture and number of QTLs of traits into consideration carefully.Subject terms: Quantitative trait, Genome-wide association studies, Animal breeding, Quantitative trait, Genome-wide association studies  相似文献   

2.

Background

Genome-wide association studies (GWAS) aim to identify causal variants and genes for complex disease by independently testing a large number of SNP markers for disease association. Although genes have been implicated in these studies, few utilise the multiple-hit model of complex disease to identify causal candidates. A major benefit of multi-locus comparison is that it compensates for some shortcomings of current statistical analyses that test the frequency of each SNP in isolation for the phenotype population versus control.

Results

Here we developed and benchmarked several protocols for GWAS data analysis using different in-silico gene prediction and prioritisation methodologies. We adopted a high sensitivity approach to the data, using less conservative statistical SNP associations. Multiple gene search spaces, either of fixed-widths or proximity-based, were generated around each SNP marker. We used the candidate disease gene prediction system Gentrepid to identify candidates based on shared biomolecular pathways or domain-based protein homology. Predictions were made either with phenotype-specific known disease genes as input; or without a priori knowledge, by exhaustive comparison of genes in distinct loci. Because Gentrepid uses biomolecular data to find interactions and common features between genes in distinct loci of the search spaces, it takes advantage of the multi-locus aspect of the data.

Conclusions

Results suggest testing multiple SNP-to-gene search spaces compensates for differences in phenotypes, populations and SNP platforms. Surprisingly, domain-based homology information was more informative when benchmarked against gene candidates reported by GWA studies compared to previously determined disease genes, possibly suggesting a larger contribution of gene homologs to complex diseases than Mendelian diseases.  相似文献   

3.
Genome-wide association studies (GWAS) have defined over 150 genomic regions unequivocally containing variation predisposing to immune-mediated disease. Inferring disease biology from these observations, however, hinges on our ability to discover the molecular processes being perturbed by these risk variants. It has previously been observed that different genes harboring causal mutations for the same Mendelian disease often physically interact. We sought to evaluate the degree to which this is true of genes within strongly associated loci in complex disease. Using sets of loci defined in rheumatoid arthritis (RA) and Crohn's disease (CD) GWAS, we build protein-protein interaction (PPI) networks for genes within associated loci and find abundant physical interactions between protein products of associated genes. We apply multiple permutation approaches to show that these networks are more densely connected than chance expectation. To confirm biological relevance, we show that the components of the networks tend to be expressed in similar tissues relevant to the phenotypes in question, suggesting the network indicates common underlying processes perturbed by risk loci. Furthermore, we show that the RA and CD networks have predictive power by demonstrating that proteins in these networks, not encoded in the confirmed list of disease associated loci, are significantly enriched for association to the phenotypes in question in extended GWAS analysis. Finally, we test our method in 3 non-immune traits to assess its applicability to complex traits in general. We find that genes in loci associated to height and lipid levels assemble into significantly connected networks but did not detect excess connectivity among Type 2 Diabetes (T2D) loci beyond chance. Taken together, our results constitute evidence that, for many of the complex diseases studied here, common genetic associations implicate regions encoding proteins that physically interact in a preferential manner, in line with observations in Mendelian disease.  相似文献   

4.
Complex traits such as susceptibility to diseases are determined in part by variants at multiple genetic loci. Genome-wide association studies can identify these loci, but most phenotype-associated variants lie distal to protein-coding regions and are likely involved in regulating gene expression. Understanding how these genetic variants affect complex traits depends on the ability to predict and test the function of the genomic elements harboring them. Community efforts such as the ENCODE Project provide a wealth of data about epigenetic features associated with gene regulation. These data enable the prediction of testable functions for many phenotype-associated variants.  相似文献   

5.
Genomic prediction utilizes single nucleotide polymorphism (SNP) chip data to predict animal genetic merit. It has the advantage of potentially capturing the effects of the majority of loci that contribute to genetic variation in a trait, even when the effects of the individual loci are very small. To implement genomic prediction, marker effects are estimated with a training set, including individuals with marker genotypes and trait phenotypes; subsequently, genomic estimated breeding values (GEBV) for any genotyped individual in the population can be calculated using the estimated marker effects. In this study, we aimed to: (i) evaluate the potential of genomic prediction to predict GEBV for nematode resistance traits and BW in sheep, within and across populations; (ii) evaluate the accuracy of these predictions through within-population cross-validation; and (iii) explore the impact of population structure on the accuracy of prediction. Four data sets comprising 752 lambs from a Scottish Blackface population, 2371 from a Sarda×Lacaune backcross population, 1000 from a Martinik Black-Belly×Romane backcross population and 64 from a British Texel population were used in this study. Traits available for the analysis were faecal egg count for Nematodirus and Strongyles and BW at different ages or as average effect, depending on the population. Moreover, immunoglobulin A was also available for the Scottish Blackface population. Results show that GEBV had moderate to good within-population predictive accuracy, whereas across-population predictions had accuracies close to zero. This can be explained by our finding that in most cases the accuracy estimates were mostly because of additive genetic relatedness between animals, rather than linkage disequilibrium between SNP and quantitative trait loci. Therefore, our results suggest that genomic prediction for nematode resistance and BW may be of value in closely related animals, but that with the current SNP chip genomic predictions are unlikely to work across breeds.  相似文献   

6.
Genome-wide linkage analysis using microsatellite markers has been successful in the identification of numerous Mendelian and complex disease loci. The recent availability of high-density single-nucleotide polymorphism (SNP) maps provides a potentially more powerful option. Using the simulated and Collaborative Study on the Genetics of Alcoholism (COGA) datasets from the Genetics Analysis Workshop 14 (GAW14), we examined how altering the density of SNP marker sets impacted the overall information content, the power to detect trait loci, and the number of false positive results. For the simulated data we used SNP maps with density of 0.3 cM, 1 cM, 2 cM, and 3 cM. For the COGA data we combined the marker sets from Illumina and Affymetrix to create a map with average density of 0.25 cM and then, using a sub-sample of these markers, created maps with density of 0.3 cM, 0.6 cM, 1 cM, 2 cM, and 3 cM. For each marker set, multipoint linkage analysis using MERLIN was performed for both dominant and recessive traits derived from marker loci. Our results showed that information content increased with increased map density. For the homogeneous, completely penetrant traits we created, there was only a modest difference in ability to detect trait loci. Additionally, as map density increased there was only a slight increase in the number of false positive results when there was linkage disequilibrium (LD) between markers. The presence of LD between markers may have led to an increased number of false positive regions but no clear relationship between regions of high LD and locations of false positive linkage signals was observed.  相似文献   

7.
Genome-wide association studies (GWAS) have become a widely used approach for genetic association studies of various human traits. A few GWAS have been conducted with the goal of identifying novel loci for pigmentation traits, melanoma, and non-melanoma skin cancer. Nevertheless, the phenotype variation explained by the genetic markers identified so far is limited. In this review, we discuss the GWAS study design and its application in pigmentation and skin cancer research. Furthermore, we summarize recent developments in post-GWAS activities such as meta-analysis, pathway analysis, and risk prediction.  相似文献   

8.

Background  

Genome-wide association studies (GWAS) based on single nucleotide polymorphisms (SNPs) revolutionized our perception of the genetic regulation of complex traits and diseases. Copy number variations (CNVs) promise to shed additional light on the genetic basis of monogenic as well as complex diseases and phenotypes. Indeed, the number of detected associations between CNVs and certain phenotypes are constantly increasing. However, while several software packages support the determination of CNVs from SNP chip data, the downstream statistical inference of CNV-phenotype associations is still subject to complicated and inefficient in-house solutions, thus strongly limiting the performance of GWAS based on CNVs.  相似文献   

9.
Genome-wide association studies (GWAS) are widely applied to analyze the genetic effects on phenotypes. With the availability of high-throughput technologies for metabolite measurements, GWAS successfully identified loci that affect metabolite concentrations and underlying pathways. In most GWAS, the effect of each SNP on the phenotype is assumed to be additive. Other genetic models such as recessive, dominant, or overdominant were considered only by very few studies. In contrast to this, there are theories that emphasize the relevance of nonadditive effects as a consequence of physiologic mechanisms. This might be especially important for metabolites because these intermediate phenotypes are closer to the underlying pathways than other traits or diseases. In this study we analyzed systematically nonadditive effects on a large panel of serum metabolites and all possible ratios (22,801 total) in a population-based study [Cooperative Health Research in the Region of Augsburg (KORA) F4, N = 1,785]. We applied four different 1-degree-of-freedom (1-df) tests corresponding to an additive, dominant, recessive, and overdominant trait model as well as a genotypic model with two degree-of-freedom (2-df) that allows a more general consideration of genetic effects. Twenty-three loci were found to be genome-wide significantly associated (Bonferroni corrected P ≤ 2.19 × 10−12) with at least one metabolite or ratio. For five of them, we show the evidence of nonadditive effects. We replicated 17 loci, including 3 loci with nonadditive effects, in an independent study (TwinsUK, N = 846). In conclusion, we found that most genetic effects on metabolite concentrations and ratios were indeed additive, which verifies the practice of using the additive model for analyzing SNP effects on metabolites.  相似文献   

10.
《Genomics》2021,113(3):1396-1406
Rice is one of the most important cereal crops, providing the daily dietary intake for approximately 50% of the global human population. Here, we re-sequenced 259 rice accessions, generating 1371.65 Gb of raw data. Furthermore, we performed genome-wide association studies (GWAS) on 13 agronomic traits using 2.8 million single nucleotide polymorphisms (SNPs) characterized in 259 rice accessions. Phenotypic data and best linear unbiased prediction (BLUP) values of each of the 13 traits over two years of each trait were used for the GWAS. The results showed that 816 SNP signals were significantly associated with the 13 agronomic traits. Then we detected candidate genes related to target traits within 200 kb upstream and downstream of the associated SNP loci, based on linkage disequilibrium (LD) blocks in the whole rice genome. These candidate genes were further identified through haplotype block constructions. This comprehensive study provides a timely and important genomic resource for breeding high yielding rice cultivars.  相似文献   

11.
An important task of human genetics studies is to predict accurately disease risks in individuals based on genetic markers, which allows for identifying individuals at high disease risks, and facilitating their disease treatment and prevention. Although hundreds of genome-wide association studies (GWAS) have been conducted on many complex human traits in recent years, there has been only limited success in translating these GWAS data into clinically useful risk prediction models. The predictive capability of GWAS data is largely bottlenecked by the available training sample size due to the presence of numerous variants carrying only small to modest effects. Recent studies have shown that different human traits may share common genetic bases. Therefore, an attractive strategy to increase the training sample size and hence improve the prediction accuracy is to integrate data from genetically correlated phenotypes. Yet, the utility of genetic correlation in risk prediction has not been explored in the literature. In this paper, we analyzed GWAS data for bipolar and related disorders and schizophrenia with a bivariate ridge regression method, and found that jointly predicting the two phenotypes could substantially increase prediction accuracy as measured by the area under the receiver operating characteristic curve. We also found similar prediction accuracy improvements when we jointly analyzed GWAS data for Crohn’s disease and ulcerative colitis. The empirical observations were substantiated through our comprehensive simulation studies, suggesting that a gain in prediction accuracy can be obtained by combining phenotypes with relatively high genetic correlations. Through both real data and simulation studies, we demonstrated pleiotropy can be leveraged as a valuable asset that opens up a new opportunity to improve genetic risk prediction in the future.  相似文献   

12.
Population genetics of genomics-based crop improvement methods   总被引:1,自引:0,他引:1  
Many genome-wide association studies (GWAS) in humans are concluding that, even with very large sample sizes and high marker densities, most of the genetic basis of complex traits may remain unexplained. At the same time, recent research in plant GWAS is showing much greater success with fewer resources. Both GWAS and genomic selection (GS), a method for predicting phenotypes by the use of genome-wide marker data, are receiving considerable attention among plant breeders. In this review we explore how differences in population genetic histories, as well as past selection for traits of interest, have produced trait architectures and patterns of linkage disequilibrium (LD) that frequently differ dramatically between domesticated plants and humans, making detection of quantitative trait loci (QTL) effects in crops more rewarding and less costly than in humans.  相似文献   

13.
全基因组关联分析(GWAS)是动植物复杂性状相关基因定位的常用手段。高通量基因分型技术的应用极大地推动了GWAS的发展。在植物中, 利用GWAS不仅能够以较高的分辨率在全基因组水平鉴定出各种自然群体特定性状相关的基因或区间, 而且可揭示表型变异的遗传架构全景图。目前, 人们利用GWAS分析方法已在拟南芥(Arabidopsis thaliana)、水稻(Oryza sativa)、小麦(Triticum aestivum)、玉米(Zea mays)和大豆(Glycine max)等模式植物和重要农作物品系中发掘出与各种性状显著相关的数量性状座位(QTL)及其候选基因位点, 阐明了这些性状的遗传基础, 并为揭示这些性状背后的分子机理提供候选基因, 也为作物高产优质品种的选育提供了理论依据。该文对GWAS的方法、影响因素及数据分析流程进行了详细描述, 以期为相关研究提供参考。  相似文献   

14.
为明确银川番茄(Lycopersicon esculentum)是否遭受了番茄斑萎病毒(TSWV)的危害, 采用国家标准TSWV RT- PCR检测技术对银川番茄上采集的14份疑似感染TSWV病叶样本进行分子鉴定, 对克隆得到的核衣壳蛋白基因N (Nucleocapsid)序列进行多序列比对和系统进化树分析, 随后对PCR阳性样本进行蛋白检测。结果表明, 14份病叶样本中有8份扩增出长度为394 bp的TSWV N基因序列, 且8条序列完全一致; 获得的银川番茄TSWV分离物与云南番茄、中国莴苣(Lactuca sativa)、中国鸢尾(Iris tectorum)和重庆辣椒(Capsicum annuum) TSWV分离物相对近缘, 与山东、黑龙江和北京等地及国外TSWV分离物相对远缘; 利用TSWV的抗体通过Western blot对8个PCR阳性样本进一步检测, 结果也证实8个阳性样本中存在TSWV感染。该研究首次通过分子鉴定及蛋白检测证明银川番茄上存在TSWV感染, 需要加快抗TSWV番茄品种的选育工作。  相似文献   

15.
Body fat distribution, particularly centralized obesity, is associated with metabolic risk above and beyond total adiposity. We performed genome-wide association of abdominal adipose depots quantified using computed tomography (CT) to uncover novel loci for body fat distribution among participants of European ancestry. Subcutaneous and visceral fat were quantified in 5,560 women and 4,997 men from 4 population-based studies. Genome-wide genotyping was performed using standard arrays and imputed to ~2.5 million Hapmap SNPs. Each study performed a genome-wide association analysis of subcutaneous adipose tissue (SAT), visceral adipose tissue (VAT), VAT adjusted for body mass index, and VAT/SAT ratio (a metric of the propensity to store fat viscerally as compared to subcutaneously) in the overall sample and in women and men separately. A weighted z-score meta-analysis was conducted. For the VAT/SAT ratio, our most significant p-value was rs11118316 at LYPLAL1 gene (p = 3.1 × 10E-09), previously identified in association with waist-hip ratio. For SAT, the most significant SNP was in the FTO gene (p = 5.9 × 10E-08). Given the known gender differences in body fat distribution, we performed sex-specific analyses. Our most significant finding was for VAT in women, rs1659258 near THNSL2 (p = 1.6 × 10-08), but not men (p = 0.75). Validation of this SNP in the GIANT consortium data demonstrated a similar sex-specific pattern, with observed significance in women (p = 0.006) but not men (p = 0.24) for BMI and waist circumference (p = 0.04 [women], p = 0.49 [men]). Finally, we interrogated our data for the 14 recently published loci for body fat distribution (measured by waist-hip ratio adjusted for BMI); associations were observed at 7 of these loci. In contrast, we observed associations at only 7/32 loci previously identified in association with BMI; the majority of overlap was observed with SAT. Genome-wide association for visceral and subcutaneous fat revealed a SNP for VAT in women. More refined phenotypes for body composition and fat distribution can detect new loci not previously uncovered in large-scale GWAS of anthropometric traits.  相似文献   

16.
Bread wheat is a leading cereal crop worldwide. Limited amount of superior allele loci restricted the progress of molecular improvement in wheat breeding. Here, we revealed new allelic variation distribution for 13 yield‐related traits in series of genome‐wide association studies (GWAS) using the wheat 90K genotyping assay, characterized in 163 bread wheat cultivars. Agronomic traits were investigated in 14 environments at three locations over 3 years. After filtering SNP data sets, GWAS using 20 689 high‐quality SNPs associated 1769 significant loci that explained, on average, ~20% of the phenotypic variation, both detected already reported loci and new promising genomic regions. Of these, repetitive and pleiotropic SNPs on chromosomes 6AS, 6AL, 6BS, 5BL and 7AS were significantly linked to thousand kernel weight, for example BS00021705_51 on 6BS and wsnp_Ex_c32624_41252144 on 6AS, with phenotypic variation explained (PVE) of ~24%, consistently identified in 12 and 13 of the 14 environments, respectively. Kernel length‐related SNPs were mainly identified on chromosomes 7BS, 6AS, 5AL and 5BL. Plant height‐related SNPs on chromosomes 4DS, 6DL, 2DS and 1BL were, respectively, identified in more than 11 environments, with averaged PVE of ~55%. Four SNPs were confirmed to be important genetic loci in two RIL populations. Based on repetivity and PVE, a total of 41 SNP loci possibly played the key role in modulating yield‐related traits of the cultivars surveyed. Distribution of superior alleles at the 41 SNP loci indicated that superior alleles were getting popular with time and modern cultivars had integrated many superior alleles, especially for peduncle length‐ and plant height‐related superior alleles. However, there were still 19 SNP loci showing less than percentages of 50% in modern cultivars, suggesting they should be paid more attention to improve yield‐related traits of cultivars in the Yellow and Huai wheat region. This study could provide useful information for dissection of yield‐related traits and valuable genetic loci for marker‐assisted selection in Chinese wheat breeding programme.  相似文献   

17.
Local interactions between neighbouring SNPs are hypothesized to be able to capture variants missing from genome-wide association studies (GWAS) via haplotype effects but have not been thoroughly explored. We have used a new high-throughput analysis tool to probe this underexplored area through full pair-wise genome scans and conventional GWAS in diastolic and systolic blood pressure and six metabolic traits in the Northern Finland Birth Cohort 1966 (NFBC1966) and the Atherosclerosis Risk in Communities study cohort (ARIC). Genome-wide significant interactions were detected in ARIC for systolic blood pressure between PLEKHA7 (a known GWAS locus for blood pressure) and GPR180 (which plays a role in vascular remodelling), and also for triglycerides as local interactions within the 11q23.3 region (replicated significantly in NFBC1966), which notably harbours several loci (BUD13, ZNF259 and APOA5) contributing to triglyceride levels. Tests of the local interactions within the 11q23.3 region conditional on the top GWAS signal suggested the presence of two independent functional variants, each with supportive evidence for their roles in gene regulation. Local interactions captured 9 additional GWAS loci identified in this study (3 significantly replicated) and 73 from previous GWAS (24 in the eight traits and 49 in related traits). We conclude that the detection of local interactions requires adequate SNP coverage of the genome and that such interactions are only likely to be detectable between SNPs in low linkage disequilibrium. Analysing local interactions is a potentially valuable complement to GWAS and can provide new insights into the biology underlying variation in complex traits.  相似文献   

18.
New sources of genetic diversity must be incorporated into plant breeding programs if they are to continue increasing grain yield and quality, and tolerance to abiotic and biotic stresses. Germplasm collections provide a source of genetic and phenotypic diversity, but characterization of these resources is required to increase their utility for breeding programs. We used a barley SNP iSelect platform with 7,842 SNPs to genotype 2,417 barley accessions sampled from the USDA National Small Grains Collection of 33,176 accessions. Most of the accessions in this core collection are categorized as landraces or cultivars/breeding lines and were obtained from more than 100 countries. Both STRUCTURE and principal component analysis identified five major subpopulations within the core collection, mainly differentiated by geographical origin and spike row number (an inflorescence architecture trait). Different patterns of linkage disequilibrium (LD) were found across the barley genome and many regions of high LD contained traits involved in domestication and breeding selection. The genotype data were used to define ‘mini-core’ sets of accessions capturing the majority of the allelic diversity present in the core collection. These ‘mini-core’ sets can be used for evaluating traits that are difficult or expensive to score. Genome-wide association studies (GWAS) of ‘hull cover’, ‘spike row number’, and ‘heading date’ demonstrate the utility of the core collection for locating genetic factors determining important phenotypes. The GWAS results were referenced to a new barley consensus map containing 5,665 SNPs. Our results demonstrate that GWAS and high-density SNP genotyping are effective tools for plant breeders interested in accessing genetic diversity in large germplasm collections.  相似文献   

19.
Genome-wide association studies (GWAS) have become a preferred method to identify new genetic susceptibility loci. This technique aims to understanding the molecular etiology of common diseases, but in many cases, it has led to the identification of loci with no obvious biological relevance. Herein, we show that previously unrecognized sequence homologies have caused single-nucleotide polymorphism (SNP) microarrays to incorrectly associate a phenotype to a given locus when in fact the linkage is to another distant locus. Using genetic differences between male and female subjects as a model to study the effect of one specific genomic region on the whole SNP microarray, we provide strong evidence that the use of standard methods for GWAS can be misleading. We suggest a new systematic quality control step in the biological interpretation of previous and future GWAS.  相似文献   

20.
Genome-wide association studies (GWAS) simultaneously investigating hundreds of thousands of single nucleotide polymorphisms (SNP) have become a powerful tool in the investigation of new disease susceptibility loci. Haplotypes are sometimes thought to be superior to SNPs and are promising in genetic association analyses. The application of genome-wide haplotype analysis, however, is hindered by the complexity of haplotypes themselves and sophistication in computation. We systematically analyzed the haplotype effects for breast cancer risk among 5,761 African American women (3,016 cases and 2,745 controls) using a sliding window approach on the genome-wide scale. Three regions on chromosomes 1, 4 and 18 exhibited moderate haplotype effects. Furthermore, among 21 breast cancer susceptibility loci previously established in European populations, 10p15 and 14q24 are likely to harbor novel haplotype effects. We also proposed a heuristic of determining the significance level and the effective number of independent tests by the permutation analysis on chromosome 22 data. It suggests that the effective number was approximately half of the total (7,794 out of 15,645), thus the half number could serve as a quick reference to evaluating genome-wide significance if a similar sliding window approach of haplotype analysis is adopted in similar populations using similar genotype density.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号