首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
An increasing number of field studies have shown that the phenotype of an individual plant depends not only on its genotype but also on those of neighboring plants; however, this fact is not taken into consideration in genome-wide association studies (GWAS). Based on the Ising model of ferromagnetism, we incorporated neighbor genotypic identity into a regression model, named “Neighbor GWAS”. Our simulations showed that the effective range of neighbor effects could be estimated using an observed phenotype when the proportion of phenotypic variation explained (PVE) by neighbor effects peaked. The spatial scale of the first nearest neighbors gave the maximum power to detect the causal variants responsible for neighbor effects, unless their effective range was too broad. However, if the effective range of the neighbor effects was broad and minor allele frequencies were low, there was collinearity between the self and neighbor effects. To suppress the false positive detection of neighbor effects, the fixed effect and variance components involved in the neighbor effects should be tested in comparison with a standard GWAS model. We applied neighbor GWAS to field herbivory data from 199 accessions of Arabidopsis thaliana and found that neighbor effects explained 8% more of the PVE of the observed damage than standard GWAS. The neighbor GWAS method provides a novel tool that could facilitate the analysis of complex traits in spatially structured environments and is available as an R package at CRAN (https://cran.rproject.org/package=rNeighborGWAS).Subject terms: Quantitative trait, Plant ecology, Ecological genetics  相似文献   

2.
The landscape of analytical tools for population genomics continues to evolve. However, these tools are scattered across programming languages, making them largely inaccessible for many biologists. In this issue of Molecular Ecology Resources, Hemstrom and Jones, 2022 (Mol Ecol Resour; 962) introduce a new R package, snpR. This package combines a large number of existing analyses, to provide a one-stop shop for population genomics. F-statistics, admixture analyses, effective population size inferences, genome-wide association studies (GWAS), and parentage analyses are all implemented natively within the package. A variety of third-party software can also be run without leaving the R environment. The authors pay particular attention to data structure – avoiding redundancy – and allowing analyses to be run across multiple sample or single-nucleotide polymorphism (SNP) groupings. Because of its great accessibility and wide range of analyses, snpR has the potential to become a favourite within the Molecular Ecology community.  相似文献   

3.
Genome-wide association studies (GWAS) have rapidly become a standard method for disease gene discovery. A substantial number of recent GWAS indicate that for most disorders, only a few common variants are implicated and the associated SNPs explain only a small fraction of the genetic risk. This review is written from the viewpoint that findings from the GWAS provide preliminary genetic information that is available for additional analysis by statistical procedures that accumulate evidence, and that these secondary analyses are very likely to provide valuable information that will help prioritize the strongest constellations of results. We review and discuss three analytic methods to combine preliminary GWAS statistics to identify genes, alleles, and pathways for deeper investigations. Meta-analysis seeks to pool information from multiple GWAS to increase the chances of finding true positives among the false positives and provides a way to combine associations across GWAS, even when the original data are unavailable. Testing for epistasis within a single GWAS study can identify the stronger results that are revealed when genes interact. Pathway analysis of GWAS results is used to prioritize genes and pathways within a biological context. Following a GWAS, association results can be assigned to pathways and tested in aggregate with computational tools and pathway databases. Reviews of published methods with recommendations for their application are provided within the framework for each approach.  相似文献   

4.
5.
GWAS have emerged as popular tools for identifying genetic variants that are associated with disease risk. Standard analysis of a case-control GWAS involves assessing the association between each individual genotyped SNP and disease risk. However, this approach suffers from limited reproducibility and difficulties in detecting multi-SNP and epistatic effects. As an alternative analytical strategy, we propose grouping SNPs together into SNP sets on the basis of proximity to genomic features such as genes or haplotype blocks, then testing the joint effect of each SNP set. Testing of each SNP set proceeds via the logistic kernel-machine-based test, which is based on a statistical framework that allows for flexible modeling of epistatic and nonlinear SNP effects. This flexibility and the ability to naturally adjust for covariate effects are important features of our test that make it appealing in comparison to individual SNP tests and existing multimarker tests. Using simulated data based on the International HapMap Project, we show that SNP-set testing can have improved power over standard individual-SNP analysis under a wide range of settings. In particular, we find that our approach has higher power than individual-SNP analysis when the median correlation between the disease-susceptibility variant and the genotyped SNPs is moderate to high. When the correlation is low, both individual-SNP analysis and the SNP-set analysis tend to have low power. We apply SNP-set analysis to analyze the Cancer Genetic Markers of Susceptibility (CGEMS) breast cancer GWAS discovery-phase data.  相似文献   

6.
We describe methods for interactive visualization and analysis of density maps available in the UCSF Chimera molecular modeling package. The methods enable segmentation, fitting, coarse modeling, measuring and coloring of density maps for elucidating structures of large molecular assemblies such as virus particles, ribosomes, microtubules, and chromosomes. The methods are suitable for density maps with resolutions in the range spanned by electron microscope single particle reconstructions and tomography. All of the tools described are simple, robust and interactive, involving computations taking only seconds. An advantage of the UCSF Chimera package is its integration of a large collection of interactive methods. Interactive tools are sufficient for performing simple analyses and also serve to prepare input for and examine results from more complex, specialized, and algorithmic non-interactive analysis software. While both interactive and non-interactive analyses are useful, we discuss only interactive methods here.  相似文献   

7.
Genome-wide association studies (GWAS) have rapidly become a powerful tool in genetic studies of complex diseases and traits. Traditionally, single marker-based tests have been used prevalently in GWAS and have uncovered tens of thousands of disease-associated SNPs. Network-assisted analysis (NAA) of GWAS data is an emerging area in which network-related approaches are developed and utilized to perform advanced analyses of GWAS data in order to study various human diseases or traits. Progress has been made in both methodology development and applications of NAA in GWAS data, and it has already been demonstrated that NAA results may enhance our interpretation and prioritization of candidate genes and markers. Inspired by the strong interest in and high demand for advanced GWAS data analysis, in this review article, we discuss the methodologies and strategies that have been reported for the NAA of GWAS data. Many NAA approaches search for subnetworks and assess the combined effects of multiple genes participating in the resultant subnetworks through a gene set analysis. With no restriction to pre-defined canonical pathways, NAA has the advantage of defining subnetworks with the guidance of the GWAS data under investigation. In addition, some NAA methods prioritize genes from GWAS data based on their interconnections in the reference network. Here, we summarize NAA applications to various diseases and discuss the available options and potential caveats related to their practical usage. Additionally, we provide perspectives regarding this rapidly growing research area.  相似文献   

8.
Maria Masotti  Bin Guo  Baolin Wu 《Biometrics》2019,75(4):1076-1085
Genetic variants associated with disease outcomes can be used to develop personalized treatment. To reach this precision medicine goal, hundreds of large‐scale genome‐wide association studies (GWAS) have been conducted in the past decade to search for promising genetic variants associated with various traits. They have successfully identified tens of thousands of disease‐related variants. However, in total these identified variants explain only part of the variation for most complex traits. There remain many genetic variants with small effect sizes to be discovered, which calls for the development of (a) GWAS with more samples and more comprehensively genotyped variants, for example, the NHLBI Trans‐Omics for Precision Medicine (TOPMed) Program is planning to conduct whole genome sequencing on over 100 000 individuals; and (b) novel and more powerful statistical analysis methods. The current dominating GWAS analysis approach is the “single trait” association test, despite the fact that many GWAS are conducted in deeply phenotyped cohorts including many correlated and well‐characterized outcomes, which can help improve the power to detect novel variants if properly analyzed, as suggested by increasing evidence that pleiotropy, where a genetic variant affects multiple traits, is the norm in genome‐phenome associations. We aim to develop pleiotropy informed powerful association test methods across multiple traits for GWAS. Since it is generally very hard to access individual‐level GWAS phenotype and genotype data for those existing GWAS, due to privacy concerns and various logistical considerations, we develop rigorous statistical methods for pleiotropy informed adaptive multitrait association test methods that need only summary association statistics publicly available from most GWAS. We first develop a pleiotropy test, which has powerful performance for truly pleiotropic variants but is sensitive to the pleiotropy assumption. We then develop a pleiotropy informed adaptive test that has robust and powerful performance under various genetic models. We develop accurate and efficient numerical algorithms to compute the analytical P‐value for the proposed adaptive test without the need of resampling or permutation. We illustrate the performance of proposed methods through application to joint association test of GWAS meta‐analysis summary data for several glycemic traits. Our proposed adaptive test identified several novel loci missed by individual trait based GWAS meta‐analysis. All the proposed methods are implemented in a publicly available R package.  相似文献   

9.
Genomewide association studies (GWAS) aim to identify genetic markers strongly associated with quantitative traits by utilizing linkage disequilibrium (LD) between candidate genes and markers. However, because of LD between nearby genetic markers, the standard GWAS approaches typically detect a number of correlated SNPs covering long genomic regions, making corrections for multiple testing overly conservative. Additionally, the high dimensionality of modern GWAS data poses considerable challenges for GWAS procedures such as permutation tests, which are computationally intensive. We propose a cluster‐based GWAS approach that first divides the genome into many large nonoverlapping windows and uses linkage disequilibrium network analysis in combination with principal component (PC) analysis as dimensional reduction tools to summarize the SNP data to independent PCs within clusters of loci connected by high LD. We then introduce single‐ and multilocus models that can efficiently conduct the association tests on such high‐dimensional data. The methods can be adapted to different model structures and used to analyse samples collected from the wild or from biparental F2 populations, which are commonly used in ecological genetics mapping studies. We demonstrate the performance of our approaches with two publicly available data sets from a plant (Arabidopsis thaliana) and a fish (Pungitius pungitius), as well as with simulated data.  相似文献   

10.
The prevailing method of analyzing GWAS data is still to test each marker individually, although from a statistical point of view it is quite obvious that in case of complex traits such single marker tests are not ideal. Recently several model selection approaches for GWAS have been suggested, most of them based on LASSO-type procedures. Here we will discuss an alternative model selection approach which is based on a modification of the Bayesian Information Criterion (mBIC2) which was previously shown to have certain asymptotic optimality properties in terms of minimizing the misclassification error. Heuristic search strategies are introduced which attempt to find the model which minimizes mBIC2, and which are efficient enough to allow the analysis of GWAS data. Our approach is implemented in a software package called MOSGWA. Its performance in case control GWAS is compared with the two algorithms HLASSO and d-GWASelect, as well as with single marker tests, where we performed a simulation study based on real SNP data from the POPRES sample. Our results show that MOSGWA performs slightly better than HLASSO, where specifically for more complex models MOSGWA is more powerful with only a slight increase in Type I error. On the other hand according to our simulations GWASelect does not at all control the type I error when used to automatically determine the number of important SNPs. We also reanalyze the GWAS data from the Wellcome Trust Case-Control Consortium and compare the findings of the different procedures, where MOSGWA detects for complex diseases a number of interesting SNPs which are not found by other methods.  相似文献   

11.
Bayesian phylogenetics with BEAUti and the BEAST 1.7   总被引:7,自引:0,他引:7  
Computational evolutionary biology, statistical phylogenetics and coalescent-based population genetics are becoming increasingly central to the analysis and understanding of molecular sequence data. We present the Bayesian Evolutionary Analysis by Sampling Trees (BEAST) software package version 1.7, which implements a family of Markov chain Monte Carlo (MCMC) algorithms for Bayesian phylogenetic inference, divergence time dating, coalescent analysis, phylogeography and related molecular evolutionary analyses. This package includes an enhanced graphical user interface program called Bayesian Evolutionary Analysis Utility (BEAUti) that enables access to advanced models for molecular sequence and phenotypic trait evolution that were previously available to developers only. The package also provides new tools for visualizing and summarizing multispecies coalescent and phylogeographic analyses. BEAUti and BEAST 1.7 are open source under the GNU lesser general public license and available at http://beast-mcmc.googlecode.com and http://beast.bio.ed.ac.uk.  相似文献   

12.
Hong CB  Kim YJ  Moon S  Shin YA  Go MJ  Kim DJ  Lee JY  Cho YS 《BMB reports》2012,45(1):44-46
Recent advances in high-throughput genotyping technologies have enabled us to conduct a genome-wide association study (GWAS) on a large cohort. However, analyzing millions of single nucleotide polymorphisms (SNPs) is still a difficult task for researchers conducting a GWAS. Several difficulties such as compatibilities and dependencies are often encountered by researchers using analytical tools, during the installation of software. This is a huge obstacle to any research institute without computing facilities and specialists. Therefore, a proper research environment is an urgent need for researchers working on GWAS. We developed BioSMACK to provide a research environment for GWAS that requires no configuration and is easy to use. BioSMACK is based on the Ubuntu Live CD that offers a complete Linux-based operating system environment without installation. Moreover, we provide users with a GWAS manual consisting of a series of guidelines for GWAS and useful examples. BioSMACK is freely available at http://ksnp.cdc. go.kr/biosmack.  相似文献   

13.
MOTIVATION: Genome-wide association studies (GWAS) based on single nucleotide polymorphism (SNP) arrays are the most widely used approach to detect loci associated to human traits. Due to the complexity of the methods and software packages available, each with its particular format requiring intricate management workflows, the analysis of GWAS usually confronts scientists with steep learning curves. Indeed, the wide variety of tools makes the parsing and manipulation of data the most time consuming and error prone part of a study. To help resolve these issues, we present GWASpi, a user-friendly, multiplatform, desktop-able application for the management and analysis of GWAS data, with a novel approach on database technologies to leverage the most out of commonly available desktop hardware. GWASpi aims to be a start-to-finish GWAS management application, from raw data to results, containing the most common analysis tools. As a result, GWASpi is easy to use and reduces in up to two orders of magnitude the time needed to perform the fundamental steps of a GWAS. AVAILABILITY: Freely available on the web at http://www.gwaspi.org. Implemented in Java, Apache-Derby and NetCDF-3, with all major operating systems supported. CONTACT: gwaspi@upf.edu; arcadi.navarro@upf.edu.  相似文献   

14.
SUMMARY: The availability of advanced profile-profile comparison tools, such as PRC or HHsearch demands sophisticated visualization tools not presently available. We introduce an approach built upon the concept of HMM logos. The method illustrates the similarities of pairs of protein family profiles in an intuitive way. Two HMM logos, one for each profile, are drawn one upon the other. The aligned states are then highlighted and connected. AVAILABILITY: A web interface offering online creation of pairwise HMM logos is available at http://www.sanger.ac.uk/Software/analysis/logomat-p. Furthermore, software developers may download a Perl package that includes methods for creation of pairwise HMM logos locally. CONTACT: bsb@sanger.ac.uk.  相似文献   

15.
Because most macroecological and biodiversity data are spatially autocorrelated, special tools for describing spatial structures and dealing with hypothesis testing are usually required. Unfortunately, most of these methods have not been available in a single statistical package. Consequently, using these tools is still a challenge for most ecologists and biogeographers. In this paper, we present sam (Spatial Analysis in Macroecology), a new, easy-to-use, freeware package for spatial analysis in macroecology and biogeography. Through an intuitive, fully graphical interface, this package allows the user to describe spatial patterns in variables and provides an explicit spatial framework for standard techniques of regression and correlation. Moran's I autocorrelation coefficient can be calculated based on a range of matrices describing spatial relationships, for original variables as well as for residuals of regression models, which can also include filtering components (obtained by standard trend surface analysis or by principal coordinates of neighbour matrices). sam also offers tools for correcting the number of degrees of freedom when calculating the significance of correlation coefficients. Explicit spatial modelling using several forms of autoregression and generalized least-squares models are also available. We believe this new tool will provide researchers with the basic statistical tools to resolve autocorrelation problems and, simultaneously, to explore spatial components in macroecological and biogeographical data. Although the program was designed primarily for the applications in macroecology and biogeography, most of sam 's statistical tools will be useful for all kinds of surface pattern spatial analysis. The program is freely available at http://www.ecoevol.ufg.br/sam (permanent URL at http://purl.oclc.org/sam/ ).  相似文献   

16.
Quantitative traits analyzed in Genome-Wide Association Studies (GWAS) are often nonnormally distributed. For such traits, association tests based on standard linear regression are subject to reduced power and inflated type I error in finite samples. Applying the rank-based inverse normal transformation (INT) to nonnormally distributed traits has become common practice in GWAS. However, the different variations on INT-based association testing have not been formally defined, and guidance is lacking on when to use which approach. In this paper, we formally define and systematically compare the direct (D-INT) and indirect (I-INT) INT-based association tests. We discuss their assumptions, underlying generative models, and connections. We demonstrate that the relative powers of D-INT and I-INT depend on the underlying data generating process. Since neither approach is uniformly most powerful, we combine them into an adaptive omnibus test (O-INT). O-INT is robust to model misspecification, protects the type I error, and is well powered against a wide range of nonnormally distributed traits. Extensive simulations were conducted to examine the finite sample operating characteristics of these tests. Our results demonstrate that, for nonnormally distributed traits, INT-based tests outperform the standard untransformed association test, both in terms of power and type I error rate control. We apply the proposed methods to GWAS of spirometry traits in the UK Biobank. O-INT has been implemented in the R package RNOmni , which is available on CRAN.  相似文献   

17.
Genome-wide association studies (GWAS) are routinely conducted for both quantitative and binary (disease) traits. We present two analytical tools for use in the experimental design of GWAS. Firstly, we present power calculations quantifying power in a unified framework for a range of scenarios. In this context we consider the utility of quantitative scores (e.g. endophenotypes) that may be available on cases only or both cases and controls. Secondly, we consider, the accuracy of prediction of genetic risk from genome-wide SNPs and derive an expression for genomic prediction accuracy using a liability threshold model for disease traits in a case-control design. The expected values based on our derived equations for both power and prediction accuracy agree well with observed estimates from simulations.  相似文献   

18.
The detrimental effects of the winner’s curse, including overestimation of the genetic effects of associated variants and underestimation of sufficient sample sizes for replication studies are well-recognized in genome-wide association studies (GWAS). These effects can be expected to worsen as the field moves from GWAS into whole genome sequencing. To date, few studies have reported statistical adjustments to the naive estimates, due to the lack of suitable statistical methods and computational tools. We have developed an efficient genome-wide non-parametric method that explicitly accounts for the threshold, ranking, and allele frequency effects in whole genome scans. Here, we implement the method to provide bias-reduced estimates via bootstrap re-sampling (BR-squared) for association studies of both disease status and quantitative traits, and we report the results of applying BR-squared to GWAS of psoriasis and HbA1c. We observed over 50% reduction in the genetic effect size estimation for many associated SNPs. This translates into a greater than fourfold increase in sample size requirements for successful replication studies, which in part explains some of the apparent failures in replicating the original signals. Our analysis suggests that adjusting for the winner’s curse is critical for interpreting findings from whole genome scans and planning replication and meta-GWAS studies, as well as in attempts to translate findings into the clinical setting.  相似文献   

19.
The power of genome-wide association studies (GWAS) rests on several foundations: (i) there is a significant amount of additive genetic variation, (ii) individual causal polymorphisms often have sizable effects and (iii) they segregate at moderate-to-intermediate frequencies, or will be effectively ‘tagged'' by polymorphisms that do. Each of these assumptions has recently been questioned. (i) Why should genetic variation appear additive given that the underlying molecular networks are highly nonlinear? (ii) A new generation of relatedness-based analyses directs us back to the nearly infinitesimal model for effect sizes that quantitative genetics was long based upon. (iii) Larger effect causal polymorphisms are often low frequency, as selection might lead us to expect. Here, we review these issues and other findings that appear to question many of the foundations of the optimism GWAS prompted. We then present a roadmap emerging as one possible future for quantitative genetics. We argue that in future GWAS should move beyond purely statistical grounds. One promising approach is to build upon the combination of population genetic models and molecular biological knowledge. This combined treatment, however, requires fitting experimental data to models that are very complex, as well as accurate capturing of the uncertainty of resulting inference. This problem can be resolved through Bayesian analysis and tools such as approximate Bayesian computation—a method growing in popularity in population genetic analysis. We show a case example of anterior–posterior segmentation in Drosophila, and argue that similar approaches will be helpful as a GWAS augmentation, in human and agricultural research.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号