首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Genome-wide association studies (GWAS) are designed to identify the portion of single-nucleotide polymorphisms (SNPs) in genome sequences associated with a complex trait. Strategies based on the gene list enrichment concept are currently applied for the functional analysis of GWAS, according to which a significant overrepresentation of candidate genes associated with a biological pathway is used as a proxy to infer overrepresentation of candidate SNPs in the pathway. Here we show that such inference is not always valid and introduce the program SNP2GO, which implements a new method to properly test for the overrepresentation of candidate SNPs in biological pathways.  相似文献   

2.
New sources of genetic diversity must be incorporated into plant breeding programs if they are to continue increasing grain yield and quality, and tolerance to abiotic and biotic stresses. Germplasm collections provide a source of genetic and phenotypic diversity, but characterization of these resources is required to increase their utility for breeding programs. We used a barley SNP iSelect platform with 7,842 SNPs to genotype 2,417 barley accessions sampled from the USDA National Small Grains Collection of 33,176 accessions. Most of the accessions in this core collection are categorized as landraces or cultivars/breeding lines and were obtained from more than 100 countries. Both STRUCTURE and principal component analysis identified five major subpopulations within the core collection, mainly differentiated by geographical origin and spike row number (an inflorescence architecture trait). Different patterns of linkage disequilibrium (LD) were found across the barley genome and many regions of high LD contained traits involved in domestication and breeding selection. The genotype data were used to define ‘mini-core’ sets of accessions capturing the majority of the allelic diversity present in the core collection. These ‘mini-core’ sets can be used for evaluating traits that are difficult or expensive to score. Genome-wide association studies (GWAS) of ‘hull cover’, ‘spike row number’, and ‘heading date’ demonstrate the utility of the core collection for locating genetic factors determining important phenotypes. The GWAS results were referenced to a new barley consensus map containing 5,665 SNPs. Our results demonstrate that GWAS and high-density SNP genotyping are effective tools for plant breeders interested in accessing genetic diversity in large germplasm collections.  相似文献   

3.
4.
Increasingly large numbers of proteins require methods for functional annotation. This is typically based on pairwise inference from the homology of either protein sequence or structure. Recently, similarity networks have been presented to leverage both the ability to visualize relationships between proteins and assess the transferability of functional inference. Here we present PANADA, a novel toolkit for the visualization and analysis of protein similarity networks in Cytoscape. Networks can be constructed based on pairwise sequence or structural alignments either on a set of proteins or, alternatively, by database search from a single sequence. The Panada web server, executable for download and examples and extensive help files are available at URL: http://protein.bio.unipd.it/panada/.  相似文献   

5.
Genome-wide association studies (GWAS) are routinely conducted for both quantitative and binary (disease) traits. We present two analytical tools for use in the experimental design of GWAS. Firstly, we present power calculations quantifying power in a unified framework for a range of scenarios. In this context we consider the utility of quantitative scores (e.g. endophenotypes) that may be available on cases only or both cases and controls. Secondly, we consider, the accuracy of prediction of genetic risk from genome-wide SNPs and derive an expression for genomic prediction accuracy using a liability threshold model for disease traits in a case-control design. The expected values based on our derived equations for both power and prediction accuracy agree well with observed estimates from simulations.  相似文献   

6.
7.
8.
《PloS one》2013,8(4)
Asbestos exposure is the main risk factor for malignant pleural mesothelioma (MPM), a rare aggressive tumor. Nevertheless, only 5–17% of those exposed to asbestos develop MPM, suggesting the involvement of other environmental and genetic risk factors.To identify the genetic risk factors that may contribute to the development of MPM, we conducted a genome-wide association study (GWAS; 370,000 genotyped SNPs, 5 million imputed SNPs) in Italy, among 407 MPM cases and 389 controls with a complete history of asbestos exposure. A replication study was also undertaken and included 428 MPM cases and 1269 controls from Australia.Although no single marker reached the genome-wide significance threshold, several associations were supported by haplotype-, chromosomal region-, gene- and gene-ontology process-based analyses. Most of these SNPs were located in regions reported to harbor aberrant alterations in mesothelioma (SLC7A14, THRB, CEBP350, ADAMTS2, ETV1, PVT1 and MMP14 genes), causing at most a 2–3-fold increase in MPM risk. The Australian replication study showed significant associations in five of these chromosomal regions (3q26.2, 4q32.1, 7p22.2, 14q11.2, 15q14).Multivariate analysis suggested an independent contribution of 10 genetic variants, with an Area Under the ROC Curve (AUC) of 0.76 when only exposure and covariates were included in the model, and of 0.86 when the genetic component was also included, with a substantial increase of asbestos exposure risk estimation (odds ratio, OR: 45.28, 95% confidence interval, CI: 21.52–95.28).These results showed that genetic risk factors may play an additional role in the development of MPM, and that these should be taken into account to better estimate individual MPM risk in individuals who have been exposed to asbestos.  相似文献   

9.
10.
11.
Efforts to identify loci underlying complex traits generally assume that most genetic variance is additive. Here, we examined the genetics of Arabidopsis thaliana root length and found that the genomic narrow-sense heritability for this trait in the examined population was statistically zero. The low amount of additive genetic variance that could be captured by the genome-wide genotypes likely explains why no associations to root length could be found using standard additive-model-based genome-wide association (GWA) approaches. However, as the broad-sense heritability for root length was significantly larger, and primarily due to epistasis, we also performed an epistatic GWA analysis to map loci contributing to the epistatic genetic variance. Four interacting pairs of loci were revealed, involving seven chromosomal loci that passed a standard multiple-testing corrected significance threshold. The genotype-phenotype maps for these pairs revealed epistasis that cancelled out the additive genetic variance, explaining why these loci were not detected in the additive GWA analysis. Small population sizes, such as in our experiment, increase the risk of identifying false epistatic interactions due to testing for associations with very large numbers of multi-marker genotypes in few phenotyped individuals. Therefore, we estimated the false-positive risk using a new statistical approach that suggested half of the associated pairs to be true positive associations. Our experimental evaluation of candidate genes within the seven associated loci suggests that this estimate is conservative; we identified functional candidate genes that affected root development in four loci that were part of three of the pairs. The statistical epistatic analyses were thus indispensable for confirming known, and identifying new, candidate genes for root length in this population of wild-collected A. thaliana accessions. We also illustrate how epistatic cancellation of the additive genetic variance explains the insignificant narrow-sense and significant broad-sense heritability by using a combination of careful statistical epistatic analyses and functional genetic experiments.  相似文献   

12.

Introduction

Gene expression data is often assumed to be normally-distributed, but this assumption has not been tested rigorously. We investigate the distribution of expression data in human cancer genomes and study the implications of deviations from the normal distribution for translational molecular oncology research.

Methods

We conducted a central moments analysis of five cancer genomes and performed empiric distribution fitting to examine the true distribution of expression data both on the complete-experiment and on the individual-gene levels. We used a variety of parametric and nonparametric methods to test the effects of deviations from normality on gene calling, functional annotation, and prospective molecular classification using a sixth cancer genome.

Results

Central moments analyses reveal statistically-significant deviations from normality in all of the analyzed cancer genomes. We observe as much as 37% variability in gene calling, 39% variability in functional annotation, and 30% variability in prospective, molecular tumor subclassification associated with this effect.

Conclusions

Cancer gene expression profiles are not normally-distributed, either on the complete-experiment or on the individual-gene level. Instead, they exhibit complex, heavy-tailed distributions characterized by statistically-significant skewness and kurtosis. The non-Gaussian distribution of this data affects identification of differentially-expressed genes, functional annotation, and prospective molecular classification. These effects may be reduced in some circumstances, although not completely eliminated, by using nonparametric analytics. This analysis highlights two unreliable assumptions of translational cancer gene expression analysis: that “small” departures from normality in the expression data distributions are analytically-insignificant and that “robust” gene-calling algorithms can fully compensate for these effects.  相似文献   

13.

Background

In plants, 14-3-3 proteins are encoded by a large multigene family and are involved in signaling pathways to regulate plant development and protection from stress. Although twelve Populus 14-3-3s were identified based on the Populus trichocarpa genome V1.1 in a previous study, no systematic analysis including genome organization, gene structure, duplication relationship, evolutionary analysis and expression compendium has been conducted in Populus based on the latest P. trichocarpa genome V3.0.

Principal Findings

Here, a comprehensive analysis of Populus 14-3-3 family is presented. Two new 14-3-3 genes were identified based on the latest P. trichocarpa genome. In P. trichocarpa, fourteen 14-3-3 genes were grouped into ε and non-ε group. Exon-intron organizations of Populus 14-3-3s are highly conserved within the same group. Genomic organization analysis indicated that purifying selection plays a pivotal role in the retention and maintenance of Populus 14-3-3 family. Protein conformational analysis indicated that Populus 14-3-3 consists of a bundle of nine α-helices (α1-α9); the first four are essential for formation of the dimer, while α3, α5, α7, and α9 form a conserved peptide-binding groove. In addition, α1, α3, α5, α7, and α9 were evolving at a lower rate, while α2, α4, and α6 were evolving at a relatively faster rate. Microarray analyses showed that most Populus 14-3-3s are differentially expressed across tissues and upon exposure to various stresses.

Conclusions

The gene structures and their coding protein structures of Populus 14-3-3s are highly conserved among group members, suggesting that members of the same group might also have conserved functions. Microarray and qRT-PCR analyses showed that most Populus 14-3-3s were differentially expressed in various tissues and were induced by various stresses. Our investigation provided a better understanding of the complexity of the 14-3-3 gene family in poplars.  相似文献   

14.
15.
While available evidence supports the role of genetics in the pathogenesis of placental abruption (PA), PA-related placental genome variations and maternal-placental genetic interactions have not been investigated. Maternal blood and placental samples collected from participants in the Peruvian Abruptio Placentae Epidemiology study were genotyped using Illumina’s Cardio-Metabochip platform. We examined 118,782 genome-wide SNPs and 333 SNPs in 32 candidate genes from mitochondrial biogenesis and oxidative phosphorylation pathways in placental DNA from 280 PA cases and 244 controls. We assessed maternal-placental interactions in the candidate gene SNPS and two imprinted regions (IGF2/H19 and C19MC). Univariate and penalized logistic regression models were fit to estimate odds ratios. We examined the combined effect of multiple SNPs on PA risk using weighted genetic risk scores (WGRS) with repeated ten-fold cross-validations. A multinomial model was used to investigate maternal-placental genetic interactions. In placental genome-wide and candidate gene analyses, no SNP was significant after false discovery rate correction. The top genome-wide association study (GWAS) hits were rs544201, rs1484464 (CTNNA2), rs4149570 (TNFRSF1A) and rs13055470 (ZNRF3) (p-values: 1.11e-05 to 3.54e-05). The top 200 SNPs of the GWAS overrepresented genes involved in cell cycle, growth and proliferation. The top candidate gene hits were rs16949118 (COX10) and rs7609948 (THRB) (p-values: 6.00e-03 and 8.19e-03). Participants in the highest quartile of WGRS based on cross-validations using SNPs selected from the GWAS and candidate gene analyses had a 8.40-fold (95% CI: 5.8–12.56) and a 4.46-fold (95% CI: 2.94–6.72) higher odds of PA compared to participants in the lowest quartile. We found maternal-placental genetic interactions on PA risk for two SNPs in PPARG (chr3∶12313450 and chr3∶12412978) and maternal imprinting effects for multiple SNPs in the C19MC and IGF2/H19 regions. Variations in the placental genome and interactions between maternal-placental genetic variations may contribute to PA risk. Larger studies may help advance our understanding of PA pathogenesis.  相似文献   

16.
17.
《PloS one》2009,4(7)
To identify loci affecting the electrocardiographic QT interval, a measure of cardiac repolarisation associated with risk of ventricular arrhythmias and sudden cardiac death, we conducted a meta-analysis of three genome-wide association studies (GWAS) including 3,558 subjects from the TwinsUK and BRIGHT cohorts in the UK and the DCCT/EDIC cohort from North America. Five loci were significantly associated with QT interval at P<1×10−6. To validate these findings we performed an in silico comparison with data from two QT consortia: QTSCD (n = 15,842) and QTGEN (n = 13,685). Analysis confirmed the association between common variants near NOS1AP (P = 1.4×10−83) and the phospholamban (PLN) gene (P = 1.9×10−29). The most associated SNP near NOS1AP (rs12143842) explains 0.82% variance; the SNP near PLN (rs11153730) explains 0.74% variance of QT interval duration. We found no evidence for interaction between these two SNPs (P = 0.99). PLN is a key regulator of cardiac diastolic function and is involved in regulating intracellular calcium cycling, it has only recently been identified as a susceptibility locus for QT interval. These data offer further mechanistic insights into genetic influence on the QT interval which may predispose to life threatening arrhythmias and sudden cardiac death.  相似文献   

18.
19.
20.

SUMMARY

N-glycosylation of proteins is one of the most prevalent posttranslational modifications in nature. Accordingly, a pathway with shared commonalities is found in all three domains of life. While excellent model systems have been developed for studying N-glycosylation in both Eukarya and Bacteria, an understanding of this process in Archaea was hampered until recently by a lack of effective molecular tools. However, within the last decade, impressive advances in the study of the archaeal version of this important pathway have been made for halophiles, methanogens, and thermoacidophiles, combining glycan structural information obtained by mass spectrometry with bioinformatic, genetic, biochemical, and enzymatic data. These studies reveal both features shared with the eukaryal and bacterial domains and novel archaeon-specific aspects. Unique features of N-glycosylation in Archaea include the presence of unusual dolichol lipid carriers, the use of a variety of linking sugars that connect the glycan to proteins, the presence of novel sugars as glycan constituents, the presence of two very different N-linked glycans attached to the same protein, and the ability to vary the N-glycan composition under different growth conditions. These advances are the focus of this review, with an emphasis on N-glycosylation pathways in Haloferax, Methanococcus, and Sulfolobus.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号