首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background  

In order to interpret the results obtained from a microarray experiment, researchers often shift focus from analysis of individual differentially expressed genes to analyses of sets of genes. These gene-set analysis (GSA) methods use previously accumulated biological knowledge to group genes into sets and then aim to rank these gene sets in a way that reflects their relative importance in the experimental situation in question. We suspect that the presence of paralogs affects the ability of GSA methods to accurately identify the most important sets of genes for subsequent research.  相似文献   

2.
3.

Background

Gene Set Analysis (GSA) identifies differential expression gene sets amid the different phenotypes. The results of published papers in this filed are inconsistent and there is no consensus on the best method. In this paper two new methods, in comparison to the previous ones, are introduced for GSA.

Methods

The MMGSA and MRGSA methods based on multivariate nonparametric techniques were presented. The implementation of five GSA methods (Hotelling's T2, Globaltest, Abs_Cat, Med_Cat and Rs_Cat) and the novel methods to detect differential gene expression between phenotypes were compared using simulated and real microarray data sets.

Results

In a real dataset, the results showed that the powers of MMGSA and MRGSA were as well as Globaltest and Tsai. The MRGSA method has not a good performance in the simulation dataset.

Conclusions

The Globaltest method is the best method in the real or simulation datasets. The performance of MMGSA in simulation dataset is good in small-size gene sets. The GLS methods are not good in the simulated data, except the Med_Cat method in large-size gene sets.  相似文献   

4.

Background

Gene set analysis (GSA) methods test the association of sets of genes with phenotypes in gene expression microarray studies. While GSA methods on a single binary or categorical phenotype abounds, little attention has been paid to the case of a continuous phenotype, and there is no method to accommodate correlated multiple continuous phenotypes.

Result

We propose here an extension of the linear combination test (LCT) to its new version for multiple continuous phenotypes, incorporating correlations among gene expressions of functionally related gene sets, as well as correlations among multiple phenotypes. Further, we extend our new method to its nonlinear version, referred as nonlinear combination test (NLCT), to test potential nonlinear association of gene sets with multiple phenotypes. Simulation study and a real microarray example demonstrate the practical aspects of the proposed methods.

Conclusion

The proposed approaches are effective in controlling type I errors and powerful in testing associations between gene-sets and multiple continuous phenotypes. They are both computationally effective. Naively (univariately) analyzing a group of multiple correlated phenotypes could be dangerous. R-codes to perform LCT and NLCT for multiple continuous phenotypes are available at http://www.ualberta.ca/~yyasui/homepage.html.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-260) contains supplementary material, which is available to authorized users.  相似文献   

5.

Introduction

Gene-set analysis (GSA) methods are used as complementary approaches to genome-wide association studies (GWASs). The single marker association estimates of a predefined set of genes are either contrasted with those of all remaining genes or with a null non-associated background. To pool the p-values from several GSAs, it is important to take into account the concordance of the observed patterns resulting from single marker association point estimates across any given gene set. Here we propose an enhanced version of Fisher’s inverse χ2-method META-GSA, however weighting each study to account for imperfect correlation between association patterns.

Simulation and Power

We investigated the performance of META-GSA by simulating GWASs with 500 cases and 500 controls at 100 diallelic markers in 20 different scenarios, simulating different relative risks between 1 and 1.5 in gene sets of 10 genes. Wilcoxon’s rank sum test was applied as GSA for each study. We found that META-GSA has greater power to discover truly associated gene sets than simple pooling of the p-values, by e.g. 59% versus 37%, when the true relative risk for 5 of 10 genes was assume to be 1.5. Under the null hypothesis of no difference in the true association pattern between the gene set of interest and the set of remaining genes, the results of both approaches are almost uncorrelated. We recommend not relying on p-values alone when combining the results of independent GSAs.

Application

We applied META-GSA to pool the results of four case-control GWASs of lung cancer risk (Central European Study and Toronto/Lunenfeld-Tanenbaum Research Institute Study; German Lung Cancer Study and MD Anderson Cancer Center Study), which had already been analyzed separately with four different GSA methods (EASE; SLAT, mSUMSTAT and GenGen). This application revealed the pathway GO0015291 “transmembrane transporter activity” as significantly enriched with associated genes (GSA-method: EASE, p = 0.0315 corrected for multiple testing). Similar results were found for GO0015464 “acetylcholine receptor activity” but only when not corrected for multiple testing (all GSA-methods applied; p≈0.02).  相似文献   

6.

Introduction

Strains of Shiga-toxin producing Escherichia coli O157 (STEC O157) are important foodborne pathogens in humans, and outbreaks of illness have been associated with consumption of undercooked beef. Here, we determine the most effective intervention strategies to reduce the prevalence of STEC O157 contaminated beef carcasses using a modelling approach.

Method

A computational model simulated events and processes in the beef harvest chain. Information from empirical studies was used to parameterise the model. Variance-based global sensitivity analysis (GSA) using the Saltelli method identified variables with the greatest influence on the prevalence of STEC O157 contaminated carcasses. Following a baseline scenario (no interventions), a series of simulations systematically introduced and tested interventions based on influential variables identified by repeated Saltelli GSA, to determine the most effective intervention strategy.

Results

Transfer of STEC O157 from hide or gastro-intestinal tract to carcass (improved abattoir hygiene) had the greatest influence on the prevalence of contaminated carcases. Due to interactions between inputs (identified by Saltelli GSA), combinations of interventions based on improved abattoir hygiene achieved a greater reduction in maximum prevalence than would be expected from an additive effect of single interventions. The most effective combination was improved abattoir hygiene with vaccination, which achieved a greater than ten-fold decrease in maximum prevalence compared to the baseline scenario.

Conclusion

Study results suggest that effective interventions to reduce the prevalence of STEC O157 contaminated carcasses should initially be based on improved abattoir hygiene. However, the effect of improved abattoir hygiene on the distribution of STEC O157 concentration on carcasses is an important information gap—further empirical research is required to determine whether reduced prevalence of contaminated carcasses is likely to result in reduced incidence of STEC O157 associated illness in humans. This is the first use of variance-based GSA to assess the drivers of STEC O157 contamination of beef carcasses.  相似文献   

7.

Purpose

Identification of key inputs and their effect on results from Life Cycle Assessment (LCA) models is fundamental. Because parameter importance varies greatly between cases due to the interaction of sensitivity and uncertainty, these features should never be defined a priori. However, exhaustive parametrical uncertainty analyses may potentially be complicated and demanding, both with analytical and sampling methods. Therefore, we propose a systematic method for selection of critical parameters based on a simplified analytical formulation that unifies the concepts of sensitivity and uncertainty in a Global Sensitivity Analysis (GSA) framework.

Methods

The proposed analytical method based on the calculation of sensitivity coefficients (SC) is evaluated against Monte Carlo sampling on traditional uncertainty assessment procedures, both for individual parameters and for full parameter sets. Three full-scale waste management scenarios are modelled with the dedicated waste LCA model EASETECH and a full range of ILCD recommended impact categories. Common uncertainty ranges of 10 % are used for all parameters, which we assume to be normally distributed. The applicability of the concepts of additivity of variances and GSA is tested on results from both uncertainty propagation methods. Then, we examine the differences in discernibility analyses results carried out with varying numbers of sampling points and parameters.

Results and discussion

The proposed analytical method complies with the Monte Carlo results for all scenarios and impact categories, but offers substantially simpler mathematical formulation and shorter computation times. The coefficients of variation obtained with the analytical method and Monte Carlo differ only by 1 %, indicating that the analytical method provides a reliable representation of uncertainties and allows determination of whether a discernibility analysis is required. The additivity of variances and the GSA approach show that the uncertainty in results is determined by a limited set of important parameters. The results of the discernibility analysis based on these critical parameters vary only by 1 % from discernibility analyses based on the full set, but require significantly fewer Monte Carlo runs.

Conclusions

The proposed method and GSA framework provide a fast and valuable approximation for uncertainty quantification. Uncertainty can be represented sparsely by contextually identifying important parameters in a systematic manner. The proposed method integrates with existing step-wise approaches for uncertainty analysis by introducing a global importance analysis before uncertainty propagation.
  相似文献   

8.

Background  

The paper of Liu, Gaido and Wolfinger on gene expression during the division cycle of HeLa cells using the data of Whitfield et al. are discussed in order to see whether their analysis is related to gene expression during the division cycle.  相似文献   

9.

Background  

Gene clustering has been widely used to group genes with similar expression pattern in microarray data analysis. Subsequent enrichment analysis using predefined gene sets can provide clues on which functional themes or regulatory sequence motifs are associated with individual gene clusters. In spite of the potential utility, gene clustering and enrichment analysis have been used in separate platforms, thus, the development of integrative algorithm linking both methods is highly challenging.  相似文献   

10.

Background

Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set analysis (GSA) methods for chemical treatment identification, for pharmacological mechanism elucidation, and for comparing compound toxicity profiles.

Methods

We created 30,211 chemical response-specific gene sets for human and mouse by next-gen TM, and derived 1,189 (human) and 588 (mouse) gene sets from the Comparative Toxicogenomics Database (CTD). We tested for significant differential expression (SDE) (false discovery rate -corrected p-values < 0.05) of the next-gen TM-derived gene sets and the CTD-derived gene sets in gene expression (GE) data sets of five chemicals (from experimental models). We tested for SDE of gene sets for six fibrates in a peroxisome proliferator-activated receptor alpha (PPARA) knock-out GE dataset and compared to results from the Connectivity Map. We tested for SDE of 319 next-gen TM-derived gene sets for environmental toxicants in three GE data sets of triazoles, and tested for SDE of 442 gene sets associated with embryonic structures. We compared the gene sets to triazole effects seen in the Whole Embryo Culture (WEC), and used principal component analysis (PCA) to discriminate triazoles from other chemicals.

Results

Next-gen TM-derived gene sets matching the chemical treatment were significantly altered in three GE data sets, and the corresponding CTD-derived gene sets were significantly altered in five GE data sets. Six next-gen TM-derived and four CTD-derived fibrate gene sets were significantly altered in the PPARA knock-out GE dataset. None of the fibrate signatures in cMap scored significant against the PPARA GE signature. 33 environmental toxicant gene sets were significantly altered in the triazole GE data sets. 21 of these toxicants had a similar toxicity pattern as the triazoles. We confirmed embryotoxic effects, and discriminated triazoles from other chemicals.

Conclusions

Gene set analysis with next-gen TM-derived chemical response-specific gene sets is a scalable method for identifying similarities in gene responses to other chemicals, from which one may infer potential mode of action and/or toxic effect.  相似文献   

11.

Background  

Previous differential coexpression analyses focused on identification of differentially coexpressed gene pairs, revealing many insightful biological hypotheses. However, this method could not detect coexpression relationships between pairs of gene sets. Considering the success of many set-wise analysis methods for microarray data, a coexpression analysis based on gene sets may elucidate underlying biological processes provoked by the conditional changes. Here, we propose a differentially coexpressed gene sets (dCoxS) algorithm that identifies the differentially coexpressed gene set pairs between conditions.  相似文献   

12.
13.

Background  

A common clustering method in the analysis of gene expression data has been hierarchical clustering. Usually the analysis involves selection of clusters by cutting the tree at a suitable level and/or analysis of a sorted gene list that is obtained with the tree. Cutting of the hierarchical tree requires the selection of a suitable level and it results in the loss of information on the other level. Sorted gene lists depend on the sorting method of the joined clusters. Author proposes that the clusters should be selected using the gene classifications.  相似文献   

14.

Background  

The accurate determination of orthology and inparalogy relationships is essential for comparative sequence analysis, functional gene annotation and evolutionary studies. Various methods have been developed based on either simple blast all-versus-all pairwise comparisons and/or time-consuming phylogenetic tree analyses.  相似文献   

15.

Background  

The traditional phylogeny analysis within gene family is mainly based on DNA or amino acid sequence homologies. However, these phylogenetic tree analyses are not suitable for those "non-traditional" gene families like microRNA with very short sequences. For the normal protein-coding gene families, low bootstrap values are frequently encountered in some nodes, suggesting low confidence or likely inappropriateness of placement of those members in those nodes.  相似文献   

16.

Background

The asexual blood stages of the human malaria parasite Plasmodium falciparum produce highly immunogenic polymorphic antigens that are expressed on the surface of the host cell. In contrast, few studies have examined the surface of the gametocyte-infected erythrocyte.

Methodology/Principal Findings

We used flow cytometry to detect antibodies recognising the surface of live cultured erythrocytes infected with gametocytes of P. falciparum strain 3D7 in the plasma of 200 Gambian children. The majority of children had been identified as carrying gametocytes after treatment for malaria, and each donated blood for mosquito-feeding experiments. None of the plasma recognised the surface of erythrocytes infected with developmental stages of gametocytes (I–IV), but 66 of 194 (34.0%) plasma contained IgG that recognised the surface of erythrocytes infected with mature (stage V) gametocytes. Thirty-four (17.0%) of 200 plasma tested recognised erythrocytes infected with trophozoites and schizonts, but there was no association with recognition of the surface of gametocyte-infected erythrocytes (odds ratio 1.08, 95% C.I. 0.434–2.57; P = 0.851). Plasma antibodies with the ability to recognise gametocyte surface antigens (GSA) were associated with the presence of antibodies that recognise the gamete antigen Pfs 230, but not Pfs48/45. Antibodies recognising GSA were associated with donors having lower gametocyte densities 4 weeks after antimalarial treatment.

Conclusions/Significance

We provide evidence that GSA are distinct from antigens detected on the surface of asexual 3D7 parasites. Our findings suggest a novel strategy for the development of transmission-blocking vaccines.  相似文献   

17.

Background  

Gene set enrichment analysis (GSEA) is a microarray data analysis method that uses predefined gene sets and ranks of genes to identify significant biological changes in microarray data sets. GSEA is especially useful when gene expression changes in a given microarray data set is minimal or moderate.  相似文献   

18.

Background  

Many methods have been developed to test the enrichment of genes related to certain phenotypes or cell states in gene sets. These approaches usually combine gene expression data with functionally related gene sets as defined in databases such as GeneOntology (GO), KEGG, or BioCarta. The results based on gene set analysis are generally more biologically interpretable, accurate and robust than the results based on individual gene analysis. However, while most available methods for gene set enrichment analysis test the enrichment of the entire gene set, it is more likely that only a subset of the genes in the gene set may be related to the phenotypes of interest.  相似文献   

19.

Background  

Vertebrate genes often appear to cluster within the background of nontranscribed genomic DNA. Here an analysis of the physical distribution of gene structures on human chromosome 7 was performed to confirm the presence of clustering, and to elucidate possible underlying statistical and biological mechanisms.  相似文献   

20.

Background  

Whether for cell culture studies of protein function, construction of mouse models to enable in vivo analysis of disease epidemiology, or ultimately gene therapy of human diseases, a critical enabling step is the ability to achieve finely controlled regulation of gene expression. Previous efforts to achieve this goal have explored inducible drug regulation of gene expression, and construction of synthetic promoters based on two-hybrid paradigms, among others.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号