首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
In this paper, we introduce the R package gendist that computes the probability density function, the cumulative distribution function, the quantile function and generates random values for several generated probability distribution models including the mixture model, the composite model, the folded model, the skewed symmetric model and the arc tan model. These models are extensively used in the literature and the R functions provided here are flexible enough to accommodate various univariate distributions found in other R packages. We also show its applications in graphing, estimation, simulation and risk measurements.  相似文献   

2.
3.
4.
Gene set analysis aims to identify predefined sets of functionally related genes that are differentially expressed between two conditions. Although gene set analysis has been very successful, by incorporating biological knowledge about the gene sets and enhancing statistical power over gene-by-gene analyses, it does not take into account the correlation (association) structure among the genes. In this work, we present CoGA (Co-expression Graph Analyzer), an R package for the identification of groups of differentially associated genes between two phenotypes. The analysis is based on concepts of Information Theory applied to the spectral distributions of the gene co-expression graphs, such as the spectral entropy to measure the randomness of a graph structure and the Jensen-Shannon divergence to discriminate classes of graphs. The package also includes common measures to compare gene co-expression networks in terms of their structural properties, such as centrality, degree distribution, shortest path length, and clustering coefficient. Besides the structural analyses, CoGA also includes graphical interfaces for visual inspection of the networks, ranking of genes according to their “importance” in the network, and the standard differential expression analysis. We show by both simulation experiments and analyses of real data that the statistical tests performed by CoGA indeed control the rate of false positives and is able to identify differentially co-expressed genes that other methods failed.  相似文献   

5.
Contemporary genetic studies are revealing the genetic complexity of many traits in humans and model organisms. Two hallmarks of this complexity are epistasis, meaning gene-gene interaction, and pleiotropy, in which one gene affects multiple phenotypes. Understanding the genetic architecture of complex traits requires addressing these phenomena, but interpreting the biological significance of epistasis and pleiotropy is often difficult. While epistasis reveals dependencies between genetic variants, it is often unclear how the activity of one variant is specifically modifying the other. Epistasis found in one phenotypic context may disappear in another context, rendering the genetic interaction ambiguous. Pleiotropy can suggest either redundant phenotype measures or gene variants that affect multiple biological processes. Here we present an R package, R/cape, which addresses these interpretation ambiguities by implementing a novel method to generate predictive and interpretable genetic networks that influence quantitative phenotypes. R/cape integrates information from multiple related phenotypes to constrain models of epistasis, thereby enhancing the detection of interactions that simultaneously describe all phenotypes. The networks inferred by R/cape are readily interpretable in terms of directed influences that indicate suppressive and enhancing effects of individual genetic variants on other variants, which in turn account for the variance in quantitative traits. We demonstrate the utility of R/cape by analyzing a mouse backcross, thereby discovering novel epistatic interactions influencing phenotypes related to obesity and diabetes. R/cape is an easy-to-use, platform-independent R package and can be applied to data from both genetic screens and a variety of segregating populations including backcrosses, intercrosses, and natural populations. The package is freely available under the GPL-3 license at http://cran.r-project.org/web/packages/cape.
This is a PLOS Computational Biology Software Article
  相似文献   

6.
I introduce an open-source R package ‘dcGOR’ to provide the bioinformatics community with the ease to analyse ontologies and protein domain annotations, particularly those in the dcGO database. The dcGO is a comprehensive resource for protein domain annotations using a panel of ontologies including Gene Ontology. Although increasing in popularity, this database needs statistical and graphical support to meet its full potential. Moreover, there are no bioinformatics tools specifically designed for domain ontology analysis. As an add-on package built in the R software environment, dcGOR offers a basic infrastructure with great flexibility and functionality. It implements new data structure to represent domains, ontologies, annotations, and all analytical outputs as well. For each ontology, it provides various mining facilities, including: (i) domain-based enrichment analysis and visualisation; (ii) construction of a domain (semantic similarity) network according to ontology annotations; and (iii) significance analysis for estimating a contact (statistical significance) network. To reduce runtime, most analyses support high-performance parallel computing. Taking as inputs a list of protein domains of interest, the package is able to easily carry out in-depth analyses in terms of functional, phenotypic and diseased relevance, and network-level understanding. More importantly, dcGOR is designed to allow users to import and analyse their own ontologies and annotations on domains (taken from SCOP, Pfam and InterPro) and RNAs (from Rfam) as well. The package is freely available at CRAN for easy installation, and also at GitHub for version control. The dedicated website with reproducible demos can be found at http://supfam.org/dcGOR.
This is a PLOS Computational Biology Software Article
  相似文献   

7.
Recent advances in big data and analytics research have provided a wealth of large data sets that are too big to be analyzed in their entirety, due to restrictions on computer memory or storage size. New Bayesian methods have been developed for data sets that are large only due to large sample sizes. These methods partition big data sets into subsets and perform independent Bayesian Markov chain Monte Carlo analyses on the subsets. The methods then combine the independent subset posterior samples to estimate a posterior density given the full data set. These approaches were shown to be effective for Bayesian models including logistic regression models, Gaussian mixture models and hierarchical models. Here, we introduce the R package parallelMCMCcombine which carries out four of these techniques for combining independent subset posterior samples. We illustrate each of the methods using a Bayesian logistic regression model for simulation data and a Bayesian Gamma model for real data; we also demonstrate features and capabilities of the R package. The package assumes the user has carried out the Bayesian analysis and has produced the independent subposterior samples outside of the package. The methods are primarily suited to models with unknown parameters of fixed dimension that exist in continuous parameter spaces. We envision this tool will allow researchers to explore the various methods for their specific applications and will assist future progress in this rapidly developing field.  相似文献   

8.
9.
10.
Gene expression signatures can predict the activation of oncogenic pathways and other phenotypes of interest via quantitative models that combine the expression levels of multiple genes. However, as the number of platforms to measure genome-wide gene expression proliferates, there is an increasing need to develop models that can be ported across diverse platforms. Because of the range of technologies that measure gene expression, the resulting signal values can vary greatly. To understand how this variation can affect the prediction of gene expression signatures, we have investigated the ability of gene expression signatures to predict pathway activation across Affymetrix and Illumina microarrays. We hybridized the same RNA samples to both platforms and compared the resultant gene expression readings, as well as the signature predictions. Using a new approach to map probes across platforms, we found that the genes in the signatures from the two platforms were highly similar, and that the predictions they generated were also strongly correlated. This demonstrates that our method can map probes from Affymetrix and Illumina microarrays, and that this mapping can be used to predict gene expression signatures across platforms.  相似文献   

11.
12.
The R package COPASutils provides a logical workflow for the reading, processing, and visualization of data obtained from the Union Biometrica Complex Object Parametric Analyzer and Sorter (COPAS) or the BioSorter large-particle flow cytometers. Data obtained from these powerful experimental platforms can be unwieldy, leading to difficulties in the ability to process and visualize the data using existing tools. Researchers studying small organisms, such as Caenorhabditis elegans, Anopheles gambiae, and Danio rerio, and using these devices will benefit from this streamlined and extensible R package. COPASutils offers a powerful suite of functions for the rapid processing and analysis of large high-throughput screening data sets.  相似文献   

13.
Although the use of (neo-)adjuvant chemotherapy in breast cancer patients has resulted in improved outcome, not all patients benefit equally. We have evaluated the utility of an in vitro chemosensitivity assay in predicting response to neoadjuvant chemotherapy. Pre-therapeutic biopsies were obtained from 30 breast cancer patients assigned to neoadjuvant epirubicin 75 mg/m2 and docetaxel 75 mg/m2 (Epi/Doc) in a prospectively randomized clinical trial. Biopsies were subjected to a standardized ATP-based Epi/Doc chemosensitivity assay, and to gene expression profiling. Patients then received 3 cycles of chemotherapy, and response was evaluated by changes in tumor diameter and Ki67 expression. The efficacy of Epi/Doc in vitro was correlated with differential changes in tumor cell proliferation in response to Epi/Doc in vivo (p = 0.0011; r = 0.73670, Spearmańs rho), but did not predict for changes in tumor size. While a pre-therapeutic gene expression signature identified tumors with a clinical response to Epi/Doc, no such signature could be found for tumors that responded to Epi/Doc in vitro, or tumors in which Epi/Doc exerted an antiproliferative effect in vivo. This is the first prospective clinical trial to demonstrate the utility of a standardized in vitro chemosensitivity assay in predicting the individual biological response to chemotherapy in breast cancer.  相似文献   

14.
15.
目的:研究核干细胞因子Nucleostemin(NS)基因在卵巢上皮性肿瘤中的表达,探讨其与肿瘤病理分型的关系。方法:采用RT-PCR及Western blot检测36例卵巢癌组织手术标本,32例卵巢良性上皮肿瘤组织手术标本,12例正常卵巢组织标本中Nu-cleostemin基因及相应蛋白的表达,采用分组对照的方法对比3组样本中NS基因及蛋白的表达情况,并进行相对定量研究。采用统计学方法检测NS基因的表达是否与临床病理分级及血清CA125存在关联。结果:①卵巢癌组织中NS的阳性表达率显著高于良性肿瘤组织及正常卵巢组织;②卵巢癌组织中,淋巴结转移组NS的表达水平高于未转移组;③临床分期Ⅲ期组的表达水平高于ⅠB期组;④中、低分化组的表达水平高于高分化组。结论:卵巢癌组织中存在NS基因的高表达,其表达量与组织类型无关,而与临床TNM分期及组织分级正相关。  相似文献   

16.

Background

The analysis of microbial communities through DNA sequencing brings many challenges: the integration of different types of data with methods from ecology, genetics, phylogenetics, multivariate statistics, visualization and testing. With the increased breadth of experimental designs now being pursued, project-specific statistical analyses are often needed, and these analyses are often difficult (or impossible) for peer researchers to independently reproduce. The vast majority of the requisite tools for performing these analyses reproducibly are already implemented in R and its extensions (packages), but with limited support for high throughput microbiome census data.

Results

Here we describe a software project, phyloseq, dedicated to the object-oriented representation and analysis of microbiome census data in R. It supports importing data from a variety of common formats, as well as many analysis techniques. These include calibration, filtering, subsetting, agglomeration, multi-table comparisons, diversity analysis, parallelized Fast UniFrac, ordination methods, and production of publication-quality graphics; all in a manner that is easy to document, share, and modify. We show how to apply functions from other R packages to phyloseq-represented data, illustrating the availability of a large number of open source analysis techniques. We discuss the use of phyloseq with tools for reproducible research, a practice common in other fields but still rare in the analysis of highly parallel microbiome census data. We have made available all of the materials necessary to completely reproduce the analysis and figures included in this article, an example of best practices for reproducible research.

Conclusions

The phyloseq project for R is a new open-source software package, freely available on the web from both GitHub and Bioconductor.  相似文献   

17.
18.
19.
目的:探讨抑癌基因p53在肝癌组织中表达及临床意义,为肝癌的治疗提供新的思路。方法:回顾性分析了2006年1月~2009年8月收治的40例肝癌患者的临床资料,采用免疫组织化学法测定肝癌组织、癌旁组织和正常组织中抑癌基因p53的表达,并分析不同肿瘤分化程度患者p53的表达。结果:p53阳性表达率在在肝癌组、癌旁组分别为47.5%和2.5%,在正常组未检测到p53阳性表达,三组之间比较有显著性差异(x2=6.515,P=0.024);高分化癌P53蛋白阳性表达率低,中低分化癌P53蛋白阳性表达率高,组比较均有显著性差异(P<0.05),中、低分化组之间比较差异无统计学意义(P>0.05)。结论:在肝癌组织中存在p53基因的高表达,其阳性表达与肿瘤分化程度有关。  相似文献   

20.
To understand the molecular mechanism for intramuscular fat deposition, the expression of the obese gene was examined in response to fasting. Food deprivation for 48 h induced a decrease in the level of obese mRNA in pooled adipose tissues (abdominal, perirenal, subcutaneous, intermuscular and intramuscular). The expression of obese mRNA was examined for individual adipose tissue from several fat depots. It was highly expressed in perirenal adipose tissue, but fasting did not affect its expression level in this tissue. Moderate levels were detected in subcutaneous and intermuscular adipose tissues, and a fasting-induced decrease in obese mRNA was apparent in these tissues. The expression level of the obese gene in intramuscular adipose tissue was very low and did not respond to fasting.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号