首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
SUMMARY: OrderedList is a Bioconductor compliant package for meta-analysis based on ordered gene lists like those resulting from differential gene expression analysis. Our package quantifies the similarity between gene lists. The significance of the similarity score is estimated from random scores computed on perturbed data. OrderedList illustrates list similarity in intuitive plots and determines the score-driving genes for further analysis. AVAILABILITY: http://www.bioconductor.org CONTACT: claudio.lottaz@molgen.mpg.de SUPPLEMENTARY INFORMATION: Please visit our webpage on http://compdiag.molgen.mpg.de/software.  相似文献   

2.
SUMMARY: By linking differential gene expression to the chromosomal localization of genes, one can investigate microarray data for characteristic patterns of expression phenomena involving sizeable parts of specific chromosomes. We have implemented a statistical approach for identifying significantly differentially expressed chromosome regions. We demonstrate the applicability of the approach on a publicly available data set on acute lymphocytic leukemia. AVAILABILITY: The R-package MACAT can be obtained from http://www.compdiag.molgen.mpg.de/software/macat.shtml SUPPLEMENTARY INFORMATION: http://www.compdiag.molgen.mpg.de/software/macat.shtml.  相似文献   

3.
Screening for differential gene expression in microarray studies leads to difficult large-scale multiple testing problems. The local false discovery rate is a statistical concept for quantifying uncertainty in multiple testing. We introduce a novel estimator for the local false discovery rate that is based on an algorithm which splits all genes into two groups, representing induced and noninduced genes, respectively. Starting from the full set of genes, we successively exclude genes until the gene-wise p-values of the remaining genes look like a typical sample from a uniform distribution. In comparison to other methods, our algorithm performs compatibly in detecting the shape of the local false discovery rate and has a smaller bias with respect to estimating the overall percentage of noninduced genes. Our algorithm is implemented in the Bioconductor compatible R package TWILIGHT version 1.0.1, which is available from http://compdiag.molgen.mpg.de/software or from the Bioconductor project at http://www.bioconductor.org.  相似文献   

4.
MOTIVATION: Statistical methods based on controlling the false discovery rate (FDR) or positive false discovery rate (pFDR) are now well established in identifying differentially expressed genes in DNA microarray. Several authors have recently raised the important issue that FDR or pFDR may give misleading inference when specific genes are of interest because they average the genes under consideration with genes that show stronger evidence for differential expression. The paper proposes a flexible and robust mixture model for estimating the local FDR which quantifies how plausible each specific gene expresses differentially. RESULTS: We develop a special mixture model tailored to multiple testing by requiring the P-value distribution for the differentially expressed genes to be stochastically smaller than the P-value distribution for the non-differentially expressed genes. A smoothing mechanism is built in. The proposed model gives robust estimation of local FDR for any reasonable underlying P-value distributions. It also provides a single framework for estimating the proportion of differentially expressed genes, pFDR, negative predictive values, sensitivity and specificity. A cervical cancer study shows that the local FDR gives more specific and relevant quantification of the evidence for differential expression that can be substantially different from pFDR. AVAILABILITY: An R function implementing the proposed model is available at http://www.geocities.com/jg_liao/software  相似文献   

5.
MOTIVATION: Comparative sequence analysis is the essence of many approaches to genome annotation. Heuristic alignment algorithms utilize similar seed pairs to anchor an alignment. Some applications of local alignment algorithms (e.g. phylogenetic footprinting) would benefit from including prior knowledge (e.g. binding site motifs) in the alignment building process. RESULTS: We introduce predefined sequence patterns as anchor points into a heuristic local alignment strategy. We extended the BLASTZ program for this purpose. A set of seed patterns is either given as consensus sequences in IUPAC code or position-weight-matrices. Phylogenetic footprinting of promoter regions is one of many potential applications for the SITEBLAST software. AVAILABILITY: The source code is freely available to the academic community from http://corg.molgen.mpg.de/software  相似文献   

6.
The regulation of intragenic miRNAs by their own intronic promoters is one of the open problems of miRNA biogenesis. Here, we describe PROmiRNA, a new approach for miRNA promoter annotation based on a semi-supervised statistical model trained on deepCAGE data and sequence features. We validate our results with existing annotation, PolII occupancy data and read coverage from RNA-seq data. Compared to previous methods PROmiRNA increases the detection rate of intronic promoters by 30%, allowing us to perform a large-scale analysis of their genomic features, as well as elucidate their contribution to tissue-specific regulation. PROmiRNA can be downloaded from http://promirna.molgen.mpg.de.  相似文献   

7.
Xdigitise: visualization of hybridization experiments   总被引:1,自引:0,他引:1  
Xdigitise is a software system for visualization of hybridization experiments giving the user facilities to analyze the corresponding images manually or automatically. Images of the high-density DNA arrays are displayed as well as the results of an external image analysis bundled with Xdigitise, e.g. the spot locations are marked and the duplicate correlations are shown by a color scale. AVAILABILITY: Xdigitise can be downloaded from http://www.molgen.mpg.de/~xdigitise.  相似文献   

8.
9.
10.
Epigenome mapping consortia are generating resources of tremendous value for studying epigenetic regulation. To maximize their utility and impact, new tools are needed that facilitate interactive analysis of epigenome datasets. Here we describe EpiExplorer, a web tool for exploring genome and epigenome data on a genomic scale. We demonstrate EpiExplorer's utility by describing a hypothesis-generating analysis of DNA hydroxymethylation in relation to public reference maps of the human epigenome. All EpiExplorer analyses are performed dynamically within seconds, using an efficient and versatile text indexing scheme that we introduce to bioinformatics. EpiExplorer is available at http://epiexplorer.mpi-inf.mpg.de.  相似文献   

11.
Predictive medicine by cytomics: potential and challenges   总被引:2,自引:0,他引:2  
Predictive medicine by cytomics represents a new concept which provides disease course predictions for individual patients. The predictive information is derived from the molecular cell phenotypes as they are determined by patient's genotype and exposure to external or internal influences. The predictions are dynamic because they are therapy dependent. They may provide a therapeutic lead time for preventive therapy or for the diminution of disease associated irreversible tissue damage. Multiparametric data from cytometry, multiple clinical chemistry assays, chip or bead arrays serve as input for an algorithmic data sieving procedure (http://www.biochem.mpg.de/valet/classif1.html). Data sieving enriches the discriminatory parameters in form of standardized data masks for predictive or diagnostic disease classification in the individual patient (http://www.biochem.mpg.de/valet/cellclas.html). Besides predictive and diagnostic utility, the data patterns can be used in a bottom-up approach for the development of scientific hypotheses on disease inducing mechanisms in complex inflammatory, infectious, allergic, malignant or degenerative diseases.  相似文献   

12.
The Database of Ribosomal Cross-links: an update.   总被引:4,自引:1,他引:3       下载免费PDF全文
The Database of Ribosomal Cross-links (DRC) was created in 1997. Here we describe new data incorporated into this database and several new features of the DRC. The DRC is freely available via World Wide Web at http://visitweb.com/database/ or http://www. mpimg-berlin-dahlem.mpg.de/ approximately ag_ribo/ag_brimacombe/drc/  相似文献   

13.
14.
15.
MOTIVATION: An estimated 25% of all eukaryotic proteins contain repeats, which underlines the importance of duplication for evolving new protein functions. Internal repeats often correspond to structural or functional units in proteins. Methods capable of identifying diverged repeated segments or domains at the sequence level can therefore assist in predicting domain structures, inferring hypotheses about function and mechanism, and investigating the evolution of proteins from smaller fragments. RESULTS: We present HHrepID, a method for the de novo identification of repeats in protein sequences. It is able to detect the sequence signature of structural repeats in many proteins that have not yet been known to possess internal sequence symmetry, such as outer membrane beta-barrels. HHrepID uses HMM-HMM comparison to exploit evolutionary information in the form of multiple sequence alignments of homologs. In contrast to a previous method, the new method (1) generates a multiple alignment of repeats; (2) utilizes the transitive nature of homology through a novel merging procedure with fully probabilistic treatment of alignments; (3) improves alignment quality through an algorithm that maximizes the expected accuracy; (4) is able to identify different kinds of repeats within complex architectures by a probabilistic domain boundary detection method and (5) improves sensitivity through a new approach to assess statistical significance. AVAILABILITY: Server: http://toolkit.tuebingen.mpg.de/hhrepid; Executables: ftp://ftp.tuebingen.mpg.de/pub/protevo/HHrepID  相似文献   

16.
We assess the variability of protein function in protein sequence and structure space. Various regions in this space exhibit considerable difference in the local conservation of molecular function. We analyze and capture local function conservation by means of logistic curves. Based on this analysis, we propose a method for predicting molecular function of a query protein with known structure but unknown function. The prediction method is rigorously assessed and compared with a previously published function predictor. Furthermore, we apply the method to 500 functionally unannotated PDB structures and discuss selected examples. The proposed approach provides a simple yet consistent statistical model for the complex relations between protein sequence, structure, and function. The GOdot method is available online (http://godot.bioinf.mpi-inf.mpg.de).  相似文献   

17.
MOTIVATION: Need for software to setup and analyze complex mathematical models for cellular systems in a modular way, that also integrates the experimental environment of the cells. RESULTS: A computer framework is described which allows the building of modularly structured models using an abstract, modular and general modeling methodology. With this methodology, reusable modeling entities are introduced which lead to the development of a modeling library within the modeling tool ProMot. The simulation environment Diva is used for numerical analysis and parameter identification of the models. The simulation environment provides a number of tools and algorithms to simulate and analyze complex biochemical networks. The described tools are the first steps towards an integrated computer-based modeling, simulation and visualization environment Availability: Available on request to the authors. The software itself is free for scientific purposes but requires commercial libraries. SUPPLEMENTARY INFORMATION: http://www.mpi-magdeburg.mpg.de/projects/promot  相似文献   

18.
SUMMARY: Mixture models of mutagenetic trees constitute a class of probabilistic models for describing evolutionary processes that are characterized by the accumulation of permanent genetic changes. They have been applied to model the accumulation of chromosomal gains and losses in tumor development and the development of drug resistance-associated mutations in the HIV genome.Mtreemix is a software package for estimating mutagenetic trees mixture models from observed cross-sectional data and for using these models for predictions. We provide programs for model fitting, model selection, simulation, likelihood computation and waiting time estimation. AVAILABILITY: Mtreemix, including source code, documentation, sample data files and precompiled Solaris and Linux binaries, is freely available for non-commercial users at http://mtreemix.bioinf.mpi-sb.mpg.de/  相似文献   

19.
ROCR: visualizing classifier performance in R   总被引:2,自引:0,他引:2  
SUMMARY: ROCR is a package for evaluating and visualizing the performance of scoring classifiers in the statistical language R. It features over 25 performance measures that can be freely combined to create two-dimensional performance curves. Standard methods for investigating trade-offs between specific performance measures are available within a uniform framework, including receiver operating characteristic (ROC) graphs, precision/recall plots, lift charts and cost curves. ROCR integrates tightly with R's powerful graphics capabilities, thus allowing for highly adjustable plots. Being equipped with only three commands and reasonable default values for optional parameters, ROCR combines flexibility with ease of usage. AVAILABILITY: http://rocr.bioinf.mpi-sb.mpg.de. ROCR can be used under the terms of the GNU General Public License. Running within R, it is platform-independent. CONTACT: tobias.sing@mpi-sb.mpg.de.  相似文献   

20.
False discovery rate (FDR) methodologies are essential in the study of high-dimensional genomic and proteomic data. The R package 'fdrtool' facilitates such analyses by offering a comprehensive set of procedures for FDR estimation. Its distinctive features include: (i) many different types of test statistics are allowed as input data, such as P-values, z-scores, correlations and t-scores; (ii) simultaneously, both local FDR and tail area-based FDR values are estimated for all test statistics and (iii) empirical null models are fit where possible, thereby taking account of potential over- or underdispersion of the theoretical null. In addition, 'fdrtool' provides readily interpretable graphical output, and can be applied to very large scale (in the order of millions of hypotheses) multiple testing problems. Consequently, 'fdrtool' implements a flexible FDR estimation scheme that is unified across different test statistics and variants of FDR. AVAILABILITY: The program is freely available from the Comprehensive R Archive Network (http://cran.r-project.org/) under the terms of the GNU General Public License (version 3 or later). CONTACT: strimmer@uni-leipzig.de.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号