共查询到20条相似文献,搜索用时 15 毫秒
1.
Arakawa K Suzuki H Fujishima K Fujimoto K Ueda S Matsui M Tomita M 《基因组蛋白质组与生物信息学报(英文版)》2005,3(3):179-188
We have developed a comprehensive software suite for bioinformatics research of cDNAs; it is aimed at rapid characterization of the features of genes and the proteins they code. Methods implemented include the detection of translation initia- tion and termination signals, statistical analysis of codon usage, comparative study of amino acid composition, comparative modeling of the structures of product proteins, prediction of alternative splice forms, and metabolic pathway reconstruction. The software package is freely available under the GNU General Public License at http: / /www.g-language.org/ data/cdna/. 相似文献
2.
DAMBE (data analysis in molecular biology and evolution) is an integrated software package for converting, manipulating, statistically and graphically describing, and analyzing molecular sequence data with a user-friendly Windows 95/98/2000/NT interface. DAMBE is free and can be downloaded from http://web.hku.hk/~xxia/software/software.htm. The current version is 4.0.36. 相似文献
3.
Tom C. Freeman Sebastian Horsewell Anirudh Patir Josh Harling-Lee Tim Regan Barbara B. Shih James Prendergast David A. Hume Tim Angus 《PLoS computational biology》2022,18(7)
Graphia is an open-source platform created for the graph-based analysis of the huge amounts of quantitative and qualitative data currently being generated from the study of genomes, genes, proteins metabolites and cells. Core to Graphia’s functionality is support for the calculation of correlation matrices from any tabular matrix of continuous or discrete values, whereupon the software is designed to rapidly visualise the often very large graphs that result in 2D or 3D space. Following graph construction, an extensive range of measurement algorithms, routines for graph transformation, and options for the visualisation of node and edge attributes are available, for graph exploration and analysis. Combined, these provide a powerful solution for the interpretation of high-dimensional data from many sources, or data already in the form of a network or equivalent adjacency matrix. Several use cases of Graphia are described, to showcase its wide range of applications in the analysis biological data. Graphia runs on all major desktop operating systems, is extensible through the deployment of plugins and is freely available to download from https://graphia.app/. 相似文献
4.
Malinowska A Kistowski M Bakun M Rubel T Tkaczyk M Mierzejewska J Dadlez M 《Journal of Proteomics》2012,75(13):4062-4073
Mass spectrometry-based global proteomics experiments generate large sets of data that can be converted into useful information only with an appropriate statistical approach. We present Diffprot - a software tool for statistical analysis of MS-derived quantitative data. With implemented resampling-based statistical test and local variance estimate, Diffprot allows to draw significant results from small scale experiments and effectively eliminates false positive results. To demonstrate the advantages of this software, we performed two spike-in tests with complex biological matrices, one label-free and one based on iTRAQ quantification; in addition, we performed an iTRAQ experiment on bacterial samples. In the spike-in tests, protein ratios were estimated and were in good agreement with theoretical values; statistical significance was assigned to spiked proteins and single or no false positive results were obtained with Diffprot. We compared the performance of Diffprot with other statistical tests - widely used t-test and non-parametric Wilcoxon test. In contrast to Diffprot, both generated many false positive hits in the spike-in experiment. This proved the superiority of the resampling-based method in terms of specificity, making Diffprot a rational choice for small scale high-throughput experiments, when the need to control the false positive rate is particularly pressing. 相似文献
5.
Background
Successful application of crosslinking combined with mass spectrometry for studying proteins and protein complexes requires specifically-designed crosslinking reagents, experimental techniques, and data analysis software. Using isotopically-coded ("heavy and light") versions of the crosslinker and cleavable crosslinking reagents is analytically advantageous for mass spectrometric applications and provides a "handle" that can be used to distinguish crosslinked peptides of different types, and to increase the confidence of the identification of the crosslinks. 相似文献6.
This set of applied programs, SPECSTAT, has been written in Turbo Pascal-5 and is adapted for an IBM compatible PC-XT/AT using
MS-DOS. SPECSTAT is a small software package (approximately 300K) which has high efficacy when performing calculations. SPECSTAT
provides some new computational opportunities not provided by the existing statistical packages for analyzing allozyme population
genetic data. Furthermore, it is able to carry out some simple transformations of quantitative traits (QT) and the genotypic
data matrix. These transformations are convenient for the investigation of QT and allele frequencies and for the analysis
of the correlations between QT and allozyme genotypic codes.
From the vectors of genotypic codes for each sample, the following parameters are computed: i) the allele frequencies with
standard errors (SE), ii) chi-squared values of goodness-of-fit for the observed and expected Hardy-Weinberg genotype frequencies
and iii) Ho, Hs and Fis values. The Ht, Fit, Fst, Dst′ and Fst′ statistics, chi-squared values in the heterogeneity test of the predominant alleles, and G-statistics of all-allelic heterogeneity
among samples with Williams' correction (Sokal & Rolf, 1981) are also available. Calculations are performed to produce single-locus
and combined matrices of Nei's genetic distances Dn and Dm (standard and minimal unbiased), and the matrix of combined Kalabushkin's
similarity metrics, which maximize small differences. Other computational opportunities are provided as well.
Output data are simple ASCII files and are organized in a compact mode which can either be used directly for publication or
used after minor changes in further calculations. 相似文献
7.
Makarenkov V Kevorkov D Zentilli P Gagarin A Malo N Nadon R 《Bioinformatics (Oxford, England)》2006,22(11):1408-1409
MOTIVATION: High-throughput screening (HTS) plays a central role in modern drug discovery, allowing for testing of >100,000 compounds per screen. The aim of our work was to develop and implement methods for minimizing the impact of systematic error in the analysis of HTS data. To the best of our knowledge, two new data correction methods included in HTS-Corrector are not available in any existing commercial software or freeware. RESULTS: This paper describes HTS-Corrector, a software application for the analysis of HTS data, detection and visualization of systematic error, and corresponding correction of HTS signals. Three new methods for the statistical analysis and correction of raw HTS data are included in HTS-Corrector: background evaluation, well correction and hit-sigma distribution procedures intended to minimize the impact of systematic errors. We discuss the main features of HTS-Corrector and demonstrate the benefits of the algorithms. 相似文献
8.
Keegan KP Trimble WL Wilkening J Wilke A Harrison T D'Souza M Meyer F 《PLoS computational biology》2012,8(6):e1002541
We provide a novel method, DRISEE (duplicate read inferred sequencing error estimation), to assess sequencing quality (alternatively referred to as "noise" or "error") within and/or between sequencing samples. DRISEE provides positional error estimates that can be used to inform read trimming within a sample. It also provides global (whole sample) error estimates that can be used to identify samples with high or varying levels of sequencing error that may confound downstream analyses, particularly in the case of studies that utilize data from multiple sequencing samples. For shotgun metagenomic data, we believe that DRISEE provides estimates of sequencing error that are more accurate and less constrained by technical limitations than existing methods that rely on reference genomes or the use of scores (e.g. Phred). Here, DRISEE is applied to (non amplicon) data sets from both the 454 and Illumina platforms. The DRISEE error estimate is obtained by analyzing sets of artifactual duplicate reads (ADRs), a known by-product of both sequencing platforms. We present DRISEE as an open-source, platform-independent method to assess sequencing error in shotgun metagenomic data, and utilize it to discover previously uncharacterized error in de novo sequence data from the 454 and Illumina sequencing platforms. 相似文献
9.
A comparison of statistical methods for analysis of high density oligonucleotide array data 总被引:8,自引:0,他引:8
Rajagopalan D 《Bioinformatics (Oxford, England)》2003,19(12):1469-1476
10.
The EpiGRAPH web service enables biologists to uncover hidden associations in vertebrate genome and epigenome datasets. Users can upload sets of genomic
regions and EpiGRAPH will test multiple attributes (including DNA sequence, chromatin structure, epigenetic modifications
and evolutionary conservation) for enrichment or depletion among these regions. Furthermore, EpiGRAPH learns to predictively
identify similar genomic regions. This paper demonstrates EpiGRAPH's practical utility in a case study on monoallelic gene
expression and describes its novel approach to reproducible bioinformatic analysis. 相似文献
11.
Hill EG Schwacke JH Comte-Walters S Slate EH Oberg AL Eckel-Passow JE Therneau TM Schey KL 《Journal of proteome research》2008,7(8):3091-3101
We describe biological and experimental factors that induce variability in reporter ion peak areas obtained from iTRAQ experiments. We demonstrate how these factors can be incorporated into a statistical model for use in evaluating differential protein expression and highlight the benefits of using analysis of variance to quantify fold change. We demonstrate the model's utility based on an analysis of iTRAQ data derived from a spike-in study. 相似文献
12.
Cyclops is a new computer program designed as a graphical front-end that allows easy control and interaction with tasks and programs for 3D reconstruction of biological complexes using cryo-electron microscopy. Cyclops' current plug-ins are designed for automated particle picking and include two new algorithms, automated carbon masking and quaternion based rotation space sampling, which are also presented here. Additional plug-ins are in the pipeline. Cyclops allows straightforward organization and visualization of all data and tasks and allows both interactive and batch-wise processing. Furthermore, it was designed for straightforward implementation in grid architectures. As a front-end to a collection of programs it provides a common interface to these programs, thus enhancing the usability of the suite and the productivity of the user. 相似文献
13.
《Chirality》2017,29(5):178-192
The program CDSpecTech was developed to facilitate the analysis of chiroptical spectra, which include the following: vibrational circular dichroism (VCD) and corresponding vibrational absorption (VA) spectra; vibrational Raman optical activity (VROA) and corresponding vibrational Raman spectra; electronic circular dichroism (ECD) and corresponding electronic absorption (EA) spectra. In addition, the program allows for generating optical rotatory dispersion (ORD) as the Kramers–Kronig transform of ECD spectra. The simulation of theoretical spectra from transition strengths can be achieved using different bandshape profiles. The experimental and simulated theoretical spectra can be visually compared by displaying them together. A unique feature of CDSpecTech is performing spectral analysis using the ratio spectra; i.e., the dimensionless dissymmetry factor (DF) spectrum, which is the ratio of CD to absorption spectra, and the dimensionless circular intensity difference (CID) spectrum, which is the ratio of VROA to vibrational Raman spectra. The quantitative agreement between experimental and simulated theoretical spectra can also be assessed from the numerical similarity overlap between them. Two different similarity overlap methods are available. The program uses a graphical user interface which allows for ease of use and facilitates the analysis. All these features make CDSpecTech a valuable tool for the analysis of chiroptical spectra. The program is freely available on the World Wide Web. 相似文献
14.
Verde PE Geracitano LA Amado LL Rosa CE Bianchini A Monserrat JM 《Mutation research》2006,604(1-2):71-82
A novel approach for statistical analysis of comet assay data (i.e.: tail moment) is proposed, employing public-domain statistical software, the R system. The analytical strategy takes into account that the distribution of comet assay data, like the tail moment, is usually skewed and do not follow a normal distribution. Probability distributions used to model comet assay data included: the Weibull, the exponential, the logistic, the normal, the log normal and log-logistic distribution. In this approach it was also considered that heterogeneity observed among experimental units is a random feature of the comet assay data. This statistical model can be characterized with a location parameter m(ij), a scale parameter r and a between experimental units variability parameter theta. In the logarithmic scale, the parameter m(ij) depends additively on treatment and random effects, as follows: log(m(ij)) = a0 + a1x(ij) + b(i), where exp(a0) represents approximately the mean value of the control group, exp(a1) can be interpreted as the relative risk of damage with respect to the control group, x(ij) is an indicator of experimental group and exp(b(i)) is the individual risk effects assume to follows a Gamma distribution with mean 1 and variance theta. Model selection is based on Akaike's information criteria (AIC). Real data coming from comet analysis of blood samples taken from the flounder Paralichtys orbignyanus (Teleostei: Paralichtyidae) and from samples of cells suspension obtained from the estuarine polychaeta Laeonereis acuta (Nereididae) were employed. This statistical approach showed that the comet assay data should be analyzed under a modeling framework that take into account the important features of these measurements. Model selection and heterogeneity between experimental units play central points in the analysis of these data. 相似文献
15.
Background
Microarray-CGH experiments are used to detect and map chromosomal imbalances, by hybridizing targets of genomic DNA from a test and a reference sample to sequences immobilized on a slide. These probes are genomic DNA sequences (BACs) that are mapped on the genome. The signal has a spatial coherence that can be handled by specific statistical tools. Segmentation methods seem to be a natural framework for this purpose. A CGH profile can be viewed as a succession of segments that represent homogeneous regions in the genome whose BACs share the same relative copy number on average. We model a CGH profile by a random Gaussian process whose distribution parameters are affected by abrupt changes at unknown coordinates. Two major problems arise : to determine which parameters are affected by the abrupt changes (the mean and the variance, or the mean only), and the selection of the number of segments in the profile. 相似文献16.
Valin Reja Alister Kwok Glenn Stone Linsong Yang Andreas Missel Christoph Menzel Brant Bassam 《Methods (San Diego, Calif.)》2010,50(4):S10-S14
Background: High resolution melting (HRM) is an emerging new method for interrogating and characterizing DNA samples. An important aspect of this technology is data analysis. Traditional HRM curves can be difficult to interpret and the method has been criticized for lack of statistical interrogation and arbitrary interpretation of results. Methods: Here we report the basic principles and first applications of a new statistical approach to HRM analysis addressing these concerns. Our method allows automated genotyping of unknown samples coupled with formal statistical information on the likelihood, if an unknown sample is of a known genotype (by discriminant analysis or “supervised learning”). It can also determine the assortment of alleles present (by cluster analysis or “unsupervised learning”) without a priori knowledge of the genotypes present. Conclusion: The new algorithms provide highly sensitive and specific auto-calling of genotypes from HRM data in both supervised an unsupervised analysis mode. The method is based on pure statistical interrogation of the data set with a high degree of standardization. The hypothesis-free unsupervised mode offers various possibilities for de novo HRM applications such as mutation discovery. 相似文献
17.
SUMMARY: Among classical methods for module detection, SpaCEM(3) provides ad hoc algorithms that were shown to be particularly well adapted to specific features of biological data: high-dimensionality, interactions between components (genes) and integrated treatment of missingness in observations. The software, currently in its version 2.0, is developed in C++ and can be used either via command line or with the GUI under Linux and Windows environments. AVAILABILITY: The SpaCEM(3) software, a documentation and datasets are available from http://spacem3.gforge.inria.fr/. 相似文献
18.
Dyer RJ 《Molecular ecology resources》2009,9(1):110-113
The analysis of genetic marker data is increasingly being conducted in the context of the spatial arrangement of strata (e.g. populations) necessitating a more flexible set of analysis tools. GeneticStudio consists of four interacting programs: (i) Geno a spreadsheet-like interface for the analysis of spatially explicit marker-based genetic variation; (ii) Graph software for the analysis of Population Graph and network topologies, (iii) Manteller, a general purpose for matrix analysis program; and (iv) SNPFinder, a program for identifying single nucleotide polymorphisms. The GeneticStudio suite is available as source code as well as binaries for OSX and Windows and is distributed under the GNU General Public License. 相似文献
19.