首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 609 毫秒
1.
2.
Summary: DeconMSn accurately determines the monoisotopic massand charge state of parent ions from high-resolution tandemmass spectrometry data, offering significant improvement forLTQ_FT and LTQ_Orbitrap instruments over the commercially deliveredThermo Fisher Scientific's extract_msn tool. Optimal parention mass tolerance values can be determined using accurate massinformation, thus improving peptide identifications for high-massmeasurement accuracy experiments. For low-resolution data fromLCQ and LTQ instruments, DeconMSn incorporates a support-vector-machine-basedcharge detection algorithm that identifies the most likely chargeof a parent species through peak characteristics of its fragmentationpattern. Availability: http://ncrr.pnl.gov/software/ or http://www.proteomicsresource.org/ Contact: rds{at}pnl.gov Supplementary information: PowerPoint presentation/Poster onhttp://ncrr.pnl.gov/software/. Associate Editor: Alfonso Valencia  相似文献   

3.
Model-based deconvolution of genome-wide DNA binding   总被引:1,自引:0,他引:1  
Motivation: Chromatin immunoprecipitation followed by hybridizationto a genomic tiling microarray (ChIP-chip) is a routinely usedprotocol for localizing the genomic targets of DNA-binding proteins.The resolution to which binding sites in this assay can be identifiedis commonly considered to be limited by two factors: (1) theresolution at which the genomic targets are tiled in the microarrayand (2) the large and variable lengths of the immunoprecipitatedDNA fragments. Results: We have developed a generative model of binding sitesin ChIP-chip data and an approach, MeDiChI, for efficientlyand robustly learning that model from diverse data sets. Wehave evaluated MeDiChI's performance using simulated data, aswell as on several diverse ChIP-chip data sets collected onwidely different tiling array platforms for two different organisms(Saccharomyces cerevisiae and Halobacterium salinarium NRC-1).We find that MeDiChI accurately predicts binding locations toa resolution greater than that of the probe spacing, even foroverlapping peaks, and can increase the effective resolutionof tiling array data by a factor of 5x or better. Moreover,the method's performance on simulated data provides insightsinto effectively optimizing the experimental design for increasedbinding site localization accuracy and efficacy. Availability: MeDiChI is available as an open-source R package,including all data, from http://baliga.systemsbiology.net/medichi. Contact: dreiss{at}systemsbiology.org Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Martin Bishop  相似文献   

4.
Motivation: In searching for differentially expressed (DE) genesin microarray data, we often observe a fraction of the genesto have unequal variability between groups. This is not an issuein large samples, where a valid test exists that uses individualvariances separately. The problem arises in the small-samplesetting, where the approximately valid Welch test lacks sensitivity,while the more sensitive moderated t-test assumes equal variance. Methods: We introduce a moderated Welch test (MWT) that allowsunequal variance between groups. It is based on (i) weightingof pooled and unpooled standard errors and (ii) improved estimationof the gene-level variance that exploits the information fromacross the genes. Results: When a non-trivial proportion of genes has unequalvariability, false discovery rate (FDR) estimates based on thestandard t and moderated t-tests are often too optimistic, whilethe standard Welch test has low sensitivity. The MWT is shownto (i) perform better than the standard t, the standard Welchand the moderated t-tests when the variances are unequal betweengroups and (ii) perform similarly to the moderated t, and betterthan the standard t and Welch tests when the group variancesare equal. These results mean that MWT is more reliable thanother existing tests over wider range of data conditions. Availability: R package to perform MWT is available at http://www.meb.ki.se/~yudpaw Contact: yudi.pawitan{at}ki.se Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Martin Bishop  相似文献   

5.
Motivation: Genomes contain biologically significant informationthat extends beyond that encoded in genes. Some of this informationrelates to various short dispersed repeats distributed throughoutthe genome. The goal of this work was to combine tools for detectionof statistically significant dispersed repeats in DNA sequenceswith tools to aid development of hypotheses regarding theirpossible physiological functions in an easy-to-use web-basedenvironment. Results: Ab Initio Motif Identification Environment (AIMIE)was designed to facilitate investigations of dispersed sequencemotifs in prokaryotic genomes. We used AIMIE to analyze theEscherichia coli and Haemophilus influenzae genomes in orderto demonstrate the utility of the new environment. AIMIE detectedrepeated extragenic palindrome (REP) elements, CRISPR repeats,uptake signal sequences, intergenic dyad sequences and severalother over-represented sequence motifs. Distributional patternsof these motifs were analyzed using the tools included in AIMIE. Availability: AIMIE and the related software can be accessedat our web site http://www.cmbl.uga.edu/software.html. Contact: mrazek{at}uga.edu Associate Editor: Alex Bateman  相似文献   

6.
Motivation: The genomic methylation analysis is useful to typebacteria that have a high number of expressed type II methyltransferases.Methyltransferases are usually committed to Restriction andModification (R-M) systems, in which the restriction endonucleaseimposes high pressure on the expression of the cognate methyltransferasethat hinder R-M system loss. Conventional cluster methods donot reflect this tendency. An algorithm was developed for dendrogramconstruction reflecting the propensity for conservation of R-MType II systems. Results: The new algorithm was applied to 52 Helicobacter pyloristrains from different geographical regions and compared withconventional clustering methods. The algorithm works by firstgrouping strains that share a common minimum set of R-M systemsand gradually adds strains according to the number of the R-Msystems acquired. Dendrograms revealed a cluster of Africanstrains, which suggest that R-M systems are present in H.pylorigenome since its human host migrates from Africa. Availability: The software files are available at http://www.ff.ul.pt/paginas/jvitor/Bioinformatics/MCRM_algorithm.zip Contact: filipavale{at}fe.ucp.pt Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Martin Bishop  相似文献   

7.
Motivation: A plethora of alignment tools have been createdthat are designed to best fit different types of alignment conditions.While some of these are made for aligning Illumina SequenceAnalyzer reads, none of these are fully utilizing its probability(prb) output. In this article, we will introduce a new alignmentapproach (Slider) that reduces the alignment problem space byutilizing each read base's probabilities given in the prb files. Results: Compared with other aligners, Slider has higher alignmentaccuracy and efficiency. In addition, given that Slider matchesbases with probabilities other than the most probable, it significantlyreduces the percentage of base mismatches. The result is thatits SNP predictions are more accurate than other SNP predictionapproaches used today that start from the most probable sequence,including those using base quality. Contact: nmalhis{at}bcgsc.ca Supplementary information and availability: http://www.bcgsc.ca/platform/bioinfo/software/slider Associate Editor: Dmitrij Frishman  相似文献   

8.
Summary: FAMHAP is an established software for haplotype associationanalysis of nuclear families. We have released a major updatethat comprises various new features for case-control data. Furthermore,weprovide an additional program runFamhap that allows usersto start the same method repeatedly for varying sets of geneticmarkers. In addition, a platform-independent graphical userinterface (GUI) was developed to simplify the usage of bothFAMHAP and runFamhap. The runFamhap program greatly facilitatesthe application of FAMHAP to genome-wide association studies(GWAS) and supports flexible genome-wide haplotype analysis.As an example, we describe application to HapMap data. Availability: The software is available at http://famhap.meb.uni-bonn.de Contact: herold{at}imbie.meb.uni-bonn.de; becker{at}imbie.meb.uni-bonn.de Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Alex Bateman  相似文献   

9.
10.
Summary: Using literature databases one can find not only knownand true relations between processes but also less studied,non-obvious associations. The main problem with discoveringsuch type of relevant biological information is ‘selection’.The ability to distinguish between a true correlation (e.g.between different types of biological processes) and randomchance that this correlation is statistically significant iscrucial for any bio-medical research, literature mining beingno exception. This problem is especially visible when searchingfor information which has not been studied and described inmany publications. Therefore, a novel bio-linguistic statisticalmethod is required, capable of ‘selecting’ truecorrelations, even when they are low-frequency associations.In this article, we present such statistical approach basedon Z-score and implemented in a web-based application ‘e-LiSe’. Availability: The software is available at http://miron.ibb.waw.pl/elise/ Contact: piotr{at}ibb.waw.pl Supplementary information: Supplementary materials are availableat http://miron.ibb.waw.pl/elise/supplementary/ Associate Editor: Alfonso Valencia  相似文献   

11.
Summary: We developed an interactive gene ontology (GO) browsernamed GOTreePlus that superimposes annotation information overGO structures. It can facilitate the identification of importantGO terms through interactive visualization of them in the GOstructure. The interactive pie chart summarizing an annotationdistribution for a selected GO term provides users with a succinctcontext-sensitive overview of their experimental results. Wetested our GOTreePlus using a proteome profiling dataset obtainedon differentiation of retinal pigment epithelial cells where399 proteins were quantified. Availability: http://bioinformatics.cnmcresearch.org/GOTreePlus/ Contact: jseo{at}cnmcresearch.org Associate Editor: John Quackenbush  相似文献   

12.
MMG: a probabilistic tool to identify submodules of metabolic pathways   总被引:1,自引:0,他引:1  
Motivation: A fundamental task in systems biology is the identificationof groups of genes that are involved in the cellular responseto particular signals. At its simplest level, this often reducesto identifying biological quantities (mRNA abundance, enzymeconcentrations, etc.) which are differentially expressed intwo different conditions. Popular approaches involve using t-teststatistics, based on modelling the data as arising from a mixturedistribution. A common assumption of these approaches is thatthe data are independent and identically distributed; however,biological quantities are usually related through a complex(weighted) network of interactions, and often the more pertinentquestion is which subnetworks are differentially expressed,rather than which genes. Furthermore, in many interesting cases(such as high-throughput proteomics and metabolomics), onlyvery partial observations are available, resulting in the needfor efficient imputation techniques. Results: We introduce Mixture Model on Graphs (MMG), a novelprobabilistic model to identify differentially expressed submodulesof biological networks and pathways. The method can easily incorporateinformation about weights in the network, is robust againstmissing data and can be easily generalized to directed networks.We propose an efficient sampling strategy to infer posteriorprobabilities of differential expression, as well as posteriorprobabilities over the model parameters. We assess our methodon artificial data demonstrating significant improvements overstandard mixture model clustering. Analysis of our model resultson quantitative high-throughput proteomic data leads to theidentification of biologically significant subnetworks, as wellas the prediction of the expression level of a number of enzymes,some of which are then verified experimentally. Availability: MATLAB code is available from http://www.dcs.shef.ac.uk/~guido/software.html Contact: guido{at}dcs.shef.ac.uk Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Jonathan Wren  相似文献   

13.
Motivation: High-density DNA microarrays provide us with usefultools for analyzing DNA and RNA comprehensively. However, thebackground signal caused by the non-specific binding (NSB) betweenprobe and target makes it difficult to obtain accurate measurements.To remove the background signal, there is a set of backgroundprobes on Affymetrix Exon arrays to represent the amount ofnon-specific signals, and an accurate estimation of non-specificsignals using these background probes is desirable for improvementof microarray analyses. Results: We developed a thermodynamic model of NSB on shortnucleotide microarrays in which the NSBs are modeled by duplexformation of probes and multiple hypothetical targets. We fittedthe observed signal intensities of the background probes withthose expected by the model to obtain the model parameters.As a result, we found that the presented model can improve theaccuracy of prediction of non-specific signals in comparisonwith previously proposed methods. This result will provide auseful method to correct for the background signal in oligonucleotidemicroarray analysis. Availability: The software is implemented in the R languageand can be downloaded from our website (http://www-shimizu.ist.osaka-u.ac.jp/shimizu_lab/MSNS/). Contact: furusawa{at}ist.osaka-u.ac.jp Supplementary information: Supplementary data are availableat Bioinformatics online. The authors wish it to be known that, in their opinion, thefirst two authors should be regarded as joint First Authors. Associate Editor: Trey Ideker  相似文献   

14.
15.
Summary: Cross-mapping of gene and protein identifiers betweendifferent databases is a tedious and time-consuming task. Toovercome this, we developed CRONOS, a cross-reference serverthat contains entries from five mammalian organisms presentedby major gene and protein information resources. Sequence similarityanalysis of the mapped entries shows that the cross-referencesare highly accurate. In total, up to 18 different identifiertypes can be used for identification of cross-references. Thequality of the mapping could be improved substantially by exclusionof ambiguous gene and protein names which were manually validated.Organism-specific lists of ambiguous terms, which are valuablefor a variety of bioinformatics applications like text miningare available for download. Availability: CRONOS is freely available to non-commercial usersat http://mips.gsf.de/genre/proj/cronos/index.html, web servicesare available at http://mips.gsf.de/CronosWSService/CronosWS?wsdl. Contact: brigitte.waegele{at}helmholtz-muenchen.de Supplementary information: Supplementary data are availableat Bioinformatics online. The online Supplementary Materialcontains all figures and tables referenced by this article. Associate Editor: Martin Bishop  相似文献   

16.
Summary: Automated analysis of flow cytometry (FCM) data isessential for it to become successful as a high throughput technology.We believe that the principles of Trellis graphics can be adaptedto provide useful visualizations that can aid such automation.In this article, we describe the R/Bioconductor package flowVizthat implements such visualizations. Availability: flowViz is available as an R package from theBioconductor project: http://bioconductor.org Contact: dsarkar{at}fhcrc.org Associate Editor: Olga Troyanskaya  相似文献   

17.
Summary: We present CellLine, a simulator of the dynamics ofgene regulatory networks (GRN) in the cells of a lineage. Fromuser-defined reactions and initial substance quantities, itgenerates cell lineages, i.e. genealogic pedigrees of cellsrelated through mitotic division. Each cell's dynamics is drivenby a delayed stochastic simulation algorithm (delayed SSA),allowing multiple time delayed reactions. The cells of the lineage can be individually subject to ‘perturbations’,such as gene deletion, duplication and mutation. External interventions,such as adding or removing a substance at a given moment, canbe specified. Cell differentiation lineages, where differentiationis stochastically driven or externally induced, can be modeledas well. Finally, CellLine can generate and simulate the dynamicsof multiple copies of any given cell of the lineage. As examples of CellLine use, we simulate the following systems:cell lineages containing a model of the P53-Mdm2 feedback loop,a differentiation lineage where each cell contains a 4 generepressilator (a bistable circuit), a model of the differentiationof the cells of the retinal mosaic required for color visionin Drosophila melanogaster, where the differentiation pathwaydepends on one substance's concentration that is controlledby a stochastic process, and a 9 gene GRN to illustrate theadvantage of using CellLine rather than simulating multipleindependent cells, in cases where the cells of the lineage aredynamically correlated. Availability: The CellLine program, instructions and examplesare available at http://www.cs.tut.fi/~sanchesr/CellLine/CellLine.html Contact: andre.sanchesribeiro{at}tut.fi Associate Editor: Limsoon Wong  相似文献   

18.
GENOME: a rapid coalescent-based whole genome simulator   总被引:1,自引:0,他引:1  
Summary: GENOME proposes a rapid coalescent-based approach tosimulate whole genome data. In addition to features of standardcoalescent simulators, the program allows for recombinationrates to vary along the genome and for flexible population histories.Within small regions, we have evaluated samples simulated byGENOME to verify that GENOME provides the expected LD patternsand frequency spectra. The program can be used to study thesampling properties of any statistic for a whole genome study. Availability: The program and C++ source code are availableonline at http://www.sph.umich.edu/csg/liang/genome/ Contact: lianglim{at}umich.edu Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Martin Bishop  相似文献   

19.
Motivation: Reliable structural modelling of protein–proteincomplexes has widespread application, from drug design to advancingour knowledge of protein interactions and function. This workaddresses three important issues in protein–protein docking:implementing backbone flexibility, incorporating prior indicationsfrom experiment and bioinformatics, and providing public accessvia a server. 3D-Garden (Global And Restrained Docking ExplorationNexus), our benchmarked and server-ready flexible docking system,allows sophisticated programming of surface patches by the uservia a facet representation of the interactors’ molecularsurfaces (generated with the marching cubes algorithm). Flexibilityis implemented as a weighted exhaustive conformer search foreach clashing pair of molecular branches in a set of 5000 modelsfiltered from around 340 000 initially. Results: In a non-global assessment, carried out strictly accordingto the protocols for number of models considered and model qualityof the Critical Assessment of Protein Interactions (CAPRI) experiment,over the widely-used Benchmark 2.0 of 84 complexes, 3D-Gardenidentifies a set of ten models containing an acceptable or bettermodel in 29/45 test cases, including one with large conformationalchange. In 19/45 cases an acceptable or better model is rankedfirst or second out of 340 000 candidates. Availability: http://www.sbg.bio.ic.ac.uk/3dgarden (server) Contact: v.lesk{at}ic.ac.uk Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Burkhard Rost  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号