首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
2.
3.
Summary: DeconMSn accurately determines the monoisotopic massand charge state of parent ions from high-resolution tandemmass spectrometry data, offering significant improvement forLTQ_FT and LTQ_Orbitrap instruments over the commercially deliveredThermo Fisher Scientific's extract_msn tool. Optimal parention mass tolerance values can be determined using accurate massinformation, thus improving peptide identifications for high-massmeasurement accuracy experiments. For low-resolution data fromLCQ and LTQ instruments, DeconMSn incorporates a support-vector-machine-basedcharge detection algorithm that identifies the most likely chargeof a parent species through peak characteristics of its fragmentationpattern. Availability: http://ncrr.pnl.gov/software/ or http://www.proteomicsresource.org/ Contact: rds{at}pnl.gov Supplementary information: PowerPoint presentation/Poster onhttp://ncrr.pnl.gov/software/. Associate Editor: Alfonso Valencia  相似文献   

4.
Motivation: High-density DNA microarrays provide us with usefultools for analyzing DNA and RNA comprehensively. However, thebackground signal caused by the non-specific binding (NSB) betweenprobe and target makes it difficult to obtain accurate measurements.To remove the background signal, there is a set of backgroundprobes on Affymetrix Exon arrays to represent the amount ofnon-specific signals, and an accurate estimation of non-specificsignals using these background probes is desirable for improvementof microarray analyses. Results: We developed a thermodynamic model of NSB on shortnucleotide microarrays in which the NSBs are modeled by duplexformation of probes and multiple hypothetical targets. We fittedthe observed signal intensities of the background probes withthose expected by the model to obtain the model parameters.As a result, we found that the presented model can improve theaccuracy of prediction of non-specific signals in comparisonwith previously proposed methods. This result will provide auseful method to correct for the background signal in oligonucleotidemicroarray analysis. Availability: The software is implemented in the R languageand can be downloaded from our website (http://www-shimizu.ist.osaka-u.ac.jp/shimizu_lab/MSNS/). Contact: furusawa{at}ist.osaka-u.ac.jp Supplementary information: Supplementary data are availableat Bioinformatics online. The authors wish it to be known that, in their opinion, thefirst two authors should be regarded as joint First Authors. Associate Editor: Trey Ideker  相似文献   

5.
Motivation: The success of genome sequencing has resulted inmany protein sequences without functional annotation. We presentConFunc, an automated Gene Ontology (GO)-based protein functionprediction approach, which uses conserved residues to generatesequence profiles to infer function. ConFunc split sets of sequencesidentified by PSI-BLAST into sub-alignments according to theirGO annotations. Conserved residues are identified for each GOterm sub-alignment for which a position specific scoring matrixis generated. This combination of steps produces a set of feature(GO annotation) derived profiles from which protein functionis predicted. Results: We assess the ability of ConFunc, BLAST and PSI-BLASTto predict protein function in the twilight zone of sequencesimilarity. ConFunc significantly outperforms BLAST & PSI-BLASTobtaining levels of recall and precision that are not obtainedby either method and maximum precision 24% greater than BLAST.Further for a large test set of sequences with homologues oflow sequence identity, at high levels of presicision, ConFuncobtains recall six times greater than BLAST. These results demonstratethe potential for ConFunc to form part of an automated genomicsannotation pipeline. Availability: http://www.sbg.bio.ic.ac.uk/confunc Contact: m.sternberg{at}imperial.ac.uk Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Dmitrij Frishman  相似文献   

6.
GENOME: a rapid coalescent-based whole genome simulator   总被引:1,自引:0,他引:1  
Summary: GENOME proposes a rapid coalescent-based approach tosimulate whole genome data. In addition to features of standardcoalescent simulators, the program allows for recombinationrates to vary along the genome and for flexible population histories.Within small regions, we have evaluated samples simulated byGENOME to verify that GENOME provides the expected LD patternsand frequency spectra. The program can be used to study thesampling properties of any statistic for a whole genome study. Availability: The program and C++ source code are availableonline at http://www.sph.umich.edu/csg/liang/genome/ Contact: lianglim{at}umich.edu Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Martin Bishop  相似文献   

7.
8.
Summary: Cross-mapping of gene and protein identifiers betweendifferent databases is a tedious and time-consuming task. Toovercome this, we developed CRONOS, a cross-reference serverthat contains entries from five mammalian organisms presentedby major gene and protein information resources. Sequence similarityanalysis of the mapped entries shows that the cross-referencesare highly accurate. In total, up to 18 different identifiertypes can be used for identification of cross-references. Thequality of the mapping could be improved substantially by exclusionof ambiguous gene and protein names which were manually validated.Organism-specific lists of ambiguous terms, which are valuablefor a variety of bioinformatics applications like text miningare available for download. Availability: CRONOS is freely available to non-commercial usersat http://mips.gsf.de/genre/proj/cronos/index.html, web servicesare available at http://mips.gsf.de/CronosWSService/CronosWS?wsdl. Contact: brigitte.waegele{at}helmholtz-muenchen.de Supplementary information: Supplementary data are availableat Bioinformatics online. The online Supplementary Materialcontains all figures and tables referenced by this article. Associate Editor: Martin Bishop  相似文献   

9.
Motivation: Reliable structural modelling of protein–proteincomplexes has widespread application, from drug design to advancingour knowledge of protein interactions and function. This workaddresses three important issues in protein–protein docking:implementing backbone flexibility, incorporating prior indicationsfrom experiment and bioinformatics, and providing public accessvia a server. 3D-Garden (Global And Restrained Docking ExplorationNexus), our benchmarked and server-ready flexible docking system,allows sophisticated programming of surface patches by the uservia a facet representation of the interactors’ molecularsurfaces (generated with the marching cubes algorithm). Flexibilityis implemented as a weighted exhaustive conformer search foreach clashing pair of molecular branches in a set of 5000 modelsfiltered from around 340 000 initially. Results: In a non-global assessment, carried out strictly accordingto the protocols for number of models considered and model qualityof the Critical Assessment of Protein Interactions (CAPRI) experiment,over the widely-used Benchmark 2.0 of 84 complexes, 3D-Gardenidentifies a set of ten models containing an acceptable or bettermodel in 29/45 test cases, including one with large conformationalchange. In 19/45 cases an acceptable or better model is rankedfirst or second out of 340 000 candidates. Availability: http://www.sbg.bio.ic.ac.uk/3dgarden (server) Contact: v.lesk{at}ic.ac.uk Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Burkhard Rost  相似文献   

10.
11.
MAGIC Tool: integrated microarray data analysis   总被引:5,自引:1,他引:4  
Summary: Several programs are now available for analyzing thelarge datasets arising from cDNA microarray experiments. Mostprograms are expensive commercial packages or require expensivethird party software. Some are freely available to academicresearchers, but are limited to one operating system. MicroArrayGenome Imaging and Clustering Tool (MAGIC Tool) is an open sourceprogram that works on all major platforms, and takes users ‘fromtiff to gif’. Several unique features of MAGIC Tool areparticularly useful for research and teaching. Availability: http://www.bio.davidson.edu/MAGIC Contact: laheyer{at}davidson.edu  相似文献   

12.
Motivation: The genomic methylation analysis is useful to typebacteria that have a high number of expressed type II methyltransferases.Methyltransferases are usually committed to Restriction andModification (R-M) systems, in which the restriction endonucleaseimposes high pressure on the expression of the cognate methyltransferasethat hinder R-M system loss. Conventional cluster methods donot reflect this tendency. An algorithm was developed for dendrogramconstruction reflecting the propensity for conservation of R-MType II systems. Results: The new algorithm was applied to 52 Helicobacter pyloristrains from different geographical regions and compared withconventional clustering methods. The algorithm works by firstgrouping strains that share a common minimum set of R-M systemsand gradually adds strains according to the number of the R-Msystems acquired. Dendrograms revealed a cluster of Africanstrains, which suggest that R-M systems are present in H.pylorigenome since its human host migrates from Africa. Availability: The software files are available at http://www.ff.ul.pt/paginas/jvitor/Bioinformatics/MCRM_algorithm.zip Contact: filipavale{at}fe.ucp.pt Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Martin Bishop  相似文献   

13.
Summary: Using literature databases one can find not only knownand true relations between processes but also less studied,non-obvious associations. The main problem with discoveringsuch type of relevant biological information is ‘selection’.The ability to distinguish between a true correlation (e.g.between different types of biological processes) and randomchance that this correlation is statistically significant iscrucial for any bio-medical research, literature mining beingno exception. This problem is especially visible when searchingfor information which has not been studied and described inmany publications. Therefore, a novel bio-linguistic statisticalmethod is required, capable of ‘selecting’ truecorrelations, even when they are low-frequency associations.In this article, we present such statistical approach basedon Z-score and implemented in a web-based application ‘e-LiSe’. Availability: The software is available at http://miron.ibb.waw.pl/elise/ Contact: piotr{at}ibb.waw.pl Supplementary information: Supplementary materials are availableat http://miron.ibb.waw.pl/elise/supplementary/ Associate Editor: Alfonso Valencia  相似文献   

14.
Motivation: A plethora of alignment tools have been createdthat are designed to best fit different types of alignment conditions.While some of these are made for aligning Illumina SequenceAnalyzer reads, none of these are fully utilizing its probability(prb) output. In this article, we will introduce a new alignmentapproach (Slider) that reduces the alignment problem space byutilizing each read base's probabilities given in the prb files. Results: Compared with other aligners, Slider has higher alignmentaccuracy and efficiency. In addition, given that Slider matchesbases with probabilities other than the most probable, it significantlyreduces the percentage of base mismatches. The result is thatits SNP predictions are more accurate than other SNP predictionapproaches used today that start from the most probable sequence,including those using base quality. Contact: nmalhis{at}bcgsc.ca Supplementary information and availability: http://www.bcgsc.ca/platform/bioinfo/software/slider Associate Editor: Dmitrij Frishman  相似文献   

15.
Motivation: Inferring population structures using genetic datasampled from a group of individuals is a challenging task. Manymethods either consider a fixed population number or ignorethe correlation between populations. As a result, they can losesensitivity and specificity in detecting subtle stratifications.In addition, when a large number of genetic markers are used,many existing algorithms perform rather inefficiently. Result: We propose a new Bayesian method to infer populationstructures using multiple unlinked single nucleotide polymorphisms(SNPs). Our approach explicitly considers the population correlationthrough a tree hierarchy, and treat the population number asa random variable. Using both simulated and real datasets ofworldwide samples, we demonstrate that an incorporated treecan consistently improve the power in detecting subtle populationstratifications. A tree-based model often involves a large numberof unknown parameters, and the corresponding estimation procedurecan be highly inefficient. We further implement a partitionmethod to analytically integrate out all nuisance parametersin the tree. As a result, our method can analyze large SNP datasetswith significantly improved convergence rate. Availability: http://www.stat.psu.edu/~yuzhang/tips.tar Contact: yuzhang{at}stat.psu.edu Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Keith Crandall  相似文献   

16.
SUMMARY: Here, we describe a tool for VARiability Analysis of DNA microarrays experiments (VARAN), a freely available Web server that performs a signal intensity based analysis of the log2 expression ratio variability deduced from DNA microarray data (one or two channels). Two modules are proposed: VARAN generator to compute a sliding windows analysis of the experimental variability (mean and SD) and VARAN analyzer to compare experimental data with an asymptotic variability model previously built with the generator module from control experiments. Both modules provide normalized intensity signals with five possible methods, log ratio values and a list of genes showing significant variations between conditions. AVAILABILITY: http://www.bionet.espci.fr/varan/ SUPPLEMENTARY INFORMATION: http://www.bionet.espci.fr/varan/help.html  相似文献   

17.
Motivation: The nucleotide sequencing process produces not onlythe sequence of nucleotides, but also associated quality values.Quality values provide valuable information, but are primarilyused only for trimming sequences and generally ignored in subsequentanalyses. Results: This article describes how the scoring schemes of standardalignment algorithms can be modified to take into account qualityvalues to produce improved alignments and statistically moreaccurate scores. A prototype implementation is also provided,and used to post-process a set of BLAST results. Quality-adjustedalignment is a natural extension of standard alignment methods,and can be implemented with only a small constant factor performancepenalty. The method can also be applied to related methods includingheuristic search algorithms like BLAST and FASTA. Availability: Software is available at http://malde.org/~ketil/qaa. Contact: ketil.malde{at}imr.no Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Limsoon Wong  相似文献   

18.
19.
Motivation: Most genome-wide association studies rely on singlenucleotide polymorphism (SNP) analyses to identify causal loci.The increased stringency required for genome-wide analyses (withper-SNP significance threshold typically 10–7) meansthat many real signals will be missed. Thus it is still highlyrelevant to develop methods with improved power at low typeI error. Haplotype-based methods provide a promising approach;however, they suffer from statistical problems such as abundanceof rare haplotypes and ambiguity in defining haplotype blockboundaries. Results: We have developed an ancestral haplotype clustering(AncesHC) association method which addresses many of these problems.It can be applied to biallelic or multiallelic markers typedin haploid, diploid or multiploid organisms, and also handlesmissing genotypes. Our model is free from the assumption ofa rigid block structure but recognizes a block-like structureif it exists in the data. We employ a Hidden Markov Model (HMM)to cluster the haplotypes into groups of predicted common ancestralorigin. We then test each cluster for association with diseaseby comparing the numbers of cases and controls with 0, 1 and2 chromosomes in the cluster. We demonstrate the power of thisapproach by simulation of case-control status under a rangeof disease models for 1500 outcrossed mice originating fromeight inbred lines. Our results suggest that AncesHC has substantiallymore power than single-SNP analyses to detect disease association,and is also more powerful than the cladistic haplotype clusteringmethod CLADHC. Availability: The software can be downloaded from http://www.imperial.ac.uk/medicine/people/l.coin Contact: I.coin{at}imperial.ac.uk Supplementary Information: Supplementary data are availableat Bioinformatics online. Associate Editor: Martin Bishop  相似文献   

20.
Motivation: Mass spectrometry (MS), such as the surface-enhancedlaser desorption and ionization time-of-flight (SELDI-TOF) MS,provides a potentially promising proteomic technology for biomarkerdiscovery. An important matter for such a technology to be usedroutinely is its reproducibility. It is of significant interestto develop quantitative measures to evaluate the quality andreliability of different experimental methods. Results: We compare the quality of SELDI-TOF MS data using unfractionated,fractionated plasma samples and abundant protein depletion methodsin terms of the numbers of detected peaks and reliability. Severalstatistical quality-control and quality-assessment techniquesare proposed, including the Graeco–Latin square designfor the sample allocation on a Protein chip, the use of thepairwise Pearson correlation coefficient as the similarity measurebetween the spectra in conjunction with multi-dimensional scaling(MDS) for graphically evaluating similarity of replicates andassessing outlier samples; and the use of the reliability ratiofor evaluating reproducibility. Our results show that the numberof peaks detected is similar among the three sample preparationtechnologies, and the use of the Sigma multi-removal kit doesnot improve peak detection. Fractionation of plasma samplesintroduces more experimental variability. The peaks detectedusing the unfractionated plasma samples have the highest reproducibilityas determined by the reliability ratio. Availability: Our algorithm for assessment of SELDI-TOF experimentquality is available at http://www.biostat.harvard.edu/~xlin Contact: harezlak{at}post.harvard.edu Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Thomas Lengauer  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号