首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Smilde et al. Bioinformatics (2005), 21(13); 3043–3048 The above paper by Smilde et al. inappropriately quotes results  相似文献   

3.
Motivation: The genomic methylation analysis is useful to typebacteria that have a high number of expressed type II methyltransferases.Methyltransferases are usually committed to Restriction andModification (R-M) systems, in which the restriction endonucleaseimposes high pressure on the expression of the cognate methyltransferasethat hinder R-M system loss. Conventional cluster methods donot reflect this tendency. An algorithm was developed for dendrogramconstruction reflecting the propensity for conservation of R-MType II systems. Results: The new algorithm was applied to 52 Helicobacter pyloristrains from different geographical regions and compared withconventional clustering methods. The algorithm works by firstgrouping strains that share a common minimum set of R-M systemsand gradually adds strains according to the number of the R-Msystems acquired. Dendrograms revealed a cluster of Africanstrains, which suggest that R-M systems are present in H.pylorigenome since its human host migrates from Africa. Availability: The software files are available at http://www.ff.ul.pt/paginas/jvitor/Bioinformatics/MCRM_algorithm.zip Contact: filipavale{at}fe.ucp.pt Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Martin Bishop  相似文献   

4.
5.
A fuzzy guided genetic algorithm for operon prediction   总被引:4,自引:0,他引:4  
Motivation: The operon structure of the prokaryotic genome isa critical input for the reconstruction of regulatory networksat the whole genome level. As experimental methods for the detectionof operons are difficult and time-consuming, efforts are beingput into developing computational methods that can use availablebiological information to predict operons. Method: A genetic algorithm is developed to evolve a startingpopulation of putative operon maps of the genome into progressivelybetter predictions. Fuzzy scoring functions based on multiplecriteria are used for assessing the ‘fitness’ ofthe newly evolved operon maps and guiding their evolution. Results: The algorithm organizes the whole genome into operons.The fuzzy guided genetic algorithm-based approach makes it possibleto use diverse biological information like genome sequence data,functional annotations and conservation across multiple genomes,to guide the organization process. This approach does not requireany prior training with experimental operons. The predictionsfrom this algorithm for Escherchia coli K12 and Bacillus subtilisare evaluated against experimentally discovered operons forthese organisms. The accuracy of the method is evaluated usingan ROC (receiver operating characteristic) analysis. The areaunder the ROC curve is around 0.9, which indicates excellentaccuracy. Contact: roschen_csir{at}rediffmail.com  相似文献   

6.
Variation at the leucine aminopeptidase (Lap), glucose phosphateisomerase (Gpi) and tetrazolium oxidase (To) loci was investigatedin samples of three populations, Al-Mayana (MAY), Shigita (SH)and Mina Salman (MS), of Pinctada radiata from pearl oysterbeds around Bahrain. The To locus was monomor-phic. SignificantLap and Gpi heterozygote deficiencies were evident and it issuggested that these were generated by selection. The MS population,to the East of Bahrain, differed significantly in Gpi allelefrequencies from both Northern populations (MAY, SH) and Nei'sgenetic identity indicates a close relationship between theNorthern populations. Measurements of shell morphometrics were used both as ratiosof one dimension to another, and as regressions of one dimensionon another to examine relatedness between populations. Boththese mor-phometric approaches gave different results from eachother and also differed from the electrophoretic data. It isconcluded that estimates of relatedness in pearl oysters basedon electrophoretic data will be more reliable than those basedon shell shape. (Received 20 November 1990; accepted 12 April 1991)  相似文献   

7.
Motivation: Genomes contain biologically significant informationthat extends beyond that encoded in genes. Some of this informationrelates to various short dispersed repeats distributed throughoutthe genome. The goal of this work was to combine tools for detectionof statistically significant dispersed repeats in DNA sequenceswith tools to aid development of hypotheses regarding theirpossible physiological functions in an easy-to-use web-basedenvironment. Results: Ab Initio Motif Identification Environment (AIMIE)was designed to facilitate investigations of dispersed sequencemotifs in prokaryotic genomes. We used AIMIE to analyze theEscherichia coli and Haemophilus influenzae genomes in orderto demonstrate the utility of the new environment. AIMIE detectedrepeated extragenic palindrome (REP) elements, CRISPR repeats,uptake signal sequences, intergenic dyad sequences and severalother over-represented sequence motifs. Distributional patternsof these motifs were analyzed using the tools included in AIMIE. Availability: AIMIE and the related software can be accessedat our web site http://www.cmbl.uga.edu/software.html. Contact: mrazek{at}uga.edu Associate Editor: Alex Bateman  相似文献   

8.
Summary: Suffix tree is one of the most fundamental data structuresin string algorithms and biological sequence analysis. Unfortunately,when it comes to implementing those algorithms and applyingthem to real genomic sequences, often the main memory size becomesthe bottleneck. This is easily explained by the fact that whilea DNA sequence of length n from alphabet = {A, C, G, T } canbe stored in n log ||= 2n bits, its suffix tree occupies O(nlog n) bits. In practice, the size difference easily reachesfactor 50. We provide an implementation of the compressed suffix tree veryrecently proposed by Sadakane (Theory of Computing Systems,in press). The compressed suffix tree occupies space proportionalto the text size, i.e. O(n log} | |) bits, and supports alltypical suffix tree operations with at most log n factor slowdown.Our experiments show that, e.g. on a 10 MB DNA sequence, thecompressed suffix tree takes 10% of the space of normal suffixtree. Typical operations are slowed down by factor 60. Availability: The C++ implementation under GNU license is availableat http://www.cs.helsinki.fi/group/suds/cst/. An example programimplementing a typical pattern discovery task is included. Experimentalresults in this note correspond to version 0.95. Contact: vmakinen{at}cs.helsinki.fi  相似文献   

9.
Motivation: Reliable structural modelling of protein–proteincomplexes has widespread application, from drug design to advancingour knowledge of protein interactions and function. This workaddresses three important issues in protein–protein docking:implementing backbone flexibility, incorporating prior indicationsfrom experiment and bioinformatics, and providing public accessvia a server. 3D-Garden (Global And Restrained Docking ExplorationNexus), our benchmarked and server-ready flexible docking system,allows sophisticated programming of surface patches by the uservia a facet representation of the interactors’ molecularsurfaces (generated with the marching cubes algorithm). Flexibilityis implemented as a weighted exhaustive conformer search foreach clashing pair of molecular branches in a set of 5000 modelsfiltered from around 340 000 initially. Results: In a non-global assessment, carried out strictly accordingto the protocols for number of models considered and model qualityof the Critical Assessment of Protein Interactions (CAPRI) experiment,over the widely-used Benchmark 2.0 of 84 complexes, 3D-Gardenidentifies a set of ten models containing an acceptable or bettermodel in 29/45 test cases, including one with large conformationalchange. In 19/45 cases an acceptable or better model is rankedfirst or second out of 340 000 candidates. Availability: http://www.sbg.bio.ic.ac.uk/3dgarden (server) Contact: v.lesk{at}ic.ac.uk Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Burkhard Rost  相似文献   

10.
Motivation: In searching for differentially expressed (DE) genesin microarray data, we often observe a fraction of the genesto have unequal variability between groups. This is not an issuein large samples, where a valid test exists that uses individualvariances separately. The problem arises in the small-samplesetting, where the approximately valid Welch test lacks sensitivity,while the more sensitive moderated t-test assumes equal variance. Methods: We introduce a moderated Welch test (MWT) that allowsunequal variance between groups. It is based on (i) weightingof pooled and unpooled standard errors and (ii) improved estimationof the gene-level variance that exploits the information fromacross the genes. Results: When a non-trivial proportion of genes has unequalvariability, false discovery rate (FDR) estimates based on thestandard t and moderated t-tests are often too optimistic, whilethe standard Welch test has low sensitivity. The MWT is shownto (i) perform better than the standard t, the standard Welchand the moderated t-tests when the variances are unequal betweengroups and (ii) perform similarly to the moderated t, and betterthan the standard t and Welch tests when the group variancesare equal. These results mean that MWT is more reliable thanother existing tests over wider range of data conditions. Availability: R package to perform MWT is available at http://www.meb.ki.se/~yudpaw Contact: yudi.pawitan{at}ki.se Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Martin Bishop  相似文献   

11.
Motivation: Although the outbreak of the severe acute respiratorysyndrome (SARS) is currently over, it is expected that it willreturn to attack human beings. A critical challenge to scientistsfrom various disciplines worldwide is to study the specificityof cleavage activity of SARS-related coronavirus (SARS-CoV)and use the knowledge obtained from the study for effectiveinhibitor design to fight the disease. The most commonly usedinductive programming methods for knowledge discovery from dataassume that the elements of input patterns are orthogonal toeach other. Suppose a sub-sequence is denoted as P2P1P1'P2',the conventional inductive programming method may result ina rule like ‘if P1 = Q, then the sub-sequence is cleaved,otherwise non-cleaved’. If the site P1 is not orthogonalto the others (for instance, P2, P1' and P2'), the predictionpower of these kind of rules may be limited. Therefore thisstudy is aimed at developing a novel method for constructingnon-orthogonal decision trees for mining protease data. Result: Eighteen sequences of coronavirus polyprotein were downloadedfrom NCBI (http://www.ncbi.nlm.nih.gov). Among these sequences,252 cleavage sites were experimentally determined. These sequenceswere scanned using a sliding window with size k to generateabout 50 000 k-mer sub-sequences (for short, k-mers). The valueof k varies from 4 to 12 with a gap of two. The bio-basis functionproposed by Thomson et al. is used to transform the k-mers toa high-dimensional numerical space on which an inductive programmingmethod is applied for the purpose of deriving a decision treefor decision-making. The process of this transform is referredto as a bio-mapping. The constructed decision trees select about10 out of 50 000 k-mers. This small set of selected k-mers isregarded as a set of decisive templates. By doing so, non-orthogonaldecision trees are constructed using the selected templatesand the prediction accuracy is significantly improved. Availability: The program for bio-mapping can be obtained byrequest to the author. Contact: z.r.yang{at}exeter.ac.uk  相似文献   

12.
Motivation: High-throughput experimental and computational methodsare generating a wealth of protein–protein interactiondata for a variety of organisms. However, data produced by currentstate-of-the-art methods include many false positives, whichcan hinder the analyses needed to derive biological insights.One way to address this problem is to assign confidence scoresthat reflect the reliability and biological significance ofeach interaction. Most previously described scoring methodsuse a set of likely true positives to train a model to scoreall interactions in a dataset. A single positive training set,however, may be biased and not representative of true interactionspace. Results: We demonstrate a method to score protein interactionsby utilizing multiple independent sets of training positivesto reduce the potential bias inherent in using a single trainingset. We used a set of benchmark yeast protein interactions toshow that our approach outperforms other scoring methods. Ourapproach can also score interactions across data types, whichmakes it more widely applicable than many previously proposedmethods. We applied the method to protein interaction data fromboth Drosophila melanogaster and Homo sapiens. Independent evaluationsshow that the resulting confidence scores accurately reflectthe biological significance of the interactions. Contact: rfinley{at}wayne.edu Supplementary information: Supplementary data are availableat Bioinformatics Online. Associate Editor: Burkhard Rost  相似文献   

13.
Journal of Plankton Research, 8, 973–983, 1986 FIg. 2. Time-dependent changes in the gut content (percentageof initial ng pigment) of E. gro.ciloides at different temperaturesunder simultaneous feeding. Fig. 4. The relationship between instantaneous evacuation rateand temperature of E. graciloides. The regresston equation forfeeding animals: y = 0.0044 e(0.141 ) (r2 = 0.90). For comparisonthe results of non-feeding animals are indicated with open circles.  相似文献   

14.
Chorda tympani responses to sugars were greater in diabetic(db/db) than in non-diabetic control mice. A kinetic analysissuggested that the greater sugar responses in db/db mice wereunlikely due to the increased number of sugar receptors. Chem.Senses 21: 59–63, 1996.  相似文献   

15.
Model-based deconvolution of genome-wide DNA binding   总被引:1,自引:0,他引:1  
Motivation: Chromatin immunoprecipitation followed by hybridizationto a genomic tiling microarray (ChIP-chip) is a routinely usedprotocol for localizing the genomic targets of DNA-binding proteins.The resolution to which binding sites in this assay can be identifiedis commonly considered to be limited by two factors: (1) theresolution at which the genomic targets are tiled in the microarrayand (2) the large and variable lengths of the immunoprecipitatedDNA fragments. Results: We have developed a generative model of binding sitesin ChIP-chip data and an approach, MeDiChI, for efficientlyand robustly learning that model from diverse data sets. Wehave evaluated MeDiChI's performance using simulated data, aswell as on several diverse ChIP-chip data sets collected onwidely different tiling array platforms for two different organisms(Saccharomyces cerevisiae and Halobacterium salinarium NRC-1).We find that MeDiChI accurately predicts binding locations toa resolution greater than that of the probe spacing, even foroverlapping peaks, and can increase the effective resolutionof tiling array data by a factor of 5x or better. Moreover,the method's performance on simulated data provides insightsinto effectively optimizing the experimental design for increasedbinding site localization accuracy and efficacy. Availability: MeDiChI is available as an open-source R package,including all data, from http://baliga.systemsbiology.net/medichi. Contact: dreiss{at}systemsbiology.org Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Martin Bishop  相似文献   

16.
CORRIGENDUM     
EAMES, F. E., 1968. New name for a Pakistan Eocene Turbonilla.Proc. malac. Soc. Lond. 38, 167. Line 4 : For J. Conch., Lond. p. 95, read Ann. Mag. Nat. Hist.(6) (6), 95.  相似文献   

17.
Summary: We present In silico Biochemical Reaction Network Analysis(IBRENA), a software package which facilitates multiple functionsincluding cellular reaction network simulation and sensitivityanalysis (both forward and adjoint methods), coupled with principalcomponent analysis, singular-value decomposition and model reduction.The software features a graphical user interface that aids simulationand plotting of in silico results. While the primary focus isto aid formulation, testing and reduction of theoretical biochemicalreaction networks, the program can also be used for analysisof high-throughput genomic and proteomic data. Availability: The software package, manual and examples areavailable at http://www.eng.buffalo.edu/~neel/ibrena Contact: neel{at}eng.buffalo.edu Associate Editor: Limsoon Wong  相似文献   

18.
Motivation: A plethora of alignment tools have been createdthat are designed to best fit different types of alignment conditions.While some of these are made for aligning Illumina SequenceAnalyzer reads, none of these are fully utilizing its probability(prb) output. In this article, we will introduce a new alignmentapproach (Slider) that reduces the alignment problem space byutilizing each read base's probabilities given in the prb files. Results: Compared with other aligners, Slider has higher alignmentaccuracy and efficiency. In addition, given that Slider matchesbases with probabilities other than the most probable, it significantlyreduces the percentage of base mismatches. The result is thatits SNP predictions are more accurate than other SNP predictionapproaches used today that start from the most probable sequence,including those using base quality. Contact: nmalhis{at}bcgsc.ca Supplementary information and availability: http://www.bcgsc.ca/platform/bioinfo/software/slider Associate Editor: Dmitrij Frishman  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号