共查询到20条相似文献,搜索用时 15 毫秒
1.
Background
With the number of available genome sequences increasing rapidly, the magnitude of sequence data required for multiple-genome analyses is a challenging problem. When large-scale rearrangements break the collinearity of gene orders among genomes, genome comparison algorithms must first identify sets of short well-conserved sequences present in each genome, termed anchors. Previously, anchor identification among multiple genomes has been achieved using pairwise alignment tools like BLASTZ through progressive alignment tools like TBA, but the computational requirements for sequence comparisons of multiple genomes quickly becomes a limiting factor as the number and scale of genomes grows.Methodology/Principal Findings
Our algorithm, named Murasaki, makes it possible to identify anchors within multiple large sequences on the scale of several hundred megabases in few minutes using a single CPU. Two advanced features of Murasaki are (1) adaptive hash function generation, which enables efficient use of arbitrary mismatch patterns (spaced seeds) and therefore the comparison of multiple mammalian genomes in a practical amount of computation time, and (2) parallelizable execution that decreases the required wall-clock and CPU times. Murasaki can perform a sensitive anchoring of eight mammalian genomes (human, chimp, rhesus, orangutan, mouse, rat, dog, and cow) in 21 hours CPU time (42 minutes wall time). This is the first single-pass in-core anchoring of multiple mammalian genomes. We evaluated Murasaki by comparing it with the genome alignment programs BLASTZ and TBA. We show that Murasaki can anchor multiple genomes in near linear time, compared to the quadratic time requirements of BLASTZ and TBA, while improving overall accuracy.Conclusions/Significance
Murasaki provides an open source platform to take advantage of long patterns, cluster computing, and novel hash algorithms to produce accurate anchors across multiple genomes with computational efficiency significantly greater than existing methods. Murasaki is available under GPL at http://murasaki.sourceforge.net. 相似文献2.
3.
Patricio Jeraldo Krishna Kalari Xianfeng Chen Jaysheel Bhavsar Ashutosh Mangalam Bryan White Heidi Nelson Jean-Pierre Kocher Nicholas Chia 《PloS one》2014,9(12)
Motivation
16S rDNA hypervariable tag sequencing has become the de facto method for accessing microbial diversity. Illumina paired-end sequencing, which produces two separate reads for each DNA fragment, has become the platform of choice for this application. However, when the two reads do not overlap, existing computational pipelines analyze data from read separately and underutilize the information contained in the paired-end reads.Results
We created a workflow known as Illinois Mayo Taxon Organization from RNA Dataset Operations (IM-TORNADO) for processing non-overlapping reads while retaining maximal information content. Using synthetic mock datasets, we show that the use of both reads produced answers with greater correlation to those from full length 16S rDNA when looking at taxonomy, phylogeny, and beta-diversity.Availability and Implementation
IM-TORNADO is freely available at http://sourceforge.net/projects/imtornado and produces BIOM format output for cross compatibility with other pipelines such as QIIME, mothur, and phyloseq. 相似文献4.
Christopher K Hobbs Michelle Leung Herbert H Tsang H Alexander Ebhardt 《BMC bioinformatics》2014,15(1)
Background
A typical affinity purification coupled to mass spectrometry (AP-MS) experiment includes the purification of a target protein (bait) using an antibody and subsequent mass spectrometry analysis of all proteins co-purifying with the bait (aka prey proteins). Like any other systems biology approach, AP-MS experiments generate a lot of data and visualization has been challenging, especially when integrating AP-MS experiments with orthogonal datasets.Results
We present Circular Interaction Graph for Proteomics (CIG-P), which generates circular diagrams for visually appealing final representation of AP-MS data. Through a Java based GUI, the user inputs experimental and reference data as file in csv format. The resulting circular representation can be manipulated live within the GUI before exporting the diagram as vector graphic in pdf format. The strength of CIG-P is the ability to integrate orthogonal datasets with each other, e.g. affinity purification data of kinase PRPF4B in relation to the functional components of the spliceosome. Further, various AP-MS experiments can be compared to each other.Conclusions
CIG-P aids to present AP-MS data to a wider audience and we envision that the tool finds other applications too, e.g. kinase – substrate relationships as a function of perturbation. CIG-P is available under: http://sourceforge.net/projects/cig-p/Electronic supplementary material
The online version of this article (doi:10.1186/1471-2105-15-344) contains supplementary material, which is available to authorized users. 相似文献5.
Background
High-throughput RNA interference (RNAi) screening has become a widely used approach to elucidating gene functions. However, analysis and annotation of large data sets generated from these screens has been a challenge for researchers without a programming background. Over the years, numerous data analysis methods were produced for plate quality control and hit selection and implemented by a few open-access software packages. Recently, strictly standardized mean difference (SSMD) has become a widely used method for RNAi screening analysis mainly due to its better control of false negative and false positive rates and its ability to quantify RNAi effects with a statistical basis. We have developed GUItars to enable researchers without a programming background to use SSMD as both a plate quality and a hit selection metric to analyze large data sets.Results
The software is accompanied by an intuitive graphical user interface for easy and rapid analysis workflow. SSMD analysis methods have been provided to the users along with traditionally-used z-score, normalized percent activity, and t-test methods for hit selection. GUItars is capable of analyzing large-scale data sets from screens with or without replicates. The software is designed to automatically generate and save numerous graphical outputs known to be among the most informative high-throughput data visualization tools capturing plate-wise and screen-wise performances. Graphical outputs are also written in HTML format for easy access, and a comprehensive summary of screening results is written into tab-delimited output files.Conclusion
With GUItars, we demonstrated robust SSMD-based analysis workflow on a 3840-gene small interfering RNA (siRNA) library and identified 200 siRNAs that increased and 150 siRNAs that decreased the assay activities with moderate to stronger effects. GUItars enables rapid analysis and illustration of data from large- or small-scale RNAi screens using SSMD and other traditional analysis methods. The software is freely available at http://sourceforge.net/projects/guitars/. 相似文献6.
Background
Next-generation sequencing technologies are rapidly generating whole-genome datasets for an increasing number of organisms. However, phylogenetic reconstruction of genomic data remains difficult because de novo assembly for non-model genomes and multi-genome alignment are challenging.Results
To greatly simplify the analysis, we present an Assembly and Alignment-Free (AAF) method (https://sourceforge.net/projects/aaf-phylogeny) that constructs phylogenies directly from unassembled genome sequence data, bypassing both genome assembly and alignment. Using mathematical calculations, models of sequence evolution, and simulated sequencing of published genomes, we address both evolutionary and sampling issues caused by direct reconstruction, including homoplasy, sequencing errors, and incomplete sequencing coverage. From these results, we calculate the statistical properties of the pairwise distances between genomes, allowing us to optimize parameter selection and perform bootstrapping. As a test case with real data, we successfully reconstructed the phylogeny of 12 mammals using raw sequencing reads. We also applied AAF to 21 tropical tree genome datasets with low coverage to demonstrate its effectiveness on non-model organisms.Conclusion
Our AAF method opens up phylogenomics for species without an appropriate reference genome or high sequence coverage, and rapidly creates a phylogenetic framework for further analysis of genome structure and diversity among non-model organisms.Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1647-5) contains supplementary material, which is available to authorized users. 相似文献7.
Ernest D Benavente Francesc Coll Nick Furnham Ruth McNerney Judith R Glynn Susana Campino Arnab Pain Fady R Mohareb Taane G Clark 《BMC bioinformatics》2015,16(1)
Background
Phylogenetic-based classification of M. tuberculosis and other bacterial genomes is a core analysis for studying evolutionary hypotheses, disease outbreaks and transmission events. Whole genome sequencing is providing new insights into the genomic variation underlying intra- and inter-strain diversity, thereby assisting with the classification and molecular barcoding of the bacteria. One roadblock to strain investigation is the lack of user-interactive solutions to interrogate and visualise variation within a phylogenetic tree setting.Results
We have developed a web-based tool called PhyTB (http://pathogenseq.lshtm.ac.uk/phytblive/index.php) to assist phylogenetic tree visualisation and identification of M. tuberculosis clade-informative polymorphism. Variant Call Format files can be uploaded to determine a sample position within the tree. A map view summarises the geographical distribution of alleles and strain-types. The utility of the PhyTB is demonstrated on sequence data from 1,601 M. tuberculosis isolates.Conclusion
PhyTB contextualises M. tuberculosis genomic variation within epidemiological, geographical and phylogenic settings. Further tool utility is possible by incorporating large variants and phenotypic data (e.g. drug-resistance profiles), and an assessment of genotype-phenotype associations. Source code is available to develop similar websites for other organisms (http://sourceforge.net/projects/phylotrack). 相似文献8.
Background
Identifying genes with essential roles in resisting environmental stress rates high in agronomic importance. Although massive DNA microarray gene expression data have been generated for plants, current computational approaches underutilize these data for studying genotype-trait relationships. Some advanced gene identification methods have been explored for human diseases, but typically these methods have not been converted into publicly available software tools and cannot be applied to plants for identifying genes with agronomic traits.Methodology
In this study, we used 22 sets of Arabidopsis thaliana gene expression data from GEO to predict the key genes involved in water tolerance. We applied an SVM-RFE (Support Vector Machine-Recursive Feature Elimination) feature selection method for the prediction. To address small sample sizes, we developed a modified approach for SVM-RFE by using bootstrapping and leave-one-out cross-validation. We also expanded our study to predict genes involved in water susceptibility.Conclusions
We analyzed the top 10 genes predicted to be involved in water tolerance. Seven of them are connected to known biological processes in drought resistance. We also analyzed the top 100 genes in terms of their biological functions. Our study shows that the SVM-RFE method is a highly promising method in analyzing plant microarray data for studying genotype-phenotype relationships. The software is freely available with source code at http://ccst.jlu.edu.cn/JCSB/RFET/. 相似文献9.
10.
I Kuepfer C Schmid M Allan A Edielu EP Haary A Kakembo S Kibona J Blum C Burri 《PLoS neglected tropical diseases》2012,6(8):e1695
Objective
Assessment of the safety and efficacy of a 10-day melarsoprol schedule in second stage T.b. rhodesiense patients and the effect of suramin-pretreatment on the incidence of encephalopathic syndrome (ES) during melarsoprol therapy.Design
Sequential conduct of a proof-of-concept trial (n = 60) and a utilization study (n = 78) using historic controls as comparator.Setting
Two trial centres in the T.b. rhodesiense endemic regions of Tanzania and Uganda. Participants: Consenting patients with confirmed second stage disease and a minimum age of 6 years were eligible for participation. Unconscious and pregnant patients were excluded.Main Outcome Measures
The primary outcome measures were safety and efficacy at end of treatment. The secondary outcome measure was efficacy during follow-up after 3, 6 and 12 months.Results
The incidence of ES in the trial population was 11.2% (CI 5–17%) and 13% (CI 9–17%) in the historic data. The respective case fatality rates were 8.4% (CI 3–13.8%) and 9.3% (CI 6–12.6%). All patients discharged alive were free of parasites at end of treatment. Twelve months after discharge, 96% of patients were clinically cured. The mean hospitalization time was reduced from 29 to 13 days (p<0.0001) per patient.Conclusions
The 10-day melarsoprol schedule does not expose patients to a higher risk of ES or death than does treatment according to national schedules in current use. The efficacy of the 10-day melarsoprol schedule was highly satisfactory. No benefit could be attributed to the suramin pre-treatment.Trial Registration
Current Controlled Trials ISRCTN40537886 相似文献11.
Speich B Ame SM Ali SM Alles R Hattendorf J Utzinger J Albonico M Keiser J 《PLoS neglected tropical diseases》2012,6(6):e1685
Background
The currently used anthelmintic drugs, in single oral application, have low efficacy against Trichuris trichiura infection, and hence novel anthelmintic drugs are needed. Nitazoxanide has been suggested as potential drug candidate.Methodology
The efficacy and safety of a single oral dose of nitazoxanide (1,000 mg), or albendazole (400 mg), and a nitazoxanide-albendazole combination (1,000 mg–400 mg), with each drug administered separately on two consecutive days, were assessed in a double-blind, randomized, placebo-controlled trial in two schools on Pemba, Tanzania. Cure and egg reduction rates were calculated by per-protocol analysis and by available case analysis. Adverse events were assessed and graded before treatment and four times after treatment.Principal Findings
Complete data for the per-protocol analysis were available from 533 T. trichiura-positive children. Cure rates against T. trichiura were low regardless of the treatment (nitazoxanide-albendazole, 16.0%; albendazole, 14.5%; and nitazoxanide, 6.6%). Egg reduction rates were 54.9% for the nitazoxanide-albendazole combination, 45.6% for single albendazole, and 13.4% for single nitazoxanide. Similar cure and egg reduction rates were calculated using the available case analysis. Children receiving nitazoxanide had significantly more adverse events compared to placebo recipients. Most of the adverse events were mild and had resolved within 24 hours posttreatment.Conclusions/Significance
Nitazoxanide shows no effect on T. trichiura infection. The low efficacy of albendazole against T. trichiura in the current setting characterized by high anthelmintic drug pressure is confirmed. There is a pressing need to develop new anthelmintics against trichuriasis.Trial Registration
Controlled-Trials.com ISRCTN08336605 相似文献12.
Goodin DS Jones J Li D Traboulsee A Reder AT Beckmann K Konieczny A Knappertz V;and the -Year Long-Term Follow-up Study Investigators 《PloS one》2011,6(11):e22444
Context
Establishing the long-term benefit of therapy in chronic diseases has been challenging. Long-term studies require non-randomized designs and, thus, are often confounded by biases. For example, although disease-modifying therapy in MS has a convincing benefit on several short-term outcome-measures in randomized trials, its impact on long-term function remains uncertain.Objective
Data from the 16-year Long-Term Follow-up study of interferon-beta-1b is used to assess the relationship between drug-exposure and long-term disability in MS patients.Design/Setting
To mitigate the bias of outcome-dependent exposure variation in non-randomized long-term studies, drug-exposure was measured as the medication-possession-ratio, adjusted up or down according to multiple different weighting-schemes based on MS severity and MS duration at treatment initiation. A recursive-partitioning algorithm assessed whether exposure (using any weighing scheme) affected long-term outcome. The optimal cut-point that was used to define “high” or “low” exposure-groups was chosen by the algorithm. Subsequent to verification of an exposure-impact that included all predictor variables, the two groups were compared using a weighted propensity-stratified analysis in order to mitigate any treatment-selection bias that may have been present. Finally, multiple sensitivity-analyses were undertaken using different definitions of long-term outcome and different assumptions about the data.Main Outcome Measure
Long-Term Disability.Results
In these analyses, the same weighting-scheme was consistently selected by the recursive-partitioning algorithm. This scheme reduced (down-weighted) the effectiveness of drug exposure as either disease duration or disability at treatment-onset increased. Applying this scheme and using propensity-stratification to further mitigate bias, high-exposure had a consistently better clinical outcome compared to low-exposure (Cox proportional hazard ratio = 0.30–0.42; p<0.0001).Conclusions
Early initiation and sustained use of interferon-beta-1b has a beneficial impact on long-term outcome in MS. Our analysis strategy provides a methodological framework for bias-mitigation in the analysis of non-randomized clinical data.Trial Registration
Clinicaltrials.gov NCT00206635相似文献13.
Yaoliang Chen Ji Hong Wanyun Cui Jacques Zaneveld Wei Wang Richard Gibbs Yanghua Xiao Rui Chen 《PloS one》2013,8(4)
Background
Next generation sequencing platforms have greatly reduced sequencing costs, leading to the production of unprecedented amounts of sequence data. BWA is one of the most popular alignment tools due to its relatively high accuracy. However, mapping reads using BWA is still the most time consuming step in sequence analysis. Increasing mapping efficiency would allow the community to better cope with ever expanding volumes of sequence data.Results
We designed a new program, CGAP-align, that achieves a performance improvement over BWA without sacrificing recall or precision. This is accomplished through the use of Suffix Tarray, a novel data structure combining elements of Suffix Array and Suffix Tree. We also utilize a tighter lower bound estimation for the number of mismatches in a read, allowing for more effective pruning during inexact mapping. Evaluation of both simulated and real data suggests that CGAP-align consistently outperforms the current version of BWA and can achieve over twice its speed under certain conditions, all while obtaining nearly identical results.Conclusion
CGAP-align is a new time efficient read alignment tool that extends and improves BWA. The increase in alignment speed will be of critical assistance to all sequence-based research and medicine. CGAP-align is freely available to the academic community at http://sourceforge.net/p/cgap-align under the GNU General Public License (GPL). 相似文献14.
15.
Introduction
Fibromyalgia is difficult to treat and requires the use of multiple approaches. This study is a randomized controlled trial of qigong compared with a wait-list control group in fibromyalgia.Methods
One hundred participants were randomly assigned to immediate or delayed practice groups, with the delayed group receiving training at the end of the control period. Qigong training (level 1 Chaoyi Fanhuan Qigong, CFQ), given over three half-days, was followed by weekly review/practice sessions for eight weeks; participants were also asked to practice at home for 45 to 60 minutes per day for this interval. Outcomes were pain, impact, sleep, physical function and mental function, and these were recorded at baseline, eight weeks, four months and six months. Immediate and delayed practice groups were analyzed individually compared to the control group, and as a combination group.Results
In both the immediate and delayed treatment groups, CFQ demonstrated significant improvements in pain, impact, sleep, physical function and mental function when compared to the wait-list/usual care control group at eight weeks, with benefits extending beyond this time. Analysis of combined data indicated significant changes for all measures at all times for six months, with only one exception. Post-hoc analysis based on self-reported practice times indicated greater benefit with the per protocol group compared to minimal practice.Conclusions
This study demonstrates that CFQ, a particular form of qigong, provides long-term benefits in several core domains in fibromyalgia. CFQ may be a useful adjuvant self-care treatment for fibromyalgia.Trial registration
clinicaltrials.gov NCT00938834. 相似文献16.
Kaushalya C Amarasinghe Jason Li Sally M Hunter Georgina L Ryland Prue A Cowin Ian G Campbell Saman K Halgamuge 《BMC genomics》2014,15(1)
Background
Using whole exome sequencing to predict aberrations in tumours is a cost effective alternative to whole genome sequencing, however is predominantly used for variant detection and infrequently utilised for detection of somatic copy number variation.Results
We propose a new method to infer copy number and genotypes using whole exome data from paired tumour/normal samples. Our algorithm uses two Hidden Markov Models to predict copy number and genotypes and computationally resolves polyploidy/aneuploidy, normal cell contamination and signal baseline shift. Our method makes explicit detection on chromosome arm level events, which are commonly found in tumour samples. The methods are combined into a package named ADTEx (Aberration Detection in Tumour Exome). We applied our algorithm to a cohort of 17 in-house generated and 18 TCGA paired ovarian cancer/normal exomes and evaluated the performance by comparing against the copy number variations and genotypes predicted using Affymetrix SNP 6.0 data of the same samples. Further, we carried out a comparison study to show that ADTEx outperformed its competitors in terms of precision and F-measure.Conclusions
Our proposed method, ADTEx, uses both depth of coverage ratios and B allele frequencies calculated from whole exome sequencing data, to predict copy number variations along with their genotypes. ADTEx is implemented as a user friendly software package using Python and R statistical language. Source code and sample data are freely available under GNU license (GPLv3) at http://adtex.sourceforge.net/.Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-732) contains supplementary material, which is available to authorized users. 相似文献17.
G Quattrocchi A Nicoletti B Marin E Bruno M Druet-Cabanac PM Preux 《PLoS neglected tropical diseases》2012,6(8):e1775
Objective
Human toxocariasis is a zoonotic infection caused by the larval stages of Toxocara canis (T. canis) and less frequently Toxocara cati (T. cati). A relationship between toxocariasis and epilepsy has been hypothesized. We conducted a systematic review and a meta-analysis of available data to evaluate the strength of association between epilepsy and Toxocara spp. seropositivity and to propose some guidelines for future surveys.Data Sources
Electronic databases, the database from the Institute of Neuroepidemiology and Tropical Neurology of the University of Limoges (http://www-ient.unilim.fr/) and the reference lists of all relevant papers and books were screened up to October 2011.Methods
We performed a systematic review of literature on toxocariasis (the exposure) and epilepsy (the outcome). Two authors independently assessed eligibility and study quality and extracted data. A common odds ratio (OR) was estimated using a random-effects meta-analysis model of aggregated published data.Results
Seven case-control studies met the inclusion criteria, for a total of 1867 participants (850 cases and 1017 controls). The percentage of seropositivity (presence of anti-Toxocara spp. antibodies) was higher among people with epilepsy (PWE) in all the included studies even if the association between epilepsy and Toxocara spp. seropositivity was statistically significant in only 4 studies, with crude ORs ranging 2.04–2.85. Another study bordered statistical significance, while in 2 of the included studies no significant association was found. A significant (p<0.001) common OR of 1.92 [95% confidence interval (CI) 1.50–2.44] was estimated. Similar results were found when meta-analysis was restricted to the studies considering an exclusively juvenile population and to surveys using Western Blot as confirmatory or diagnostic serological assay.Conclusion
Our results support the existence of a positive association between Toxocara spp. seropositivity and epilepsy. Further studies, possibly including incident cases, should be performed to better investigate the relationship between toxocariasis and epilepsy. 相似文献18.
Plant and animal genomes are replete with large gene families, making the task of ortholog identification difficult and labor
intensive. OrthoRBH is an automated reciprocal blast pipeline tool enabling the rapid identification of specific gene families of
interest in related species, streamlining the collection of homologs prior to downstream molecular evolutionary analysis. The
efficacy of OrthoRBH is demonstrated with the identification of the 13-member PYR/PYL/RCAR gene family in Hordeum vulgare
using Oryza sativa query sequences. OrthoRBH runs on the Linux command line and is freely available at SourceForge.
Availability
http://sourceforge.net/projects/ orthorbh/ 相似文献19.
Background
Telephone helplines are frequently and repeatedly used by individuals with chronic mental health problems and web interventions may be an effective tool for reducing depression in this population.Aim
To evaluate the effectiveness of a 6 week, web-based cognitive behaviour therapy (CBT) intervention with and without proactive weekly telephone tracking in the reduction of depression in callers to a helpline service.Method
155 callers to a national helpline service with moderate to high psychological distress were recruited and randomised to receive either Internet CBT plus weekly telephone follow-up; Internet CBT only; weekly telephone follow-up only; or treatment as usual.Results
Depression was lower in participants in the web intervention conditions both with and without telephone tracking compared to the treatment as usual condition both at post intervention and at 6 month follow-up. Telephone tracking provided by a lay telephone counsellor did not confer any additional advantage in terms of symptom reduction or adherence.Conclusions
A web-based CBT program is effective both with and without telephone tracking for reducing depression in callers to a national helpline.Trial Registration
Controlled-Trials.com ISRCTN93903959 相似文献20.
Djie Tjwan Thung Joep de Ligt Lisenka EM Vissers Marloes Steehouwer Mark Kroon Petra de Vries Eline P Slagboom Kai Ye Joris A Veltman Jayne Y Hehir-Kwa 《Genome biology》2014,15(10)
Mobile elements are major drivers in changing genomic architecture and can cause disease. The detection of mobile elements is hindered due to the low mappability of their highly repetitive sequences. We have developed an algorithm, called Mobster, to detect non-reference mobile element insertions in next generation sequencing data from both whole genome and whole exome studies. Mobster uses discordant read pairs and clipped reads in combination with consensus sequences of known active mobile elements. Mobster has a low false discovery rate and high recall rate for both L1 and Alu elements. Mobster is available at http://sourceforge.net/projects/mobster.