首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

With the number of available genome sequences increasing rapidly, the magnitude of sequence data required for multiple-genome analyses is a challenging problem. When large-scale rearrangements break the collinearity of gene orders among genomes, genome comparison algorithms must first identify sets of short well-conserved sequences present in each genome, termed anchors. Previously, anchor identification among multiple genomes has been achieved using pairwise alignment tools like BLASTZ through progressive alignment tools like TBA, but the computational requirements for sequence comparisons of multiple genomes quickly becomes a limiting factor as the number and scale of genomes grows.

Methodology/Principal Findings

Our algorithm, named Murasaki, makes it possible to identify anchors within multiple large sequences on the scale of several hundred megabases in few minutes using a single CPU. Two advanced features of Murasaki are (1) adaptive hash function generation, which enables efficient use of arbitrary mismatch patterns (spaced seeds) and therefore the comparison of multiple mammalian genomes in a practical amount of computation time, and (2) parallelizable execution that decreases the required wall-clock and CPU times. Murasaki can perform a sensitive anchoring of eight mammalian genomes (human, chimp, rhesus, orangutan, mouse, rat, dog, and cow) in 21 hours CPU time (42 minutes wall time). This is the first single-pass in-core anchoring of multiple mammalian genomes. We evaluated Murasaki by comparing it with the genome alignment programs BLASTZ and TBA. We show that Murasaki can anchor multiple genomes in near linear time, compared to the quadratic time requirements of BLASTZ and TBA, while improving overall accuracy.

Conclusions/Significance

Murasaki provides an open source platform to take advantage of long patterns, cluster computing, and novel hash algorithms to produce accurate anchors across multiple genomes with computational efficiency significantly greater than existing methods. Murasaki is available under GPL at http://murasaki.sourceforge.net.  相似文献   

2.
3.

Motivation

16S rDNA hypervariable tag sequencing has become the de facto method for accessing microbial diversity. Illumina paired-end sequencing, which produces two separate reads for each DNA fragment, has become the platform of choice for this application. However, when the two reads do not overlap, existing computational pipelines analyze data from read separately and underutilize the information contained in the paired-end reads.

Results

We created a workflow known as Illinois Mayo Taxon Organization from RNA Dataset Operations (IM-TORNADO) for processing non-overlapping reads while retaining maximal information content. Using synthetic mock datasets, we show that the use of both reads produced answers with greater correlation to those from full length 16S rDNA when looking at taxonomy, phylogeny, and beta-diversity.

Availability and Implementation

IM-TORNADO is freely available at http://sourceforge.net/projects/imtornado and produces BIOM format output for cross compatibility with other pipelines such as QIIME, mothur, and phyloseq.  相似文献   

4.

Background

A typical affinity purification coupled to mass spectrometry (AP-MS) experiment includes the purification of a target protein (bait) using an antibody and subsequent mass spectrometry analysis of all proteins co-purifying with the bait (aka prey proteins). Like any other systems biology approach, AP-MS experiments generate a lot of data and visualization has been challenging, especially when integrating AP-MS experiments with orthogonal datasets.

Results

We present Circular Interaction Graph for Proteomics (CIG-P), which generates circular diagrams for visually appealing final representation of AP-MS data. Through a Java based GUI, the user inputs experimental and reference data as file in csv format. The resulting circular representation can be manipulated live within the GUI before exporting the diagram as vector graphic in pdf format. The strength of CIG-P is the ability to integrate orthogonal datasets with each other, e.g. affinity purification data of kinase PRPF4B in relation to the functional components of the spliceosome. Further, various AP-MS experiments can be compared to each other.

Conclusions

CIG-P aids to present AP-MS data to a wider audience and we envision that the tool finds other applications too, e.g. kinase – substrate relationships as a function of perturbation. CIG-P is available under: http://sourceforge.net/projects/cig-p/

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-344) contains supplementary material, which is available to authorized users.  相似文献   

5.

Background

High-throughput RNA interference (RNAi) screening has become a widely used approach to elucidating gene functions. However, analysis and annotation of large data sets generated from these screens has been a challenge for researchers without a programming background. Over the years, numerous data analysis methods were produced for plate quality control and hit selection and implemented by a few open-access software packages. Recently, strictly standardized mean difference (SSMD) has become a widely used method for RNAi screening analysis mainly due to its better control of false negative and false positive rates and its ability to quantify RNAi effects with a statistical basis. We have developed GUItars to enable researchers without a programming background to use SSMD as both a plate quality and a hit selection metric to analyze large data sets.

Results

The software is accompanied by an intuitive graphical user interface for easy and rapid analysis workflow. SSMD analysis methods have been provided to the users along with traditionally-used z-score, normalized percent activity, and t-test methods for hit selection. GUItars is capable of analyzing large-scale data sets from screens with or without replicates. The software is designed to automatically generate and save numerous graphical outputs known to be among the most informative high-throughput data visualization tools capturing plate-wise and screen-wise performances. Graphical outputs are also written in HTML format for easy access, and a comprehensive summary of screening results is written into tab-delimited output files.

Conclusion

With GUItars, we demonstrated robust SSMD-based analysis workflow on a 3840-gene small interfering RNA (siRNA) library and identified 200 siRNAs that increased and 150 siRNAs that decreased the assay activities with moderate to stronger effects. GUItars enables rapid analysis and illustration of data from large- or small-scale RNAi screens using SSMD and other traditional analysis methods. The software is freely available at http://sourceforge.net/projects/guitars/.  相似文献   

6.

Background

Next-generation sequencing technologies are rapidly generating whole-genome datasets for an increasing number of organisms. However, phylogenetic reconstruction of genomic data remains difficult because de novo assembly for non-model genomes and multi-genome alignment are challenging.

Results

To greatly simplify the analysis, we present an Assembly and Alignment-Free (AAF) method (https://sourceforge.net/projects/aaf-phylogeny) that constructs phylogenies directly from unassembled genome sequence data, bypassing both genome assembly and alignment. Using mathematical calculations, models of sequence evolution, and simulated sequencing of published genomes, we address both evolutionary and sampling issues caused by direct reconstruction, including homoplasy, sequencing errors, and incomplete sequencing coverage. From these results, we calculate the statistical properties of the pairwise distances between genomes, allowing us to optimize parameter selection and perform bootstrapping. As a test case with real data, we successfully reconstructed the phylogeny of 12 mammals using raw sequencing reads. We also applied AAF to 21 tropical tree genome datasets with low coverage to demonstrate its effectiveness on non-model organisms.

Conclusion

Our AAF method opens up phylogenomics for species without an appropriate reference genome or high sequence coverage, and rapidly creates a phylogenetic framework for further analysis of genome structure and diversity among non-model organisms.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1647-5) contains supplementary material, which is available to authorized users.  相似文献   

7.

Background

Phylogenetic-based classification of M. tuberculosis and other bacterial genomes is a core analysis for studying evolutionary hypotheses, disease outbreaks and transmission events. Whole genome sequencing is providing new insights into the genomic variation underlying intra- and inter-strain diversity, thereby assisting with the classification and molecular barcoding of the bacteria. One roadblock to strain investigation is the lack of user-interactive solutions to interrogate and visualise variation within a phylogenetic tree setting.

Results

We have developed a web-based tool called PhyTB (http://pathogenseq.lshtm.ac.uk/phytblive/index.php) to assist phylogenetic tree visualisation and identification of M. tuberculosis clade-informative polymorphism. Variant Call Format files can be uploaded to determine a sample position within the tree. A map view summarises the geographical distribution of alleles and strain-types. The utility of the PhyTB is demonstrated on sequence data from 1,601 M. tuberculosis isolates.

Conclusion

PhyTB contextualises M. tuberculosis genomic variation within epidemiological, geographical and phylogenic settings. Further tool utility is possible by incorporating large variants and phenotypic data (e.g. drug-resistance profiles), and an assessment of genotype-phenotype associations. Source code is available to develop similar websites for other organisms (http://sourceforge.net/projects/phylotrack).  相似文献   

8.
Liang Y  Zhang F  Wang J  Joshi T  Wang Y  Xu D 《PloS one》2011,6(7):e21750

Background

Identifying genes with essential roles in resisting environmental stress rates high in agronomic importance. Although massive DNA microarray gene expression data have been generated for plants, current computational approaches underutilize these data for studying genotype-trait relationships. Some advanced gene identification methods have been explored for human diseases, but typically these methods have not been converted into publicly available software tools and cannot be applied to plants for identifying genes with agronomic traits.

Methodology

In this study, we used 22 sets of Arabidopsis thaliana gene expression data from GEO to predict the key genes involved in water tolerance. We applied an SVM-RFE (Support Vector Machine-Recursive Feature Elimination) feature selection method for the prediction. To address small sample sizes, we developed a modified approach for SVM-RFE by using bootstrapping and leave-one-out cross-validation. We also expanded our study to predict genes involved in water susceptibility.

Conclusions

We analyzed the top 10 genes predicted to be involved in water tolerance. Seven of them are connected to known biological processes in drought resistance. We also analyzed the top 100 genes in terms of their biological functions. Our study shows that the SVM-RFE method is a highly promising method in analyzing plant microarray data for studying genotype-phenotype relationships. The software is freely available with source code at http://ccst.jlu.edu.cn/JCSB/RFET/.  相似文献   

9.
10.

Objective

Assessment of the safety and efficacy of a 10-day melarsoprol schedule in second stage T.b. rhodesiense patients and the effect of suramin-pretreatment on the incidence of encephalopathic syndrome (ES) during melarsoprol therapy.

Design

Sequential conduct of a proof-of-concept trial (n = 60) and a utilization study (n = 78) using historic controls as comparator.

Setting

Two trial centres in the T.b. rhodesiense endemic regions of Tanzania and Uganda. Participants: Consenting patients with confirmed second stage disease and a minimum age of 6 years were eligible for participation. Unconscious and pregnant patients were excluded.

Main Outcome Measures

The primary outcome measures were safety and efficacy at end of treatment. The secondary outcome measure was efficacy during follow-up after 3, 6 and 12 months.

Results

The incidence of ES in the trial population was 11.2% (CI 5–17%) and 13% (CI 9–17%) in the historic data. The respective case fatality rates were 8.4% (CI 3–13.8%) and 9.3% (CI 6–12.6%). All patients discharged alive were free of parasites at end of treatment. Twelve months after discharge, 96% of patients were clinically cured. The mean hospitalization time was reduced from 29 to 13 days (p<0.0001) per patient.

Conclusions

The 10-day melarsoprol schedule does not expose patients to a higher risk of ES or death than does treatment according to national schedules in current use. The efficacy of the 10-day melarsoprol schedule was highly satisfactory. No benefit could be attributed to the suramin pre-treatment.

Trial Registration

Current Controlled Trials ISRCTN40537886  相似文献   

11.

Background

The currently used anthelmintic drugs, in single oral application, have low efficacy against Trichuris trichiura infection, and hence novel anthelmintic drugs are needed. Nitazoxanide has been suggested as potential drug candidate.

Methodology

The efficacy and safety of a single oral dose of nitazoxanide (1,000 mg), or albendazole (400 mg), and a nitazoxanide-albendazole combination (1,000 mg–400 mg), with each drug administered separately on two consecutive days, were assessed in a double-blind, randomized, placebo-controlled trial in two schools on Pemba, Tanzania. Cure and egg reduction rates were calculated by per-protocol analysis and by available case analysis. Adverse events were assessed and graded before treatment and four times after treatment.

Principal Findings

Complete data for the per-protocol analysis were available from 533 T. trichiura-positive children. Cure rates against T. trichiura were low regardless of the treatment (nitazoxanide-albendazole, 16.0%; albendazole, 14.5%; and nitazoxanide, 6.6%). Egg reduction rates were 54.9% for the nitazoxanide-albendazole combination, 45.6% for single albendazole, and 13.4% for single nitazoxanide. Similar cure and egg reduction rates were calculated using the available case analysis. Children receiving nitazoxanide had significantly more adverse events compared to placebo recipients. Most of the adverse events were mild and had resolved within 24 hours posttreatment.

Conclusions/Significance

Nitazoxanide shows no effect on T. trichiura infection. The low efficacy of albendazole against T. trichiura in the current setting characterized by high anthelmintic drug pressure is confirmed. There is a pressing need to develop new anthelmintics against trichuriasis.

Trial Registration

Controlled-Trials.com ISRCTN08336605  相似文献   

12.

Context

Establishing the long-term benefit of therapy in chronic diseases has been challenging. Long-term studies require non-randomized designs and, thus, are often confounded by biases. For example, although disease-modifying therapy in MS has a convincing benefit on several short-term outcome-measures in randomized trials, its impact on long-term function remains uncertain.

Objective

Data from the 16-year Long-Term Follow-up study of interferon-beta-1b is used to assess the relationship between drug-exposure and long-term disability in MS patients.

Design/Setting

To mitigate the bias of outcome-dependent exposure variation in non-randomized long-term studies, drug-exposure was measured as the medication-possession-ratio, adjusted up or down according to multiple different weighting-schemes based on MS severity and MS duration at treatment initiation. A recursive-partitioning algorithm assessed whether exposure (using any weighing scheme) affected long-term outcome. The optimal cut-point that was used to define “high” or “low” exposure-groups was chosen by the algorithm. Subsequent to verification of an exposure-impact that included all predictor variables, the two groups were compared using a weighted propensity-stratified analysis in order to mitigate any treatment-selection bias that may have been present. Finally, multiple sensitivity-analyses were undertaken using different definitions of long-term outcome and different assumptions about the data.

Main Outcome Measure

Long-Term Disability.

Results

In these analyses, the same weighting-scheme was consistently selected by the recursive-partitioning algorithm. This scheme reduced (down-weighted) the effectiveness of drug exposure as either disease duration or disability at treatment-onset increased. Applying this scheme and using propensity-stratification to further mitigate bias, high-exposure had a consistently better clinical outcome compared to low-exposure (Cox proportional hazard ratio = 0.30–0.42; p<0.0001).

Conclusions

Early initiation and sustained use of interferon-beta-1b has a beneficial impact on long-term outcome in MS. Our analysis strategy provides a methodological framework for bias-mitigation in the analysis of non-randomized clinical data.

Trial Registration

Clinicaltrials.gov NCT00206635  相似文献   

13.

Background

Next generation sequencing platforms have greatly reduced sequencing costs, leading to the production of unprecedented amounts of sequence data. BWA is one of the most popular alignment tools due to its relatively high accuracy. However, mapping reads using BWA is still the most time consuming step in sequence analysis. Increasing mapping efficiency would allow the community to better cope with ever expanding volumes of sequence data.

Results

We designed a new program, CGAP-align, that achieves a performance improvement over BWA without sacrificing recall or precision. This is accomplished through the use of Suffix Tarray, a novel data structure combining elements of Suffix Array and Suffix Tree. We also utilize a tighter lower bound estimation for the number of mismatches in a read, allowing for more effective pruning during inexact mapping. Evaluation of both simulated and real data suggests that CGAP-align consistently outperforms the current version of BWA and can achieve over twice its speed under certain conditions, all while obtaining nearly identical results.

Conclusion

CGAP-align is a new time efficient read alignment tool that extends and improves BWA. The increase in alignment speed will be of critical assistance to all sequence-based research and medicine. CGAP-align is freely available to the academic community at http://sourceforge.net/p/cgap-align under the GNU General Public License (GPL).  相似文献   

14.
15.

Introduction

Fibromyalgia is difficult to treat and requires the use of multiple approaches. This study is a randomized controlled trial of qigong compared with a wait-list control group in fibromyalgia.

Methods

One hundred participants were randomly assigned to immediate or delayed practice groups, with the delayed group receiving training at the end of the control period. Qigong training (level 1 Chaoyi Fanhuan Qigong, CFQ), given over three half-days, was followed by weekly review/practice sessions for eight weeks; participants were also asked to practice at home for 45 to 60 minutes per day for this interval. Outcomes were pain, impact, sleep, physical function and mental function, and these were recorded at baseline, eight weeks, four months and six months. Immediate and delayed practice groups were analyzed individually compared to the control group, and as a combination group.

Results

In both the immediate and delayed treatment groups, CFQ demonstrated significant improvements in pain, impact, sleep, physical function and mental function when compared to the wait-list/usual care control group at eight weeks, with benefits extending beyond this time. Analysis of combined data indicated significant changes for all measures at all times for six months, with only one exception. Post-hoc analysis based on self-reported practice times indicated greater benefit with the per protocol group compared to minimal practice.

Conclusions

This study demonstrates that CFQ, a particular form of qigong, provides long-term benefits in several core domains in fibromyalgia. CFQ may be a useful adjuvant self-care treatment for fibromyalgia.

Trial registration

clinicaltrials.gov NCT00938834.  相似文献   

16.

Background

Using whole exome sequencing to predict aberrations in tumours is a cost effective alternative to whole genome sequencing, however is predominantly used for variant detection and infrequently utilised for detection of somatic copy number variation.

Results

We propose a new method to infer copy number and genotypes using whole exome data from paired tumour/normal samples. Our algorithm uses two Hidden Markov Models to predict copy number and genotypes and computationally resolves polyploidy/aneuploidy, normal cell contamination and signal baseline shift. Our method makes explicit detection on chromosome arm level events, which are commonly found in tumour samples. The methods are combined into a package named ADTEx (Aberration Detection in Tumour Exome). We applied our algorithm to a cohort of 17 in-house generated and 18 TCGA paired ovarian cancer/normal exomes and evaluated the performance by comparing against the copy number variations and genotypes predicted using Affymetrix SNP 6.0 data of the same samples. Further, we carried out a comparison study to show that ADTEx outperformed its competitors in terms of precision and F-measure.

Conclusions

Our proposed method, ADTEx, uses both depth of coverage ratios and B allele frequencies calculated from whole exome sequencing data, to predict copy number variations along with their genotypes. ADTEx is implemented as a user friendly software package using Python and R statistical language. Source code and sample data are freely available under GNU license (GPLv3) at http://adtex.sourceforge.net/.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-732) contains supplementary material, which is available to authorized users.  相似文献   

17.

Objective

Human toxocariasis is a zoonotic infection caused by the larval stages of Toxocara canis (T. canis) and less frequently Toxocara cati (T. cati). A relationship between toxocariasis and epilepsy has been hypothesized. We conducted a systematic review and a meta-analysis of available data to evaluate the strength of association between epilepsy and Toxocara spp. seropositivity and to propose some guidelines for future surveys.

Data Sources

Electronic databases, the database from the Institute of Neuroepidemiology and Tropical Neurology of the University of Limoges (http://www-ient.unilim.fr/) and the reference lists of all relevant papers and books were screened up to October 2011.

Methods

We performed a systematic review of literature on toxocariasis (the exposure) and epilepsy (the outcome). Two authors independently assessed eligibility and study quality and extracted data. A common odds ratio (OR) was estimated using a random-effects meta-analysis model of aggregated published data.

Results

Seven case-control studies met the inclusion criteria, for a total of 1867 participants (850 cases and 1017 controls). The percentage of seropositivity (presence of anti-Toxocara spp. antibodies) was higher among people with epilepsy (PWE) in all the included studies even if the association between epilepsy and Toxocara spp. seropositivity was statistically significant in only 4 studies, with crude ORs ranging 2.04–2.85. Another study bordered statistical significance, while in 2 of the included studies no significant association was found. A significant (p<0.001) common OR of 1.92 [95% confidence interval (CI) 1.50–2.44] was estimated. Similar results were found when meta-analysis was restricted to the studies considering an exclusively juvenile population and to surveys using Western Blot as confirmatory or diagnostic serological assay.

Conclusion

Our results support the existence of a positive association between Toxocara spp. seropositivity and epilepsy. Further studies, possibly including incident cases, should be performed to better investigate the relationship between toxocariasis and epilepsy.  相似文献   

18.
Plant and animal genomes are replete with large gene families, making the task of ortholog identification difficult and labor intensive. OrthoRBH is an automated reciprocal blast pipeline tool enabling the rapid identification of specific gene families of interest in related species, streamlining the collection of homologs prior to downstream molecular evolutionary analysis. The efficacy of OrthoRBH is demonstrated with the identification of the 13-member PYR/PYL/RCAR gene family in Hordeum vulgare using Oryza sativa query sequences. OrthoRBH runs on the Linux command line and is freely available at SourceForge.

Availability

http://sourceforge.net/projects/ orthorbh/  相似文献   

19.

Background

Telephone helplines are frequently and repeatedly used by individuals with chronic mental health problems and web interventions may be an effective tool for reducing depression in this population.

Aim

To evaluate the effectiveness of a 6 week, web-based cognitive behaviour therapy (CBT) intervention with and without proactive weekly telephone tracking in the reduction of depression in callers to a helpline service.

Method

155 callers to a national helpline service with moderate to high psychological distress were recruited and randomised to receive either Internet CBT plus weekly telephone follow-up; Internet CBT only; weekly telephone follow-up only; or treatment as usual.

Results

Depression was lower in participants in the web intervention conditions both with and without telephone tracking compared to the treatment as usual condition both at post intervention and at 6 month follow-up. Telephone tracking provided by a lay telephone counsellor did not confer any additional advantage in terms of symptom reduction or adherence.

Conclusions

A web-based CBT program is effective both with and without telephone tracking for reducing depression in callers to a national helpline.

Trial Registration

Controlled-Trials.com ISRCTN93903959  相似文献   

20.
Mobile elements are major drivers in changing genomic architecture and can cause disease. The detection of mobile elements is hindered due to the low mappability of their highly repetitive sequences. We have developed an algorithm, called Mobster, to detect non-reference mobile element insertions in next generation sequencing data from both whole genome and whole exome studies. Mobster uses discordant read pairs and clipped reads in combination with consensus sequences of known active mobile elements. Mobster has a low false discovery rate and high recall rate for both L1 and Alu elements. Mobster is available at http://sourceforge.net/projects/mobster.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-014-0488-x) contains supplementary material, which is available to authorized users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号