首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
2.

Background

The overall metabolic/functional potential of any given environmental niche is a function of the sum total of genes/proteins/enzymes that are encoded and expressed by various interacting microbes residing in that niche. Consequently, prior (collated) information pertaining to genes, enzymes encoded by the resident microbes can aid in indirectly (re)constructing/ inferring the metabolic/ functional potential of a given microbial community (given its taxonomic abundance profile). In this study, we present Vikodak—a multi-modular package that is based on the above assumption and automates inferring and/ or comparing the functional characteristics of an environment using taxonomic abundance generated from one or more environmental sample datasets. With the underlying assumptions of co-metabolism and independent contributions of different microbes in a community, a concerted effort has been made to accommodate microbial co-existence patterns in various modules incorporated in Vikodak.

Results

Validation experiments on over 1400 metagenomic samples have confirmed the utility of Vikodak in (a) deciphering enzyme abundance profiles of any KEGG metabolic pathway, (b) functional resolution of distinct metagenomic environments, (c) inferring patterns of functional interaction between resident microbes, and (d) automating statistical comparison of functional features of studied microbiomes. Novel features incorporated in Vikodak also facilitate automatic removal of false positives and spurious functional predictions.

Conclusions

With novel provisions for comprehensive functional analysis, inclusion of microbial co-existence pattern based algorithms, automated inter-environment comparisons; in-depth analysis of individual metabolic pathways and greater flexibilities at the user end, Vikodak is expected to be an important value addition to the family of existing tools for 16S based function prediction.

Availability and Implementation

A web implementation of Vikodak can be publicly accessed at: http://metagenomics.atc.tcs.com/vikodak. This web service is freely available for all categories of users (academic as well as commercial).  相似文献   

3.

Background

Next generation sequencing (NGS) offers a rapid and comprehensive method of screening for mutations associated with retinitis pigmentosa and related disorders. However, certain sequence alterations such as large insertions or deletions may remain undetected using standard NGS pipelines. One such mutation is a recently-identified Alu insertion into the Male Germ Cell-Associated Kinase (MAK) gene, which is missed by standard NGS-based variant callers. Here, we developed an in silico method of searching NGS raw sequence reads to detect this mutation, without the need to recalculate sequence alignments or to screen every sample by PCR.

Methods

The Linux program grep was used to search for a 23 bp “probe” sequence containing the known junction sequence of the insert. A corresponding search was performed with the wildtype sequence. The matching reads were counted and further compared to the known sequences of the full wildtype and mutant genomic loci. (See https://github.com/MEEIBioinformaticsCenter/grepsearch.)

Results

In a test sample set consisting of eleven previously published homozygous mutants, detection of the MAK-Alu insertion was validated with 100% sensitivity and specificity. As a discovery cohort, raw NGS reads from 1,847 samples (including custom and whole exome selective capture) were searched in ~1 hour on a local computer cluster, yielding an additional five samples with MAK-Alu insertions and solving two previously unsolved pedigrees. Of these, one patient was homozygous for the insertion, one compound heterozygous with a missense change on the other allele (c. 46G>A; p.Gly16Arg), and three were heterozygous carriers.

Conclusions

Using the MAK-Alu grep program proved to be a rapid and effective method of finding a known, disease-causing Alu insertion in a large cohort of patients with NGS data. This simple approach avoids wet-lab assays or computationally expensive algorithms, and could also be used for other known disease-causing insertions and deletions.  相似文献   

4.

Background

The assembly of viral or endosymbiont genomes from Next Generation Sequencing (NGS) data is often hampered by the predominant abundance of reads originating from the host organism. These reads increase the memory and CPU time usage of the assembler and can lead to misassemblies.

Results

We developed RAMBO-K (Read Assignment Method Based On K-mers), a tool which allows rapid and sensitive removal of unwanted host sequences from NGS datasets. Reaching a speed of 10 Megabases/s on 4 CPU cores and a standard hard drive, RAMBO-K is faster than any tool we tested, while showing a consistently high sensitivity and specificity across different datasets.

Conclusions

RAMBO-K rapidly and reliably separates reads from different species without data preprocessing. It is suitable as a straightforward standard solution for workflows dealing with mixed datasets. Binaries and source code (java and python) are available from http://sourceforge.net/projects/rambok/.  相似文献   

5.

Background

Understanding the taxonomic composition of a sample, whether from patient, food or environment, is important to several types of studies including pathogen diagnostics, epidemiological studies, biodiversity analysis and food quality regulation. With the decreasing costs of sequencing, metagenomic data is quickly becoming the preferred typed of data for such analysis.

Results

Rapidly defining the taxonomic composition (both taxonomic profile and relative frequency) in a metagenomic sequence dataset is challenging because the task of mapping millions of sequence reads from a metagenomic study to a non-redundant nucleotide database such as the NCBI non-redundant nucleotide database (nt) is a computationally intensive task. We have developed a robust subsampling-based algorithm implemented in a tool called CensuScope meant to take a ‘sneak peak’ into the population distribution and estimate taxonomic composition as if a census was taken of the metagenomic landscape. CensuScope is a rapid and accurate metagenome taxonomic profiling tool that randomly extracts a small number of reads (based on user input) and maps them to NCBI’s nt database. This process is repeated multiple times to ascertain the taxonomic composition that is found in majority of the iterations, thereby providing a robust estimate of the population and measures of the accuracy for the results.

Conclusion

CensuScope can be run on a laptop or on a high-performance computer. Based on our analysis we are able to provide some recommendations in terms of the number of sequence reads to analyze and the number of iterations to use. For example, to quantify taxonomic groups present in the sample at a level of 1% or higher a subsampling size of 250 random reads with 50 iterations yields a statistical power of >99%. Windows and UNIX versions of CensuScope are available for download at https://hive.biochemistry.gwu.edu/dna.cgi?cmd=censuscope. CensuScope is also available through the High-performance Integrated Virtual Environment (HIVE) and can be used in conjunction with other HIVE analysis and visualization tools.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-918) contains supplementary material, which is available to authorized users.  相似文献   

6.

Background

The exponential growth of next generation sequencing (NGS) data has posed big challenges to data storage, management and archive. Data compression is one of the effective solutions, where reference-based compression strategies can typically achieve superior compression ratios compared to the ones not relying on any reference.

Results

This paper presents a lossless light-weight reference-based compression algorithm namely LW-FQZip to compress FASTQ data. The three components of any given input, i.e., metadata, short reads and quality score strings, are first parsed into three data streams in which the redundancy information are identified and eliminated independently. Particularly, well-designed incremental and run-length-limited encoding schemes are utilized to compress the metadata and quality score streams, respectively. To handle the short reads, LW-FQZip uses a novel light-weight mapping model to fast map them against external reference sequence(s) and produce concise alignment results for storage. The three processed data streams are then packed together with some general purpose compression algorithms like LZMA. LW-FQZip was evaluated on eight real-world NGS data sets and achieved compression ratios in the range of 0.111-0.201. This is comparable or superior to other state-of-the-art lossless NGS data compression algorithms.

Conclusions

LW-FQZip is a program that enables efficient lossless FASTQ data compression. It contributes to the state of art applications for NGS data storage and transmission. LW-FQZip is freely available online at: http://csse.szu.edu.cn/staff/zhuzx/LWFQZip.  相似文献   

7.

Background

Human leukocyte antigen (HLA) genes are critical genes involved in important biomedical aspects, including organ transplantation, autoimmune diseases and infectious diseases. The gene family contains the most polymorphic genes in humans and the difference between two alleles is only a single base pair substitution in many cases. The next generation sequencing (NGS) technologies could be used for high throughput HLA typing but in silico methods are still needed to correctly assign the alleles of a sample. Computer scientists have developed such methods for various NGS platforms, such as Illumina, Roche 454 and Ion Torrent, based on the characteristics of the reads they generate. However, the method for PacBio reads was less addressed, probably owing to its high error rates. The PacBio system has the longest read length among available NGS platforms, and therefore is the only platform capable of having exon 2 and exon 3 of HLA genes on the same read to unequivocally solve the ambiguity problem caused by the “phasing” issue.

Results

We proposed a new method BayesTyping1 to assign HLA alleles for PacBio circular consensus sequencing reads using Bayes’ theorem. The method was applied to simulated data of the three loci HLA-A, HLA-B and HLA-DRB1. The experimental results showed its capability to tolerate the disturbance of sequencing errors and external noise reads.

Conclusions

The BayesTyping1 method could overcome the problems of HLA typing using PacBio reads, which mostly arise from sequencing errors of PacBio reads and the divergence of HLA genes, to some extent.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-296) contains supplementary material, which is available to authorized users.  相似文献   

8.
9.

Background

Usually, next generation sequencing (NGS) technology has the property of ultra-high throughput but the read length is remarkably short compared to conventional Sanger sequencing. Paired-end NGS could computationally extend the read length but with a lot of practical inconvenience because of the inherent gaps. Now that Illumina paired-end sequencing has the ability of read both ends from 600 bp or even 800 bp DNA fragments, how to fill in the gaps between paired ends to produce accurate long reads is intriguing but challenging.

Results

We have developed a new technology, referred to as pseudo-Sanger (PS) sequencing. It tries to fill in the gaps between paired ends and could generate near error-free sequences equivalent to the conventional Sanger reads in length but with the high throughput of the Next Generation Sequencing. The major novelty of PS method lies on that the gap filling is based on local assembly of paired-end reads which have overlaps with at either end. Thus, we are able to fill in the gaps in repetitive genomic region correctly. The PS sequencing starts with short reads from NGS platforms, using a series of paired-end libraries of stepwise decreasing insert sizes. A computational method is introduced to transform these special paired-end reads into long and near error-free PS sequences, which correspond in length to those with the largest insert sizes. The PS construction has 3 advantages over untransformed reads: gap filling, error correction and heterozygote tolerance. Among the many applications of the PS construction is de novo genome assembly, which we tested in this study. Assembly of PS reads from a non-isogenic strain of Drosophila melanogaster yields an N50 contig of 190 kb, a 5 fold improvement over the existing de novo assembly methods and a 3 fold advantage over the assembly of long reads from 454 sequencing.

Conclusions

Our method generated near error-free long reads from NGS paired-end sequencing. We demonstrated that de novo assembly could benefit a lot from these Sanger-like reads. Besides, the characteristic of the long reads could be applied to such applications as structural variations detection and metagenomics.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-14-711) contains supplementary material, which is available to authorized users.  相似文献   

10.

Objective

To investigate the effectiveness of educational poster on improving secondary school students'' knowledge of emergency management of dental trauma.

Methods

A cluster randomised controlled trial was conducted. 16 schools with total 671 secondary students who can read Chinese or English were randomised into intervention (poster, 8 schools, 364 students) and control groups (8 schools, 305 students) at the school level. Baseline knowledge of dental trauma was obtained by a questionnaire. Poster containing information of dental trauma management was displayed in a classroom for 2 weeks in each school in the intervention group whereas in the control group there was no display of such posters. Students of both groups completed the same questionnarie after 2 weeks.

Results

Two-week display of posters improved the knowledge score by 1.25 (p-value = 0.0407) on average.

Conclusion

Educational poster on dental trauma management significantly improved the level of knowledge of secondary school students in Hong Kong.

Trial Registration

HKClinicalTrial.com HKCTR-1343 ClinicalTrials.gov NCT01809457  相似文献   

11.
12.
Horizontal Gene Transfer (HGT) events, initially thought to be rare in Mycobacterium tuberculosis, have recently been shown to be involved in the acquisition of virulence operons in M. tuberculosis. We have developed a new partitioning framework based HGT prediction algorithm, called Grid3M, and applied the same for the prediction of HGTs in Mycobacteria. Validation and testing using simulated and real microbial genomes indicated better performance of Grid3M as compared with other widely used HGT prediction methods. Specific analysis of the genes belonging to dormancy/reactivation regulons across 14 mycobacterial genomes indicated that horizontal acquisition is specifically restricted to important accessory proteins. The results also revealed Burkholderia species to be a probable source of HGT genes belonging to these regulons. The current study provides a basis for similar analyses investigating the functional/evolutionary aspects of HGT genes in other pathogens. A database of Grid3M predicted HGTs in completely sequenced genomes is available at https://metagenomics.atc.tcs.com/Grid3M/ .  相似文献   

13.

Introduction

Stories may be an effective tool to communicate with patients because of their ability to engage the reader. Our objective was to evaluate the effectiveness of story booklets compared to standard information sheets for parents of children attending the emergency department (ED) with a child with croup.

Methods

Parents were randomized to receive story booklets (n=208) or standard information sheets (n=205) during their ED visit. The primary outcome was change in anxiety between triage to ED discharge as measured by the State-Trait Anxiety Inventory. Follow-up telephone interviews were conducted at 1 and 3 days after discharge, then every other day until 9 days (or until resolution of symptoms), and at 1 year. Secondary outcomes included: expected future anxiety, event impact, parental knowledge, satisfaction, decision regret, healthcare utilization, time to symptom resolution.

Results

There was no significant difference in the primary outcome of change in parental anxiety between recruitment and ED discharge (change of 5 points for the story group vs. 6 points for the comparison group, p=0.78). The story group showed significantly greater decision regret regarding their decision to go to the ED (p<0.001): 6.7% of the story group vs. 1.5% of the comparison group strongly disagreed with the statement “I would go for the same choice if I had to do it over again”. The story group reported shorter time to resolution of symptoms (mean 3.7 days story group vs. 4.0 days comparison group, median 3 days both groups; log rank test, p=0.04). No other outcomes were different between study groups.

Conclusions

Stories about parent experiences managing a child with croup did not reduce parental anxiety. The story group showed significantly greater decision regret and quicker time to resolution of symptoms. Further research is needed to better understand whether stories can be effective in improving patient-important outcomes.

Trial Registration

Current Controlled Trials, ISRCTN39642997 (http://www.controlled-trials.com/ISRCTN39642997)  相似文献   

14.
15.

Background

Revision knee arthroplasty is assumed to be even more painful than primary knee arthroplasty and predominantly performed in chronic pain patients, which challenges postoperative pain treatment. We hypothesized that the adductor canal block, effective for pain relief after primary total knee arthroplasty, may reduce pain during knee flexion (primary endpoint: at 4 h) compared with placebo after revision total knee arthroplasty. Secondary endpoints were pain at rest, morphine consumption and morphine-related side effects.

Methods

We included patients scheduled for revision knee arthroplasty in general anesthesia into this blinded, placebo-controlled, randomized trial. Patients were allocated to an adductor canal block via a catheter with either ropivacaine or placebo; bolus of 0.75% ropivacaine/saline, followed by infusion of 0.2% ropivacaine/saline. Clinicaltrials.gov ID: NCT01191593.

Results

We enrolled 36 patients, of which 30 were analyzed. Mean pain scores during knee flexion at 4 h (primary endpoint) were: 52±22 versus 71±25 mm (mean difference 19, 95% CI: 1 to 37, P = 0.04), ropivacaine and placebo group respectively. When calculated as area under the curve (1–8 h/7 h) pain scores were 55±21 versus 69±21 mm during knee flexion (P = 0.11) and 39±18 versus 45±23 mm at rest (P = 0.43), ropivacaine and placebo group respectively. Groups were similar regarding morphine consumption and morphine-related side effects (P>0.05).

Conclusions

The only statistically significant difference found between groups was in the primary endpoint: pain during knee flexion at 4 h. However, due to a larger than anticipated dropout rate and heterogeneous study population, the study was underpowered.

Trial Registration

Clinicaltrials.gov NCT01191593  相似文献   

16.

Objectives

To examine the extent to which individual and ecological-level cognitive and structural social capital are associated with common mental disorder (CMD), the role played by physical characteristics of the neighbourhood in moderating this association, and the longitudinal change of the association between ecological level cognitive and structural social capital and CMD.

Design

Cross-sectional and longitudinal study of 40 disadvantaged London neighbourhoods. We used a contextual measure of the physical characteristics of each neighbourhood to examine how the neighbourhood moderates the association between types of social capital and mental disorder. We analysed the association between ecological-level measures of social capital and CMD longitudinally.

Participants

4,214 adults aged 16-97 (44.4% men) were randomly selected from 40 disadvantaged London neighbourhoods.

Main Outcome Measures

General Health Questionnaire (GHQ-12).

Results

Structural rather than cognitive social capital was significantly associated with CMD after controlling for socio-demographic variables. However, the two measures of structural social capital used, social networks and civic participation, were negatively and positively associated with CMD respectively. ‘Social networks’ was negatively associated with CMD at both the individual and ecological levels. This result was maintained when contextual aspects of the physical environment (neighbourhood incivilities) were introduced into the model, suggesting that ‘social networks’ was independent from characteristics of the physical environment. When ecological-level longitudinal analysis was conducted, ‘social networks’ was not statistically significant after controlling for individual-level social capital at follow up.

Conclusions

If we conceptually distinguish between cognitive and structural components as the quality and quantity of social capital respectively, the conclusion of this study is that the quantity rather than quality of social capital is important in relation to CMD at both the individual and ecological levels in disadvantaged urban areas. Thus, policy should support interventions that create and sustain social networks. One of these is explored in this article.

Trial Registration

Controlled-Trials.com ISRCTN68175121 http://www.controlled-trials.com/ISRCTN68175121  相似文献   

17.

Objective

To assess the clinical effect of medication monitoring using the West Wales Adverse Drug Reaction (ADR) Profile for Respiratory Medicine.

Design

Single-site parallel-arm pragmatic trial using stratified randomisation.

Setting

Nurse-led respiratory outpatient clinic in general hospital in South Wales.

Participants

54 patients with chronic respiratory disease receiving bronchodilators, corticosteroids or leukotriene receptor antagonists.

Intervention

Following initial observation of usual nursing care, we allocated participants at random to receive at follow up: either the West Wales ADR Profile for Respiratory Medicine in addition to usual care (‘intervention arm’ with 26 participants); or usual care alone (‘control arm’ with 28 participants).

Main Outcome Measures

Problems reported and actions taken.

Results

We followed up all randomised participants, and analysed data in accordance with treatment allocated. The increase in numbers of problems per participant identified at follow up was significantly higher in the intervention arm, where the median increase was 20.5 [inter-quartile range (IQR) 13–26], while that in the control arm was −1 [−3 to +2] [Mann-Whitney U test: z = 6.28, p<0.001]. The increase in numbers of actions per participant taken at follow up was also significantly higher in the intervention arm, where the median increase was 2.5 [1][4] while that in the control arm was 0 [−1.75 to +1] [Mann-Whitney U test: z = 4.40, p<0.001].

Conclusion

When added to usual nursing care, the West Wales ADR Profile identified more problems and prompted more nursing actions. Our ADR Profile warrants further investigation as a strategy to optimise medication management.

Trial Registration

Controlled-trials.com ISRCTN10386209  相似文献   

18.
19.
20.

Background

The Smith-Waterman algorithm, which produces the optimal pairwise alignment between two sequences, is frequently used as a key component of fast heuristic read mapping and variation detection tools for next-generation sequencing data. Though various fast Smith-Waterman implementations are developed, they are either designed as monolithic protein database searching tools, which do not return detailed alignment, or are embedded into other tools. These issues make reusing these efficient Smith-Waterman implementations impractical.

Results

To facilitate easy integration of the fast Single-Instruction-Multiple-Data Smith-Waterman algorithm into third-party software, we wrote a C/C++ library, which extends Farrar’s Striped Smith-Waterman (SSW) to return alignment information in addition to the optimal Smith-Waterman score. In this library we developed a new method to generate the full optimal alignment results and a suboptimal score in linear space at little cost of efficiency. This improvement makes the fast Single-Instruction-Multiple-Data Smith-Waterman become really useful in genomic applications. SSW is available both as a C/C++ software library, as well as a stand-alone alignment tool at: https://github.com/mengyao/Complete-Striped-Smith-Waterman-Library.

Conclusions

The SSW library has been used in the primary read mapping tool MOSAIK, the split-read mapping program SCISSORS, the MEI detector TANGRAM, and the read-overlap graph generation program RZMBLR. The speeds of the mentioned software are improved significantly by replacing their ordinary Smith-Waterman or banded Smith-Waterman module with the SSW Library.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号