首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.

Background

Comparing and aligning genomes is a key step in analyzing closely related genomes. Despite the development of many genome aligners in the last 15 years, the problem is not yet fully resolved, even when aligning closely related bacterial genomes of the same species. In addition, no procedures are available to assess the quality of genome alignments or to compare genome aligners.

Results

We designed an original method for pairwise genome alignment, named YOC, which employs a highly sensitive similarity detection method together with a recent collinear chaining strategy that allows overlaps. YOC improves the reliability of collinear genome alignments, while preserving or even improving sensitivity. We also propose an original qualitative evaluation criterion for measuring the relevance of genome alignments. We used this criterion to compare and benchmark YOC with five recent genome aligners on large bacterial genome datasets, and showed it is suitable for identifying the specificities and the potential flaws of their underlying strategies.

Conclusions

The YOC prototype is available at https://github.com/ruricaru/YOC. It has several advantages over existing genome aligners: (1) it is based on a simplified two phase alignment strategy, (2) it is easy to parameterize, (3) it produces reliable genome alignments, which are easier to analyze and to use.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0530-3) contains supplementary material, which is available to authorized users.  相似文献   

2.
3.

Motivation

Biologists often wish to use their knowledge on a few experimental models of a given molecular system to identify homologs in genomic data. We developed a generic tool for this purpose.

Results

Macromolecular System Finder (MacSyFinder) provides a flexible framework to model the properties of molecular systems (cellular machinery or pathway) including their components, evolutionary associations with other systems and genetic architecture. Modelled features also include functional analogs, and the multiple uses of a same component by different systems. Models are used to search for molecular systems in complete genomes or in unstructured data like metagenomes. The components of the systems are searched by sequence similarity using Hidden Markov model (HMM) protein profiles. The assignment of hits to a given system is decided based on compliance with the content and organization of the system model. A graphical interface, MacSyView, facilitates the analysis of the results by showing overviews of component content and genomic context. To exemplify the use of MacSyFinder we built models to detect and class CRISPR-Cas systems following a previously established classification. We show that MacSyFinder allows to easily define an accurate “Cas-finder” using publicly available protein profiles.

Availability and Implementation

MacSyFinder is a standalone application implemented in Python. It requires Python 2.7, Hmmer and makeblastdb (version 2.2.28 or higher). It is freely available with its source code under a GPLv3 license at https://github.com/gem-pasteur/macsyfinder. It is compatible with all platforms supporting Python and Hmmer/makeblastdb. The “Cas-finder” (models and HMM profiles) is distributed as a compressed tarball archive as Supporting Information.  相似文献   

4.

Background

First pass methods based on BLAST match are commonly used as an initial step to separate the different phylogenetic histories of genes in microbial genomes, and target putative horizontal gene transfer (HGT) events. This will continue to be necessary given the rapid growth of genomic data and the technical difficulties in conducting large-scale explicit phylogenetic analyses. However, these methods often produce misleading results due to their inability to resolve indirect phylogenetic links and their vulnerability to stochastic events.

Results

A new computational method of rapid, exhaustive and genome-wide detection of HGT was developed, featuring the systematic analysis of BLAST hit distribution patterns in the context of a priori defined hierarchical evolutionary categories. Genes that fall beyond a series of statistically determined thresholds are identified as not adhering to the typical vertical history of the organisms in question, but instead having a putative horizontal origin. Tests on simulated genomic data suggest that this approach effectively targets atypically distributed genes that are highly likely to be HGT-derived, and exhibits robust performance compared to conventional BLAST-based approaches. This method was further tested on real genomic datasets, including Rickettsia genomes, and was compared to previous studies. Results show consistency with currently employed categories of HGT prediction methods. In-depth analysis of both simulated and real genomic data suggests that the method is notably insensitive to stochastic events such as gene loss, rate variation and database error, which are common challenges to the current methodology. An automated pipeline was created to implement this approach and was made publicly available at: https://github.com/DittmarLab/HGTector. The program is versatile, easily deployed, has a low requirement for computational resources.

Conclusions

HGTector is an effective tool for initial or standalone large-scale discovery of candidate HGT-derived genes.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-717) contains supplementary material, which is available to authorized users.  相似文献   

5.

Background

Programs based on hash tables and Burrows-Wheeler are very fast for mapping short reads to genomes but have low accuracy in the presence of mismatches and gaps. Such reads can be aligned accurately with the Smith-Waterman algorithm but it can take hours and days to map millions of reads even for bacteria genomes.

Results

We introduce a GPU program called MaxSSmap with the aim of achieving comparable accuracy to Smith-Waterman but with faster runtimes. Similar to most programs MaxSSmap identifies a local region of the genome followed by exact alignment. Instead of using hash tables or Burrows-Wheeler in the first part, MaxSSmap calculates maximum scoring subsequence score between the read and disjoint fragments of the genome in parallel on a GPU and selects the highest scoring fragment for exact alignment. We evaluate MaxSSmap’s accuracy and runtime when mapping simulated Illumina E.coli and human chromosome one reads of different lengths and 10% to 30% mismatches with gaps to the E.coli genome and human chromosome one. We also demonstrate applications on real data by mapping ancient horse DNA reads to modern genomes and unmapped paired reads from NA12878 in 1000 genomes.

Conclusions

We show that MaxSSmap attains comparable high accuracy and low error to fast Smith-Waterman programs yet has much lower runtimes. We show that MaxSSmap can map reads rejected by BWA and NextGenMap with high accuracy and low error much faster than if Smith-Waterman were used. On short read lengths of 36 and 51 both MaxSSmap and Smith-Waterman have lower accuracy compared to at higher lengths. On real data MaxSSmap produces many alignments with high score and mapping quality that are not given by NextGenMap and BWA. The MaxSSmap source code in CUDA and OpenCL is freely available from http://www.cs.njit.edu/usman/MaxSSmap.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-969) contains supplementary material, which is available to authorized users.  相似文献   

6.

Background

While current recommendations on exercise type and volume have strong experimental bases, there is no clear evidence from large-sized studies indicating whether increasing training intensity provides additional benefits to subjects with type 2 diabetes.

Objective

To compare the effects of moderate-to-high intensity (HI) versus low-to-moderate intensity (LI) training of equal energy cost, i.e. exercise volume, on modifiable cardiovascular risk factors.

Design

Pre-specified sub-analysis of the Italian Diabetes and Exercise Study (IDES), a randomized multicenter prospective trial comparing a supervised exercise intervention with standard care for 12 months (2005–2006).

Setting

Twenty-two outpatient diabetes clinics across Italy.

Patients

Sedentary patients with type 2 diabetes assigned to twice-a-week supervised progressive aerobic and resistance training plus exercise counseling (n = 303).

Interventions

Subjects were randomized by center to LI (n = 142, 136 completed) or HI (n = 161, 152 completed) progressive aerobic and resistance training, i.e. at 55% or 70% of predicted maximal oxygen consumption and at 60% or 80% of predicted 1-Repetition Maximum, respectively, of equal volume.

Main Outcome Measure(s)

Hemoglobin (Hb) A1c and other cardiovascular risk factors; 10-year coronary heart disease (CHD) risk scores.

Results

Volume of physical activity, both supervised and non-supervised, was similar in LI and HI participants. Compared with LI training, HI training produced only clinically marginal, though statistically significant, improvements in HbA1c (mean difference −0.17% [95% confidence interval −0.44,0.10], P = 0.03), triglycerides (−0.12 mmol/l [−0.34,0.10], P = 0.02) and total cholesterol (−0.24 mmol/l [−0.46, −0.01], P = 0.04), but not in other risk factors and CHD risk scores. However, intensity was not an independent predictor of reduction of any of these parameters. Adverse event rate was similar in HI and LI subjects.

Conclusions

Data from the large IDES cohort indicate that, in low-fitness individuals such as sedentary subjects with type 2 diabetes, increasing exercise intensity is not harmful, but does not provide additional benefits on cardiovascular risk factors.

Trial Registration

www.ISRCTN.org ISRCTN-04252749.  相似文献   

7.

Background

Assembling genes from next-generation sequencing data is not only time consuming but computationally difficult, particularly for taxa without a closely related reference genome. Assembling even a draft genome using de novo approaches can take days, even on a powerful computer, and these assemblies typically require data from a variety of genomic libraries. Here we describe software that will alleviate these issues by rapidly assembling genes from distantly related taxa using a single library of paired-end reads: aTRAM, automated Target Restricted Assembly Method. The aTRAM pipeline uses a reference sequence, BLAST, and an iterative approach to target and locally assemble the genes of interest.

Results

Our results demonstrate that aTRAM rapidly assembles genes across distantly related taxa. In comparative tests with a closely related taxon, aTRAM assembled the same sequence as reference-based and de novo approaches taking on average < 1 min per gene. As a test case with divergent sequences, we assembled >1,000 genes from six taxa ranging from 25 – 110 million years divergent from the reference taxon. The gene recovery was between 97 – 99% from each taxon.

Conclusions

aTRAM can quickly assemble genes across distantly-related taxa, obviating the need for draft genome assembly of all taxa of interest. Because aTRAM uses a targeted approach, loci can be assembled in minutes depending on the size of the target. Our results suggest that this software will be useful in rapidly assembling genes for phylogenomic projects covering a wide taxonomic range, as well as other applications. The software is freely available http://www.github.com/juliema/aTRAM.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0515-2) contains supplementary material, which is available to authorized users.  相似文献   

8.
9.

Objective

To evaluate the effectiveness of a school-based intervention involving the families and teachers that aimed to promote healthy eating habits in adolescents; the ultimate aim of the intervention was to reduce the increase in body mass index (BMI) of the students.

Design

Paired cluster randomized school-based trial conducted with a sample of fifth graders.

Setting

Twenty classes were randomly assigned into either an intervention group or a control group.

Participants

From a total of 574 eligible students, 559 students participated in the study (intervention: 10 classes with 277 participants; control: 10 classes with 282 participants). The mean age of students was 11 years.

Intervention

Students attended 9 nutritional education sessions during the 2010 academic year. Parents/guardians and teachers received information on the same subjects.

Main Outcome Measurement

Changes in BMI and percentage of body fat.

Results

Intention-to-treat analysis showed that changes in BMI were not significantly different between the 2 groups (β = 0.003; p = 0.75). There was a major reduction in the consumption of sugar-sweetened beverages and cookies in the intervention group; students in this group also consumed more fruits.

Conclusion

Encouraging the adoption of healthy eating habits promoted important changes in the adolescent diet, but this did not lead to a reduction in BMI gain. Strategies based exclusively on the quality of diet may not reduce weight gain among adolescents.

Trial Registration

Clinicaltrials.gov NCT01046474.  相似文献   

10.

Background & Objective

Managing data from large-scale projects (such as The Cancer Genome Atlas (TCGA)) for further analysis is an important and time consuming step for research projects. Several efforts, such as the Firehose project, make TCGA pre-processed data publicly available via web services and data portals, but this information must be managed, downloaded and prepared for subsequent steps. We have developed an open source and extensible R based data client for pre-processed data from the Firehouse, and demonstrate its use with sample case studies. Results show that our RTCGAToolbox can facilitate data management for researchers interested in working with TCGA data. The RTCGAToolbox can also be integrated with other analysis pipelines for further data processing.

Availability and implementation

The RTCGAToolbox is open-source and licensed under the GNU General Public License Version 2.0. All documentation and source code for RTCGAToolbox is freely available at http://mksamur.github.io/RTCGAToolbox/ for Linux and Mac OS X operating systems.  相似文献   

11.

Background

Sampling genomes with Fosmid vectors and sequencing of pooled Fosmid libraries on the Illumina platform for massive parallel sequencing is a novel and promising approach to optimizing the trade-off between sequencing costs and assembly quality.

Results

In order to sequence the genome of Norway spruce, which is of great size and complexity, we developed and applied a new technology based on the massive production, sequencing, and assembly of Fosmid pools (FP). The spruce chromosomes were sampled with ~40,000 bp Fosmid inserts to obtain around two-fold genome coverage, in parallel with traditional whole genome shotgun sequencing (WGS) of haploid and diploid genomes. Compared to the WGS results, the contiguity and quality of the FP assemblies were high, and they allowed us to fill WGS gaps resulting from repeats, low coverage, and allelic differences. The FP contig sets were further merged with WGS data using a novel software package GAM-NGS.

Conclusions

By exploiting FP technology, the first published assembly of a conifer genome was sequenced entirely with massively parallel sequencing. Here we provide a comprehensive report on the different features of the approach and the optimization of the process.We have made public the input data (FASTQ format) for the set of pools used in this study:ftp://congenie.org/congenie/Nystedt_2013/Assembly/ProcessedData/FosmidPools/.(alternatively accessible via http://congenie.org/downloads).The software used for running the assembly process is available at http://research.scilifelab.se/andrej_alexeyenko/downloads/fpools/.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-439) contains supplementary material, which is available to authorized users.  相似文献   

12.
13.

Background

Tree nut consumption has been associated with reduced diabetes risk, however, results from randomized trials on glycemic control have been inconsistent.

Objective

To provide better evidence for diabetes guidelines development, we conducted a systematic review and meta-analysis of randomized controlled trials to assess the effects of tree nuts on markers of glycemic control in individuals with diabetes.

Data Sources

MEDLINE, EMBASE, CINAHL, and Cochrane databases through 6 April 2014.

Study Selection

Randomized controlled trials ≥3 weeks conducted in individuals with diabetes that compare the effect of diets emphasizing tree nuts to isocaloric diets without tree nuts on HbA1c, fasting glucose, fasting insulin, and HOMA-IR.

Data Extraction and Synthesis

Two independent reviewer’s extracted relevant data and assessed study quality and risk of bias. Data were pooled by the generic inverse variance method and expressed as mean differences (MD) with 95% CI’s. Heterogeneity was assessed (Cochran Q-statistic) and quantified (I2).

Results

Twelve trials (n = 450) were included. Diets emphasizing tree nuts at a median dose of 56 g/d significantly lowered HbA1c (MD = −0.07% [95% CI:−0.10, −0.03%]; P = 0.0003) and fasting glucose (MD = −0.15 mmol/L [95% CI: −0.27, −0.02 mmol/L]; P = 0.03) compared with control diets. No significant treatment effects were observed for fasting insulin and HOMA-IR, however the direction of effect favoured tree nuts.

Limitations

Majority of trials were of short duration and poor quality.

Conclusions

Pooled analyses show that tree nuts improve glycemic control in individuals with type 2 diabetes, supporting their inclusion in a healthy diet. Owing to the uncertainties in our analyses there is a need for longer, higher quality trials with a focus on using nuts to displace high-glycemic index carbohydrates.

Trial Registration

ClinicalTrials.gov NCT01630980  相似文献   

14.

Background

Exome sequencing allows researchers to study the human genome in unprecedented detail. Among the many types of variants detectable through exome sequencing, one of the most over looked types of mutation is internal deletion of exons. Internal exon deletions are the absence of consecutive exons in a gene. Such deletions have potentially significant biological meaning, and they are often too short to be considered copy number variation. Therefore, to the need for efficient detection of such deletions using exome sequencing data exists.

Results

We present ExonDel, a tool specially designed to detect homozygous exon deletions efficiently. We tested ExonDel on exome sequencing data generated from 16 breast cancer cell lines and identified both novel and known IEDs. Subsequently, we verified our findings using RNAseq and PCR technologies. Further comparisons with multiple sequencing-based CNV tools showed that ExonDel is capable of detecting unique IEDs not found by other CNV tools.

Conclusions

ExonDel is an efficient way to screen for novel and known IEDs using exome sequencing data. ExonDel and its source code can be downloaded freely at https://github.com/slzhao/ExonDel.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-332) contains supplementary material, which is available to authorized users.  相似文献   

15.

Background

The Immunoglobulins (IG) and the T cell receptors (TR) play the key role in antigen recognition during the adaptive immune response. Recent progress in next-generation sequencing technologies has provided an opportunity for the deep T cell receptor repertoire profiling. However, a specialised software is required for the rational analysis of massive data generated by next-generation sequencing.

Results

Here we introduce tcR, a new R package, representing a platform for the advanced analysis of T cell receptor repertoires, which includes diversity measures, shared T cell receptor sequences identification, gene usage statistics computation and other widely used methods. The tool has proven its utility in recent research studies.

Conclusions

tcR is an R package for the advanced analysis of T cell receptor repertoires after primary TR sequences extraction from raw sequencing reads. The stable version can be directly installed from The Comprehensive R Archive Network (http://cran.r-project.org/mirrors.html). The source code and development version are available at tcR GitHub (http://imminfo.github.io/tcr/) along with the full documentation and typical usage examples.  相似文献   

16.

Importance

Poor mental health places a burden on individuals and populations. Resilient persons are able to adapt to life’s challenges and maintain high quality of life and function. Finding effective strategies to bolster resilience in individuals and populations is of interest to many stakeholders.

Objectives

To synthesize the evidence for resiliency training programs in improving mental health and capacity in 1) diverse adult populations and 2) persons with chronic diseases.

Data Sources

Electronic databases, clinical trial registries, and bibliographies. We also contacted study authors and field experts.

Study Selection

Randomized trials assessing the efficacy of any program intended to enhance resilience in adults and published after 1990. No restrictions were made based on outcome measured or comparator used.

Data Extraction and Synthesis

Reviewers worked independently and in duplicate to extract study characteristics and data. These were confirmed with authors. We conducted a random effects meta-analysis on available data and tested for interaction in planned subgroups.

Main Outcomes

The standardized mean difference (SMD) effect of resiliency training programs on 1) resilience/hardiness, 2) quality of life/well-being, 3) self-efficacy/activation, 4) depression, 5) stress, and 6) anxiety.

Results

We found 25 small trials at moderate to high risk of bias. Interventions varied in format and theoretical approach. Random effects meta-analysis showed a moderate effect of generalized stress-directed programs on enhancing resilience [pooled SMD 0.37 (95% CI 0.18, 0.57) p = .0002; I2 = 41%] within 3 months of follow up. Improvement in other outcomes was favorable to the interventions and reached statistical significance after removing two studies at high risk of bias. Trauma-induced stress-directed programs significantly improved stress [−0.53 (−1.04, −0.03) p = .03; I2 = 73%] and depression [−0.51 (−0.92, −0.10) p = .04; I2 = 61%].

Conclusions

We found evidence warranting low confidence that resiliency training programs have a small to moderate effect at improving resilience and other mental health outcomes. Further study is needed to better define the resilience construct and to design interventions specific to it.

Registration Number

PROSPERO #CRD42014007185  相似文献   

17.
18.

Background

Next-generation sequencing technologies are rapidly generating whole-genome datasets for an increasing number of organisms. However, phylogenetic reconstruction of genomic data remains difficult because de novo assembly for non-model genomes and multi-genome alignment are challenging.

Results

To greatly simplify the analysis, we present an Assembly and Alignment-Free (AAF) method (https://sourceforge.net/projects/aaf-phylogeny) that constructs phylogenies directly from unassembled genome sequence data, bypassing both genome assembly and alignment. Using mathematical calculations, models of sequence evolution, and simulated sequencing of published genomes, we address both evolutionary and sampling issues caused by direct reconstruction, including homoplasy, sequencing errors, and incomplete sequencing coverage. From these results, we calculate the statistical properties of the pairwise distances between genomes, allowing us to optimize parameter selection and perform bootstrapping. As a test case with real data, we successfully reconstructed the phylogeny of 12 mammals using raw sequencing reads. We also applied AAF to 21 tropical tree genome datasets with low coverage to demonstrate its effectiveness on non-model organisms.

Conclusion

Our AAF method opens up phylogenomics for species without an appropriate reference genome or high sequence coverage, and rapidly creates a phylogenetic framework for further analysis of genome structure and diversity among non-model organisms.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1647-5) contains supplementary material, which is available to authorized users.  相似文献   

19.

Background

New treatments need to be evaluated in real-world clinical practice to account for co-morbidities, adherence and polypharmacy.

Methods

Patients with chronic obstructive pulmonary disease (COPD), ≥40 years old, with exacerbation in the previous 3 years are randomised 1:1 to once-daily fluticasone furoate 100 μg/vilanterol 25 μg in a novel dry-powder inhaler versus continuing their existing therapy. The primary endpoint is the mean annual rate of COPD exacerbations; an electronic medical record allows real-time collection and monitoring of endpoint and safety data.

Conclusions

The Salford Lung Study is the world’s first pragmatic randomised controlled trial of a pre-licensed medication in COPD.

Trial registration

Clinicaltrials.gov identifier NCT01551758.  相似文献   

20.

Background

There is an ongoing debate whether stroke patients presenting with minor or moderate symptoms benefit from thrombolysis. Up until now, stroke severity on admission is typically measured with the NIHSS, and subsequently used for treatment decision.

Hypothesis

Acute MRI lesion volume assessment can aid in therapy decision for iv-tPA in minor stroke.

Methods

We analysed 164 patients with NIHSS 0–7 from a prospective stroke MRI registry, the 1000+ study (clinicaltrials.org NCT00715533). Patients were examined in a 3 T MRI scanner and either received (n = 62) or did not receive thrombolysis (n = 102). DWI (diffusion weighted imaging) and PI (perfusion imaging) at admission were evaluated for diffusion - perfusion mismatch. Our primary outcome parameter was final lesion volume, defined by lesion volume on day 6 FLAIR images.

Results

The association between t-PA and FLAIR lesion volume on day 6 was significantly different for patients with smaller DWI volume compared to patients with larger DWI volume (interaction between DWI and t-PA: p = 0.021). Baseline DWI lesion volume was dichotomized at the median (0.7 ml): final lesion volume at day 6 was larger in patients with large baseline DWI volumes without t-PA treatment (median difference 3, IQR −0.4–9.3 ml). Conversely, in patients with larger baseline DWI volumes final lesion volumes were smaller after t-PA treatment (median difference 0, IQR −4.1–5 ml). However, this did not translate into a significant difference in the mRS at day 90 (p = 0.577).

Conclusion

Though this study is only hypothesis generating considering the number of cases, we believe that the size of DWI lesion volume may support therapy decision in patients with minor stroke.

Trial Registration

Clinicaltrials.org NCT00715533  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号