共查询到20条相似文献,搜索用时 156 毫秒
1.
Background
Comparing and aligning genomes is a key step in analyzing closely related genomes. Despite the development of many genome aligners in the last 15 years, the problem is not yet fully resolved, even when aligning closely related bacterial genomes of the same species. In addition, no procedures are available to assess the quality of genome alignments or to compare genome aligners.Results
We designed an original method for pairwise genome alignment, named YOC, which employs a highly sensitive similarity detection method together with a recent collinear chaining strategy that allows overlaps. YOC improves the reliability of collinear genome alignments, while preserving or even improving sensitivity. We also propose an original qualitative evaluation criterion for measuring the relevance of genome alignments. We used this criterion to compare and benchmark YOC with five recent genome aligners on large bacterial genome datasets, and showed it is suitable for identifying the specificities and the potential flaws of their underlying strategies.Conclusions
The YOC prototype is available at https://github.com/ruricaru/YOC. It has several advantages over existing genome aligners: (1) it is based on a simplified two phase alignment strategy, (2) it is easy to parameterize, (3) it produces reliable genome alignments, which are easier to analyze and to use.Electronic supplementary material
The online version of this article (doi:10.1186/s12859-015-0530-3) contains supplementary material, which is available to authorized users. 相似文献2.
3.
Motivation
Biologists often wish to use their knowledge on a few experimental models of a given molecular system to identify homologs in genomic data. We developed a generic tool for this purpose.Results
Macromolecular System Finder (MacSyFinder) provides a flexible framework to model the properties of molecular systems (cellular machinery or pathway) including their components, evolutionary associations with other systems and genetic architecture. Modelled features also include functional analogs, and the multiple uses of a same component by different systems. Models are used to search for molecular systems in complete genomes or in unstructured data like metagenomes. The components of the systems are searched by sequence similarity using Hidden Markov model (HMM) protein profiles. The assignment of hits to a given system is decided based on compliance with the content and organization of the system model. A graphical interface, MacSyView, facilitates the analysis of the results by showing overviews of component content and genomic context. To exemplify the use of MacSyFinder we built models to detect and class CRISPR-Cas systems following a previously established classification. We show that MacSyFinder allows to easily define an accurate “Cas-finder” using publicly available protein profiles.Availability and Implementation
MacSyFinder is a standalone application implemented in Python. It requires Python 2.7, Hmmer and makeblastdb (version 2.2.28 or higher). It is freely available with its source code under a GPLv3 license at https://github.com/gem-pasteur/macsyfinder. It is compatible with all platforms supporting Python and Hmmer/makeblastdb. The “Cas-finder” (models and HMM profiles) is distributed as a compressed tarball archive as Supporting Information. 相似文献4.
Background
First pass methods based on BLAST match are commonly used as an initial step to separate the different phylogenetic histories of genes in microbial genomes, and target putative horizontal gene transfer (HGT) events. This will continue to be necessary given the rapid growth of genomic data and the technical difficulties in conducting large-scale explicit phylogenetic analyses. However, these methods often produce misleading results due to their inability to resolve indirect phylogenetic links and their vulnerability to stochastic events.Results
A new computational method of rapid, exhaustive and genome-wide detection of HGT was developed, featuring the systematic analysis of BLAST hit distribution patterns in the context of a priori defined hierarchical evolutionary categories. Genes that fall beyond a series of statistically determined thresholds are identified as not adhering to the typical vertical history of the organisms in question, but instead having a putative horizontal origin. Tests on simulated genomic data suggest that this approach effectively targets atypically distributed genes that are highly likely to be HGT-derived, and exhibits robust performance compared to conventional BLAST-based approaches. This method was further tested on real genomic datasets, including Rickettsia genomes, and was compared to previous studies. Results show consistency with currently employed categories of HGT prediction methods. In-depth analysis of both simulated and real genomic data suggests that the method is notably insensitive to stochastic events such as gene loss, rate variation and database error, which are common challenges to the current methodology. An automated pipeline was created to implement this approach and was made publicly available at: https://github.com/DittmarLab/HGTector. The program is versatile, easily deployed, has a low requirement for computational resources.Conclusions
HGTector is an effective tool for initial or standalone large-scale discovery of candidate HGT-derived genes.Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-717) contains supplementary material, which is available to authorized users. 相似文献5.
Background
Programs based on hash tables and Burrows-Wheeler are very fast for mapping short reads to genomes but have low accuracy in the presence of mismatches and gaps. Such reads can be aligned accurately with the Smith-Waterman algorithm but it can take hours and days to map millions of reads even for bacteria genomes.Results
We introduce a GPU program called MaxSSmap with the aim of achieving comparable accuracy to Smith-Waterman but with faster runtimes. Similar to most programs MaxSSmap identifies a local region of the genome followed by exact alignment. Instead of using hash tables or Burrows-Wheeler in the first part, MaxSSmap calculates maximum scoring subsequence score between the read and disjoint fragments of the genome in parallel on a GPU and selects the highest scoring fragment for exact alignment. We evaluate MaxSSmap’s accuracy and runtime when mapping simulated Illumina E.coli and human chromosome one reads of different lengths and 10% to 30% mismatches with gaps to the E.coli genome and human chromosome one. We also demonstrate applications on real data by mapping ancient horse DNA reads to modern genomes and unmapped paired reads from NA12878 in 1000 genomes.Conclusions
We show that MaxSSmap attains comparable high accuracy and low error to fast Smith-Waterman programs yet has much lower runtimes. We show that MaxSSmap can map reads rejected by BWA and NextGenMap with high accuracy and low error much faster than if Smith-Waterman were used. On short read lengths of 36 and 51 both MaxSSmap and Smith-Waterman have lower accuracy compared to at higher lengths. On real data MaxSSmap produces many alignments with high score and mapping quality that are not given by NextGenMap and BWA. The MaxSSmap source code in CUDA and OpenCL is freely available from http://www.cs.njit.edu/usman/MaxSSmap.Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-969) contains supplementary material, which is available to authorized users. 相似文献6.
Stefano Balducci Silvano Zanuso Patrizia Cardelli Laura Salvi Alessandra Bazuro Luca Pugliese Carla Maccora Carla Iacobini Francesco G. Conti Antonio Nicolucci Giuseppe Pugliese for the Italian Diabetes Exercise Study Investigators 《PloS one》2012,7(11)
Background
While current recommendations on exercise type and volume have strong experimental bases, there is no clear evidence from large-sized studies indicating whether increasing training intensity provides additional benefits to subjects with type 2 diabetes.Objective
To compare the effects of moderate-to-high intensity (HI) versus low-to-moderate intensity (LI) training of equal energy cost, i.e. exercise volume, on modifiable cardiovascular risk factors.Design
Pre-specified sub-analysis of the Italian Diabetes and Exercise Study (IDES), a randomized multicenter prospective trial comparing a supervised exercise intervention with standard care for 12 months (2005–2006).Setting
Twenty-two outpatient diabetes clinics across Italy.Patients
Sedentary patients with type 2 diabetes assigned to twice-a-week supervised progressive aerobic and resistance training plus exercise counseling (n = 303).Interventions
Subjects were randomized by center to LI (n = 142, 136 completed) or HI (n = 161, 152 completed) progressive aerobic and resistance training, i.e. at 55% or 70% of predicted maximal oxygen consumption and at 60% or 80% of predicted 1-Repetition Maximum, respectively, of equal volume.Main Outcome Measure(s)
Hemoglobin (Hb) A1c and other cardiovascular risk factors; 10-year coronary heart disease (CHD) risk scores.Results
Volume of physical activity, both supervised and non-supervised, was similar in LI and HI participants. Compared with LI training, HI training produced only clinically marginal, though statistically significant, improvements in HbA1c (mean difference −0.17% [95% confidence interval −0.44,0.10], P = 0.03), triglycerides (−0.12 mmol/l [−0.34,0.10], P = 0.02) and total cholesterol (−0.24 mmol/l [−0.46, −0.01], P = 0.04), but not in other risk factors and CHD risk scores. However, intensity was not an independent predictor of reduction of any of these parameters. Adverse event rate was similar in HI and LI subjects.Conclusions
Data from the large IDES cohort indicate that, in low-fitness individuals such as sedentary subjects with type 2 diabetes, increasing exercise intensity is not harmful, but does not provide additional benefits on cardiovascular risk factors.Trial Registration
www.ISRCTN.org ISRCTN-04252749. 相似文献7.
Background
Assembling genes from next-generation sequencing data is not only time consuming but computationally difficult, particularly for taxa without a closely related reference genome. Assembling even a draft genome using de novo approaches can take days, even on a powerful computer, and these assemblies typically require data from a variety of genomic libraries. Here we describe software that will alleviate these issues by rapidly assembling genes from distantly related taxa using a single library of paired-end reads: aTRAM, automated Target Restricted Assembly Method. The aTRAM pipeline uses a reference sequence, BLAST, and an iterative approach to target and locally assemble the genes of interest.Results
Our results demonstrate that aTRAM rapidly assembles genes across distantly related taxa. In comparative tests with a closely related taxon, aTRAM assembled the same sequence as reference-based and de novo approaches taking on average < 1 min per gene. As a test case with divergent sequences, we assembled >1,000 genes from six taxa ranging from 25 – 110 million years divergent from the reference taxon. The gene recovery was between 97 – 99% from each taxon.Conclusions
aTRAM can quickly assemble genes across distantly-related taxa, obviating the need for draft genome assembly of all taxa of interest. Because aTRAM uses a targeted approach, loci can be assembled in minutes depending on the size of the target. Our results suggest that this software will be useful in rapidly assembling genes for phylogenomic projects covering a wide taxonomic range, as well as other applications. The software is freely available http://www.github.com/juliema/aTRAM.Electronic supplementary material
The online version of this article (doi:10.1186/s12859-015-0515-2) contains supplementary material, which is available to authorized users. 相似文献8.
9.
Objective
To evaluate the effectiveness of a school-based intervention involving the families and teachers that aimed to promote healthy eating habits in adolescents; the ultimate aim of the intervention was to reduce the increase in body mass index (BMI) of the students.Design
Paired cluster randomized school-based trial conducted with a sample of fifth graders.Setting
Twenty classes were randomly assigned into either an intervention group or a control group.Participants
From a total of 574 eligible students, 559 students participated in the study (intervention: 10 classes with 277 participants; control: 10 classes with 282 participants). The mean age of students was 11 years.Intervention
Students attended 9 nutritional education sessions during the 2010 academic year. Parents/guardians and teachers received information on the same subjects.Main Outcome Measurement
Changes in BMI and percentage of body fat.Results
Intention-to-treat analysis showed that changes in BMI were not significantly different between the 2 groups (β = 0.003; p = 0.75). There was a major reduction in the consumption of sugar-sweetened beverages and cookies in the intervention group; students in this group also consumed more fruits.Conclusion
Encouraging the adoption of healthy eating habits promoted important changes in the adolescent diet, but this did not lead to a reduction in BMI gain. Strategies based exclusively on the quality of diet may not reduce weight gain among adolescents.Trial Registration
Clinicaltrials.gov . NCT01046474相似文献10.
Mehmet Kemal Samur 《PloS one》2014,9(9)
Background & Objective
Managing data from large-scale projects (such as The Cancer Genome Atlas (TCGA)) for further analysis is an important and time consuming step for research projects. Several efforts, such as the Firehose project, make TCGA pre-processed data publicly available via web services and data portals, but this information must be managed, downloaded and prepared for subsequent steps. We have developed an open source and extensible R based data client for pre-processed data from the Firehouse, and demonstrate its use with sample case studies. Results show that our RTCGAToolbox can facilitate data management for researchers interested in working with TCGA data. The RTCGAToolbox can also be integrated with other analysis pipelines for further data processing.Availability and implementation
The RTCGAToolbox is open-source and licensed under the GNU General Public License Version 2.0. All documentation and source code for RTCGAToolbox is freely available at http://mksamur.github.io/RTCGAToolbox/ for Linux and Mac OS X operating systems. 相似文献11.
Andrey Alexeyenko Bj?rn Nystedt Francesco Vezzi Ellen Sherwood Rosa Ye Bjarne Knudsen Martin Simonsen Benjamin Turner Pieter de Jong Cheng-Cang Wu Joakim Lundeberg 《BMC genomics》2014,15(1)
Background
Sampling genomes with Fosmid vectors and sequencing of pooled Fosmid libraries on the Illumina platform for massive parallel sequencing is a novel and promising approach to optimizing the trade-off between sequencing costs and assembly quality.Results
In order to sequence the genome of Norway spruce, which is of great size and complexity, we developed and applied a new technology based on the massive production, sequencing, and assembly of Fosmid pools (FP). The spruce chromosomes were sampled with ~40,000 bp Fosmid inserts to obtain around two-fold genome coverage, in parallel with traditional whole genome shotgun sequencing (WGS) of haploid and diploid genomes. Compared to the WGS results, the contiguity and quality of the FP assemblies were high, and they allowed us to fill WGS gaps resulting from repeats, low coverage, and allelic differences. The FP contig sets were further merged with WGS data using a novel software package GAM-NGS.Conclusions
By exploiting FP technology, the first published assembly of a conifer genome was sequenced entirely with massively parallel sequencing. Here we provide a comprehensive report on the different features of the approach and the optimization of the process.We have made public the input data (FASTQ format) for the set of pools used in this study:ftp://congenie.org/congenie/Nystedt_2013/Assembly/ProcessedData/FosmidPools/.(alternatively accessible via http://congenie.org/downloads).The software used for running the assembly process is available at http://research.scilifelab.se/andrej_alexeyenko/downloads/fpools/.Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-439) contains supplementary material, which is available to authorized users. 相似文献12.
13.
Effie Viguiliouk Cyril W. C. Kendall Sonia Blanco Mejia Adrian I. Cozma Vanessa Ha Arash Mirrahimi Viranda H. Jayalath Livia S. A. Augustin Laura Chiavaroli Lawrence A. Leiter Russell J. de Souza David J. A. Jenkins John L. Sievenpiper 《PloS one》2014,9(7)
Background
Tree nut consumption has been associated with reduced diabetes risk, however, results from randomized trials on glycemic control have been inconsistent.Objective
To provide better evidence for diabetes guidelines development, we conducted a systematic review and meta-analysis of randomized controlled trials to assess the effects of tree nuts on markers of glycemic control in individuals with diabetes.Data Sources
MEDLINE, EMBASE, CINAHL, and Cochrane databases through 6 April 2014.Study Selection
Randomized controlled trials ≥3 weeks conducted in individuals with diabetes that compare the effect of diets emphasizing tree nuts to isocaloric diets without tree nuts on HbA1c, fasting glucose, fasting insulin, and HOMA-IR.Data Extraction and Synthesis
Two independent reviewer’s extracted relevant data and assessed study quality and risk of bias. Data were pooled by the generic inverse variance method and expressed as mean differences (MD) with 95% CI’s. Heterogeneity was assessed (Cochran Q-statistic) and quantified (I2).Results
Twelve trials (n = 450) were included. Diets emphasizing tree nuts at a median dose of 56 g/d significantly lowered HbA1c (MD = −0.07% [95% CI:−0.10, −0.03%]; P = 0.0003) and fasting glucose (MD = −0.15 mmol/L [95% CI: −0.27, −0.02 mmol/L]; P = 0.03) compared with control diets. No significant treatment effects were observed for fasting insulin and HOMA-IR, however the direction of effect favoured tree nuts.Limitations
Majority of trials were of short duration and poor quality.Conclusions
Pooled analyses show that tree nuts improve glycemic control in individuals with type 2 diabetes, supporting their inclusion in a healthy diet. Owing to the uncertainties in our analyses there is a need for longer, higher quality trials with a focus on using nuts to displace high-glycemic index carbohydrates.Trial Registration
ClinicalTrials.gov NCT01630980相似文献14.
Yan Guo Shilin Zhao Brian D Lehmann Quanhu Sheng Timothy M Shaver Thomas P Stricker Jennifer A Pietenpol Yu Shyr 《BMC bioinformatics》2014,15(1)
Background
Exome sequencing allows researchers to study the human genome in unprecedented detail. Among the many types of variants detectable through exome sequencing, one of the most over looked types of mutation is internal deletion of exons. Internal exon deletions are the absence of consecutive exons in a gene. Such deletions have potentially significant biological meaning, and they are often too short to be considered copy number variation. Therefore, to the need for efficient detection of such deletions using exome sequencing data exists.Results
We present ExonDel, a tool specially designed to detect homozygous exon deletions efficiently. We tested ExonDel on exome sequencing data generated from 16 breast cancer cell lines and identified both novel and known IEDs. Subsequently, we verified our findings using RNAseq and PCR technologies. Further comparisons with multiple sequencing-based CNV tools showed that ExonDel is capable of detecting unique IEDs not found by other CNV tools.Conclusions
ExonDel is an efficient way to screen for novel and known IEDs using exome sequencing data. ExonDel and its source code can be downloaded freely at https://github.com/slzhao/ExonDel.Electronic supplementary material
The online version of this article (doi:10.1186/1471-2105-15-332) contains supplementary material, which is available to authorized users. 相似文献15.
Vadim I. Nazarov Mikhail V. Pogorelyy Ekaterina A. Komech Ivan V. Zvyagin Dmitry A. Bolotin Mikhail Shugay Dmitry M. Chudakov Yury B. Lebedev Ilgar Z. Mamedov 《BMC bioinformatics》2015,16(1)
Background
The Immunoglobulins (IG) and the T cell receptors (TR) play the key role in antigen recognition during the adaptive immune response. Recent progress in next-generation sequencing technologies has provided an opportunity for the deep T cell receptor repertoire profiling. However, a specialised software is required for the rational analysis of massive data generated by next-generation sequencing.Results
Here we introduce tcR, a new R package, representing a platform for the advanced analysis of T cell receptor repertoires, which includes diversity measures, shared T cell receptor sequences identification, gene usage statistics computation and other widely used methods. The tool has proven its utility in recent research studies.Conclusions
tcR is an R package for the advanced analysis of T cell receptor repertoires after primary TR sequences extraction from raw sequencing reads. The stable version can be directly installed from The Comprehensive R Archive Network (http://cran.r-project.org/mirrors.html). The source code and development version are available at tcR GitHub (http://imminfo.github.io/tcr/) along with the full documentation and typical usage examples. 相似文献16.
Aaron L. Leppin Pavithra R. Bora Jon C. Tilburt Michael R. Gionfriddo Claudia Zeballos-Palacios Megan M. Dulohery Amit Sood Patricia J. Erwin Juan Pablo Brito Kasey R. Boehmer Victor M. Montori 《PloS one》2014,9(10)
Importance
Poor mental health places a burden on individuals and populations. Resilient persons are able to adapt to life’s challenges and maintain high quality of life and function. Finding effective strategies to bolster resilience in individuals and populations is of interest to many stakeholders.Objectives
To synthesize the evidence for resiliency training programs in improving mental health and capacity in 1) diverse adult populations and 2) persons with chronic diseases.Data Sources
Electronic databases, clinical trial registries, and bibliographies. We also contacted study authors and field experts.Study Selection
Randomized trials assessing the efficacy of any program intended to enhance resilience in adults and published after 1990. No restrictions were made based on outcome measured or comparator used.Data Extraction and Synthesis
Reviewers worked independently and in duplicate to extract study characteristics and data. These were confirmed with authors. We conducted a random effects meta-analysis on available data and tested for interaction in planned subgroups.Main Outcomes
The standardized mean difference (SMD) effect of resiliency training programs on 1) resilience/hardiness, 2) quality of life/well-being, 3) self-efficacy/activation, 4) depression, 5) stress, and 6) anxiety.Results
We found 25 small trials at moderate to high risk of bias. Interventions varied in format and theoretical approach. Random effects meta-analysis showed a moderate effect of generalized stress-directed programs on enhancing resilience [pooled SMD 0.37 (95% CI 0.18, 0.57) p = .0002; I2 = 41%] within 3 months of follow up. Improvement in other outcomes was favorable to the interventions and reached statistical significance after removing two studies at high risk of bias. Trauma-induced stress-directed programs significantly improved stress [−0.53 (−1.04, −0.03) p = .03; I2 = 73%] and depression [−0.51 (−0.92, −0.10) p = .04; I2 = 61%].Conclusions
We found evidence warranting low confidence that resiliency training programs have a small to moderate effect at improving resilience and other mental health outcomes. Further study is needed to better define the resilience construct and to design interventions specific to it.Registration Number
PROSPERO #CRD42014007185 相似文献17.
18.
Background
Next-generation sequencing technologies are rapidly generating whole-genome datasets for an increasing number of organisms. However, phylogenetic reconstruction of genomic data remains difficult because de novo assembly for non-model genomes and multi-genome alignment are challenging.Results
To greatly simplify the analysis, we present an Assembly and Alignment-Free (AAF) method (https://sourceforge.net/projects/aaf-phylogeny) that constructs phylogenies directly from unassembled genome sequence data, bypassing both genome assembly and alignment. Using mathematical calculations, models of sequence evolution, and simulated sequencing of published genomes, we address both evolutionary and sampling issues caused by direct reconstruction, including homoplasy, sequencing errors, and incomplete sequencing coverage. From these results, we calculate the statistical properties of the pairwise distances between genomes, allowing us to optimize parameter selection and perform bootstrapping. As a test case with real data, we successfully reconstructed the phylogeny of 12 mammals using raw sequencing reads. We also applied AAF to 21 tropical tree genome datasets with low coverage to demonstrate its effectiveness on non-model organisms.Conclusion
Our AAF method opens up phylogenomics for species without an appropriate reference genome or high sequence coverage, and rapidly creates a phylogenetic framework for further analysis of genome structure and diversity among non-model organisms.Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1647-5) contains supplementary material, which is available to authorized users. 相似文献19.
Nawar Diar Bakerly Ashley Woodcock John P. New J. Martin Gibson Wei Wu David Leather J?rgen Vestbo 《Respiratory research》2015,16(1)
Background
New treatments need to be evaluated in real-world clinical practice to account for co-morbidities, adherence and polypharmacy.Methods
Patients with chronic obstructive pulmonary disease (COPD), ≥40 years old, with exacerbation in the previous 3 years are randomised 1:1 to once-daily fluticasone furoate 100 μg/vilanterol 25 μg in a novel dry-powder inhaler versus continuing their existing therapy. The primary endpoint is the mean annual rate of COPD exacerbations; an electronic medical record allows real-time collection and monitoring of endpoint and safety data.Conclusions
The Salford Lung Study is the world’s first pragmatic randomised controlled trial of a pre-licensed medication in COPD.Trial registration
Clinicaltrials.gov identifier . NCT01551758相似文献20.
Kersten Villringer Ulrike Grittner Lars-Arne Schaafs Christian H. Nolte Heinrich Audebert Jochen B. Fiebach 《PloS one》2014,9(10)