首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background

Typical human genome differs from the reference genome at 4-5 million sites. This diversity is increasingly catalogued in repositories such as ExAC/gnomAD, consisting of >15,000 whole-genomes and >126,000 exome sequences from different individuals. Despite this enormous diversity, resequencing data workflows are still based on a single human reference genome. Identification and genotyping of genetic variants is typically carried out on short-read data aligned to a single reference, disregarding the underlying variation.

Results

We propose a new unified framework for variant calling with short-read data utilizing a representation of human genetic variation – a pan-genomic reference. We provide a modular pipeline that can be seamlessly incorporated into existing sequencing data analysis workflows. Our tool is open source and available online: https://gitlab.com/dvalenzu/PanVC.

Conclusions

Our experiments show that by replacing a standard human reference with a pan-genomic one we achieve an improvement in single-nucleotide variant calling accuracy and in short indel calling accuracy over the widely adopted Genome Analysis Toolkit (GATK) in difficult genomic regions.
  相似文献   

2.

Background

High-throughput custom designed genotyping arrays are a valuable resource for biologically focused research studies and increasingly for validation of variation predicted by next-generation sequencing (NGS) technologies. We investigate the Illumina GoldenGate chemistry using custom designed VeraCode and sentrix array matrix (SAM) assays for each of these applications, respectively. We highlight applications for interpretation of Illumina generated genotype cluster plots to maximise data inclusion and reduce genotyping errors.

Findings

We illustrate the dramatic effect of outliers in genotype calling and data interpretation, as well as suggest simple means to avoid genotyping errors. Furthermore we present this platform as a successful method for two-cluster rare or non-autosomal variant calling. The success of high-throughput technologies to accurately call rare variants will become an essential feature for future association studies. Finally, we highlight additional advantages of the Illumina GoldenGate chemistry in generating unusually segregated cluster plots that identify potential NGS generated sequencing error resulting from minimal coverage.

Conclusions

We demonstrate the importance of visually inspecting genotype cluster plots generated by the Illumina software and issue warnings regarding commonly accepted quality control parameters. In addition to suggesting applications to minimise data exclusion, we propose that the Illumina cluster plots may be helpful in identifying potential in-put sequence errors, particularly important for studies to validate NGS generated variation.
  相似文献   

3.
4.

Introduction

Concerning NMR-based metabolomics, 1D spectra processing often requires an expert eye for disentangling the intertwined peaks.

Objectives

The objective of NMRProcFlow is to assist the expert in this task in the best way without requirement of programming skills.

Methods

NMRProcFlow was developed to be a graphical and interactive 1D NMR (1H & 13C) spectra processing tool.

Results

NMRProcFlow (http://nmrprocflow.org), dedicated to metabolic fingerprinting and targeted metabolomics, covers all spectra processing steps including baseline correction, chemical shift calibration and alignment.

Conclusion

Biologists and NMR spectroscopists can easily interact and develop synergies by visualizing the NMR spectra along with their corresponding experimental-factor levels, thus setting a bridge between experimental design and subsequent statistical analyses.
  相似文献   

5.

Introduction

Botanicals containing iridoid and phenylethanoid/phenylpropanoid glycosides are used worldwide for the treatment of inflammatory musculoskeletal conditions that are primary causes of human years lived with disability, such as arthritis and lower back pain.

Objectives

We report the analysis of candidate anti-inflammatory metabolites of several endemic Scrophularia species and Verbascum thapsus used medicinally by peoples of North America.

Methods

Leaves, stems, and roots were analyzed by ultra-performance liquid chromatography-mass spectrometry (UPLC-MS) and partial least squares-discriminant analysis (PLS-DA) was performed in MetaboAnalyst 3.0 after processing the datasets in Progenesis QI.

Results

Comparison of the datasets revealed significant and differential accumulation of iridoid and phenylethanoid/phenylpropanoid glycosides in the tissues of the endemic Scrophularia species and Verbascum thapsus.

Conclusions

Our investigation identified several species of pharmacological interest as good sources for harpagoside and other important anti-inflammatory metabolites.
  相似文献   

6.

Objective

To examine the activities of residual enzymes in dried shiitake mushrooms, which are a traditional foodstuff in Japanese cuisine, for possible applications in food processing.

Results

Polysaccharide-degrading enzymes remained intact in dried shiitake mushrooms and the activities of amylase, β-glucosidase and pectinase were high. A potato digestion was tested using dried shiitake powder. The enzymes reacted with potato tuber specimens to solubilize sugars even under a heterogeneous solid-state condition and that their reaction modes were different at 38 and 50 °C.

Conclusion

Dried shiitake mushrooms have a potential use in food processing as an enzyme preparation.
  相似文献   

7.

Introduction

Collecting feces is easy. It offers direct outcome to endogenous and microbial metabolites.

Objectives

In a context of lack of consensus about fecal sample preparation, especially in animal species, we developed a robust protocol allowing untargeted LC-HRMS fingerprinting.

Methods

The conditions of extraction (quantity, preparation, solvents, dilutions) were investigated in bovine feces.

Results

A rapid and simple protocol involving feces extraction with methanol (1/3, M/V) followed by centrifugation and a step filtration (10 kDa) was developed.

Conclusion

The workflow generated repeatable and informative fingerprints for robust metabolome characterization.
  相似文献   

8.

Introduction

Data processing is one of the biggest problems in metabolomics, given the high number of samples analyzed and the need of multiple software packages for each step of the processing workflow.

Objectives

Merge in the same platform the steps required for metabolomics data processing.

Methods

KniMet is a workflow for the processing of mass spectrometry-metabolomics data based on the KNIME Analytics platform.

Results

The approach includes key steps to follow in metabolomics data processing: feature filtering, missing value imputation, normalization, batch correction and annotation.

Conclusion

KniMet provides the user with a local, modular and customizable workflow for the processing of both GC–MS and LC–MS open profiling data.
  相似文献   

9.

Introduction

Processing delays after blood collection is a common pre-analytical condition in large epidemiologic studies. It is critical to evaluate the suitability of blood samples with processing delays for metabolomics analysis as it is a potential source of variation that could attenuate associations between metabolites and disease outcomes.

Objectives

We aimed to evaluate the reproducibility of metabolites over extended processing delays up to 48 h. We also aimed to test the reproducibility of the metabolomics platform.

Methods

Blood samples were collected from 18 healthy volunteers. Blood was stored in the refrigerator and processed for plasma at 0, 15, 30, and 48 h after collection. Plasma samples were metabolically profiled using an untargeted, ultrahigh performance liquid chromatography–tandem mass spectrometry (UPLC–MS/MS) platform. Reproducibility of 1012 metabolites over processing delays and reproducibility of the platform were determined by intraclass correlation coefficients (ICCs) with variance components estimated from mixed-effects models.

Results

The majority of metabolites (approximately 70% of 1012) were highly reproducible (ICCs?≥?0.75) over 15-, 30- or 48-h processing delays. Nucleotides, energy-related metabolites, peptides, and carbohydrates were most affected by processing delays. The platform was highly reproducible with a median technical ICC of 0.84 (interquartile range 0.68–0.93).

Conclusion

Most metabolites measured by the UPLC–MS/MS platform show acceptable reproducibility up to 48-h processing delays. Metabolites of certain pathways need to be interpreted cautiously in relation to outcomes in epidemiologic studies with prolonged processing delays.
  相似文献   

10.

Background

Innumerable opportunities for new genomic research have been stimulated by advancement in high-throughput next-generation sequencing (NGS). However, the pitfall of NGS data abundance is the complication of distinction between true biological variants and sequence error alterations during downstream analysis. Many error correction methods have been developed to correct erroneous NGS reads before further analysis, but independent evaluation of the impact of such dataset features as read length, genome size, and coverage depth on their performance is lacking. This comparative study aims to investigate the strength and weakness as well as limitations of some newest k-spectrum-based methods and to provide recommendations for users in selecting suitable methods with respect to specific NGS datasets.

Methods

Six k-spectrum-based methods, i.e., Reptile, Musket, Bless, Bloocoo, Lighter, and Trowel, were compared using six simulated sets of paired-end Illumina sequencing data. These NGS datasets varied in coverage depth (10× to 120×), read length (36 to 100 bp), and genome size (4.6 to 143 MB). Error Correction Evaluation Toolkit (ECET) was employed to derive a suite of metrics (i.e., true positives, false positive, false negative, recall, precision, gain, and F-score) for assessing the correction quality of each method.

Results

Results from computational experiments indicate that Musket had the best overall performance across the spectra of examined variants reflected in the six datasets. The lowest accuracy of Musket (F-score?=?0.81) occurred to a dataset with a medium read length (56 bp), a medium coverage (50×), and a small-sized genome (5.4 MB). The other five methods underperformed (F-score?<?0.80) and/or failed to process one or more datasets.

Conclusions

This study demonstrates that various factors such as coverage depth, read length, and genome size may influence performance of individual k-spectrum-based error correction methods. Thus, efforts have to be paid in choosing appropriate methods for error correction of specific NGS datasets. Based on our comparative study, we recommend Musket as the top choice because of its consistently superior performance across all six testing datasets. Further extensive studies are warranted to assess these methods using experimental datasets generated by NGS platforms (e.g., 454, SOLiD, and Ion Torrent) under more diversified parameter settings (k-mer values and edit distances) and to compare them against other non-k-spectrum-based classes of error correction methods.
  相似文献   

11.

Background

Pathogens identification is critical for the proper diagnosis and precise treatment of infective endocarditis (IE). Although blood and valve cultures are the gold standard for IE pathogens detection, many cases are culture-negative, especially in patients who had received long-term antibiotic treatment, and precise diagnosis has therefore become a major challenge in the clinic. Metagenomic sequencing can provide both information on the pathogenic strain and the antibiotic susceptibility profile of patient samples without culturing, offering a powerful method to deal with culture-negative cases.

Methods

To assess the feasibility of a metagenomic approach to detect the causative pathogens in resected valves from IE patients, we employed both next-generation sequencing and Oxford Nanopore Technologies MinION nanopore sequencing for pathogens and antimicrobial resistance detection in seven culture-negative IE patients. Using our in-house developed bioinformatics pipeline, we analyzed the sequencing results generated from both platforms for the direct identification of pathogens from the resected valves of seven clinically culture-negative IE patients according to the modified Duke criteria.

Results

Our results showed both metagenomics methods can be applied for the causative pathogen detection in all IE samples. Moreover, we were able to simultaneously characterize respective antimicrobial resistance features.

Conclusion

Metagenomic methods for IE detection can provide clinicians with valuable information to diagnose and treat IE patients after valve replacement surgery. However, more efforts should be made to optimize protocols for sample processing, sequencing and bioinformatics analysis.
  相似文献   

12.

Background

There are several reports on anatomical differences of the meniscus. However, there are only a few reports on abnormalities in both menisci and anatomical differences in anterior cruciate ligament insertions.

Case presentation

This is a case report of a 36-year-old Hispanic man presenting symptoms, including knee pain, locking, and effusion, with an anatomical abnormality of the menisci corresponding to the fusion of the posterior horns of the menisci in tandem with the insertion of the posterior meniscus fibers in the anterior cruciate ligament.

Conclusions

This is the first study describing a meniscus anatomical variant with isolated posterior junction of the posterior horn with an anomalous insertion to the anterior cruciate ligament. The recognition of meniscus variants is important as they can be misinterpreted for more significant pathology on magnetic resonance images.
  相似文献   

13.

Background

Studies that ascertain families containing multiple relatives affected by disease can be useful for identification of causal, rare variants from next-generation sequencing data.

Results

We present the R package SimRVPedigree, which allows researchers to simulate pedigrees ascertained on the basis of multiple, affected relatives. By incorporating the ascertainment process in the simulation, SimRVPedigree allows researchers to better understand the within-family patterns of relationship amongst affected individuals and ages of disease onset.

Conclusions

Through simulation, we show that affected members of a family segregating a rare disease variant tend to be more numerous and cluster in relationships more closely than those for sporadic disease. We also show that the family ascertainment process can lead to apparent anticipation in the age of onset. Finally, we use simulation to gain insight into the limit on the proportion of ascertained families segregating a causal variant. SimRVPedigree should be useful to investigators seeking insight into the family-based study design through simulation.
  相似文献   

14.
15.

Background

Existing clustering approaches for microarray data do not adequately differentiate between subsets of co-expressed genes. We devised a novel approach that integrates expression and sequence data in order to generate functionally coherent and biologically meaningful subclusters of genes. Specifically, the approach clusters co-expressed genes on the basis of similar content and distributions of predicted statistically significant sequence motifs in their upstream regions.

Results

We applied our method to several sets of co-expressed genes and were able to define subsets with enrichment in particular biological processes and specific upstream regulatory motifs.

Conclusions

These results show the potential of our technique for functional prediction and regulatory motif identification from microarray data.
  相似文献   

16.

Introduction

Human plasma metabolomics offer powerful tools for understanding disease mechanisms and identifying clinical biomarkers for diagnosis, efficacy prediction and patient stratification. Although storage conditions can affect the reliability of data from metabolites, strict control of these conditions remains challenging, particularly when clinical samples are included from multiple centers. Therefore, it is necessary to consider stability profiles of each analyte.

Objectives

The purpose of this study was to extract unstable metabolites from vast metabolome data and identify factors that cause instability.

Method

Plasma samples were obtained from five healthy volunteers, were stored under ten different conditions of time and temperature and were quantified using leading-edge metabolomics. Instability was evaluated by comparing quantitation values under each storage condition with those obtained after ?80 °C storage.

Result

Stability profiling of the 992 metabolites showed time- and temperature-dependent increases in numbers of significantly changed metabolites. This large volume of data enabled comparisons of unstable metabolites with their related molecules and allowed identification of causative factors, including compound-specific enzymatic activity in plasma and chemical reactivity. Furthermore, these analyses indicated extreme instability of 1-docosahexaenoylglycerol, 1-arachidonoylglycerophosphate, cystine, cysteine and N6-methyladenosine.

Conclusion

A large volume of data regarding storage stability was obtained. These data are a contribution to the discovery of biomarker candidates without misselection based on unreliable values and to the establishment of suitable handling procedures for targeted biomarker quantification.
  相似文献   

17.

Introduction

Modern omics experiments pertain not only to the measurement of many variables but also follow complex experimental designs where many factors are manipulated at the same time. This data can be conveniently analyzed using multivariate tools like ANOVA-simultaneous component analysis (ASCA) which allows interpretation of the variation induced by the different factors in a principal component analysis fashion. However, while in general only a subset of the measured variables may be related to the problem studied, all variables contribute to the final model and this may hamper interpretation.

Objectives

We introduce here a sparse implementation of ASCA termed group-wise ANOVA-simultaneous component analysis (GASCA) with the aim of obtaining models that are easier to interpret.

Methods

GASCA is based on the concept of group-wise sparsity introduced in group-wise principal components analysis where structure to impose sparsity is defined in terms of groups of correlated variables found in the correlation matrices calculated from the effect matrices.

Results

The GASCA model, containing only selected subsets of the original variables, is easier to interpret and describes relevant biological processes.

Conclusions

GASCA is applicable to any kind of omics data obtained through designed experiments such as, but not limited to, metabolomic, proteomic and gene expression data.
  相似文献   

18.
19.

Background

Metagenomics method directly sequences and analyses genome information from microbial communities. There are usually more than hundreds of genomes from different microbial species in the same community, and the main computational tasks for metagenomic data analyses include taxonomical and functional component examination of all genomes in the microbial community. Metagenomic data analysis is both data- and computation- intensive, which requires extensive computational power. Most of the current metagenomic data analysis softwares were designed to be used on a single computer or single computer clusters, which could not match with the fast increasing number of large metagenomic projects' computational requirements. Therefore, advanced computational methods and pipelines have to be developed to cope with such need for efficient analyses.

Result

In this paper, we proposed Parallel-META, a GPU- and multi-core-CPU-based open-source pipeline for metagenomic data analysis, which enabled the efficient and parallel analysis of multiple metagenomic datasets and the visualization of the results for multiple samples. In Parallel-META, the similarity-based database search was parallelized based on GPU computing and multi-core CPU computing optimization. Experiments have shown that Parallel-META has at least 15 times speed-up compared to traditional metagenomic data analysis method, with the same accuracy of the results http://www.computationalbioenergy.org/parallel-meta.html.

Conclusion

The parallel processing of current metagenomic data would be very promising: with current speed up of 15 times and above, binning would not be a very time-consuming process any more. Therefore, some deeper analysis of the metagenomic data, such as the comparison of different samples, would be feasible in the pipeline, and some of these functionalities have been included into the Parallel-META pipeline.
  相似文献   

20.

Introduction

Raspberries are becoming increasingly popular due to their reported health beneficial properties. Despite the presence of only trace amounts of anthocyanins, yellow varieties seems to show similar or better effects in comparison to conventional raspberries.

Objectives

The aim of this work is to characterize the metabolic differences between red and yellow berries, focussing on the compounds showing a higher concentration in yellow varieties.

Methods

The metabolomic profile of 13 red and 12 yellow raspberries (of different varieties, locations and collection dates) was determined by UPLC–TOF-MS. A novel approach based on Pearson correlation on the extracted ion chromatograms was implemented to extract the pseudospectra of the most relevant biomarkers from high energy LC–MS runs. The raw data will be made publicly available on MetaboLights (MTBLS333).

Results

Among the metabolites showing higher concentration in yellow raspberries it was possible to identify a series of compounds showing a pseudospectrum similar to that of A-type procyanidin polymers. The annotation of this group of compounds was confirmed by specific MS/MS experiments and performing standard injections.

Conclusions

In berries lacking anthocyanins the polyphenol metabolism might be shifted to the formation of a novel class of A-type procyanidin polymers.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号