首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.

Introduction

Adoption of automatic profiling tools for 1H-NMR-based metabolomic studies still lags behind other approaches in the absence of the flexibility and interactivity necessary to adapt to the properties of study data sets of complex matrices.

Objectives

To provide an open source tool that fully integrates these needs and enables the reproducibility of the profiling process.

Methods

rDolphin incorporates novel techniques to optimize exploratory analysis, metabolite identification, and validation of profiling output quality.

Results

The information and quality achieved in two public datasets of complex matrices are maximized.

Conclusion

rDolphin is an open-source R package (http://github.com/danielcanueto/rDolphin) able to provide the best balance between accuracy, reproducibility and ease of use.
  相似文献   

3.

Background

Horizontal gene transfer (HGT), a process of acquisition and fixation of foreign genetic material, is an important biological phenomenon. Several approaches to HGT inference have been proposed. However, most of them either rely on approximate, non-phylogenetic methods or on the tree reconciliation, which is computationally intensive and sensitive to parameter values.

Results

We investigate the locus tree inference problem as a possible alternative that combines the advantages of both approaches. We present several algorithms to solve the problem in the parsimony framework. We introduce a novel tree mapping, which allows us to obtain a heuristic solution to the problems of locus tree inference and duplication classification.

Conclusions

Our approach allows for faster comparisons of gene and species trees and improves known algorithms for duplication inference in the presence of polytomies in the species trees. We have implemented our algorithms in a software tool available at https://github.com/mciach/LocusTreeInference.
  相似文献   

4.

Background

Various algorithms have been developed to predict fetal trisomies using cell-free DNA in non-invasive prenatal testing (NIPT). As basis for prediction, a control group of non-trisomy samples is needed. Prediction accuracy is dependent on the characteristics of this group and can be improved by reducing variability between samples and by ensuring the control group is representative for the sample analyzed.

Results

NIPTeR is an open-source R Package that enables fast NIPT analysis and simple but flexible workflow creation, including variation reduction, trisomy prediction algorithms and quality control. This broad range of functions allows users to account for variability in NIPT data, calculate control group statistics and predict the presence of trisomies.

Conclusion

NIPTeR supports laboratories processing next-generation sequencing data for NIPT in assessing data quality and determining whether a fetal trisomy is present. NIPTeR is available under the GNU LGPL v3 license and can be freely downloaded from https://github.com/molgenis/NIPTeR or CRAN.
  相似文献   

5.

Introduction

The field of metabolomics has expanded greatly over the past two decades, both as an experimental science with applications in many areas, as well as in regards to data standards and bioinformatics software tools. The diversity of experimental designs and instrumental technologies used for metabolomics has led to the need for distinct data analysis methods and the development of many software tools.

Objectives

To compile a comprehensive list of the most widely used freely available software and tools that are used primarily in metabolomics.

Methods

The most widely used tools were selected for inclusion in the review by either ≥ 50 citations on Web of Science (as of 08/09/16) or the use of the tool being reported in the recent Metabolomics Society survey. Tools were then categorised by the type of instrumental data (i.e. LC–MS, GC–MS or NMR) and the functionality (i.e. pre- and post-processing, statistical analysis, workflow and other functions) they are designed for.

Results

A comprehensive list of the most used tools was compiled. Each tool is discussed within the context of its application domain and in relation to comparable tools of the same domain. An extended list including additional tools is available at https://github.com/RASpicer/MetabolomicsTools which is classified and searchable via a simple controlled vocabulary.

Conclusion

This review presents the most widely used tools for metabolomics analysis, categorised based on their main functionality. As future work, we suggest a direct comparison of tools’ abilities to perform specific data analysis tasks e.g. peak picking.
  相似文献   

6.

Background

Today researchers can choose from many bioinformatics protocols for all types of life sciences research, computational environments and coding languages. Although the majority of these are open source, few of them possess all virtues to maximize reuse and promote reproducible science. Wikipedia has proven a great tool to disseminate information and enhance collaboration between users with varying expertise and background to author qualitative content via crowdsourcing. However, it remains an open question whether the wiki paradigm can be applied to bioinformatics protocols.

Results

We piloted PyPedia, a wiki where each article is both implementation and documentation of a bioinformatics computational protocol in the python language. Hyperlinks within the wiki can be used to compose complex workflows and induce reuse. A RESTful API enables code execution outside the wiki. Initial content of PyPedia contains articles for population statistics, bioinformatics format conversions and genotype imputation. Use of the easy to learn wiki syntax effectively lowers the barriers to bring expert programmers and less computer savvy researchers on the same page.

Conclusions

PyPedia demonstrates how wiki can provide a collaborative development, sharing and even execution environment for biologists and bioinformaticians that complement existing resources, useful for local and multi-center research teams.

Availability

PyPedia is available online at: http://www.pypedia.com. The source code and installation instructions are available at: https://github.com/kantale/PyPedia_server. The PyPedia python library is available at: https://github.com/kantale/pypedia. PyPedia is open-source, available under the BSD 2-Clause License.
  相似文献   

7.
8.

Background

Single-cell RNA sequencing (scRNA-Seq) is an emerging technology that has revolutionized the research of the tumor heterogeneity. However, the highly sparse data matrices generated by the technology have posed an obstacle to the analysis of differential gene regulatory networks.

Results

Addressing the challenges, this study presents, as far as we know, the first bioinformatics tool for scRNA-Seq-based differential network analysis (scdNet). The tool features a sample size adjustment of gene-gene correlation, comparison of inter-state correlations, and construction of differential networks. A simulation analysis demonstrated the power of scdNet in the analyses of sparse scRNA-Seq data matrices, with low requirement on the sample size, high computation efficiency, and tolerance of sequencing noises. Applying the tool to analyze two datasets of single circulating tumor cells (CTCs) of prostate cancer and early mouse embryos, our data demonstrated that differential gene regulation plays crucial roles in anti-androgen resistance and early embryonic development.

Conclusions

Overall, the tool is widely applicable to datasets generated by the emerging technology to bring biological insights into tumor heterogeneity and other studies. MATLAB implementation of scdNet is available at https://github.com/ChenLabGCCRI/scdNet.
  相似文献   

9.

Background

Flux balance analysis (FBA) is a widely-used method for analyzing metabolic networks. However, most existing tools that implement FBA require downloading software and writing code. Furthermore, FBA generates predictions for metabolic networks with thousands of components, so meaningful changes in FBA solutions can be difficult to identify. These challenges make it difficult for beginners to learn how FBA works.

Results

To meet this need, we present Escher-FBA, a web application for interactive FBA simulations within a pathway visualization. Escher-FBA allows users to set flux bounds, knock out reactions, change objective functions, upload metabolic models, and generate high-quality figures without downloading software or writing code. We provide detailed instructions on how to use Escher-FBA to replicate several FBA simulations that generate real scientific hypotheses.

Conclusions

We designed Escher-FBA to be as intuitive as possible so that users can quickly and easily understand the core concepts of FBA. The web application can be accessed at https://sbrg.github.io/escher-fba.
  相似文献   

10.

Background

The integration of high-quality, genome-wide analyses offers a robust approach to elucidating genetic factors involved in complex human diseases. Even though several methods exist to integrate heterogeneous omics data, most biologists still manually select candidate genes by examining the intersection of lists of candidates stemming from analyses of different types of omics data that have been generated by imposing hard (strict) thresholds on quantitative variables, such as P-values and fold changes, increasing the chance of missing potentially important candidates.

Methods

To better facilitate the unbiased integration of heterogeneous omics data collected from diverse platforms and samples, we propose a desirability function framework for identifying candidate genes with strong evidence across data types as targets for follow-up functional analysis. Our approach is targeted towards disease systems with sparse, heterogeneous omics data, so we tested it on one such pathology: spontaneous preterm birth (sPTB).

Results

We developed the software integRATE, which uses desirability functions to rank genes both within and across studies, identifying well-supported candidate genes according to the cumulative weight of biological evidence rather than based on imposition of hard thresholds of key variables. Integrating 10 sPTB omics studies identified both genes in pathways previously suspected to be involved in sPTB as well as novel genes never before linked to this syndrome. integRATE is available as an R package on GitHub (https://github.com/haleyeidem/integRATE).

Conclusions

Desirability-based data integration is a solution most applicable in biological research areas where omics data is especially heterogeneous and sparse, allowing for the prioritization of candidate genes that can be used to inform more targeted downstream functional analyses.
  相似文献   

11.

Background

The fundamental challenge in optimally aligning homologous sequences is to define a scoring scheme that best reflects the underlying biological processes. Maximising the overall number of matches in the alignment does not always reflect the patterns by which nucleotides mutate. Efficiently implemented algorithms that can be parameterised to accommodate more complex non-linear scoring schemes are thus desirable.

Results

We present Cola, alignment software that implements different optimal alignment algorithms, also allowing for scoring contiguous matches of nucleotides in a nonlinear manner. The latter places more emphasis on short, highly conserved motifs, and less on the surrounding nucleotides, which can be more diverged. To illustrate the differences, we report results from aligning 14,100 sequences from 3' untranslated regions of human genes to 25 of their mammalian counterparts, where we found that a nonlinear scoring scheme is more consistent than a linear scheme in detecting short, conserved motifs.

Conclusions

Cola is freely available under LPGL from https://github.com/nedaz/cola.
  相似文献   

12.
13.

Background

Motif analysis methods have long been central for studying biological function of nucleotide sequences. Functional genomics experiments extend their potential. They typically generate sequence lists ranked by an experimentally acquired functional property such as gene expression or protein binding affinity. Current motif discovery tools suffer from limitations in searching large motif spaces, and thus more complex motifs may not be included. There is thus a need for motif analysis methods that are tailored for analyzing specific complex motifs motivated by biological questions and hypotheses rather than acting as a screen based motif finding tool.

Methods

We present Regmex (REGular expression Motif EXplorer), which offers several methods to identify overrepresented motifs in ranked lists of sequences. Regmex uses regular expressions to define motifs or families of motifs and embedded Markov models to calculate exact p-values for motif observations in sequences. Biases in motif distributions across ranked sequence lists are evaluated using random walks, Brownian bridges, or modified rank based statistics. A modular setup and fast analytic p value evaluations make Regmex applicable to diverse and potentially large-scale motif analysis problems.

Results

We demonstrate use cases of combined motifs on simulated data and on expression data from micro RNA transfection experiments. We confirm previously obtained results and demonstrate the usability of Regmex to test a specific hypothesis about the relative location of microRNA seed sites and U-rich motifs. We further compare the tool with an existing motif discovery tool and show increased sensitivity.

Conclusions

Regmex is a useful and flexible tool to analyze motif hypotheses that relates to large data sets in functional genomics. The method is available as an R package (https://github.com/muhligs/regmex).
  相似文献   

14.

Background

Studies that aim at explaining phenotypes or disease susceptibility by genetic or epigenetic variants often rely on clustering methods to stratify individuals or samples. While statistical associations may point at increased risk for certain parts of the population, the ultimate goal is to make precise predictions for each individual. This necessitates tools that allow for the rapid inspection of each data point, in particular to find explanations for outliers.

Results

ACES is an integrative cluster- and phenotype-browser, which implements standard clustering methods, as well as multiple visualization methods in which all sample information can be displayed quickly. In addition, ACES can automatically mine a list of phenotypes for cluster enrichment, whereby the number of clusters and their boundaries are estimated by a novel method. For visual data browsing, ACES provides a 2D or 3D PCA or Heat Map view. ACES is implemented in Java, with a focus on a user-friendly, interactive, graphical interface.

Conclusions

ACES has been proven an invaluable tool for analyzing large, pre-filtered DNA methylation data sets and RNA-Sequencing data, due to its ease to link molecular markers to complex phenotypes. The source code is available from https://github.com/GrabherrGroup/ACES.
  相似文献   

15.

Background

Centrifugation is an indispensable procedure for plasma sample preparation, but applied conditions can vary between labs.

Aim

Determine whether routinely used plasma centrifugation protocols (1500×g 10 min; 3000×g 5 min) influence non-targeted metabolomic analyses.

Methods

Nuclear magnetic resonance spectroscopy (NMR) and High Resolution Mass Spectrometry (HRMS) data were evaluated with sparse partial least squares discriminant analyses and compared with cell count measurements.

Results

Besides significant differences in platelet count, we identified substantial alterations in NMR and HRMS data related to the different centrifugation protocols.

Conclusion

Already minor differences in plasma centrifugation can significantly influence metabolomic patterns and potentially bias metabolomics studies.
  相似文献   

16.

Introduction

Polycystic ovary syndrome (PCOS) is a complex, heterogeneous endocrinological disorder with uncertain pathogenesis and is very common in women of reproductive age. There are few reports of utilizing metabolomics approach to understand the complex pathophysiology of PCOS. However, excluding one previous NMR-based metabolomics study, none of the study was conducted in Indian population.

Objective

The study aims to compare the serum metabolomic profile of PCOS women with controls from the Eastern region of India.

Methods

PCOS women (n?=?35) and healthy control women (n?=?30) undergoing tubal ligation were recruited for this study. Serum metabolic profiles were generated using liquid chromatography–tandem mass spectrometry (LC-MS/MS) and gas chromatography–mass spectrometry (GC-MS). Multivariate statistical analysis was applied to spectral data obtained from both the LC-MS/MS and GC/MS.

Results

Nine metabolites were identified to be most significantly dysregulated in sera of PCOS women; however, few other identified metabolites were also altered but with lesser significance. Amongst these metabolites, riboflavin, sucrose, adenine and N-acetyl glycine, phosphoric acid and cortisol were down-regulated, whereas, thymine, cystathionine, and phenylalanine were up-regulated in PCOS when compared with controls. The observed changes in metabolite expression suggested alterations in aminoacyl-tRNA biosynthesis, metabolism of nitrogen, alanine-aspartate-glutamate, galactose, glycine-serine-threonine, and pyrimidine-purine among several metabolic pathways possibly implicated in these PCOS women.

Conclusion

The altered metabolites identified in PCOS women of Eastern Indian population, provide insight into current perceptive of the disease pathology, metabolic involvements, and may be considered as putative markers of PCOS.
  相似文献   

17.

Background

Improvements in sequencing technology now allow easy acquisition of large datasets; however, analyzing these data for phylogenetics can be challenging. We have developed a novel method to rapidly obtain homologous genomic data for phylogenetics directly from next-generation sequencing reads without the use of a reference genome. This software, called SISRS, avoids the time consuming steps of de novo whole genome assembly, multiple genome alignment, and annotation.

Results

For simulations SISRS is able to identify large numbers of loci containing variable sites with phylogenetic signal. For genomic data from apes, SISRS identified thousands of variable sites, from which we produced an accurate phylogeny. Finally, we used SISRS to identify phylogenetic markers that we used to estimate the phylogeny of placental mammals. We recovered eight phylogenies that resolved the basal relationships among mammals using datasets with different levels of missing data. The three alternate resolutions of the basal relationships are consistent with the major hypotheses for the relationships among mammals, all of which have been supported previously by different molecular datasets.

Conclusions

SISRS has the potential to transform phylogenetic research. This method eliminates the need for expensive marker development in many studies by using whole genome shotgun sequence data directly. SISRS is open source and freely available at https://github.com/rachelss/SISRS/releases.
  相似文献   

18.

Background

Thyroid cancer is the most common endocrine tumor with a steady increase in incidence. It is classified into multiple histopathological subtypes with potentially distinct molecular mechanisms. Identifying the most relevant genes and biological pathways reported in the thyroid cancer literature is vital for understanding of the disease and developing targeted therapeutics.

Results

We developed a large-scale text mining system to generate a molecular profiling of thyroid cancer subtypes. The system first uses a subtype classification method for the thyroid cancer literature, which employs a scoring scheme to assign different subtypes to articles. We evaluated the classification method on a gold standard derived from the PubMed Supplementary Concept annotations, achieving a micro-average F1-score of 85.9% for primary subtypes. We then used the subtype classification results to extract genes and pathways associated with different thyroid cancer subtypes and successfully unveiled important genes and pathways, including some instances that are missing from current manually annotated databases or most recent review articles.

Conclusions

Identification of key genes and pathways plays a central role in understanding the molecular biology of thyroid cancer. An integration of subtype context can allow prioritized screening for diagnostic biomarkers and novel molecular targeted therapeutics. Source code used for this study is made freely available online at https://github.com/chengkun-wu/GenesThyCan.
  相似文献   

19.
20.

Background

Cancer is an evolutionary process characterized by the accumulation of somatic mutations in a population of cells that form a tumor. One frequent type of mutations is copy number aberrations, which alter the number of copies of genomic regions. The number of copies of each position along a chromosome constitutes the chromosome’s copy-number profile. Understanding how such profiles evolve in cancer can assist in both diagnosis and prognosis.

Results

We model the evolution of a tumor by segmental deletions and amplifications, and gauge distance from profile \(\mathbf {a}\) to \(\mathbf {b}\) by the minimum number of events needed to transform \(\mathbf {a}\) into \(\mathbf {b}\). Given two profiles, our first problem aims to find a parental profile that minimizes the sum of distances to its children. Given k profiles, the second, more general problem, seeks a phylogenetic tree, whose k leaves are labeled by the k given profiles and whose internal vertices are labeled by ancestral profiles such that the sum of edge distances is minimum.

Conclusions

For the former problem we give a pseudo-polynomial dynamic programming algorithm that is linear in the profile length, and an integer linear program formulation. For the latter problem we show it is NP-hard and give an integer linear program formulation that scales to practical problem instance sizes. We assess the efficiency and quality of our algorithms on simulated instances.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号