共查询到20条相似文献,搜索用时 31 毫秒
1.
Background
One of the recent challenges of computational biology is development of new algorithms, tools and software to facilitate predictive modeling of big data generated by high-throughput technologies in biomedical research.Results
To meet these demands we developed PROPER - a package for visual evaluation of ranking classifiers for biological big data mining studies in the MATLAB environment.Conclusion
PROPER is an efficient tool for optimization and comparison of ranking classifiers, providing over 20 different two- and three-dimensional performance curves.2.
Background
The current literature establishes the importance of gene functional category and expression in promoting or suppressing duplicate gene loss after whole genome doubling in plants, a process known as fractionation. Inspired by studies that have reported gene expression to be the dominating factor in preventing duplicate gene loss, we analyzed the relative effect of functional category and expression.Methods
We use multivariate methods to study data sets on gene retention, function and expression in rosids and asterids to estimate effects and assess their interaction.Results
Our results suggest that the effect on duplicate gene retention fractionation by functional category and expression are independent and have no statistical interaction.Conclusion
In plants, functional category is the more dominant factor in explaining duplicate gene loss.3.
Chao Xie Chin Lui Wesley Goi Daniel H. Huson Peter F. R. Little Rohan B. H. Williams 《BMC bioinformatics》2016,17(19):508
Background
Taxonomic profiling of microbial communities is often performed using small subunit ribosomal RNA (SSU) amplicon sequencing (16S or 18S), while environmental shotgun sequencing is often focused on functional analysis. Large shotgun datasets contain a significant number of SSU sequences and these can be exploited to perform an unbiased SSU--based taxonomic analysis.Results
Here we present a new program called RiboTagger that identifies and extracts taxonomically informative ribotags located in a specified variable region of the SSU gene in a high-throughput fashion.Conclusions
RiboTagger permits fast recovery of SSU-RNA sequences from shotgun nucleic acid surveys of complex microbial communities. The program targets all three domains of life, exhibits high sensitivity and specificity and is substantially faster than comparable programs.4.
Introduction
Untargeted metabolomics is a powerful tool for biological discoveries. To analyze the complex raw data, significant advances in computational approaches have been made, yet it is not clear how exhaustive and reliable the data analysis results are.Objectives
Assessment of the quality of raw data processing in untargeted metabolomics.Methods
Five published untargeted metabolomics studies, were reanalyzed.Results
Omissions of at least 50 relevant compounds from the original results as well as examples of representative mistakes were reported for each study.Conclusion
Incomplete raw data processing shows unexplored potential of current and legacy data.5.
Jack W. KentJr 《BMC genetics》2016,17(Z2):S5
Background
New technologies for acquisition of genomic data, while offering unprecedented opportunities for genetic discovery, also impose severe burdens of interpretation andpenalties for multiple testing.Methods
The Pathway-based Analyses Group of the Genetic Analysis Workshop 19 (GAW19) sought reduction of multiple-testing burden through various approaches to aggregation of highdimensional data in pathways informed by prior biological knowledge.Results
Experimental methods testedincluded the use of "synthetic pathways" (random sets of genes) to estimate power and false-positive error rate of methods applied to simulated data; data reduction via independent components analysis, single-nucleotide polymorphism (SNP)-SNP interaction, and use of gene sets to estimate genetic similarity; and general assessment of the efficacy of prior biological knowledge to reduce the dimensionality of complex genomic data.Conclusions
The work of this group explored several promising approaches to managing high-dimensional data, with the caveat that these methods are necessarily constrained by the quality of external bioinformatic annotation.6.
Andrei Prodan Sultan Imangaliyev Henk S. Brand Martijn N. A. Rosema Evgeni Levin Wim Crielaard Bart J. F. Keijser Enno C. I. Veerman 《Metabolomics : Official journal of the Metabolomic Society》2016,12(9):147
Introduction
Understanding the changes occurring in the oral ecosystem during development of gingivitis could help improve prevention and treatment strategies for oral health. Erythritol is a non-caloric polyol proposed to have beneficial effects on oral health.Objectives
To examine the effect of experimental gingivitis and the effect of erythritol on the salivary metabolome and salivary functional biochemistry.Methods
In a two-week experimental gingivitis challenge intervention study, non-targeted, mass spectrometry-based metabolomic profiling was performed on saliva samples from 61 healthy adults, collected at five time-points. The effect of erythritol was studied in a randomized, controlled trial setting. Fourteen salivary biochemistry variables were measured with antibody- or enzymatic activity-based assays.Results
Bacterial amino acid catabolites (cadaverine, N-acetylcadaverine, and α-hydroxyisovalerate) and end-products of bacterial alkali-producing pathways (N-α-acetylornithine and γ-aminobutyrate) increased significantly during the experimental gingivitis. Significant changes were found in a set of 13 salivary metabolite ratios composed of host cell membrane lipids involved in cell signaling, host responses to bacteria, and defense against free radicals. An increase in mevalonate was also observed. There were no significant effects of erythritol. No significant changes were found in functional salivary biochemistry.Conclusions
The findings underline a dynamic interaction between the host and the oral microbial biofilm during an experimental induction of gingivitis.7.
Background
High-throughput technologies, such as DNA microarray, have significantly advanced biological and biomedical research by enabling researchers to carry out genome-wide screens. One critical task in analyzing genome-wide datasets is to control the false discovery rate (FDR) so that the proportion of false positive features among those called significant is restrained. Recently a number of FDR control methods have been proposed and widely practiced, such as the Benjamini-Hochberg approach, the Storey approach and Significant Analysis of Microarrays (SAM).Methods
This paper presents a straight-forward yet powerful FDR control method termed miFDR, which aims to minimize FDR when calling a fixed number of significant features. We theoretically proved that the strategy used by miFDR is able to find the optimal number of significant features when the desired FDR is fixed.Results
We compared miFDR with the BH approach, the Storey approach and SAM on both simulated datasets and public DNA microarray datasets. The results demonstrated that miFDR outperforms others by identifying more significant features under the same FDR cut-offs. Literature search showed that many genes called only by miFDR are indeed relevant to the underlying biology of interest.Conclusions
FDR has been widely applied to analyzing high-throughput datasets allowed for rapid discoveries. Under the same FDR threshold, miFDR is capable to identify more significant features than its competitors at a compatible level of complexity. Therefore, it can potentially generate great impacts on biological and biomedical research.Availability
If interested, please contact the authors for getting miFDR.8.
Background
Existing clustering approaches for microarray data do not adequately differentiate between subsets of co-expressed genes. We devised a novel approach that integrates expression and sequence data in order to generate functionally coherent and biologically meaningful subclusters of genes. Specifically, the approach clusters co-expressed genes on the basis of similar content and distributions of predicted statistically significant sequence motifs in their upstream regions.Results
We applied our method to several sets of co-expressed genes and were able to define subsets with enrichment in particular biological processes and specific upstream regulatory motifs.Conclusions
These results show the potential of our technique for functional prediction and regulatory motif identification from microarray data.9.
Background
With ever increasing amount of available data on biological networks, modeling and understanding the structure of these large networks is an important problem with profound biological implications. Cellular functions and biochemical events are coordinately carried out by groups of proteins interacting each other in biological modules. Identifying of such modules in protein interaction networks is very important for understanding the structure and function of these fundamental cellular networks. Therefore, developing an effective computational method to uncover biological modules should be highly challenging and indispensable.Results
The purpose of this study is to introduce a new quantitative measure modularity density into the field of biomolecular networks and develop new algorithms for detecting functional modules in protein-protein interaction (PPI) networks. Specifically, we adopt the simulated annealing (SA) to maximize the modularity density and evaluate its efficiency on simulated networks. In order to address the computational complexity of SA procedure, we devise a spectral method for optimizing the index and apply it to a yeast PPI network.Conclusions
Our analysis of detected modules by the present method suggests that most of these modules have well biological significance in context of protein complexes. Comparison with the MCL and the modularity based methods shows the efficiency of our method.10.
N. Cesbron A.-L. Royer Y. Guitton A. Sydor B. Le Bizec G. Dervilly-Pinel 《Metabolomics : Official journal of the Metabolomic Society》2017,13(8):99
Introduction
Collecting feces is easy. It offers direct outcome to endogenous and microbial metabolites.Objectives
In a context of lack of consensus about fecal sample preparation, especially in animal species, we developed a robust protocol allowing untargeted LC-HRMS fingerprinting.Methods
The conditions of extraction (quantity, preparation, solvents, dilutions) were investigated in bovine feces.Results
A rapid and simple protocol involving feces extraction with methanol (1/3, M/V) followed by centrifugation and a step filtration (10 kDa) was developed.Conclusion
The workflow generated repeatable and informative fingerprints for robust metabolome characterization.11.
M. G. L. Henquet M. Roelse R. C. H. de Vos A. Schipper G. Polder N. C. A. de Ruijter R. D. Hall M. A. Jongsma 《Metabolomics : Official journal of the Metabolomic Society》2016,12(7):115
Introduction
Metabolomics has become a valuable tool in many research areas. However, generating metabolomics-based biochemical profiles without any related bioactivity is only of indirect value in understanding a biological process. Therefore, metabolomics research could greatly benefit from tools that directly determine the bioactivity of the detected compounds.Objective
We aimed to combine LC–MS metabolomics with a cell based receptor assay. This combination could increase the understanding of biological processes and may provide novel opportunities for functional metabolomics.Methods
We developed a flow through biosensor with human cells expressing both the TRPV1, a calcium ion channel which responds to capsaicin, and the fluorescent intracellular calcium ion reporter, YC3.6. We have analysed three contrasting Capsicum varieties. Two were selected with contrasting degrees of spiciness for characterization by HPLC coupled to high mass resolution MS. Subsequently, the biosensor was then used to link individual pepper compounds with TRPV1 activity.Results
Among the compounds in the crude pepper fruit extracts, we confirmed capsaicin and also identified both nordihydrocapsaicin and dihydrocapsaicin as true agonists of the TRPV1 receptor. Furthermore, the biosensor was able to detect receptor activity in extracts of both Capsicum fruits as well as a commercial product. Sensitivity of the biosensor to this commercial product was similar to the sensory threshold of a human sensory panel.Conclusion
Our results demonstrate that the TRPV1 biosensor is suitable for detecting bioactive metabolites. Novel opportunities may lie in the development of a continuous functional assay, where the biosensor is directly coupled to the LC–MS.12.
Background
The heme-protein interactions are essential for various biological processes such as electron transfer, catalysis, signal transduction and the control of gene expression. The knowledge of heme binding residues can provide crucial clues to understand these activities and aid in functional annotation, however, insufficient work has been done on the research of heme binding residues from protein sequence information.Methods
We propose a sequence-based approach for accurate prediction of heme binding residues by a novel integrative sequence profile coupling position specific scoring matrices with heme specific physicochemical properties. In order to select the informative physicochemical properties, we design an intuitive feature selection scheme by combining a greedy strategy with correlation analysis.Results
Our integrative sequence profile approach for prediction of heme binding residues outperforms the conventional methods using amino acid and evolutionary information on the 5-fold cross validation and the independent tests.Conclusions
The novel feature of an integrative sequence profile achieves good performance using a reduced set of feature vector elements.13.
Background
Essential proteins are indispensable to the survival and development process of living organisms. To understand the functional mechanisms of essential proteins, which can be applied to the analysis of disease and design of drugs, it is important to identify essential proteins from a set of proteins first. As traditional experimental methods designed to test out essential proteins are usually expensive and laborious, computational methods, which utilize biological and topological features of proteins, have attracted more attention in recent years. Protein-protein interaction networks, together with other biological data, have been explored to improve the performance of essential protein prediction.Results
The proposed method SCP is evaluated on Saccharomyces cerevisiae datasets and compared with five other methods. The results show that our method SCP outperforms the other five methods in terms of accuracy of essential protein prediction.Conclusions
In this paper, we propose a novel algorithm named SCP, which combines the ranking by a modified PageRank algorithm based on subcellular compartments information, with the ranking by Pearson correlation coefficient (PCC) calculated from gene expression data. Experiments show that subcellular localization information is promising in boosting essential protein prediction.14.
Li Li Chang-Sheng Wu Guan-Mei Hou Ming-Zhe Dong Zhen-Bo Wang Yi Hou Heide Schatten Gui-Rong Zhang Qing-Yuan Sun 《Reproductive biology and endocrinology : RB&E》2018,16(1):110
Background
Diabetes induces many complications including reduced fertility and low oocyte quality, but whether it causes increased mtDNA mutations is unknown.Methods
We generated a T2D mouse model by using high-fat-diet (HFD) and Streptozotocin (STZ) injection. We examined mtDNA mutations in oocytes of diabetic mice by high-throughput sequencing techniques.Results
T2D mice showed glucose intolerance, insulin resistance, low fecundity compared to the control group. T2D oocytes showed increased mtDNA mutation sites and mutation numbers compared to the control counterparts. mtDNA mutation examination in F1 mice showed that the mitochondrial bottleneck could eliminate mtDNA mutations.Conclusions
T2D mice have increased mtDNA mutation sites and mtDNA mutation numbers in oocytes compared to the counterparts, while these adverse effects can be eliminated by the bottleneck effect in their offspring. This is the first study using a small number of oocytes to examine mtDNA mutations in diabetic mothers and offspring.15.
Background
With the improvements in biosensors and high-throughput image acquisition technologies, life science laboratories are able to perform an increasing number of experiments that involve the generation of a large amount of images at different imaging modalities/scales. It stresses the need for computer vision methods that automate image classification tasks.Results
We illustrate the potential of our image classification method in cell biology by evaluating it on four datasets of images related to protein distributions or subcellular localizations, and red-blood cell shapes. Accuracy results are quite good without any specific pre-processing neither domain knowledge incorporation. The method is implemented in Java and available upon request for evaluation and research purpose.Conclusion
Our method is directly applicable to any image classification problems. We foresee the use of this automatic approach as a baseline method and first try on various biological image classification problems.16.
Rachel A. Spicer Christoph Steinbeck 《Metabolomics : Official journal of the Metabolomic Society》2018,14(1):16
Introduction
Data sharing is being increasingly required by journals and has been heralded as a solution to the ‘replication crisis’.Objectives
(i) Review data sharing policies of journals publishing the most metabolomics papers associated with open data and (ii) compare these journals’ policies to those that publish the most metabolomics papers.Methods
A PubMed search was used to identify metabolomics papers. Metabolomics data repositories were manually searched for linked publications.Results
Journals that support data sharing are not necessarily those with the most papers associated to open metabolomics data.Conclusion
Further efforts are required to improve data sharing in metabolomics.17.
Background
With the development of high-throughput genotyping and sequencing technology, there are growing evidences of association with genetic variants and complex traits. In spite of thousands of genetic variants discovered, such genetic markers have been shown to explain only a very small proportion of the underlying genetic variance of complex traits. Gene-gene interaction (GGI) analysis is expected to unveil a large portion of unexplained heritability of complex traits.Methods
In this work, we propose IGENT, Information theory-based GEnome-wide gene-gene iNTeraction method. IGENT is an efficient algorithm for identifying genome-wide gene-gene interactions (GGI) and gene-environment interaction (GEI). For detecting significant GGIs in genome-wide scale, it is important to reduce computational burden significantly. Our method uses information gain (IG) and evaluates its significance without resampling.Results
Through our simulation studies, the power of the IGENT is shown to be better than or equivalent to that of that of BOOST. The proposed method successfully detected GGI for bipolar disorder in the Wellcome Trust Case Control Consortium (WTCCC) and age-related macular degeneration (AMD).Conclusions
The proposed method is implemented by C++ and available on Windows, Linux and MacOSX.18.
Background
Whether or not a protein's number of physical interactions with other proteins plays a role in determining its rate of evolution has been a contentious issue. A recent analysis suggested that the observed correlation between number of interactions and evolutionary rate may be due to experimental biases in high-throughput protein interaction data sets.Discussion
The number of interactions per protein, as measured by some protein interaction data sets, shows no correlation with evolutionary rate. Other data sets, however, do reveal a relationship. Furthermore, even when experimental biases of these data sets are taken into account, a real correlation between number of interactions and evolutionary rate appears to exist.Summary
A strong and significant correlation between a protein's number of interactions and evolutionary rate is apparent for interaction data from some studies. The extremely low agreement between different protein interaction data sets indicates that interaction data are still of low coverage and/or quality. These limitations may explain why some data sets reveal no correlation with evolutionary rates.19.
Jia Tu Yandong Yin Meimei Xu Ruohong Wang Zheng-Jiang Zhu 《Metabolomics : Official journal of the Metabolomic Society》2018,14(1):5
Introduction
The absolute quantitation of lipids at the lipidome-wide scale is a challenge but plays an important role in the comprehensive study of lipid metabolism.Objectives
We aim to develop a high-throughput quantitative lipidomics approach to enable the simultaneous identification and absolute quantification of hundreds of lipids in a single experiment. Then, we will systematically characterize lipidome-wide changes in the aging mouse brain and provide a link between aging and disordered lipid homeostasis.Methods
We created an in-house lipid spectral library, containing 76,361 lipids and 181,300 MS/MS spectra in total, to support accurate lipid identification. Then, we developed a response factor-based approach for the large-scale absolute quantifications of lipids.Results
Using the lipidomics approach, we absolutely quantified 1212 and 864 lipids in human cells and mouse brains, respectively. The quantification accuracy was validated using the traditional approach with a median relative error of 12.6%. We further characterized the lipidome-wide changes in aging mouse brains, and dramatic changes were observed in both glycerophospholipids and sphingolipids. Sphingolipids with longer acyl chains tend to accumulate in aging brains. Membrane-esterified fatty acids demonstrated diverse changes with aging, while most polyunsaturated fatty acids consistently decreased.Conclusion
We developed a high-throughput quantitative lipidomics approach and systematically characterized the lipidome-wide changes in aging mouse brains. The results proved a link between aging and disordered lipid homeostasis.20.
Lili Fu Binhui Jiang Jinliang Liu Xin Zhao Qian Liu Xiaomin Hu 《Biotechnology letters》2016,38(3):447-453