首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Introduction

In systems biology, where a main goal is acquiring knowledge of biological systems, one of the challenges is inferring biochemical interactions from different molecular entities such as metabolites. In this area, the metabolome possesses a unique place for reflecting “true exposure” by being sensitive to variation coming from genetics, time, and environmental stimuli. While influenced by many different reactions, often the research interest needs to be focused on variation coming from a certain source, i.e. a certain covariable \(\mathbf {X}_m\).

Objective

Here, we use network analysis methods to recover a set of metabolite relationships, by finding metabolites sharing a similar relation to \(\mathbf {X}_m\). Metabolite values are based on information coming from individuals’ \(\mathbf {X}_m\) status which might interact with other covariables.

Methods

Alternative to using the original metabolite values, the total information is decomposed by utilizing a linear regression model and the part relevant to \(\mathbf {X}_m\) is further used. For two datasets, two different network estimation methods are considered. The first is weighted gene co-expression network analysis based on correlation coefficients. The second method is graphical LASSO based on partial correlations.

Results

We observed that when using the parts related to the specific covariable of interest, resulting estimated networks display higher interconnectedness. Additionally, several groups of biologically associated metabolites (very large density lipoproteins, lipoproteins, etc.) were identified in the human data example.

Conclusions

This work demonstrates how information on the study design can be incorporated to estimate metabolite networks. As a result, sets of interconnected metabolites can be clustered together with respect to their relation to a covariable of interest.
  相似文献   

2.

Background

In this work, we present a new coarse grained representation of RNA dynamics. It is based on adjacency matrices and their interactions patterns obtained from molecular dynamics simulations. RNA molecules are well-suited for this representation due to their composition which is mainly modular and assessable by the secondary structure alone. These interactions can be represented as adjacency matrices of k nucleotides. Based on those, we define transitions between states as changes in the adjacency matrices which form Markovian dynamics. The intense computational demand for deriving the transition probability matrices prompted us to develop StreAM-\(T_g\), a stream-based algorithm for generating such Markov models of k-vertex adjacency matrices representing the RNA.

Results

We benchmark StreAM-\(T_g\) (a) for random and RNA unit sphere dynamic graphs (b) for the robustness of our method against different parameters. Moreover, we address a riboswitch design problem by applying StreAM-\(T_g\) on six long term molecular dynamics simulation of a synthetic tetracycline dependent riboswitch (500 ns) in combination with five different antibiotics.

Conclusions

The proposed algorithm performs well on large simulated as well as real world dynamic graphs. Additionally, StreAM-\(T_g\) provides insights into nucleotide based RNA dynamics in comparison to conventional metrics like the root-mean square fluctuation. In the light of experimental data our results show important design opportunities for the riboswitch.
  相似文献   

3.

Background

The basic RNA secondary structure prediction problem or single sequence folding problem (SSF) was solved 35 years ago by a now well-known \(O(n^3)\)-time dynamic programming method. Recently three methodologies—Valiant, Four-Russians, and Sparsification—have been applied to speedup RNA secondary structure prediction. The sparsification method exploits two properties of the input: the number of subsequence Z with the endpoints belonging to the optimal folding set and the maximum number base-pairs L. These sparsity properties satisfy \(0 \le L \le n / 2\) and \(n \le Z \le n^2 / 2\), and the method reduces the algorithmic running time to O(LZ). While the Four-Russians method utilizes tabling partial results.

Results

In this paper, we explore three different algorithmic speedups. We first expand the reformulate the single sequence folding Four-Russians \(\Theta \left(\frac{n^3}{\log ^2 n}\right)\)-time algorithm, to utilize an on-demand lookup table. Second, we create a framework that combines the fastest Sparsification and new fastest on-demand Four-Russians methods. This combined method has worst-case running time of \(O(\tilde{L}\tilde{Z})\), where \(\frac{{L}}{\log n} \le \tilde{L}\le min\left({L},\frac{n}{\log n}\right)\) and \(\frac{{Z}}{\log n}\le \tilde{Z} \le min\left({Z},\frac{n^2}{\log n}\right)\). Third we update the Four-Russians formulation to achieve an on-demand \(O( n^2/ \log ^2n )\)-time parallel algorithm. This then leads to an asymptotic speedup of \(O(\tilde{L}\tilde{Z_j})\) where \(\frac{{Z_j}}{\log n}\le \tilde{Z_j} \le min\left({Z_j},\frac{n}{\log n}\right)\) and \(Z_j\) the number of subsequence with the endpoint j belonging to the optimal folding set.

Conclusions

The on-demand formulation not only removes all extraneous computation and allows us to incorporate more realistic scoring schemes, but leads us to take advantage of the sparsity properties. Through asymptotic analysis and empirical testing on the base-pair maximization variant and a more biologically informative scoring scheme, we show that this Sparse Four-Russians framework is able to achieve a speedup on every problem instance, that is asymptotically never worse, and empirically better than achieved by the minimum of the two methods alone.
  相似文献   

4.

Introduction

The Elongator complex, comprising six subunits (Elp1p-Elp6p), is required for formation of 5-carbamoylmethyl (ncm5) and 5-methoxycarbonylmethyl (mcm5) side chains on wobble uridines in 11 out of 42 tRNA species in Saccharomyces cerevisiae. Loss of these side chains reduces the efficiency of tRNA decoding during translation, resulting in pleiotropic phenotypes. Overexpression of hypomodified \( {\text {tRNA}_{{\rm s^{2} {\rm UUU}}}^{{\rm Lys}} , {\rm tRNA}_{{\rm s^{2} {\rm UUG}}}^{{\rm Gln }} \;{\rm and}\;{\rm tRNA}_{{\rm s^{2} {\rm UUC}}}^{{\rm Glu}}} \), which in wild-type strains are modified with mcm5s2U, partially suppress phenotypes of an elp3Δ strain.

Objectives

To identify metabolic alterations in an elp3Δ strain and elucidate whether these metabolic alterations are suppressed by overexpression of hypomodified \( {\text {tRNA}_{{\rm s^{2} {\rm UUU}}}^{{\rm Lys}} , {\rm tRNA}_{{\rm s^{2} {\rm UUG}}}^{{\rm Gln }} \;{\rm and}\;{\rm tRNA}_{{\rm s^{2} {\rm UUC}}}^{{\rm Glu}}} \).

Method

Metabolic profiles were obtained using untargeted GC-TOF-MS of a temperature-sensitive elp3Δ strain carrying either an empty low-copy vector, an empty high-copy vector, a low-copy vector harboring the wild-type ELP3 gene, or a high-copy vector overexpressing \( {\text {tRNA}_{{\rm s^{2} {\rm UUU}}}^{{\rm Lys}} , {\rm tRNA}_{{\rm s^{2} {\rm UUG}}}^{{\rm Gln }} \;{\rm and}\;{\rm tRNA}_{{\rm s^{2} {\rm UUC}}}^{{\rm Glu}}} \). The temperature sensitive elp3Δ strain derivatives were cultivated at permissive (30 °C) or semi-permissive (34 °C) growth conditions.

Results

Culturing an elp3Δ strain at 30 or 34 °C resulted in altered metabolism of 36 and 46 %, respectively, of all metabolites detected when compared to an elp3Δ strain carrying the wild-type ELP3 gene. Overexpression of hypomodified \( {\text {tRNA}_{{\rm s^{2} {\rm UUU}}}^{{\rm Lys}} , {\rm tRNA}_{{\rm s^{2} {\rm UUG}}}^{{\rm Gln }} \;{\rm and}\;{\rm tRNA}_{{\rm s^{2} {\rm UUC}}}^{{\rm Glu}}} \) suppressed a subset of the metabolic alterations observed in the elp3Δ strain.

Conclusion

Our results suggest that the presence of ncm5- and mcm5-side chains on wobble uridines in tRNA are important for metabolic homeostasis.
  相似文献   

5.

Background

Cancer is an evolutionary process characterized by the accumulation of somatic mutations in a population of cells that form a tumor. One frequent type of mutations is copy number aberrations, which alter the number of copies of genomic regions. The number of copies of each position along a chromosome constitutes the chromosome’s copy-number profile. Understanding how such profiles evolve in cancer can assist in both diagnosis and prognosis.

Results

We model the evolution of a tumor by segmental deletions and amplifications, and gauge distance from profile \(\mathbf {a}\) to \(\mathbf {b}\) by the minimum number of events needed to transform \(\mathbf {a}\) into \(\mathbf {b}\). Given two profiles, our first problem aims to find a parental profile that minimizes the sum of distances to its children. Given k profiles, the second, more general problem, seeks a phylogenetic tree, whose k leaves are labeled by the k given profiles and whose internal vertices are labeled by ancestral profiles such that the sum of edge distances is minimum.

Conclusions

For the former problem we give a pseudo-polynomial dynamic programming algorithm that is linear in the profile length, and an integer linear program formulation. For the latter problem we show it is NP-hard and give an integer linear program formulation that scales to practical problem instance sizes. We assess the efficiency and quality of our algorithms on simulated instances.
  相似文献   

6.

Background

Isometric gene tree reconciliation is a gene tree/species tree reconciliation problem where both the gene tree and the species tree include branch lengths, and these branch lengths must be respected by the reconciliation. The problem was introduced by Ma et al. in 2008 in the context of reconstructing evolutionary histories of genomes in the infinite sites model.

Results

In this paper, we show that the original algorithm by Ma et al. is incorrect, and we propose a modified algorithm that addresses the problems that we discovered. We have also improved the running time from \(O(N^2)\) to \(O(N\log N)\), where N is the total number of nodes in the two input trees. Finally, we examine two new variants of the problem: reconciliation of two unrooted trees and scaling of branch lengths of the gene tree during reconciliation of two rooted trees.

Conclusions

We provide several new algorithms for isometric reconciliation of trees. Some questions in this area remain open; most importantly extensions of the problem allowing for imprecise estimates of branch lengths.
  相似文献   

7.
Zeng  Chao  Hamada  Michiaki 《BMC genomics》2018,19(10):906-49

Background

With the increasing number of annotated long noncoding RNAs (lncRNAs) from the genome, researchers are continually updating their understanding of lncRNAs. Recently, thousands of lncRNAs have been reported to be associated with ribosomes in mammals. However, their biological functions or mechanisms are still unclear.

Results

In this study, we tried to investigate the sequence features involved in the ribosomal association of lncRNA. We have extracted ninety-nine sequence features corresponding to different biological mechanisms (i.e., RNA splicing, putative ORF, k-mer frequency, RNA modification, RNA secondary structure, and repeat element). An \(\mathcal {L}1\)-regularized logistic regression model was applied to screen these features. Finally, we obtained fifteen and nine important features for the ribosomal association of human and mouse lncRNAs, respectively.

Conclusion

To our knowledge, this is the first study to characterize ribosome-associated lncRNAs and ribosome-free lncRNAs from the perspective of sequence features. These sequence features that were identified in this study may shed light on the biological mechanism of the ribosomal association and provide important clues for functional analysis of lncRNAs.
  相似文献   

8.

Introduction

Concerning NMR-based metabolomics, 1D spectra processing often requires an expert eye for disentangling the intertwined peaks.

Objectives

The objective of NMRProcFlow is to assist the expert in this task in the best way without requirement of programming skills.

Methods

NMRProcFlow was developed to be a graphical and interactive 1D NMR (1H & 13C) spectra processing tool.

Results

NMRProcFlow (http://nmrprocflow.org), dedicated to metabolic fingerprinting and targeted metabolomics, covers all spectra processing steps including baseline correction, chemical shift calibration and alignment.

Conclusion

Biologists and NMR spectroscopists can easily interact and develop synergies by visualizing the NMR spectra along with their corresponding experimental-factor levels, thus setting a bridge between experimental design and subsequent statistical analyses.
  相似文献   

9.
10.

Background

Patterns with wildcards in specified positions, namely spaced seeds, are increasingly used instead of k-mers in many bioinformatics applications that require indexing, querying and rapid similarity search, as they can provide better sensitivity. Many of these applications require to compute the hashing of each position in the input sequences with respect to the given spaced seed, or to multiple spaced seeds. While the hashing of k-mers can be rapidly computed by exploiting the large overlap between consecutive k-mers, spaced seeds hashing is usually computed from scratch for each position in the input sequence, thus resulting in slower processing.

Results

The method proposed in this paper, fast spaced-seed hashing (FSH), exploits the similarity of the hash values of spaced seeds computed at adjacent positions in the input sequence. In our experiments we compute the hash for each positions of metagenomics reads from several datasets, with respect to different spaced seeds. We also propose a generalized version of the algorithm for the simultaneous computation of multiple spaced seeds hashing. In the experiments, our algorithm can compute the hashing values of spaced seeds with a speedup, with respect to the traditional approach, between 1.6\(\times\) to 5.3\(\times\), depending on the structure of the spaced seed.

Conclusions

Spaced seed hashing is a routine task for several bioinformatics application. FSH allows to perform this task efficiently and raise the question of whether other hashing can be exploited to further improve the speed up. This has the potential of major impact in the field, making spaced seed applications not only accurate, but also faster and more efficient.

Availability

The software FSH is freely available for academic use at: https://bitbucket.org/samu661/fsh/overview.
  相似文献   

11.

Background

Suffix arrays, augmented by additional data structures, allow solving efficiently many string processing problems. The external memory construction of the generalized suffix array for a string collection is a fundamental task when the size of the input collection or the data structure exceeds the available internal memory.

Results

In this article we present and analyze \(\mathsf {eGSA}\) [introduced in CPM (External memory generalized suffix and \(\mathsf {LCP}\) arrays construction. In: Proceedings of CPM. pp 201–10, 2013)], the first external memory algorithm to construct generalized suffix arrays augmented with the longest common prefix array for a string collection. Our algorithm relies on a combination of buffers, induced sorting and a heap to avoid direct string comparisons. We performed experiments that covered different aspects of our algorithm, including running time, efficiency, external memory access, internal phases and the influence of different optimization strategies. On real datasets of size up to 24 GB and using 2 GB of internal memory, \(\mathsf {eGSA}\) showed a competitive performance when compared to \(\mathsf {eSAIS}\) and \(\mathsf {SAscan}\), which are efficient algorithms for a single string according to the related literature. We also show the effect of disk caching managed by the operating system on our algorithm.

Conclusions

The proposed algorithm was validated through performance tests using real datasets from different domains, in various combinations, and showed a competitive performance. Our algorithm can also construct the generalized Burrows-Wheeler transform of a string collection with no additional cost except by the output time.
  相似文献   

12.

Main conclusion

Starch granule size distributions in plant tissues, when determined in high resolution and specifiedproperly as a frequency function, could provide useful information on the granule formation and growth.

Abstract

To better understand genetic control of physical properties of starch granules, we attempted a new approach to analyze developmental and genotypic effects on morphology and size distributions of starch granules in sweetpotato storage roots. Starch granules in sweetpotatoes exhibited low sphericity, many shapes that appeared to be independent of genotypes or developmental stages, and non-randomly distributed sizes. Granule size distributions of sweetpotato starches were determined in high resolution as differential volume-percentage distributions of volume-equivalent spherical diameters, rigorously curve-fitted to be lognormal, and specified using their geometric means \(\bar{x}^{*}\) and multiplicative standard deviations \(s^{*}\) in a \(\bar{x}^{*} \times /({\text{multiply/divide}})s^{*}\) form. The scale (\(\bar{x}^{*}\)) and shape (\(\bar{s}^{*}\)) of these distributions were independently variable, ranging from 14.02 to 19.36 μm and 1.403 to 1.567, respectively, among 22 cultivars/clones. The shape (\(s^{*}\)) of granule lognormal volume-size distributions of sweetpotato starch were found to be highly significantly and inversely correlated with their apparent amylose contents. More importantly, granule lognormal volume-size distributions of starches in developing sweetpotatoes displayed the same self-preserving kinetics, i.e., preserving the shape but shifting upward the scale, as those of particles undergoing agglomeration, which strongly indicated involvement of agglomeration in the formation and growth of starch granules. Furthermore, QTL analysis of a segregating null allele at one of three homoeologous starch synthase II loci in a reciprocal-cross population, which was identified through profiling starch granule-bound proteins in sweetpotatoes of diverse genotypes, showed that the locus is a QTL modulating the scale of granule volume-size distributions of starch in sweetpotatoes.
  相似文献   

13.

Background

Centrifugation is an indispensable procedure for plasma sample preparation, but applied conditions can vary between labs.

Aim

Determine whether routinely used plasma centrifugation protocols (1500×g 10 min; 3000×g 5 min) influence non-targeted metabolomic analyses.

Methods

Nuclear magnetic resonance spectroscopy (NMR) and High Resolution Mass Spectrometry (HRMS) data were evaluated with sparse partial least squares discriminant analyses and compared with cell count measurements.

Results

Besides significant differences in platelet count, we identified substantial alterations in NMR and HRMS data related to the different centrifugation protocols.

Conclusion

Already minor differences in plasma centrifugation can significantly influence metabolomic patterns and potentially bias metabolomics studies.
  相似文献   

14.

Background

The mutual exclusivity of somatic genome alterations (SGAs), such as somatic mutations and copy number alterations, is an important observation of tumors and is widely used to search for cancer signaling pathways or SGAs related to tumor development. However, one problem with current methods that use mutual exclusivity is that they are not signal-based; another problem is that they use heuristic algorithms to handle the NP-hard problems, which cannot guarantee to find the optimal solutions of their models.

Method

In this study, we propose a novel signal-based method that utilizes the intrinsic relationship between SGAs on signaling pathways and expression changes of downstream genes regulated by pathways to identify cancer signaling pathways using the mutually exclusive property. We also present a relatively efficient exact algorithm that can guarantee to obtain the optimal solution of the new computational model.

Results

We have applied our new model and exact algorithm to the breast cancer data. The results reveal that our new approach increases the capability of finding better solutions in the application of cancer research. Our new exact algorithm has a time complexity of \(O^{*}(1.325^{m})\)(Note: Following the recent convention, we use a star * to represent that the polynomial part of the time complexity is neglected), which has solved the NP-hard problem of our model efficiently.

Conclusion

Our new method and algorithm can discover the true causes behind the phenotypes, such as what SGA events lead to abnormality of the cell cycle or make the cell metastasis lose control in tumors; thus, it identifies the target candidates for precision (or target) therapeutics.
  相似文献   

15.

Introduction

Despite the use of buffering agents the 1H NMR spectra of biofluid samples in metabolic profiling investigations typically suffer from extensive peak frequency shifting between spectra. These chemical shift changes are mainly due to differences in pH and divalent metal ion concentrations between the samples. This frequency shifting results in a correspondence problem: it can be hard to register the same peak as belonging to the same molecule across multiple samples. The problem is especially acute for urine, which can have a wide range of ionic concentrations between different samples.

Objectives

To investigate the acid, base and metal ion dependent 1H NMR chemical shift variations and limits of the main metabolites in a complex biological mixture.

Methods

Urine samples from five different individuals were collected and pooled, and pre-treated with Chelex-100 ion exchange resin. Urine samples were either treated with either HCl or NaOH, or were supplemented with various concentrations of CaCl2, MgCl2, NaCl or KCl, and their 1H NMR spectra were acquired.

Results

Nonlinear fitting was used to derive acid dissociation constants and acid and base chemical shift limits for peaks from 33 identified metabolites. Peak pH titration curves for a further 65 unidentified peaks were also obtained for future reference. Furthermore, the peak variations induced by the main metal ions present in urine, Na+, K+, Ca2+ and Mg2+, were also measured.

Conclusion

These data will be a valuable resource for 1H NMR metabolite profiling experiments and for the development of automated metabolite alignment and identification algorithms for 1H NMR spectra.
  相似文献   

16.

Background

The gene family-free framework for comparative genomics aims at providing methods for gene order analysis that do not require prior gene family assignment, but work directly on a sequence similarity graph. We study two problems related to the breakpoint median of three genomes, which asks for the construction of a fourth genome that minimizes the sum of breakpoint distances to the input genomes.

Methods

We present a model for constructing a median of three genomes in this family-free setting, based on maximizing an objective function that generalizes the classical breakpoint distance by integrating sequence similarity in the score of a gene adjacency. We study its computational complexity and we describe an integer linear program (ILP) for its exact solution. We further discuss a related problem called family-free adjacencies for k genomes for the special case of \(k \le 3\) and present an ILP for its solution. However, for this problem, the computation of exact solutions remains intractable for sufficiently large instances. We then proceed to describe a heuristic method, FFAdj-AM, which performs well in practice.

Results

The developed methods compute accurate positional orthologs for genomes comparable in size of bacterial genomes on simulated data and genomic data acquired from the OMA orthology database. In particular, FFAdj-AM performs equally or better when compared to the well-established gene family prediction tool MultiMSOAR.

Conclusions

We study the computational complexity of a new family-free model and present algorithms for its solution. With FFAdj-AM, we propose an appealing alternative to established tools for identifying higher confidence positional orthologs.
  相似文献   

17.

Introduction

Experiments in metabolomics rely on the identification and quantification of metabolites in complex biological mixtures. This remains one of the major challenges in NMR/mass spectrometry analysis of metabolic profiles. These features are mandatory to make metabolomics asserting a general approach to test a priori formulated hypotheses on the basis of exhaustive metabolome characterization rather than an exploratory tool dealing with unknown metabolic features.

Objectives

In this article we propose a method, named ASICS, based on a strong statistical theory that handles automatically the metabolites identification and quantification in proton NMR spectra.

Methods

A statistical linear model is built to explain a complex spectrum using a library containing pure metabolite spectra. This model can handle local or global chemical shift variations due to experimental conditions using a warping function. A statistical lasso-type estimator identifies and quantifies the metabolites in the complex spectrum. This estimator shows good statistical properties and handles peak overlapping issues.

Results

The performances of the method were investigated on known mixtures (such as synthetic urine) and on plasma datasets from duck and human. Results show noteworthy performances, outperforming current existing methods.

Conclusion

ASICS is a completely automated procedure to identify and quantify metabolites in 1H NMR spectra of biological mixtures. It will enable empowering NMR-based metabolomics by quickly and accurately helping experts to obtain metabolic profiles.
  相似文献   

18.

Introduction

The differences in fecal metabolome between ankylosing spondylitis (AS)/rheumatoid arthritis (RA) patients and healthy individuals could be the reason for an autoimmune disorder.

Objectives

The study explored the fecal metabolome difference between AS/RA patients and healthy controls to clarify human immune disturbance.

Methods

Fecal samples from 109 individuals (healthy controls 34, AS 40, and RA 35) were analyzed by 1H NMR spectroscopy. Data were analyzed with principal component analysis (PCA) and orthogonal projection to latent structure discriminant (OPLS-DA) analysis.

Results

Significant differences in the fecal metabolic profiles could distinguish AS/RA patients from healthy controls but could not distinguish between AS and RA patients. The significantly decreased metabolites in AS/RA patients were butyrate, propionate, methionine, and hypoxanthine. Significantly increased metabolites in AS/RA patients were taurine, methanol, fumarate, and tryptophan.

Conclusion

The metabolome variations in feces indicated AS and RA were two homologous diseases that could not be distinguished by 1H NMR metabolomics.
  相似文献   

19.

Background

Diabetes induces many complications including reduced fertility and low oocyte quality, but whether it causes increased mtDNA mutations is unknown.

Methods

We generated a T2D mouse model by using high-fat-diet (HFD) and Streptozotocin (STZ) injection. We examined mtDNA mutations in oocytes of diabetic mice by high-throughput sequencing techniques.

Results

T2D mice showed glucose intolerance, insulin resistance, low fecundity compared to the control group. T2D oocytes showed increased mtDNA mutation sites and mutation numbers compared to the control counterparts. mtDNA mutation examination in F1 mice showed that the mitochondrial bottleneck could eliminate mtDNA mutations.

Conclusions

T2D mice have increased mtDNA mutation sites and mtDNA mutation numbers in oocytes compared to the counterparts, while these adverse effects can be eliminated by the bottleneck effect in their offspring. This is the first study using a small number of oocytes to examine mtDNA mutations in diabetic mothers and offspring.
  相似文献   

20.

Introduction

Collecting feces is easy. It offers direct outcome to endogenous and microbial metabolites.

Objectives

In a context of lack of consensus about fecal sample preparation, especially in animal species, we developed a robust protocol allowing untargeted LC-HRMS fingerprinting.

Methods

The conditions of extraction (quantity, preparation, solvents, dilutions) were investigated in bovine feces.

Results

A rapid and simple protocol involving feces extraction with methanol (1/3, M/V) followed by centrifugation and a step filtration (10 kDa) was developed.

Conclusion

The workflow generated repeatable and informative fingerprints for robust metabolome characterization.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号