首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Lyu  Chuqiao  Wang  Lei  Zhang  Juhua 《BMC genomics》2018,19(10):905-165

Background

The DNase I hypersensitive sites (DHSs) are associated with the cis-regulatory DNA elements. An efficient method of identifying DHSs can enhance the understanding on the accessibility of chromatin. Despite a multitude of resources available on line including experimental datasets and computational tools, the complex language of DHSs remains incompletely understood.

Methods

Here, we address this challenge using an approach based on a state-of-the-art machine learning method. We present a novel convolutional neural network (CNN) which combined Inception like networks with a gating mechanism for the response of multiple patterns and longterm association in DNA sequences to predict multi-scale DHSs in Arabidopsis, rice and Homo sapiens.

Results

Our method obtains 0.961 area under curve (AUC) on Arabidopsis, 0.969 AUC on rice and 0.918 AUC on Homo sapiens.

Conclusions

Our method provides an efficient and accurate way to identify multi-scale DHSs sequences by deep learning.
  相似文献   

2.

Background

Adverse drug reactions (ADRs) are unintended and harmful reactions caused by normal uses of drugs. Predicting and preventing ADRs in the early stage of the drug development pipeline can help to enhance drug safety and reduce financial costs.

Methods

In this paper, we developed machine learning models including a deep learning framework which can simultaneously predict ADRs and identify the molecular substructures associated with those ADRs without defining the substructures a-priori.

Results

We evaluated the performance of our model with ten different state-of-the-art fingerprint models and found that neural fingerprints from the deep learning model outperformed all other methods in predicting ADRs. Via feature analysis on drug structures, we identified important molecular substructures that are associated with specific ADRs and assessed their associations via statistical analysis.

Conclusions

The deep learning model with feature analysis, substructure identification, and statistical assessment provides a promising solution for identifying risky components within molecular structures and can potentially help to improve drug safety evaluation.
  相似文献   

3.

Background

Hot spot residues are functional sites in protein interaction interfaces. The identification of hot spot residues is time-consuming and laborious using experimental methods. In order to address the issue, many computational methods have been developed to predict hot spot residues. Moreover, most prediction methods are based on structural features, sequence characteristics, and/or other protein features.

Results

This paper proposed an ensemble learning method to predict hot spot residues that only uses sequence features and the relative accessible surface area of amino acid sequences. In this work, a novel feature selection technique was developed, an auto-correlation function combined with a sliding window technique was applied to obtain the characteristics of amino acid residues in protein sequence, and an ensemble classifier with SVM and KNN base classifiers was built to achieve the best classification performance.

Conclusion

The experimental results showed that our model yields the highest F1 score of 0.92 and an MCC value of 0.87 on ASEdb dataset. Compared with other machine learning methods, our model achieves a big improvement in hot spot prediction.
  相似文献   

4.

Background

The non-receptor tyrosine kinase, SRMS (Src-related kinase lacking C-terminal regulatory tyrosine and N-terminal myristoylation sites) is a member of the BRK family kinases (BFKs) which represents an evolutionarily conserved relative of the Src family kinases (SFKs). Tyrosine kinases are known to regulate a number of cellular processes and pathways via phosphorylating substrate proteins directly and/or by partaking in signaling cross-talks leading to the indirect modulation of various signaling intermediates. In a previous study, we profiled the tyrosine-phosphoproteome of SRMS and identified multiple candidate substrates of the kinase. The broader cellular signaling intermediates of SRMS are unknown.

Methods

In order to uncover the broader SRMS-regulated phosphoproteome and identify the SRMS-regulated indirect signaling intermediates, we performed label-free global phosphoproteomics analysis on cells expressing wild-type SRMS. Using computational database searching and bioinformatics analyses we characterized the dataset.

Results

Our analyses identified 60 hyperphosphorylated (phosphoserine/phosphothreonine) proteins mapped from 140 hyperphosphorylated peptides. Bioinfomatics analyses identified a number of significantly enriched biological and cellular processes among which DNA repair pathways were found to be upregulated while apoptotic pathways were found to be downregulated. Analyses of motifs derived from the upregulated phosphosites identified Casein kinase 2 alpha (CK2α) as one of the major potential kinases contributing to the SRMS-dependent indirect regulation of signaling intermediates.

Conclusions

Overall, our phosphoproteomics analyses identified serine/threonine phosphorylation dynamics as important secondary events of the SRMS-regulated phosphoproteome with implications in the regulation of cellular and biological processes.
  相似文献   

5.

Background

Tks5/FISH is a scaffold protein comprising of five SH3 domains and one PX domain. Tks5 is a substrate of the tyrosine kinase Src and is required for the organization of podosomes/invadopodia implicated in invasion of tumor cells. Recent data have suggested that a close homologue of Tks5, Tks4, is implicated in the EGF signaling.

Results

Here, we report that Tks5 is a component of the EGF signaling pathway. In EGF-treated cells, Tks5 is tyrosine phosphorylated within minutes and the level of phosphorylation is sustained for at least 2 hours. Using specific kinase inhibitors, we demonstrate that tyrosine phosphorylation of Tks5 is catalyzed by Src tyrosine kinase. We show that treatment of cells with EGF results in plasma membrane translocation of Tks5. In addition, treatment of cells with LY294002, an inhibitor of PI 3-kinase, or mutation of the PX domain reduces tyrosine phosphorylation and membrane translocation of Tks5.

Conclusions

Our results identify Tks5 as a novel component of the EGF signaling pathway.
  相似文献   

6.

Background

The application of machine learning to classification problems that depend only on positive examples is gaining attention in the computational biology community. We and others have described the use of two-class machine learning to identify novel miRNAs. These methods require the generation of an artificial negative class. However, designation of the negative class can be problematic and if it is not properly done can affect the performance of the classifier dramatically and/or yield a biased estimate of performance. We present a study using one-class machine learning for microRNA (miRNA) discovery and compare one-class to two-class approaches using naïve Bayes and Support Vector Machines. These results are compared to published two-class miRNA prediction approaches. We also examine the ability of the one-class and two-class techniques to identify miRNAs in newly sequenced species.

Results

Of all methods tested, we found that 2-class naive Bayes and Support Vector Machines gave the best accuracy using our selected features and optimally chosen negative examples. One class methods showed average accuracies of 70–80% versus 90% for the two 2-class methods on the same feature sets. However, some one-class methods outperform some recently published two-class approaches with different selected features. Using the EBV genome as and external validation of the method we found one-class machine learning to work as well as or better than a two-class approach in identifying true miRNAs as well as predicting new miRNAs.

Conclusion

One and two class methods can both give useful classification accuracies when the negative class is well characterized. The advantage of one class methods is that it eliminates guessing at the optimal features for the negative class when they are not well defined. In these cases one-class methods can be superior to two-class methods when the features which are chosen as representative of that positive class are well defined.

Availability

The OneClassmiRNA program is available at: [1]
  相似文献   

7.

Background

p38 mitogen-activated protein kinase has been implicated in both skeletal muscle atrophy and hypertrophy. T317 phosphorylation of the p38 substrate mitogen-activated protein kinase-activated protein kinase 2 (MK2) correlates with muscle weight in atrophic and hypertrophic denervated muscle and may influence the nuclear and cytoplasmic distribution of p38 and/or MK2. The present study investigates expression and phosphorylation of p38, MK2 and related proteins in cytosolic and nuclear fractions from atrophic and hypertrophic 6-days denervated skeletal muscles compared to innervated controls.

Methods

Expression and phosphorylation of p38, MK2, Hsp25 (heat shock protein25rodent/27human, Hsp25/27) and Hsp70 protein expression were studied semi-quantitatively using Western blots with separated nuclear and cytosolic fractions from innervated and denervated hypertrophic hemidiaphragm and atrophic anterior tibial muscles. Unfractionated innervated and denervated atrophic pooled gastrocnemius and soleus muscles were also studied.

Results

No support was obtained for a differential nuclear/cytosolic localization of p38 or MK2 in denervated hypertrophic and atrophic muscle. The differential effect of denervation on T317 phosphorylation of MK2 in denervated hypertrophic and atrophic muscle was not reflected in p38 phosphorylation nor in the phosphorylation of the MK2 substrate Hsp25. Hsp25 phosphorylation increased 3-30-fold in all denervated muscles studied. The expression of Hsp70 increased 3-5-fold only in denervated hypertrophic muscles.

Conclusions

The study confirms a differential response of MK2 T317 phosphorylation in denervated hypertrophic and atrophic muscles and suggests that Hsp70 may be important for this. Increased Hsp25 phosphorylation in all denervated muscles studied indicates a role for factors other than MK2 in the regulation of this phosphorylation.
  相似文献   

8.

Objectives

To design a new system for the in vivo phosphorylation of proteins in Escherichia coli using the co-expression of the α-subunit of casein kinase II (CKIIα) and a target protein, (Nanofitin) fused with a phosphorylatable tag.

Results

The level of the co-expressed CKIIα was controlled by the arabinose promoter and optimal phosphorylation was obtained with 2 % (w/v) arabinose as inductor. The effectiveness of the phosphorylation system was demonstrated by electrophoretic mobility shift assay (NUT-PAGE) and staining with a specific phosphoprotein-staining gel. The resulting phosphorylated tag was also used to purify the phosphoprotein by immobilized metal affinity chromatography, which relies on the specific interaction of phosphate moieties with Fe(III).

Conclusion

The use of a single tag for both the purification and protein array anchoring provides a simple and straightforward system for protein analysis.
  相似文献   

9.

Introduction

Natural products from culture collections have enormous impact in advancing discovery programs for metabolites of biotechnological importance. These discovery efforts rely on the metabolomic characterization of strain collections.

Objective

Many emerging approaches compare metabolomic profiles of such collections, but few enable the analysis and prioritization of thousands of samples from diverse organisms while delivering chemistry specific read outs.

Method

In this work we utilize untargeted LC–MS/MS based metabolomics together with molecular networking to inventory the chemistries associated with 1000 marine microorganisms.

Result

This approach annotated 76 molecular families (a spectral match rate of 28 %), including clinically and biotechnologically important molecules such as valinomycin, actinomycin D, and desferrioxamine E. Targeting a molecular family produced primarily by one microorganism led to the isolation and structure elucidation of two new molecules designated maridric acids A and B.

Conclusion

Molecular networking guided exploration of large culture collections allows for rapid dereplication of know molecules and can highlight producers of uniques metabolites. These methods, together with large culture collections and growing databases, allow for data driven strain prioritization with a focus on novel chemistries.
  相似文献   

10.

Introduction

Direct injection Fourier-transform mass spectrometry (FT-MS) allows for the high-throughput and high-resolution detection of thousands of metabolite-associated isotopologues. However, spectral artifacts can generate large numbers of spectral features (peaks) that do not correspond to known compounds. Misassignment of these artifactual features creates interpretive errors and limits our ability to discern the role of representative features within living systems.

Objectives

Our goal is to develop rigorous methods that identify and handle spectral artifacts within the context of high-throughput FT-MS-based metabolomics studies.

Results

We observed three types of artifacts unique to FT-MS that we named high peak density (HPD) sites: fuzzy sites, ringing and partial ringing. While ringing artifacts are well-known, fuzzy sites and partial ringing have not been previously well-characterized in the literature. We developed new computational methods based on comparisons of peak density within a spectrum to identify regions of spectra with fuzzy sites. We used these methods to identify and eliminate fuzzy site artifacts in an example dataset of paired cancer and non-cancer lung tissue samples and evaluated the impact of these artifacts on classification accuracy and robustness.

Conclusion

Our methods robustly identified consistent fuzzy site artifacts in our FT-MS metabolomics spectral data. Without artifact identification and removal, 91.4% classification accuracy was achieved on an example lung cancer dataset; however, these classifiers rely heavily on artifactual features present in fuzzy sites. Proper removal of fuzzy site artifacts produces a more robust classifier based on non-artifactual features, with slightly improved accuracy of 92.4% in our example analysis.
  相似文献   

11.

Introduction

Collecting feces is easy. It offers direct outcome to endogenous and microbial metabolites.

Objectives

In a context of lack of consensus about fecal sample preparation, especially in animal species, we developed a robust protocol allowing untargeted LC-HRMS fingerprinting.

Methods

The conditions of extraction (quantity, preparation, solvents, dilutions) were investigated in bovine feces.

Results

A rapid and simple protocol involving feces extraction with methanol (1/3, M/V) followed by centrifugation and a step filtration (10 kDa) was developed.

Conclusion

The workflow generated repeatable and informative fingerprints for robust metabolome characterization.
  相似文献   

12.
Zhang  Wen  Zhu  Xiaopeng  Fu  Yu  Tsuji  Junko  Weng  Zhiping 《BMC bioinformatics》2017,18(13):464-11

Background

Alternative splicing is the critical process in a single gene coding, which removes introns and joins exons, and splicing branchpoints are indicators for the alternative splicing. Wet experiments have identified a great number of human splicing branchpoints, but many branchpoints are still unknown. In order to guide wet experiments, we develop computational methods to predict human splicing branchpoints.

Results

Considering the fact that an intron may have multiple branchpoints, we transform the branchpoint prediction as the multi-label learning problem, and attempt to predict branchpoint sites from intron sequences. First, we investigate a variety of intron sequence-derived features, such as sparse profile, dinucleotide profile, position weight matrix profile, Markov motif profile and polypyrimidine tract profile. Second, we consider several multi-label learning methods: partial least squares regression, canonical correlation analysis and regularized canonical correlation analysis, and use them as the basic classification engines. Third, we propose two ensemble learning schemes which integrate different features and different classifiers to build ensemble learning systems for the branchpoint prediction. One is the genetic algorithm-based weighted average ensemble method; the other is the logistic regression-based ensemble method.

Conclusions

In the computational experiments, two ensemble learning methods outperform benchmark branchpoint prediction methods, and can produce high-accuracy results on the benchmark dataset.
  相似文献   

13.

Background

MicroRNAs (miRNAs) are a large class of non-coding RNAs with important functions wide spread in animals, plants and viruses. Studies showed that an RNase III family member called Drosha recognizes most miRNAs, initiates their processing and determines the mature miRNAs. The Drosha processing sites identification will shed some light on both miRNA identification and understanding the mechanism of Drosha processing.

Methods

We developed a computational method for Drosha processing site predicting, named as DroshaPSP, which employs a two-layer mathematical model to integrate structure feature in the first layer and sequence features in the second layer. The performance of DroshaPSP was estimated by 5-fold cross-validation and measured by ACC (accuracy), Sn (sensitivity), Sp (specificity), P (precision) and MCC (Matthews correlation coefficient).

Results

The results of testing DroshaPSP on the miRNA data of Drosophila melanogaster indicated that the Sn, Sp, and MCC thereof reach to 0.86, 0.99 and 0.86 respectively.

Conclusions

We found the Shannon entropy, a chemical kinetics feature, is a significant feature in telling the true sites among the nearby sites and improving the performance.
  相似文献   

14.

Background

Human cancers are complex ecosystems composed of cells with distinct molecular signatures. Such intratumoral heterogeneity poses a major challenge to cancer diagnosis and treatment. Recent advancements of single-cell techniques such as scRNA-seq have brought unprecedented insights into cellular heterogeneity. Subsequently, a challenging computational problem is to cluster high dimensional noisy datasets with substantially fewer cells than the number of genes.

Methods

In this paper, we introduced a consensus clustering framework conCluster, for cancer subtype identification from single-cell RNA-seq data. Using an ensemble strategy, conCluster fuses multiple basic partitions to consensus clusters.

Results

Applied to real cancer scRNA-seq datasets, conCluster can more accurately detect cancer subtypes than the widely used scRNA-seq clustering methods. Further, we conducted co-expression network analysis for the identified melanoma subtypes.

Conclusions

Our analysis demonstrates that these subtypes exhibit distinct gene co-expression networks and significant gene sets with different functional enrichment.
  相似文献   

15.

Introduction

Data sharing is being increasingly required by journals and has been heralded as a solution to the ‘replication crisis’.

Objectives

(i) Review data sharing policies of journals publishing the most metabolomics papers associated with open data and (ii) compare these journals’ policies to those that publish the most metabolomics papers.

Methods

A PubMed search was used to identify metabolomics papers. Metabolomics data repositories were manually searched for linked publications.

Results

Journals that support data sharing are not necessarily those with the most papers associated to open metabolomics data.

Conclusion

Further efforts are required to improve data sharing in metabolomics.
  相似文献   

16.

Background

Cytoplasmic stress granules (SGs) are specialized storage sites of untranslated mRNAs whose formation occurs under different stress conditions and is often associated with cell survival. SGs-inducing stresses include radiations, hypoxia, viral infections, and chemical inhibitors of specific translation initiation factors. The FDA-approved drug bortezomib (Velcade®) is a peptide boronate inhibitor of the 26S proteasome that is very efficient for the treatment of myelomas and other hematological tumors. Solid tumors are largely refractory to bortezomib. In the present study, we investigated the formation of SGs following bortezomib treatment.

Results

We show that bortezomib efficiently induces the formation of SGs in cancer cells. This process involves the phosphorylation of translation initiation factor eIF2α by heme-regulated inhibitor kinase (HRI). Depletion of HRI prevents bortezomib-induced formation of SGs and promotes apoptosis.

Conclusions

This is the first study describing the formation of SGs by a chemotherapeutic compound. We speculate that the activation of HRI and the formation of SGs might constitute a mechanism by which cancer cells resist bortezomib-mediated apoptosis.
  相似文献   

17.
18.

Objectives

To improve its phosphate accumulating abilities for phosphate recycling from wastewater, a magnetotactic bacterium, Magnetospirillum gryphiswaldense, was genetically modified to over-express polyphosphate kinase.

Results

Polyphosphate kinase was over-expressed in the bacterium. The recombinant strain accumulated ninefold more polyphosphate from synthetic wastewater compared to original wild type. The magnetic property of the recombinant M. gryphiswaldense strain was retained.

Conclusions

The recombinant M. gryphiswaldense can be used for phosphate removal and recovery in bioremediation.
  相似文献   

19.

Background

Scaffold proteins have an important role in the regulation of signal propagation. These proteins do not possess any enzymatic activity but can contribute to the formation of multiprotein complexes. Although scaffold proteins are present in all cell types, the nervous system contains them in the largest amount. Caskin proteins are typically present in neuronal cells, particularly, in the synapses. However, the signaling mechanisms by which Caskin proteins are regulated are largely unknown.

Results

Here we demonstrate that EphB1 receptor tyrosine kinase can recruit Caskin1 through the adaptor protein Nck. Upon activation of the receptor kinase, the SH2 domain of Nck binds to one of its tyrosine residues, while Nck SH3 domains interact with the proline-rich domain of Caskin1. Complex formation of the receptor, adaptor and scaffold proteins results in the tyrosine phosphorylation of Caskin1 on its SH3 domain. The phosphorylation sites were identified by mass-spectrometry as tyrosines 296 and 336. To reveal the structural consequence of this phosphorylation, CD spectroscopy was performed. This measurement suggests that upon tyrosine phosphorylation the structure of the Caskin1 SH3 domain changes significantly.

Conclusion

Taken together, we propose that the scaffold protein Caskin1 can form a complex with the EphB1 tyrosine kinase via the Nck protein as a linker. Complex formation results in tyrosine phosphorylation of the Caskin1 SH3 domain. Although we were not able to identify any physiological partner of the SH3 domain so far, we could demonstrate that phosphorylation on conserved tyrosine residues results in marked changes in the structure of the SH3 domain.
  相似文献   

20.

Introduction

While the evolutionary adaptation of enzymes to their own substrates is a well assessed and rationalized field, how molecules have been originally selected in order to initiate and assemble convenient metabolic pathways is a fascinating, but still debated argument.

Objectives

Aim of the present study is to give a rationale for the preferential selection of specific molecules to generate metabolic pathways.

Methods

The comparison of structural features of molecules, through an inductive methodological approach, offer a reading key to cautiously propose a determining factor for their metabolic recruitment.

Results

Starting with some commonplaces occurring in the structural representation of relevant carbohydrates, such as glucose, fructose and ribose, arguments are presented in associating stable structural determinants of these molecules and their peculiar occurrence in metabolic pathways.

Conclusions

Among other possible factors, the reliability of the structural asset of a molecule may be relevant or its selection among structurally and, a priori, functionally similar molecules.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号