首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 750 毫秒
1.
Chemical graph generators are software packages to generate computer representations of chemical structures adhering to certain boundary conditions. Their development is a research topic of cheminformatics. Chemical graph generators are used in areas such as virtual library generation in drug design, in molecular design with specified properties, called inverse QSAR/QSPR, as well as in organic synthesis design, retrosynthesis or in systems for computer-assisted structure elucidation (CASE). CASE systems again have regained interest for the structure elucidation of unknowns in computational metabolomics, a current area of computational biology.  相似文献   

2.
3.
Practical identifiability of Systems Biology models has received a lot of attention in recent scientific research. It addresses the crucial question for models’ predictability: how accurately can the models’ parameters be recovered from available experimental data. The methods based on profile likelihood are among the most reliable methods of practical identification. However, these methods are often computationally demanding or lead to inaccurate estimations of parameters’ confidence intervals. Development of methods, which can accurately produce parameters’ confidence intervals in reasonable computational time, is of utmost importance for Systems Biology and QSP modeling.We propose an algorithm Confidence Intervals by Constraint Optimization (CICO) based on profile likelihood, designed to speed-up confidence intervals estimation and reduce computational cost. The numerical implementation of the algorithm includes settings to control the accuracy of confidence intervals estimates. The algorithm was tested on a number of Systems Biology models, including Taxol treatment model and STAT5 Dimerization model, discussed in the current article.The CICO algorithm is implemented in a software package freely available in Julia (https://github.com/insysbio/LikelihoodProfiler.jl) and Python (https://github.com/insysbio/LikelihoodProfiler.py).  相似文献   

4.
Recent advances in metagenomic sequencing have enabled discovery of diverse, distinct microbes and viruses. Bacteriophages, the most abundant biological entity on Earth, evolve rapidly, and therefore, detection of unknown bacteriophages in sequence datasets is a challenge. Most of the existing detection methods rely on sequence similarity to known bacteriophage sequences, impeding the identification and characterization of distinct, highly divergent bacteriophage families. Here we present Seeker, a deep-learning tool for alignment-free identification of phage sequences. Seeker allows rapid detection of phages in sequence datasets and differentiation of phage sequences from bacterial ones, even when those phages exhibit little sequence similarity to established phage families. We comprehensively validate Seeker''s ability to identify previously unidentified phages, and employ this method to detect unknown phages, some of which are highly divergent from the known phage families. We provide a web portal (seeker.pythonanywhere.com) and a user-friendly Python package (github.com/gussow/seeker) allowing researchers to easily apply Seeker in metagenomic studies, for the detection of diverse unknown bacteriophages.  相似文献   

5.
The rapid spread of COVID-19 is motivating development of antivirals targeting conserved SARS-CoV-2 molecular machinery. The SARS-CoV-2 genome includes conserved RNA elements that offer potential small-molecule drug targets, but most of their 3D structures have not been experimentally characterized. Here, we provide a compilation of chemical mapping data from our and other labs, secondary structure models, and 3D model ensembles based on Rosetta''s FARFAR2 algorithm for SARS-CoV-2 RNA regions including the individual stems SL1-8 in the extended 5′ UTR; the reverse complement of the 5′ UTR SL1-4; the frameshift stimulating element (FSE); and the extended pseudoknot, hypervariable region, and s2m of the 3′ UTR. For eleven of these elements (the stems in SL1–8, reverse complement of SL1–4, FSE, s2m and 3′ UTR pseudoknot), modeling convergence supports the accuracy of predicted low energy states; subsequent cryo-EM characterization of the FSE confirms modeling accuracy. To aid efforts to discover small molecule RNA binders guided by computational models, we provide a second set of similarly prepared models for RNA riboswitches that bind small molecules. Both datasets (‘FARFAR2-SARS-CoV-2’, https://github.com/DasLab/FARFAR2-SARS-CoV-2; and ‘FARFAR2-Apo-Riboswitch’, at https://github.com/DasLab/FARFAR2-Apo-Riboswitch’) include up to 400 models for each RNA element, which may facilitate drug discovery approaches targeting dynamic ensembles of RNA molecules.  相似文献   

6.
The binding affinities of protein-nucleic acid interactions could be altered due to missense mutations occurring in DNA- or RNA-binding proteins, therefore resulting in various diseases. Unfortunately, a systematic comparison and prediction of the effects of mutations on protein-DNA and protein-RNA interactions (these two mutation classes are termed MPDs and MPRs, respectively) is still lacking. Here, we demonstrated that these two classes of mutations could generate similar or different tendencies for binding free energy changes in terms of the properties of mutated residues. We then developed regression algorithms separately for MPDs and MPRs by introducing novel geometric partition-based energy features and interface-based structural features. Through feature selection and ensemble learning, similar computational frameworks that integrated energy- and nonenergy-based models were established to estimate the binding affinity changes resulting from MPDs and MPRs, but the selected features for the final models were different and therefore reflected the specificity of these two mutation classes. Furthermore, the proposed methodology was extended to the identification of mutations that significantly decreased the binding affinities. Extensive validations indicated that our algorithm generally performed better than the state-of-the-art methods on both the regression and classification tasks. The webserver and software are freely available at http://liulab.hzau.edu.cn/PEMPNI and https://github.com/hzau-liulab/PEMPNI.  相似文献   

7.
Phenotypic profiling of large three-dimensional microscopy data sets has not been widely adopted due to the challenges posed by cell segmentation and feature selection. The computational demands of automated processing further limit analysis of hard-to-segment images such as of neurons and organoids. Here we describe a comprehensive shallow-learning framework for automated quantitative phenotyping of three-dimensional (3D) image data using unsupervised data-driven voxel-based feature learning, which enables computationally facile classification, clustering and advanced data visualization. We demonstrate the analysis potential on complex 3D images by investigating the phenotypic alterations of: neurons in response to apoptosis-inducing treatments and morphogenesis for oncogene-expressing human mammary gland acinar organoids. Our novel implementation of image analysis algorithms called Phindr3D allowed rapid implementation of data-driven voxel-based feature learning into 3D high content analysis (HCA) operations and constitutes a major practical advance as the computed assignments represent the biology while preserving the heterogeneity of the underlying data. Phindr3D is provided as Matlab code and as a stand-alone program (https://github.com/DWALab/Phindr3D).  相似文献   

8.
G-quadruplex DNA structures have become attractive drug targets, and native mass spectrometry can provide detailed characterization of drug binding stoichiometry and affinity, potentially at high throughput. However, the G-quadruplex DNA polymorphism poses problems for interpreting ligand screening assays. In order to establish standardized MS-based screening assays, we studied 28 sequences with documented NMR structures in (usually ∼100 mM) potassium, and report here their circular dichroism (CD), melting temperature (Tm), NMR spectra and electrospray mass spectra in 1 mM KCl/100 mM trimethylammonium acetate. Based on these results, we make a short-list of sequences that adopt the same structure in the MS assay as reported by NMR, and provide recommendations on using them for MS-based assays. We also built an R-based open-source application to build and consult a database, wherein further sequences can be incorporated in the future. The application handles automatically most of the data processing, and allows generating custom figures and reports. The database is included in the g4dbr package (https://github.com/EricLarG4/g4dbr) and can be explored online (https://ericlarg4.github.io/G4_database.html).  相似文献   

9.
The lifestyle of parasitic plants is associated with peculiar morphological, genetic, and physiological adaptations that existing online plant-specific resources fail to adequately represent. Here, we introduce the Web Application for the Research of Parasitic Plants (WARPP) as an online resource dedicated to advancing research and development of parasitic plant biology. WARPP is a framework to facilitate international efforts by providing a central hub of curated evolutionary, ecological, and genetic data. The first version of WARPP provides a community hub for researchers to test this web application, for which curated data revolving around the economically important Broomrape family (Orobanchaceae) is readily accessible. The initial set of WARPP online tools includes a genome browser that centralizes genomic information for sequenced parasitic plant genomes, an orthogroup summary detailing the presence and absence of orthologous genes in parasites compared with nonparasitic plants, and an ancestral trait explorer showing the evolution of life-history preferences along phylogenies. WARPP represents a project under active development and relies on the scientific community to populate the web app’s database and further the development of new analysis tools. The first version of WARPP can be securely accessed at https://parasiticplants.app. The source code is licensed under GNU GPLv2 and is available at https://github.com/wickeLab/WARPP.

The WARPP online resource is a new, expandable, and interactive parasitic plant-specific data hub that provides online tools tailored to the peculiarities of parasitic angiosperms.  相似文献   

10.
11.
Extensive amounts of multi-omics data and multiple cancer subtyping methods have been developed rapidly, and generate discrepant clustering results, which poses challenges for cancer molecular subtype research. Thus, the development of methods for the identification of cancer consensus molecular subtypes is essential. The lack of intuitive and easy-to-use analytical tools has posed a barrier. Here, we report on the development of the COnsensus Molecular SUbtype of Cancer (COMSUC) web server. With COMSUC, users can explore consensus molecular subtypes of more than 30 cancers based on eight clustering methods, five types of omics data from public reference datasets or users’ private data, and three consensus clustering methods. The web server provides interactive and modifiable visualization, and publishable output of analysis results. Researchers can also exchange consensus subtype results with collaborators via project IDs. COMSUC is now publicly and freely available with no login requirement at http://comsuc.bioinforai.tech/ (IP address: http://59.110.25.27/). For a video summary of this web server, see S1 Video and S1 File.  相似文献   

12.
13.
Understanding the relationships between biological processes is paramount to unravel pathophysiological mechanisms. These relationships can be modeled with Transfer Functions (TFs), with no need of a priori hypotheses as to the shape of the transfer function. Here we present Iliski, a software dedicated to TFs computation between two signals. It includes different pre-treatment routines and TF computation processes: deconvolution, deterministic and non-deterministic optimization algorithms that are adapted to disparate datasets. We apply Iliski to data on neurovascular coupling, an ensemble of cellular mechanisms that link neuronal activity to local changes of blood flow, highlighting the software benefits and caveats in the computation and evaluation of TFs. We also propose a workflow that will help users to choose the best computation according to the dataset. Iliski is available under the open-source license CC BY 4.0 on GitHub (https://github.com/alike-aydin/Iliski) and can be used on the most common operating systems, either within the MATLAB environment, or as a standalone application.  相似文献   

14.
A streaming assembly pipeline utilising real-time Oxford Nanopore Technology (ONT) sequencing data is important for saving sequencing resources and reducing time-to-result. A previous approach implemented in npScarf provided an efficient streaming algorithm for hybrid assembly but was relatively prone to mis-assemblies compared to other graph-based methods. Here we present npGraph, a streaming hybrid assembly tool using the assembly graph instead of the separated pre-assembly contigs. It is able to produce more complete genome assembly by resolving the path finding problem on the assembly graph using long reads as the traversing guide. Application to synthetic and real data from bacterial isolate genomes show improved accuracy while still maintaining a low computational cost. npGraph also provides a graphical user interface (GUI) which provides a real-time visualisation of the progress of assembly. The tool and source code is available at https://github.com/hsnguyen/assembly.  相似文献   

15.
16.
Identifying cooperating modules of driver alterations can provide insights into cancer etiology and advance the development of effective personalized treatments. We present Cancer Rule Set Optimization (CRSO) for inferring the combinations of alterations that cooperate to drive tumor formation in individual patients. Application to 19 TCGA cancer types revealed a mean of 11 core driver combinations per cancer, comprising 2–6 alterations per combination and accounting for a mean of 70% of samples per cancer type. CRSO is distinct from methods based on statistical co‐occurrence, which we demonstrate is a suboptimal criterion for investigating driver cooperation. CRSO identified well‐studied driver combinations that were not detected by other approaches and nominated novel combinations that correlate with clinical outcomes in multiple cancer types. Novel synergies were identified in NRAS‐mutant melanomas that may be therapeutically relevant. Core driver combinations involving NFE2L2 mutations were identified in four cancer types, supporting the therapeutic potential of NRF2 pathway inhibition. CRSO is available at https://github.com/mikekleinsgit/CRSO/.  相似文献   

17.
Different miRNA profiling protocols and technologies introduce differences in the resulting quantitative expression profiles. These include differences in the presence (and measurability) of certain miRNAs. We present and examine a method based on quantile normalization, Adjusted Quantile Normalization (AQuN), to combine miRNA expression data from multiple studies in breast cancer into a single joint dataset for integrative analysis. By pooling multiple datasets, we obtain increased statistical power, surfacing patterns that do not emerge as statistically significant when separately analyzing these datasets. To merge several datasets, as we do here, one needs to overcome both technical and batch differences between these datasets. We compare several approaches for merging and jointly analyzing miRNA datasets. We investigate the statistical confidence for known results and highlight potential new findings that resulted from the joint analysis using AQuN. In particular, we detect several miRNAs to be differentially expressed in estrogen receptor (ER) positive versus ER negative samples. In addition, we identify new potential biomarkers and therapeutic targets for both clinical groups. As a specific example, using the AQuN-derived dataset we detect hsa-miR-193b-5p to have a statistically significant over-expression in the ER positive group, a phenomenon that was not previously reported. Furthermore, as demonstrated by functional assays in breast cancer cell lines, overexpression of hsa-miR-193b-5p in breast cancer cell lines resulted in decreased cell viability in addition to inducing apoptosis. Together, these observations suggest a novel functional role for this miRNA in breast cancer. Packages implementing AQuN are provided for Python and Matlab: https://github.com/YakhiniGroup/PyAQN.  相似文献   

18.
Drug combinations have demonstrated great potential in cancer treatments. They alleviate drug resistance and improve therapeutic efficacy. The fast-growing number of anti-cancer drugs has caused the experimental investigation of all drug combinations to become costly and time-consuming. Computational techniques can improve the efficiency of drug combination screening. Despite recent advances in applying machine learning to synergistic drug combination prediction, several challenges remain. First, the performance of existing methods is suboptimal. There is still much space for improvement. Second, biological knowledge has not been fully incorporated into the model. Finally, many models are lack interpretability, limiting their clinical applications. To address these challenges, we have developed a knowledge-enabled and self-attention transformer boosted deep learning model, TranSynergy, which improves the performance and interpretability of synergistic drug combination prediction. TranSynergy is designed so that the cellular effect of drug actions can be explicitly modeled through cell-line gene dependency, gene-gene interaction, and genome-wide drug-target interaction. A novel Shapley Additive Gene Set Enrichment Analysis (SA-GSEA) method has been developed to deconvolute genes that contribute to the synergistic drug combination and improve model interpretability. Extensive benchmark studies demonstrate that TranSynergy outperforms the state-of-the-art method, suggesting the potential of mechanism-driven machine learning. Novel pathways that are associated with the synergistic combinations are revealed and supported by experimental evidences. They may provide new insights into identifying biomarkers for precision medicine and discovering new anti-cancer therapies. Several new synergistic drug combinations have been predicted with high confidence for ovarian cancer which has few treatment options. The code is available at https://github.com/qiaoliuhub/drug_combination.  相似文献   

19.
Adaptive introgression—the flow of adaptive genetic variation between species or populations—has attracted significant interest in recent years and it has been implicated in a number of cases of adaptation, from pesticide resistance and immunity, to local adaptation. Despite this, methods for identification of adaptive introgression from population genomic data are lacking. Here, we present Ancestry_HMM-S, a hidden Markov model-based method for identifying genes undergoing adaptive introgression and quantifying the strength of selection acting on them. Through extensive validation, we show that this method performs well on moderately sized data sets for realistic population and selection parameters. We apply Ancestry_HMM-S to a data set of an admixed Drosophila melanogaster population from South Africa and we identify 17 loci which show signatures of adaptive introgression, four of which have previously been shown to confer resistance to insecticides. Ancestry_HMM-S provides a powerful method for inferring adaptive introgression in data sets that are typically collected when studying admixed populations. This method will enable powerful insights into the genetic consequences of admixture across diverse populations. Ancestry_HMM-S can be downloaded from https://github.com/jesvedberg/Ancestry_HMM-S/.  相似文献   

20.
In the past few years, a wealth of sample-specific network construction methods and structural network control methods has been proposed to identify sample-specific driver nodes for supporting the Sample-Specific network Control (SSC) analysis of biological networked systems. However, there is no comprehensive evaluation for these state-of-the-art methods. Here, we conducted a performance assessment for 16 SSC analysis workflows by using the combination of 4 sample-specific network reconstruction methods and 4 representative structural control methods. This study includes simulation evaluation of representative biological networks, personalized driver genes prioritization on multiple cancer bulk expression datasets with matched patient samples from TCGA, and cell marker genes and key time point identification related to cell differentiation on single-cell RNA-seq datasets. By widely comparing analysis of existing SSC analysis workflows, we provided the following recommendations and banchmarking workflows. (i) The performance of a network control method is strongly dependent on the up-stream sample-specific network method, and Cell-Specific Network construction (CSN) method and Single-Sample Network (SSN) method are the preferred sample-specific network construction methods. (ii) After constructing the sample-specific networks, the undirected network-based control methods are more effective than the directed network-based control methods. In addition, these data and evaluation pipeline are freely available on https://github.com/WilfongGuo/Benchmark_control.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号