首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 359 毫秒
1.
Chemical graph generators are software packages to generate computer representations of chemical structures adhering to certain boundary conditions. Their development is a research topic of cheminformatics. Chemical graph generators are used in areas such as virtual library generation in drug design, in molecular design with specified properties, called inverse QSAR/QSPR, as well as in organic synthesis design, retrosynthesis or in systems for computer-assisted structure elucidation (CASE). CASE systems again have regained interest for the structure elucidation of unknowns in computational metabolomics, a current area of computational biology.  相似文献   

2.
3.
4.
5.
The binding affinities of protein-nucleic acid interactions could be altered due to missense mutations occurring in DNA- or RNA-binding proteins, therefore resulting in various diseases. Unfortunately, a systematic comparison and prediction of the effects of mutations on protein-DNA and protein-RNA interactions (these two mutation classes are termed MPDs and MPRs, respectively) is still lacking. Here, we demonstrated that these two classes of mutations could generate similar or different tendencies for binding free energy changes in terms of the properties of mutated residues. We then developed regression algorithms separately for MPDs and MPRs by introducing novel geometric partition-based energy features and interface-based structural features. Through feature selection and ensemble learning, similar computational frameworks that integrated energy- and nonenergy-based models were established to estimate the binding affinity changes resulting from MPDs and MPRs, but the selected features for the final models were different and therefore reflected the specificity of these two mutation classes. Furthermore, the proposed methodology was extended to the identification of mutations that significantly decreased the binding affinities. Extensive validations indicated that our algorithm generally performed better than the state-of-the-art methods on both the regression and classification tasks. The webserver and software are freely available at http://liulab.hzau.edu.cn/PEMPNI and https://github.com/hzau-liulab/PEMPNI.  相似文献   

6.
dadi is a popular but computationally intensive program for inferring models of demographic history and natural selection from population genetic data. I show that running dadi on a Graphics Processing Unit can dramatically speed computation compared with the CPU implementation, with minimal user burden. Motivated by this speed increase, I also extended dadi to four- and five-population models. This functionality is available in dadi version 2.1.0, https://bitbucket.org/gutenkunstlab/dadi/.  相似文献   

7.
Approximate Bayesian computation (ABC) constitutes a class of computational methods rooted in Bayesian statistics. In all model-based statistical inference, the likelihood function is of central importance, since it expresses the probability of the observed data under a particular statistical model, and thus quantifies the support data lend to particular values of parameters and to choices among different models. For simple models, an analytical formula for the likelihood function can typically be derived. However, for more complex models, an analytical formula might be elusive or the likelihood function might be computationally very costly to evaluate. ABC methods bypass the evaluation of the likelihood function. In this way, ABC methods widen the realm of models for which statistical inference can be considered. ABC methods are mathematically well-founded, but they inevitably make assumptions and approximations whose impact needs to be carefully assessed. Furthermore, the wider application domain of ABC exacerbates the challenges of parameter estimation and model selection. ABC has rapidly gained popularity over the last years and in particular for the analysis of complex problems arising in biological sciences (e.g., in population genetics, ecology, epidemiology, and systems biology).
This is a “Topic Page” article for PLOS Computational Biology.
  相似文献   

8.
It is computationally challenging to detect variation by aligning single-molecule sequencing (SMS) reads, or contigs from SMS assemblies. One approach to efficiently align SMS reads is sparse dynamic programming (SDP), where optimal chains of exact matches are found between the sequence and the genome. While straightforward implementations of SDP penalize gaps with a cost that is a linear function of gap length, biological variation is more accurately represented when gap cost is a concave function of gap length. We have developed a method, lra, that uses SDP with a concave-cost gap penalty, and used lra to align long-read sequences from PacBio and Oxford Nanopore (ONT) instruments as well as de novo assembly contigs. This alignment approach increases sensitivity and specificity for SV discovery, particularly for variants above 1kb and when discovering variation from ONT reads, while having runtime that are comparable (1.05-3.76×) to current methods. When applied to calling variation from de novo assembly contigs, there is a 3.2% increase in Truvari F1 score compared to minimap2+htsbox. lra is available in bioconda (https://anaconda.org/bioconda/lra) and github (https://github.com/ChaissonLab/LRA).  相似文献   

9.
Flow cytometry bioinformatics is the application of bioinformatics to flow cytometry data, which involves storing, retrieving, organizing, and analyzing flow cytometry data using extensive computational resources and tools. Flow cytometry bioinformatics requires extensive use of and contributes to the development of techniques from computational statistics and machine learning. Flow cytometry and related methods allow the quantification of multiple independent biomarkers on large numbers of single cells. The rapid growth in the multidimensionality and throughput of flow cytometry data, particularly in the 2000s, has led to the creation of a variety of computational analysis methods, data standards, and public databases for the sharing of results. Computational methods exist to assist in the preprocessing of flow cytometry data, identifying cell populations within it, matching those cell populations across samples, and performing diagnosis and discovery using the results of previous steps. For preprocessing, this includes compensating for spectral overlap, transforming data onto scales conducive to visualization and analysis, assessing data for quality, and normalizing data across samples and experiments. For population identification, tools are available to aid traditional manual identification of populations in two-dimensional scatter plots (gating), to use dimensionality reduction to aid gating, and to find populations automatically in higher dimensional space in a variety of ways. It is also possible to characterize data in more comprehensive ways, such as the density-guided binary space partitioning technique known as probability binning, or by combinatorial gating. Finally, diagnosis using flow cytometry data can be aided by supervised learning techniques, and discovery of new cell types of biological importance by high-throughput statistical methods, as part of pipelines incorporating all of the aforementioned methods. Open standards, data, and software are also key parts of flow cytometry bioinformatics. Data standards include the widely adopted Flow Cytometry Standard (FCS) defining how data from cytometers should be stored, but also several new standards under development by the International Society for Advancement of Cytometry (ISAC) to aid in storing more detailed information about experimental design and analytical steps. Open data is slowly growing with the opening of the CytoBank database in 2010 and FlowRepository in 2012, both of which allow users to freely distribute their data, and the latter of which has been recommended as the preferred repository for MIFlowCyt-compliant data by ISAC. Open software is most widely available in the form of a suite of Bioconductor packages, but is also available for web execution on the GenePattern platform.
This is a “Topic Page” article for PLOS Computational Biology.
  相似文献   

10.
Recent advances in metagenomic sequencing have enabled discovery of diverse, distinct microbes and viruses. Bacteriophages, the most abundant biological entity on Earth, evolve rapidly, and therefore, detection of unknown bacteriophages in sequence datasets is a challenge. Most of the existing detection methods rely on sequence similarity to known bacteriophage sequences, impeding the identification and characterization of distinct, highly divergent bacteriophage families. Here we present Seeker, a deep-learning tool for alignment-free identification of phage sequences. Seeker allows rapid detection of phages in sequence datasets and differentiation of phage sequences from bacterial ones, even when those phages exhibit little sequence similarity to established phage families. We comprehensively validate Seeker''s ability to identify previously unidentified phages, and employ this method to detect unknown phages, some of which are highly divergent from the known phage families. We provide a web portal (seeker.pythonanywhere.com) and a user-friendly Python package (github.com/gussow/seeker) allowing researchers to easily apply Seeker in metagenomic studies, for the detection of diverse unknown bacteriophages.  相似文献   

11.
Understanding the relationships between biological processes is paramount to unravel pathophysiological mechanisms. These relationships can be modeled with Transfer Functions (TFs), with no need of a priori hypotheses as to the shape of the transfer function. Here we present Iliski, a software dedicated to TFs computation between two signals. It includes different pre-treatment routines and TF computation processes: deconvolution, deterministic and non-deterministic optimization algorithms that are adapted to disparate datasets. We apply Iliski to data on neurovascular coupling, an ensemble of cellular mechanisms that link neuronal activity to local changes of blood flow, highlighting the software benefits and caveats in the computation and evaluation of TFs. We also propose a workflow that will help users to choose the best computation according to the dataset. Iliski is available under the open-source license CC BY 4.0 on GitHub (https://github.com/alike-aydin/Iliski) and can be used on the most common operating systems, either within the MATLAB environment, or as a standalone application.  相似文献   

12.
Practical identifiability of Systems Biology models has received a lot of attention in recent scientific research. It addresses the crucial question for models’ predictability: how accurately can the models’ parameters be recovered from available experimental data. The methods based on profile likelihood are among the most reliable methods of practical identification. However, these methods are often computationally demanding or lead to inaccurate estimations of parameters’ confidence intervals. Development of methods, which can accurately produce parameters’ confidence intervals in reasonable computational time, is of utmost importance for Systems Biology and QSP modeling.We propose an algorithm Confidence Intervals by Constraint Optimization (CICO) based on profile likelihood, designed to speed-up confidence intervals estimation and reduce computational cost. The numerical implementation of the algorithm includes settings to control the accuracy of confidence intervals estimates. The algorithm was tested on a number of Systems Biology models, including Taxol treatment model and STAT5 Dimerization model, discussed in the current article.The CICO algorithm is implemented in a software package freely available in Julia (https://github.com/insysbio/LikelihoodProfiler.jl) and Python (https://github.com/insysbio/LikelihoodProfiler.py).  相似文献   

13.
G-quadruplex DNA structures have become attractive drug targets, and native mass spectrometry can provide detailed characterization of drug binding stoichiometry and affinity, potentially at high throughput. However, the G-quadruplex DNA polymorphism poses problems for interpreting ligand screening assays. In order to establish standardized MS-based screening assays, we studied 28 sequences with documented NMR structures in (usually ∼100 mM) potassium, and report here their circular dichroism (CD), melting temperature (Tm), NMR spectra and electrospray mass spectra in 1 mM KCl/100 mM trimethylammonium acetate. Based on these results, we make a short-list of sequences that adopt the same structure in the MS assay as reported by NMR, and provide recommendations on using them for MS-based assays. We also built an R-based open-source application to build and consult a database, wherein further sequences can be incorporated in the future. The application handles automatically most of the data processing, and allows generating custom figures and reports. The database is included in the g4dbr package (https://github.com/EricLarG4/g4dbr) and can be explored online (https://ericlarg4.github.io/G4_database.html).  相似文献   

14.
Dbf4-dependent kinase (DDK) and cyclin-dependent kinase (CDK) are essential to initiate DNA replication at individual origins. During replication stress, the S-phase checkpoint inhibits the DDK- and CDK-dependent activation of late replication origins. Rad53 kinase is a central effector of the replication checkpoint and both binds to and phosphorylates Dbf4 to prevent late-origin firing. The molecular basis for the Rad53Dbf4 physical interaction is not clear but occurs through the Dbf4 N terminus. Here we found that both Rad53 FHA1 and FHA2 domains, which specifically recognize phospho-threonine (pT), interacted with Dbf4 through an N-terminal sequence and an adjacent BRCT domain. Purified Rad53 FHA1 domain (but not FHA2) bound to a pT Dbf4 peptide in vitro, suggesting a possible phospho-threonine-dependent interaction between FHA1 and Dbf4. The Dbf4Rad53 interaction is governed by multiple contacts that are separable from the Cdc5- and Msa1-binding sites in the Dbf4 N terminus. Importantly, abrogation of the Rad53Dbf4 physical interaction blocked Dbf4 phosphorylation and allowed late-origin firing during replication checkpoint activation. This indicated that Rad53 must stably bind to Dbf4 to regulate its activity.  相似文献   

15.
16.
The Saccharomyces cerevisiae type 2C protein phosphatase Ptc1 is required for a wide variety of cellular functions, although only a few cellular targets have been identified. A genetic screen in search of mutations in protein kinase–encoding genes able to suppress multiple phenotypic traits caused by the ptc1 deletion yielded a single gene, MKK1, coding for a MAPK kinase (MAPKK) known to activate the cell-wall integrity (CWI) Slt2 MAPK. In contrast, mutation of the MKK1 paralog, MKK2, had a less significant effect. Deletion of MKK1 abolished the increased phosphorylation of Slt2 induced by the absence of Ptc1 both under basal and CWI pathway stimulatory conditions. We demonstrate that Ptc1 acts at the level of the MAPKKs of the CWI pathway, but only the Mkk1 kinase activity is essential for ptc1 mutants to display high Slt2 activation. We also show that Ptc1 is able to dephosphorylate Mkk1 in vitro. Our results reveal the preeminent role of Mkk1 in signaling through the CWI pathway and strongly suggest that hyperactivation of Slt2 caused by upregulation of Mkk1 is at the basis of most of the phenotypic defects associated with lack of Ptc1 function.  相似文献   

17.
18.
Directional export of messenger RNA (mRNA) protein particles (mRNPs) through nuclear pore complexes (NPCs) requires multiple factors. In Saccharomyces cerevisiae, the NPC proteins Nup159 and Nup42 are asymmetrically localized to the cytoplasmic face and have distinct functional domains: a phenylalanine-glycine (FG) repeat domain that docks mRNP transport receptors and domains that bind the DEAD-box ATPase Dbp5 and its activating cofactor Gle1, respectively. We speculated that the Nup42 and Nup159 FG domains play a role in positioning mRNPs for the terminal mRNP-remodeling steps carried out by Dbp5. Here we find that deletion (Δ) of both the Nup42 and Nup159 FG domains results in a cold-sensitive poly(A)+ mRNA export defect. The nup42ΔFG nup159ΔFG mutant also has synthetic lethal genetic interactions with dbp5 and gle1 mutants. RNA cross-linking experiments further indicate that the nup42ΔFG nup159ΔFG mutant has a reduced capacity for mRNP remodeling during export. To further analyze the role of these FG domains, we replaced the Nup159 or Nup42 FG domains with FG domains from other Nups. These FG “swaps” demonstrate that only certain FG domains are functional at the NPC cytoplasmic face. Strikingly, fusing the Nup42 FG domain to the carboxy-terminus of Gle1 bypasses the need for the endogenous Nup42 FG domain, highlighting the importance of proximal positioning for these factors. We conclude that the Nup42 and Nup159 FG domains target the mRNP to Gle1 and Dbp5 for mRNP remodeling at the NPC. Moreover, these results provide key evidence that character and context play a direct role in FG domain function and mRNA export.  相似文献   

19.
Mycobacterium simiae is a non-tuberculosis mycobacterium causing pulmonary infections in both immunocompetent and imunocompromized patients. We announce the draft genome sequence of M. simiae DSM 44165T. The 5,782,968-bp long genome with 65.15% GC content (one chromosome, no plasmid) contains 5,727 open reading frames (33% with unknown function and 11 ORFs sizing more than 5000 -bp), three rRNA operons, 52 tRNA, one 66-bp tmRNA matching with tmRNA tags from Mycobacterium avium, Mycobacterium tuberculosis, Mycobacterium bovis, Mycobacterium microti, Mycobacterium marinum, and Mycobacterium africanum and 389 DNA repetitive sequences. Comparing ORFs and size distribution between M. simiae and five other Mycobacterium species M. simiae clustered with M. abscessus and M. smegmatis. A 40-kb prophage was predicted in addition to two prophage-like elements, 7-kb and 18-kb in size, but no mycobacteriophage was seen after the observation of 106 M. simiae cells. Fifteen putative CRISPRs were found. Three genes were predicted to encode resistance to aminoglycosides, betalactams and macrolide-lincosamide-streptogramin B. A total of 163 CAZYmes were annotated. M. simiae contains ESX-1 to ESX-5 genes encoding for a type-VII secretion system. Availability of the genome sequence may help depict the unique properties of this environmental, opportunistic pathogen.  相似文献   

20.
Meiosis is a tightly regulated process requiring coordination of diverse events. A conserved ERK/MAPK-signaling cascade plays an essential role in the regulation of meiotic progression. The Thousand And One kinase (TAO) kinase is a MAPK kinase kinase, the meiotic role of which is unknown. We have analyzed the meiotic functions of KIN-18, the homolog of mammalian TAO kinases, in Caenorhabditis elegans. We found that KIN-18 is essential for normal meiotic progression; mutants exhibit accelerated meiotic recombination as detected both by analysis of recombination intermediates and by crossover outcome. In addition, ectopic germ-cell differentiation and enhanced levels of apoptosis were observed in kin-18 mutants. These defects correlate with ectopic activation of MPK-1 that includes premature, missing, and reoccurring MPK-1 activation. Late progression defects in kin-18 mutants are suppressed by inhibiting an upstream activator of MPK-1 signaling, KSR-2. However, the acceleration of recombination events observed in kin-18 mutants is largely MPK-1-independent. Our data suggest that KIN-18 coordinates meiotic progression by modulating the timing of MPK-1 activation and the progression of recombination events. The regulation of the timing of MPK-1 activation ensures the proper timing of apoptosis and is required for the formation of functional oocytes. Meiosis is a conserved process; thus, revealing that KIN-18 is a novel regulator of meiotic progression in C. elegans would help to elucidate TAO kinase’s role in germline development in higher eukaryotes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号