期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

scDeepSort: a pre-trained cell-type annotation method for single-cell transcriptomics using deep learning with a weighted graph neural network

Xin Shao Haihong Yang Xiang Zhuang Jie Liao Penghui Yang Junyun Cheng Xiaoyan Lu Huajun Chen Xiaohui Fan 《Nucleic acids research》2021,49(21):e122

相似文献

2.

CellCall: integrating paired ligand–receptor and transcription factor activities for cell–cell communication

Yang Zhang Tianyuan Liu Xuesong Hu Mei Wang Jing Wang Bohao Zou Puwen Tan Tianyu Cui Yiying Dou Lin Ning Yan huang Shuan Rao Dong Wang Xiaoyang Zhao 《Nucleic acids research》2021,49(15):8520

相似文献

3.

scGET: Predicting Cell Fate Transition During Early Embryonic Development by Single-cell Graph Entropy

Jiayuan Zhong Chongyin Han Xuhang Zhang Pei Chen Rui Liu 《基因组蛋白质组与生物信息学报(英文版)》2021,19(3):461-474

During early embryonic development, cell fate commitment represents a critical transition or"tipping point"of embryonic differentiation, at which there is a drastic and qualitative shift of the cell populations. In this study, we presented a computational approach, scGET, to explore the gene–gene associations based on single-cell RNA sequencing (scRNA-seq) data for critical transition prediction. Specifically, by transforming the gene expression data to the local network entropy, the single-cell graph entropy (SGE) value quantitatively characterizes the stability and criticality of gene regu-latory networks among cell populations and thus can be employed to detect the critical signal of cell fate or lineage commitment at the single-cell level. Being applied to five scRNA-seq datasets of embryonic differentiation, scGET accurately predicts all the impending cell fate transitions. After identifying the"dark genes"that are non-differentially expressed genes but sensitive to the SGE value, the underlying signaling mechanisms were revealed, suggesting that the synergy of dark genes and their downstream targets may play a key role in various cell development processes. The application in all five datasets demonstrates the effectiveness of scGET in analyzing scRNA-seq data from a network perspective and its potential to track the dynamics of cell differentiation. The source code of scGET is accessible at https://github.com/zhongjiayuna/scGET_Project. 相似文献

4.

STRIDE: accurately decomposing and integrating spatial transcriptomics using single-cell RNA sequencing

Dongqing Sun Zhaoyang Liu Taiwen Li Qiu Wu Chenfei Wang 《Nucleic acids research》2022,50(7):e42

相似文献

5.

coupleCoC+: An information-theoretic co-clustering-based transfer learning framework for the integrative analysis of single-cell genomic data

Pengcheng Zeng Zhixiang Lin 《PLoS computational biology》2021,17(6)

Technological advances have enabled us to profile multiple molecular layers at unprecedented single-cell resolution and the available datasets from multiple samples or domains are growing. These datasets, including scRNA-seq data, scATAC-seq data and sc-methylation data, usually have different powers in identifying the unknown cell types through clustering. So, methods that integrate multiple datasets can potentially lead to a better clustering performance. Here we propose coupleCoC+ for the integrative analysis of single-cell genomic data. coupleCoC+ is a transfer learning method based on the information-theoretic co-clustering framework. In coupleCoC+, we utilize the information in one dataset, the source data, to facilitate the analysis of another dataset, the target data. coupleCoC+ uses the linked features in the two datasets for effective knowledge transfer, and it also uses the information of the features in the target data that are unlinked with the source data. In addition, coupleCoC+ matches similar cell types across the source data and the target data. By applying coupleCoC+ to the integrative clustering of mouse cortex scATAC-seq data and scRNA-seq data, mouse and human scRNA-seq data, mouse cortex sc-methylation and scRNA-seq data, and human blood dendritic cells scRNA-seq data from two batches, we demonstrate that coupleCoC+ improves the overall clustering performance and matches the cell subpopulations across multimodal single-cell genomic datasets. coupleCoC+ has fast convergence and it is computationally efficient. The software is available at https://github.com/cuhklinlab/coupleCoC_plus. 相似文献

6.

scIMC: a platform for benchmarking comparison and visualization analysis of scRNA-seq data imputation methods

Chichi Dai Yi Jiang Chenglin Yin Ran Su Xiangxiang Zeng Quan Zou Kenta Nakai Leyi Wei 《Nucleic acids research》2022,50(9):4877

相似文献

7.

PseudoGA: cell pseudotime reconstruction based on genetic algorithm

Pronoy Kanti Mondal Udit Surya Saha Indranil Mukhopadhyay 《Nucleic acids research》2021,49(14):7909

相似文献

8.

TSCAN: Pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis

Zhicheng Ji Hongkai Ji 《Nucleic acids research》2016,44(13):e117

相似文献

9.

UFold: fast and accurate RNA secondary structure prediction with deep learning

Laiyi Fu Yingxin Cao Jie Wu Qinke Peng Qing Nie Xiaohui Xie 《Nucleic acids research》2022,50(3):e14

For many RNA molecules, the secondary structure is essential for the correct function of the RNA. Predicting RNA secondary structure from nucleotide sequences is a long-standing problem in genomics, but the prediction performance has reached a plateau over time. Traditional RNA secondary structure prediction algorithms are primarily based on thermodynamic models through free energy minimization, which imposes strong prior assumptions and is slow to run. Here, we propose a deep learning-based method, called UFold, for RNA secondary structure prediction, trained directly on annotated data and base-pairing rules. UFold proposes a novel image-like representation of RNA sequences, which can be efficiently processed by Fully Convolutional Networks (FCNs). We benchmark the performance of UFold on both within- and cross-family RNA datasets. It significantly outperforms previous methods on within-family datasets, while achieving a similar performance as the traditional methods when trained and tested on distinct RNA families. UFold is also able to predict pseudoknots accurately. Its prediction is fast with an inference time of about 160 ms per sequence up to 1500 bp in length. An online web server running UFold is available at https://ufold.ics.uci.edu. Code is available at https://github.com/uci-cbcl/UFold. 相似文献

10.

svaseq: removing batch effects and other unwanted noise from sequencing data

Jeffrey T. Leek 《Nucleic acids research》2014,42(21):e161

It is now known that unwanted noise and unmodeled artifacts such as batch effects can dramatically reduce the accuracy of statistical inference in genomic experiments. These sources of noise must be modeled and removed to accurately measure biological variability and to obtain correct statistical inference when performing high-throughput genomic analysis. We introduced surrogate variable analysis (sva) for estimating these artifacts by (i) identifying the part of the genomic data only affected by artifacts and (ii) estimating the artifacts with principal components or singular vectors of the subset of the data matrix. The resulting estimates of artifacts can be used in subsequent analyses as adjustment factors to correct analyses. Here I describe a version of the sva approach specifically created for count data or FPKMs from sequencing experiments based on appropriate data transformation. I also describe the addition of supervised sva (ssva) for using control probes to identify the part of the genomic data only affected by artifacts. I present a comparison between these versions of sva and other methods for batch effect estimation on simulated data, real count-based data and FPKM-based data. These updates are available through the sva Bioconductor package and I have made fully reproducible analysis using these methods available from: https://github.com/jtleek/svaseq. 相似文献

11.

CHANCE: comprehensive software for quality control and validation of ChIP-seq data

Aaron Diaz Abhinav Nellore Jun S Song 《Genome biology》2012,13(10):R98

ChIP-seq is a powerful method for obtaining genome-wide maps of protein-DNA interactions and epigenetic modifications. CHANCE (CHip-seq ANalytics and Confidence Estimation) is a standalone package for ChIP-seq quality control and protocol optimization. Our user-friendly graphical software quickly estimates the strength and quality of immunoprecipitations, identifies biases, compares the user''s data with ENCODE''s large collection of published datasets, performs multi-sample normalization, checks against quantitative PCR-validated control regions, and produces informative graphical reports. CHANCE is available at https://github.com/songlab/chance. 相似文献

12.

A Real-Time All-Atom Structural Search Engine for Proteins

Gabriel Gonzalez Brett Hannigan William F. DeGrado 《PLoS computational biology》2014,10(7)

Protein designers use a wide variety of software tools for de novo design, yet their repertoire still lacks a fast and interactive all-atom search engine. To solve this, we have built the Suns program: a real-time, atomic search engine integrated into the PyMOL molecular visualization system. Users build atomic-level structural search queries within PyMOL and receive a stream of search results aligned to their query within a few seconds. This instant feedback cycle enables a new “designability”-inspired approach to protein design where the designer searches for and interactively incorporates native-like fragments from proven protein structures. We demonstrate the use of Suns to interactively build protein motifs, tertiary interactions, and to identify scaffolds compatible with hot-spot residues. The official web site and installer are located at http://www.degradolab.org/suns/ and the source code is hosted at https://github.com/godotgildor/Suns (PyMOL plugin, BSD license), https://github.com/Gabriel439/suns-cmd (command line client, BSD license), and https://github.com/Gabriel439/suns-search (search engine server, GPLv2 license).

This is a PLOS Computational Biology Software Article

相似文献

13.

POMAShiny: A user-friendly web-based workflow for metabolomics and proteomics data analysis

Pol Castellano-Escuder Raúl Gonzlez-Domínguez Francesc Carmona-Pontaque Cristina Andrs-Lacueva Alex Snchez-Pla 《PLoS computational biology》2021,17(7)

Metabolomics and proteomics, like other omics domains, usually face a data mining challenge in providing an understandable output to advance in biomarker discovery and precision medicine. Often, statistical analysis is one of the most difficult challenges and it is critical in the subsequent biological interpretation of the results. Because of this, combined with the computational programming skills needed for this type of analysis, several bioinformatic tools aimed at simplifying metabolomics and proteomics data analysis have emerged. However, sometimes the analysis is still limited to a few hidebound statistical methods and to data sets with limited flexibility. POMAShiny is a web-based tool that provides a structured, flexible and user-friendly workflow for the visualization, exploration and statistical analysis of metabolomics and proteomics data. This tool integrates several statistical methods, some of them widely used in other types of omics, and it is based on the POMA R/Bioconductor package, which increases the reproducibility and flexibility of analyses outside the web environment. POMAShiny and POMA are both freely available at https://github.com/nutrimetabolomics/POMAShiny and https://github.com/nutrimetabolomics/POMA, respectively. 相似文献

14.

VDJtools: Unifying Post-analysis of T Cell Receptor Repertoires

Mikhail Shugay Dmitriy V. Bagaev Maria A. Turchaninova Dmitriy A. Bolotin Olga V. Britanova Ekaterina V. Putintseva Mikhail V. Pogorelyy Vadim I. Nazarov Ivan V. Zvyagin Vitalina I. Kirgizova Kirill I. Kirgizov Elena V. Skorobogatova Dmitriy M. Chudakov 《PLoS computational biology》2015,11(11)

Despite the growing number of immune repertoire sequencing studies, the field still lacks software for analysis and comprehension of this high-dimensional data. Here we report VDJtools, a complementary software suite that solves a wide range of T cell receptor (TCR) repertoires post-analysis tasks, provides a detailed tabular output and publication-ready graphics, and is built on top of a flexible API. Using TCR datasets for a large cohort of unrelated healthy donors, twins, and multiple sclerosis patients we demonstrate that VDJtools greatly facilitates the analysis and leads to sound biological conclusions. VDJtools software and documentation are available at https://github.com/mikessh/vdjtools. 相似文献

15.

A multi-objective genetic algorithm to find active modules in multiplex biological networks

Elva María Novoa-del-Toro Efrn Mezura-Montes Matthieu Vignes Morgane Trzol Frdrique Magdinier Laurent Tichit Anaïs Baudot 《PLoS computational biology》2021,17(8)

The identification of subnetworks of interest—or active modules—by integrating biological networks with molecular profiles is a key resource to inform on the processes perturbed in different cellular conditions. We here propose MOGAMUN, a Multi-Objective Genetic Algorithm to identify active modules in MUltiplex biological Networks. MOGAMUN optimizes both the density of interactions and the scores of the nodes (e.g., their differential expression). We compare MOGAMUN with state-of-the-art methods, representative of different algorithms dedicated to the identification of active modules in single networks. MOGAMUN identifies dense and high-scoring modules that are also easier to interpret. In addition, to our knowledge, MOGAMUN is the first method able to use multiplex networks. Multiplex networks are composed of different layers of physical and functional relationships between genes and proteins. Each layer is associated to its own meaning, topology, and biases; the multiplex framework allows exploiting this diversity of biological networks. We applied MOGAMUN to identify cellular processes perturbed in Facio-Scapulo-Humeral muscular Dystrophy, by integrating RNA-seq expression data with a multiplex biological network. We identified different active modules of interest, thereby providing new angles for investigating the pathomechanisms of this disease.Availability: MOGAMUN is available at https://github.com/elvanov/MOGAMUN and as a Bioconductor package at https://bioconductor.org/packages/release/bioc/html/MOGAMUN.html. Contact: rf.uma-vinu@toduab.siana 相似文献

16.

Segmentation and Tracking of Adherens Junctions in 3D for the Analysis of Epithelial Tissue Morphogenesis

Rodrigo Cilla Vinodh Mechery Beatriz Hernandez de Madrid Steven Del Signore Ivan Dotu Victor Hatini 《PLoS computational biology》2015,11(4)

Epithelial morphogenesis generates the shape of tissues, organs and embryos and is fundamental for their proper function. It is a dynamic process that occurs at multiple spatial scales from macromolecular dynamics, to cell deformations, mitosis and apoptosis, to coordinated cell rearrangements that lead to global changes of tissue shape. Using time lapse imaging, it is possible to observe these events at a system level. However, to investigate morphogenetic events it is necessary to develop computational tools to extract quantitative information from the time lapse data. Toward this goal, we developed an image-based computational pipeline to preprocess, segment and track epithelial cells in 4D confocal microscopy data. The computational pipeline we developed, for the first time, detects the adherens junctions of epithelial cells in 3D, without the need to first detect cell nuclei. We accentuate and detect cell outlines in a series of steps, symbolically describe the cells and their connectivity, and employ this information to track the cells. We validated the performance of the pipeline for its ability to detect vertices and cell-cell contacts, track cells, and identify mitosis and apoptosis in surface epithelia of Drosophila imaginal discs. We demonstrate the utility of the pipeline to extract key quantitative features of cell behavior with which to elucidate the dynamics and biomechanical control of epithelial tissue morphogenesis. We have made our methods and data available as an open-source multiplatform software tool called TTT (http://github.com/morganrcu/TTT) 相似文献

17.

Wham: Identifying Structural Variants of Biological Consequence

Zev N. Kronenberg Edward J. Osborne Kelsey R. Cone Brett J. Kennedy Eric T. Domyan Michael D. Shapiro Nels C. Elde Mark Yandell 《PLoS computational biology》2015,11(12)

Existing methods for identifying structural variants (SVs) from short read datasets are inaccurate. This complicates disease-gene identification and efforts to understand the consequences of genetic variation. In response, we have created Wham (Whole-genome Alignment Metrics) to provide a single, integrated framework for both structural variant calling and association testing, thereby bypassing many of the difficulties that currently frustrate attempts to employ SVs in association testing. Here we describe Wham, benchmark it against three other widely used SV identification tools–Lumpy, Delly and SoftSearch–and demonstrate Wham’s ability to identify and associate SVs with phenotypes using data from humans, domestic pigeons, and vaccinia virus. Wham and all associated software are covered under the MIT License and can be freely downloaded from github (https://github.com/zeeev/wham), with documentation on a wiki (http://zeeev.github.io/wham/). For community support please post questions to https://www.biostars.org/.

This is PLOS Computational Biology software paper.

相似文献

18.

UniqTag: Content-Derived Unique and Stable Identifiers for Gene Annotation

Shaun D. Jackman Joerg Bohlmann ?nan? Birol 《PloS one》2015,10(5)

When working on an ongoing genome sequencing and assembly project, it is rather inconvenient when gene identifiers change from one build of the assembly to the next. The gene labelling system described here, UniqTag, addresses this common challenge. UniqTag assigns a unique identifier to each gene that is a representative k-mer, a string of length k, selected from the sequence of that gene. Unlike serial numbers, these identifiers are stable between different assemblies and annotations of the same data without requiring that previous annotations be lifted over by sequence alignment. We assign UniqTag identifiers to ten builds of the Ensembl human genome spanning eight years to demonstrate this stability. The implementation of UniqTag in Ruby and an R package are available at https://github.com/sjackman/uniqtag sjackman/uniqtag. The R package is also available from CRAN: install.packages ("uniqtag"). Supplementary material and code to reproduce it is available at https://github.com/sjackman/uniqtag-paper. 相似文献

19.

SpeciesRax: A Tool for Maximum Likelihood Species Tree Inference from Gene Family Trees under Duplication,Transfer, and Loss

Benoit Morel Paul Schade Sarah Lutteropp Tom A Williams Gergely J Szll&#x;si Alexandros Stamatakis 《Molecular biology and evolution》2022,39(2)

Species tree inference from gene family trees is becoming increasingly popular because it can account for discordance between the species tree and the corresponding gene family trees. In particular, methods that can account for multiple-copy gene families exhibit potential to leverage paralogy as informative signal. At present, there does not exist any widely adopted inference method for this purpose. Here, we present SpeciesRax, the first maximum likelihood method that can infer a rooted species tree from a set of gene family trees and can account for gene duplication, loss, and transfer events. By explicitly modeling events by which gene trees can depart from the species tree, SpeciesRax leverages the phylogenetic rooting signal in gene trees. SpeciesRax infers species tree branch lengths in units of expected substitutions per site and branch support values via paralogy-aware quartets extracted from the gene family trees. Using both empirical and simulated data sets we show that SpeciesRax is at least as accurate as the best competing methods while being one order of magnitude faster on large data sets at the same time. We used SpeciesRax to infer a biologically plausible rooted phylogeny of the vertebrates comprising 188 species from 31,612 gene families in 1 h using 40 cores. SpeciesRax is available under GNU GPL at https://github.com/BenoitMorel/GeneRax and on BioConda. 相似文献

20.

cLoops2: a full-stack comprehensive analytical tool for chromatin interactions

Yaqiang Cao Shuai Liu Gang Ren Qingsong Tang Keji Zhao 《Nucleic acids research》2022,50(1):57

Investigating chromatin interactions between regulatory regions such as enhancer and promoter elements is vital for understanding the regulation of gene expression. Compared to Hi-C and its variants, the emerging 3D mapping technologies focusing on enriched signals, such as TrAC-looping, reduce the sequencing cost and provide higher interaction resolution for cis-regulatory elements. A robust pipeline is needed for the comprehensive interpretation of these data, especially for loop-centric analysis. Therefore, we have developed a new versatile tool named cLoops2 for the full-stack analysis of these 3D chromatin interaction data. cLoops2 consists of core modules for peak-calling, loop-calling, differentially enriched loops calling and loops annotation. It also contains multiple modules for interaction resolution estimation, data similarity estimation, features quantification, feature aggregation analysis, and visualization. cLoops2 with documentation and example data are open source and freely available at GitHub: https://github.com/KejiZhaoLab/cLoops2. 相似文献