首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Regulatory DNA elements, short genomic segments that regulate gene expression, have been implicated in developmental disorders and human disease. Despite this clinical urgency, only a small fraction of the regulatory DNA repertoire has been confirmed through reporter gene assays. The overall success rate of functional validation of candidate regulatory elements is low. Moreover, the number and diversity of datasets from which putative regulatory elements can be identified is large and rapidly increasing. We generated a flexible and user-friendly tool to integrate the information from different types of genomic datasets, e.g. ATAC-seq, ChIP-seq, conservation, aiming to increase the ease and success rate of functional prediction. To this end, we developed the EMERGE program that merges all datasets that the user considers informative and uses a logistic regression framework, based on validated functional elements, to set optimal weights to these datasets. ROC curve analysis shows that a combination of datasets leads to improved prediction of tissue-specific enhancers in human, mouse and Drosophila genomes. Functional assays based on this prediction can be expected to have substantially higher success rates. The resulting integrated signal for prediction of functional elements can be plotted in a build-in genome browser or exported for further analysis.  相似文献   

2.
With the rapid accumulation of biological omics datasets, decoding the underlying relationships of cross-dataset genes becomes an important issue. Previous studies have attempted to identify differentially expressed genes across datasets. However, it is hard for them to detect interrelated ones. Moreover, existing correlation-based algorithms can only measure the relationship between genes within a single dataset or two multi-modal datasets from the same samples. It is still unclear how to quantify the strength of association of the same gene across two biological datasets with different samples. To this end, we propose Approximate Distance Correlation (ADC) to select interrelated genes with statistical significance across two different biological datasets. ADC first obtains the k most correlated genes for each target gene as its approximate observations, and then calculates the distance correlation (DC) for the target gene across two datasets. ADC repeats this process for all genes and then performs the Benjamini-Hochberg adjustment to control the false discovery rate. We demonstrate the effectiveness of ADC with simulation data and four real applications to select highly interrelated genes across two datasets. These four applications including 21 cancer RNA-seq datasets of different tissues; six single-cell RNA-seq (scRNA-seq) datasets of mouse hematopoietic cells across six different cell types along the hematopoietic cell lineage; five scRNA-seq datasets of pancreatic islet cells across five different technologies; coupled single-cell ATAC-seq (scATAC-seq) and scRNA-seq data of peripheral blood mononuclear cells (PBMC). Extensive results demonstrate that ADC is a powerful tool to uncover interrelated genes with strong biological implications and is scalable to large-scale datasets. Moreover, the number of such genes can serve as a metric to measure the similarity between two datasets, which could characterize the relative difference of diverse cell types and technologies.  相似文献   

3.
陈敏  张峥  孟紫媛  张学军 《遗传》2020,(4):347-353
染色质转座酶可及性测序(assay for transposase-accessible chromatin with high-throughput sequencing,ATAC-seq)是利用Tn5转座酶研究染色质可及性的高通量测序技术。ATAC-seq可以在全基因组范围内绘制染色质可及性图谱,揭示转录因子结合位点以及核小体的位置。在医学领域,ATAC-seq技术是研究重大疾病发病机制、药物作用机制、新药研发和生物标志物功能等的新一代有力工具。本文对ATAC-seq技术的优势及其在复杂疾病研究中的应用和前景进行了综述,以期为人类复杂疾病基因表达调控机制等相关研究的开展提供借鉴与参考。  相似文献   

4.
Single-cell ATAC-seq detects open chromatin in individual cells. Currently data are sparse, but combining information from many single cells can identify determinants of cell-to-cell chromatin variation.  相似文献   

5.
6.
7.
Assay for transposase-accessible chromatin with high-throughput sequencing (ATAC-seq) is a technique widely used to investigate genome-wide chromatin accessibility. The recently published Omni-ATAC-seq protocol substantially improves the signal/noise ratio and reduces the input cell number. High-quality data are critical to ensure accurate analysis. Several tools have been developed for assessing sequencing quality and insertion size distribution for ATAC-seq data; however, key quality control (QC) metrics have not yet been established to accurately determine the quality of ATAC-seq data. Here, we optimized the analysis strategy for ATAC-seq and defined a series of QC metrics for ATAC-seq data, including reads under peak ratio (RUPr), background (BG), promoter enrichment (ProEn), subsampling enrichment (SubEn), and other measurements. We incorporated these QC tests into our recently developed ATAC-seq Integrative Analysis Package (AIAP) to provide a complete ATAC-seq analysis system, including quality assurance, improved peak calling, and downstream differential analysis. We demonstrated a significant improvement of sensitivity (20%–60%) in both peak calling and differential analysis by processing paired-end ATAC-seq datasets using AIAP. AIAP is compiled into Docker/Singularity, and it can be executed by one command line to generate a comprehensive QC report. We used ENCODE ATAC-seq data to benchmark and generate QC recommendations, and developed qATACViewer for the user-friendly interaction with the QC report. The software, source code, and documentation of AIAP are freely available at https://github.com/Zhang-lab/ATAC-seq_QC_analysis.  相似文献   

8.
Genetic variants and de novo mutations in regulatory regions of the genome are typically discovered by whole-genome sequencing (WGS), however WGS is expensive and most WGS reads come from non-regulatory regions. The Assay for Transposase-Accessible Chromatin (ATAC-seq) generates reads from regulatory sequences and could potentially be used as a low-cost ‘capture’ method for regulatory variant discovery, but its use for this purpose has not been systematically evaluated. Here we apply seven variant callers to bulk and single-cell ATAC-seq data and evaluate their ability to identify single nucleotide variants (SNVs) and insertions/deletions (indels). In addition, we develop an ensemble classifier, VarCA, which combines features from individual variant callers to predict variants. The Genome Analysis Toolkit (GATK) is the best-performing individual caller with precision/recall on a bulk ATAC test dataset of 0.92/0.97 for SNVs and 0.87/0.82 for indels within ATAC-seq peak regions with at least 10 reads. On bulk ATAC-seq reads, VarCA achieves superior performance with precision/recall of 0.99/0.95 for SNVs and 0.93/0.80 for indels. On single-cell ATAC-seq reads, VarCA attains precision/recall of 0.98/0.94 for SNVs and 0.82/0.82 for indels. In summary, ATAC-seq reads can be used to accurately discover non-coding regulatory variants in the absence of whole-genome sequencing data and our ensemble method, VarCA, has the best overall performance.  相似文献   

9.
10.
染色质转座酶可及性测序研究进展   总被引:1,自引:0,他引:1  
吴杰  全建平  叶勇  吴珍芳  杨杰  杨明  郑恩琴 《遗传》2020,(4):333-346
染色质转座酶可及性测序(assay for transposase-accessible chromatin with high-throughput sequencing,ATAC-seq)诞生于2013年,具有比脱氧核糖核酸酶I超敏感位点测序(deoxyribonuclease I hypersensitive site sequencing, DNase-seq)和微球菌核酸酶敏感位点测序(micrococcal nuclease sequencing, MNase-seq)更快速、灵敏、简便的优点,是目前分析全基因组范围染色质开放区域的热点技术。通过该技术能获得染色质开放区域的相关信息,从而映射出转录因子等调控蛋白的结合区域和核小体定位等信息,对于研究表观遗传分子机制具有重要意义。本文比较了5种获取染色质开放区域技术的优缺点,重点介绍了ATAC-seq的原理和主要流程,描述了利用ATAC-seq技术研究染色质开放区域的发展概况以及ATAC-seq的相关应用,期望对真核生物全基因组水平的染色质开放区域研究、顺式调控元件鉴定以及遗传调控网络的解析等提供借鉴。  相似文献   

11.
Chromatin immunoprecipitation sequencing (ChIP-seq) and the Assay for Transposase-Accessible Chromatin with high-throughput sequencing (ATAC-seq) have become essential technologies to effectively measure protein–DNA interactions and chromatin accessibility. However, there is a need for a scalable and reproducible pipeline that incorporates proper normalization between samples, correction of copy number variations, and integration of new downstream analysis tools. Here we present Containerized Bioinformatics workflow for Reproducible ChIP/ATAC-seq Analysis (CoBRA), a modularized computational workflow which quantifies ChIP-seq and ATAC-seq peak regions and performs unsupervised and supervised analyses. CoBRA provides a comprehensive state-of-the-art ChIP-seq and ATAC-seq analysis pipeline that can be used by scientists with limited computational experience. This enables researchers to gain rapid insight into protein–DNA interactions and chromatin accessibility through sample clustering, differential peak calling, motif enrichment, comparison of sites to a reference database, and pathway analysis. CoBRA is publicly available online at https://bitbucket.org/cfce/cobra  相似文献   

12.
Many flexible extensions of the Cox proportional hazards model incorporate time-dependent (TD) and/or nonlinear (NL) effects of time-invariant covariates. In contrast, little attention has been given to the assessment of such effects for continuous time-varying covariates (TVCs). We propose a flexible regression B-spline–based model for TD and NL effects of a TVC. To account for sparse TVC measurements, we added to this model the effect of time elapsed since last observation (TEL), which acts as an effect modifier. TD, NL, and TEL effects are estimated with the iterative alternative conditional estimation algorithm. Furthermore, a simulation extrapolation (SIMEX)-like procedure was adapted to correct the estimated effects for random measurement errors in the observed TVC values. In simulations, TD and NL estimates were unbiased if the TVC was measured with a high frequency. With sparse measurements, the strength of the effects was underestimated but the TEL estimate helped reduce the bias, whereas SIMEX helped further to correct for bias toward the null due to “white noise” measurement errors. We reassessed the effects of systolic blood pressure (SBP) and total cholesterol, measured at two-year intervals, on cardiovascular risks in women participating in the Framingham Heart Study. Accounting for TD effects of SBP, cholesterol and age, the NL effect of cholesterol, and the TEL effect of SBP improved substantially the model's fit to data. Flexible estimates yielded clinically important insights regarding the role of these risk factors. These results illustrate the advantages of flexible modeling of TVC effects.  相似文献   

13.
Traveler''s dilemma (TD) is one of social dilemmas which has been well studied in the economics community, but it is attracted little attention in the physics community. The TD game is a two-person game. Each player can select an integer value between and () as a pure strategy. If both of them select the same value, the payoff to them will be that value. If the players select different values, say and (), then the payoff to the player who chooses the small value will be and the payoff to the other player will be . We term the player who selects a large value as the cooperator, and the one who chooses a small value as the defector. The reason is that if both of them select large values, it will result in a large total payoff. The Nash equilibrium of the TD game is to choose the smallest value . However, in previous behavioral studies, players in TD game typically select values that are much larger than , and the average selected value exhibits an inverse relationship with . To explain such anomalous behavior, in this paper, we study the evolution of cooperation in spatial traveler''s dilemma game where the players are located on a square lattice and each player plays TD games with his neighbors. Players in our model can adopt their neighbors'' strategies following two standard models of spatial game dynamics. Monte-Carlo simulation is applied to our model, and the results show that the cooperation level of the system, which is proportional to the average value of the strategies, decreases with increasing until is greater than the critical value where cooperation vanishes. Our findings indicate that spatial reciprocity promotes the evolution of cooperation in TD game and the spatial TD game model can interpret the anomalous behavior observed in previous behavioral experiments.  相似文献   

14.
An open problem in the field of computational neuroscience is how to link synaptic plasticity to system-level learning. A promising framework in this context is temporal-difference (TD) learning. Experimental evidence that supports the hypothesis that the mammalian brain performs temporal-difference learning includes the resemblance of the phasic activity of the midbrain dopaminergic neurons to the TD error and the discovery that cortico-striatal synaptic plasticity is modulated by dopamine. However, as the phasic dopaminergic signal does not reproduce all the properties of the theoretical TD error, it is unclear whether it is capable of driving behavior adaptation in complex tasks. Here, we present a spiking temporal-difference learning model based on the actor-critic architecture. The model dynamically generates a dopaminergic signal with realistic firing rates and exploits this signal to modulate the plasticity of synapses as a third factor. The predictions of our proposed plasticity dynamics are in good agreement with experimental results with respect to dopamine, pre- and post-synaptic activity. An analytical mapping from the parameters of our proposed plasticity dynamics to those of the classical discrete-time TD algorithm reveals that the biological constraints of the dopaminergic signal entail a modified TD algorithm with self-adapting learning parameters and an adapting offset. We show that the neuronal network is able to learn a task with sparse positive rewards as fast as the corresponding classical discrete-time TD algorithm. However, the performance of the neuronal network is impaired with respect to the traditional algorithm on a task with both positive and negative rewards and breaks down entirely on a task with purely negative rewards. Our model demonstrates that the asymmetry of a realistic dopaminergic signal enables TD learning when learning is driven by positive rewards but not when driven by negative rewards.  相似文献   

15.
16.
17.
18.
19.
20.
Thiamine deficiency (TD) impairs hippocampal neurogenesis. However, the mechanisms involved are not identified. In this work, TD mouse model was generated using a thiamine-depleted diet at two time points, TD9 and TD14 for 9 and 14 days of TD respectively. The activities of pyruvate dehydrogenase (PDH), α-ketoglutamate dehydrogenase (KGDH), glucose-6-phosphate dehydrogenase (G6PD), and transketolase (TK), as well as on the contents of NADP+ and NADPH were determined in whole mouse brain, isolated cortex, and hippocampus of TD mice model. The effects of TK silencing on the growth and migratory ability of cultured hippocampal progenitor cells (HPC), as well as on neuritogenesis of hippocampal neurons were explored. The results showed that TD specifically reduced TK activity in both cortex and hippocampus, without significantly affecting the activities of PDH, KGDH, and G6PD in TD9 and TD14 groups. The level of whole brain and hippocampal NADPH in TD14 group were significantly lower than that of control group. TK silencing significantly inhibited the proliferation, growth, and migratory abilities of cultured HPC, without affecting neuritogenesis of cultured hippocampal neurons. Taken together, these results demonstrate that decreased TK activity leads to pentose-phosphate pathway dysfunction and contributes to impaired hippocampal neurogenesis induced by TD. TK and pentose-phosphate pathway may be considered new targets to investigate hippocampal neurogenesis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号