首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
COHCAP (City of Hope CpG Island Analysis Pipeline) is an algorithm to analyze single-nucleotide resolution DNA methylation data produced by either an Illumina methylation array or targeted bisulfite sequencing. The goal of the COHCAP algorithm is to identify CpG islands that show a consistent pattern of methylation among CpG sites. COHCAP is currently the only DNA methylation package that provides integration with gene expression data to identify a subset of CpG islands that are most likely to regulate downstream gene expression, and it can generate lists of differentially methylated CpG islands with ∼50% concordance with gene expression from both cell line data and heterogeneous patient data. For example, this article describes known breast cancer biomarkers (such as estrogen receptor) with a negative correlation between DNA methylation and gene expression. COHCAP also provides visualization for quality control metrics, regions of differential methylation and correlation between methylation and gene expression. This software is freely available at https://sourceforge.net/projects/cohcap/.  相似文献   

2.
Copy number variation (CNV) is one of the most prevalent genetic variations in the genome, leading to an abnormal number of copies of moderate to large genomic regions. High-throughput technologies such as next-generation sequencing often identify thousands of CNVs involved in biological or pathological processes. Despite the growing demand to filter and classify CNVs by factors such as frequency in population, biological features, and function, surprisingly, no online web server for CNV annotations has been made available to the research community. Here, we present CNVannotator, a web server that accepts an input set of human genomic positions in a user-friendly tabular format. CNVannotator can perform genomic overlaps of the input coordinates using various functional features, including a list of the reported 356,817 common CNVs, 181,261 disease CNVs, as well as, 140,342 SNPs from genome-wide association studies. In addition, CNVannotator incorporates 2,211,468 genomic features, including ENCODE regulatory elements, cytoband, segmental duplication, genome fragile site, pseudogene, promoter, enhancer, CpG island, and methylation site. For cancer research community users, CNVannotator can apply various filters to retrieve a subgroup of CNVs pinpointed in hundreds of tumor suppressor genes and oncogenes. In total, 5,277,234 unique genomic coordinates with functional features are available to generate an output in a plain text format that is free to download. In summary, we provide a comprehensive web resource for human CNVs. The annotated results along with the server can be accessed at http://bioinfo.mc.vanderbilt.edu/CNVannotator/.  相似文献   

3.
Advances in biotechnology have resulted in large-scale studies of DNA methylation. A differentially methylated region (DMR) is a genomic region with multiple adjacent CpG sites that exhibit different methylation statuses among multiple samples. Many so-called “supervised” methods have been established to identify DMRs between two or more comparison groups. Methods for the identification of DMRs without reference to phenotypic information are, however, less well studied. An alternative “unsupervised” approach was proposed, in which DMRs in studied samples were identified with consideration of nature dependence structure of methylation measurements between neighboring probes from tiling arrays. Through simulation study, we investigated effects of dependencies between neighboring probes on determining DMRs where a lot of spurious signals would be produced if the methylation data were analyzed independently of the probe. In contrast, our newly proposed method could successfully correct for this effect with a well-controlled false positive rate and a comparable sensitivity. By applying to two real datasets, we demonstrated that our method could provide a global picture of methylation variation in studied samples. R source codes to implement the proposed method were freely available at http://www.csjfann.ibms.sinica.edu.tw/eag/programlist/ICDMR/ICDMR.html.  相似文献   

4.
5.

Background

Whole genome sequencing of bisulfite converted DNA (‘methylC-seq’) method provides comprehensive information of DNA methylation. An important application of these whole genome methylation maps is classifying each position as a methylated versus non-methylated nucleotide. A widely used current method for this purpose, the so-called binomial method, is intuitive and straightforward, but lacks power when the sequence coverage and the genome-wide methylation level are low. These problems present a particular challenge when analyzing sparsely methylated genomes, such as those of many invertebrates and plants.

Results

We demonstrate that the number of sequence reads per position from methylC-seq data displays a large variance and can be modeled as a shifted negative binomial distribution. We also show that DNA methylation levels of adjacent CpG sites are correlated, and this similarity in local DNA methylation levels extends several kilobases. Taking these observations into account, we propose a new method based on Bayesian classification to infer DNA methylation status while considering the neighborhood DNA methylation levels of a specific site. We show that our approach has higher sensitivity and better classification performance than the binomial method via multiple analyses, including computational simulations, Area Under Curve (AUC) analyses, and improved consistencies across biological replicates. This method is especially advantageous in the analyses of sparsely methylated genomes with low coverage.

Conclusions

Our method improves the existing binomial method for binary methylation calls by utilizing a posterior odds framework and incorporating local methylation information. This method should be widely applicable to the analyses of methylC-seq data from diverse sparsely methylated genomes. Bis-Class and example data are provided at a dedicated website (http://bibs.snu.ac.kr/software/Bisclass).

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-608) contains supplementary material, which is available to authorized users.  相似文献   

6.

Background

Epigenome-wide association scans (EWAS) are an increasingly powerful and widely-used approach to assess the role of epigenetic variation in human complex traits. However, this rapidly emerging field lacks dedicated visualisation tools that can display features specific to epigenetic datasets.

Result

We developed coMET, an R package and online tool for visualisation of EWAS results in a genomic region of interest. coMET generates a regional plot of epigenetic-phenotype association results and the estimated DNA methylation correlation between CpG sites (co-methylation), with further options to visualise genomic annotations based on ENCODE data, gene tracks, reference CpG-sites, and user-defined features. The tool can be used to display phenotype association signals and correlation patterns of microarray or sequencing-based DNA methylation data, such as Illumina Infinium 450k, WGBS, or MeDIP-seq, as well as other types of genomic data, such as gene expression profiles. The software is available as a user-friendly online tool from http://epigen.kcl.ac.uk/cometand as an R Bioconductor package. Source code, examples, and full documentation are also available from GitHub.

Conclusion

Our new software allows visualisation of EWAS results with functional genomic annotations and with estimation of co-methylation patterns. coMET is available to a wide audience as an online tool and R package, and can be a valuable resource to interpret results in the fast growing field of epigenetics. The software is designed for epigenetic data, but can also be applied to genomic and functional genomic datasets in any species.  相似文献   

7.
A key benefit of long-read nanopore sequencing technology is the ability to detect modified DNA bases, such as 5-methylcytosine. The lack of R/Bioconductor tools for the effective visualization of nanopore methylation profiles between samples from different experimental groups led us to develop the NanoMethViz R package. Our software can handle methylation output generated from a range of different methylation callers and manages large datasets using a compressed data format. To fully explore the methylation patterns in a dataset, NanoMethViz allows plotting of data at various resolutions. At the sample-level, we use dimensionality reduction to look at the relationships between methylation profiles in an unsupervised way. We visualize methylation profiles of classes of features such as genes or CpG islands by scaling them to relative positions and aggregating their profiles. At the finest resolution, we visualize methylation patterns across individual reads along the genome using the spaghetti plot and heatmaps, allowing users to explore particular genes or genomic regions of interest. In summary, our software makes the handling of methylation signal more convenient, expands upon the visualization options for nanopore data and works seamlessly with existing methylation analysis tools available in the Bioconductor project. Our software is available at https://bioconductor.org/packages/NanoMethViz.  相似文献   

8.
Spontaneous preterm birth (PTB, <37 weeks gestation) is a major public health concern, and children born preterm have a higher risk of morbidity and mortality throughout their lives. Recent studies suggest that fetal DNA methylation of several genes varies across a range of gestational ages (GA), but it is not yet clear if fetal epigenetic changes associate with PTB. The objective of this study is to interrogate methylation patterns across the genome in fetal leukocyte DNA from African Americans with early PTB (241/7–340/7 weeks; N = 22) or term births (390/7–406/7weeks; N = 28) and to evaluate the association of each CpG site with PTB and GA. DNA methylation was assessed across the genome with the HumanMethylation450 BeadChip. For each individual sample and CpG site, the proportion of DNA methylation was estimated. The associations between methylation and PTB or GA were evaluated by fitting a separate linear model for each CpG site, adjusting for relevant covariates. Overall, 29 CpG sites associated with PTB (FDR<.05; 5.7×10−10<p<2.9×10−6) independent of GA. Also, 9637 sites associated with GA (FDR<.05; 9.5×10−16<p<1.0×10−3), with 61.8% decreasing in methylation with shorter GA. GA-associated CpG sites were depleted in the CpG islands of their respective genes (p<2.2×10−16). Gene set enrichment analysis (GSEA) supported enrichment of GA-associated CpG sites in genes that play a role in embryonic development as well as the extracellular matrix. Additionally, this study replicated the association of several CpG sites associated with gestational age in other studies (CRHBP, PIK3CD and AVP). Dramatic differences in fetal DNA methylation are evident in fetuses born preterm versus at term, and the patterns established at birth may provide insight into the long-term consequences associated with PTB.  相似文献   

9.
10.
11.
Smoking increases the risk of many diseases and could act through changes in DNA methylation patterns. The aims of this study were to determine the association between smoking and DNA methylation throughout the genome at cytosine-phosphate-guanine (CpG) site level and genomic regions. A discovery cross-sectional epigenome-wide association study nested in the follow-up of the REGICOR cohort was designed and included 645 individuals. Blood DNA methylation was assessed using the Illumina HumanMethylation450 BeadChip. Smoking status was self-reported using a standardized questionnaire. We identified 66 differentially methylated CpG sites associated with smoking, located in 38 genes. In most of these CpG sites, we observed a trend among those quitting smoking to recover methylation levels typical of never smokers. A CpG site located in a novel smoking-associated gene (cg06394460 in LNX2) was hypomethylated in current smokers. Moreover, we validated two previously reported CpG sites (cg05886626 in THBS1, and cg24838345 in MTSS1) for their potential relation to atherosclerosis and cancer diseases, using several different approaches: CpG site methylation, gene expression, and plasma protein level determinations. Smoking was also associated with higher THBS1 gene expression but with lower levels of thrombospondin-1 in plasma. Finally, we identified differential methylation regions in 13 genes and in four non-coding RNAs. In summary, this study replicated previous findings and identified and validated a new CpG site located in LNX2 associated with smoking.  相似文献   

12.
The DNTM3A and DNMT3B de novo DNA methyltransferases (DNMTs) are responsible for setting genomic DNA methylation patterns, a key layer of epigenetic information. Here, using an in vivo episomal methylation assay and extensive bisulfite methylation sequencing, we show that human DNMT3A and DNMT3B possess significant and distinct flanking sequence preferences for target CpG sites. Selection for high or low efficiency sites is mediated by the base composition at the −2 and +2 positions flanking the CpG site for DNMT3A, and at the −1 and +1 positions for DNMT3B. This intrinsic preference reproducibly leads to the formation of specific de novo methylation patterns characterized by up to 34-fold variations in the efficiency of DNA methylation at individual sites. Furthermore, analysis of the distribution of signature methylation hotspot and coldspot motifs suggests that DNMT flanking sequence preference has contributed to shaping the composition of CpG islands in the human genome. Our results also show that the DNMT3L stimulatory factor modulates the formation of de novo methylation patterns in two ways. First, DNMT3L selectively focuses the DNA methylation machinery on properly chromatinized DNA templates. Second, DNMT3L attenuates the impact of the intrinsic DNMT flanking sequence preference by providing a much greater boost to the methylation of poorly methylated sites, thus promoting the formation of broader and more uniform methylation patterns. This study offers insights into the manner by which DNA methylation patterns are deposited and reveals a new level of interplay between members of the de novo DNMT family.  相似文献   

13.
Prenatal stress has been widely associated with a number of short- and long-term pathological outcomes. Epigenetic mechanisms are thought to partially mediate these environmental insults into the fetal physiology. One of the main targets of developmental programming is the hypothalamic-pituitary-adrenal (HPA) axis as it is the main regulator of the stress response. Accordingly, an increasing number of researchers have recently focused on the putative association between DNA methylation at the glucocorticoid receptor gene (NR3C1) and prenatal stress, among other types of psychosocial stress. The current study aims to systematically review and meta-analyze the existing evidence linking several forms of prenatal stress with DNA methylation at the region 1F of the NR3C1 gene. The inclusion of relevant articles allowed combining empirical evidence from 977 individuals by meta-analytic techniques, whose methylation assessments showed overlap across 5 consecutive CpG sites (GRCh37/hg19 chr5:142,783,607-142,783,639). From this information, methylation levels at CpG site 36 displayed a significant correlation to prenatal stress (r = 0.14, 95% CI: 0.05–0.23, P = 0.002). This result supports the proposed association between a specific CpG site located at the NR3C1 promoter and prenatal stress. Several confounders, such as gender, methylation at other glucocorticoid-related genes, and adjustment for pharmacological treatments during pregnancy, should be taken into account in further studies.  相似文献   

14.
15.
Although CpG methylation clearly distributes genome-wide in vertebrate nuclear DNA, the state of methylation in the vertebrate mitochondrial genome has been unclear. Several recent reports using immunoprecipitation, mass spectrometry, and enzyme-linked immunosorbent assay methods concluded that human mitochondrial DNA (mtDNA) has much more than the 2 to 5% CpG methylation previously estimated. However, these methods do not provide information as to the sites or frequency of methylation at each CpG site. Here, we have used the more definitive bisulfite genomic sequencing method to examine CpG methylation in HCT116 human cells and primary human cells to independently answer these two questions. We found no evidence of CpG methylation at a biologically significant level in these regions of the human mitochondrial genome. Furthermore, unbiased next-generation sequencing of sodium bisulfite treated total DNA from HCT116 cells and analysis of genome-wide sodium bisulfite sequencing data sets from several other DNA sources confirmed this absence of CpG methylation in mtDNA. Based on our findings using regionally specific and genome-wide approaches with multiple human cell sources, we can definitively conclude that CpG methylation is absent in mtDNA. It is highly unlikely that CpG methylation plays any role in direct control of mitochondrial function.  相似文献   

16.
Reduced representation bisulfite sequencing (RRBS) was used to analyze DNA methylation patterns across the mouse brain genome in mice carrying a deletion of the Prader-Willi syndrome imprinting center (PWS-IC) on either the maternally- or paternally-inherited chromosome. Within the ∼3.7 Mb imprinted Angelman/Prader-Willi syndrome (AS/PWS) domain, 254 CpG sites were interrogated for changes in methylation due to PWS-IC deletion. Paternally-inherited deletion of the PWS-IC increased methylation levels ∼2-fold at each CpG site (compared to wild-type controls) at differentially methylated regions (DMRs) associated with 5′ CpG island promoters of paternally-expressed genes; these methylation changes extended, to a variable degree, into the adjacent CpG island shores. Maternal PWS-IC deletion yielded little or no changes in methylation at these DMRs, and methylation of CpG sites outside of promoter DMRs also was unchanged upon maternal or paternal PWS-IC deletion. Using stringent ascertainment criteria, ∼750,000 additional CpG sites were also interrogated across the entire mouse genome. This analysis identified 26 loci outside of the imprinted AS/PWS domain showing altered DNA methylation levels of ≥25% upon PWS-IC deletion. Curiously, altered methylation at 9 of these loci was a consequence of maternal PWS-IC deletion (maternal PWS-IC deletion by itself is not known to be associated with a phenotype in either humans or mice), and 10 of these loci exhibited the same changes in methylation irrespective of the parental origin of the PWS-IC deletion. These results suggest that the PWS-IC may affect DNA methylation at these loci by directly interacting with them, or may affect methylation at these loci through indirect downstream effects due to PWS-IC deletion. They further suggest the PWS-IC may have a previously uncharacterized function outside of the imprinted AS/PWS domain.  相似文献   

17.
18.
Spinal muscular atrophy (SMA) is a monogenic neurodegenerative disorder subdivided into four different types. Whole genome methylation analysis revealed 40 CpG sites associated with genes that are significantly differentially methylated between SMA patients and healthy individuals of the same age. To investigate the contribution of methylation changes to SMA severity, we compared the methylation level of found CpG sites, designed as “targets”, as well as the nearest CpG sites in regulatory regions of ARHGAP22, CDK2AP1, CHML, NCOR2, SLC23A2 and RPL9 in three groups of SMA patients. Of notable interest, compared to type I SMA male patients, the methylation level of a target CpG site and one nearby CpG site belonging to the 5’UTR of SLC23A2 were significantly hypomethylated 19–22% in type III-IV patients. In contrast to type I SMA male patients, type III-IV patients demonstrated a 16% decrease in the methylation levels of a target CpG site, belonging to the 5’UTR of NCOR2. To conclude, this study validates the data of our previous study and confirms significant methylation changes in the SLC23A2 and NCOR2 regulatory regions correlates with SMA severity.  相似文献   

19.
20.
Environmentally induced epigenetic transgenerational inheritance of disease and phenotypic variation involves germline transmitted epimutations. The primary epimutations identified involve altered differential DNA methylation regions (DMRs). Different environmental toxicants have been shown to promote exposure (i.e., toxicant) specific signatures of germline epimutations. Analysis of genomic features associated with these epimutations identified low-density CpG regions (<3 CpG / 100bp) termed CpG deserts and a number of unique DNA sequence motifs. The rat genome was annotated for these and additional relevant features. The objective of the current study was to use a machine learning computational approach to predict all potential epimutations in the genome. A number of previously identified sperm epimutations were used as training sets. A novel machine learning approach using a sequential combination of Active Learning and Imbalance Class Learner analysis was developed. The transgenerational sperm epimutation analysis identified approximately 50K individual sites with a 1 kb mean size and 3,233 regions that had a minimum of three adjacent sites with a mean size of 3.5 kb. A select number of the most relevant genomic features were identified with the low density CpG deserts being a critical genomic feature of the features selected. A similar independent analysis with transgenerational somatic cell epimutation training sets identified a smaller number of 1,503 regions of genome-wide predicted sites and differences in genomic feature contributions. The predicted genome-wide germline (sperm) epimutations were found to be distinct from the predicted somatic cell epimutations. Validation of the genome-wide germline predicted sites used two recently identified transgenerational sperm epimutation signature sets from the pesticides dichlorodiphenyltrichloroethane (DDT) and methoxychlor (MXC) exposure lineage F3 generation. Analysis of this positive validation data set showed a 100% prediction accuracy for all the DDT-MXC sperm epimutations. Observations further elucidate the genomic features associated with transgenerational germline epimutations and identify a genome-wide set of potential epimutations that can be used to facilitate identification of epigenetic diagnostics for ancestral environmental exposures and disease susceptibility.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号