首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
D Wang  Y Zhang  Y Huang  P Li  M Wang  R Wu  L Cheng  W Zhang  Y Zhang  B Li  C Wang  Z Guo 《Gene》2012,506(1):36-42
Nowadays, some researchers normalized DNA methylation arrays data in order to remove the technical artifacts introduced by experimental differences in sample preparation, array processing and other factors. However, other researchers analyzed DNA methylation arrays without performing data normalization considering that current normalizations for methylation data may distort real differences between normal and cancer samples because cancer genomes may be extensively subject to hypomethylation and the total amount of CpG methylation might differ substantially among samples. In this study, using eight datasets by Infinium HumanMethylation27 assay, we systemically analyzed the global distribution of DNA methylation changes in cancer compared to normal control and its effect on data normalization for selecting differentially methylated (DM) genes. We showed more differentially methylated (DM) genes could be found in the Quantile/Lowess-normalized data than in the non-normalized data. We found the DM genes additionally selected in the Quantile/Lowess-normalized data showed significantly consistent methylation states in another independent dataset for the same cancer, indicating these extra DM genes were effective biological signals related to the disease. These results suggested normalization can increase the power of detecting DM genes in the context of diagnostic markers which were usually characterized by relatively large effect sizes. Besides, we evaluated the reproducibility of DM discoveries for a particular cancer type, and we found most of the DM genes additionally detected in one dataset showed the same methylation directions in the other dataset for the same cancer type, indicating that these DM genes were effective biological signals in the other dataset. Furthermore, we showed that some DM genes detected from different studies for a particular cancer type were significantly reproducible at the functional level.  相似文献   

2.
Aberrant DNA methylation in the blood of patients with major depressive disorder (MDD) has been reported in several previous studies. However, no comprehensive studies using medication-free subjects with MDD have been conducted. Furthermore, the majority of these previous studies has been limited to the analysis of the CpG sites in CpG islands (CGIs) in the gene promoter regions. The main aim of the present study is to identify DNA methylation markers that distinguish patients with MDD from non-psychiatric controls. Genome-wide DNA methylation profiling of peripheral leukocytes was conducted in two set of samples, a discovery set (20 medication-free patients with MDD and 19 controls) and a replication set (12 medication-free patients with MDD and 12 controls), using Infinium HumanMethylation450 BeadChips. Significant diagnostic differences in DNA methylation were observed at 363 CpG sites in the discovery set. All of these loci demonstrated lower DNA methylation in patients with MDD than in the controls, and most of them (85.7%) were located in the CGIs in the gene promoter regions. We were able to distinguish patients with MDD from the control subjects with high accuracy in the discriminant analysis using the top DNA methylation markers. We also validated these selected DNA methylation markers in the replication set. Our results indicate that multiplex DNA methylation markers may be useful for distinguishing patients with MDD from non-psychiatric controls.  相似文献   

3.
刘阳  王丽茹  张岩 《生物信息学》2021,19(4):240-248
为了通过分析DNA甲基化谱识别出与预后相关的结肠腺癌亚型。从TCGA数据库获取了结肠腺癌患者的甲基化数据,通过差异甲基化分析和构建COX比例风险回归模型筛得与预后显著相关的CpG位点,并通过一致性聚类识别出7个亚型。生存分析和临床特征检验显示7个亚型间预后差异显著且亚型特征可由多种临床特征反映。此外,用7个亚型间识别出的差异甲基化位点构建的基于SMO(序列最小最优化)的预测模型在各亚型上都有较高的AUC值,并用检验集进行了验证。综上,本研究利用生物信息学算法识别了7个预后差异的结肠腺癌亚型并挖掘了它们的特异性甲基化标记。该研究结果或可使得结肠腺癌预后被更精准地评估,为早期诊断及治疗方案提供新思路。  相似文献   

4.
5.
Redundancy Analysis (RDA) is a well‐known method used to describe the directional relationship between related data sets. Recently, we proposed sparse Redundancy Analysis (sRDA) for high‐dimensional genomic data analysis to find explanatory variables that explain the most variance of the response variables. As more and more biomolecular data become available from different biological levels, such as genotypic and phenotypic data from different omics domains, a natural research direction is to apply an integrated analysis approach in order to explore the underlying biological mechanism of certain phenotypes of the given organism. We show that the multiset sparse Redundancy Analysis (multi‐sRDA) framework is a prominent candidate for high‐dimensional omics data analysis since it accounts for the directional information transfer between omics sets, and, through its sparse solutions, the interpretability of the result is improved. In this paper, we also describe a software implementation for multi‐sRDA, based on the Partial Least Squares Path Modeling algorithm. We test our method through simulation and real omics data analysis with data sets of 364,134 methylation markers, 18,424 gene expression markers, and 47 cytokine markers measured on 37 patients with Marfan syndrome.  相似文献   

6.
7.
Epigenome-wide association studies (EWAS) have focused primarily on DNA methylation as a chemically stable and functional epigenetic modification. However, the stability and accuracy of the measurement of methylation in different tissues and extraction types is still being actively studied, and the longitudinal stability of DNA methylation in commonly studied peripheral tissues is of great interest. Here, we used data from two studies, three tissue types, and multiple time points to assess the stability of DNA methylation measured with the Illumina Infinium HumanMethylation450 BeadChip array. Redundancy analysis enabled visual assessment of agreement of replicate samples overall and showed good agreement after removing effects of tissue type, age, and sex. At the probe level, analysis of variance contrasts separating technical and biological replicates clearly showed better agreement between technical replicates versus longitudinal samples, and suggested increased stability for buccal cells versus blood or blood spots. Intraclass correlations (ICCs) demonstrated that inter-individual variability is of similar magnitude to within-sample variability at many probes; however, as inter-individual variability increased, so did ICC. Furthermore, we were able to demonstrate decreasing agreement in methylation levels with time, despite a maximal sampling interval of only 576 days. Finally, at 6 popular candidate genes, there was a large range of stability across probes. Our findings highlight important sources of technical and biological variation in DNA methylation across different tissues over time. These data will help to inform longitudinal sampling strategies of future EWAS.  相似文献   

8.
de Andrade  Mariza  Warwick Daw  E.  Kraja  Aldi T.  Fisher  Virginia  Wang  Lan  Hu  Ke  Li  Jing  Romanescu  Razvan  Veenstra  Jenna  Sun  Rui  Weng  Haoyi  Zhou  Wenda 《BMC genetics》2018,19(1):119-125
Background

GAW20 working group 5 brought together researchers who contributed 7 papers with the aim of evaluating methods to detect genetic by epigenetic interactions. GAW20 distributed real data from the Genetics of Lipid Lowering Drugs and Diet Network (GOLDN) study, including single-nucleotide polymorphism (SNP) markers, methylation (cytosine-phosphate-guanine [CpG]) markers, and phenotype information on up to 995 individuals. In addition, a simulated data set based on the real data was provided.

Results

The 7 contributed papers analyzed these data sets with a number of different statistical methods, including generalized linear mixed models, mediation analysis, machine learning, W-test, and sparsity-inducing regularized regression. These methods generally appeared to perform well. Several papers confirmed a number of causative SNPs in either the large number of simulation sets or the real data on chromosome 11. Findings were also reported for different SNPs, CpG sites, and SNP–CpG site interaction pairs.

Conclusions

In the simulation (200 replications), power appeared generally good for large interaction effects, but smaller effects will require larger studies or consortium collaboration for realizing a sufficient power.

  相似文献   

9.
Many epigenetic association studies have attempted to identify DNA methylation markers in blood that are able to mirror those in target tissues. Although some have suggested potential utility of surrogate epigenetic markers in blood, few studies have collected data to directly compare DNA methylation across tissues from the same individuals. Here, epigenomic data were collected from adipose tissue and blood in 143 subjects using Illumina HumanMethylation450 BeadChip array. The top axis of epigenome-wide variation differentiates adipose tissue from blood, which is confirmed internally using cross-validation and externally with independent data from the two tissues. We identified 1,285 discordant genes and 1,961 concordant genes between blood and adipose tissue. RNA expression data of the two classes of genes show consistent patterns with those observed in DNA methylation. The discordant genes are enriched in biological functions related to immune response, leukocyte activation or differentiation, and blood coagulation. We distinguish the CpG-specific correlation from the within-subject correlation and emphasize that the magnitude of within-subject correlation does not guarantee the utility of surrogate epigenetic markers. The study reinforces the critical role of DNA methylation in regulating gene expression and cellular phenotypes across tissues, and highlights the caveats of using methylation markers in blood to mirror the corresponding profile in the target tissue.  相似文献   

10.
IntroductionAdvances in high-throughput technologies have generated diverse informative molecular markers for cancer outcome prediction. Long non-coding RNA (lncRNA) and DNA methylation as new classes of promising markers are emerging as key molecules in human cancers; however, the prognostic utility of such diverse molecular data remains to be explored.ResultsUsing the IDFO approach, we obtained good predictive performance of the molecular datasets (bootstrap accuracy: 0.71–0.97) in five cancer types. Impressively, lncRNA was identified as the best prognostic predictor in the validated cohorts of four cancer types, followed by DNA methylation, mRNA, and then microRNA. We found the incorporating of multi-type molecular data showed similar predictive power to single-type molecular data, but with the exception of the lncRNA + DNA methylation combinations in two cancers. Survival analysis of proportional hazard models confirmed a high robustness for lncRNA and DNA methylation as prognosis factors independent of traditional clinical variables.ConclusionOur study provides insight into systematically understanding the prognostic performance of diverse molecular data in both single and aggregate patterns, which may have specific reference to subsequent related studies.  相似文献   

11.

Background

GAW20 working group 5 brought together researchers who contributed 7 papers with the aim of evaluating methods to detect genetic by epigenetic interactions. GAW20 distributed real data from the Genetics of Lipid Lowering Drugs and Diet Network (GOLDN) study, including single-nucleotide polymorphism (SNP) markers, methylation (cytosine-phosphate-guanine [CpG]) markers, and phenotype information on up to 995 individuals. In addition, a simulated data set based on the real data was provided.

Results

The 7 contributed papers analyzed these data sets with a number of different statistical methods, including generalized linear mixed models, mediation analysis, machine learning, W-test, and sparsity-inducing regularized regression. These methods generally appeared to perform well. Several papers confirmed a number of causative SNPs in either the large number of simulation sets or the real data on chromosome 11. Findings were also reported for different SNPs, CpG sites, and SNP–CpG site interaction pairs.

Conclusions

In the simulation (200 replications), power appeared generally good for large interaction effects, but smaller effects will require larger studies or consortium collaboration for realizing a sufficient power.
  相似文献   

12.
The genetic diversity of Cistus ladanifer ssp. ladanifer (Cistaceae) growing on ultramafic and non-ultramafic (basic and schists) soils in the NE of Portugal was studied in order to identify molecular markers that could distinguish the metal-tolerant ecotypes of this species. Random Amplified Polymorphic DNA (RAPD) markers were used in order to estimate genetic variation and differences between populations. The RAPD dataset was analysed by means of a cluster analysis and an analysis of molecular variance (AMOVA). Our results indicate a significant partitioning of molecular variance between ultramafic and non-ultramafic populations of Cistus ladanifer, although the highest percentage of this variance was found at the intra-population level. Mantel's test showed no relationship between inter-population genetic and geographic distances. A series of RAPD bands that could be related to heavy metal tolerance were observed. The identification of such markers will enable the use of Cistus ladanifer in phytoremediation procedures.  相似文献   

13.
Diagnostic or screening tests are widely used in medical fields to classify patients according to their disease status. Several statistical models for meta‐analysis of diagnostic test accuracy studies have been developed to synthesize test sensitivity and specificity of a diagnostic test of interest. Because of the correlation between test sensitivity and specificity, modeling the two measures using a bivariate model is recommended. In this paper, we extend the current standard bivariate linear mixed model (LMM) by proposing two variance‐stabilizing transformations: the arcsine square root and the Freeman–Tukey double arcsine transformation. We compared the performance of the proposed methods with the standard method through simulations using several performance measures. The simulation results showed that our proposed methods performed better than the standard LMM in terms of bias, root mean square error, and coverage probability in most of the scenarios, even when data were generated assuming the standard LMM. We also illustrated the methods using two real data sets.  相似文献   

14.
In recent years it has become apparent that epigenetic events are potentially equally responsible for cancer initiation and progression as genetic abnormalities. DNA methylation is the main epigenetic modification in humans. Two DNA methylation lesions coexist in human neoplasms: hypermethylation of promoter regions of specific genes within a context of genomic hypomethylation. Aberrant methylation is found at early stages of carcinogenesis and distinct types of cancer exhibit specific patterns of methylation changes. Tumor specific DNA is readily obtainable from different clinical samples and methylation status analysis often permits sensitive disease detection. Methylation markers may also serve for prognostic and predictive purposes as they often reflect the metastatic potential and sensitivity to therapy. As current findings show a great potential of recently characterised methylation markers, more studies in the field are needed in the future. Large clinical studies of newly developed markers are especially needed. The review describes the diagnostic potential of DNA methylation markers.  相似文献   

15.
Wu G  Yi N  Absher D  Zhi D 《PloS one》2011,6(6):e21034

Background/Aims

Recently, next-generation sequencing-based technologies have enabled DNA methylation profiling at high resolution and low cost. Methyl-Seq and Reduced Representation Bisulfite Sequencing (RRBS) are two such technologies that interrogate methylation levels at CpG sites throughout the entire human genome. With rapid reduction of sequencing costs, these technologies will enable epigenotyping of large cohorts for phenotypic association studies. Existing quantification methods for sequencing-based methylation profiling are simplistic and do not deal with the noise due to the random sampling nature of sequencing and various experimental artifacts. Therefore, there is a need to investigate the statistical issues related to the quantification of methylation levels for these emerging technologies, with the goal of developing an accurate quantification method.

Methods

In this paper, we propose two methods for Methyl-Seq quantification. The first method, the Maximum Likelihood estimate, is both conceptually intuitive and computationally simple. However, this estimate is biased at extreme methylation levels and does not provide variance estimation. The second method, based on Bayesian hierarchical model, allows variance estimation of methylation levels, and provides a flexible framework to adjust technical bias in the sequencing process.

Results

We compare the previously proposed binary method, the Maximum Likelihood (ML) method, and the Bayesian method. In both simulation and real data analysis of Methyl-Seq data, the Bayesian method offers the most accurate quantification. The ML method is slightly less accurate than the Bayesian method. But both our proposed methods outperform the original binary method in Methyl-Seq. In addition, we applied these quantification methods to simulation data and show that, with sequencing depth above 40–300 (which varies with different tissue samples) per cleavage site, Methyl-Seq offers a comparable quantification consistency as microarrays.  相似文献   

16.
During differentiation, in vitro organogenesis calls for the adjustment of the gene expression program toward a new fate. The role of epigenetic mechanisms including DNA methylation is suggested but little is known about the loci affected by DNA methylation changes, particularly in agronomic plants for witch in vitro technologies are useful such as sugar beet. Here, three pairs of organogenic and non-organogenic in vitro cell lines originating from different sugar beet (Beta vulgaris altissima) cultivars were used to assess the dynamics of DNA methylation at the global or genic levels during shoot or root regeneration. The restriction landmark genome scanning for methylation approach was applied to provide a direct quantitative epigenetic assessment of several CG methylated genes without prior knowledge of gene sequence that is particularly adapted for studies on crop plants without a fully sequenced genome. The cloned sequences had putative roles in cell proliferation, differentiation or unknown functions and displayed organ-specific DNA polymorphism for methylation and changes in expression during in vitro organogenesis. Among them, a potential ubiquitin extension protein 6 (UBI6) was shown, in different cultivars, to exhibit repeatable variations of DNA methylation and gene expression during shoot regeneration. In addition, abnormal development and callogenesis were observed in a T-DNA insertion mutant (ubi6) for a homologous sequence in Arabidopsis. Our data showed that DNA methylation is changed in an organ-specific way for genes exhibiting variations of expression and playing potential role during organogenesis. These epialleles could be conserved between parental lines opening perspectives for molecular markers.  相似文献   

17.
目的:探讨利用分子量阵列平台进行特定序列甲基化分析的方法。方法:通过对不同扩增条件和扩增效率及不同条件处理的质谱分析比较了MassArray平台进行甲基化分析的特点。结果:本研究通过与重亚硫酸盐测序结果比较证实,MassArray分子量阵列技术平台能够反映甲基化修饰的真实水平;通过不同条件下PCR扩增效率与甲基化分析的结果,发现扩增效率是制约MassArray分子量阵列技术平台甲基化分析的关键因素,而产物的放置时间和不同的处理没有明显影响甲基化分析。结论:Mas-sArray甲基化分析平台是高效快速检测甲基化修饰的平台,在使用过程中应该根据实际的实验条件,进行合理的质控。  相似文献   

18.
DNA pooling is a potential tool for the efficient analysis of the large numbers of samples and DNA markers that are necessary for genome-wide association studies. A simple accurate method for measuring total allele differences in comparisons between two pools containing large numbers of DNA samples is presented. This method compares relative peak height differences between electrophoretograms for each allele of a microsatellite. The method was evaluated by the analysis of 11 microsatellite markers and DNA pooled sample sizes of 50, 100, and 200 individual DNA samples from the same number of different subjects. Pools were created from previously individually genotyped subjects and constructed so that the pool comparisons would provide real total allele differences varying from 0% to 55%. Calculated pool differences were then compared with the real total allele differences determined by individual genotyping results. Together over 200 comparisons demonstrated a correlation coefficient of 0.96, which compared favorably with other previous methods of analysis. This method could provide a rapid screen for total allele differences of greater than 10%, a threshold that should be applicable to detecting low relative risk genes in common diseases. Therefore, these studies suggest that DNA pooling could be a useful tool in association studies for the determination of candidate regions for a range of complex genetic diseases.  相似文献   

19.
We analyzed genetic diversity and population genetic structure of four artificial populations of wild barley (Hordeum brevisubulatum); 96 plants collected from the Songnen Prairie in northeastern China were analyzed using amplified fragment length polymorphism (AFLP), specific-sequence amplified polymorphism (SSAP) and methylation-sensitive amplified polymorphism (MSAP) markers. Indices of (epi-)genetic diversity, (epi-)genetic distance, gene flow, genotype frequency, cluster analysis, PCA analysis and AMOVA analysis generated from MSAP, AFLP and SSAP markers had the same trend. We found a high level of correlation in the artificial populations between MSAP, SSAP and AFLP markers by the Mantel test (r > 0.8). This is incongruent with previous findings showing that there is virtually no correlation between DNA methylation polymorphism and classical genetic variation; the high level of genetic polymorphism could be a result of epigenetic regulation. We compared our results with data from natural populations. The population diversity of the artificial populations was lower. However, different from what was found using AFLP and SSAP, based on MSAP results the methylation polymorphism of the artificial populations was not significantly reduced. This leads us to suggest that the DNA methylation pattern change in H. brevisubulatum populations is not only related to DNA sequence variation, but is also regulated by other controlling systems.  相似文献   

20.
MOTIVATION: Methylation of cytosines in DNA plays an important role in the regulation of gene expression, and the analysis of methylation patterns is fundamental for the understanding of cell differentiation, aging processes, diseases and cancer development. Such analysis has been limited, because technologies for detailed and efficient high-throughput studies have not been available. We have developed a novel quantitative methylation analysis algorithm and workflow based on direct DNA sequencing of PCR products from bisulfite-treated DNA with high-throughput sequencing machines. This technology is a prerequisite for success of the Human Epigenome Project, the first large genome-wide sequencing study for DNA methylation in many different tissues. Methylation in tissue samples which are compositions of different cells is a quantitative information represented by cytosine/thymine proportions after bisulfite conversion of unmethylated cytosines to uracil and PCR. Calculation of quantitative methylation information from base proportions represented by different dye signals in four-dye sequencing trace files needs a specific algorithm handling imbalanced and overscaled signals, incomplete conversion, quality problems and basecaller artifacts. RESULTS: The algorithm we developed has several key properties: it analyzes trace files from PCR products of bisulfite-treated DNA sequenced directly on ABI machines; it yields quantitative methylation measurements for individual cytosine positions after alignment with genomic reference sequences, signal normalization and estimation of effectiveness of bisulfite treatment; it works in a fully automated pipeline including data quality monitoring; it is efficient and avoids the usual cost of multiple sequencing runs on subclones to estimate DNA methylation. The power of our new algorithm is demonstrated with data from two test systems based on mixtures with known base compositions and defined methylation. In addition, the applicability is proven by identifying CpGs that are differentially methylated in real tissue samples.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号