首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.

Background

Adenocarcinomas located near the gastroesophageal junction have unclear etiology and are difficult to classify. We used DNA methylation analysis to identify subtype-specific markers and new subgroups of gastroesophageal adenocarcinomas, and studied their association with epidemiological risk factors and clinical outcomes.

Methodology/Principal Findings

We used logistic regression models and unsupervised hierarchical cluster analysis of 74 DNA methylation markers on 45 tumor samples (44 patients) of esophageal and gastric adenocarcinomas obtained from a population-based case-control study to uncover epigenetic markers and cluster groups of gastroesophageal adenocarcinomas. No distinct epigenetic differences were evident between subtypes of gastric and esophageal cancers. However, we identified two gastroesophageal adenocarcinoma subclusters based on DNA methylation profiles. Group membership was best predicted by GATA5 DNA methylation status. We analyzed the associations between these two epigenetic groups and exposure using logistic regression, and the associations with survival time using Cox regression in a larger set of 317 tumor samples (278 patients). There were more males with esophageal and gastric cardia cancers in Cluster Group 1 characterized by higher GATA5 DNA methylation values (all p<0.05). This group also showed associations of borderline statistical significance with having ever smoked (p-value = 0.07), high body mass index (p-value = 0.06), and symptoms of gastroesophageal reflux (p-value = 0.07). Subjects in cluster Group 1 showed better survival than those in Group 2 after adjusting for tumor differentiation grade, but this was not found to be independent of tumor stage.

Conclusions/Significance

DNA methylation profiling can be used in population-based studies to identify epigenetic subclasses of gastroesophageal adenocarcinomas and class-specific DNA methylation markers that can be linked to epidemiological data and clinical outcome. Two new epigenetic subgroups of gastroesophageal adenocarcinomas were identified that differ to some extent in their survival rates, risk factors of exposure, and GATA5 DNA methylation.  相似文献   

4.

Background

Protein sequence profile-profile alignment is an important approach to recognizing remote homologs and generating accurate pairwise alignments. It plays an important role in protein sequence database search, protein structure prediction, protein function prediction, and phylogenetic analysis.

Results

In this work, we integrate predicted solvent accessibility, torsion angles and evolutionary residue coupling information with the pairwise Hidden Markov Model (HMM) based profile alignment method to improve profile-profile alignments. The evaluation results demonstrate that adding predicted relative solvent accessibility and torsion angle information improves the accuracy of profile-profile alignments. The evolutionary residue coupling information is helpful in some cases, but its contribution to the improvement is not consistent.

Conclusion

Incorporating the new structural information such as predicted solvent accessibility and torsion angles into the profile-profile alignment is a useful way to improve pairwise profile-profile alignment methods.  相似文献   

5.

Background

Guide-trees are used as part of an essential heuristic to enable the calculation of multiple sequence alignments. They have been the focus of much method development but there has been little effort at determining systematically, which guide-trees, if any, give the best alignments. Some guide-tree construction schemes are based on pair-wise distances amongst unaligned sequences. Others try to emulate an underlying evolutionary tree and involve various iteration methods.

Results

We explore all possible guide-trees for a set of protein alignments of up to eight sequences. We find that pairwise distance based default guide-trees sometimes outperform evolutionary guide-trees, as measured by structure derived reference alignments. However, default guide-trees fall way short of the optimum attainable scores. On average chained guide-trees perform better than balanced ones but are not better than default guide-trees for small alignments.

Conclusions

Alignment methods that use Consistency or hidden Markov models to make alignments are less susceptible to sub-optimal guide-trees than simpler methods, that basically use conventional sequence alignment between profiles. The latter appear to be affected positively by evolutionary based guide-trees for difficult alignments and negatively for easy alignments. One phylogeny aware alignment program can strongly discriminate between good and bad guide-trees. The results for randomly chained guide-trees improve with the number of sequences.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-338) contains supplementary material, which is available to authorized users.  相似文献   

6.

Background

Analysis of targeted amplicon sequencing data presents some unique challenges in comparison to the analysis of random fragment sequencing data. Whereas reads from randomly fragmented DNA have arbitrary start positions, the reads from amplicon sequencing have fixed start positions that coincide with the amplicon boundaries. As a result, any variants near the amplicon boundaries can cause misalignments of multiple reads that can ultimately lead to false-positive or false-negative variant calls.

Results

We show that amplicon boundaries are variant calling blind spots where the variant calls are highly inaccurate. We propose that an effective strategy to avoid these blind spots is to incorporate the primer bases in obtaining read alignments and post-processing of the alignments, thereby effectively moving these blind spots into the primer binding regions (which are not used for variant calling). Targeted sequencing data analysis pipelines can provide better variant calling accuracy when primer bases are retained and sequenced.

Conclusions

Read bases beyond the variant site are necessary for analysis of amplicon sequencing data. Enzymatic primer digestion, if used in the target enrichment process, should leave at least a few primer bases to ensure that these bases are available during data analysis. The primer bases should only be removed immediately before the variant calling step to ensure that the variants can be called irrespective of where they occur within the amplicon insert region.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1073) contains supplementary material, which is available to authorized users.  相似文献   

7.

Background

Epigenome-wide association studies of human disease and other quantitative traits are becoming increasingly common. A series of papers reporting age-related changes in DNA methylation profiles in peripheral blood have already been published. However, blood is a heterogeneous collection of different cell types, each with a very different DNA methylation profile.

Results

Using a statistical method that permits estimating the relative proportion of cell types from DNA methylation profiles, we examine data from five previously published studies, and find strong evidence of cell composition change across age in blood. We also demonstrate that, in these studies, cellular composition explains much of the observed variability in DNA methylation. Furthermore, we find high levels of confounding between age-related variability and cellular composition at the CpG level.

Conclusions

Our findings underscore the importance of considering cell composition variability in epigenetic studies based on whole blood and other heterogeneous tissue sources. We also provide software for estimating and exploring this composition confounding for the Illumina 450k microarray.  相似文献   

8.

Background

Severe refractory asthma is a heterogeneous disease. We sought to determine statistical clusters from the British Thoracic Society Severe refractory Asthma Registry and to examine cluster-specific outcomes and stability.

Methods

Factor analysis and statistical cluster modelling was undertaken to determine the number of clusters and their membership (N = 349). Cluster-specific outcomes were assessed after a median follow-up of 3 years. A classifier was programmed to determine cluster stability and was validated in an independent cohort of new patients recruited to the registry (n = 245).

Findings

Five clusters were identified. Cluster 1 (34%) were atopic with early onset disease, cluster 2 (21%) were obese with late onset disease, cluster 3 (15%) had the least severe disease, cluster 4 (15%) were the eosinophilic with late onset disease and cluster 5 (15%) had significant fixed airflow obstruction. At follow-up, the proportion of subjects treated with oral corticosteroids increased in all groups with an increase in body mass index. Exacerbation frequency decreased significantly in clusters 1, 2 and 4 and was associated with a significant fall in the peripheral blood eosinophil count in clusters 2 and 4. Stability of cluster membership at follow-up was 52% for the whole group with stability being best in cluster 2 (71%) and worst in cluster 4 (25%). In an independent validation cohort, the classifier identified the same 5 clusters with similar patient distribution and characteristics.

Interpretation

Statistical cluster analysis can identify distinct phenotypes with specific outcomes. Cluster membership can be determined using a classifier, but when treatment is optimised, cluster stability is poor.  相似文献   

9.

Background

The classical candidate-gene approach has failed to identify novel breast cancer susceptibility genes. Nowadays, massive parallel sequencing technology allows the development of studies unaffordable a few years ago. However, analysis protocols are not yet sufficiently developed to extract all information from the huge amount of data obtained.

Methodology/Principal Findings

In this study, we performed high throughput sequencing in two regions located on chromosomes 3 and 6, recently identified by linkage studies by our group as candidate regions for harbouring breast cancer susceptibility genes. In order to enrich for the coding regions of all described genes located in both candidate regions, a hybrid-selection method on tiling microarrays was performed.

Conclusions/Significance

We developed an analysis pipeline based on SOAP aligner to identify candidate variants with a high real positive confirmation rate (0.89), with which we identified eight variants considered candidates for functional studies. The results suggest that the present strategy might be a valid second step for identifying high penetrance genes.  相似文献   

10.

Background

We recently described FastTree, a tool for inferring phylogenies for alignments with up to hundreds of thousands of sequences. Here, we describe improvements to FastTree that improve its accuracy without sacrificing scalability.

Methodology/Principal Findings

Where FastTree 1 used nearest-neighbor interchanges (NNIs) and the minimum-evolution criterion to improve the tree, FastTree 2 adds minimum-evolution subtree-pruning-regrafting (SPRs) and maximum-likelihood NNIs. FastTree 2 uses heuristics to restrict the search for better trees and estimates a rate of evolution for each site (the “CAT” approximation). Nevertheless, for both simulated and genuine alignments, FastTree 2 is slightly more accurate than a standard implementation of maximum-likelihood NNIs (PhyML 3 with default settings). Although FastTree 2 is not quite as accurate as methods that use maximum-likelihood SPRs, most of the splits that disagree are poorly supported, and for large alignments, FastTree 2 is 100–1,000 times faster. FastTree 2 inferred a topology and likelihood-based local support values for 237,882 distinct 16S ribosomal RNAs on a desktop computer in 22 hours and 5.8 gigabytes of memory.

Conclusions/Significance

FastTree 2 allows the inference of maximum-likelihood phylogenies for huge alignments. FastTree 2 is freely available at http://www.microbesonline.org/fasttree.  相似文献   

11.

Background

Proteins are composed of domains, protein segments that fold independently from the rest of the protein and have a specific function. During evolution the arrangement of domains can change: domains are gained, lost or their order is rearranged. To facilitate the analysis of these changes we propose the use of multiple domain alignments.

Results

We developed an alignment program, called MDAT, which aligns multiple domain arrangements. MDAT extends earlier programs which perform pairwise alignments of domain arrangements. MDAT uses a domain similarity matrix to score domain pairs and aligns the domain arrangements using a consistency supported progressive alignment method.

Conclusion

MDAT will be useful for analysing changes in domain arrangements within and between protein families and will thus provide valuable insights into the evolution of proteins and their domains. MDAT is coded in C++, and the source code is freely available for download at http://www.bornberglab.org/pages/mdat.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0442-7) contains supplementary material, which is available to authorized users.  相似文献   

12.
13.

Purpose

Aberrant promoter DNA methylation can serve as a predictive biomarker for improved clinical responses to certain chemotherapeutics. One of the major advantages of methylation biomarkers is the ease of detection and clinical application. In order to identify methylation biomarkers predictive of a response to a taxane-platinum based chemotherapy regimen in advanced NSCLC we performed an unbiased methylation analysis of 1,536 CpG dinucleotides in cancer-associated gene loci and correlated results with clinical outcomes.

Methods

We studied a cohort of 49 patients (median age 62 years) with advanced NSCLC treated at the Atlanta VAMC between 1999 and 2010. Methylation analysis was done on the Illumina GoldenGate Cancer panel 1 methylation microarray platform. Methylation data were correlated with clinical response and adjusted for false discovery rates.

Results

Cav1 methylation emerged as a powerful predictor for achieving disease stabilization following platinum taxane based chemotherapy (p = 1.21E-05, FDR significance  = 0.018176). In Cox regression analysis after multivariate adjustment for age, performance status, gender, histology and the use of bevacizumab, CAV1 methylation was significantly associated with improved overall survival (HR 0.18 (95%CI: 0.03–0.94)). Silencing of CAV1 expression in lung cancer cell lines(A549, EKVX)by shRNA led to alterations in taxane retention.

Conclusions

CAV1 methylation is a predictor of disease stabilization and improved overall survival following chemotherapy with a taxane-platinum combination regimen in advanced NSCLC. CAV1 methylation may predict improved outcomes for other chemotherapeutic agents which are subject to cellular clearance mediated by caveolae.  相似文献   

14.

Background

Human induced pluripotent stem cells (iPSCs) have a wide range of applications throughout the fields of basic research, disease modeling and drug screening. Epigenetic instable iPSCs with aberrant DNA methylation may divide and differentiate into cancer cells. Unfortunately, little effort has been taken to compare the epigenetic variation in iPSCs with that in differentiated cells. Here, we developed an analytical procedure to decipher the DNA methylation heterogeneity of mixed cells and further exploited it to quantitatively assess the DNA methylation variation in the methylomes of adipose-derived stem cells (ADS), mature adipocytes differentiated from ADS cells (ADS-adipose) and iPSCs reprogrammed from ADS cells (ADS-iPSCs).

Results

We observed that the degree of DNA methylation variation varies across distinct genomic regions with promoter and 5’UTR regions exhibiting low methylation variation and Satellite showing high methylation variation. Compared with differentiated cells, ADS-iPSCs possess globally decreased methylation variation, in particular in repetitive elements. Interestingly, DNA methylation variation decreases in promoter regions during differentiation but increases during reprogramming. Methylation variation in promoter regions is negatively correlated with gene expression. In addition, genes showing a bipolar methylation pattern, with both completely methylated and completely unmethylated reads, are related to the carbohydrate metabolic process, cellular development, cellular growth, proliferation, etc.

Conclusions

This study delivers a way to detect cell-subset specific methylation genes in a mixed cell population and provides a better understanding of methylation dynamics during stem cell differentiation and reprogramming.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-978) contains supplementary material, which is available to authorized users.  相似文献   

15.

Background

A commonplace analysis in high-throughput DNA methylation studies is the comparison of methylation extent between different functional regions, computed by averaging methylation states within region types and then comparing averages between regions. For example, it has been reported that methylation is more prevalent in coding regions as compared to their neighboring introns or UTRs, leading to hypotheses about novel forms of epigenetic regulation.

Results

We have identified and characterized a bias present in these seemingly straightforward comparisons that results in the false detection of differences in methylation intensities across region types. This bias arises due to differences in conservation rates, rather than methylation rates, and is broadly present in the published literature. When controlling for conservation at coding start sites the differences in DNA methylation rates disappear. Moreover, a re-evaluation of methylation rates at intronexon junctions reveals that the magnitude of previously reported differences is greatly exaggerated. We introduce two correction methods to address this bias, an inferencebased matrix completion algorithm and an averaging approach, tailored to address different underlying biological questions. We evaluate how analysis using these corrections affects the detection of differences in DNA methylation across functional boundaries.

Conclusions

We report here on a bias in DNA methylation comparative studies that originates in conservation rate differences and manifests itself in the false discovery of differences in DNA methylation intensities and their extents. We have characterized this bias and its broad implications, and show how to control for it so as to enable the study of a variety of biological questions.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1604-3) contains supplementary material, which is available to authorized users.  相似文献   

16.
17.

Background

To date, pathological examination of specimens remains largely qualitative. Quantitative measures of tissue spatial features are generally not captured. To gain additional mechanistic and prognostic insights, a need for quantitative architectural analysis arises in studying immune cell-cancer interactions within the tumor microenvironment and tumor-draining lymph nodes (TDLNs).

Methodology/Principal Findings

We present a novel, quantitative image analysis approach incorporating 1) multi-color tissue staining, 2) high-resolution, automated whole-section imaging, 3) custom image analysis software that identifies cell types and locations, and 4) spatial statistical analysis. As a proof of concept, we applied this approach to study the architectural patterns of T and B cells within tumor-draining lymph nodes from breast cancer patients versus healthy lymph nodes. We found that the spatial grouping patterns of T and B cells differed between healthy and breast cancer lymph nodes, and this could be attributed to the lack of B cell localization in the extrafollicular region of the TDLNs.

Conclusions/Significance

Our integrative approach has made quantitative analysis of complex visual data possible. Our results highlight spatial alterations of immune cells within lymph nodes from breast cancer patients as an independent variable from numerical changes. This opens up new areas of investigations in research and medicine. Future application of this approach will lead to a better understanding of immune changes in the tumor microenvironment and TDLNs, and how they affect clinical outcomes.  相似文献   

18.
19.

Background

DNA methylation is an important epigenetic mechanism in several human diseases, most notably cancer. The quantitative analysis of DNA methylation patterns has the potential to serve as diagnostic and prognostic biomarkers, however, there is currently a lack of consensus regarding the optimal methodologies to quantify methylation status. To address this issue we compared five analytical methods: (i) MethyLight qPCR, (ii) MethyLight digital PCR (dPCR), methylation-sensitive and -dependent restriction enzyme (MSRE/MDRE) digestion followed by (iii) qPCR or (iv) dPCR, and (v) bisulfite amplicon next generation sequencing (NGS). The techniques were evaluated for linearity, accuracy and precision.

Results

MethyLight qPCR displayed the best linearity across the range of tested samples. Observed methylation measured by MethyLight- and MSRE/MDRE-qPCR and -dPCR were not significantly different to expected values whilst bisulfite amplicon NGS analysis over-estimated methylation content. Bisulfite amplicon NGS showed good precision, whilst the lower precision of qPCR and dPCR analysis precluded discrimination of differences of < 25% in methylation status. A novel dPCR MethyLight assay is also described as a potential method for absolute quantification that simultaneously measures both sense and antisense DNA strands following bisulfite treatment.

Conclusions

Our findings comprise a comprehensive benchmark for the quantitative accuracy of key methods for methylation analysis and demonstrate their applicability to the quantification of circulating tumour DNA biomarkers by using sample concentrations that are representative of typical clinical isolates.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1174) contains supplementary material, which is available to authorized users.  相似文献   

20.

Background

A low birth weight has been extensively related to poor adult health outcomes. Birth weight can be seen as a proxy for environmental conditions during prenatal development. Identical twin pairs discordant for birth weight provide an extraordinary model for investigating the association between birth weight and adult life health while controlling for not only genetics but also postnatal rearing environment. We performed an epigenome-wide profiling on blood samples from 150 pairs of adult monozygotic twins discordant for birth weight to look for molecular evidence of epigenetic signatures in association with birth weight discordance.

Results

Our association analysis revealed no CpG site with genome-wide statistical significance (FDR < 0.05) for either qualitative (larger or smaller) or quantitative discordance in birth weight. Even with selected samples of extremely birth weight discordant twin pairs, no significant site was found except for 3 CpGs that displayed age-dependent intra-pair differential methylation with FDRs 0.014 (cg26856578, p = 3.42e-08), 0.0256 (cg15122603, p = 1.25e-07) and 0.0258 (cg16636641, p = 2.05e-07). Among the three sites, intra-pair differential methylation increased with age for cg26856578 but decreased with age for cg15122603 and cg16636641. There was no genome-wide statistical significance for sex-dependent effects on intra-pair differential methylation in either the whole samples or the extremely discordant twins.

Conclusions

Genome-wide DNA methylation profiling did not reveal epigenetic signatures of birth weight discordance although some sites displayed age-dependent intra-pair differential methylation in the extremely discordant twin pairs.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1062) contains supplementary material, which is available to authorized users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号