首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We propose a statistical algorithm MethylPurify that uses regions with bisulfite reads showing discordant methylation levels to infer tumor purity from tumor samples alone. MethylPurify can identify differentially methylated regions (DMRs) from individual tumor methylome samples, without genomic variation information or prior knowledge from other datasets. In simulations with mixed bisulfite reads from cancer and normal cell lines, MethylPurify correctly inferred tumor purity and identified over 96% of the DMRs. From patient data, MethylPurify gave satisfactory DMR calls from tumor methylome samples alone, and revealed potential missed DMRs by tumor to normal comparison due to tumor heterogeneity.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-014-0419-x) contains supplementary material, which is available to authorized users.  相似文献   

2.
He  Feifei  Li  Yang  Tang  Yu-Hang  Ma  Jian  Zhu  Huaiqiu 《BMC genomics》2016,17(1):141-151
Background

The identification of inversions of DNA segments shorter than read length (e.g., 100 bp), defined as micro-inversions (MIs), remains challenging for next-generation sequencing reads. It is acknowledged that MIs are important genomic variation and may play roles in causing genetic disease. However, current alignment methods are generally insensitive to detect MIs. Here we develop a novel tool, MID (Micro-Inversion Detector), to identify MIs in human genomes using next-generation sequencing reads.

Results

The algorithm of MID is designed based on a dynamic programming path-finding approach. What makes MID different from other variant detection tools is that MID can handle small MIs and multiple breakpoints within an unmapped read. Moreover, MID improves reliability in low coverage data by integrating multiple samples. Our evaluation demonstrated that MID outperforms Gustaf, which can currently detect inversions from 30 bp to 500 bp.

Conclusions

To our knowledge, MID is the first method that can efficiently and reliably identify MIs from unmapped short next-generation sequencing reads. MID is reliable on low coverage data, which is suitable for large-scale projects such as the 1000 Genomes Project (1KGP). MID identified previously unknown MIs from the 1KGP that overlap with genes and regulatory elements in the human genome. We also identified MIs in cancer cell lines from Cancer Cell Line Encyclopedia (CCLE). Therefore our tool is expected to be useful to improve the study of MIs as a type of genetic variant in the human genome. The source code can be downloaded from: http://cqb.pku.edu.cn/ZhuLab/MID.

  相似文献   

3.
Introduction: Cancer is often diagnosed at late stages when the chance of cure is relatively low and although research initiatives in oncology discover many potential cancer biomarkers, few transition to clinical applications. This review addresses the current landscape of cancer biomarker discovery and translation with a focus on proteomics and beyond.

Areas covered: The review examines proteomic and genomic techniques for cancer biomarker detection and outlines advantages and challenges of integrating multiple omics approaches to achieve optimal sensitivity and address tumor heterogeneity. This discussion is based on a systematic literature review and direct participation in translational studies.

Expert commentary: Identifying aggressive cancers early on requires improved sensitivity and implementation of biomarkers representative of tumor heterogeneity. During the last decade of genomic and proteomic research, significant advancements have been made in next generation sequencing and mass spectrometry techniques. This in turn has led to a dramatic increase in identification of potential genomic and proteomic cancer biomarkers. However, limited successes have been shown with translation of these discoveries into clinical practice. We believe that the integration of these omics approaches is the most promising molecular tool for comprehensive cancer evaluation, early detection and transition to Precision Medicine in oncology.  相似文献   


4.
摘要 目的:比较同源肿瘤细胞来源的不同单克隆表型差异。方法:采用极限稀释法,在悬浮培养条件下获取HCT116结肠癌细胞系的单个细胞,对每孔含单个的细胞进行扩增培养,获得子代单克隆,并以同样方法继续挑取单克隆,连续获得子三代克隆。根据单克隆形态特点,选取第三代的三株代表性的单克隆,采用Western blot和免疫荧光法比较其SOX2、EpCAM和Vimentin蛋白表达差异。采用放疗观察三株单克隆的Vimentin蛋白的动态变化,研究其放疗干预的时间异质性,Transwell体外侵袭实验比较克隆侵袭力的差异。结果:三株由单细胞扩增培养的同源第三代子克隆依然存在明显生物学差异。形态有明显区别的球形与不规则的克隆形态。不规则形态克隆更表现为SOX2低表达及Vimentin的高表达。并且在单个细胞水平上,同个单克隆群体内也存在个体细胞间蛋白的表达差异(Vimentin; EpCAM)。通过观察放疗前后Vimentin蛋白在不同时间点上的荧光强度,发现肿瘤单克隆细胞存在时间异质性。Transwell体外侵袭实验也显示三个同源克隆间存在明显的差异性。结论:同源的、连续单细胞扩增获得的第三代单克隆依然存在明显生物学差异,提示肿瘤内部异质性是其固有特征,并且在治疗干预下,也会引起肿瘤时间异质性的产生。  相似文献   

5.
Li  Xin  Wu  Yufeng 《BMC bioinformatics》2023,23(8):1-16
Background

Structural variation (SV), which ranges from 50 bp to \(\sim\) 3 Mb in size, is an important type of genetic variations. Deletion is a type of SV in which a part of a chromosome or a sequence of DNA is lost during DNA replication. Three types of signals, including discordant read-pairs, reads depth and split reads, are commonly used for SV detection from high-throughput sequence data. Many tools have been developed for detecting SVs by using one or multiple of these signals.

Results

In this paper, we develop a new method called EigenDel for detecting the germline submicroscopic genomic deletions. EigenDel first takes advantage of discordant read-pairs and clipped reads to get initial deletion candidates, and then it clusters similar candidates by using unsupervised learning methods. After that, EigenDel uses a carefully designed approach for calling true deletions from each cluster. We conduct various experiments to evaluate the performance of EigenDel on low coverage sequence data.

Conclusions

Our results show that EigenDel outperforms other major methods in terms of improving capability of balancing accuracy and sensitivity as well as reducing bias. EigenDel can be downloaded from https://github.com/lxwgcool/EigenDel.

  相似文献   

6.
Zhu  Fangfang  Li  Jiang  Liu  Juan  Min  Wenwen 《BMC genetics》2021,22(1):1-10
Background

Next-generation sequencing (NGS) has profoundly changed the approach to genetic/genomic research. Particularly, the clinical utility of NGS in detecting mutations associated with disease risk has contributed to the development of effective therapeutic strategies. Recently, comprehensive analysis of somatic genetic mutations by NGS has also been used as a new approach for controlling the quality of cell substrates for manufacturing biopharmaceuticals. However, the quality evaluation of cell substrates by NGS largely depends on the limit of detection (LOD) for rare somatic mutations. The purpose of this study was to develop a simple method for evaluating the ability of whole-exome sequencing (WES) by NGS to detect mutations with low allele frequency. To estimate the LOD of WES for low-frequency somatic mutations, we repeatedly and independently performed WES of a reference genomic DNA using the same NGS platform and assay design. LOD was defined as the allele frequency with a relative standard deviation (RSD) value of 30% and was estimated by a moving average curve of the relation between RSD and allele frequency.

Results

Allele frequencies of 20 mutations in the reference material that had been pre-validated by droplet digital PCR (ddPCR) were obtained from 5, 15, 30, or 40 G base pair (Gbp) sequencing data per run. There was a significant association between the allele frequencies measured by WES and those pre-validated by ddPCR, whose p-value decreased as the sequencing data size increased. By this method, the LOD of allele frequency in WES with the sequencing data of 15 Gbp or more was estimated to be between 5 and 10%.

Conclusions

For properly interpreting the WES data of somatic genetic mutations, it is necessary to have a cutoff threshold of low allele frequencies. The in-house LOD estimated by the simple method shown in this study provides a rationale for setting the cutoff.

  相似文献   

7.
Background

A metagenome is a collection of genomes, usually in a micro-environment, and sequencing a metagenomic sample en masse is a powerful means for investigating the community of the constituent microorganisms. One of the challenges is in distinguishing between similar organisms due to rampant multiple possible assignments of sequencing reads, resulting in false positive identifications. We map the problem to a topological data analysis (TDA) framework that extracts information from the geometric structure of data. Here the structure is defined by multi-way relationships between the sequencing reads using a reference database.

Results

Based primarily on the patterns of co-mapping of the reads to multiple organisms in the reference database, we use two models: one a subcomplex of a Barycentric subdivision complex and the other a Čech complex. The Barycentric subcomplex allows a natural mapping of the reads along with their coverage of organisms while the Čech complex takes simply the number of reads into account to map the problem to homology computation. Using simulated genome mixtures we show not just enrichment of signal but also microbe identification with strain-level resolution.

Conclusions

In particular, in the most refractory of cases where alternative algorithms that exploit unique reads (i.e., mapped to unique organisms) fail, we show that the TDA approach continues to show consistent performance. The Čech model that uses less information is equally effective, suggesting that even partial information when augmented with the appropriate structure is quite powerful.

  相似文献   

8.
ABSTRACT

Introduction: The recent development of checkpoint blockade immunotherapy for cancer has led to impressive clinical results across multiple tumor types. There is mounting evidence that immune recognition of tumor derived MHC class I (MHC-I) restricted epitopes bearing cancer specific mutations and alterations is a crucial mechanism in successfully triggering immune-mediated tumor rejection. Therapeutic targeting of these cancer specific epitopes (neoepitopes) is emerging as a promising opportunity for the generation of personalized cancer vaccines and adoptive T cell therapies. However, one major obstacle limiting the broader application of neoepitope based therapies is the difficulty of selecting highly immunogenic neoepitopes among the wide array of presented non-immunogenic HLA ligands derived from self-proteins.

Areas covered: In this review, we present an overview of the MHC-I processing and presentation pathway, as well as highlight key areas that contribute to the complexity of the associated MHC-I peptidome. We cover recent technological advances that simplify and optimize the identification of targetable neoepitopes for cancer immunotherapeutic applications.

Expert commentary: Recent advances in computational modeling, bioinformatics, and mass spectrometry are unlocking the underlying mechanisms governing antigen processing and presentation of tumor-derived neoepitopes.  相似文献   

9.
Clonal evolution is the process by which genetic and epigenetic diversity is created within malignant tumor cells. This process culminates in a heterogeneous tumor, consisting of multiple subpopulations of cancer cells that often do not contain the same underlying mutations. Continuous selective pressure permits outgrowth of clones that harbor lesions that are capable of enhancing disease progression, including those that contribute to therapy resistance, metastasis and relapse. Clonal evolution and the resulting intratumoral heterogeneity pose a substantial challenge to biomarker identification, personalized cancer therapies and the discovery of underlying driver mutations in cancer. The purpose of this Review is to highlight the unique strengths of zebrafish cancer models in assessing the roles that intratumoral heterogeneity and clonal evolution play in cancer, including transgenesis, imaging technologies, high-throughput cell transplantation approaches and in vivo single-cell functional assays.KEY WORDS: Cancer stem cell, Fluorescence, Intratumoral, Single cell, Targeted therapy, Tumor  相似文献   

10.

Background

Many different genetic alterations are observed in cancer cells. Individual cancer genes display point mutations such as base changes, insertions and deletions that initiate and promote cancer growth and spread. Somatic hypermutation is a powerful mechanism for generation of different mutations. It was shown previously that somatic hypermutability of proto-oncogenes can induce development of lymphomas.

Methodology/Principal Findings

We found an exceptionally high incidence of single-base mutations in the tumor suppressor genes RASSF1 and RBSP3 (CTDSPL) both located in 3p21.3 regions, LUCA and AP20 respectively. These regions contain clusters of tumor suppressor genes involved in multiple cancer types such as lung, kidney, breast, cervical, head and neck, nasopharyngeal, prostate and other carcinomas. Altogether in 144 sequenced RASSF1A clones (exons 1–2), 129 mutations were detected (mutation frequency, MF = 0.23 per 100 bp) and in 98 clones of exons 3–5 we found 146 mutations (MF = 0.29). In 85 sequenced RBSP3 clones, 89 mutations were found (MF = 0.10). The mutations were not cytidine-specific, as would be expected from alterations generated by AID/APOBEC family enzymes, and appeared de novo during cell proliferation. They diminished the ability of corresponding transgenes to suppress cell and tumor growth implying a loss of function. These high levels of somatic mutations were found both in cancer biopsies and cancer cell lines.

Conclusions/Significance

This is the first report of high frequencies of somatic mutations in RASSF1 and RBSP3 in different cancers suggesting it may underlay the mutator phenotype of cancer. Somatic hypermutations in tumor suppressor genes involved in major human malignancies offer a novel insight in cancer development, progression and spread.  相似文献   

11.
Single-cell sequencing is a powerful tool for delineating clonal relationship and identifying key driver genes for personalized cancer management. Here we performed single-cell sequencing analysis of a case of colon cancer. Population genetics analyses identified two independent clones in tumor cell population. The major tumor clone harbored APC and TP53 mutations as early oncogenic events, whereas the minor clone contained preponderant CDC27 and PABPC1 mutations. The absence of APC and TP53 mutations in the minor clone supports that these two clones were derived from two cellular origins. Examination of somatic mutation allele frequency spectra of additional 21 whole-tissue exome-sequenced cases revealed the heterogeneity of clonal origins in colon cancer. Next, we identified a mutated gene SLC12A5 that showed a high frequency of mutation at the single-cell level but exhibited low prevalence at the population level. Functional characterization of mutant SLC12A5 revealed its potential oncogenic effect in colon cancer. Our study provides the first exome-wide evidence at single-cell level supporting that colon cancer could be of a biclonal origin, and suggests that low-prevalence mutations in a cohort may also play important protumorigenic roles at the individual level.  相似文献   

12.
13.
Hybrid assembly strategies that combine long-read sequencing reads from Oxford Nanopore's MinION device combined with high-depth Illumina paired-end reads have enabled completion and circularization of both plasmids and chromosomes from multiple bacterial strains. Here we demonstrate the utility of supplementing Illumina paired-end reads from a previously published draft genome of P. syringae pv. pisi PP1 with long reads to generate a complete genome sequence for this strain. The phylogenetic placement and genomic repertoire of virulence factors within this strain provides a unique perspective on virulence evolution within P. syringae phylogroup 2, and highlights that strains can rapidly acquire virulence factors through horizontal gene transfer by acquisition of plasmids as well as through chromosomal recombination.  相似文献   

14.
15.
【目的】为了体现亚硝酸盐还原酶在环境中氮生物循环的重要性,研究了它们的分布情况。【方法】利用现有亚硝酸盐还原酶序列在已经测序的基因组数据库中进行查找,研究该酶的分布情况,通过多序列比对比较了它们的序列相似性,通过构建系统发育树比较其进化关系,并利用宏基因组学的方法研究了它们在海洋宏基因组中的分布。【结果】分析结果显示,两类亚硝酸盐还原酶在已测序的细菌和古生菌基因组中分别有397和812个,分别占总量的8%和15.7%,几乎所有的古生菌都含有Ⅱ类酶;它们自身的序列相似性很高,在Ⅰ类酶和Ⅱ类酶中底物结合位点以及Ⅱ类酶的铜离子结合位点保守性都很高,显示该酶序列保守性与其环境功能相适应的特点;进化分析显示它们可能具有共同的进化来源;在海洋宏基因组中,平均每100000读数中分别有6个Ⅰ类和35个Ⅱ类,且2类酶都在热带南太平洋区域有最大分布。【结论】NIR在氮的生物循环及环境修复中可能起到重要作用。  相似文献   

16.
17.
Intra-tumor heterogeneity reflects cancer genome evolution and provides key information for diagnosis and treatment. When bulk tumor tissues are profiled for somatic copy number alterations (sCNA) and point mutations, it may be difficult to estimate their cellular fractions when a mutation falls within a sCNA. We present the Clonal Heterogeneity Analysis Tool, which estimates cellular fractions for both sCNAs and mutations, and uses their distributions to inform macroscopic clonal architecture. In a set of approximately 700 breast tumors, more than half appear to contain multiple recognizable aneuploid tumor clones, and many show subtype-specific differences in clonality for known cancer genes.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-014-0473-4) contains supplementary material, which is available to authorized users.  相似文献   

18.
Summary

Using larval data of zoeae from selected genera of majids, we determined tree topologies, levels of homoplasy, and frequencies of reduction under three different assumptions of character argumentation: ordered reduction events, unordered reduction events, and outgroup comparison. Under each assumption we provided a phylogenetic hypothesis for some majid genera and evaluated the assumption that structural reduction can be assumed a priori as a criterion to infer character transformation polarity in phylogenetic reconstruction of decapods. The results indicate that the a priori assumption of “reduction” as the derived condition is not justified because under this assumption, reduction is not always maintained throughout the resulting phylogenetic hypothesis. Furthermore, we also found that this criterion fails to provide the most parsimonious explanation of the data set. Therefore, we reject the use a “reduction=derived” criterion to infer polarity in phylogenetic reconstruction. Phylogenetic analysis using outgroup comparison provided a phylogenetic hypothesis with a better fit and a lower frequency of reduction events. However, we found that statements of homology may be problematic when the number of larval stages in the outgroup differ from those of the ingroup. To overcome this problem, we suggest that, in the absence of evidence for developmental homology, all larval stages should be considered as potential homologues. Using this approach to homology of larval stages, we provide a new phylogenetic hypothesis for 15 genera of majids based on larval morphology. Within Majidae, representative members of Majinae formed a highly nested monophyletic group with the following topology: ((Jacquinotia+Notomithrax) (Leptomithrax+Maja)). In contrast, the Oregoniinae (Hyas+Chionoecetes) formed a basal monophyletic group. Contrary to established ideas for the monophyly of Inachinae, Macrocheira is basal to the Oregoniinae. Other taxa did not form monophyletic groupings based on classical assignment to subfamilies.  相似文献   

19.
20.

Background

It has been an abiding belief among geneticists that multicellular organisms’ genomes can be analyzed under the assumption that a single individual has a uniform genome in all its cells. Despite some evidence to the contrary, this belief has been used as an axiomatic assumption in most genome analysis software packages. In this paper we present observations in human whole genome data, human whole exome data and in mouse whole genome data to challenge this assumption. We show that heterogeneity is in fact ubiquitous and readily observable in ordinary Next Generation Sequencing (NGS) data.

Results

Starting with the assumption that a single NGS read (or read pair) must come from one haplotype, we built a procedure for directly observing haplotypes at a local level by examining 2 or 3 adjacent single nucleotide polymorphisms (SNPs) which are close enough on the genome to be spanned by individual reads. We applied this procedure to NGS data from three different sources: whole genome of a Central European trio from the 1000 genomes project, whole genome data from laboratory-bred strains of mouse, and whole exome data from a set of patients of head and neck tumors. Thousands of loci were found in each genome where reads spanning 2 or 3 SNPs displayed more than two haplotypes, indicating that the locus is heterogeneous. We show that such loci are ubiquitous in the genome and cannot be explained by segmental duplications. We explain them on the basis of cellular heterogeneity at the genomic level. Such heterogeneous loci were found in all normal and tumor genomes examined.

Conclusions

Our results highlight the need for new methods to analyze genomic variation because existing ones do not systematically consider local haplotypes. Identification of cancer somatic mutations is complicated because of tumor heterogeneity. It is further complicated if, as we show, normal tissues are also heterogeneous. Methods for biomarker discovery must consider contextual haplotype information rather than just whether a variant “is present”.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-418) contains supplementary material, which is available to authorized users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号