首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
SomaticSeq is an accurate somatic mutation detection pipeline implementing a stochastic boosting algorithm to produce highly accurate somatic mutation calls for both single nucleotide variants and small insertions and deletions. The workflow currently incorporates five state-of-the-art somatic mutation callers, and extracts over 70 individual genomic and sequencing features for each candidate site. A training set is provided to an adaptively boosted decision tree learner to create a classifier for predicting mutation statuses. We validate our results with both synthetic and real data. We report that SomaticSeq is able to achieve better overall accuracy than any individual tool incorporated.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-015-0758-2) contains supplementary material, which is available to authorized users.  相似文献   

2.
What is the lineage relation among the cells of an organism? The answer is sought by developmental biology, immunology, stem cell research, brain research, and cancer research, yet complete cell lineage trees have been reconstructed only for simple organisms such as Caenorhabditis elegans. We discovered that somatic mutations accumulated during normal development of a higher organism implicitly encode its entire cell lineage tree with very high precision. Our mathematical analysis of known mutation rates in microsatellites (MSs) shows that the entire cell lineage tree of a human embryo, or a mouse, in which no cell is a descendent of more than 40 divisions, can be reconstructed from information on somatic MS mutations alone with no errors, with probability greater than 99.95%. Analyzing all approximately 1.5 million MSs of each cell of an organism may not be practical at present, but we also show that in a genetically unstable organism, analyzing only a few hundred MSs may suffice to reconstruct portions of its cell lineage tree. We demonstrate the utility of the approach by reconstructing cell lineage trees from DNA samples of a human cell line displaying MS instability. Our discovery and its associated procedure, which we have automated, may point the way to a future "Human Cell Lineage Project" that would aim to resolve fundamental open questions in biology and medicine by reconstructing ever larger portions of the human cell lineage tree.  相似文献   

3.
SUMMARY Fate maps depict how cells relate together through past lineage relationships, and are useful tools for studying developmental and somatic processes. However, with existing technologies, it has not been possible to generate detailed fate maps of complex organisms such as the mouse. We and others have therefore proposed a novel approach, "phylogenetic fate mapping," where patterns of somatic mutation carried by the individual cells of an animal are used to retrospectively deduce lineage relationships through phylogenetic inference. Here, we have cataloged genomic polymorphisms at 324 mutation-prone polyguanine tracts for nearly 300 cells isolated from a single mouse, and have explored the cells' lineage relationships both phylogenetically and through a network-based approach. We present a model of mouse embryogenesis, where an early period of substantial cell mixing is followed by more coherent growth of clones later. We find that cells from certain tissues have greater numbers of close relatives in other specific tissues than expected from chance, suggesting that those populations arise from a similar pool of ancestral lineages. Finally, we have investigated the dynamics of cell turnover (the frequency of cell loss and replacement) in postnatal tissues. This work offers a longitudinal study of developmental lineages, from conception to adulthood, and provides insight into basic questions of mouse embryology as well as the somatic processes that occur after birth.  相似文献   

4.
Mutations are an inevitable consequence of cell division. Similarly to how DNA sequence differences allow inferring evolutionary relationships between organisms, we and others have recently demonstrated how somatic mutations may be exploited for phylogenetically reconstructing lineages of individual cells during development in multicellular organisms. However, a problem with such "phylogenetic fate maps" is that they cannot be verified experimentally; distinguishing actual lineages within clonal populations requires direct observation of cell growth, as was used to construct the fate map of Caenorhabditis elegans, but is not possible in higher organisms. Here we employ computer simulation of mitotic cell division to determine how factors such as the quantity of cells, mutation rate, and the number of examined marker sequences contribute to fidelity of phylogenetic fate maps and to explore statistical methods for assessing accuracy. To experimentally evaluate these factors, as well as for the purpose of investigating the developmental origins of connective tissue, we have produced a lineage map of fibroblasts harvested from various organs of an adult mouse. Statistical analysis demonstrates that the inferred relationships between cells in the phylogenetic fate map reflect biological information regarding the origin of fibroblasts and is suggestive of cell migration during mesenchymal development.  相似文献   

5.
Single cell genomics performed on individual human subjects' tumors, neural tissues, and sperm samples revealed the existence of genetic heterogeneity arising through either mutations in exomes, deletions, recombinations, and duplications of DNA sequences, as well as aneuploidy. These genetic changes happen during cell cycles followed by cell division. The aim of this review is to strictly focus on single cell human genomics and intends to deliver information that can help to refine fundamental knowledge relating to genetic causes of cellular heterogeneity origins in both healthy and disease states. Allogenic heterogeneity as well as heterogeneity origins of cells possessing the same genome with different gene expression patterns is not the subject of this review. Future research still requires: a) improvement for complete and errorless DNA acquisition and sequencing of not only selected parts of the genome, and b) analyses of more samples that contain millions of cells. These data will deliver a more precise comparative representation of genetic diversity among single cells in an individual human subject. Consequently, we will be able to better distinguish between the role of genetic, versus epigenetic, and stochastic factors in the cellular diversity of over 30 trillion cells present in a human body.  相似文献   

6.
Accurate discovery of somatic mutations in a cell is a challenge that partially lays in immaturity of dedicated analytical approaches. Approaches comparing a cell’s genome to a control bulk sample miss common mutations, while approaches to find such mutations from bulk suffer from low sensitivity. We developed a tool, All2, which enables accurate filtering of mutations in a cell without the need for data from bulk(s). It is based on pair-wise comparisons of all cells to each other where every call for base pair substitution and indel is classified as either a germline variant, mosaic mutation, or false positive. As All2 allows for considering dropped-out regions, it is applicable to whole genome and exome analysis of cloned and amplified cells. By applying the approach to a variety of available data, we showed that its application reduces false positives, enables sensitive discovery of high frequency mutations, and is indispensable for conducting high resolution cell lineage tracing.  相似文献   

7.
8.
A theoretical model is developed of the fate of mutations for organisms with such life-history characteristics as indeterminate growth and clonal reproduction. It focuses on how the fate of a particular mutant depends on whether it arises during mitotic cell division (somatic mutation) or during meiotic cell division (meiotic mutation). At gamete production, individuals carrying somatic mutations will produce some proportion of gametes reflecting the original, zygotic genotype and some proportion reflecting genotypes carrying the somatic mutation. Focusing on allele frequencies at gamete production allows the effects of growth and clonal reproduction to be summarized. The relative strengths of somatic and meiotic mutation can be determined, as well as the conditions under which the change in allele frequency due to one is greater than that due to the other. Examples from a published demographic study of clonal corals are used to compare somatic and meiotic mutation. When there is no selection acting on either type of mutation, only a few cell divisions per time unit on average are needed for the change in allele frequency due to somatic mutation to be greater, given empirically based mutation rates. When somatic selection is added, the most dramatic effect is seen with fairly strong negative selection acting against the somatic mutation within individuals. In this case, selection within organisms can effectively counteract the effects of somatic mutation, and the change in allele frequency due to somatic mutations will not be greater than that due to meiotic mutations for reasonable numbers of within-generation cell divisions. The majority of the mutation load, which would have been due to somatic mutation, is purged by selection within the individual organism.  相似文献   

9.
Variation at nuclear- and chloroplast-encoded microsatellite loci was studied among and within clonally propagated individuals of Eastern white pine. Total DNA was extracted and assayed from gamete-bearing tissue (megagametophytes) located on six different branch positions on each of 12 individual genets. No within-individual variation was observed among 12 loci studied. Estimates of numbers of mitotic cell divisions required to produce the tissue used as the source of genomic DNA were obtained by combining tree growth and anatomical data. This allowed for the calculation of upper bound estimates of numbers of mutations per locus per somatic cell division. The estimated somatic mutation rate was found to be substantially lower than those published for genomic microsatellite mutation rates in other plant species.  相似文献   

10.
Advances in plant chromosome identification and cytogenetic techniques   总被引:7,自引:0,他引:7  
Recent developments that improve our ability to distinguish slightly diverged genomes from each other, as well as to distinguish each of the nonhomologous chromosomes within a genome, add a new dimension to the study of plant genomics. Differences in repetitive sequences among different species have been used to develop multicolor fluorescent in situ hybridization techniques that can define the components of allopolyploids in detail and reveal introgression between species. Bacterial artificial chromosome probes and repetitive sequence arrays have been used to distinguish each of the nonhomologous somatic chromosomes within a species. Such karyotype analysis opens new avenues for the study of chromosomal variation and behavior, as well as for the localization of individual genes and transgenes to genomic position.  相似文献   

11.
12.
Somatic mutations and aging: a re-evaluation   总被引:14,自引:0,他引:14  
Vijg J 《Mutation research》2000,447(1):117-135
Aging has been explained in terms of an accumulation of mutations in the genome of somatic cells, leading to tissue atrophy and neoplasms, as well as increased loss of function. Recent advances in transgenic mouse modeling and genomics technology have created, for the first time, the opportunity to begin testing this theory. In this paper the existing evidence for a possible role of somatic mutation accumulation in aging will be re-evaluated on the basis of the evolutionary logic of aging and recent insights in genome structure and function. New strategies for investigating the relationship between genome instability, mutation accumulation and aging will be discussed.  相似文献   

13.
14.
The ability to segregate a committed germ stem cell (GSC) lineage distinct from somatic cell lineages is a characteristic of bilaterian Metazoans. However, the occurrence of GSC lineage specification in basally branching Metazoan phyla, such as Cnidaria, is uncertain. Without an independently segregated GSC lineage, germ cells and their precursors must be specified throughout adulthood from continuously dividing somatic stem cells, generating the risk of propagating somatic mutations within the individual and its gametes. To address the potential for existence of a GSC lineage in Anthozoa, the sister-group to all remaining Cnidaria, we identified moderate- to high-frequency somatic mutations and their potential for gametic transfer in the long-lived coral Orbicella faveolata (Anthozoa, Cnidaria) using a 2b-RAD sequencing approach. Our results demonstrate that somatic mutations can drift to high frequencies (up to 50%) and can also generate substantial intracolonial genetic diversity. However, these somatic mutations are not transferable to gametes, signifying the potential for an independently segregated GSC lineage in O. faveolata. In conjunction with previous research on germ cell development in other basally branching Metazoan species, our results suggest that the GSC system may be a Eumetazoan characteristic that evolved in association with the emergence of greater complexity in animal body plan organization and greater specificity of stem cell functions.  相似文献   

15.
Recently, lineage tracing technology using CRISPR/Cas9 genome editing has enabled simultaneous readouts of gene expressions and lineage barcodes, which allows for the reconstruction of the cell division tree and makes it possible to reconstruct ancestral cell types and trace the origin of each cell type. Meanwhile, trajectory inference methods are widely used to infer cell trajectories and pseudotime in a dynamic process using gene expression data of present-day cells. Here, we present TedSim (single-cell temporal dynamics simulator), which simulates the cell division events from the root cell to present-day cells, simultaneously generating two data modalities for each single cell: the lineage barcode and gene expression data. TedSim is a framework that connects the two problems: lineage tracing and trajectory inference. Using TedSim, we conducted analysis to show that (i) TedSim generates realistic gene expression and barcode data, as well as realistic relationships between these two data modalities; (ii) trajectory inference methods can recover the underlying cell state transition mechanism with balanced cell type compositions; and (iii) integrating gene expression and barcode data can provide more insights into the temporal dynamics in cell differentiation compared to using only one type of data, but better integration methods need to be developed.  相似文献   

16.
Research over the past two decades has made substantial inroads into our understanding of somatic mutations. Recently, these studies have focused on understanding their presence in homeostatic tissue. In parallel, agent-based mechanistic models have emerged as an important tool for understanding somatic mutation in tissue; yet no common methodology currently exists to provide base-pair resolution data for these models. Here, we present Gattaca as the first method for introducing and tracking somatic mutations at the base-pair resolution within agent-based models that typically lack nuclei. With nuclei that incorporate human reference genomes, mutational context, and sequence coverage/error information, Gattaca is able to realistically evolve sequence data, facilitating comparisons between in silico cell tissue modeling with experimental human somatic mutation data. This user-friendly method, incorporated into each in silico cell, allows us to fully capture somatic mutation spectra and evolution.  相似文献   

17.
During the several-week course of an immune response, B cells undergo a process of clonal expansion, somatic hypermutation of the immunoglobulin (Ig) genes and affinity-dependent selection. Over a lifetime, each B cell may participate in multiple rounds of affinity maturation as part of different immune responses. These two time-scales for selection are apparent in the structure of B-cell lineage trees, which often contain a ‘trunk’ consisting of mutations that are shared across all members of a clone, and several branches that form a ‘canopy’ consisting of mutations that are shared by a subset of clone members. The influence of affinity maturation on the B-cell population can be inferred by analysing the pattern of somatic mutations in the Ig. While global analysis of mutation patterns has shown evidence of strong selection pressures shaping the B-cell population, the effect of different time-scales of selection and diversification has not yet been studied. Analysis of B cells from blood samples of three healthy individuals identifies a range of clone sizes with lineage trees that can contain long trunks and canopies indicating the significant diversity introduced by the affinity maturation process. We here show that observed mutation patterns in the framework regions (FWRs) are determined by an almost purely purifying selection on both short and long time-scales. By contrast, complementarity determining regions (CDRs) are affected by a combination of purifying and antigen-driven positive selection on the short term, which leads to a net positive selection in the long term. In both the FWRs and CDRs, long-term selection is strongly dependent on the heavy chain variable gene family.  相似文献   

18.
Myofiber cultures give rise to myogenic as well as to non-myogenic cells. Whether these myofiber-associated non-myogenic cells develop from resident stem cells that possess mesenchymal plasticity or from other stem cells such as mesenchymal stem cells (MSCs) remain unsolved. To address this question, we applied a method for reconstructing cell lineage trees from somatic mutations to MSCs and myogenic and non-myogenic cells from individual myofibers that were cultured at clonal density.Our analyses show that (i) in addition to myogenic progenitors, myofibers also harbor non-myogenic progenitors of a distinct, yet close, lineage; (ii) myofiber-associated non-myogenic and myogenic cells share the same muscle-bound primordial stem cells of a lineage distinct from bone marrow MSCs; (iii) these muscle-bound primordial stem-cells first part to individual muscles and then differentiate into myogenic and non-myogenic stem cells.  相似文献   

19.
While the bulk of the finished microbial genomes sequenced to date are derived from cultured bacterial and archaeal representatives, the vast majority of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes from these environmental species. Single cell genomics is a novel culture-independent approach, which enables access to the genetic material of an individual cell. No single cell genome has to our knowledge been closed and finished to date. Here we report the completed genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN. Digital PCR on single symbiont cells isolated from the bacteriome of the green sharpshooter Draeculacephala minerva bacteriome allowed us to assess that this bacteria is polyploid with genome copies ranging from approximately 200–900 per cell, making it a most suitable target for single cell finishing efforts. For single cell shotgun sequencing, an individual Sulcia cell was isolated and whole genome amplified by multiple displacement amplification (MDA). Sanger-based finishing methods allowed us to close the genome. To verify the correctness of our single cell genome and exclude MDA-derived artifacts, we independently shotgun sequenced and assembled the Sulcia genome from pooled bacteriomes using a metagenomic approach, yielding a nearly identical genome. Four variations we detected appear to be genuine biological differences between the two samples. Comparison of the single cell genome with bacteriome metagenomic sequence data detected two single nucleotide polymorphisms (SNPs), indicating extremely low genetic diversity within a Sulcia population. This study demonstrates the power of single cell genomics to generate a complete, high quality, non-composite reference genome within an environmental sample, which can be used for population genetic analyzes.  相似文献   

20.
A key challenge in genomics is to identify genetic variants that distinguish patients with different survival time following diagnosis or treatment. While the log-rank test is widely used for this purpose, nearly all implementations of the log-rank test rely on an asymptotic approximation that is not appropriate in many genomics applications. This is because: the two populations determined by a genetic variant may have very different sizes; and the evaluation of many possible variants demands highly accurate computation of very small p-values. We demonstrate this problem for cancer genomics data where the standard log-rank test leads to many false positive associations between somatic mutations and survival time. We develop and analyze a novel algorithm, Exact Log-rank Test (ExaLT), that accurately computes the p-value of the log-rank statistic under an exact distribution that is appropriate for any size populations. We demonstrate the advantages of ExaLT on data from published cancer genomics studies, finding significant differences from the reported p-values. We analyze somatic mutations in six cancer types from The Cancer Genome Atlas (TCGA), finding mutations with known association to survival as well as several novel associations. In contrast, standard implementations of the log-rank test report dozens-hundreds of likely false positive associations as more significant than these known associations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号