首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Somatic variant analysis of a tumour sample and its matched normal has been widely used in cancer research to distinguish germline polymorphisms from somatic mutations. However, due to the extensive intratumour heterogeneity of cancer, sequencing data from a single tumour sample may greatly underestimate the overall mutational landscape. In recent studies, multiple spatially or temporally separated tumour samples from the same patient were sequenced to identify the regional distribution of somatic mutations and study intratumour heterogeneity. There are a number of tools to perform somatic variant calling from matched tumour-normal next-generation sequencing (NGS) data; however none of these allow joint analysis of multiple same-patient samples. We discuss the benefits and challenges of multisample somatic variant calling and present multiSNV, a software package for calling single nucleotide variants (SNVs) using NGS data from multiple same-patient samples. Instead of performing multiple pairwise analyses of a single tumour sample and a matched normal, multiSNV jointly considers all available samples under a Bayesian framework to increase sensitivity of calling shared SNVs. By leveraging information from all available samples, multiSNV is able to detect rare mutations with variant allele frequencies down to 3% from whole-exome sequencing experiments.  相似文献   

2.
The molecular diagnosis of muscle disorders is challenging: genetic heterogeneity (>100 causal genes for skeletal and cardiac muscle disease) precludes exhaustive clinical testing, prioritizing sequencing of specific genes is difficult due to the similarity of clinical presentation, and the number of variants returned through exome sequencing can make the identification of the disease-causing variant difficult. We have filtered variants found through exome sequencing by prioritizing variants in genes known to be involved in muscle disease while examining the quality and depth of coverage of those genes. We ascertained two families with autosomal dominant limb-girdle muscular dystrophy of unknown etiology. To identify the causal mutations in these families, we performed exome sequencing on five affected individuals using the Agilent SureSelect Human All Exon 50 Mb kit and the Illumina HiSeq 2000 (2×100 bp). We identified causative mutations in desmin (IVS3+3A>G) and filamin C (p.W2710X), and augmented the phenotype data for individuals with muscular dystrophy due to these mutations. We also discuss challenges encountered due to depth of coverage variability at specific sites and the annotation of a functionally proven splice site variant as an intronic variant.  相似文献   

3.

Background

Significant clinical and research applications are driving large scale adoption of individualized tumor sequencing in cancer in order to identify tumors-specific mutations. When a matched germline sample is available, somatic mutations may be identified using comparative callers. However, matched germline samples are frequently not available such as with archival tissues, which makes it difficult to distinguish somatic from germline variants. While population databases may be used to filter out known germline variants, recent studies have shown private germline variants result in an inflated false positive rate in unmatched tumor samples, and the number germline false positives in an individual may be related to ancestry.

Methods

First, we examined the relationship between the germline false positives and ancestry. Then we developed and implemented a tumor only caller (LumosVar) that leverages differences in allelic frequency between somatic and germline variants in impure tumors. We used simulated data to systematically examine how copy number alterations, tumor purity, and sequencing depth should affect the sensitivity of our caller. Finally, we evaluated the caller on real data.

Results

We find the germline false-positive rate is significantly higher for individuals of non-European Ancestry largely due to the limited diversity in public polymorphism databases and due to population-specific characteristics such as admixture or recent expansions. Our Bayesian tumor only caller (LumosVar) is able to greatly reduce false positives from private germline variants, and our sensitivity is similar to predictions based on simulated data.

Conclusions

Taken together, our results suggest that studies of individuals of non-European ancestry would most benefit from our approach. However, high sensitivity requires sufficiently impure tumors and adequate sequencing depth. Even in impure tumors, there are copy number alterations that result in germline and somatic variants having similar allele frequencies, limiting the sensitivity of the approach. We believe our approach could greatly improve the analysis of archival samples in a research setting where the normal is not available.
  相似文献   

4.
De novo mutations are recognized both as an important source of genetic variation and as a prominent cause of sporadic disease in humans. Mutations identified as de novo are generally assumed to have occurred during gametogenesis and, consequently, to be present as germline events in an individual. Because Sanger sequencing does not provide the sensitivity to reliably distinguish somatic from germline mutations, the proportion of de novo mutations that occur somatically rather than in the germline remains largely unknown. To determine the contribution of post-zygotic events to de novo mutations, we analyzed a set of 107 de novo mutations in 50 parent-offspring trios. Using four different sequencing techniques, we found that 7 (6.5%) of these presumed germline de novo mutations were in fact present as mosaic mutations in the blood of the offspring and were therefore likely to have occurred post-zygotically. Furthermore, genome-wide analysis of de novo variants in the proband led to the identification of 4/4,081 variants that were also detectable in the blood of one of the parents, implying parental mosaicism as the origin of these variants. Thus, our results show that an important fraction of de novo mutations presumed to be germline in fact occurred either post-zygotically in the offspring or were inherited as a consequence of low-level mosaicism in one of the parents.  相似文献   

5.
Next-generation sequencing (NGS) has enabled the high-throughput discovery of germline and somatic mutations. However, NGS-based variant detection is still prone to errors, resulting in inaccurate variant calls. Here, we categorized the variants detected by NGS according to total read depth (TD) and SNP quality (SNPQ), and performed Sanger sequencing with 348 selected non-synonymous single nucleotide variants (SNVs) for validation. Using the SAMtools and GATK algorithms, the validation rate was positively correlated with SNPQ but showed no correlation with TD. In addition, common variants called by both programs had a higher validation rate than caller-specific variants. We further examined several parameters to improve the validation rate, and found that strand bias (SB) was a key parameter. SB in NGS data showed a strong difference between the variants passing validation and those that failed validation, showing a validation rate of more than 92% (filtering cutoff value: alternate allele forward [AF]≥20 and AF<80 in SAMtools, SB<–10 in GATK). Moreover, the validation rate increased significantly (up to 97–99%) when the variant was filtered together with the suggested values of mapping quality (MQ), SNPQ and SB. This detailed and systematic study provides comprehensive recommendations for improving validation rates, saving time and lowering cost in NGS analyses.  相似文献   

6.
We sequenced 11 germline exomes from five families with familial pancreatic cancer (FPC). One proband had a germline nonsense variant in ATM with somatic loss of the variant allele. Another proband had a nonsense variant in PALB2 with somatic loss of the variant allele. Both variants were absent in a relative with FPC. These findings question the causal mechanisms of ATM and PALB2 in these families and highlight challenges in identifying the causes of familial cancer syndromes using exome sequencing.  相似文献   

7.

Background

Target enrichment and resequencing is a widely used approach for identification of cancer genes and genetic variants associated with diseases. Although cost effective compared to whole genome sequencing, analysis of many samples constitutes a significant cost, which could be reduced by pooling samples before capture. Another limitation to the number of cancer samples that can be analyzed is often the amount of available tumor DNA. We evaluated the performance of whole genome amplified DNA and the power to detect subclonal somatic single nucleotide variants in non-indexed pools of cancer samples using the HaloPlex technology for target enrichment and next generation sequencing.

Results

We captured a set of 1528 putative somatic single nucleotide variants and germline SNPs, which were identified by whole genome sequencing, with the HaloPlex technology and sequenced to a depth of 792–1752. We found that the allele fractions of the analyzed variants are well preserved during whole genome amplification and that capture specificity or variant calling is not affected. We detected a large majority of the known single nucleotide variants present uniquely in one sample with allele fractions as low as 0.1 in non-indexed pools of up to ten samples. We also identified and experimentally validated six novel variants in the samples included in the pools.

Conclusion

Our work demonstrates that whole genome amplified DNA can be used for target enrichment equally well as genomic DNA and that accurate variant detection is possible in non-indexed pools of cancer samples. These findings show that analysis of a large number of samples is feasible at low cost, even when only small amounts of DNA is available, and thereby significantly increases the chances of indentifying recurrent mutations in cancer samples.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-14-856) contains supplementary material, which is available to authorized users.  相似文献   

8.
Inheritable colorectal cancers (CRC) accounted for about 20% of the CRC cases, such as hereditary nonpolyposis colorectal cancer (HNPCC), Gardner syndrome and familial adenomatous polyposis (FAP). A four-generation Han Chinese family was found affected with polyposis in colons. Inferred from the pedigree structure, the disease in this family showed an autosomal dominant inheritance model. To locate the causal mutations in this family, genomic DNAs were extracted and the next generation sequencing for 5 genes relating to colon cancer performed by Ion Torrent Personal Genome Machine with a 314 chip. The reads were aligned with human reference genome hg19 to call variants in the 5 genes. After analysis, 14 variants were detected in the sequenced sample and 13 been collected in dbSNP database and assigned with a rs identification number. In these variants, 9 were synonymous, 4 missense and 1 non-sense. In them, 2 rare variants (c.694C>T in APC and c.1690A>G in MSH2) might be the putative causal mutations for familial adenomatous polyposis (FAP) since the rarity of the mutated allele in normal controls. c.694C>T was detected in only affected members and generated a premature stop codon in APC. It should be a de novo germline mutation making APC containing this stop codon as targets for nonsense-mediated mRNA decay (NMD). c.1690A>G in MSH2 was not only detected in affected members, but also in normal ones in the family. Functional prediction revealed that the amino acid affected by this variant had no effect on the function of MSH2. Here, we report a de novo germline mutation of APC as the causal variant in a Chinese family with inheritable colon cancer by the next generation sequencing.  相似文献   

9.
Understanding the genetic causes of neurodegenerative disease (ND) can be useful for their prevention and treatment. Among the genetic variations responsible for ND, heritable germline variants have been discovered in genome-wide association studies (GWAS), and nonheritable somatic mutations have been discovered in sequencing projects. Distinguishing the important initiating genes in ND and comparing the importance of heritable and nonheritable genetic variants for treating ND are important challenges. In this study, we analysed GWAS results, somatic mutations and drug targets of ND from large databanks by performing directed network-based analysis considering a randomised network hypothesis testing procedure. A disease-associated biological network was created in the context of the functional interactome, and the nonrandom topological characteristics of directed-edge classes were interpreted. Hierarchical network analysis indicated that drug targets tend to lie upstream of somatic mutations and germline variants. Furthermore, using directed path length information and biological explanations, we provide information on the most important genes in these created node classes and their associated drugs. Finally, we identified nine germline variants overlapping with drug targets for ND, seven somatic mutations close to drug targets from the hierarchical network analysis and six crucial genes in controlling other genes from the network analysis. Based on these findings, some drugs have been proposed for treating ND via drug repurposing. Our results provide new insights into the therapeutic actionability of GWAS results and somatic mutations for ND. The interesting properties of each node class and the existing relationships between them can broaden our knowledge of ND.  相似文献   

10.
Next-generation sequencing technologies have revolutionized our ability to identify genetic variants, either germline or somatic point mutations, that occur in cancer. Parallelization and miniaturization of DNA sequencing enables massive data throughput and for the first time, large-scale, nucleotide resolution views of cancer genomes can be achieved. Systematic, large-scale sequencing surveys have revealed that the genetic spectrum of mutations in cancers appears to be highly complex with numerous low frequency bystander somatic variations, and a limited number of common, frequently mutated genes. Large sample sizes and deeper resequencing are much needed in resolving clinical and biological relevance of the mutations as well as in detecting somatic variants in heterogeneous samples and cancer cell sub-populations. However, even with the next-generation sequencing technologies, the overwhelming size of the human genome and need for very high fold coverage represents a major challenge for up-scaling cancer genome sequencing projects. Assays to target, capture, enrich or partition disease-specific regions of the genome offer immediate solutions for reducing the complexity of the sequencing libraries. Integration of targeted DNA capture assays and next-generation deep resequencing improves the ability to identify clinically and biologically relevant mutations.  相似文献   

11.
We determined frequency/types of K-ras mutations in colorectal/lung cancer. ADx-K-ras kit (real-time/double-loop probe PCR) was used to detect somatic tumor gene mutations compared with Sanger DNA sequencing using 583 colorectal and 244 lung cancer paraffin-embedded clinical samples. Genomic DNA was used in both methods; mutation rates at codons 12/13 and frequency of each mutation were detected and compared. The data show that 91.4% colorectal and 59.0% lung carcinoma samples were detected conclusively by DNA sequencing, whereas 100% colorectal and lung samples were detected by ADx-K-ras kit. K-ras gene mutations were detected in 32.9–27.4% colorectal samples using kit and sequencing methods, respectively. Whereas 10.6–8.3% lung cancer samples were positively detected by kit and sequencing methods, respectively. Notably, 172/677 showed mutations and 467/677 showed wild type by both methods; 38 samples showed mutations with kit but wild type with sequencing. Mutations in colorectal samples were as follows: GGT → GAT/codon-12 (35.1%); GGC → GAC/codon-13 (26.6%); GGT → GTT/codon-12 (18.2%); and GGT → GCT/codon-12 (1.6%). Mutations in lung samples were as follows: GGT > GTT/codon-12 (40.9%) and GGT > GCT/codon-12 (4.5%). In conclusion, K-ras mutations involved 32.2% colorectal and 10.6% lung samples among this cohort. ADx-K-ras real-time PCR showed higher detection rates (P < 0.05). The kit method has good clinical applicability as it is simple, fast, less prone to contamination and hence can be used effectively and reliably for clinical screening of somatic tumor gene mutations.  相似文献   

12.
Targeted mutagenesis is one of the key methods for functional gene analysis. A simplified variant of gene targeting uses direct microinjection of custom-designed Zinc Finger Nuclease (ZFN) mRNAs into Drosophila embryos. To evaluate the applicability of this method to gene targeting in another insect, we mutagenized the Bombyx mori epidermal color marker gene BmBLOS2, which controls the formation of uric acid granules in the larval epidermis. Our results revealed that ZFN mRNA injection is effective to induce somatic, as well as germline, mutations in a targeted gene by non-homologous end joining (NHEJ). The ZFN-induced NHEJ mutations lack end-filling and blunt ligation products, and include mainly 7 bp or longer deletions, as well as single nucleotide insertions. These observations suggest that the B. mori double-strand break repair system relies on microhomologies rather than on a canonical ligase IV-dependent mechanism. The frequency of germline mutants in G1 was sufficient to be used for gene targeting relying on a screen based solely on molecular methods.  相似文献   

13.
《Genomics》2021,113(4):1930-1939
Gene mutation detection and the resulted precision-medicine therapy is transforming clinical practice. Here, we report the use of a custom-developed, medium-sized, pan-cancer probe panel for the detection of somatic and germline mutations. We used a hybridization capture-based NGS assay for targeted deep sequencing of all exons and selected introns of 181 key cancer driver genes, covering both inherited risks and somatic mutations. We performed paired-variant calling on tumor samples and their matched normal samples. We processed clinical patient samples of formalin-fixed, paraffin embedded tumors (FFPE samples) and cell-free peripheral blood (cfDNA samples). We found germline mutations of inherited cancer risk at 9%; and discovered a novel germline mutation in BRCA1. Somatic mutation rate in driver genes is at 73.1%, much higher than previously reported. On recommending precision-medicine therapeutics, we achieved 91.6% for patients with FFPE samples.  相似文献   

14.
Goh L  Chen GB  Cutcutache I  Low B  Teh BT  Rozen S  Tan P 《PloS one》2011,6(3):e17810
Next generation sequencing technology has revolutionized the study of cancers. Through matched normal-tumor pairs, it is now possible to identify genome-wide germline and somatic mutations. The generation and analysis of the data requires rigorous quality checks and filtering, and the current analytical pipeline is constantly undergoing improvements. We noted however that in analyzing matched pairs, there is an implicit assumption that the sequenced data are matched, without any quality check such as those implemented in association studies. There are serious implications in this assumption as identification of germline and rare somatic variants depend on the normal sample being the matched pair. Using a genetics concept on measuring relatedness between individuals, we demonstrate that the matchedness of tumor pairs can be quantified and should be included as part of a quality protocol in analysis of sequenced data. Despite the mutation changes in cancer samples, matched tumor-normal pairs are still relatively similar in sequence compared to non-matched pairs. We demonstrate that the approach can be used to assess the mutation landscape between individuals.  相似文献   

15.

Background

Matched sequencing of both tumor and normal tissue is routinely used to classify variants of uncertain significance (VUS) into somatic vs. germline. However, assays used in molecular diagnostics focus on known somatic alterations in cancer genes and often only sequence tumors. Therefore, an algorithm that reliably classifies variants would be helpful for retrospective exploratory analyses. Contamination of tumor samples with normal cells results in differences in expected allelic fractions of germline and somatic variants, which can be exploited to accurately infer genotypes after adjusting for local copy number. However, existing algorithms for determining tumor purity, ploidy and copy number are not designed for unmatched short read sequencing data.

Results

We describe a methodology and corresponding open source software for estimating tumor purity, copy number, loss of heterozygosity (LOH), and contamination, and for classification of single nucleotide variants (SNVs) by somatic status and clonality. This R package, PureCN, is optimized for targeted short read sequencing data, integrates well with standard somatic variant detection pipelines, and has support for matched and unmatched tumor samples. Accuracy is demonstrated on simulated data and on real whole exome sequencing data.

Conclusions

Our algorithm provides accurate estimates of tumor purity and ploidy, even if matched normal samples are not available. This in turn allows accurate classification of SNVs. The software is provided as open source (Artistic License 2.0) R/Bioconductor package PureCN (http://bioconductor.org/packages/PureCN/).
  相似文献   

16.
Congenital melanocytic nevi (CMN) are cutaneous malformations whose prevalence is inversely correlated with projected adult size. CMN are caused by somatic mutations, but epidemiological studies suggest that germline genetic factors may influence CMN development. In CMN patients from the U.K., genetic variants in MC1R, such as p.V92M and loss‐of‐function variants, have been previously associated with larger CMN. We analyzed the association of MC1R variants with CMN characteristics in two distinct cohorts of medium‐to‐giant CMN patients from Spain (N = 113) and from France, Norway, Canada, and the United States (N = 53), similar at the clinical and phenotypical level except for the number of nevi per patient. We found that the p.V92M or loss‐of‐function MC1R variants either alone or in combination did not correlate with CMN size, in contrast to the U.K. CMN patients. An additional case–control analysis with 259 unaffected Spanish individuals showed a higher frequency of MC1R compound heterozygous or homozygous variant genotypes in Spanish CMN patients compared to the control population (15.9% vs. 9.3%; p = .075). Altogether, this study suggests that MC1R variants are not associated with CMN size in these non‐UK cohorts. Additional studies are required to define the potential role of MC1R as a risk factor in CMN development.  相似文献   

17.
BackgroundColorectal cancer with metastases limited to the liver (liver-limited mCRC) is a distinct clinical subset characterized by possible cure with surgery. We performed high-depth sequencing of over 750 cancer-associated genes and copy number profiling in matched primary, metastasis and normal tissues to characterize genomic progression in 18 patients with liver-limited mCRC.ResultsHigh depth Illumina sequencing and use of three different variant callers enable comprehensive and accurate identification of somatic variants down to 2.5% variant allele frequency. We identify a median of 11 somatic single nucleotide variants (SNVs) per tumor. Across patients, a median of 79.3% of somatic SNVs present in the primary are present in the metastasis and 81.7% of all alterations present in the metastasis are present in the primary. Private alterations are found at lower allele frequencies; a different mutational signature characterized shared and private variants, suggesting distinct mutational processes. Using B-allele frequencies of heterozygous germline SNPs and copy number profiling, we find that broad regions of allelic imbalance and focal copy number changes, respectively, are generally shared between the primary tumor and metastasis.ConclusionsOur analyses point to high genomic concordance of primary tumor and metastasis, with a thick common trunk and smaller genomic branches in general support of the linear progression model in most patients with liver-limited mCRC. More extensive studies are warranted to further characterize genomic progression in this important clinical population.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-015-0589-1) contains supplementary material, which is available to authorized users.  相似文献   

18.
The base excision repair gene MYH protects against damage to DNA from reactive oxygen species, which are commonly found in cigarette smoke. Inherited mutations in MYH predispose to colorectal adenomas and carcinomas that show a characteristic pattern of somatic G:CT:A mutations in the APC gene. A similar pattern of somatic mutations in the TP53 gene is reported in smoking-related lung cancers. We therefore tested whether germline changes in MYH may also contribute to the development of lung cancer by screening for variants in 276 patients with lung carcinoma and 106 normal controls. No patients harboured truncating mutations in MYH and only a single patient was a carrier for the G382D missense mutation. We identified three common coding region (V22M, Q324H and S501F) and intronic (157+30A>G, 462+35G>A and 1435–40G>C) variants, but none were over-represented in the patient samples, indicating that MYH variants are unlikely to predispose significantly to the risk of lung cancer.  相似文献   

19.
20.
Distinguishing single-nucleotide variants (SNVs) from errors in whole-genome sequences remains challenging. Here we describe a set of filters, together with a freely accessible software tool, that selectively reduce error rates and thereby facilitate variant detection in data from two short-read sequencing technologies, Complete Genomics and Illumina. By sequencing the nearly identical genomes from monozygotic twins and considering shared SNVs as 'true variants' and discordant SNVs as 'errors', we optimized thresholds for 12 individual filters and assessed which of the 1,048 filter combinations were effective in terms of sensitivity and specificity. Cumulative application of all effective filters reduced the error rate by 290-fold, facilitating the identification of genetic differences between monozygotic twins. We also applied an adapted, less stringent set of filters to reliably identify somatic mutations in a highly rearranged tumor and to identify variants in the NA19240 HapMap genome relative to a reference set of SNVs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号