首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
Exome sequencing has been widely used in detecting pathogenic nonsynonymous single nucleotide variants (SNVs) for human inherited diseases. However, traditional statistical genetics methods are ineffective in analyzing exome sequencing data, due to such facts as the large number of sequenced variants, the presence of non-negligible fraction of pathogenic rare variants or de novo mutations, and the limited size of affected and normal populations. Indeed, prevalent applications of exome sequencing have been appealing for an effective computational method for identifying causative nonsynonymous SNVs from a large number of sequenced variants. Here, we propose a bioinformatics approach called SPRING (Snv PRioritization via the INtegration of Genomic data) for identifying pathogenic nonsynonymous SNVs for a given query disease. Based on six functional effect scores calculated by existing methods (SIFT, PolyPhen2, LRT, MutationTaster, GERP and PhyloP) and five association scores derived from a variety of genomic data sources (gene ontology, protein-protein interactions, protein sequences, protein domain annotations and gene pathway annotations), SPRING calculates the statistical significance that an SNV is causative for a query disease and hence provides a means of prioritizing candidate SNVs. With a series of comprehensive validation experiments, we demonstrate that SPRING is valid for diseases whose genetic bases are either partly known or completely unknown and effective for diseases with a variety of inheritance styles. In applications of our method to real exome sequencing data sets, we show the capability of SPRING in detecting causative de novo mutations for autism, epileptic encephalopathies and intellectual disability. We further provide an online service, the standalone software and genome-wide predictions of causative SNVs for 5,080 diseases at http://bioinfo.au.tsinghua.edu.cn/spring.  相似文献   

4.
Whether certain Epstein-Barr virus (EBV) strains are associated with pathogenesis of nasopharyngeal carcinoma (NPC) is still an unresolved question. In the present study, EBV genome contained in a primary NPC tumor biopsy was amplified by Polymerase Chain Reaction (PCR), and sequenced using next-generation (Illumina) and conventional dideoxy-DNA sequencing. The EBV genome, designated HKNPC1 (Genbank accession number JQ009376) is a type 1 EBV of approximately 171.5 kb. The virus appears to be a uniform strain in line with accepted monoclonal nature of EBV in NPC but is heterogeneous at 172 nucleotide positions. Phylogenetic analysis with the four published EBV strains, B95-8, AG876, GD1, and GD2, indicated HKNPC1 was more closely related to the Chinese NPC patient-derived strains, GD1 and GD2. HKNPC1 contains 1,589 single nucleotide variations (SNVs) and 132 insertions or deletions (indels) in comparison to the reference EBV sequence (accession number NC007605). When compared to AG876, a strain derived from Ghanaian Burkitt's lymphoma, we found 322 SNVs, of which 76 were non-synonymous SNVs and were shared amongst the Chinese GD1, GD2 and HKNPC1 isolates. We observed 88 non-synonymous SNVs shared only by HKNPC1 and GD2, the only other NPC tumor-derived strain reported thus far. Non-synonymous SNVs were mainly found in the latent, tegument and glycoprotein genes. The same point mutations were found in glycoprotein (BLLF1 and BALF4) genes of GD1, GD2 and HKNPC1 strains and might affect cell type specific binding. Variations in LMP1 and EBNA3B epitopes and mutations in Cp (11404 C>T) and Qp (50134 G>C) found in GD1, GD2 and HKNPC1 could potentially affect CD8(+) T cell recognition and latent gene expression pattern in NPC, respectively. In conclusion, we showed that whole genome sequencing of EBV in NPC may facilitate discovery of previously unknown variations of pathogenic significance.  相似文献   

5.
6.
7.
8.
9.
Early analytical clone screening is important during Chinese hamster ovary (CHO) cell line development of biotherapeutic proteins to select a clonally derived cell line with most favorable stability and product quality. Sensitive sequence confirmation methods using mass spectrometry have limitations in throughput and turnaround time. Next‐generation sequencing (NGS) technologies emerged as alternatives for CHO clone analytics. We report an efficient NGS workflow applying the targeted locus amplification (TLA) strategy for genomic screening of antibody expressing CHO clones. In contrast to previously reported RNA sequencing approaches, TLA allows for targeted sequencing of genomic integrated transgenic DNA without prior locus information, robust detection of single‐nucleotide variants (SNVs) and transgenic rearrangements. During clone selection, TLA/NGS revealed CHO clones with high‐level SNVs within the antibody gene and we report in another case the utility of TLA/NGS to identify rearrangements at transgenic DNA level. We also determined detection limits for SNVs calling and the potential to identify clone contaminations by TLA/NGS. TLA/NGS also allows to identify genetically identical clones. In summary, we demonstrate that TLA/NGS is a robust screening method useful for routine clone analytics during cell line development with the potential to process up to 24 CHO clones in less than 7 workdays.  相似文献   

10.
For the robust practice of genomic medicine, sequencing results must be compatible, regardless of the sequencing technologies and algorithms used. Presently, genome sequencing is still an imprecise science and is complicated by differences in the chemistry, coverage, alignment, and variant-calling algorithms. We identified ∼3.33 million single nucleotide variants (SNVs) and ∼3.62 million SNVs in the SJK genome using SOLiD and Illumina data, respectively. Approximately 3 million SNVs were concordant between the two platforms while 68,532 SNVs were discordant; 219,616 SNVs were SOLiD-specific and 516,080 SNVs were Illumina-specific (i.e., platform-specific). Concordant, discordant, and platform-specific SNVs were further analyzed and characterized. Overall, a large portion of heterozygous SNVs that were discordant with genotyping calls of single nucleotide polymorphism chips were highly confident. Approximately 70% of the platform-specific SNVs were located in regions containing repetitive sequences. Such platform-specificity may arise from differences between platforms, with regard to read length (36 bp and 72 bp vs. 50 bp), insert size (∼100–300 bp vs. ∼1–2 kb), sequencing chemistry (sequencing-by-synthesis using single nucleotides vs. ligation-based sequencing using oligomers), and sequencing quality. When data from the two platforms were merged for variant calling, the proportion of callable regions of the reference genome increased to 99.66%, which was 1.43% higher than the average callability of the two platforms, representing ∼40 million bases. In this study, we compared the differences in sequencing results between two sequencing platforms. Approximately 90% of the SNVs were concordant between the two platforms, yet ∼10% of the SNVs were either discordant or platform-specific, indicating that each platform had its own strengths and weaknesses. When data from the two platforms were merged, both the overall callability of the reference genome and the overall accuracy of the SNVs improved, demonstrating the likelihood that a re-sequenced genome can be revised using complementary data.  相似文献   

11.
12.
13.
14.
Genetic variants and de novo mutations in regulatory regions of the genome are typically discovered by whole-genome sequencing (WGS), however WGS is expensive and most WGS reads come from non-regulatory regions. The Assay for Transposase-Accessible Chromatin (ATAC-seq) generates reads from regulatory sequences and could potentially be used as a low-cost ‘capture’ method for regulatory variant discovery, but its use for this purpose has not been systematically evaluated. Here we apply seven variant callers to bulk and single-cell ATAC-seq data and evaluate their ability to identify single nucleotide variants (SNVs) and insertions/deletions (indels). In addition, we develop an ensemble classifier, VarCA, which combines features from individual variant callers to predict variants. The Genome Analysis Toolkit (GATK) is the best-performing individual caller with precision/recall on a bulk ATAC test dataset of 0.92/0.97 for SNVs and 0.87/0.82 for indels within ATAC-seq peak regions with at least 10 reads. On bulk ATAC-seq reads, VarCA achieves superior performance with precision/recall of 0.99/0.95 for SNVs and 0.93/0.80 for indels. On single-cell ATAC-seq reads, VarCA attains precision/recall of 0.98/0.94 for SNVs and 0.82/0.82 for indels. In summary, ATAC-seq reads can be used to accurately discover non-coding regulatory variants in the absence of whole-genome sequencing data and our ensemble method, VarCA, has the best overall performance.  相似文献   

15.
Somatic single nucleotide variants (SNVs) in cancer genome affect gene expression through various mechanisms depending on their genomic location. While somatic SNVs near canonical splice sites have been reported to cause abnormal splicing of cancer-related genes, whether these SNVs can affect gene expression through other mechanisms remains an open question. Here, we analyzed RNA sequencing and exome data from 4,998 cancer patients covering ten cancer types and identified 152 somatic SNVs near splice sites that were associated with abnormal intronic polyadenylation (IPA). IPA-associated somatic variants favored the localization near the donor splice sites compared to the acceptor splice sites. A proportion of SNV-associated IPA events overlapped with premature cleavage and polyadenylation events triggered by U1 small nuclear ribonucleoproteins (snRNP) inhibition. GC content, intron length and polyadenylation signal were three genomic features that differentiated between SNV-associated IPA and intron retention. Notably, IPA-associated SNVs were enriched in tumor suppressor genes (TSGs), including the well-known TSGs such as PTEN and CDH1 with recurrent SNV-associated IPA events. Minigene assay confirmed that SNVs from PTEN, CDH1, VEGFA, GRHL2, CUL3 and WWC2 could lead to IPA. This work reveals that IPA acts as a novel mechanism explaining the functional consequence of somatic SNVs in human cancer.  相似文献   

16.
PURPOSE: Relapsed/refractory pediatric cancers show poor prognosis; however, their genomic patterns remain unknown. To investigate the genetic mechanisms of tumor relapse and therapy resistance, we characterized genomic alterations in diagnostic and relapsed lesions in patients with relapsed/refractory pediatric solid tumors using targeted deep sequencing. PATIENTS AND METHODS: A targeted sequencing panel covering the exons of 381 cancer genes was used to characterize 19 paired diagnostic and relapsed samples from patients with relapsed/refractory pediatric solid tumors. RESULTS: The mean coverage for all samples was 930.6× (SD = 213.8). Among the 381 genes, 173 single nucleotide variations (SNVs)/insertion-deletions (InDels), 100 copy number alterations, and 1 structural variation were detected. A total of 72.6% of SNVs in primary tumors were also found in recurrent lesions, and 27.2% of SNVs in recurrent tumors had newly occurred. Among SNVs/InDels detected only in recurrent lesions, 71% had a low variant allele fraction (<10%). Patients were classified into three categories based on the mutation patterns after cancer treatment. A significant association between the major mutation patterns and clinical outcome was observed. Patients whose relapsed tumor had fewer mutations than the diagnostic sample tended to be older, had longer progression-free survival, and achieved complete remission after relapse. Contrastingly, patients whose genetic profile only had concordant mutations without any change had the worst outcome. CONCLUSIONS: We characterized genomic changes in recurrent pediatric solid tumors. These findings could help to understand the biology of relapsed childhood cancer and to develop personalized treatment based on their genetic profile.  相似文献   

17.
Chronic obstructive pulmonary disease (COPD) is a risk factor for the development of lung cancer. The aim of this study was to identify early diagnosis biomarkers for lung squamous cell carcinoma (SQCC) in COPD patients and to determine the potential pathogenetic mechanisms. The GSE12472 data set was downloaded from the Gene Expression Omnibus database. Differentially co‐expressed links (DLs) and differentially expressed genes (DEGs) in both COPD and normal tissues, or in both SQCC + COPD and COPD samples were used to construct a dynamic network associated with high‐risk genes for the SQCC pathogenetic process. Enrichment analysis was performed based on Gene Ontology annotations and Kyoto Encyclopedia of Genes and Genomes pathway analysis. We used the gene expression data and the clinical information to identify the co‐expression modules based on weighted gene co‐expression network analysis (WGCNA). In total, 205 dynamic DEGs, 5034 DLs and one pathway including CDKN1A, TP53, RB1 and MYC were found to have correlations with the pathogenetic progress. The pathogenetic mechanisms shared by both SQCC and COPD are closely related to oxidative stress, the immune response and infection. WGCNA identified 11 co‐expression modules, where magenta and black were correlated with the “time to distant metastasis.” And the “surgery due to” was closely related to the brown and blue modules. In conclusion, a pathway that includes TP53, CDKN1A, RB1 and MYC may play a vital role in driving COPD towards SQCC. Inflammatory processes and the immune response participate in COPD‐related carcinogenesis.  相似文献   

18.
19.
Gorlin syndrome (GS) is an autosomal dominant disorder that predisposes affected individuals to developmental defects and tumorigenesis, and caused mainly by heterozygous germline PTCH1 mutations. Despite exhaustive analysis, PTCH1 mutations are often unidentifiable in some patients; the failure to detect mutations is presumably because of mutations occurred in other causative genes or outside of analyzed regions of PTCH1, or copy number alterations (CNAs). In this study, we subjected a cohort of GS-affected individuals from six unrelated families to next-generation sequencing (NGS) analysis for the combined screening of causative alterations in Hedgehog signaling pathway-related genes. Specific single nucleotide variations (SNVs) of PTCH1 causing inferred amino acid changes were identified in four families (seven affected individuals), whereas CNAs within or around PTCH1 were found in two families in whom possible causative SNVs were not detected. Through a targeted resequencing of all coding exons, as well as simultaneous evaluation of copy number status using the alignment map files obtained via NGS, we found that GS phenotypes could be explained by PTCH1 mutations or deletions in all affected patients. Because it is advisable to evaluate CNAs of candidate causative genes in point mutation-negative cases, NGS methodology appears to be useful for improving molecular diagnosis through the simultaneous detection of both SNVs and CNAs in the targeted genes/regions.  相似文献   

20.

Background

RNA-seq has spurred important gene fusion discoveries in a number of different cancers, including lung, prostate, breast, brain, thyroid and bladder carcinomas. Gene fusion discovery can potentially lead to the development of novel treatments that target the underlying genetic abnormalities.

Results

In this study, we provide comprehensive view of gene fusion landscape in 185 glioblastoma multiforme patients from two independent cohorts. Fusions occur in approximately 30-50% of GBM patient samples. In the Ivy Center cohort of 24 patients, 33% of samples harbored fusions that were validated by qPCR and Sanger sequencing. We were able to identify high-confidence gene fusions from RNA-seq data in 53% of the samples in a TCGA cohort of 161 patients. We identified 13 cases (8%) with fusions retaining a tyrosine kinase domain in the TCGA cohort and one case in the Ivy Center cohort. Ours is the first study to describe recurrent fusions involving non-coding genes. Genomic locations 7p11 and 12q14-15 harbor majority of the fusions. Fusions on 7p11 are formed in focally amplified EGFR locus whereas 12q14-15 fusions are formed by complex genomic rearrangements. All the fusions detected in this study can be further visualized and analyzed using our website: http://ivygap.swedish.org/fusions.

Conclusions

Our study highlights the prevalence of gene fusions as one of the major genomic abnormalities in GBM. The majority of the fusions are private fusions, and a minority of these recur with low frequency. A small subset of patients with fusions of receptor tyrosine kinases can benefit from existing FDA approved drugs and drugs available in various clinical trials. Due to the low frequency and rarity of clinically relevant fusions, RNA-seq of GBM patient samples will be a vital tool for the identification of patient-specific fusions that can drive personalized therapy.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-14-818) contains supplementary material, which is available to authorized users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号