首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.

Background

Matched sequencing of both tumor and normal tissue is routinely used to classify variants of uncertain significance (VUS) into somatic vs. germline. However, assays used in molecular diagnostics focus on known somatic alterations in cancer genes and often only sequence tumors. Therefore, an algorithm that reliably classifies variants would be helpful for retrospective exploratory analyses. Contamination of tumor samples with normal cells results in differences in expected allelic fractions of germline and somatic variants, which can be exploited to accurately infer genotypes after adjusting for local copy number. However, existing algorithms for determining tumor purity, ploidy and copy number are not designed for unmatched short read sequencing data.

Results

We describe a methodology and corresponding open source software for estimating tumor purity, copy number, loss of heterozygosity (LOH), and contamination, and for classification of single nucleotide variants (SNVs) by somatic status and clonality. This R package, PureCN, is optimized for targeted short read sequencing data, integrates well with standard somatic variant detection pipelines, and has support for matched and unmatched tumor samples. Accuracy is demonstrated on simulated data and on real whole exome sequencing data.

Conclusions

Our algorithm provides accurate estimates of tumor purity and ploidy, even if matched normal samples are not available. This in turn allows accurate classification of SNVs. The software is provided as open source (Artistic License 2.0) R/Bioconductor package PureCN (http://bioconductor.org/packages/PureCN/).
  相似文献   

2.

Background

Structural variations (SVs) are wide-spread in human genomes and may have important implications in disease-related and evolutionary studies. High-throughput sequencing (HTS) has become a major platform for SV detection and simulation serves as a powerful and cost-effective approach for benchmarking SV detection algorithms. Accurate performance assessment by simulation requires the simulator capable of generating simulation data with all important features of real data, such GC biases in HTS data and various complexities in tumor data. However, no available package has systematically addressed all issues in data simulation for SV benchmarking.

Results

Pysim-sv is a package for simulating HTS data to evaluate performance of SV detection algorithms. Pysim-sv can introduce a wide spectrum of germline and somatic genomic variations. The package contains functionalities to simulate tumor data with aneuploidy and heterogeneous subclones, which is very useful in assessing algorithm performance in tumor studies. Furthermore, Pysim-sv can introduce GC-bias, the most important and prevalent bias in HTS data, in the simulated HTS data.

Conclusions

Pysim-sv provides an unbiased toolkit for evaluating HTS-based SV detection algorithms.
  相似文献   

3.

Background

Noninvasive prenatal screening (NIPS) of common aneuploidies using cell-free DNA from maternal plasma is part of routine prenatal care and is widely used in both high-risk and low-risk patient populations. High specificity is needed for clinically acceptable positive predictive values. Maternal copy-number variants (mCNVs) have been reported as a source of false-positive aneuploidy results that compromises specificity.

Methods

We surveyed the mCNV landscape in 87,255 patients undergoing NIPS. We evaluated both previously reported and novel algorithmic strategies for mitigating the effects of mCNVs on the screen’s specificity. Further, we analyzed the frequency, length, and positional distribution of CNVs in our large dataset to investigate the curation of novel fetal microdeletions, which can be identified by NIPS but are challenging to interpret clinically.

Results

mCNVs are common, with 65% of expecting mothers harboring an autosomal CNV spanning more than 200 kb, underscoring the need for robust NIPS analysis strategies. By analyzing empirical and simulated data, we found that general, outlier-robust strategies reduce the rate of mCNV-caused false positives but not as appreciably as algorithms specifically designed to account for mCNVs. We demonstrate that large-scale tabulation of CNVs identified via routine NIPS could be clinically useful: together with the gene density of a putative microdeletion region, we show that the region’s relative tolerance to duplications versus deletions may aid the interpretation of microdeletion pathogenicity.

Conclusions

Our study thoroughly investigates a common source of NIPS false positives and demonstrates how to bypass its corrupting effects. Our findings offer insight into the interpretation of NIPS results and inform the design of NIPS algorithms suitable for use in screening in the general obstetric population.
  相似文献   

4.

Background

Multiple primary cancers (MPC) have been identified as two or more cancers without any subordinate relationship that occur either simultaneously or metachronously in the same or different organs of an individual. Lynch syndrome is an autosomal dominant genetic disorder that increases the risk of many types of cancers. Lynch syndrome patients who suffer more than two cancers can also be considered as MPC; patients of this kind provide unique resources to learn how genetic mutation causes MPC in different tissues.

Methods

We performed a whole genome sequencing on blood cells and two tumor samples of a Lynch syndrome patient who was diagnosed with five primary cancers. The mutational landscape of the tumors, including somatic point mutations and copy number alternations, was characterized. We also compared Lynch syndrome with sporadic cancers and proposed a model to illustrate the mutational process by which Lynch syndrome progresses to MPC.

Results

We revealed a novel pathologic mutation on the MSH2 gene (G504 splicing) that associates with Lynch syndrome. Systematical comparison of the mutation landscape revealed that multiple cancers in the proband were evolutionarily independent. Integrative analysis showed that truncating mutations of DNA mismatch repair (MMR) genes were significantly enriched in the patient. A mutation progress model that included germline mutations of MMR genes, double hits of MMR system, mutations in tissue-specific driver genes, and rapid accumulation of additional passenger mutations was proposed to illustrate how MPC occurs in Lynch syndrome patients.

Conclusion

Our findings demonstrate that both germline and somatic alterations are driving forces of carcinogenesis, which may resolve the carcinogenic theory of Lynch syndrome.
  相似文献   

5.

Background

Osteosarcoma (OS) is a prevalent primary malignant bone tumour with unknown etiology. These highly metastasizing tumours are among the most frequent causes of cancer-related deaths. Thus, there is an urgent need for different markers, and with our study, we were aiming towards finding novel biomarkers for OS.

Methods

For that, we analysed the whole exome of the tumorous and non-tumour bone tissue from the same patient with OS applying next-generation sequencing. For data analysis, we used several softwares and combined the exome data with RNA-seq data from our previous study.

Results

In the tumour exome, we found wide genomic rearrangements, which should qualify as chromotripsis—we detected almost 3,000 somatic single nucleotide variants (SNVs) and small indels and more than 2,000 copy number variants (CNVs) in different chromosomes. Furthermore, the somatic changes seem to be associated to bone tumours, whereas germline mutations to cancer in general. We confirmed the previous findings that the most significant pathway involved in OS pathogenesis is probably the WNT/β-catenin signalling pathway. Also, the IGF1/IGF2 and IGF1R homodimer signalling and TP53 (including downstream tumour suppressor gene EI24) pathways may have a role. Additionally, the mucin family genes, especially MUC4 and cell cycle controlling gene CDC27 may be considered as potential biomarkers for OS.

Conclusions

The genes, in which the mutations were detected, may be considered as targets for finding biomarkers for OS. As the study is based on a single case and only DNA and RNA analysis, further confirmative studies are required.
  相似文献   

6.

Background

Tandem affinity purification coupled with mass-spectrometry (TAP/MS) analysis is a popular method for the identification of novel endogenous protein-protein interactions (PPIs) in large-scale. Computational analysis of TAP/MS data is a critical step, particularly for high-throughput datasets, yet it remains challenging due to the noisy nature of TAP/MS data.

Results

We investigated several major TAP/MS data analysis methods for identifying PPIs, and developed an advanced method, which incorporates an improved statistical method to filter out false positives from the negative controls. Our method is named PPIRank that stands for PPI rank ing in TAP/MS data. We compared PPIRank with several other existing methods in analyzing two pathway-specific TAP/MS PPI datasets from Drosophila.

Conclusion

Experimental results show that PPIRank is more capable than other approaches in terms of identifying known interactions collected in the BioGRID PPI database. Specifically, PPIRank is able to capture more true interactions and simultaneously less false positives in both Insulin and Hippo pathways of Drosophila Melanogaster.
  相似文献   

7.

Background

While continental level ancestry is relatively simple using genomic information, distinguishing between individuals from closely associated sub-populations (e.g., from the same continent) is still a difficult challenge.

Methods

We study the problem of predicting human biogeographical ancestry from genomic data under resource constraints. In particular, we focus on the case where the analysis is constrained to using single nucleotide polymorphisms (SNPs) from just one chromosome. We propose methods to construct such ancestry informative SNP panels using correlation-based and outlier-based methods.

Results

We accessed the performance of the proposed SNP panels derived from just one chromosome, using data from the 1000 Genome Project, Phase 3. For continental-level ancestry classification, we achieved an overall classification rate of 96.75% using 206 single nucleotide polymorphisms (SNPs). For sub-population level ancestry prediction, we achieved an average pairwise binary classification rates as follows: subpopulations in Europe: 76.6% (58 SNPs); Africa: 87.02% (87 SNPs); East Asia: 73.30% (68 SNPs); South Asia: 81.14% (75 SNPs); America: 85.85% (68 SNPs).

Conclusion

Our results demonstrate that one single chromosome (in particular, Chromosome 1), if carefully analyzed, could hold enough information for accurate prediction of human biogeographical ancestry. This has significant implications in terms of the computational resources required for analysis of ancestry, and in the applications of such analyses, such as in studies of genetic diseases, forensics, and soft biometrics.
  相似文献   

8.
9.

Background

False occurrences of functional motifs in protein sequences can be considered as random events due solely to the sequence composition of a proteome. Here we use a numerical approach to investigate the random appearance of functional motifs with the aim of addressing biological questions such as: How are organisms protected from undesirable occurrences of motifs otherwise selected for their functionality? Has the random appearance of functional motifs in protein sequences been affected during evolution?

Results

Here we analyse the occurrence of functional motifs in random sequences and compare it to that observed in biological proteomes; the behaviour of random motifs is also studied. Most motifs exhibit a number of false positives significantly similar to the number of times they appear in randomized proteomes (=expected number of false positives). Interestingly, about 3% of the analysed motifs show a different kind of behaviour and appear in biological proteomes less than they do in random sequences. In some of these cases, a mechanism of evolutionary negative selection is apparent; this helps to prevent unwanted functionalities which could interfere with cellular mechanisms.

Conclusion

Our thorough statistical and biological analysis showed that there are several mechanisms and evolutionary constraints both of which affect the appearance of functional motifs in protein sequences.
  相似文献   

10.

Background

Metastasis is the primary cause of mortality in cancer patients. Therefore, elucidating the genetics and epigenetics of metastatic tumor cells and the mechanisms by which tumor cells acquire metastatic properties constitute significant challenges in cancer research.

Objective

To summarize the current understandings of the specific genotype and phenotype of the metastatic tumor cells.

Method and Result

In-depth genetic analysis of tumor cells, especially with advances in the next-generation sequencing, have revealed insights of the genotypes of metastatic tumor cells. Also, studies have shown that the cancer stem cell (CSC) and epithelial to mesenchymal transition (EMT) phenotypes are associated with the metastatic cascade.

Conclusion

In this review, we will discuss recent advances in the field by focusing on the genomic instability and phenotypic dynamics of metastatic tumor cells.
  相似文献   

11.

Background

Charge states of tandem mass spectra from low-resolution collision induced dissociation can not be determined by mass spectrometry. As a result, such spectra with multiple charges are usually searched multiple times by assuming each possible charge state. Not only does this strategy increase the overall database search time, but also yields more false positives. Hence, it is advantageous to determine charge states of such spectra before database search.

Results

We propose a new approach capable of determining the charge states of low-resolution tandem mass spectra. Four novel and discriminant features are introduced to describe tandem mass spectra and used in Gaussian mixture model to distinguish doubly and triply charged peptides. By testing on three independent datasets with known validity, the results have shown that this method can assign charge states to low-resolution tandem mass spectra more accurately than existing methods.

Conclusions

The proposed method can be used to improve the speed and reliability of peptide identification.
  相似文献   

12.

Introduction

Efforts to harmonize lipidomic methodologies have been limited within the community. Here, we aimed to capitalize on the recent National Institute of Standards and Technology lipidomics interlaboratory comparison exercise by implementing a questionnaire that assessed current methodologies, quantitation strategies, standard operating procedures (SOPs), and quality control activities employed by the lipidomics community.

Objectives

Lipidomics is a rapidly developing field with diverse applications. At present, there are no community-vetted methods to assess measurement comparability or data quality. Thus, a major impetus of this questionnaire was to profile current efforts, highlight areas of need, and establish future objectives in an effort to harmonize lipidomics workflows.

Methods

The 54-question survey inquired about laboratory demographics, lipidomic methodologies and SOPs, analytical platforms, quantitation, reference materials, quality control procedures, and opinions regarding challenges existing within the community.

Results

A total of 125 laboratories participated in the questionnaire. A broad overview of results highlighted a wide methodological diversity within current lipidomic workflows. The impact of this diversity on lipid measurement and quantitation is currently unknown and needs to be explored further. While some laboratories do incorporate SOPs and quality control activities, these concepts have not been fully embraced by the community. The top five perceived challenges within the lipidomics community were a lack of standardization amongst methods/protocols, lack of lipid standards, software/data handling and quantification, and over-reporting/false positives.

Conclusion

The questionnaire provided an overview of current lipidomics methodologies and further promoted the need for community-accepted guidelines and protocols. The questionnaire also served as a platform to help determine and prioritize metrological issues to be investigated.
  相似文献   

13.
14.

Introduction

Lung cancer is the leading cause of cancer related mortality owing to the advanced stage it is usually detected because the available diagnostic tests are expensive and invasive; therefore, they cannot be used for general screening.

Objectives

To increase robustness of previous biomarker panels—based on metabolites in sweat samples—proposed by the authors, new samples were collected within different intervals (4 months and 2 years), analyzed at different times (2012 and 2014, respectively) by different analysts to discriminate between LC patients and smokers at risk factor.

Methods

Sweat analysis was carried out by LC–MS/MS with minimum sample preparation and the generated analytical data were then integrated to minimize variability in statistical analysis.

Results

Panels with capability to discriminate LC patients from smokers at risk factor were obtained taken into account the variability between both cohorts as a consequence of the different intervals for samples collection, the times at which the analyses were carried out and the influence of the analyst. Two panels of metabolites using the PanelomiX tool allow reducing false negatives (95 % specificity) and false positives (95 % sensitivity). The first panel (96.9 % specificity and 83.8 % sensitivity) is composed by monoglyceride MG(22:2), muconic, suberic and urocanic acids, and a tetrahexose; the second panel (81.2 % specificity and 97.3 % sensitivity) is composed by the monoglyceride MG(22:2), muconic, nonanedioic and urocanic acids, and a tetrahexose.

Conclusion

The study has allowed obtaining a prediction model more robust than that obtained in the previous study from the authors.
  相似文献   

15.

Background

Motor- (MEP) and somatosensory-evoked potentials (SSEP) are susceptible to the effects of intraoperative environmental factors.

Methods

Over a 5-year period, 250 patients with adolescent idiopathic scoliosis (AIS) who underwent corrective surgery with IOM were retrospectively analyzed for MEP suppression (MEPS).

Results

Our results show that four distinct groups of MEPS were encountered over the study period. All 12 patients did not sustain any neurological deficits postoperatively. However, comparison of groups 1 and 2 suggests that neither the duration of anesthesia nor speed of surgical or anesthetic intervention were associated with recovery to a level beyond the criteria for MEPS. For group 3, spontaneous MEPS recovery despite the lack of surgical intervention suggests that anesthetic intervention may play a role in this process. However, spontaneous MEPS recovery was also seen in group 4, suggesting that in certain circumstances, both surgical and anesthetic intervention was not required. In addition, neither the duration of time to the first surgical manoeuver nor the duration of surgical manoeuver to MEPS were related to recovery of MEPS. None of the patients had suppression of SSEPs intraoperatively.

Conclusion

This study suggests that in susceptible individuals, MEPS may rarely occur unpredictably, independent of surgical or anesthetic intervention. However, our findings favor anesthetic before surgical intervention as a proposed protocol. Early recognition of MEPS is important to prevent false positives in the course of IOM for spinal surgery.
  相似文献   

16.

Introduction

Collecting feces is easy. It offers direct outcome to endogenous and microbial metabolites.

Objectives

In a context of lack of consensus about fecal sample preparation, especially in animal species, we developed a robust protocol allowing untargeted LC-HRMS fingerprinting.

Methods

The conditions of extraction (quantity, preparation, solvents, dilutions) were investigated in bovine feces.

Results

A rapid and simple protocol involving feces extraction with methanol (1/3, M/V) followed by centrifugation and a step filtration (10 kDa) was developed.

Conclusion

The workflow generated repeatable and informative fingerprints for robust metabolome characterization.
  相似文献   

17.

Background

As studies of molecular biology system attempt to achieve a comprehensive understanding of a particular system, Type 1 errors may be a significant problem. However, few investigators are inclined to accept the increase in Type 2 errors (false positives) that may result when less stringent statistical cut-off values are used. To address this dilemma, we developed an analysis strategy that used a stringent statistical analysis to create a list of differentially expressed genes that served as "bait" to "fish out" other genes with similar patterns of expression.

Results

Comparing two strains of mice (NOD and C57Bl/6), we identified 93 genes with statistically significant differences in their patterns of expression. Hierarchical clustering identified an additional 39 genes with similar patterns of expression differences between the two strains. Pathway analysis was then employed: 1) identify the central genes and define biological processes that may be regulated by the genes identified, and 2) identify genes on the lists that could not be connected to each other in pathways (potential false positives). For networks created by both gene lists, the most connected (central) genes were interferon gamma (IFN-γ) and tumor necrosis factor alpha (TNF-α). These two cytokines are relevant to the biological differences between the two strains of mice. Furthermore, the network created by the list of 39 genes also suggested other biological differences between the strains.

Conclusion

Taken together, these data demonstrate how stringent statistical analysis, combined with hierarchical clustering and pathway analysis may offer deeper insight into the biological processes reflected from a set of expression array data. This approach allows us to 'recapture" false negative genes that otherwise would have been missed by the statistical analysis.
  相似文献   

18.

Background

Observations of recurrent somatic mutations in tumors have led to identification and definition of signaling and other pathways that are important for cancer progression and therapeutic targeting. As tumor cells contain both an individual’s inherited genetic variants and somatic mutations, challenges arise in distinguishing these events in massively parallel sequencing datasets. Typically, both a tumor sample and a “normal” sample from the same individual are sequenced and compared; variants observed only in the tumor are considered to be somatic mutations. However, this approach requires two samples for each individual.

Results

We evaluate a method of detecting somatic mutations in tumor samples for which only a subset of normal samples are available. We describe tuning of the method for detection of mutations in tumors, filtering to remove inherited variants, and comparison of detected mutations to several matched tumor/normal analysis methods. Filtering steps include the use of population variation datasets to remove inherited variants as well a subset of normal samples to remove technical artifacts. We then directly compare mutation detection with tumor-only and tumor-normal approaches using the same sets of samples. Comparisons are performed using an internal targeted gene sequencing dataset (n = 3380) as well as whole exome sequencing data from The Cancer Genome Atlas project (n = 250). Tumor-only mutation detection shows similar recall (43–60%) but lesser precision (20–21%) to current matched tumor/normal approaches (recall 43–73%, precision 30–82%) when compared to a “gold-standard” tumor/normal approach. The inclusion of a small pool of normal samples improves precision, although many variants are still uniquely detected in the tumor-only analysis.

Conclusions

A detailed method for somatic mutation detection without matched normal samples enables study of larger numbers of tumor samples, as well as tumor samples for which a matched normal is not available. As sensitivity/recall is similar to tumor/normal mutation detection but precision is lower, tumor-only detection is more appropriate for classification of samples based on known mutations. Although matched tumor-normal analysis is preferred due to higher precision, we demonstrate that mutation detection without matched normal samples is possible for certain applications.
  相似文献   

19.
20.

Objective

To explore the impact of taurine on monoclonal antibody (mAb) basic charge variants in Chinese hamster ovary (CHO) cell culture.

Results

In fed-batch culture, adding taurine in the feed medium slightly increased the maximum viable cell density and mAb titers in CHO cells. What’s more, taurine significantly decreased the lysine variant and oxidized variant levels, which further decreased basic variant contents from 32 to 27%. The lysine variant content in the taurine culture was approximately 4% lower than that in control condition, which was the main reason for the decrease in basic variants. Real-time PCR and cell-free assay revealed that taurine played a critical role in the upregulation of relative basic carboxypeptidase and stimulating extracellular basic carboxypeptidase activities.

Conclusion

Taurine exhibits noticeable impact on lower basic charge variants, which are mainly due to the decrease of lysine variant and oxidized protein variants.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号