首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Insertional mutagenesis screens in mice are used to identify individual genes that drive tumor formation. In these screens, candidate cancer genes are identified if their genomic location is proximal to a common insertion site (CIS) defined by high rates of transposon or retroviral insertions in a given genomic window. In this article, we describe a new method for defining CISs based on a Poisson distribution, the Poisson Regression Insertion Model, and show that this new method is an improvement over previously described methods. We also describe a modification of the method that can identify pairs and higher orders of co-occurring common insertion sites. We apply these methods to two data sets, one generated in a transposon-based screen for gastrointestinal tract cancer genes and another based on the set of retroviral insertions in the Retroviral Tagged Cancer Gene Database. We show that the new methods identify more relevant candidate genes and candidate gene pairs than found using previous methods. Identification of the biologically relevant set of mutations that occur in a single cell and cause tumor progression will aid in the rational design of single and combinatorial therapies in the upcoming age of personalized cancer therapy.  相似文献   

2.

Background

Animal models of cancer are useful to generate complementary datasets for comparison to human tumor data. Insertional mutagenesis screens, such as those utilizing the Sleeping Beauty (SB) transposon system, provide a model that recapitulates the spontaneous development and progression of human disease. This approach has been widely used to model a variety of cancers in mice. Comprehensive mutation profiles are generated for individual tumors through amplification of transposon insertion sites followed by high-throughput sequencing. Subsequent statistical analyses identify common insertion sites (CISs), which are predicted to be functionally involved in tumorigenesis. Current methods utilized for SB insertion site analysis have some significant limitations. For one, they do not account for transposon footprints – a class of mutation generated following transposon remobilization. Existing methods also discard quantitative sequence data due to uncertainty regarding the extent to which it accurately reflects mutation abundance within a heterogeneous tumor. Additionally, computational analyses generally assume that all potential insertion sites have an equal probability of being detected under non-selective conditions, an assumption without sufficient relevant data. The goal of our study was to address these potential confounding factors in order to enhance functional interpretation of insertion site data from tumors.

Results

We describe here a novel method to detect footprints generated by transposon remobilization, which revealed minimal evidence of positive selection in tumors. We also present extensive characterization data demonstrating an ability to reproducibly assign semi-quantitative information to individual insertion sites within a tumor sample. Finally, we identify apparent biases for detection of inserted transposons in several genomic regions that may lead to the identification of false positive CISs.

Conclusion

The information we provide can be used to refine analyses of data from insertional mutagenesis screens, improving functional interpretation of results and facilitating the identification of genes important in cancer development and progression.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1150) contains supplementary material, which is available to authorized users.  相似文献   

3.
Somatic transposon mutagenesis in mice is an efficient strategy to investigate the genetic mechanisms of tumorigenesis. The identification of tumor driving transposon insertions traditionally requires the generation of large tumor cohorts to obtain information about common insertion sites. Tumor driving insertions are also characterized by their clonal expansion in tumor tissue, a phenomenon that is facilitated by the slow and evolving transformation process of transposon mutagenesis. We describe here an improved approach for the detection of tumor driving insertions that assesses the clonal expansion of insertions by quantifying the relative proportion of sequence reads obtained in individual tumors. To this end, we have developed a protocol for insertion site sequencing that utilizes acoustic shearing of tumor DNA and Illumina sequencing. We analyzed various solid tumors generated by PiggyBac mutagenesis and for each tumor >106 reads corresponding to >104 insertion sites were obtained. In each tumor, 9 to 25 insertions stood out by their enriched sequence read frequencies when compared to frequencies obtained from tail DNA controls. These enriched insertions are potential clonally expanded tumor driving insertions, and thus identify candidate cancer genes. The candidate cancer genes of our study comprised many established cancer genes, but also novel candidate genes such as Mastermind-like1 (Mamld1) and Diacylglycerolkinase delta (Dgkd). We show that clonal expansion analysis by high-throughput sequencing is a robust approach for the identification of candidate cancer genes in insertional mutagenesis screens on the level of individual tumors.  相似文献   

4.
5.
Retroviral insertional mutagenesis screens, which identify genes involved in tumor development in mice, have yielded a substantial number of retroviral integration sites, and this number is expected to grow substantially due to the introduction of high-throughput screening techniques. The data of various retroviral insertional mutagenesis screens are compiled in the publicly available Retroviral Tagged Cancer Gene Database (RTCGD). Integrally analyzing these screens for the presence of common insertion sites (CISs, i.e., regions in the genome that have been hit by viral insertions in multiple independent tumors significantly more than expected by chance) requires an approach that corrects for the increased probability of finding false CISs as the amount of available data increases. Moreover, significance estimates of CISs should be established taking into account both the noise, arising from the random nature of the insertion process, as well as the bias, stemming from preferential insertion sites present in the genome and the data retrieval methodology. We introduce a framework, the kernel convolution (KC) framework, to find CISs in a noisy and biased environment using a predefined significance level while controlling the family-wise error (FWE) (the probability of detecting false CISs). Where previous methods use one, two, or three predetermined fixed scales, our method is capable of operating at any biologically relevant scale. This creates the possibility to analyze the CISs in a scale space by varying the width of the CISs, providing new insights in the behavior of CISs across multiple scales. Our method also features the possibility of including models for background bias. Using simulated data, we evaluate the KC framework using three kernel functions, the Gaussian, triangular, and rectangular kernel function. We applied the Gaussian KC to the data from the combined set of screens in the RTCGD and found that 53% of the CISs do not reach the significance threshold in this combined setting. Still, with the FWE under control, application of our method resulted in the discovery of eight novel CISs, which each have a probability less than 5% of being false detections.  相似文献   

6.
The understanding of bacterial gene function has been greatly enhanced by recent advancements in the deep sequencing of microbial genomes. Transposon insertion sequencing methods combines next-generation sequencing techniques with transposon mutagenesis for the exploration of the essentiality of genes under different environmental conditions. We propose a model-based method that uses regularized negative binomial regression to estimate the change in transposon insertions attributable to gene-environment changes in this genetic interaction study without transformations or uniform normalization. An empirical Bayes model for estimating the local false discovery rate combines unique and total count information to test for genes that show a statistically significant change in transposon counts. When applied to RB-TnSeq (randomized barcode transposon sequencing) and Tn-seq (transposon sequencing) libraries made in strains of Caulobacter crescentus using both total and unique count data the model was able to identify a set of conditionally beneficial or conditionally detrimental genes for each target condition that shed light on their functions and roles during various stress conditions.  相似文献   

7.
Listeria monocytogenes is a Gram-positive, food-borne pathogen of humans and animals. L. monocytogenes is considered to be a potential public health risk by the U.S. Food and Drug Administration (FDA), as this bacterium can easily contaminate ready-to-eat (RTE) foods and cause an invasive, life-threatening disease (listeriosis). Bacteria can adhere and grow on multiple surfaces and persist within biofilms in food processing plants, providing resistance to sanitizers and other antimicrobial agents. While whole genome sequencing has led to the identification of biofilm synthesis gene clusters in many bacterial species, bioinformatics has not identified the biofilm synthesis genes within the L. monocytogenes genome. To identify genes necessary for L. monocytogenes biofilm formation, we performed a transposon mutagenesis library screen using a recently constructed Himar1 mariner transposon. Approximately 10,000 transposon mutants within L. monocytogenes strain 10403S were screened for biofilm formation in 96-well polyvinyl chloride (PVC) microtiter plates with 70 Himar1 insertion mutants identified that produced significantly less biofilms. DNA sequencing of the transposon insertion sites within the isolated mutants revealed transposon insertions within 38 distinct genetic loci. The identification of mutants bearing insertions within several flagellar motility genes previously known to be required for the initial stages of biofilm formation validated the ability of the mutagenesis screen to identify L. monocytogenes biofilm-defective mutants. Two newly identified genetic loci, dltABCD and phoPR, were selected for deletion analysis and both ΔdltABCD and ΔphoPR bacterial strains displayed biofilm formation defects in the PVC microtiter plate assay, confirming these loci contribute to biofilm formation by L. monocytogenes.  相似文献   

8.
Using a luxAB reporter transposon, seven mutants of Sinorhizobium meliloti were identified as containing insertions in four cold shock loci. LuxAB activity was strongly induced (25- to 160-fold) after a temperature shift from 30 to 15°C. The transposon and flanking host DNA from each mutant was cloned, and the nucleic acid sequence of the insertion site was determined. Unexpectedly, five of the seven luxAB mutants contained transposon insertions in the 16S and 23S rRNA genes of two of the three rrn operons of S. meliloti. Directed insertion of luxAB genes into each of the three rrn operons revealed that all three operons were similarly affected by cold shock. Two other insertions were found to be located downstream of a homolog of the major Escherichia coli cold shock gene, cspA. Although the cold shock loci were highly induced in response to a shift to low temperature, none of the insertions resulted in a statistically significant decrease in growth rate at 15°C.  相似文献   

9.
AKXD recombinant inbred (RI) strains develop a variety of leukemias and lymphomas due to somatically acquired insertions of retroviral DNA into the genome of hematopoetic cells that can mutate cellular proto-oncogenes and tumor suppressor genes. We generated a new set of tumors from nine AKXD RI strains selected for their propensity to develop B-cell tumors, the most common type of human hematopoietic cancers. We employed a PCR technique called viral insertion site amplification (VISA) to rapidly isolate genomic sequence at the site of provirus insertion. Here we describe 550 VISA sequence tags (VSTs) that identify 74 common insertion sites (CISs), of which 21 have not been identified previously. Several suspected proto-oncogenes and tumor suppressor genes lie near CISs, providing supportive evidence for their roles in cancer. Furthermore, numerous previously uncharacterized genes lie near CISs, providing a pool of candidate disease genes for future research. Pathway analysis of candidate genes identified several signaling pathways as common and powerful routes to blood cancer, including Notch, E-protein, NFκB, and Ras signaling. Misregulation of several Notch signaling genes was confirmed by quantitative RT-PCR. Our data suggest that analyses of insertional mutagenesis on a single genetic background are biased toward the identification of cooperating mutations. This tumor collection represents the most comprehensive study of the genetics of B-cell leukemia and lymphoma development in mice. We have deposited the VST sequences, CISs in a genome viewer, histopathology, and molecular tumor typing data in a public web database called VISION (Viral Insertion Sites Identifying Oncogenes), which is located at . Keith C. Weiser and Bin Liu are authors that contributed equally to this work.  相似文献   

10.
11.
Understanding how complex networks of genes integrate to produce dividing cells is an important goal that is limited by the difficulty in defining the function of individual genes. Current resources for the systematic identification of gene function such as siRNA libraries and collections of deletion strains are costly and organism specific. We describe here integration profiling, a novel approach to identify the function of eukaryotic genes based upon dense maps of transposon integration. As a proof of concept, we used the transposon Hermes to generate a library of 360,513 insertions in the genome of Schizosaccharomyces pombe. On average, we obtained one insertion for every 29 bp of the genome. Hermes integrated more often into nucleosome free sites and 33% of the insertions occurred in ORFs. We found that ORFs with low integration densities successfully identified the genes that are essential for cell division. Importantly, the nonessential ORFs with intermediate levels of insertion correlated with the nonessential genes that have functions required for colonies to reach full size. This finding indicates that integration profiles can measure the contribution of nonessential genes to cell division. While integration profiling succeeded in identifying genes necessary for propagation, it also has the potential to identify genes important for many other functions such as DNA repair, stress response, and meiosis.  相似文献   

12.
13.
14.
The recombinant retrovirus, MoFe2-MuLV (MoFe2), was constructed by replacing the U3 region of Moloney murine leukemia virus (M-MuLV) with homologous sequences from the FeLV-945 LTR. NIH/Swiss mice neonatally inoculated with MoFe2 developed T-cell lymphomas of immature thymocyte surface phenotype. MoFe2 integrated infrequently (0 to 9%) near common insertion sites (CISs) previously identified for either parent virus. Using three different strategies, CISs in MoFe2-induced tumors were identified at six loci, none of which had been previously reported as CISs in tumors induced by either parent virus in wild-type animals. Two of the newly identified CISs had not previously been implicated in lymphoma in any retrovirus model. One of these, designated 3-19, encodes the p101 regulatory subunit of phosphoinositide-3-kinase-gamma. The other, designated Rw1, is predicted to encode a protein that functions in the immune response to virus infection. Thus, substitution of FeLV-945 U3 sequences into the M-MuLV long terminal repeat (LTR) did not alter the target tissue for M-MuLV transformation but significantly altered the pattern of CIS utilization in the induction of T-cell lymphoma. These observations support a growing body of evidence that the distinctive sequence and/or structure of the retroviral LTR determines its pattern of insertional activation. The findings also demonstrate the oligoclonal nature of retrovirus-induced lymphomas by demonstrating proviral insertions at CISs in subdominant populations in the tumor mass. Finally, the findings demonstrate the utility of novel recombinant retroviruses such as MoFe2 to contribute new genes potentially relevant to the induction of lymphoid malignancy.  相似文献   

15.
Site-selected insertion (SSI) is a PCR-based technique which uses primers located within the transposon and a target gene for detection of transposon insertions into cloned genes. We screened tomato plants bearing single or multiple copies of maizeAc orDs transposable elements for somatic insertions at one close-range target and two long-range targets. Eight close-rangeDs insertions near the right border of the T-DNA were recovered. Sequence analysis showed a precise junction between the transposon and the target for all insertions. Two insertions in separate plants occurred at the same site, but others appeared dispersed in the region of the right T-DNA border with no target specificity. However, insertions showed a preference for one orientation of the transposon. Use of plants with multipleAc (HiAc) orDs (HiDs) elements allowed detection of somatic insertions at two single-copy genes,PG (polygalacturonase) andDFR (dihydroflavonol 4-reductase). Certain HiDs plants showed much higher rates of insertion intoPG than others. Insertions inPG andDFR were found throughout the gene regions monitored and, with the exception of one insertion inPG, the junctions between transposon and target were exact. SSI analysis of progeny from the HiDs parents revealed that in some cases the tendency to incur high levels of somatic insertions inPG was inherited. Inheritance of this character is an indication that SSI could be used to direct a search for germinalPG insertions in tomato.  相似文献   

16.
We report here the in vivo expression of the synthetic transposase gene himar1(a) in Streptomyces coelicolor M145 and Streptomyces albus. Using the synthetic himar1(a) gene adapted for Streptomyces codon usage, we showed random insertion of the transposon into the streptomycetes genome. The insertion frequency for the Himar1-derived minitransposons is nearly 100 % of transformed Streptomyces cells, and insertions are stably inherited in the absence of an antibiotic selection. The minitransposons contain different antibiotic resistance selection markers (apramycin, hygromycin, and spectinomycin), site-specific recombinase target sites (rox and/or loxP), I-SceI meganuclease target sites, and an R6Kγ origin of replication for transposon rescue. We identified transposon insertion loci by random sequencing of more than 100 rescue plasmids. The majority of insertions were mapped to putative open-reading frames on the S. coelicolor M145 and S. albus chromosomes. These insertions included several new regulatory genes affecting S. coelicolor M145 growth and actinorhodin biosynthesis.  相似文献   

17.
A native composite transposon was isolated from Corynebacterium glutamicum ATCC 14751. This transposon comprises two functional copies of a corynebacterial IS31831-like insertion sequence organized as converging terminal inverted repeats. This novel 20.3-kb element, Tn14751, carries 17.4 kb of C. glutamicum chromosomal DNA containing various genes, including genes involved in purine biosynthesis but not genes related to bacterial warfare, such as genes encoding mediators of antibiotic resistance or extracellular toxins. A derivative of this element carrying a kanamycin resistance cassette, minicomposite Tn14751, transposed into the genome of C. glutamicum at an efficiency of 1.8 × 102 transformants per μg of DNA. Random insertion of the Tn14751 derivative carrying the kanamycin resistance cassette into the chromosome was verified by Southern hybridization. This work paves the way for realization of the concept of minimum genome factories in the search for metabolic engineering via genome-scale directed evolution through a combination of random and directed approaches.  相似文献   

18.
Transposon mutagenesis, in combination with parallel sequencing, is becoming a powerful tool for en-masse mutant analysis. A probability generating function was used to explain observed miniHimar transposon insertion patterns, and gene essentiality calls were made by transposon insertion frequency analysis (TIFA). TIFA incorporated the observed genome and sequence motif bias of the miniHimar transposon. The gene essentiality calls were compared to: 1) previous genome-wide direct gene-essentiality assignments; and, 2) flux balance analysis (FBA) predictions from an existing genome-scale metabolic model of Shewanella oneidensis MR-1. A three-way comparison between FBA, TIFA, and the direct essentiality calls was made to validate the TIFA approach. The refinement in the interpretation of observed transposon insertions demonstrated that genes without insertions are not necessarily essential, and that genes that contain insertions are not always nonessential. The TIFA calls were in reasonable agreement with direct essentiality calls for S. oneidensis, but agreed more closely with E. coli essentiality calls for orthologs. The TIFA gene essentiality calls were in good agreement with the MR-1 FBA essentiality predictions, and the agreement between TIFA and FBA predictions was substantially better than between the FBA and the direct gene essentiality predictions.  相似文献   

19.
Genome-wide association studies (GWAS) have successfully identified susceptibility loci from marginal association analysis of SNPs. Valuable insight into genetic variation underlying complex diseases will likely be gained by considering functionally related sets of genes simultaneously. One approach is to further develop gene set enrichment analysis methods, which are initiated in gene expression studies, to account for the distinctive features of GWAS data. These features include the large number of SNPs per gene, the modest and sparse SNP associations, and the additional information provided by linkage disequilibrium (LD) patterns within genes. We propose a “gene set ridge regression in association studies (GRASS)” algorithm. GRASS summarizes the genetic structure for each gene as eigenSNPs and uses a novel form of regularized regression technique, termed group ridge regression, to select representative eigenSNPs for each gene and assess their joint association with disease risk. Compared with existing methods, the proposed algorithm greatly reduces the high dimensionality of GWAS data while still accounting for multiple hits and/or LD in the same gene. We show by simulation that this algorithm performs well in situations in which there are a large number of predictors compared to sample size. We applied the GRASS algorithm to a genome-wide association study of colon cancer and identified nicotinate and nicotinamide metabolism and transforming growth factor beta signaling as the top two significantly enriched pathways. Elucidating the role of variation in these pathways may enhance our understanding of colon cancer etiology.  相似文献   

20.
The CRISPR (clusters of regularly interspaced short palindromic repeats)–Cas adaptive immune system is an important defense system in bacteria, providing targeted defense against invasions of foreign nucleic acids. CRISPR–Cas systems consist of CRISPR loci and cas (CRISPR-associated) genes: sequence segments of invaders are incorporated into host genomes at CRISPR loci to generate specificity, while adjacent cas genes encode proteins that mediate the defense process. We pursued an integrated approach to identifying putative cas genes from genomes and metagenomes, combining similarity searches with genomic neighborhood analysis. Application of our approach to bacterial genomes and human microbiome datasets allowed us to significantly expand the collection of cas genes: the sequence space of the Cas9 family, the key player in the recently engineered RNA-guided platforms for genome editing in eukaryotes, is expanded by at least two-fold with metagenomic datasets. We found genes in cas loci encoding other functions, for example, toxins and antitoxins, confirming the recently discovered potential of coupling between adaptive immunity and the dormancy/suicide systems. We further identified 24 novel Cas families; one novel family contains 20 proteins, all identified from the human microbiome datasets, illustrating the importance of metagenomics projects in expanding the diversity of cas genes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号