首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Retroviral insertional mutagenesis screens, which identify genes involved in tumor development in mice, have yielded a substantial number of retroviral integration sites, and this number is expected to grow substantially due to the introduction of high-throughput screening techniques. The data of various retroviral insertional mutagenesis screens are compiled in the publicly available Retroviral Tagged Cancer Gene Database (RTCGD). Integrally analyzing these screens for the presence of common insertion sites (CISs, i.e., regions in the genome that have been hit by viral insertions in multiple independent tumors significantly more than expected by chance) requires an approach that corrects for the increased probability of finding false CISs as the amount of available data increases. Moreover, significance estimates of CISs should be established taking into account both the noise, arising from the random nature of the insertion process, as well as the bias, stemming from preferential insertion sites present in the genome and the data retrieval methodology. We introduce a framework, the kernel convolution (KC) framework, to find CISs in a noisy and biased environment using a predefined significance level while controlling the family-wise error (FWE) (the probability of detecting false CISs). Where previous methods use one, two, or three predetermined fixed scales, our method is capable of operating at any biologically relevant scale. This creates the possibility to analyze the CISs in a scale space by varying the width of the CISs, providing new insights in the behavior of CISs across multiple scales. Our method also features the possibility of including models for background bias. Using simulated data, we evaluate the KC framework using three kernel functions, the Gaussian, triangular, and rectangular kernel function. We applied the Gaussian KC to the data from the combined set of screens in the RTCGD and found that 53% of the CISs do not reach the significance threshold in this combined setting. Still, with the FWE under control, application of our method resulted in the discovery of eight novel CISs, which each have a probability less than 5% of being false detections.  相似文献   

2.
Mobile genetic elements (MGEs) account for a significant fraction of eukaryotic genomes and are implicated in altered gene expression and disease. We present an efficient computational protocol for MGE insertion site analysis. ELAN, the suite of tools described here uses standard techniques to identify different MGEs and their distribution on the genome. One component, DNASCANNER analyses known insertion sites of MGEs for the presence of signals that are based on a combination of local physical and chemical properties. ISF (insertion site finder) is a machine-learning tool that incorporates information derived from DNASCANNER. ISF permits classification of a given DNA sequence as a potential insertion site or not, using a support vector machine. We have studied the genomes of Homo sapiens, Mus musculus, Drosophila melanogaster and Entamoeba histolytica via a protocol whereby DNASCANNER is used to identify a common set of statistically important signals flanking the insertion sites in the various genomes. These are used in ISF for insertion site prediction, and the current accuracy of the tool is over 65%. We find similar signals at gene boundaries and splice sites. Together, these data are suggestive of a common insertion mechanism that operates in a variety of eukaryotes.  相似文献   

3.
4.
There have been many different approaches employed to define the "consensus" sequence of various DNA binding sites and to use the definition obtained to locate and rank members of a given sequence family. The analysis presented here enlists two of these approaches, each in modified form, to develop a highly efficient search protocol for Escherichia coli promoters and to provide a relative ranking of these sites showing good agreement with in vitro measurements of promoter strength. Schneider et al. have applied Shannon's index of information content to evaluate the significance of each position within the consensus of a family of aligned sequences. In a formal sense, this index is only applicable to a group of sequences, providing at each position a negative entropy value between zero (random) and two bits (total conservation of a single base) for sequences in which all bases are equally represented. A method for evaluating how well an individual sequence conforms to the information content pattern of the consensus is described. A function is derived, by analogy to the information content of the sequence family, for application to individual sequences. Since this function is a measure of conformity, it can be used in a search protocol to identify new members of the family represented by the consensus. A protocol for locating E. coli promoters is presented. The Berg-von Hippel statistical-mechanical function is also tested in a similar application. While the information content function provides a superior search protocol, the Berg-von Hippel function, when scaled at each position by the information content, does well at ranking promoters according to their strength as measured in vitro.  相似文献   

5.
6.
The exact site of transgene insertion into a plant host genome is one feature of the genetic transformation process that cannot, at present, be controlled and is often poorly understood. The site of transgene insertion may have implications for transgene stability and for potential unintended effects of the transgene on plant metabolism. To increase our understanding of transgene insertion sites in barley, a detailed analysis of transgene integration in independently derived transgenic barley lines was carried out. Fluorescence in situ hybridization (FISH) was used to physically map 23 transgene integration sites from 19 independent barley lines. Genetic mapping further confirmed the location of the transgenes in 11 of these lines. Transgene integration sites were present only on five of the seven barley chromosomes. The pattern of transgene integration appeared to be nonrandom and there was evidence of clustering of independent transgene insertion events within the barley genome. In addition, barley genomic regions flanking the transgene insertion site were isolated for seven independent lines. The data from the transgene flanking regions indicated that transgene insertions were preferentially located in gene-rich areas of the genome. These results are discussed in relation to the structure of the barley genome.  相似文献   

7.
Coupled mutagenesis screens and genetic mapping in zebrafish   总被引:4,自引:0,他引:4  
Forward genetic analysis is one of the principal advantages of the zebrafish model system. However, managing zebrafish mutant lines derived from mutagenesis screens and mapping the corresponding mutations and integrating them into the larger collection of mutations remain arduous tasks. To simplify and focus these endeavors, we developed an approach that facilitates the rapid mapping of new zebrafish mutations as they are generated through mutagenesis screens. We selected a minimal panel of 149 simple sequence length polymorphism markers for a first-pass genome scan in crosses involving C32 and SJD inbred lines. We also conducted a small chemical mutagenesis screen that identified several new mutations affecting zebrafish embryonic melanocyte development. Using our first-pass marker panel in bulked-segregant analysis, we were able to identify the genetic map positions of these mutations as they were isolated in our screen. Rapid mapping of the mutations facilitated stock management, helped direct allelism tests, and should accelerate identification of the affected genes. These results demonstrate the efficacy of coupling mutagenesis screens with genetic mapping.  相似文献   

8.
9.
Chromosomal insertion sites for phages and plasmids.   总被引:14,自引:14,他引:14       下载免费PDF全文
  相似文献   

10.
Hu YJ 《Nucleic acids research》2003,31(13):3446-3449
RNA molecules play an important role in many biological activities. Knowing its secondary structure can help us better understand the molecule's ability to function. The methods for RNA structure determination have traditionally been implemented through biochemical, biophysical and phylogenetic analyses. As the advance of computer technology, an increasing number of computational approaches have recently been developed. They have different goals and apply various algorithms. For example, some focus on secondary structure prediction for a single sequence; some aim at finding a global alignment of multiple sequences. Some predict the structure based on free energy minimization; some make comparative sequence analyses to determine the structure. In this paper, we describe how to correctly use GPRM, a genetic programming approach to finding common secondary structure elements in a set of unaligned coregulated or homologous RNA sequences. GPRM can be accessed at http://bioinfo.cis.nctu.edu.tw/service/gprm/.  相似文献   

11.
Shaham S 《PloS one》2007,2(11):e1117
In genetic screens, the number of mutagenized gametes examined is an important parameter for evaluating screen progress, the number of genes of a given mutable phenotype, gene size, cost, and labor. Since genetic screens often entail examination of thousands or tens of thousands of animals, strategies for optimizing genetics screens are important for minimizing effort while maximizing the number of mutagenized gametes examined. To date, such strategies have not been described for genetic screens in the nematode Caenorhabditis elegans. Here we review general principles of genetic screens in C. elegans, and use a modified binomial strategy to obtain a general expression for the number of mutagenized gametes examined in a genetic screen. We use this expression to calculate optimal screening parameters for a large range of genetic screen types. In addition, we developed a simple online genetic-screen-optimization tool that can be used independently of this paper. Our results demonstrate that choosing the optimal F2-to-F1 screening ratio can significantly improve screen efficiency.  相似文献   

12.
Humans are mammals, not bacteria or plants, yeast or nematodes, insects or fish. Mice are also mammals, but unlike gorilla and goat, fox and ferret, giraffe and jackal, they are suited perfectly to the laboratory environment and genetic experimentation. In this review, we will summarize the tools, tricks and techniques for executing forward genetic screens in the mouse and argue that this approach is now accessible to most biologists, rather than being the sole domain of large national facilities and specialized genetics laboratories.  相似文献   

13.
Maize (Zea mays) is an excellent model for basic research. Genetic screens have informed our understanding of developmental processes, meiosis, epigenetics and biochemical pathways--not only in maize but also in other cereal crops. We discuss the forward and reverse genetic screens that are possible in this organism, and emphasize the available tools. Screens exploit the well-studied behaviour of transposon systems, and the distinctive chromosomes allow an integration of cytogenetics into mutagenesis screens and analyses. The imminent completion of the maize genome sequence provides the essential resource to move seamlessly from gene to phenotype and back.  相似文献   

14.
Graph-based methods have been widely used for the analysis of biological networks. Their application to metabolic networks has been much discussed, in particular noting that an important weakness in such methods is that reaction stoichiometry is neglected. In this study, we show that reaction stoichiometry can be incorporated into path-finding approaches via mixed-integer linear programming. This major advance at the modeling level results in improved prediction of topological and functional properties in metabolic networks.  相似文献   

15.
16.
Non-acute transforming retroviruses like mouse mammary tumor virus (MMTV) cause cancer, at least in part, through integration near cellular genes involved in growth control, thereby de-regulating their expression. It is well-established that MMTV commonly integrates near and activates expression of members of the Wnt and Fgf pathways in mammary tumors. However, there are a significant number of tumors for which the proviral integration sites have not been identified. Here, we used high through-put screening to identify common integration sites (CISs) in MMTV-induced tumors from C3H/HeN and BALB/c mice. As expected, members of both the Wnt and Fgf families were identified in this screen. In addition, a number of novel CISs were found, including Tcf7l2, Antxr1/Tem8, and Arhgap18. We show here that expression of these three putative oncogenes in normal murine mammary gland cells altered their growth kinetics and caused their morphological transformation when grown in three dimensional cultures. Additionally, expression of Tcf7l2 and Antxr1/Tem8 sensitized cells to exogenous WNT ligand. As Tcf7l2, Antxr1/Tem8, and Arhgap18 have been associated with human breast and other cancers, these data demonstrate that MMTV-induced insertional mutation remains an important means for identifying genes involved in breast cancer.  相似文献   

17.
We developed a dynamic programming approach of computing common sequence structure patterns among two RNAs given their primary sequences and their secondary structures. Common patterns between two RNAs are defined to share the same local sequential and structural properties. The locality is based on the connections of nucleotides given by their phosphodiester and hydrogen bonds. The idea of interpreting secondary structures as chains of structure elements leads us to develop an efficient dynamic programming approach in time O(nm) and space O(nm), where n and m are the lengths of the RNAs. The biological motivation is given by detecting common, local regions of RNAs, although they do not necessarily share global sequential and structural properties. This might happen if RNAs fold into different structures but share a lot of local, stable regions. Here, we illustrate our algorithm on Hepatitis C virus internal ribosome entry sites. Our method is useful for detecting and describing local motifs as well. An implementation in C++ is available and can be obtained by contacting one of the authors.  相似文献   

18.
Vectors based on γ-retroviruses or lentiviruses have been shown to stably express therapeutical transgenes and effectively cure different hematological diseases. Molecular follow up of the insertional repertoire of gene corrected cells in patients and preclinical animal models revealed different integration preferences in the host genome including clusters of integrations in small genomic areas (CIS; common integrations sites). In the majority, these CIS were found in or near genes, with the potential to influence the clonal fate of the affected cell. To determine whether the observed degree of clustering is statistically compatible with an assumed standard model of spatial distribution of integrants, we have developed various methods and computer programs for γ-retroviral and lentiviral integration site distribution. In particular, we have devised and implemented mathematical and statistical approaches for comparing two experimental samples with different numbers of integration sites with respect to the propensity to form CIS as well as for the analysis of coincidences of integration sites obtained from different blood compartments. The programs and statistical tools described here are available as workspaces in R code and allow the fast detection of excessive clustering of integration sites from any retrovirally transduced sample and thus contribute to the assessment of potential treatment-related risks in preclinical and clinical retroviral gene therapy studies.  相似文献   

19.
This paper describes a generic algorithm for finding restrictionsites within DNA sequences. The ‘genericity’ ofthe algorithm is made possible through the use of set theory.Basic elements of DNA sequences, i.e. nucleotides (bases), arerepresented in sets, and DNA sequences, whether specific, ambiguousor even protein-coding, are represented as sequences of thosesets. The set intersection operation demonstrates its abilityto perform pattern-matching correctly on various DNA sequences.The performance analysis showed that the degree of complexityof the pattern matching is reduced from exponential to linear.An example is given to show the actual and potential restrictionsites, derived by the generic algorithm, in the DNA sequencetemplate coding for a synthetic calmodulin. Received on October 2, 1990; accepted on December 18, 1990  相似文献   

20.
This article summarizes the general principles of selections and screens in Escherichia coli. The focus is on the lac operon, owing to its inherent simplicity and versatility. Examples of different strategies for mutagenesis and mutant discovery are described. In particular, the usefulness and effectiveness of simple colour-based screens are illustrated. The power of lac genetics can be applied to almost any bacterial system with gene fusions that hook any gene of interest to lacZ, which is the structural gene that encodes beta-galactosidase. The diversity of biological processes that can be studied with lac genetics is remarkable and includes DNA metabolism, gene regulation and signal transduction, protein localization and folding, and even electron transport.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号