首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Nested effects models (NEMs) are a class of probabilistic models introduced to analyze the effects of gene perturbation screens visible in high-dimensional phenotypes like microarrays or cell morphology. NEMs reverse engineer upstream/downstream relations of cellular signaling cascades. NEMs take as input a set of candidate pathway genes and phenotypic profiles of perturbing these genes. NEMs return a pathway structure explaining the observed perturbation effects. Here, we describe the package nem, an open-source software to efficiently infer NEMs from data. Our software implements several search algorithms for model fitting and is applicable to a wide range of different data types and representations. The methods we present summarize the current state-of-the-art in NEMs. AVAILABILITY: Our software is written in the R language and freely avail-able via the Bioconductor project at http://www.bioconductor.org.  相似文献   

2.
In this article we highlight recent developments in computational functional genomics to identify networks of functionally related genes and proteins based on diverse sources of genomic data. Our specific focus is on statistical methods to identify genetic networks. We discuss integrated analysis of microarray datasets, methods to combine heterogeneous data sources, the analysis of high-dimensional phenotyping screens and describe efforts to establish a reliable and unbiased gold standard for method comparison and evaluation.  相似文献   

3.
Nested Effects Models (NEMs) are a class of graphical models introduced to analyze the results of gene perturbation screens. NEMs explore noisy subset relations between the high-dimensional outputs of phenotyping studies, e.g., the effects showing in gene expression profiles or as morphological features of the perturbed cell. In this paper we expand the statistical basis of NEMs in four directions. First, we derive a new formula for the likelihood function of a NEM, which generalizes previous results for binary data. Second, we prove model identifiability under mild assumptions. Third, we show that the new formulation of the likelihood allows efficiency in traversing model space. Fourth, we incorporate prior knowledge and an automated variable selection criterion to decrease the influence of noise in the data.  相似文献   

4.
5.
Nested effects models have been used successfully for learning subcellular networks from high-dimensional perturbation effects that result from RNA interference (RNAi) experiments. Here, we further develop the basic nested effects model using high-content single-cell imaging data from RNAi screens of cultured cells infected with human rhinovirus. RNAi screens with single-cell readouts are becoming increasingly common, and they often reveal high cell-to-cell variation. As a consequence of this cellular heterogeneity, knock-downs result in variable effects among cells and lead to weak average phenotypes on the cell population level. To address this confounding factor in network inference, we explicitly model the stimulation status of a signaling pathway in individual cells. We extend the framework of nested effects models to probabilistic combinatorial knock-downs and propose NEMix, a nested effects mixture model that accounts for unobserved pathway activation. We analyzed the identifiability of NEMix and developed a parameter inference scheme based on the Expectation Maximization algorithm. In an extensive simulation study, we show that NEMix improves learning of pathway structures over classical NEMs significantly in the presence of hidden pathway stimulation. We applied our model to single-cell imaging data from RNAi screens monitoring human rhinovirus infection, where limited infection efficiency of the assay results in uncertain pathway stimulation. Using a subset of genes with known interactions, we show that the inferred NEMix network has high accuracy and outperforms the classical nested effects model without hidden pathway activity. NEMix is implemented as part of the R/Bioconductor package ‘nem’ and available at www.cbg.ethz.ch/software/NEMix.  相似文献   

6.
High‐content imaging using automated microscopy and computer vision allows multivariate profiling of single‐cell phenotypes. Here, we present methods for the application of the CISPR‐Cas9 system in large‐scale, image‐based, gene perturbation experiments. We show that CRISPR‐Cas9‐mediated gene perturbation can be achieved in human tissue culture cells in a timeframe that is compatible with image‐based phenotyping. We developed a pipeline to construct a large‐scale arrayed library of 2,281 sequence‐verified CRISPR‐Cas9 targeting plasmids and profiled this library for genes affecting cellular morphology and the subcellular localization of components of the nuclear pore complex (NPC). We conceived a machine‐learning method that harnesses genetic heterogeneity to score gene perturbations and identify phenotypically perturbed cells for in‐depth characterization of gene perturbation effects. This approach enables genome‐scale image‐based multivariate gene perturbation profiling using CRISPR‐Cas9.  相似文献   

7.
RNA interference (RNAi)-mediated loss-of-function screening in Drosophila melanogaster tissue culture cells is a powerful method for identifying the genes underlying cell biological functions and for annotating the fly genome. Here we describe the development of living-cell microarrays for screening large collections of RNAi-inducing double-stranded RNAs (dsRNAs) in Drosophila cells. The features of the microarrays consist of clusters of cells 200 mum in diameter, each with an RNAi-mediated depletion of a specific gene product. Because of the small size of the features, thousands of distinct dsRNAs can be screened on a single chip. The microarrays are suitable for quantitative and high-content cellular phenotyping and, in combination screens, for the identification of genetic suppressors, enhancers and synthetic lethal interactions. We used a prototype cell microarray with 384 different dsRNAs to identify previously unknown genes that affect cell proliferation and morphology, and, in a combination screen, that regulate dAkt/dPKB phosphorylation in the absence of dPTEN expression.  相似文献   

8.
Automated analysis of C. elegans behaviour is a rapidly developing field, offering the possibility of behaviour-based, high-throughput drug screens and systematic phenotyping. Standard methods for parameterizing worm shapes and movements are emerging, and progress has been made towards overcoming the difficulties introduced by interactions between worms, as well as worm coiling and omega turning. Current methods have facilitated the identification of subtle phenotypes and the characterisation of roles of neurones in forward locomotion and chemotaxis, as well as the quantitative characterisation of behaviour choice and circadian patterns of activity. Given the speed with which C. elegans has been deployed in genetic screens and chemical screens, it is to be hoped that wormtrackers may eventually provide similar rapidity in assaying behavioural phenotypes. However, considerable progress must be made before this can be accomplished. In the case of genome-wide RNAi screens, for example, the presence in the worm genome of some 19,000 genes means that even the minimal user intervention in an automatic phenotyping system will be very costly. Nonetheless, recent advances have shown that drug actions on large numbers of worms can be tracked, raising hopes that high-throughput behavioural screens may soon be available.  相似文献   

9.
Pollock DD  Larkin JC 《Genetics》2004,168(1):489-502
Large-scale screens for loss-of-function mutants have played a significant role in recent advances in developmental biology and other fields. In such mutant screens, it is desirable to estimate the degree of "saturation" of the screen (i.e., what fraction of the possible target genes has been identified). We applied Bayesian and maximum-likelihood methods for estimating the number of loci remaining undetected in large-scale screens and produced credibility intervals to assess the uncertainty of these estimates. Since different loci may mutate to alleles with detectable phenotypes at different rates, we also incorporated variation in the degree of mutability among genes, using either gamma-distributed mutation rates or multiple discrete mutation rate classes. We examined eight published data sets from large-scale mutant screens and found that credibility intervals are much broader than implied by previous assumptions about the degree of saturation of screens. The likelihood methods presented here are a significantly better fit to data from published experiments than estimates based on the Poisson distribution, which implicitly assumes a single mutation rate for all loci. The results are reasonably robust to different models of variation in the mutability of genes. We tested our methods against mutant allele data from a region of the Drosophila melanogaster genome for which there is an independent genomics-based estimate of the number of undetected loci and found that the number of such loci falls within the predicted credibility interval for our models. The methods we have developed may also be useful for estimating the degree of saturation in other types of genetic screens in addition to classical screens for simple loss-of-function mutants, including genetic modifier screens and screens for protein-protein interactions using the yeast two-hybrid method.  相似文献   

10.
Adaptive quality-based clustering of gene expression profiles   总被引:17,自引:0,他引:17  
MOTIVATION: Microarray experiments generate a considerable amount of data, which analyzed properly help us gain a huge amount of biologically relevant information about the global cellular behaviour. Clustering (grouping genes with similar expression profiles) is one of the first steps in data analysis of high-throughput expression measurements. A number of clustering algorithms have proved useful to make sense of such data. These classical algorithms, though useful, suffer from several drawbacks (e.g. they require the predefinition of arbitrary parameters like the number of clusters; they force every gene into a cluster despite a low correlation with other cluster members). In the following we describe a novel adaptive quality-based clustering algorithm that tackles some of these drawbacks. RESULTS: We propose a heuristic iterative two-step algorithm: First, we find in the high-dimensional representation of the data a sphere where the "density" of expression profiles is locally maximal (based on a preliminary estimate of the radius of the cluster-quality-based approach). In a second step, we derive an optimal radius of the cluster (adaptive approach) so that only the significantly coexpressed genes are included in the cluster. This estimation is achieved by fitting a model to the data using an EM-algorithm. By inferring the radius from the data itself, the biologist is freed from finding an optimal value for this radius by trial-and-error. The computational complexity of this method is approximately linear in the number of gene expression profiles in the data set. Finally, our method is successfully validated using existing data sets. AVAILABILITY: http://www.esat.kuleuven.ac.be/~thijs/Work/Clustering.html  相似文献   

11.
Readily-accessible and standardised capture of genotypic variation has revolutionised our understanding of the genetic contribution to disease. Unfortunately, the corresponding systematic capture of patient phenotypic variation needed to fully interpret the impact of genetic variation has lagged far behind. Exploiting deep and systematic phenotyping of a cohort of 197 patients presenting with heterogeneous developmental disorders and whose genomes harbour de novo CNVs, we systematically applied a range of commonly-used functional genomics approaches to identify the underlying molecular perturbations and their phenotypic impact. Grouping patients into 408 non-exclusive patient-phenotype groups, we identified a functional association amongst the genes disrupted in 209 (51%) groups. We find evidence for a significant number of molecular interactions amongst the association-contributing genes, including a single highly-interconnected network disrupted in 20% of patients with intellectual disability, and show using microcephaly how these molecular networks can be used as baits to identify additional members whose genes are variant in other patients with the same phenotype. Exploiting the systematic phenotyping of this cohort, we observe phenotypic concordance amongst patients whose variant genes contribute to the same functional association but note that (i) this relationship shows significant variation across the different approaches used to infer a commonly perturbed molecular pathway, and (ii) that the phenotypic similarities detected amongst patients who share the same inferred pathway perturbation result from these patients sharing many distinct phenotypes, rather than sharing a more specific phenotype, inferring that these pathways are best characterized by their pleiotropic effects.  相似文献   

12.
Functional genomics screens using multi-parametric assays are powerful approaches for identifying genes involved in particular cellular processes. However, they suffer from problems like noise, and often provide little insight into molecular mechanisms. A bottleneck for addressing these issues is the lack of computational methods for the systematic integration of multi-parametric phenotypic datasets with molecular interactions. Here, we present Integrative Multi Profile Analysis of Cellular Traits (IMPACT). The main goal of IMPACT is to identify the most consistent phenotypic profile among interacting genes. This approach utilizes two types of external information: sets of related genes (IMPACT-sets) and network information (IMPACT-modules). Based on the notion that interacting genes are more likely to be involved in similar functions than non-interacting genes, this data is used as a prior to inform the filtering of phenotypic profiles that are similar among interacting genes. IMPACT-sets selects the most frequent profile among a set of related genes. IMPACT-modules identifies sub-networks containing genes with similar phenotype profiles. The statistical significance of these selections is subsequently quantified via permutations of the data. IMPACT (1) handles multiple profiles per gene, (2) rescues genes with weak phenotypes and (3) accounts for multiple biases e.g. caused by the network topology. Application to a genome-wide RNAi screen on endocytosis showed that IMPACT improved the recovery of known endocytosis-related genes, decreased off-target effects, and detected consistent phenotypes. Those findings were confirmed by rescreening 468 genes. Additionally we validated an unexpected influence of the IGF-receptor on EGF-endocytosis. IMPACT facilitates the selection of high-quality phenotypic profiles using different types of independent information, thereby supporting the molecular interpretation of functional screens.  相似文献   

13.
A system is constructed to automatically infer a genetic network byapplication of graphical Gaussian modeling to the expression profiledata. Our system is composed of two parts: one part is automaticdetermination of cluster boundaries of profiles in hierarchicalclustering, and another part is inference of a genetic network byapplication of graphical Gaussian modeling to the clustered profiles.Since thousands of or tens of thousands of gene expression profiles aremeasured under only one hundred conditions, the profiles naturally showsome similar patterns. Therefore, a preprocessing for systematicallyclustering the profiles is prerequisite to infer the relationship betweenthe genes. For this purpose, a method for automatic determination ofcluster boundaries is newly developed without any biological knowledgeand any additional analyses. Then, the profiles for each cluster areanalyzed by graphical Gaussian modeling to infer the relationship betweenthe clusters. Thus, our system automatically provides a graph betweenclusters only by input the profile data. The performance of the presentsystem is validated by 2467 profiles from yeast genes. The clusters andthe genetic network obtained by our system are discussed in terms of thegene function and the known regulatory relationship between genes.  相似文献   

14.
Systematic perturbation screens provide comprehensive resources for the elucidation of cancer driver genes. The perturbation of many genes in relatively few cell lines in such functional screens necessitates the development of specialized computational tools with sufficient statistical power. Here we developed APSiC (Analysis of Perturbation Screens for identifying novel Cancer genes) to identify genetic drivers and effectors in perturbation screens even with few samples. Applying APSiC to the shRNA screen Project DRIVE, APSiC identified well-known and novel putative mutational and amplified cancer genes across all cancer types and in specific cancer types. Additionally, APSiC discovered tumor-promoting and tumor-suppressive effectors, respectively, for individual cancer types, including genes involved in cell cycle control, Wnt/β-catenin and hippo signalling pathways. We functionally demonstrated that LRRC4B, a putative novel tumor-suppressive effector, suppresses proliferation by delaying cell cycle and modulates apoptosis in breast cancer. We demonstrate APSiC is a robust statistical framework for discovery of novel cancer genes through analysis of large-scale perturbation screens. The analysis of DRIVE using APSiC is provided as a web portal and represents a valuable resource for the discovery of novel cancer genes.  相似文献   

15.
Reverse genetic screens have driven gene annotation and target discovery in model organisms. However, many disease‐relevant genotypes and phenotypes cannot be studied in lower organisms. It is therefore essential to overcome technical hurdles associated with large‐scale reverse genetics in human cells. Here, we establish a reverse genetic approach based on highly robust and sensitive multiplexed RNA sequencing of mutant human cells. We conduct 10 parallel screens using a collection of engineered haploid isogenic cell lines with knockouts covering tyrosine kinases and identify known and unexpected effects on signaling pathways. Our study provides proof of concept for a scalable approach to link genotype to phenotype in human cells, which has broad applications. In particular, it clears the way for systematic phenotyping of still poorly characterized human genes and for systematic study of uncharacterized genomic features associated with human disease.  相似文献   

16.
To successfully treat cancer we will likely need a much more detailed understanding of the genes and pathways meaningfully altered in individual cancer cases. One method for achieving this goal is to derive cancers in model organisms using unbiased forward genetic screens that allow cancer gene candidate discovery. We have developed a method using a “cut-and-paste” DNA transposon system called Sleeping Beauty (SB) to perform forward genetic screens for cancer genes in mice. Although the approach is conceptually similar to the use of replication competent retroviruses for cancer gene identification, the SB system promises to allow such screens in tissues previously not amenable to forward genetic screens such as the gastrointestinal tract, brain, and liver. This article describes the strains useful for SB-based screens for cancer genes in mice and how they are deployed in an experiment.  相似文献   

17.
Prokopenko SN  He Y  Lu Y  Bellen HJ 《Genetics》2000,156(4):1691-1715
In our quest for novel genes required for the development of the embryonic peripheral nervous system (PNS), we have performed three genetic screens using MAb 22C10 as a marker of terminally differentiated neurons. A total of 66 essential genes required for normal PNS development were identified, including 49 novel genes. To obtain information about the molecular nature of these genes, we decided to complement our genetic screens with a molecular screen. From transposon-tagged mutations identified on the basis of their phenotype in the PNS we selected 31 P-element strains representing 26 complementation groups on the second and third chromosomes to clone and sequence the corresponding genes. We used plasmid rescue to isolate and sequence 51 genomic fragments flanking the sites of these P-element insertions. Database searches using sequences derived from the ends of plasmid rescues allowed us to assign genes to one of four classes: (1) previously characterized genes (11), (2) first mutations in cloned genes (1), (3) P-element insertions in genes that were identified, but not characterized molecularly (1), and (4) novel genes (13). Here, we report the cloning, sequence, Northern analysis, and the embryonic expression pattern of candidate cDNAs for 10 genes: astray, chrowded, dalmatian, gluon, hoi-polloi, melted, pebble, skittles, sticky ch1, and vegetable. This study allows us to draw conclusions about the identity of proteins required for the development of the nervous system in Drosophila and provides an example of a molecular approach to characterize en masse transposon-tagged mutations identified in genetic screens.  相似文献   

18.
There is a need for improved appreciation of the importance of genome-wide mRNA and protein expression measurements and their role in understanding translation and in relation to genome-wide mathematical frameworks for gene expression regulation. We investigated the use of a high-density microarray technique for mRNA expression analysis and a two-dimensional protein electrophoresis-tandem mass spectrometry method for protein analysis to monitor changes in gene expression. We applied these analytical tools in the context of an environmental perturbation of Escherichia coli cells-the addition of varying amounts of IPTG. We also tested the application of these tools to the study of a genetic perturbation of Escherichia coli cells-the ability of certain strains to hypersecrete the hemolysin protein. We observed a lack of correspondence between mRNA and protein expression profiles. Although our data do not include measurements on all expressed genes (because the ability to measure protein expression profiles is limiting), we observed that the qualitative and quantitative behavior of the measurements of a subset of expressed genes is similar to the behavior of the entire system. The change in observed average mRNA and protein amplification factors for 77 and 52 genes coincided with the observed change in mRNA amplification factor for the entire system. Furthermore, we found that the use of relative changes in expression could be used to elucidate mechanisms of gene expression regulation for the system studied, even when measurements were made on a small subset of the system.  相似文献   

19.
20.
The mouse is a proven model for studying human disease. Many strains exist that exhibit either natural or engineered genetic variation and thereby enable the elucidation of pathways involved in the development of cardiovascular disease. Although those mouse models have been fundamental to advancing our knowledge base, we are still at an early stage in understanding how genes contribute to complex disorders. There remains a need for new animal models that closely represent human disease. To expedite their development, we have established the Center for New Mouse Models of Heart, Lung, Blood, and Sleep Disorders at The Jackson Laboratory. We are using a phenotype-driven approach to identify mutations leading to atherosclerosis, hypertension, obesity, blood disorders, lung dysfunction, thrombosis, and disordered sleep. Our high-throughput, comprehensive phenotyping draws from two sources for new models: 1) the natural variation among over 40 inbred mouse strains and 2) chemically induced, whole-genome mutagenized mice. Here, we review our cardiovascular screens and present some hypertensive, obese, and cardiovascular models identified with this approach.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号