共查询到20条相似文献,搜索用时 0 毫秒
1.
Kauffman S 《Journal of theoretical biology》2004,230(4):22-590
Understanding the genetic regulatory network comprising genes, RNA, proteins and the network connections and dynamical control rules among them, is a major task of contemporary systems biology. I focus here on the use of the ensemble approach to find one or more well-defined ensembles of model networks whose statistical features match those of real cells and organisms. Such ensembles should help explain and predict features of real cells and organisms. More precisely, an ensemble of model networks is defined by constraints on the "wiring diagram" of regulatory interactions, and the "rules" governing the dynamical behavior of regulated components of the network. The ensemble consists of all networks consistent with those constraints. Here I discuss ensembles of random Boolean networks, scale free Boolean networks, "medusa" Boolean networks, continuous variable networks, and others. For each ensemble, M statistical features, such as the size distribution of avalanches in gene activity changes unleashed by transiently altering the activity of a single gene, the distribution in distances between gene activities on different cell types, and others, are measured. This creates an M-dimensional space, where each ensemble corresponds to a cluster of points or distributions. Using current and future experimental techniques, such as gene arrays, these M properties are to be measured for real cells and organisms, again yielding a cluster of points or distributions in the M-dimensional space. The procedure then finds ensembles close to those of real cells and organisms, and hill climbs to attempt to match the observed M features. Thus obtains one or more ensembles that should predict and explain many features of the regulatory networks in cells and organisms. 相似文献
2.
Background
With the advances in high-throughput gene profiling technologies, a large volume of gene interaction maps has been constructed. A higher-level layer of gene-gene interaction, namely modulate gene interaction, is composed of gene pairs of which interaction strengths are modulated by (i.e., dependent on) the expression level of a key modulator gene. Systematic investigations into the modulation by estrogen receptor (ER), the best-known modulator gene, have revealed the functional and prognostic significance in breast cancer. However, a genome-wide identification of key modulator genes that may further unveil the landscape of modulated gene interaction is still lacking.Results
We proposed a systematic workflow to screen for key modulators based on genome-wide gene expression profiles. We designed four modularity parameters to measure the ability of a putative modulator to perturb gene interaction networks. Applying the method to a dataset of 286 breast tumors, we comprehensively characterized the modularity parameters and identified a total of 973 key modulator genes. The modularity of these modulators was verified in three independent breast cancer datasets. ESR1, the encoding gene of ER, appeared in the list, and abundant novel modulators were illuminated. For instance, a prognostic predictor of breast cancer, SFRP1, was found the second modulator. Functional annotation analysis of the 973 modulators revealed involvements in ER-related cellular processes as well as immune- and tumor-associated functions.Conclusions
Here we present, as far as we know, the first comprehensive analysis of key modulator genes on a genome-wide scale. The validity of filtering parameters as well as the conservativity of modulators among cohorts were corroborated. Our data bring new insights into the modulated layer of gene-gene interaction and provide candidates for further biological investigations.3.
ABSTRACT: BACKGROUND: Genome-wide gene-gene interaction analysis using single nucleotide polymorphisms (SNPs) is an attractive way for identification of genetic components that confers susceptibility of human complex diseases. Individual hypothesis testing for SNP-SNP pairs as in common genome-wide association study (GWAS) however involves difficulty in setting overall p-value due to complicated correlation structure, namely, the multiple testing problem that causes unacceptable false negative results. A large number of SNP-SNP pairs than sample size, so-called the large p small n problem, precludes simultaneous analysis using multiple regression. The method that overcomes above issues is thus needed. RESULTS: We adopt an up-to-date method for ultrahigh-dimensional variable selection termed the sure independence screening (SIS) [17] for appropriate handling of numerous number of SNP-SNP interactions by including them as predictor variables in logistic regression. We propose ranking strategy using promising dummy coding methods and following variable selection procedure in the SIS method suitably modified for gene-gene interaction analysis. We also implemented the procedures in a software program, EPISIS, using the cost-effective GPGPU (General-purpose computing on graphics processing units) technology. EPISIS can complete exhaustive search for SNP-SNP interactions in standard GWAS dataset within several hours. The proposed method works successfully in simulation experiments and in application to real WTCCC (Wellcome Trust Case-Control Consortium) data. CONCLUSIONS: Based on the machine-learning principle, the proposed method gives powerful and flexible genome-wide search for various patterns of gene-gene interaction. 相似文献
4.
5.
6.
复杂疾病是基因与基因、基因与环境交互作用的结果,高维基因交互作用的探测给计算带来了极大的挑战。在过去20年间,机器学习方法被用于探测基因-基因交互作用,并取得了一定的效果。本文综述了机器学习方法在基因交互作用探测中的研究进展,系统地介绍了神经网络(neural networks, NN)、随机森林(random forest, RF)、支持向量机(support vector machines, SVM)和多因子降维法(multifactor dimensionality reduction, MDR)等机器学习方法在全基因组关联研究(genome wide association study, GWAS)中探测基因交互作用的原理和局限性,并对未来的研究进行了展望。 相似文献
7.
Protein-protein interaction plays a major role in all biological processes. The currently available genetic methods such as the two-hybrid system and the protein recruitment system are relatively limited in their ability to identify interactions with integral membrane proteins. Here we describe the development of a reverse Ras recruitment system (reverse RRS), in which the bait used encodes a membrane protein. The bait is expressed in its natural environment, the membrane, whereas the protein partner (the prey) is fused to a cytoplasmic Ras mutant. Protein-protein interaction between the proteins encoded by the prey and the bait results in Ras membrane translocation and activation of a viability pathway in yeast. We devised the expression of the bait and prey proteins under the control of dual distinct inducible promoters, thus enabling a rapid selection of transformants in which growth is attributed solely to specific protein-protein interaction. The reverse RRS approach greatly extends the usefulness of the protein recruitment systems and the use of integral membrane proteins as baits. The system serves as an attractive approach to explore novel protein-protein interactions with high specificity and selectivity, where other methods fail. 相似文献
8.
9.
Background
With the development of high-throughput genotyping and sequencing technology, there are growing evidences of association with genetic variants and complex traits. In spite of thousands of genetic variants discovered, such genetic markers have been shown to explain only a very small proportion of the underlying genetic variance of complex traits. Gene-gene interaction (GGI) analysis is expected to unveil a large portion of unexplained heritability of complex traits.Methods
In this work, we propose IGENT, Information theory-based GEnome-wide gene-gene iNTeraction method. IGENT is an efficient algorithm for identifying genome-wide gene-gene interactions (GGI) and gene-environment interaction (GEI). For detecting significant GGIs in genome-wide scale, it is important to reduce computational burden significantly. Our method uses information gain (IG) and evaluates its significance without resampling.Results
Through our simulation studies, the power of the IGENT is shown to be better than or equivalent to that of that of BOOST. The proposed method successfully detected GGI for bipolar disorder in the Wellcome Trust Case Control Consortium (WTCCC) and age-related macular degeneration (AMD).Conclusions
The proposed method is implemented by C++ and available on Windows, Linux and MacOSX.10.
Background
Since the introduction of large-scale genotyping methods that can be utilized in genome-wide association (GWA) studies for deciphering complex diseases, statistical genetics has been posed with a tremendous challenge of how to most appropriately analyze such data. A plethora of advanced model-based methods for genetic mapping of traits has been available for more than 10 years in animal and plant breeding. However, most such methods are computationally intractable in the context of genome-wide studies. Therefore, it is hardly surprising that GWA analyses have in practice been dominated by simple statistical tests concerned with a single marker locus at a time, while the more advanced approaches have appeared only relatively recently in the biomedical and statistical literature. 相似文献11.
Lou XY Chen GB Yan L Ma JZ Mangold JE Zhu J Elston RC Li MD 《American journal of human genetics》2008,83(4):457-467
Widespread multifactor interactions present a significant challenge in determining risk factors of complex diseases. Several combinatorial approaches, such as the multifactor dimensionality reduction (MDR) method, have emerged as a promising tool for better detecting gene-gene (G x G) and gene-environment (G x E) interactions. We recently developed a general combinatorial approach, namely the generalized multifactor dimensionality reduction (GMDR) method, which can entertain both qualitative and quantitative phenotypes and allows for both discrete and continuous covariates to detect G x G and G x E interactions in a sample of unrelated individuals. In this article, we report the development of an algorithm that can be used to study G x G and G x E interactions for family-based designs, called pedigree-based GMDR (PGMDR). Compared to the available method, our proposed method has several major improvements, including allowing for covariate adjustments and being applicable to arbitrary phenotypes, arbitrary pedigree structures, and arbitrary patterns of missing marker genotypes. Our Monte Carlo simulations provide evidence that the PGMDR method is superior in performance to identify epistatic loci compared to the MDR-pedigree disequilibrium test (PDT). Finally, we applied our proposed approach to a genetic data set on tobacco dependence and found a significant interaction between two taste receptor genes (i.e., TAS2R16 and TAS2R38) in affecting nicotine dependence. 相似文献
12.
Lisa G. Shaffer Colleen K. Jackson-Cook Joanne M. Meyer Judith A. Brown J. Edward Spence 《Human genetics》1991,86(4):375-382
Summary The largest class of de novo chromosomal rearrangements in Down syndrome are rea(21q21q). Classically, these rearrangements have been termed Robertsonian translocations, implying an attachment of two different chromosome 21 homologues. Additionally, a Robertsonian translocation between two chromosomes 21 cannot be distinguished from an isochromosome composed of genetically identical arms by cytogenetic analyses. Therefore, we have used molecular techniques to differentiate between true Robertsonian translocations and isochromosomes. Samples were obtained from 12 probands, ascertained for de novo rearrangements between homologous chromosomes 21 [11 rea(21q21q) and 1 rea (21;21)(q22;q22)], their parents (n = 24) and available siblings (n = 7). The parental origins of the de novo rearrangements were assigned using molecular and cytogenetic analyses. Although not statistically significant, there was a two-fold increase in the number of paternally derived de novo rearrangements (n = 8) as compared with maternally derived rearrangements (n = 4). To distinguish between rob(21q21q) and i(21q), we used restriction fragment length polymorphisms (RFLPs) spanning the length of chromosome 21. Using all informative and partially informative RFLPs, we used the method of maximum likelihood to assign the most likely rearrangement definition (i or rob) and parental origin in each family. The maximum likelihood estimates indicated that all rearrangements tested (n = 8) were isochromosomes. C-banding revealed two centromeres in three cases indicating that a U-type exchange occurred between sister chromatids in these rearrangements. Our results suggest that the majority of de novo rea(21q21q) are isochromosomes derived from a single parental chromosome 21. 相似文献
13.
MOTIVATION: Protein-Protein Interaction (PPI) networks are believed to be important sources of information related to biological processes and complex metabolic functions of the cell. The presence of biologically relevant functional modules in these networks has been theorized by many researchers. However, the application of traditional clustering algorithms for extracting these modules has not been successful, largely due to the presence of noisy false positive interactions as well as specific topological challenges in the network. RESULTS: In this article, we propose an ensemble clustering framework to address this problem. For base clustering, we introduce two topology-based distance metrics to counteract the effects of noise. We develop a PCA-based consensus clustering technique, designed to reduce the dimensionality of the consensus problem and yield informative clusters. We also develop a soft consensus clustering variant to assign multifaceted proteins to multiple functional groups. We conduct an empirical evaluation of different consensus techniques using topology-based, information theoretic and domain-specific validation metrics and show that our approaches can provide significant benefits over other state-of-the-art approaches. Our analysis of the consensus clusters obtained demonstrates that ensemble clustering can (a) produce improved biologically significant functional groupings; and (b) facilitate soft clustering by discovering multiple functional associations for proteins. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. 相似文献
14.
Background
Protein residue-residue contact prediction is important for protein model generation and model evaluation. Here we develop a conformation ensemble approach to improve residue-residue contact prediction. We collect a number of structural models stemming from a variety of methods and implementations. The various models capture slightly different conformations and contain complementary information which can be pooled together to capture recurrent, and therefore more likely, residue-residue contacts.Results
We applied our conformation ensemble approach to free modeling targets from both CASP8 and CASP9. Given a diverse ensemble of models, the method is able to achieve accuracies of. 48 for the top L/5 medium range contacts and. 36 for the top L/5 long range contacts for CASP8 targets (L being the target domain length). When applied to targets from CASP9, the accuracies of the top L/5 medium and long range contact predictions were. 34 and. 30 respectively.Conclusions
When operating on a moderately diverse ensemble of models, the conformation ensemble approach is an effective means to identify medium and long range residue-residue contacts. An immediate benefit of the method is that when tied with a scoring scheme, it can be used to successfully rank models. 相似文献15.
Background
With the rapid advancement of array-based genotyping techniques, genome-wide association studies (GWAS) have successfully identified common genetic variants associated with common complex diseases. However, it has been shown that only a small proportion of the genetic etiology of complex diseases could be explained by the genetic factors identified from GWAS. This missing heritability could possibly be explained by gene-gene interaction (epistasis) and rare variants. There has been an exponential growth of gene-gene interaction analysis for common variants in terms of methodological developments and practical applications. Also, the recent advancement of high-throughput sequencing technologies makes it possible to conduct rare variant analysis. However, little progress has been made in gene-gene interaction analysis for rare variants.Results
Here, we propose GxGrare which is a new gene-gene interaction method for the rare variants in the framework of the multifactor dimensionality reduction (MDR) analysis. The proposed method consists of three steps; 1) collapsing the rare variants, 2) MDR analysis for the collapsed rare variants, and 3) detect top candidate interaction pairs. GxGrare can be used for the detection of not only gene-gene interactions, but also interactions within a single gene. The proposed method is illustrated with 1080 whole exome sequencing data of the Korean population in order to identify causal gene-gene interaction for rare variants for type 2 diabetes.Conclusion
The proposed GxGrare performs well for gene-gene interaction detection with collapsing of rare variants. GxGrare is available at http://bibs.snu.ac.kr/software/gxgrare which contains simulation data and documentation. Supported operating systems include Linux and OS X.16.
A novel approach for the identification
of protein–protein interaction with integral membrane proteins 下载免费PDF全文
Protein–protein interaction plays a major role in all biological processes. The currently available genetic methods such as the two-hybrid system and the protein recruitment system are relatively limited in their ability to identify interactions with integral membrane proteins. Here we describe the development of a reverse Ras recruitment system (reverse RRS), in which the bait used encodes a membrane protein. The bait is expressed in its natural environment, the membrane, whereas the protein partner (the prey) is fused to a cytoplasmic Ras mutant. Protein–protein interaction between the proteins encoded by the prey and the bait results in Ras membrane translocation and activation of a viability pathway in yeast. We devised the expression of the bait and prey proteins under the control of dual distinct inducible promoters, thus enabling a rapid selection of transformants in which growth is attributed solely to specific protein–protein interaction. The reverse RRS approach greatly extends the usefulness of the protein recruitment systems and the use of integral membrane proteins as baits. The system serves as an attractive approach to explore novel protein–protein interactions with high specificity and selectivity, where other methods fail. 相似文献
17.
Background
The rapid advance in large-scale SNP-chip technologies offers us great opportunities in elucidating the genetic basis of complex diseases. Methods for large-scale interactions analysis have been under development from several sources. Due to several difficult issues (e.g., sparseness of data in high dimensions and low replication or validation rate), development of fast, powerful and robust methods for detecting various forms of gene-gene interactions continues to be a challenging task.Methodology/Principal Findings
In this article, we have developed an evolution-based method to search for genome-wide epistasis in a case-control design. From an evolutionary perspective, we view that human diseases originate from ancient mutations and consider that the underlying genetic variants play a role in differentiating human population into the healthy and the diseased. Based on this concept, traditional evolutionary measure, fixation index (Fst) for two unlinked loci, which measures the genetic distance between populations, should be able to reveal the responsible genetic interplays for disease traits. To validate our proposal, we first investigated the theoretical distribution of Fst by using extensive simulations. Then, we explored its power for detecting gene-gene interactions via SNP markers, and compared it with the conventional Pearson Chi-square test, mutual information based test and linkage disequilibrium based test under several disease models. The proposed evolution-based method outperformed these compared methods in dominant and additive models, no matter what the disease allele frequencies were. However, its performance was relatively poor in a recessive model. Finally, we applied the proposed evolution-based method to analysis of a published dataset. Our results showed that the P value of the Fst -based statistic is smaller than those obtained by the LD-based statistic or Poisson regression models.Conclusions/Significance
With rapidly growing large-scale genetic association studies, the proposed evolution-based method can be a promising tool in the identification of epistatic effects. 相似文献18.
Renuka C Pillutla Ku-chuan Hsiao Renee Brissette Paul S Eder Tony Giordano Paul W Fletcher Michael Lennick Arthur J Blume Neil I Goldstein 《BMC biotechnology》2001,1(1):6-9
Background
Modern drug discovery is concerned with identification and validation of novel protein targets from among the 30,000 genes or more postulated to be present in the human genome. While protein-protein interactions may be central to many disease indications, it has been difficult to identify new chemical entities capable of regulating these interactions as either agonists or antagonists. 相似文献19.
Association mapping and gene-gene interaction for stem rust resistance in CIMMYT spring wheat germplasm 总被引:1,自引:0,他引:1
Yu LX Lorenz A Rutkoski J Singh RP Bhavani S Huerta-Espino J Sorrells ME 《TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik》2011,123(8):1257-1268
The recent emergence of wheat stem rust Ug99 and evolution of new races within the lineage threatens global wheat production because they overcome widely deployed stem rust resistance (Sr) genes that had been effective for many years. To identify loci conferring adult plant resistance to races of Ug99 in wheat, we employed an association mapping approach for 276 current spring wheat breeding lines from the International Maize and Wheat Improvement Center (CIMMYT). Breeding lines were genotyped with Diversity Array Technology (DArT) and microsatellite markers. Phenotypic data was collected on these lines for stem rust race Ug99 resistance at the adult plant stage in the stem rust resistance screening nursery in Njoro, Kenya in seasons 2008, 2009 and 2010. Fifteen marker loci were found to be significantly associated with stem rust resistance. Several markers appeared to be linked to known Sr genes, while other significant markers were located in chromosome regions where no Sr genes have been previously reported. Most of these new loci colocalized with QTLs identified recently in different biparental populations. Using the same data and Q?+?K covariate matrices, we investigated the interactions among marker loci using linear regression models to calculate P values for pairwise marker interactions. Resistance marker loci including the Sr2 locus on 3BS and the wPt1859 locus on 7DL had significant interaction effects with other loci in the same chromosome arm and with markers on chromosome 6B. Other resistance marker loci had significant pairwise interactions with markers on different chromosomes. Based on these results, we propose that a complex network of gene-gene interactions is, in part, responsible for resistance to Ug99. Further investigation may provide insight for understanding mechanisms that contribute to this resistance gene network. 相似文献
20.
MOTIVATION: To understand cancer etiology, it is important to explore molecular changes in cellular processes from normal state to cancerous state. Because genes interact with each other during cellular processes, carcinogenesis related genes may form differential co-expression patterns with other genes in different cell states. In this study, we develop a statistical method for identifying differential gene-gene co-expression patterns in different cell states. RESULTS: For efficient pattern recognition, we extend the traditional F-statistic and obtain an Expected Conditional F-statistic (ECF-statistic), which incorporates statistical information of location and correlation. We also propose a statistical method for data transformation. Our approach is applied to a microarray gene expression dataset for prostate cancer study. For a gene of interest, our method can select other genes that have differential gene-gene co-expression patterns with this gene in different cell states. The 10 most frequently selected genes, include hepsin, GSTP1 and AMACR, which have recently been proposed to be associated with prostate carcinogenesis. However, genes GSTP1 and AMACR cannot be identified by studying differential gene expression alone. By using tumor suppressor genes TP53, PTEN and RB1, we identify seven genes that also include hepsin, GSTP1 and AMACR. We show that genes associated with cancer may have differential gene-gene expression patterns with many other genes in different cell states. By discovering such patterns, we may be able to identify carcinogenesis related genes. 相似文献