首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
MOTIVATION: Side-chain positioning is a central component of homology modeling and protein design. In a common formulation of the problem, the backbone is fixed, side-chain conformations come from a rotamer library, and a pairwise energy function is optimized. It is NP-complete to find even a reasonable approximate solution to this problem. We seek to put this hardness result into practical context. RESULTS: We present an integer linear programming (ILP) formulation of side-chain positioning that allows us to tackle large problem sizes. We relax the integrality constraint to give a polynomial-time linear programming (LP) heuristic. We apply LP to position side chains on native and homologous backbones and to choose side chains for protein design. Surprisingly, when positioning side chains on native and homologous backbones, optimal solutions using a simple, biologically relevant energy function can usually be found using LP. On the other hand, the design problem often cannot be solved using LP directly; however, optimal solutions for large instances can still be found using the computationally more expensive ILP procedure. While different energy functions also affect the difficulty of the problem, the LP/ILP approach is able to find optimal solutions. Our analysis is the first large-scale demonstration that LP-based approaches are highly effective in finding optimal (and successive near-optimal) solutions for the side-chain positioning problem.  相似文献   

2.
Reconstruction of phylogenetic trees is a fundamental problem in computational biology. While excellent heuristic methods are available for many variants of this problem, new advances in phylogeny inference will be required if we are to be able to continue to make effective use of the rapidly growing stores of variation data now being gathered. In this paper, we present two integer linear programming (ILP) formulations to find the most parsimonious phylogenetic tree from a set of binary variation data. One method uses a flow-based formulation that can produce exponential numbers of variables and constraints in the worst case. The method has, however, proven extremely efficient in practice on datasets that are well beyond the reach of the available provably efficient methods, solving several large mtDNA and Y-chromosome instances within a few seconds and giving provably optimal results in times competitive with fast heuristics than cannot guarantee optimality. An alternative formulation establishes that the problem can be solved with a polynomial-sized ILP. We further present a web server developed based on the exponential-sized ILP that performs fast maximum parsimony inferences and serves as a front end to a database of precomputed phylogenies spanning the human genome.  相似文献   

3.
Signal transduction is an important process that transmits signals from the outside of a cell to the inside to mediate sophisticated biological responses. Effective computational models to unravel such a process by taking advantage of high-throughput genomic and proteomic data are needed to understand the essential mechanisms underlying the signaling pathways. In this article, we propose a novel method for uncovering signal transduction networks (STNs) by integrating protein interaction with gene expression data. Specifically, we formulate STN identification problem as an integer linear programming (ILP) model, which can be actually solved by a relaxed linear programming algorithm and is flexible for handling various prior information without any restriction on the network structures. The numerical results on yeast MAPK signaling pathways demonstrate that the proposed ILP model is able to uncover STNs or pathways in an efficient and accurate manner. In particular, the prediction results are found to be in high agreement with current biological knowledge and available information in literature. In addition, the proposed model is simple to be interpreted and easy to be implemented even for a large-scale system.  相似文献   

4.
We study the problem of reconstructing haplotype configurations from genotypes on pedigree data with missing alleles under the Mendelian law of inheritance and the minimum-recombination principle, which is important for the construction of haplotype maps and genetic linkage/association analyses. Our previous results show that the problem of finding a minimum-recombinant haplotype configuration (MRHC) is in general NP-hard. This paper presents an effective integer linear programming (ILP) formulation of the MRHC problem with missing data and a branch-and-bound strategy that utilizes a partial order relationship and some other special relationships among variables to decide the branching order. Nontrivial lower and upper bounds on the optimal number of recombinants are introduced at each branching node to effectively prune the search tree. When multiple solutions exist, a best haplotype configuration is selected based on a maximum likelihood approach. The paper also shows for the first time how to incorporate marker interval distance into a rule-based haplotyping algorithm. Our results on simulated data show that the algorithm could recover haplotypes with 50 loci from a pedigree of size 29 in seconds on a Pentium IV computer. Its accuracy is more than 99.8% for data with no missing alleles and 98.3% for data with 20% missing alleles in terms of correctly recovered phase information at each marker locus. A comparison with a statistical approach SimWalk2 on simulated data shows that the ILP algorithm runs much faster than SimWalk2 and reports better or comparable haplotypes on average than the first and second runs of SimWalk2. As an application of the algorithm to real data, we present some test results on reconstructing haplotypes from a genome-scale SNP dataset consisting of 12 pedigrees that have 0.8% to 14.5% missing alleles.  相似文献   

5.

Background

Traditional drug discovery methods focused on the efficacy of drugs rather than their toxicity. However, toxicity and/or lack of efficacy are produced when unintended targets are affected in metabolic networks. Thus, identification of biological targets which can be manipulated to produce the desired effect with minimum side-effects has become an important and challenging topic. Efficient computational methods are required to identify the drug targets while incurring minimal side-effects.

Results

In this paper, we propose a graph-based computational damage model that summarizes the impact of enzymes on compounds in metabolic networks. An efficient method based on Integer Linear Programming formalism is then developed to identify the optimal enzyme-combination so as to minimize the side-effects. The identified target enzymes for known successful drugs are then verified by comparing the results with those in the existing literature.

Conclusions

Side-effects reduction plays a crucial role in the study of drug development. A graph-based computational damage model is proposed and the theoretical analysis states the captured problem is NP-completeness. The proposed approaches can therefore contribute to the discovery of drug targets. Our developed software is available at “http://hkumath.hku.hk/~wkc/APBC2018-metabolic-network.zip”.
  相似文献   

6.
Deregulation of cell signaling pathways plays a crucial role in the development of tumors. The identification of such pathways requires effective analysis tools that facilitate the interpretation of expression differences. Here, we present a novel and highly efficient method for identifying deregulated subnetworks in a regulatory network. Given a score for each node that measures the degree of deregulation of the corresponding gene or protein, the algorithm computes the heaviest connected subnetwork of a specified size reachable from a designated root node. This root node can be interpreted as a molecular key player responsible for the observed deregulation. To demonstrate the potential of our approach, we analyzed three gene expression data sets. In one scenario, we compared expression profiles of non-malignant primary mammary epithelial cells derived from BRCA1 mutation carriers and of epithelial cells without BRCA1 mutation. Our results suggest that oxidative stress plays an important role in epithelial cells of BRCA1 mutation carriers and that the activation of stress proteins may result in avoidance of apoptosis leading to an increased overall survival of cells with genetic alterations. In summary, our approach opens new avenues for the elucidation of pathogenic mechanisms and for the detection of molecular key players.  相似文献   

7.
Lin CH  Cheng HP  Yang CB  Yang CN 《Bio Systems》2007,90(1):242-252
An algorithm based on a modified sticker model accompanied with an advanced MEMS-based microarray technology is demonstrated to solve SAT problem, which has long served as a benchmark in DNA computing. Unlike conventional DNA computing algorithms needing an initial data pool to cover correct and incorrect answers and further executing a series of separation procedures to destroy the unwanted ones, we built solutions in parts to satisfy one clause in one step, and eventually solve the entire Boolean formula through steps. No time-consuming sample preparation procedures and delicate sample applying equipment were required for the computing process. Moreover, experimental results show the bound DNA sequences can sustain the chemical solutions during computing processes such that the proposed method shall be useful in dealing with large-scale problems.  相似文献   

8.
Alignment of molecular networks by integer quadratic programming   总被引:3,自引:0,他引:3  
MOTIVATION: With more and more data on molecular networks (e.g. protein interaction networks, gene regulatory networks and metabolic networks) available, the discovery of conserved patterns or signaling pathways by comparing various kinds of networks among different species or within a species becomes an increasingly important problem. However, most of the conventional approaches either restrict comparative analysis to special structures, such as pathways, or adopt heuristic algorithms due to computational burden. RESULTS: In this article, to find the conserved substructures, we develop an efficient algorithm for aligning molecular networks based on both molecule similarity and architecture similarity, by using integer quadratic programming (IQP). Such an IQP can be relaxed into the corresponding quadratic programming (QP) which almost always ensures an integer solution, thereby making molecular network alignment tractable without any approximation. The proposed framework is very flexible and can be applied to many kinds of molecular networks including weighted and unweighted, directed and undirected networks with or without loops. AVAILABILITY: Matlab code and data are available from http://zhangroup.aporc.org/bioinfo/MNAligner or http://intelligent.eic.osaka-sandai.ac.jp/chenen/software/MNAligner, or upon request from authors. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

9.
In the conservation literature, heuristic procedures have been employed to incorporate spatial considerations in reserve network selection with the presumption that computationally convenient optimization models would be too difficult or impossible to formulate. This paper extends the standard set-covering formulation to incorporate a particular spatial selection criterion, namely reducing the reserve boundary to the extent possible, when selecting a reserve network that represents a set of target species at least once. Applying the model to a dataset on the occurrence of breeding birds in Berkshire, UK, demonstrated that the technique resulted in significant reductions in reserve boundary length relative to solutions produced by the standard set-covering formulation. Computational results showed that moderately large reserve network selection problems could be solved without issue. Alternative solutions may be produced to explore trade-offs between boundary length, number of sites required or alternative criteria.  相似文献   

10.
对一般的0—1整数规划问题提出了一种半自动化的DNA计算模型。首先产生所给定的0—1整数规划问题的所有可能解,然后设置对应于0—1整数规划问题的约束不等式的探针,利用这些探针设计半自动化装置对所有可能解进行自动分离,最终找出0—1整数规划问题的解。该模型的最大优点在于具有自动化的特点;同时,从理论上来讲,该模型适合含有任意变量的任意0—1整数规划问题的求解。  相似文献   

11.
In automated production systems like flexible manufacturing systems (FMSs), an important issue is to find an adequate workload for each machine for each time period. Many integer linear programming (ILP) models have been proposed to solve the FMS loading problems, but not all of them take tools into account. Those that do not consider tooling are quite unrealistic, especially when setup times are important with respect to processing times. When tool loading has to be handled by the model, the load assignment may have to be changed completely. In this article we consider FMSs with a tool management of the following type: the system works in time periods whose durations are fixed or not; and tools are loaded on the machines at the beginning of each time period and stay there for the whole time period. Tool changes may occur only at the end of each time period when the system is stopped. We present some integer programming models for handling these situations with several types of objectives. Emphasis is laid on the ILP formulations. Computational complexities are discussed.  相似文献   

12.
Xiao D  Li W  Zhang Z  He L 《Bio Systems》2005,82(3):203-207
In this paper, we consider a procedure for solving maximum cut problems in the Adleman-Lipton model. The procedure works in O(n2) steps for maximum cut problems of an undirected graph with n vertices.  相似文献   

13.
A new method is proposed for the optimization of biochemical systems. The method, based on the separation of the stoichiometric and kinetic aspects of the system, follows the general approach used in the previously presented indirect optimization method (IOM) developed within biochemical systems theory. It is called GMA-IOM because it makes use of the generalized mass action (GMA) as the model system representation form. The GMA representation avoids flux aggregation and thus prevents possible stoichiometric errors. The optimization of a system is used to illustrate and compare the features, advantages and shortcomings of both versions of the IOM method as a general strategy for designing improved microbial strains of biotechnological interest. Special attention has been paid to practical problems for the actual implementation of the new proposed strategy, such as the total protein content of the engineered strain or the deviation from the original steady state and its influence on cell viability.  相似文献   

14.
This paper presents a novel scheme for the use of linear programming to calculate muscle contraction forces in models describing musculoskeletal system biomechanics. Models of this kind are frequently found in the biomechanics literature. In most cases they involve muscle contraction force calculations that are statically indeterminate, and hence use optimization techniques to make those calculations. We present a linear programming optimization technique that solves a two-objective problem with two sequential linear programs. We use the technique here to minimize muscle intensity and joint compression force, since those are commonly used objectives. The two linear program model has the advantages of low computation cost, ready implementation on a micro-computer, and stable solutions. We show how to solve the model analytically in simple cases. We also discuss the use of the dual problem of linear programming to gain understanding of the solution it provides.  相似文献   

15.
16.
In this article, we present COMSAT, a hybrid framework for residue contact prediction of transmembrane (TM) proteins, integrating a support vector machine (SVM) method and a mixed integer linear programming (MILP) method. COMSAT consists of two modules: COMSAT_SVM which is trained mainly on position–specific scoring matrix features, and COMSAT_MILP which is an ab initio method based on optimization models. Contacts predicted by the SVM model are ranked by SVM confidence scores, and a threshold is trained to improve the reliability of the predicted contacts. For TM proteins with no contacts above the threshold, COMSAT_MILP is used. The proposed hybrid contact prediction scheme was tested on two independent TM protein sets based on the contact definition of 14 Å between Cα‐Cα atoms. First, using a rigorous leave‐one‐protein‐out cross validation on the training set of 90 TM proteins, an accuracy of 66.8%, a coverage of 12.3%, a specificity of 99.3% and a Matthews' correlation coefficient (MCC) of 0.184 were obtained for residue pairs that are at least six amino acids apart. Second, when tested on a test set of 87 TM proteins, the proposed method showed a prediction accuracy of 64.5%, a coverage of 5.3%, a specificity of 99.4% and a MCC of 0.106. COMSAT shows satisfactory results when compared with 12 other state‐of‐the‐art predictors, and is more robust in terms of prediction accuracy as the length and complexity of TM protein increase. COMSAT is freely accessible at http://hpcc.siat.ac.cn/COMSAT/ . Proteins 2016; 84:332–348. © 2016 Wiley Periodicals, Inc.  相似文献   

17.
Maximum likelihood paired comparison ranking by linear programming   总被引:2,自引:0,他引:2  
DECANI  JOHN S. 《Biometrika》1969,56(3):537-545
  相似文献   

18.
This paper presents a novel linear programming approach to do protein 3-dimensional (3D) structure prediction via threading. Based on the contact map graph of the protein 3D structure template, the protein threading problem is formulated as a large scale integer programming (IP) problem. The IP formulation is then relaxed to a linear programming (LP) problem, and then solved by the canonical branch-and-bound method. The final solution is globally optimal with respect to energy functions. In particular, our energy function includes pairwise interaction preferences and allowing variable gaps which are two key factors in making the protein threading problem NP-hard. A surprising result is that, most of the time, the relaxed linear programs generate integral solutions directly. Our algorithm has been implemented as a software package RAPTOR-RApid Protein Threading by Operation Research technique. Large scale benchmark test for fold recognition shows that RAPTOR significantly outperforms other programs at the fold similarity level. The CAFASP3 evaluation, a blind and public test by the protein structure prediction community, ranks RAPTOR as top 1, among individual prediction servers, in terms of the recognition capability and alignment accuracy for Fold Recognition (FR) family targets. RAPTOR also performs very well in recognizing the hard Homology Modeling (HM) targets. RAPTOR was implemented at the University of Waterloo and it can be accessed at http://www.cs.uwaterloo.ca/~j3xu/RAPTOR_form.htm.  相似文献   

19.
This paper addresses an extension of the resource-constrained project scheduling problem that takes into account storage resources which may be produced or consumed by activities. To solve this problem, we propose the generalization of two existing mixed integer linear programming models for the classical resource-constrained project scheduling problem, as well as one novel formulation based on the concept of event. Computational results are reported to compare these formulations with each other, as well as with a reference method from the literature. Conclusions are drawn on the merits and drawbacks of each model according to the instance characteristics.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号