首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Local network alignment is an important component of the analysis of protein-protein interaction networks that may lead to the identification of evolutionary related complexes. We present AlignNemo, a new algorithm that, given the networks of two organisms, uncovers subnetworks of proteins that relate in biological function and topology of interactions. The discovered conserved subnetworks have a general topology and need not to correspond to specific interaction patterns, so that they more closely fit the models of functional complexes proposed in the literature. The algorithm is able to handle sparse interaction data with an expansion process that at each step explores the local topology of the networks beyond the proteins directly interacting with the current solution. To assess the performance of AlignNemo, we ran a series of benchmarks using statistical measures as well as biological knowledge. Based on reference datasets of protein complexes, AlignNemo shows better performance than other methods in terms of both precision and recall. We show our solutions to be biologically sound using the concept of semantic similarity applied to Gene Ontology vocabularies. The binaries of AlignNemo and supplementary details about the algorithms and the experiments are available at: sourceforge.net/p/alignnemo.  相似文献   

3.
4.
A novel database search algorithm is presented for the qualitative identification of proteins over a wide dynamic range, both in simple and complex biological samples. The algorithm has been designed for the analysis of data originating from data independent acquisitions, whereby multiple precursor ions are fragmented simultaneously. Measurements used by the algorithm include retention time, ion intensities, charge state, and accurate masses on both precursor and product ions from LC‐MS data. The search algorithm uses an iterative process whereby each iteration incrementally increases the selectivity, specificity, and sensitivity of the overall strategy. Increased specificity is obtained by utilizing a subset database search approach, whereby for each subsequent stage of the search, only those peptides from securely identified proteins are queried. Tentative peptide and protein identifications are ranked and scored by their relative correlation to a number of models of known and empirically derived physicochemical attributes of proteins and peptides. In addition, the algorithm utilizes decoy database techniques for automatically determining the false positive identification rates. The search algorithm has been tested by comparing the search results from a four‐protein mixture, the same four‐protein mixture spiked into a complex biological background, and a variety of other “system” type protein digest mixtures. The method was validated independently by data dependent methods, while concurrently relying on replication and selectivity. Comparisons were also performed with other commercially and publicly available peptide fragmentation search algorithms. The presented results demonstrate the ability to correctly identify peptides and proteins from data independent acquisition strategies with high sensitivity and specificity. They also illustrate a more comprehensive analysis of the samples studied; providing approximately 20% more protein identifications, compared to a more conventional data directed approach using the same identification criteria, with a concurrent increase in both sequence coverage and the number of modified peptides.  相似文献   

5.
Structural characterization of protein-protein interactions is essential for our ability to study life processes at the molecular level. Computational modeling of protein complexes (protein docking) is important as the source of their structure and as a way to understand the principles of protein interaction. Rapidly evolving comparative docking approaches utilize target/template similarity metrics, which are often based on the protein structure. Although the structural similarity, generally, yields good performance, other characteristics of the interacting proteins (eg, function, biological process, and localization) may improve the prediction quality, especially in the case of weak target/template structural similarity. For the ranking of a pool of models for each target, we tested scoring functions that quantify similarity of Gene Ontology (GO) terms assigned to target and template proteins in three ontology domains—biological process, molecular function, and cellular component (GO-score). The scoring functions were tested in docking of bound, unbound, and modeled proteins. The results indicate that the combined structural and GO-terms functions improve the scoring, especially in the twilight zone of structural similarity, typical for protein models of limited accuracy.  相似文献   

6.
Positive feedback loops are common regulatory elements in metabolic and protein signalling pathways. The length of such feedback loops determines stability and sensitivity to network perturbations. Here we provide a mathematical analysis of arbitrary length positive feedback loops with protein production and degradation. These loops serve as an abstraction of typical regulation patterns in protein signalling pathways. We first perform a steady state analysis and, independently of the chain length, identify exactly two steady states that represent either biological activity or inactivity. We thereby provide two formulas for the steady state protein concentrations as a function of feedback length, strength of feedback, as well as protein production and degradation rates. Using a control theory approach, analysing the frequency response of the linearisation of the system and exploiting the Small Gain Theorem, we provide conditions for local stability for both steady states. Our results demonstrate that, under some parameter relationships, once a biological meaningful on steady state arises, it is stable, while the off steady state, where all proteins are inactive, becomes unstable. We apply our results to a three-tier feedback of caspase activation in apoptosis and demonstrate how an intermediary protein in such a loop may be used as a signal amplifier within the cascade. Our results provide a rigorous mathematical analysis of positive feedback chains of arbitrary length, thereby relating pathway structure and stability.  相似文献   

7.
Translation is the final stage of gene expression where messenger RNA is used as a template for protein polymerization from appropriate amino acids. Release of the completed protein requires a release factor protein acting at the termination/stop codon to liberate it. In this paper we focus on a complex feedback control mechanism involved in the translation and synthesis of release factor proteins, which has been observed in different systems. These release factor proteins are involved in the termination stage of their own translation. Further, mutations in the release factor gene can result in a premature stop codon. In this case translation can result either in early termination and the production of a truncated protein or readthrough of the premature stop codon and production of the complete release factor protein. Thus during translation of the release factor mRNA containing a premature stop codon, the full length protein negatively regulates its production by its action on a premature stop codon, while positively regulating its production by its action on the regular stop codon. This paper develops a mathematical modelling framework to investigate this complex feedback control system involved in translation. A series of models is established to carefully investigate the role of individual mechanisms and how they work together. The steady state and dynamic behaviour of the resulting models are examined both analytically and numerically.  相似文献   

8.
Summary In this article we argue that an organismic perspective in character identification can alleviate a structural deficiency of mathematical models in biology relative to the ones in the physical sciences. The problem with many biological theories is that they do not contain the conditions of their validity or a method of identifying objects that are appropriate instances of the models. Here functionally important biological characters are introduced as conceptual abstractions derived within the context of an ontologically prior object, such as a cell or an organism. To illustrate this approach, we present an analytical method of character decomposition based on the notion of the quasi-independence of traits. Two cases are analyzed: context dependent units of inheritance and a model of character identification in adaptive evolution. We demonstrate that in each case the biological process as represented by a mathematical theory entails the conditions for the individualization of characters. Our approach also requires a conceptual re-orientation in the way we build biological models. Rather than defining a set of biological characters a priori, functionally relevant characters are identified in the context of a higher level biological process.  相似文献   

9.
A mathematical model has been developed to describe the growth and infection of insect cells by recombinant baculoviruses. The model parameters were determined from a series of independent experiments involving batch suspension culture. The profiles generated by the model for cell growth, virus production and protein production agree with those observed in experiments. Presently, the model simulates only systems where cells are not growth-limited. The model is useful in aiding the design and optimization of large-scale systems for production of biological insecticides as well as recombinant proteins and in delineating those areas which are limiting the process and require further, more fundamental, investigation.  相似文献   

10.
11.
Rho S  You S  Kim Y  Hwang D 《BMB reports》2008,41(3):184-193
Living organisms are comprised of various systems at different levels, i.e., organs, tissues, and cells. Each system carries out its diverse functions in response to environmental and genetic perturbations, by utilizing biological networks, in which nodal components, such as, DNA, mRNAs, proteins, and metabolites, closely interact with each other. Systems biology investigates such systems by producing comprehensive global data that represent different levels of biological information, i.e., at the DNA, mRNA, protein, or metabolite levels, and by integrating this data into network models that generate coherent hypotheses for given biological situations. This review presents a systems biology framework, called the 'Integrative Proteomics Data Analysis Pipeline' (IPDAP), which generates mechanistic hypotheses from network models reconstructed by integrating diverse types of proteomic data generated by mass spectrometry-based proteomic analyses. The devised framework includes a serial set of computational and network analysis tools. Here, we demonstrate its functionalities by applying these tools to several conceptual examples.  相似文献   

12.
The enzyme cellulase, a multienzyme complex made up of several proteins, catalyzes the conversion of cellulose to glucose in an enzymatic hydrolysis-based biomass-to-ethanol process. Production of cellulase enzyme proteins in large quantities using the fungus Trichoderma reesei requires understanding the dynamics of growth and enzyme production. The method of neural network parameter function modeling, which combines the approximation capabilities of neural networks with fundamental process knowledge, is utilized to develop a mathematical model of this dynamic system. In addition, kinetic models are also developed. Laboratory data from bench-scale fermentations involving growth and protein production by T. reesei on lactose and xylose are used to estimate the parameters in these models. The relative performances of the various models and the results of optimizing these models on two different performance measures are presented. An approximately 33% lower root-mean-squared error (RMSE) in protein predictions and about 40% lower total RMSE is obtained with the neural network-based model as opposed to kinetic models. Using the neural network-based model, the RMSE in predicting optimal conditions for two performance indices, is about 67% and 40% lower, respectively, when compared with the kinetic models. Thus, both model predictions and optimization results from the neural network-based model are found to be closer to the experimental data than the kinetic models developed in this work. It is shown that the neural network parameter function modeling method can be useful as a "macromodeling" technique to rapidly develop dynamic models of a process.  相似文献   

13.
This paper proposes a model for the expected probability distribution for a certain class of biological structures. In particular, a model is derived for the distribution of lengths of helices, sheets, turns, and coils as a function of the length of the structure divided by the length of the protein it is contained in. A fit between the derived lognormal function and the structures for some proteins whose three-dimensional structure is known was significant. The fit produces fundamental parameters particular to each structure type that are related to the underlying structure and its morphogenesis. The importance of the result is that a universal mathematical distribution can be used to explain certain protein morphogeneses. Also, these fundamental parameters can be used as an aid in predicting whether a given sequence is a particular secondary structure or not, without a knowledge of its three-dimensional structure.  相似文献   

14.
A new method has been developed to compute the probability that each amino acid in a protein sequence is in a particular secondary structural element. Each of these probabilities is computed using the entire sequence and a set of predefined structural class models. This set of structural classes is patterned after Jane Richardson''s taxonomy for the domains of globular proteins. For each structural class considered, a mathematical model is constructed to represent constraints on the pattern of secondary structural elements characteristic of that class. These are stochastic models having discrete state spaces (referred to as hidden Markov models by researchers in signal processing and automatic speech recognition). Each model is a mathematical generator of amino acid sequences; the sequence under consideration is modeled as having been generated by one model in the set of candidates. The probability that each model generated the given sequence is computed using a filtering algorithm. The protein is then classified as belonging to the structural class having the most probable model. The secondary structure of the sequence is then analyzed using a "smoothing" algorithm that is optimal for that structural class model. For each residue position in the sequence, the smoother computes the probability that the residue is contained within each of the defined secondary structural elements of the model. This method has two important advantages: (1) the probability of each residue being in each of the modeled secondary structural elements is computed using the totality of the amino acid sequence, and (2) these probabilities are consistent with prior knowledge of realizable domain folds as encoded in each model. As an example of the method''s utility, we present its application to flavodoxin, a prototypical alpha/beta protein having a central beta-sheet, and to thioredoxin, which belongs to a similar structural class but shares no significant sequence similarity.  相似文献   

15.
In this work we propose a model that simultaneously optimizes the process variables and the structure of a multiproduct batch plant for the production of recombinant proteins. The complete model includes process performance models for the unit stages and a posynomial representation for the multiproduct batch plant. Although the constant time and size factor models are the most commonly used to model multiproduct batch processes, process performance models describe these time and size factors as functions of the process variables selected for optimization. These process performance models are expressed as algebraic equations obtained from the analytical integration of simplified mass balances and kinetic expressions that describe each unit operation. They are kept as simple as possible while retaining the influence of the process variables selected to optimize the plant. The resulting mixed-integer nonlinear program simultaneously calculates the plant structure (parallel units in or out of phase, and allocation of intermediate storage tanks), the batch plant decision variables (equipment sizes, batch sizes, and operating times of semicontinuous items), and the process decision variables (e.g., final concentration at selected stages, volumetric ratio of phases in the liquid-liquid extraction). A noteworthy feature of the proposed approach is that the mathematical model for the plant is the same as that used in the constant factor model. The process performance models are handled as extra constraints. A plant consisting of eight stages operating in the single product campaign mode (one fermentation, two microfiltrations, two ultrafiltrations, one homogenization, one liquid-liquid extraction, and one chromatography) for producing four different recombinant proteins by the genetically engineered yeast Saccharomyces cerevisiae was modeled and optimized. Using this example, it is shown that the presence of additional degrees of freedom introduced by the process performance models, with respect to a fixed size and time factor model, represents an important development in improving plant design.  相似文献   

16.
With the increasing flow of biological data there is a growing demand for mathematical tools whereby essential aspects of complex causal dynamic models can be captured and detected by simpler mathematical models without sacrificing too much of the realism provided by the original ones. Given the presence of a time scale hierarchy, singular perturbation techniques represent an elegant method for making such minimised mathematical representations. Any reduction of a complex model by singular perturbation methods is a targeted reduction by the fact that one has to pick certain mechanisms, processes or aspects thought to be essential in a given explanatory context. Here we illustrate how such a targeted reduction of a complex model of melanogenesis in mammals recently developed by the authors provides a way to improve the understanding of how the melanogenic system may behave in a switch-like manner between production of the two major types of melanins. The reduced model is shown by numerical means to be in good quantitative agreement with the original model. Furthermore, it is shown how the reduced model discloses hidden robustness features of the full model, and how the making of a reduced model represents an efficient analytical sensitivity analysis. In addition to yielding new insights concerning the melanogenic system, the paper provides an illustration of a protocol that could be followed to make validated simplifications of complex biological models possessing time scale hierarchies.  相似文献   

17.
18.
19.
We analyze a basic building block of gene regulatory networks using a stochastic/geometric model in search of a mathematical backing for the discrete modeling frameworks. We consider a network consisting only of two interacting genes: a source gene and a target gene. The target gene is activated by the proteins encoded by the source gene. The interaction is therefore mediated by activator proteins that travel, like a signal, from the source to the target. We calculate the production curve of the target proteins in response to a constant-rate production of activator proteins. The latter has a sigmoidal shape (like a simple delay line) that is sharper and taller when the two genes are closer to each other. This provides further support for the use of discrete models in the analysis gene regulatory networks. Moreover, it suggests an evolutionary pressure towards making the interacting genes closer to each other to make their interactions more efficient and more reliable.  相似文献   

20.
In this paper, we report an experimental setup and mathematical algorithm for determination of relative protein abundance from directly labeled native protein samples applied to an array of antibodies. The application of the proposed experimental system compensates internally at each array element for a number of deficiencies in array experiments such as differential labeling efficiency in dual color assay systems, differential solubility of protein molecules in dual color assay systems, and differential affinity of capture reagents toward proteins labeled with two different fluorescent dyes. This system offers full compensation for variable amounts of capture reagents on separate array structures, as well as limited compensation for nonspecific interactions between capture reagents and analytes. The proposed experimental strategy enables the use of a large number of capture reagents to develop a true multiplex analysis system that will yield complete relative protein abundance information in two biological systems.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号