首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
By rearranging naturally occurring genetic components, gene networks can be created that display novel functions. When designing these networks, the kinetic parameters describing DNA/protein binding are of great importance, as these parameters strongly influence the behavior of the resulting gene network. This article presents an optimization method based on simulated annealing to locate combinations of kinetic parameters that produce a desired behavior in a genetic network. Since gene expression is an inherently stochastic process, the simulation component of simulated annealing optimization is conducted using an accurate multiscale simulation algorithm to calculate an ensemble of network trajectories at each iteration of the simulated annealing algorithm. Using the three-gene repressilator of Elowitz and Leibler as an example, we show that gene network optimizations can be conducted using a mechanistically realistic model integrated stochastically. The repressilator is optimized to give oscillations of an arbitrary specified period. These optimized designs may then provide a starting-point for the selection of genetic components needed to realize an in vivo system.  相似文献   

2.
SUMMARY: Since few years the problem of finding optimal solutions for drug or vaccine protocols have been tackled using system biology modeling. These approaches are usually computationally expensive. Our previous experiences in optimizing vaccine or drug protocols using genetic algorithms required the use of a high performance computing infrastructure for a couple of days. In the present article we show that by an appropriate use of a different optimization algorithm, the simulated annealing, we have been able to downsize the computational effort by a factor 10(2). The new algorithm requires computational effort that can be achieved by current generation personal computers. AVAILABILITY: Software and additional data can be found at http://www.immunomics.eu/SA/  相似文献   

3.
In modern genetic epidemiology studies, the association between the disease and a genomic region, such as a candidate gene, is often investigated using multiple SNPs. We propose a multilocus test of genetic association that can account for genetic effects that might be modified by variants in other genes or by environmental factors. We consider use of the venerable and parsimonious Tukey's 1-degree-of-freedom model of interaction, which is natural when individual SNPs within a gene are associated with disease through a common biological mechanism; in contrast, many standard regression models are designed as if each SNP has unique functional significance. On the basis of Tukey's model, we propose a novel but computationally simple generalized test of association that can simultaneously capture both the main effects of the variants within a genomic region and their interactions with the variants in another region or with an environmental exposure. We compared performance of our method with that of two standard tests of association, one ignoring gene-gene/gene-environment interactions and the other based on a saturated model of interactions. We demonstrate major power advantages of our method both in analysis of data from a case-control study of the association between colorectal adenoma and DNA variants in the NAT2 genomic region, which are well known to be related to a common biological phenotype, and under different models of gene-gene interactions with use of simulated data.  相似文献   

4.
As a more complete picture of the genetic and enzymatic composition of cells becomes available, there is a growing need to describe how cellular regulatory elements interact with the cellular environment to affect cell physiology. One means for describing intracellular regulatory mechanisms is concurrent measurement of multiple metabolic pathways and their interactions by metabolic flux analysis. Flux of carbon through a metabolic pathway responds to all cellular regulatory systems, including changes in enzyme and substrate concentrations, enzyme activation or inhibition, and ultimately genetic control. The extent to which metabolic flux analysis can describe cellular physiology depends on the number of pathways in the model and the quality of the data. Intracellular information is obtainable from isotopic tracer experiments, the most extensive being the determination of the isotopomer distribution, or specific labeling pattern, of intracellular metabolites. We present a rapid and novel solution method that determines the flux of carbon through complex pathway models using isotopomer data. This time-consuming problem was solved with the introduction of isotopomer path tracing, which drastically reduces the number of isotopomer variables to the number of isotopomers observed experimentally. We propose a partitioned solution method that takes advantage of the nearly linear relationship between fluxes and isotopomers. Whereas the stoichiometric matrix and the isotopomer matrix are invertible, simulated annealing and the Newton-Raphson method are used for the nonlinear components. Reversible reactions are described by a new parameter, the association factor, which scales hyperbolically with the rate of metabolite exchange. Automating the solution method permits a variety of models to be compared, thus enhancing the accuracy of results. A simplified example that contains all of the complexities of a comprehensive pathway model is presented. Copyright John Wiley & Sons, Inc.  相似文献   

5.
6.
MOTIVATION: Diffusable and non-diffusable gene products play a major role in body plan formation. A quantitative understanding of the spatio-temporal patterns formed in body plan formation, by using simulation models is an important addition to experimental observation. The inverse modelling approach consists of describing the body plan formation by a rule-based model, and fitting the model parameters to real observed data. In body plan formation, the data are usually obtained from fluorescent immunohistochemistry or in situ hybridizations. Inferring model parameters by comparing such data to those from simulation is a major computational bottleneck. An important aspect in this process is the choice of method used for parameter estimation. When no information on parameters is available, parameter estimation is mostly done by means of heuristic algorithms. RESULTS: We show that parameter estimation for pattern formation models can be efficiently performed using an evolution strategy (ES). As a case study we use a quantitative spatio-temporal model of the regulatory network for early development in Drosophila melanogaster. In order to estimate the parameters, the simulated results are compared to a time series of gene products involved in the network obtained with immunohistochemistry. We demonstrate that a (mu,lambda)-ES can be used to find good quality solutions in the parameter estimation. We also show that an ES with multiple populations is 5-140 times as fast as parallel simulated annealing for this case study, and that combining ES with a local search results in an efficient parameter estimation method.  相似文献   

7.
Quantitative trait nucleotide analysis using Bayesian model selection   总被引:4,自引:0,他引:4  
Although much attention has been given to statistical genetic methods for the initial localization and fine mapping of quantitative trait loci (QTLs), little methodological work has been done to date on the problem of statistically identifying the most likely functional polymorphisms using sequence data. In this paper we provide a general statistical genetic framework, called Bayesian quantitative trait nucleotide (BQTN) analysis, for assessing the likely functional status of genetic variants. The approach requires the initial enumeration of all genetic variants in a set of resequenced individuals. These polymorphisms are then typed in a large number of individuals (potentially in families), and marker variation is related to quantitative phenotypic variation using Bayesian model selection and averaging. For each sequence variant a posterior probability of effect is obtained and can be used to prioritize additional molecular functional experiments. An example of this quantitative nucleotide analysis is provided using the GAW12 simulated data. The results show that the BQTN method may be useful for choosing the most likely functional variants within a gene (or set of genes). We also include instructions on how to use our computer program, SOLAR, for association analysis and BQTN analysis.  相似文献   

8.
Background/Aims: Structural Equation Modeling (SEM) is an analysis approach that accounts for both the causal relationships between variables and the errors associated with the measurement of these variables. In this paper, a framework for implementing structural equation models (SEMs) in family data is proposed. Methods: This framework includes both a latent measurement model and a structural model with covariates. It allows for a wide variety of models, including latent growth curve models. Environmental, polygenic and other genetic variance components can be included in the SEM. Kronecker notation makes it easy to separate the SEM process from a familial correlation model. A limited information method of model fitting is discussed. We show how missing data and ascertainment may be handled. We give several examples of how the framework may be used. Results: A simulation study shows that our method is computationally feasible, and has good statistical properties. Conclusion: Our framework may be used to build and compare causal models using family data without any genetic marker data. It also allows for a nearly endless array of genetic association and/or linkage tests. A preliminary Matlab program is available, and we are currently implementing a more complete and user-friendly R package.  相似文献   

9.
A parallel genetic algorithm for optimization is outlined, and its performance on both mathematical and biomechanical optimization problems is compared to a sequential quadratic programming algorithm, a downhill simplex algorithm and a simulated annealing algorithm. When high-dimensional non-smooth or discontinuous problems with numerous local optima are considered, only the simulated annealing and the genetic algorithm, which are both characterized by a weak search heuristic, are successful in finding the optimal region in parameter space. The key advantage of the genetic algorithm is that it can easily be parallelized at negligible overhead.  相似文献   

10.
Constructing dense genetic linkage maps   总被引:4,自引:0,他引:4  
This paper describes a novel combination of techniques for the construction of dense genetic linkage maps. The construction of such maps is hampered by the occurrence of even small proportions of typing errors. Simulated annealing is used to obtain the best map according to the optimality criterion: the likelihood or the total number of recombination events. Spatial sampling of markers is used to obtain a framework map. The construction of a framework map is essential if the steps used for simulated annealing are required to be simple. For missing-data imputation the Gibbs sampler is used. Map construction using simulated annealing and missing-data imputation are used in an iterative way. In order to obtain some measure of precision of the genetic linkage map obtained, the Metropolis-Hastings algorithm is used to obtain posterior intervals for the positions of markers. The process of map construction is embedded in a framework of pre-mapping and post-mapping diagnostics. The techniques described are illustrated using a practical application. Received: 1 June 2000 / Accepted: 21 September 2000  相似文献   

11.
Summary .  To detect association between a genetic marker and a disease in case–control studies, the Cochran–Armitage trend test is typically used. The trend test is locally optimal when the genetic model is correctly specified. However, in practice, the underlying genetic model, and hence the optimal trend test, are usually unknown. In this case, Pearson's chi-squared test, the maximum of three trend test statistics (optimal for the recessive, additive, and dominant models), and the test based on genetic model selection (GMS) are useful. In this article, we first modify the existing GMS method so that it can be used when the risk allele is unknown. Then we propose a new approach by excluding a genetic model that is not supported by the data. Using either the model selection or exclusion, the alternative space is reduced conditional on the observed data, and hence the power to detect a true association can be increased. Simulation results are reported and the proposed methods are applied to the genetic markers identified from the genome-wide association studies conducted by the Wellcome Trust Case–Control Consortium. The results demonstrate that the genetic model exclusion approach usually performs better than existing methods under its worst situation across scientifically plausible genetic models we considered.  相似文献   

12.
We describe a method by which a single experiment can reveal both association model (pathway and constants) and low-resolution structures of a self-associating system. Small-angle scattering data are collected from solutions at a range of concentrations. These scattering data curves are mass-weighted linear combinations of the scattering from each oligomer. Singular value decomposition of the data yields a set of basis vectors from which the scattering curve for each oligomer is reconstructed using coefficients that depend on the association model. A search identifies the association pathway and constants that provide the best agreement between reconstructed and observed data. Using simulated data with realistic noise, our method finds the correct pathway and association constants. Depending on the simulation parameters, reconstructed curves for each oligomer differ from the ideal by 0.05-0.99% in median absolute relative deviation. The reconstructed scattering curves are fundamental to further analysis, including interatomic distance distribution calculation and low-resolution ab initio shape reconstruction of each oligomer in solution. This method can be applied to x-ray or neutron scattering data from small angles to moderate (or higher) resolution. Data can be taken under physiological conditions, or particular conditions (e.g., temperature) can be varied to extract fundamental association parameters (ΔHass, ΔSass).  相似文献   

13.
One of the most difficult and time-consuming aspects of building compartmental models of single neurons is assigning values to free parameters to make models match experimental data. Automated parameter-search methods potentially represent a more rapid and less labor-intensive alternative to choosing parameters manually. Here we compare the performance of four different parameter-search methods on several single-neuron models. The methods compared are conjugate-gradient descent, genetic algorithms, simulated annealing, and stochastic search. Each method has been tested on five different neuronal models ranging from simple models with between 3 and 15 parameters to a realistic pyramidal cell model with 23 parameters. The results demonstrate that genetic algorithms and simulated annealing are generally the most effective methods. Simulated annealing was overwhelmingly the most effective method for simple models with small numbers of parameters, but the genetic algorithm method was equally effective for more complex models with larger numbers of parameters. The discussion considers possible explanations for these results and makes several specific recommendations for the use of parameter searches on neuronal models.  相似文献   

14.
Genomic best linear-unbiased prediction (GBLUP) assumes equal variance for all marker effects, which is suitable for traits that conform to the infinitesimal model. For traits controlled by major genes, Bayesian methods with shrinkage priors or genome-wide association study (GWAS) methods can be used to identify causal variants effectively. The information from Bayesian/GWAS methods can be used to construct the weighted genomic relationship matrix (G). However, it remains unclear which methods perform best for traits varying in genetic architecture. Therefore, we developed several methods to optimize the performance of weighted GBLUP and compare them with other available methods using simulated and real data sets. First, two types of methods (marker effects with local shrinkage or normal prior) were used to obtain test statistics and estimates for each marker effect. Second, three weighted G matrices were constructed based on the marker information from the first step: (1) the genomic-feature-weighted G, (2) the estimated marker-variance-weighted G, and (3) the absolute value of the estimated marker-effect-weighted G. Following the above process, six different weighted GBLUP methods (local shrinkage/normal-prior GF/EV/AEWGBLUP) were proposed for genomic prediction. Analyses with both simulated and real data demonstrated that these options offer flexibility for optimizing the weighted GBLUP for traits with a broad spectrum of genetic architectures. The advantage of weighting methods over GBLUP in terms of accuracy was trait dependant, ranging from 14.8% to marginal for simulated traits and from 44% to marginal for real traits. Local-shrinkage prior EVWGBLUP is superior for traits mainly controlled by loci of a large effect. Normal-prior AEWGBLUP performs well for traits mainly controlled by loci of moderate effect. For traits controlled by some loci with large effects (explain 25–50% genetic variance) and a range of loci with small effects, GFWGBLUP has advantages. In conclusion, the optimal weighted GBLUP method for genomic selection should take both the genetic architecture and number of QTLs of traits into consideration carefully.Subject terms: Quantitative trait, Genome-wide association studies, Animal breeding, Quantitative trait, Genome-wide association studies  相似文献   

15.
16.
Association mapping studies aim to determine the genetic basis of a trait. A common experimental design uses a sample of unrelated individuals classified into 2 groups, for example cases and controls. If the trait has a complex genetic basis, consisting of many quantitative trait loci (QTLs), each group needs to be large. Each group must be genotyped at marker loci covering the region of interest; for dense coverage of a large candidate region, or a whole-genome scan, the number of markers will be very large. The total amount of genotyping required for such a study is formidable. A laboratory effort efficient technique called DNA pooling could reduce the amount of genotyping required, but the data generated are less informative and require novel methods for efficient analysis. In this paper, a Bayesian statistical analysis of the classic model of McPeek and Strahs is proposed. In contrast to previous work on this model, I assume that data are collected using DNA pooling, so individual genotypes are not directly observed, and also account for experimental errors. A complete analysis can be performed using analytical integration, a propagation algorithm for a hidden Markov model, and quadrature. The method developed here is both statistically and computationally efficient. It allows simultaneous detection and mapping of a QTL, in a large-scale association mapping study, using data from pooled DNA. The method is shown to perform well on data sets simulated under a realistic coalescent-with-recombination model, and is shown to outperform classical single-point methods. The method is illustrated on data consisting of 27 markers in an 880-kb region around the CYP2D6 gene.  相似文献   

17.
Data mining applied to linkage disequilibrium mapping   总被引:11,自引:0,他引:11       下载免费PDF全文
We introduce a new method for linkage disequilibrium mapping: haplotype pattern mining (HPM). The method, inspired by data mining methods, is based on discovery of recurrent patterns. We define a class of useful haplotype patterns in genetic case-control data and use the algorithm for finding disease-associated haplotypes. The haplotypes are ordered by their strength of association with the phenotype, and all haplotypes exceeding a given threshold level are used for prediction of disease susceptibility-gene location. The method is model-free, in the sense that it does not require (and is unable to utilize) any assumptions about the inheritance model of the disease. The statistical model is nonparametric. The haplotypes are allowed to contain gaps, which improves the method's robustness to mutations and to missing and erroneous data. Experimental studies with simulated microsatellite and SNP data show that the method has good localization power in data sets with large degrees of phenocopies and with lots of missing and erroneous data. The power of HPM is roughly identical for marker maps at a density of 3 single-nucleotide polymorphisms/cM or 1 microsatellite/cM. The capacity to handle high proportions of phenocopies makes the method promising for complex disease mapping. An example of correct disease susceptibility-gene localization with HPM is given with real marker data from families from the United Kingdom affected by type 1 diabetes. The method is extendable to include environmental covariates or phenotype measurements or to find several genes simultaneously.  相似文献   

18.
Resequencing is an emerging tool for identification of rare disease-associated mutations. Rare mutations are difficult to tag with SNP genotyping, as genotyping studies are designed to detect common variants. However, studies have shown that genetic heterogeneity is a probable scenario for common diseases, in which multiple rare mutations together explain a large proportion of the genetic basis for the disease. Thus, we propose a weighted-sum method to jointly analyse a group of mutations in order to test for groupwise association with disease status. For example, such a group of mutations may result from resequencing a gene. We compare the proposed weighted-sum method to alternative methods and show that it is powerful for identifying disease-associated genes, both on simulated and Encode data. Using the weighted-sum method, a resequencing study can identify a disease-associated gene with an overall population attributable risk (PAR) of 2%, even when each individual mutation has much lower PAR, using 1,000 to 7,000 affected and unaffected individuals, depending on the underlying genetic model. This study thus demonstrates that resequencing studies can identify important genetic associations, provided that specialised analysis methods, such as the weighted-sum method, are used.  相似文献   

19.
Sequencing studies are increasingly being conducted to identify rare variants associated with complex traits. The limited power of classical single-marker association analysis for rare variants poses a central challenge in such studies. We propose the sequence kernel association test (SKAT), a supervised, flexible, computationally efficient regression method to test for association between genetic variants (common and rare) in a region and a continuous or dichotomous trait while easily adjusting for covariates. As a score-based variance-component test, SKAT can quickly calculate p values analytically by fitting the null model containing only the covariates, and so can easily be applied to genome-wide data. Using SKAT to analyze a genome-wide sequencing study of 1000 individuals, by segmenting the whole genome into 30 kb regions, requires only 7 hr on a laptop. Through analysis of simulated data across a wide range of practical scenarios and triglyceride data from the Dallas Heart Study, we show that SKAT can substantially outperform several alternative rare-variant association tests. We also provide analytic power and sample-size calculations to help design candidate-gene, whole-exome, and whole-genome sequence association studies.  相似文献   

20.
A new functional representation of NMR-derived distance constraints, the flexible restraint potential, has been implemented in the program CONGEN (Bruccoleri RE, Karplus M, 1987, Biopolymers 26:137-168) for molecular structure generation. In addition, flat-bottomed restraint potentials for representing dihedral angle and vicinal scalar coupling constraints have been introduced into CONGEN. An effective simulated annealing (SA) protocol that combines both weight annealing and temperature annealing is described. Calculations have been performed using ideal simulated NMR constraints, in order to evaluate the use of restrained molecular dynamics (MD) with these target functions as implemented in CONGEN. In this benchmark study, internuclear distance, dihedral angle, and vicinal coupling constant constraints were calculated from the energy-minimized X-ray crystal structure of the 46-amino acid polypeptide crambin (ICRN). Three-dimensional structures of crambin that satisfy these simulated NMR constraints were generated using restrained MD and SA. Polypeptide structures with extended backbone and side-chain conformations were used as starting conformations. Dynamical annealing calculations using extended starting conformations and assignments of initial velocities taken randomly from a Maxwellian distribution were found to adequately sample the conformational space consistent with the constraints. These calculations also show that loosened internuclear constraints can allow molecules to overcome local minima in the search for a global minimum with respect to both the NMR-derived constraints and conformational energy. This protocol and the modified version of the CONGEN program described here are shown to be reliable and robust, and are applicable generally for protein structure determination by dynamical simulated annealing using NMR data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号