首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
MetaboNexus is an interactive metabolomics data analysis platform that integrates pre-processing of raw peak data with in-depth statistical analysis and metabolite identity search. It is designed to work as a desktop application hence uploading large files to web servers is not required. This could speed up the data analysis process because server queries or queues are avoided, while ensuring security of confidential clinical data on a local computer. With MetaboNexus, users can progressively start from data pre-processing, multi- and univariate analysis to metabolite identity search of significant molecular features, thereby seamlessly integrating critical steps for metabolite biomarker discovery. Data exploration can be first performed using principal components analysis, while prediction and variable importance can be calculated using partial least squares-discriminant analysis and Random Forest. After identifying putative features from multi- and univariate analyses (e.g. t test, ANOVA, Mann–Whitney U test and Kruskal–Wallis test), users can seamlessly determine the molecular identity of these putative features. To assist users in data interpretation, MetaboNexus also automatically generates graphical outputs, such as score plots, diagnostic plots, boxplots, receiver operating characteristic plots and heatmaps. The metabolite search function will match the mass spectrometric peak data to three major metabolite repositories, namely HMDB, MassBank and METLIN, using a comprehensive range of molecular adducts. Biological pathways can also be searched within MetaboNexus. MetaboNexus is available with installation guide and tutorial at http://www.sph.nus.edu.sg/index.php/research-services/research-centres/ceohr/metabonexus, and is meant for the Windows Operating System, XP and onwards (preferably on 64-bit). In summary, MetaboNexus is a desktop-based platform that seamlessly integrates the entire data analytical workflow and further provides the putative identities of mass spectrometric data peaks by matching them to databases.  相似文献   

2.

Background

Biochemical equilibria are usually modeled iteratively: given one or a few fitted models, if there is a lack of fit or over fitting, a new model with additional or fewer parameters is then fitted, and the process is repeated. The problem with this approach is that different analysts can propose and select different models and thus extract different binding parameter estimates from the same data. An alternative is to first generate a comprehensive standardized list of plausible models, and to then fit them exhaustively, or semi-exhaustively.

Results

A framework is presented in which equilibriums are modeled as pairs (g, h) where g = 0 maps total reactant concentrations (system inputs) into free reactant concentrations (system states) which h then maps into expected values of measurements (system outputs). By letting dissociation constants K d be either freely estimated, infinity, zero, or equal to other K d , and by letting undamaged protein fractions be either freely estimated or 1, many g models are formed. A standard space of g models for ligand-induced protein dimerization equilibria is given. Coupled to an h model, the resulting (g, h) were fitted to dTTP induced R1 dimerization data (R1 is the large subunit of ribonucleotide reductase). Models with the fewest parameters were fitted first. Thereafter, upon fitting a batch, the next batch of models (with one more parameter) was fitted only if the current batch yielded a model that was better (based on the Akaike Information Criterion) than the best model in the previous batch (with one less parameter). Within batches models were fitted in parallel. This semi-exhaustive approach yielded the same best models as an exhaustive model space fit, but in approximately one-fifth the time.

Conclusion

Comprehensive model space based biochemical equilibrium model selection methods are realizable. Their significance to systems biology as mappings of data into mathematical models warrants their development.  相似文献   

3.
Lisacek F 《Proteomics》2006,6(Z2):22-32
This tutorial focuses on three MS/MS data analysis programs currently available via a web interface: Mascot, Phenyx and X!Tandem. Although these programs process the same input and often produce comparable outputs, subtle differences remain. The use of parameters that are requested in the on-line forms and the subsequent interpretation of results are illustrated and explained via a single example.  相似文献   

4.

Background

Species distribution models are often used to characterize a species'' native range climate, so as to identify sites elsewhere in the world that may be climatically similar and therefore at risk of invasion by the species. This endeavor provoked intense public controversy over recent attempts to model areas at risk of invasion by the Indian Python (Python molurus). We evaluated a number of MaxEnt models on this species to assess MaxEnt''s utility for vertebrate climate matching.

Methodology/Principal Findings

Overall, we found MaxEnt models to be very sensitive to modeling choices and selection of input localities and background regions. As used, MaxEnt invoked minimal protections against data dredging, multi-collinearity of explanatory axes, and overfitting. As used, MaxEnt endeavored to identify a single ideal climate, whereas different climatic considerations may determine range boundaries in different parts of the native range. MaxEnt was extremely sensitive to both the choice of background locations for the python, and to selection of presence points: inclusion of just four erroneous localities was responsible for Pyron et al.''s conclusion that no additional portions of the U.S. mainland were at risk of python invasion. When used with default settings, MaxEnt overfit the realized climate space, identifying models with about 60 parameters, about five times the number of parameters justifiable when optimized on the basis of Akaike''s Information Criterion.

Conclusions/Significance

When used with default settings, MaxEnt may not be an appropriate vehicle for identifying all sites at risk of colonization. Model instability and dearth of protections against overfitting, multi-collinearity, and data dredging may combine with a failure to distinguish fundamental from realized climate envelopes to produce models of limited utility. A priori identification of biologically realistic model structure, combined with computational protections against these statistical problems, may produce more robust models of invasion risk.  相似文献   

5.
Experimental simulator studies are frequently performed to evaluate wear behavior in total knee replacement. It is vital that the simulation conditions match the physiological situation as closely as possible. To date, few experimental wear studies have examined the effects of joint laxity on wear and joint kinematics and the absence of the anterior cruciate ligament has not been sufficiently taken into account in simulator wear studies.The aim of this study was to investigate different ligament and soft tissue models with respect to wear and kinematics.A virtual soft tissue control system was used to simulate different motion restraints in a force-controlled knee wear simulator.The application of more realistic and sophisticated ligament models that considered the absence of anterior cruciate ligament lead to a significant increase in polyethylene wear (p=0.02) and joint kinematics (p<0.01). We recommend the use of more complex ligament models to appropriately simulate the function of the human knee joint and to evaluate the wear behavior of total knee replacements. A feasible simulation model is presented.  相似文献   

6.
7.
How anger works     
Anger appears to be a neurocognitive adaptation designed to bargain for better treatment, and is primarily triggered by indications that another individual values the focal individual insufficiently. Once activated, anger orchestrates cognitive, physiological, and behavioral responses geared to incentivize the target individual to place more weight on the welfare of the focal individual. Here, we evaluate the hypothesis that anger works by matching in intensity the various outputs it controls to the magnitude of the current input—the precise degree to which the target appears to undervalue the focal individual. By magnitude-matching outputs to inputs, the anger system balances the competing demands of effectiveness and economy and avoids the dual errors of excessive diffidence and excessive belligerence in bargaining. To test this hypothesis, we measured the degree to which audiences devalue each of 39 negative traits in others, and how individuals would react, for each of those 39 traits, if someone slandered them as possessing those traits. We observed the hypothesized magnitude-matchings. The intensities of the anger feeling and of various motivations of anger (telling the offender to stop, insulting the offender, physically attacking the offender, stopping talking to the offender, and denying help to the offender) vary in proportion to: (i) one another, and (ii) the reputational cost that the slanderer imposes on the slandered (proxied by audience devaluation). These patterns of magnitude-matching were observed both within and between the United States and India. These quantitative findings echo laypeople's folk understanding of anger and suggest that there are cross-cultural regularities in the functional logic and content of anger.  相似文献   

8.
In this paper, we investigate the application of a new method, the Finite Difference and Stochastic Gradient (Hybrid method), for history matching in reservoir models. History matching is one of the processes of solving an inverse problem by calibrating reservoir models to dynamic behaviour of the reservoir in which an objective function is formulated based on a Bayesian approach for optimization. The goal of history matching is to identify the minimum value of an objective function that expresses the misfit between the predicted and measured data of a reservoir. To address the optimization problem, we present a novel application using a combination of the stochastic gradient and finite difference methods for solving inverse problems. The optimization is constrained by a linear equation that contains the reservoir parameters. We reformulate the reservoir model’s parameters and dynamic data by operating the objective function, the approximate gradient of which can guarantee convergence. At each iteration step, we obtain the relatively ‘important’ elements of the gradient, which are subsequently substituted by the values from the Finite Difference method through comparing the magnitude of the components of the stochastic gradient, which forms a new gradient, and we subsequently iterate with the new gradient. Through the application of the Hybrid method, we efficiently and accurately optimize the objective function. We present a number numerical simulations in this paper that show that the method is accurate and computationally efficient.  相似文献   

9.

Background

The vast computational resources that became available during the past decade enabled the development and simulation of increasingly complex mathematical models of cancer growth. These models typically involve many free parameters whose determination is a substantial obstacle to model development. Direct measurement of biochemical parameters in vivo is often difficult and sometimes impracticable, while fitting them under data-poor conditions may result in biologically implausible values.

Results

We discuss different methodological approaches to estimate parameters in complex biological models. We make use of the high computational power of the Blue Gene technology to perform an extensive study of the parameter space in a model of avascular tumor growth. We explicitly show that the landscape of the cost function used to optimize the model to the data has a very rugged surface in parameter space. This cost function has many local minima with unrealistic solutions, including the global minimum corresponding to the best fit.

Conclusions

The case studied in this paper shows one example in which model parameters that optimally fit the data are not necessarily the best ones from a biological point of view. To avoid force-fitting a model to a dataset, we propose that the best model parameters should be found by choosing, among suboptimal parameters, those that match criteria other than the ones used to fit the model. We also conclude that the model, data and optimization approach form a new complex system and point to the need of a theory that addresses this problem more generally.  相似文献   

10.

Background

Antagonistic species often interact via matching of phenotypes, and interactions between brood parasitic common cuckoos (Cuculus canorus) and their hosts constitute classic examples. The outcome of a parasitic event is often determined by the match between host and cuckoo eggs, giving rise to potentially strong associations between fitness and egg phenotype. Yet, empirical efforts aiming to document and understand the resulting evolutionary outcomes are in short supply.

Methods/Principal Findings

We used avian color space models to analyze patterns of egg color variation within and between the cuckoo and two closely related hosts, the nomadic brambling (Fringilla montifringilla) and the site fidelic chaffinch (F. coelebs). We found that there is pronounced opportunity for disruptive selection on brambling egg coloration. The corresponding cuckoo host race has evolved egg colors that maximize fitness in both sympatric and allopatric brambling populations. By contrast, the chaffinch has a more bimodal egg color distribution consistent with the evolutionary direction predicted for the brambling. Whereas the brambling and its cuckoo host race show little geographical variation in their egg color distributions, the chaffinch''s distribution becomes increasingly dissimilar to the brambling''s distribution towards the core area of the brambling cuckoo host race.

Conclusion

High rates of brambling gene flow is likely to cool down coevolutionary hot spots by cancelling out the selection imposed by a patchily distributed cuckoo host race, thereby promoting a matching equilibrium. By contrast, the site fidelic chaffinch is more likely to respond to selection from adapting cuckoos, resulting in a markedly more bimodal egg color distribution. The geographic variation in the chaffinch''s egg color distribution could reflect a historical gradient in parasitism pressure. Finally, marked cuckoo egg polymorphisms are unlikely to evolve in these systems unless the hosts evolve even more exquisite egg recognition capabilities than currently possessed.  相似文献   

11.
12.

Background

The distribution of residual effects in linear mixed models in animal breeding applications is typically assumed normal, which makes inferences vulnerable to outlier observations. In order to mute the impact of outliers, one option is to fit models with residuals having a heavy-tailed distribution. Here, a Student''s-t model was considered for the distribution of the residuals with the degrees of freedom treated as unknown. Bayesian inference was used to investigate a bivariate Student''s-t (BSt) model using Markov chain Monte Carlo methods in a simulation study and analysing field data for gestation length and birth weight permitted to study the practical implications of fitting heavy-tailed distributions for residuals in linear mixed models.

Methods

In the simulation study, bivariate residuals were generated using Student''s-t distribution with 4 or 12 degrees of freedom, or a normal distribution. Sire models with bivariate Student''s-t or normal residuals were fitted to each simulated dataset using a hierarchical Bayesian approach. For the field data, consisting of gestation length and birth weight records on 7,883 Italian Piemontese cattle, a sire-maternal grandsire model including fixed effects of sex-age of dam and uncorrelated random herd-year-season effects were fitted using a hierarchical Bayesian approach. Residuals were defined to follow bivariate normal or Student''s-t distributions with unknown degrees of freedom.

Results

Posterior mean estimates of degrees of freedom parameters seemed to be accurate and unbiased in the simulation study. Estimates of sire and herd variances were similar, if not identical, across fitted models. In the field data, there was strong support based on predictive log-likelihood values for the Student''s-t error model. Most of the posterior density for degrees of freedom was below 4. Posterior means of direct and maternal heritabilities for birth weight were smaller in the Student''s-t model than those in the normal model. Re-rankings of sires were observed between heavy-tailed and normal models.

Conclusions

Reliable estimates of degrees of freedom were obtained in all simulated heavy-tailed and normal datasets. The predictive log-likelihood was able to distinguish the correct model among the models fitted to heavy-tailed datasets. There was no disadvantage of fitting a heavy-tailed model when the true model was normal. Predictive log-likelihood values indicated that heavy-tailed models with low degrees of freedom values fitted gestation length and birth weight data better than a model with normally distributed residuals.Heavy-tailed and normal models resulted in different estimates of direct and maternal heritabilities, and different sire rankings. Heavy-tailed models may be more appropriate for reliable estimation of genetic parameters from field data.  相似文献   

13.
Single neuron models have a long tradition in computational neuroscience. Detailed biophysical models such as the Hodgkin-Huxley model as well as simplified neuron models such as the class of integrate-and-fire models relate the input current to the membrane potential of the neuron. Those types of models have been extensively fitted to in vitro data where the input current is controlled. Those models are however of little use when it comes to characterize intracellular in vivo recordings since the input to the neuron is not known. Here we propose a novel single neuron model that characterizes the statistical properties of in vivo recordings. More specifically, we propose a stochastic process where the subthreshold membrane potential follows a Gaussian process and the spike emission intensity depends nonlinearly on the membrane potential as well as the spiking history. We first show that the model has a rich dynamical repertoire since it can capture arbitrary subthreshold autocovariance functions, firing-rate adaptations as well as arbitrary shapes of the action potential. We then show that this model can be efficiently fitted to data without overfitting. We finally show that this model can be used to characterize and therefore precisely compare various intracellular in vivo recordings from different animals and experimental conditions.  相似文献   

14.

Background

Sensitivity and robustness are essential properties of circadian clock systems, enabling them to respond to the environment but resist noisy variations. These properties should be recapitulated in computational models of the circadian clock. Highly nonlinear kinetics and multiple loops are often incorporated into models to match experimental time-series data, but these also impact on model properties for clock models.

Methodology/Principal Findings

Here, we study the consequences of complicated structure and nonlinearity using simple Goodwin-type oscillators and the complex Arabidopsis circadian clock models. Sensitivity analysis of the simple oscillators implies that an interlocked multi-loop structure reinforces sensitivity/robustness properties, enhancing the response to external and internal variations. Furthermore, we found that reducing the degree of nonlinearity could sometimes enhance the robustness of models, implying that ad hoc incorporation of nonlinearity could be detrimental to a model''s perceived credibility.

Conclusion

The correct multi-loop structure and degree of nonlinearity are therefore critical in contributing to the desired properties of a model as well as its capacity to match experimental data.  相似文献   

15.
The multidimensional computations performed by many biological systems are often characterized with limited information about the correlations between inputs and outputs. Given this limitation, our approach is to construct the maximum noise entropy response function of the system, leading to a closed-form and minimally biased model consistent with a given set of constraints on the input/output moments; the result is equivalent to conditional random field models from machine learning. For systems with binary outputs, such as neurons encoding sensory stimuli, the maximum noise entropy models are logistic functions whose arguments depend on the constraints. A constraint on the average output turns the binary maximum noise entropy models into minimum mutual information models, allowing for the calculation of the information content of the constraints and an information theoretic characterization of the system's computations. We use this approach to analyze the nonlinear input/output functions in macaque retina and thalamus; although these systems have been previously shown to be responsive to two input dimensions, the functional form of the response function in this reduced space had not been unambiguously identified. A second order model based on the logistic function is found to be both necessary and sufficient to accurately describe the neural responses to naturalistic stimuli, accounting for an average of 93% of the mutual information with a small number of parameters. Thus, despite the fact that the stimulus is highly non-Gaussian, the vast majority of the information in the neural responses is related to first and second order correlations. Our results suggest a principled and unbiased way to model multidimensional computations and determine the statistics of the inputs that are being encoded in the outputs.  相似文献   

16.
This software note announces a new open‐source release of the Maxent software for modeling species distributions from occurrence records and environmental data, and describes a new R package for fitting such models. The new release (ver. 3.4.0) will be hosted online by the American Museum of Natural History, along with future versions. It contains small functional changes, most notably use of a complementary log‐log (cloglog) transform to produce an estimate of occurrence probability. The cloglog transform derives from the recently‐published interpretation of Maxent as an inhomogeneous Poisson process (IPP), giving it a stronger theoretical justification than the logistic transform which it replaces by default. In addition, the new R package, maxnet, fits Maxent models using the glmnet package for regularized generalized linear models. We discuss the implications of the IPP formulation in terms of model inputs and outputs, treating occurrence records as points rather than grid cells and interpreting the exponential Maxent model (raw output) as as an estimate of relative abundance. With these two open‐source developments, we invite others to freely use and contribute to the software.  相似文献   

17.
The light color of mice that inhabit the sandy dunes of Florida's coast have served as a textbook example of adaptation for nearly a century, despite the fact that the selective advantage of crypsis has never been directly tested or quantified in nature. Using plasticine mouse models of light and dark color, we demonstrate a strong selective advantage for mice that match their local background substrate. Further our data suggest that stabilizing selection maintains color matching within a single habitat, as models that are both lighter and darker than their local environment are selected against. These results provide empirical evidence in support of the hypothesis that visual hunting predators shape color patterning in Peromyscus mice and suggest a mechanism by which selection drives the pronounced color variation among populations.  相似文献   

18.
Although much of the information regarding genes'' expressions is encoded in the genome, deciphering such information has been very challenging. We reexamined Beer and Tavazoie''s (BT) approach to predict mRNA expression patterns of 2,587 genes in Saccharomyces cerevisiae from the information in their respective promoter sequences. Instead of fitting complex Bayesian network models, we trained naïve Bayes classifiers using only the sequence-motif matching scores provided by BT. Our simple models correctly predict expression patterns for 79% of the genes, based on the same criterion and the same cross-validation (CV) procedure as BT, which compares favorably to the 73% accuracy of BT. The fact that our approach did not use position and orientation information of the predicted binding sites but achieved a higher prediction accuracy, motivated us to investigate a few biological predictions made by BT. We found that some of their predictions, especially those related to motif orientations and positions, are at best circumstantial. For example, the combinatorial rules suggested by BT for the PAC and RRPE motifs are not unique to the cluster of genes from which the predictive model was inferred, and there are simpler rules that are statistically more significant than BT''s ones. We also show that CV procedure used by BT to estimate their method''s prediction accuracy is inappropriate and may have overestimated the prediction accuracy by about 10%.  相似文献   

19.
With advances in sequencing technology, research in the field of landscape genetics can now be conducted at unprecedented spatial and genomic scales. This has been especially evident when using sequence data to visualize patterns of genetic differentiation across a landscape due to demographic history, including changes in migration. Two recent model‐based visualization methods that can highlight unusual patterns of genetic differentiation across a landscape, SpaceMix and EEMS, are increasingly used. While SpaceMix's model can infer long‐distance migration, EEMS’ model is more sensitive to short‐distance changes in genetic differentiation, and it is unclear how these differences may affect their results in various situations. Here, we compare SpaceMix and EEMS side by side using landscape genetics simulations representing different migration scenarios. While both methods excel when patterns of simulated migration closely match their underlying models, they can produce either un‐intuitive or misleading results when the simulated migration patterns match their models less well, and this may be difficult to assess in empirical data sets. We also introduce unbundled principal components (un‐PC), a fast, model‐free method to visualize patterns of genetic differentiation by combining principal components analysis (PCA), which is already used in many landscape genetics studies, with the locations of sampled individuals. Un‐PC has characteristics of both SpaceMix and EEMS and works well with simulated and empirical data. Finally, we introduce msLandscape, a collection of tools that streamline the creation of customizable landscape‐scale simulations using the popular coalescent simulator ms and conversion of the simulated data for use with un‐PC, SpaceMix and EEMS.  相似文献   

20.
Noninformative vision improves haptic spatial perception   总被引:10,自引:0,他引:10  
Previous studies have attempted to map somatosensory space via haptic matching tasks and have shown that individuals make large and systematic matching errors, the magnitude and angular direction of which vary systematically through the workspace. Based upon such demonstrations, it has been suggested that haptic space is non-Euclidian. This conclusion assumes that spatial perception is modality specific, and it largely ignores the fact that tactile matching tasks involve active, exploratory arm movements. Here we demonstrate that, when individuals match two bar stimuli (i.e., make them parallel) in circumstances favoring extrinsic (visual) coordinates, providing noninformative visual information significantly increases the accuracy of haptic perception. In contrast, when individuals match the same bar stimuli in circumstances favoring the coding of movements in intrinsic (limb-based) coordinates, providing identical noninformative visual information either has no effect or leads to the decreased accuracy of haptic perception. These results are consistent with optimal integration models of sensory integration in which the weighting given to visual and somatosensory signals depends upon the precision of the visual and somatosensory information and provide important evidence for the task-dependent integration of visual and somatosensory signals during the construction of a representation of peripersonal space.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号