首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Multiple patient-specific parameters, such as wall thickness, wall strength, and constitutive properties, are required for the computational assessment of abdominal aortic aneurysm (AAA) rupture risk. Unfortunately, many of these quantities are not easily accessible and could only be determined by invasive procedures, rendering a computational rupture risk assessment obsolete. This study investigates two different approaches to predict these quantities using regression models in combination with a multitude of noninvasively accessible, explanatory variables. We have gathered a large dataset comprising tensile tests performed with AAA specimens and supplementary patient information based on blood analysis, the patients medical history, and geometric features of the AAAs. Using this unique database, we harness the capability of state-of-the-art Bayesian regression techniques to infer probabilistic models for multiple quantities of interest. After a brief presentation of our experimental results, we show that we can effectively reduce the predictive uncertainty in the assessment of several patient-specific parameters, most importantly in thickness and failure strength of the AAA wall. Thereby, the more elaborate Bayesian regression approach based on Gaussian processes consistently outperforms standard linear regression. Moreover, our study contains a comparison to a previously proposed model for the wall strength.  相似文献   

2.
Large contingency tables summarizing categorical variables arise in many areas. One example is in biology, where large numbers of biomarkers are cross‐tabulated according to their discrete expression level. Interactions of the variables are of great interest and are generally studied with log–linear models. The structure of a log–linear model can be visually represented by a graph from which the conditional independence structure can then be easily read off. However, since the number of parameters in a saturated model grows exponentially in the number of variables, this generally comes with a heavy computational burden. Even if we restrict ourselves to models of lower‐order interactions or other sparse structures, we are faced with the problem of a large number of cells which play the role of sample size. This is in sharp contrast to high‐dimensional regression or classification procedures because, in addition to a high‐dimensional parameter, we also have to deal with the analogue of a huge sample size. Furthermore, high‐dimensional tables naturally feature a large number of sampling zeros which often leads to the nonexistence of the maximum likelihood estimate. We therefore present a decomposition approach, where we first divide the problem into several lower‐dimensional problems and then combine these to form a global solution. Our methodology is computationally feasible for log–linear interaction models with many categorical variables each or some of them having many levels. We demonstrate the proposed method on simulated data and apply it to a bio‐medical problem in cancer research.  相似文献   

3.
The relative solvent accessibility (RSA) of an amino acid residue in a protein structure is a real number that represents the solvent exposed surface area of this residue in relative terms. The problem of predicting the RSA from the primary amino acid sequence can therefore be cast as a regression problem. Nevertheless, RSA prediction has so far typically been cast as a classification problem. Consequently, various machine learning techniques have been used within the classification framework to predict whether a given amino acid exceeds some (arbitrary) RSA threshold and would thus be predicted to be "exposed," as opposed to "buried." We have recently developed novel methods for RSA prediction using nonlinear regression techniques which provide accurate estimates of the real-valued RSA and outperform classification-based approaches with respect to commonly used two-class projections. However, while their performance seems to provide a significant improvement over previously published approaches, these Neural Network (NN) based methods are computationally expensive to train and involve several thousand parameters. In this work, we develop alternative regression models for RSA prediction which are computationally much less expensive, involve orders-of-magnitude fewer parameters, and are still competitive in terms of prediction quality. In particular, we investigate several regression models for RSA prediction using linear L1-support vector regression (SVR) approaches as well as standard linear least squares (LS) regression. Using rigorously derived validation sets of protein structures and extensive cross-validation analysis, we compare the performance of the SVR with that of LS regression and NN-based methods. In particular, we show that the flexibility of the SVR (as encoded by metaparameters such as the error insensitivity and the error penalization terms) can be very beneficial to optimize the prediction accuracy for buried residues. We conclude that the simple and computationally much more efficient linear SVR performs comparably to nonlinear models and thus can be used in order to facilitate further attempts to design more accurate RSA prediction methods, with applications to fold recognition and de novo protein structure prediction methods.  相似文献   

4.
Lee S  Lee BC  Kim D 《Proteins》2006,62(4):1107-1114
Knowing protein structure and inferring its function from the structure are one of the main issues of computational structural biology, and often the first step is studying protein secondary structure. There have been many attempts to predict protein secondary structure contents. Previous attempts assumed that the content of protein secondary structure can be predicted successfully using the information on the amino acid composition of a protein. Recent methods achieved remarkable prediction accuracy by using the expanded composition information. The overall average error of the most successful method is 3.4%. Here, we demonstrate that even if we only use the simple amino acid composition information alone, it is possible to improve the prediction accuracy significantly if the evolutionary information is included. The idea is motivated by the observation that evolutionarily related proteins share the similar structure. After calculating the homolog-averaged amino acid composition of a protein, which can be easily obtained from the multiple sequence alignment by running PSI-BLAST, those 20 numbers are learned by a multiple linear regression, an artificial neural network and a support vector regression. The overall average error of method by a support vector regression is 3.3%. It is remarkable that we obtain the comparable accuracy without utilizing the expanded composition information such as pair-coupled amino acid composition. This work again demonstrates that the amino acid composition is a fundamental characteristic of a protein. It is anticipated that our novel idea can be applied to many areas of protein bioinformatics where the amino acid composition information is utilized, such as subcellular localization prediction, enzyme subclass prediction, domain boundary prediction, signal sequence prediction, and prediction of unfolded segment in a protein sequence, to name a few.  相似文献   

5.
6.
Vesicle docking in regulated exocytosis   总被引:1,自引:0,他引:1  
In electron micrographs, many secretory and synaptic vesicles are found 'docked' at the target membrane, but it is unclear why and how. It is generally assumed that docking is a necessary first step in the secretory pathway before vesicles can acquire fusion competence (through 'priming'), but recent studies challenge this. New biophysical methods have become available to detect how vesicles are tethered at the target membrane, and genetic manipulations have implicated many genes in tethering, docking and priming. However, these studies have not yet led to consistent working models for these steps. In this study, we review recent attempts to characterize these early steps and the cellular factors to orchestrate them. We discuss whether assays for docking, tethering and priming report on the same phenomena and whether all vesicles necessarily follow the same linear docking–priming–fusion pathway. We conclude that most evidence to date is consistent with such a linear pathway assuming several refinements that imply that some vesicles can be nonfunctionally docked ('dead-end' docking) or, conversely, that the linear pathway can be greatly accelerated (crash fusion).  相似文献   

7.
There is still much unknown regarding the computational role of inhibitory cells in the sensory cortex. While modeling studies could potentially shed light on the critical role played by inhibition in cortical computation, there is a gap between the simplicity of many models of sensory coding and the biological complexity of the inhibitory subpopulation. In particular, many models do not respect that inhibition must be implemented in a separate subpopulation, with those inhibitory interneurons having a diversity of tuning properties and characteristic E/I cell ratios. In this study we demonstrate a computational framework for implementing inhibition in dynamical systems models that better respects these biophysical observations about inhibitory interneurons. The main approach leverages recent work related to decomposing matrices into low-rank and sparse components via convex optimization, and explicitly exploits the fact that models and input statistics often have low-dimensional structure that can be exploited for efficient implementations. While this approach is applicable to a wide range of sensory coding models (including a family of models based on Bayesian inference in a linear generative model), for concreteness we demonstrate the approach on a network implementing sparse coding. We show that the resulting implementation stays faithful to the original coding goals while using inhibitory interneurons that are much more biophysically plausible.  相似文献   

8.
Leung Lai T  Shih MC  Wong SP 《Biometrics》2006,62(1):159-167
To circumvent the computational complexity of likelihood inference in generalized mixed models that assume linear or more general additive regression models of covariate effects, Laplace's approximations to multiple integrals in the likelihood have been commonly used without addressing the issue of adequacy of the approximations for individuals with sparse observations. In this article, we propose a hybrid estimation scheme to address this issue. The likelihoods for subjects with sparse observations use Monte Carlo approximations involving importance sampling, while Laplace's approximation is used for the likelihoods of other subjects that satisfy a certain diagnostic check on the adequacy of Laplace's approximation. Because of its computational tractability, the proposed approach allows flexible modeling of covariate effects by using regression splines and model selection procedures for knot and variable selection. Its computational and statistical advantages are illustrated by simulation and by application to longitudinal data from a fecundity study of fruit flies, for which overdispersion is modeled via a double exponential family.  相似文献   

9.
It is quite difficult to construct circuits of spiking neurons that can carry out complex computational tasks. On the other hand even randomly connected circuits of spiking neurons can in principle be used for complex computational tasks such as time-warp invariant speech recognition. This is possible because such circuits have an inherent tendency to integrate incoming information in such a way that simple linear readouts can be trained to transform the current circuit activity into the target output for a very large number of computational tasks. Consequently we propose to analyze circuits of spiking neurons in terms of their roles as analog fading memory and non-linear kernels, rather than as implementations of specific computational operations and algorithms. This article is a sequel to [W. Maass, T. Natschl?ger, H. Markram, Real-time computing without stable states: a new framework for neural computation based on perturbations, Neural Comput. 14 (11) (2002) 2531-2560, Online available as #130 from: ], and contains new results about the performance of generic neural microcircuit models for the recognition of speech that is subject to linear and non-linear time-warps, as well as for computations on time-varying firing rates. These computations rely, apart from general properties of generic neural microcircuit models, just on capabilities of simple linear readouts trained by linear regression. This article also provides detailed data on the fading memory property of generic neural microcircuit models, and a quick review of other new results on the computational power of such circuits of spiking neurons.  相似文献   

10.
The developments in biochemistry and molecular biology over the past 30 years have produced an impressive parts list of cellular components. It has become increasingly clear that we need to understand how components come together to form systems. One area where this approach has been growing is cell signalling research. Here, instead of focusing on individual or small groups of signalling proteins, researchers are now using a more holistic perspective. This approach attempts to view how many components are working together in concert to process information and to orchestrate cellular phenotypic changes. Additionally, the advancements in experimental techniques to measure and visualize many cellular components at once gradually grow in diversity and accuracy. The multivariate data, produced by experiments, introduce new and exciting challenges for computational biologists, who develop models of cellular systems made up of interacting cellular components. The integration of high-throughput experimental results and information from legacy literature is expected to produce computational models that would rapidly enhance our understanding of the detail workings of mammalian cells.  相似文献   

11.
Quantitative predictions in computational life sciences are often based on regression models. The advent of machine learning has led to highly accurate regression models that have gained widespread acceptance. While there are statistical methods available to estimate the global performance of regression models on a test or training dataset, it is often not clear how well this performance transfers to other datasets or how reliable an individual prediction is–a fact that often reduces a user’s trust into a computational method. In analogy to the concept of an experimental error, we sketch how estimators for individual prediction errors can be used to provide confidence intervals for individual predictions. Two novel statistical methods, named CONFINE and CONFIVE, can estimate the reliability of an individual prediction based on the local properties of nearby training data. The methods can be applied equally to linear and non-linear regression methods with very little computational overhead. We compare our confidence estimators with other existing confidence and applicability domain estimators on two biologically relevant problems (MHC–peptide binding prediction and quantitative structure-activity relationship (QSAR)). Our results suggest that the proposed confidence estimators perform comparable to or better than previously proposed estimation methods. Given a sufficient amount of training data, the estimators exhibit error estimates of high quality. In addition, we observed that the quality of estimated confidence intervals is predictable. We discuss how confidence estimation is influenced by noise, the number of features, and the dataset size. Estimating the confidence in individual prediction in terms of error intervals represents an important step from plain, non-informative predictions towards transparent and interpretable predictions that will help to improve the acceptance of computational methods in the biological community.  相似文献   

12.
Regulated transport of the plant hormone auxin is central to many aspects of plant development. Directional transport, mediated by membrane transporters, produces patterns of auxin distribution in tissues that trigger developmental processes, such as vascular patterning or leaf formation. Experimentation has produced many, largely qualitative, data providing strong evidence for multiple feedback systems between auxin and its transport. However, the exact mechanisms concerned remain elusive and the experiments required to evaluate alternative hypotheses are challenging. Because of this, computational modelling now plays an important role in auxin transport research. Here we review some current approaches and underlying assumptions of computational auxin transport models. We focus on self‐organising models for polar auxin transport and on recent attempts to unify conflicting mechanistic explanations. In addition, we discuss in general how these computer simulations are proving to be increasingly effective in hypothesis generation and testing, and how simulation can be used to direct future experiments. Editor's suggested further reading in BioEssays Local auxin production: a small contribution to a big field Abstract  相似文献   

13.
CRISCAT is a computer program for the analysis of grouped survival data with competing risks via weighted least squares methods. Competing risks adjustments are obtained from general matrix operations using many of the strategies employed in a previously developed program (GENCAT) for multivariate categorical data. CRISCAT computes survival rates at several time points for multiple causes of failure, where each rate is adjusted for other causes in the sense that failure due to these other causes has been eliminated as a risk. The program can generate functions of the adjusted survival rates, to which asymptotic regression models may be fit. CRISCAT yields test statistics for hypotheses involving either these functions or estimated model parameters. Thus, this computational algorithm links competing risks theory to linear models methods for contingency table analysis and provides a unified approach to estimation and hypothesis testing of functions involving competing risks adjusted rates.  相似文献   

14.
Simple cells in primary visual cortex were famously found to respond to low-level image components such as edges. Sparse coding and independent component analysis (ICA) emerged as the standard computational models for simple cell coding because they linked their receptive fields to the statistics of visual stimuli. However, a salient feature of image statistics, occlusions of image components, is not considered by these models. Here we ask if occlusions have an effect on the predicted shapes of simple cell receptive fields. We use a comparative approach to answer this question and investigate two models for simple cells: a standard linear model and an occlusive model. For both models we simultaneously estimate optimal receptive fields, sparsity and stimulus noise. The two models are identical except for their component superposition assumption. We find the image encoding and receptive fields predicted by the models to differ significantly. While both models predict many Gabor-like fields, the occlusive model predicts a much sparser encoding and high percentages of ‘globular’ receptive fields. This relatively new center-surround type of simple cell response is observed since reverse correlation is used in experimental studies. While high percentages of ‘globular’ fields can be obtained using specific choices of sparsity and overcompleteness in linear sparse coding, no or only low proportions are reported in the vast majority of studies on linear models (including all ICA models). Likewise, for the here investigated linear model and optimal sparsity, only low proportions of ‘globular’ fields are observed. In comparison, the occlusive model robustly infers high proportions and can match the experimentally observed high proportions of ‘globular’ fields well. Our computational study, therefore, suggests that ‘globular’ fields may be evidence for an optimal encoding of visual occlusions in primary visual cortex.  相似文献   

15.
Simultaneous inference in general parametric models   总被引:6,自引:0,他引:6  
Simultaneous inference is a common problem in many areas of application. If multiple null hypotheses are tested simultaneously, the probability of rejecting erroneously at least one of them increases beyond the pre-specified significance level. Simultaneous inference procedures have to be used which adjust for multiplicity and thus control the overall type I error rate. In this paper we describe simultaneous inference procedures in general parametric models, where the experimental questions are specified through a linear combination of elemental model parameters. The framework described here is quite general and extends the canonical theory of multiple comparison procedures in ANOVA models to linear regression problems, generalized linear models, linear mixed effects models, the Cox model, robust linear models, etc. Several examples using a variety of different statistical models illustrate the breadth of the results. For the analyses we use the R add-on package multcomp, which provides a convenient interface to the general approach adopted here.  相似文献   

16.

Background

Early and accurate identification of adverse drug reactions (ADRs) is critically important for drug development and clinical safety. Computer-aided prediction of ADRs has attracted increasing attention in recent years, and many computational models have been proposed. However, because of the lack of systematic analysis and comparison of the different computational models, there remain limitations in designing more effective algorithms and selecting more useful features. There is therefore an urgent need to review and analyze previous computation models to obtain general conclusions that can provide useful guidance to construct more effective computational models to predict ADRs.

Principal Findings

In the current study, the main work is to compare and analyze the performance of existing computational methods to predict ADRs, by implementing and evaluating additional algorithms that have been earlier used for predicting drug targets. Our results indicated that topological and intrinsic features were complementary to an extent and the Jaccard coefficient had an important and general effect on the prediction of drug-ADR associations. By comparing the structure of each algorithm, final formulas of these algorithms were all converted to linear model in form, based on this finding we propose a new algorithm called the general weighted profile method and it yielded the best overall performance among the algorithms investigated in this paper.

Conclusion

Several meaningful conclusions and useful findings regarding the prediction of ADRs are provided for selecting optimal features and algorithms.  相似文献   

17.
18.
陈妍  宋豫秦  王伟 《生态学报》2018,38(7):2384-2394
作为草地资源大国,我国正面临严峻的草场退化形势。掌握草场植被盖度的历史演变趋势,是草场退化驱动力识别及风险评估的基础。目前已有研究多以参数回归方法估算植被盖度,但并未充分考虑其苛刻的使用条件。利用Landsat系列卫星遥感影像及地面植被盖度监测资料建立非参数回归——随机森林回归模型,并与传统线性回归方法进行比较,在此基础上应用随机森林回归模型估算近10年来布尔津县草场植被盖度的变化趋势,并对结果的不确定性进行分析。结果显示:传统的线性回归方法很难满足其基本的统计学假设条件,而随机森林模型不但无需进行假设条件检验,而且预测的准确性也优于以往普遍应用的线性模型。基于Landsat ETM+标准数据得到的反演结果较之TM和OLI数据普遍偏小,地表反射率数据虽然可以大幅降低传感器不同对反演结果所造成的影响,但结果仍存在约±10%的不确定性。涉及的草场类型众多,为了提高反演精度,后续研究需要分别计算其植被指数,并尽量减低传感器差异带来的不确定性。  相似文献   

19.
Rapid advances in molecular genetics push the need for efficient data analysis. Advanced algorithms are necessary for extracting all possible information from large experimental data sets. We present a general linear algebra framework for quantitative trait loci (QTL) mapping, using both linear regression and maximum likelihood estimation. The formulation simplifies future comparisons between and theoretical analyses of the methods. We show how the common structure of QTL analysis models can be used to improve the kernel algorithms, drastically reducing the computational effort while retaining the original analysis results. We have evaluated our new algorithms on data sets originating from two large F(2) populations of domestic animals. Using an updating approach, we show that 1-3 orders of magnitude reduction in computational demand can be achieved for matrix factorizations. For interval-mapping/composite-interval-mapping settings using a maximum likelihood model, we also show how to use the original EM algorithm instead of the ECM approximation, significantly improving the convergence and further reducing the computational time. The algorithmic improvements makes it feasible to perform analyses which have previously been deemed impractical or even impossible. For example, using the new algorithms, it is reasonable to perform permutation testing using exhaustive search on populations of 200 individuals using an epistatic two-QTL model.  相似文献   

20.
It has previously been shown that generic cortical microcircuit models can perform complex real-time computations on continuous input streams, provided that these computations can be carried out with a rapidly fading memory. We investigate the computational capability of such circuits in the more realistic case where not only readout neurons, but in addition a few neurons within the circuit, have been trained for specific tasks. This is essentially equivalent to the case where the output of trained readout neurons is fed back into the circuit. We show that this new model overcomes the limitation of a rapidly fading memory. In fact, we prove that in the idealized case without noise it can carry out any conceivable digital or analog computation on time-varying inputs. But even with noise, the resulting computational model can perform a large class of biologically relevant real-time computations that require a nonfading memory. We demonstrate these computational implications of feedback both theoretically, and through computer simulations of detailed cortical microcircuit models that are subject to noise and have complex inherent dynamics. We show that the application of simple learning procedures (such as linear regression or perceptron learning) to a few neurons enables such circuits to represent time over behaviorally relevant long time spans, to integrate evidence from incoming spike trains over longer periods of time, and to process new information contained in such spike trains in diverse ways according to the current internal state of the circuit. In particular we show that such generic cortical microcircuits with feedback provide a new model for working memory that is consistent with a large set of biological constraints. Although this article examines primarily the computational role of feedback in circuits of neurons, the mathematical principles on which its analysis is based apply to a variety of dynamical systems. Hence they may also throw new light on the computational role of feedback in other complex biological dynamical systems, such as, for example, genetic regulatory networks.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号