首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The recent increase in high‐throughput capacity of ‘omics datasets combined with advances and interest in machine learning (ML) have created great opportunities for systems metabolic engineering. In this regard, data‐driven modeling methods have become increasingly valuable to metabolic strain design. In this review, the nature of ‘omics is discussed and a broad introduction to the ML algorithms combining these datasets into predictive models of metabolism and metabolic rewiring is provided. Next, this review highlights recent work in the literature that utilizes such data‐driven methods to inform various metabolic engineering efforts for different classes of application including product maximization, understanding and profiling phenotypes, de novo metabolic pathway design, and creation of robust system‐scale models for biotechnology. Overall, this review aims to highlight the potential and promise of using ML algorithms with metabolic engineering and systems biology related datasets.  相似文献   

2.
BackgroundStatistical models are regularly used in the forecasting and surveillance of infectious diseases to guide public health. Variable selection assists in determining factors associated with disease transmission, however, often overlooked in this process is the evaluation and suitability of the statistical model used in forecasting disease transmission and outbreaks. Here we aim to evaluate several modelling methods to optimise predictive modelling of Ross River virus (RRV) disease notifications and outbreaks in epidemiological important regions of Victoria and Western Australia.Methodology/Principal findingsWe developed several statistical methods using meteorological and RRV surveillance data from July 2000 until June 2018 in Victoria and from July 1991 until June 2018 in Western Australia. Models were developed for 11 Local Government Areas (LGAs) in Victoria and seven LGAs in Western Australia. We found generalised additive models and generalised boosted regression models, and generalised additive models and negative binomial models to be the best fit models when predicting RRV outbreaks and notifications, respectively. No association was found with a model’s ability to predict RRV notifications in LGAs with greater RRV activity, or for outbreak predictions to have a higher accuracy in LGAs with greater RRV notifications. Moreover, we assessed the use of factor analysis to generate independent variables used in predictive modelling. In the majority of LGAs, this method did not result in better model predictive performance.Conclusions/SignificanceWe demonstrate that models which are developed and used for predicting disease notifications may not be suitable for predicting disease outbreaks, or vice versa. Furthermore, poor predictive performance in modelling disease transmissions may be the result of inappropriate model selection methods. Our findings provide approaches and methods to facilitate the selection of the best fit statistical model for predicting mosquito-borne disease notifications and outbreaks used for disease surveillance.  相似文献   

3.
BackgroundMachine learning (ML) has been gradually integrated into oncologic research but seldom applied to predict cervical cancer (CC), and no model has been reported to predict survival and site-specific recurrence simultaneously. Thus, we aimed to develop ML models to predict survival and site-specific recurrence in CC and to guide individual surveillance.MethodsWe retrospectively collected data on CC patients from 2006 to 2017 in four hospitals. The survival or recurrence predictive value of the variables was analyzed using multivariate Cox, principal component, and K-means clustering analyses. The predictive performances of eight ML models were compared with logistic or Cox models. A novel web-based predictive calculator was developed based on the ML algorithms.ResultsThis study included 5112 women for analysis (268 deaths, 343 recurrences): (1) For site-specific recurrence, larger tumor size was associated with local recurrence, while positive lymph nodes were associated with distant recurrence. (2) The ML models exhibited better prognostic predictive performance than traditional models. (3) The ML models were superior to traditional models when multiple variables were used. (4) A novel predictive web-based calculator was developed and externally validated to predict survival and site-specific recurrence.ConclusionML models might be a better analytic approach in CC prognostic prediction than traditional models as they can predict survival and site-specific recurrence simultaneously, especially when using multiple variables. Moreover, our novel web-based calculator may provide clinicians with useful information and help them make individual postoperative follow-up plans and further treatment strategies.  相似文献   

4.
V?nttinen, T, Blomqvist, M, Nyman, K, and H?kkinen, K. Changes in body composition, hormonal status, and physical fitness in 11-, 13-, and 15-year-old Finnish regional youth soccer players during a two-year follow-up. J Strength Cond Res 25(12): 3342-3351, 2011-The purpose of this study was to examine the changes in body composition, hormonal status, and physical fitness in 10.8 ± 0.3-year-old (n = 13), 12.7 ± 0.2-year-old (n = 14), and 14.7 ± 0.3-year-old (n = 12) Finnish regional youth soccer players during a 2-year monitoring period and to compare physical fitness characteristics of soccer players with those of age-matched controls (10.7 ± 0.3 years, n = 13; 14.7 ± 0.3 years, n = 10) not participating in soccer. Body composition was measured in terms of height, weight, muscle mass, percentage of body fat, and lean body weight of trunk, legs, and arms. Hormonal status was monitored by concentrations of serum testosterone and cortisol. Physical fitness was measured in terms of sprinting speed, agility, isometric maximal strength (leg extensors, abdominal, back, grip), explosive strength, and endurance. Age-related development was detected in all other measured variables except in the percentage of body fat. The results showed that the physical fitness of regional soccer players was better than that of the control groups in all age groups, especially in cardiovascular endurance (p < 0.01-0.001) and in agility (p < 0.01-0.001). In conclusion, playing in a regional level soccer team seems to provide training adaptation, which is beyond normal development and which in all likelihood leads to positive health effects over a prolonged period of time.  相似文献   

5.
目的

基于临床数据构建一种预测慢性乙型肝炎肝纤维化的无创诊断模型。

方法

收集2021年1月至2023年7月宁波市医疗中心李惠利医院收治的165例CHB患者病例资料作回顾性分析,根据肝活检病理结果将患者分为无肝纤维化组(S0,n = 22)和肝纤维化组(≥S1,n = 143)。收集患者的血清学指标和临床数据,运用单因素和多因素 logitstic回归分析筛选出独立预测指标并建立模型,同时采用受试者工作特征曲线(ROC)评价模型的预测效能。

结果

单因素分析结果显示,两组患者在白蛋白、谷草转氨酶、甘油三酯、总胆汁酸、胆碱酯酶、凝血酶原时间、 BMI、血清Ⅳ胶原和血清透明质酸等指标中存在差异(P<0.05)。通过logistic多因素的回归分析构建肝纤维化模型S-risk score = −4.30+0.12×白蛋白+0.02×谷草转氨酶−0.05×碱性磷酸酶+0.29×甘油三酯+0.06×总胆汁酸−0.47×凝血酶原时间+0.20×BMI+0.03×血清Ⅳ胶原测定+0.02×血清透明质酸。该评分下的ROC曲线下的面积为0.866,其预测肝纤维化的准确性明显优于APRI和FIB-4两项评分模型。

结论

我们构建的S-risk score模型对CHB患者肝纤维化有良好的预测能力,其预测准确性均高于APRI和FIB-4两项评分模型。

  相似文献   

6.
Abstract. Conservation management has significant gaps between (1) collection and storage of biological data, (2) data analysis, and (3) application of results. In order to improve management decision-making, it is necessary to bridge these gaps. One of the most promising approaches uses computer-based decision support systems (DSS): interactive models of the system in question—for example, a nature reserve. One kind of DSS is scenario modeling: spatially-based models which (1) use expert opinion and data on vegetation, geology, hydrology, and management, (2) to project changes in landscape through time, (3) on the basis of changes in driving environmental factors. Scenario models are essentially graphic hypotheses, predicting changes in landscape with a specified change in driving factors, which can then be verified or falsified by monitoring. This paper presents an application of this approach to an Israeli nature reserve, the En Afeq Reserve in western Galilee. Our project tests the possibility of improving Israeli conservation management by using methods now standard for nature reserves in the Netherlands.  相似文献   

7.
The purpose of this study was to determine the effectiveness of white-box decision tree models (DTM) for predicting the rating of perceived exertion (RPE). The second aim was to examine the relationship between RPE and external measures of intensity in youth soccer training at the group and individual level. Training load data from 18 youth soccer players were collected during an in-season competition period. A total of 804 training observations were undertaken, with a total of 43 ± 17 sessions per player (range 12–76). External measures of intensity were determined using a 10 Hz GPS and included total distance (TD, m/min), high-speed running distance (HSR, m/min), PlayerLoad (PL, n/min), impacts (n/min), distance in acceleration/deceleration (TD ACC/TD DEC, m/min) and the number of accelerations/decelerations (ACC/DEC, n/min). Data were analysed with decision tree models. Global and individualized models were constructed. Aggregated importance revealed HSR as the strongest predictor of RPE with relative importance of 0.61. HSR was the most important factor in predicting RPE for half of the players. The prediction error (root mean square error [RMSE] 0.755 ± 0.014) for the individualized models was lower compared to the population model (RMSE 1.621 ± 0.001). The findings demonstrate that individual models should be used for the assessment of players’ response to external load. Furthermore, the study demonstrates that DTM provide straightforward interpretation, with the possibility of visualization. This method can be used to prescribe daily training loads on the basis of predicted, desired player responses (exertion).  相似文献   

8.
A molecular modeling study using Comparative Molecular Field Analysis (CoMFA) was undertaken to develop a predictive model for combretastatin binding to the colchicine binding site of tubulin. Furthermore, we examined the potential contribution of lipophilicity (log P) and molecular dipole moment and were unable to correlate these properties to the observed biological data. In this study we first confirmed that tubulin polymerization inhibition (IC50) correlated (R2 = 0.92) with [3H]colchicine displacement. Although these data correlated quite well, we developed two independent models for each set of data to quantify structural features that may contribute to each biological property independently. To develop our predictive model we first examined a series of molecular alignments for the training set and ultimately found that overlaying the respective trimethoxyphenyl rings (A ring) of the analogues generated the best correlated model. The CoMFA yielded a cross-validated R2 = 0.41 (optimum number of components equal to 5) for the tubulin polymerization model and an R2 = 0.38 (optimum number of components equal to 5) for [3H]colchicine inhibition. Final non-cross-validation generated models for tubulin polymerization (R2 of 0.93) and colchicine inhibition (R2 of 0.91). These models were validated by predicting both biological properties for compounds not used in the training set. These models accurately predicted the IC50 for tubulin polymerization with an R2 of 0.88 (n = 6) and those of [3H]colchicine displacement with an R2 of 0.80 (n = 7). This study represents the first predictive model for the colchicine binding site over a wide range of combretastatin analogues.  相似文献   

9.
Ecological systems are governed by complex interactions which are mainly nonlinear. In order to capture the inherent complexity and nonlinearity of ecological, and in general biological systems, empirical models recently gained popularity. However, although these models, particularly connectionist approaches such as multilayered backpropagation networks, are commonly applied as predictive models in ecology to a wide variety of ecosystems and questions, there are no studies to date aiming to assess the performance, both in terms of data fitting and generalizability, and applicability of empirical models in ecology. Our aim is hence to provide an overview for nature of the wide range of the data sets and predictive variables, from both aquatic and terrestrial ecosystems with different scales of time-dependent dynamics, and the applicability and robustness of predictive modeling methods on such data sets by comparing different empirical modeling approaches. The models used in this study range from predicting the occurrence of submerged plants in shallow lakes to predicting nest occurrence of bird species from environmental variables and satellite images. The methods considered include k-nearest neighbor (k-NN), linear and quadratic discriminant analysis (LDA and QDA), generalized linear models (GLM) feedforward multilayer backpropagation networks and pseudo-supervised network ARTMAP.Our results show that the predictive performances of the models on training data could be misleading, and one should consider the predictive performance of a given model on an independent test set for assessing its predictive power. Moreover, our results suggest that for ecosystems involving time-dependent dynamics and periodicities whose frequency are possibly less than the time scale of the data considered, GLM and connectionist neural network models appear to be most suitable and robust, provided that a predictive variable reflecting these time-dependent dynamics included in the model either implicitly or explicitly. For spatial data, which does not include any time-dependence comparable to the time scale covered by the data, on the other hand, neighborhood based methods such as k-NN and ARTMAP proved to be more robust than other methods considered in this study. In addition, for predictive modeling purposes, first a suitable, computationally inexpensive method should be applied to the problem at hand a good predictive performance of which would render the computational cost and efforts associated with complex variants unnecessary.  相似文献   

10.
Short‐term forecasts based on time series of counts or survey data are widely used in population biology to provide advice concerning the management, harvest and conservation of natural populations. A common approach to produce these forecasts uses time‐series models, of different types, fit to time series of counts. Similar time‐series models are used in many other disciplines, however relative to the data available in these other disciplines, population data are often unusually short and noisy and models that perform well for data from other disciplines may not be appropriate for population data. In order to study the performance of time‐series forecasting models for natural animal population data, we assembled 2379 time series of vertebrate population indices from actual surveys. Our data were comprised of three vastly different types: highly variable (marine fish productivity), strongly cyclic (adult salmon counts), and small variance but long‐memory (bird and mammal counts). We tested the predictive performance of 49 different forecasting models grouped into three broad classes: autoregressive time‐series models, non‐linear regression‐type models and non‐parametric time‐series models. Low‐dimensional parametric autoregressive models gave the most accurate forecasts across a wide range of taxa; the most accurate model was one that simply treated the most recent observation as the forecast. More complex parametric and non‐parametric models performed worse, except when applied to highly cyclic species. Across taxa, certain life history characteristics were correlated with lower forecast error; specifically, we found that better forecasts were correlated with attributes of slow growing species: large maximum age and size for fishes and high trophic level for birds. Synthesis Evaluating the data support for multiple plausible models has been an integral focus of many ecological analyses. However, the most commonly used tools to quantify support have weighted models’ hindcasting and forecasting abilities. For many applications, predicting the past may be of little interest. Concentrating only on the future predictive performance of time series models, we performed a forecasting competition among many different kinds of statistical models, applying each to many different kinds of vertebrate time series of population abundance. Low‐dimensional (simple) models performed well overall, but more complex models did slightly better when applied to time series of cyclic species (e.g. salmon).  相似文献   

11.
BackgroundThe purpose of this study was to characterize pre-treatment non-contrast computed tomography (CT) and 18F-fluorodeoxyglucose positron emission tomography (PET) based radiomics signatures predictive of pathological response and clinical outcomes in rectal cancer patients treated with neoadjuvant chemoradiotherapy (NACR T).Materials and methodsAn exploratory analysis was performed using pre-treatment non-contrast CT and PET imaging dataset. The association of tumor regression grade (TRG) and neoadjuvant rectal (NAR) score with pre-treatment CT and PET features was assessed using machine learning algorithms. Three separate predictive models were built for composite features from CT + PET.ResultsThe patterns of pathological response were TRG 0 (n = 13; 19.7%), 1 (n = 34; 51.5%), 2 (n = 16; 24.2%), and 3 (n = 3; 4.5%). There were 20 (30.3%) patients with low, 22 (33.3%) with intermediate and 24 (36.4%) with high NAR scores. Three separate predictive models were built for composite features from CT + PET and analyzed separately for clinical endpoints. Composite features with α = 0.2 resulted in the best predictive power using logistic regression. For pathological response prediction, the signature resulted in 88.1% accuracy in predicting TRG 0 vs. TRG 1–3; 91% accuracy in predicting TRG 0–1 vs. TRG 2–3. For the surrogate of DFS and OS, it resulted in 67.7% accuracy in predicting low vs. intermediate vs. high NAR scores.ConclusionThe pre-treatment composite radiomics signatures were highly predictive of pathological response in rectal cancer treated with NACR T. A larger cohort is warranted for further validation.  相似文献   

12.

Background

Fenofibrate (Fb) is a known treatment for elevated triglyceride (TG) levels. The Genetics of Lipid Lowering Drugs and Diet Network (GOLDN) study was designed to investigate potential contributors to the effects of Fb on TG levels. Here, we summarize the analyses of 8 papers whose authors had access to the GOLDN data and were grouped together because they pursued investigations into Fb treatment responses as part of GAW20. These papers report explorations of a variety of genetics, epigenetics, and study design questions. Data regarding treatment with 160 mg of micronized Fb per day for 3 weeks included pretreatment and posttreatment TG and methylation levels (ML) at approximately 450,000 epigenetic markers (cytosine-phosphate-guanine [CpG] sites). In addition, approximately 1 million single-nucleotide polymorphisms (SNPs) were genotyped or imputed in each of the study participants, drawn from 188 pedigrees.

Results

The analyses of a variety of subsets of the GOLDN data used a number of analytic approaches such as linear mixed models, a kernel score test, penalized regression, and artificial neural networks.

Conclusions

Results indicate that (a) CpG ML are responsive to Fb; (b) CpG ML should be included in models predicting the TG level responses to Fb; (c) common and rare variants are associated with TG responses to Fb; (d) the interactions of common variants and CpG ML should be included in models predicting the TG response; and (e) sample size is a critical factor in the successful construction of predictive models representing the response to Fb.
  相似文献   

13.
14.
The aim of the study was to construct the model forecasting the birch pollen season characteristics in Cracow on the basis of an 18-year data series. The study was performed using the volumetric method (Lanzoni/Burkard trap). The 98/95 % method was used to calculate the pollen season. The Spearman’s correlation test was applied to find the relationship between the meteorological parameters and pollen season characteristics. To construct the predictive model, the backward stepwise multiple regression analysis was used including the multi-collinearity of variables. The predictive models best fitted the pollen season start and end, especially models containing two independent variables. The peak concentration value was predicted with the higher prediction error. Also the accuracy of the models predicting the pollen season characteristics in 2009 was higher in comparison with 2010. Both, the multi-variable model and one-variable model for the beginning of the pollen season included air temperature during the last 10 days of February, while the multi-variable model also included humidity at the beginning of April. The models forecasting the end of the pollen season were based on temperature in March–April, while the peak day was predicted using the temperature during the last 10 days of March.  相似文献   

15.
张霞  李占斌  张振文  邓彦 《生态学报》2012,32(21):6788-6794
预测陕西洛惠渠灌区地下水动态变化情况,在综合分析了各种地下水动态研究方法的基础上,提出了基于支持向量机和改进的BP神经网络模型的灌区地下水动态预测方法,并在MATLAB中编制了相应的计算机程序,建立了相应的地下水动态预测模型。以灌区多年实例数据为学习样本和测试样本,比较了两种模型的地下水动态预测优劣性。研究表明,支持向量机模型和BP网络模型在样本训练学习过程中都具较高的模拟精度,而在样本学习阶段,支持向量机的预测精度明显优于BP网络,可以很好的描述地下水动态复杂的耦合关系。支持向量机方法切实可行,更加适合大型灌区地下水动态预测,是对传统地下水动态研究方法的补充与完善。  相似文献   

16.
The purpose of this study was to compare the effects of combined strength and plyometric training with strength training alone on power-related measurements in professional soccer players. Subjects in the intervention team were randomly divided into 2 groups. Group ST (n = 6) performed heavy strength training twice a week for 7 weeks in addition to 6 to 8 soccer sessions a week. Group ST+P (n = 8) performed a plyometric training program in addition to the same training as the ST group. The control group (n = 7) performed 6 to 8 soccer sessions a week. Pretests and posttests were 1 repetition maximum (1RM) half squat, countermovement jump (CMJ), squat jump (SJ), 4-bounce test (4BT), peak power in half squat with 20 kg, 35 kg, and 50 kg (PP20, PP35, and PP50, respectively), sprint acceleration, peak sprint velocity, and total time on 40-m sprint. There were no significant differences between the ST+P group and ST group. Thus, the groups were pooled into 1 intervention group. The intervention group significantly improved in all measurements except CMJ, while the control group showed significant improvements only in PP20. There was a significant difference in relative improvement between the intervention group and control group in 1RM half squat, 4BT, and SJ. However, a significant difference between groups was not observed in PP20, PP35, sprint acceleration, peak sprinting velocity, and total time on 40-m sprint. The results suggest that there are no significant performance-enhancing effects of combining strength and plyometric training in professional soccer players concurrently performing 6 to 8 soccer sessions a week compared to strength training alone. However, heavy strength training leads to significant gains in strength and power-related measurements in professional soccer players.  相似文献   

17.
This study tested the accuracy of a novel, limited-availability web application (H2Q™) for predicting sweat rates in a variety of sports using estimates of energy expenditure and air temperature only. The application of predictions for group water planning was investigated for soccer match play. Fourteen open literature studies were identified where group sweat rates were reported (n = 20 group means comprising 230 individual observations from 179 athletes) with fidelity. Sports represented included: walking, cycling, swimming, and soccer match play. The accuracy of H2Q™ sweat rates was tested by comparing to measured group sweat rates using the concordance correlation coefficient (CCC) with 95% confidence interval [CI]. The relative absolute error (RAE) with 95% [CI] was also assessed, whereby the mean absolute error was expressed relative to an acceptance limit of 0.250 L/h. The CCC was 0.98 [0.95, 0.99] and the RAE was 0.449 [0.279, 0.620], indicating that the prediction error was on average 0.112 L/h. The RAE was < 1.0 for 19/20 observations (95%). Drink volumes modeled as a proxy for sweat losses during soccer match play prevented dehydration (< 1% loss of body mass). The H2Q™ web application demonstrated high group sweat prediction accuracy for the variety of sports activities tested. Water planning for soccer match play suggests the feasibility of easily and accurately predicting sweat rates to plan group water needs and promote optimal hydration in training and/or competition.  相似文献   

18.
This narrative review paper aimed to discuss the literature on machine learning applications in soccer with an emphasis on injury risk assessment. A secondary aim was to provide practical tips for the health and performance staff in soccer clubs on how machine learning can provide a competitive advantage. Performance analysis is the area with the majority of research so far. Other domains of soccer science and medicine with machine learning use are injury risk assessment, players’ workload and wellness monitoring, movement analysis, players’ career trajectory, club performance, and match attendance. Regarding injuries, which is a hot topic, machine learning does not seem to have a high predictive ability at the moment (models specificity ranged from 74.2%-97.7%. sensitivity from 15.2%-55.6% with area under the curve of 0.66–0.83). It seems, though, that machine learning can help to identify the early signs of elevated risk for a musculoskeletal injury. Future research should account for musculoskeletal injuries’ dynamic nature for machine learning to provide more meaningful results for practitioners in soccer.  相似文献   

19.
Clinicians and patients would benefit if accurate methods of predicting and monitoring bone strength in-vivo were available. A group of 51 human femurs (age range 21-93; 23 females, 28 males) were evaluated for bone density and geometry using quantitative computed tomography (QCT) and dual energy X-ray absorptiometry (DXA). Regional bone density and dimensions obtained from QCT and DXA were used to develop statistical models to predict femoral strength ex vivo. The QCT data also formed the basis of a three-dimensional finite element (FE) models to predict structural stiffness. The femurs were separated into two groups; a model training set (n = 25) was used to develop statistical models to predict ultimate load, and a test set (n = 26) was used to validate these models. The main goal of this study was to test the ability of DXA, QCT and FE techniques to predict fracture load non-invasively, in a simple load configuration which produces predominantly femoral neck fractures. The load configuration simulated the single stance phase portion of normal gait; in 87% of the specimens, clinical appearing sub-capital fractures were produced. The training/test study design provided a tool to validate that the predictive models were reliable when used on specimens with "unknown" strength characteristics. The FE method explained at least 20% more of the variance in strength than the DXA models. Planned refinements of the FE technique are expected to further improve these results. Three-dimensional FE models are a promising method for predicting fracture load, and may be useful in monitoring strength changes in vivo.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号