首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We developed linguistics-driven prediction models to estimate the risk of suicide. These models were generated from unstructured clinical notes taken from a national sample of U.S. Veterans Administration (VA) medical records. We created three matched cohorts: veterans who committed suicide, veterans who used mental health services and did not commit suicide, and veterans who did not use mental health services and did not commit suicide during the observation period (n = 70 in each group). From the clinical notes, we generated datasets of single keywords and multi-word phrases, and constructed prediction models using a machine-learning algorithm based on a genetic programming framework. The resulting inference accuracy was consistently 65% or more. Our data therefore suggests that computerized text analytics can be applied to unstructured medical records to estimate the risk of suicide. The resulting system could allow clinicians to potentially screen seemingly healthy patients at the primary care level, and to continuously evaluate the suicide risk among psychiatric patients.  相似文献   

2.
This article surveys efforts on text mining of the pharmacogenomics literature, mainly from the period 2008 to 2011. Pharmacogenomics (or pharmacogenetics) is the field that studies how human genetic variation impacts drug response. Therefore, publications span the intersection of research in genotypes, phenotypes and pharmacology, a topic that has increasingly become a focus of active research in recent years. This survey covers efforts dealing with the automatic recognition of relevant named entities (e.g. genes, gene variants and proteins, diseases and other pathological phenomena, drugs and other chemicals relevant for medical treatment), as well as various forms of relations between them. A wide range of text genres is considered, such as scientific publications (abstracts, as well as full texts), patent texts and clinical narratives. We also discuss infrastructure and resources needed for advanced text analytics, e.g. document corpora annotated with corresponding semantic metadata (gold standards and training data), biomedical terminologies and ontologies providing domain-specific background knowledge at different levels of formality and specificity, software architectures for building complex and scalable text analytics pipelines and Web services grounded to them, as well as comprehensive ways to disseminate and interact with the typically huge amounts of semiformal knowledge structures extracted by text mining tools. Finally, we consider some of the novel applications that have already been developed in the field of pharmacogenomic text mining and point out perspectives for future research.  相似文献   

3.
We have built a computational model for individual aging trajectories of health and survival, which contains physical, functional, and biological variables, and is conditioned on demographic, lifestyle, and medical background information. We combine techniques of modern machine learning with an interpretable interaction network, where health variables are coupled by explicit pair-wise interactions within a stochastic dynamical system. Our dynamic joint interpretable network (DJIN) model is scalable to large longitudinal data sets, is predictive of individual high-dimensional health trajectories and survival from baseline health states, and infers an interpretable network of directed interactions between the health variables. The network identifies plausible physiological connections between health variables as well as clusters of strongly connected health variables. We use English Longitudinal Study of Aging (ELSA) data to train our model and show that it performs better than multiple dedicated linear models for health outcomes and survival. We compare our model with flexible lower-dimensional latent-space models to explore the dimensionality required to accurately model aging health outcomes. Our DJIN model can be used to generate synthetic individuals that age realistically, to impute missing data, and to simulate future aging outcomes given arbitrary initial health states.  相似文献   

4.
In recent years, cell population models have become increasingly common. In contrast to classic single cell models, population models allow for the study of cell-to-cell variability, a crucial phenomenon in most populations of primary cells, cancer cells, and stem cells. Unfortunately, tools for in-depth analysis of population models are still missing. This problem originates from the complexity of population models. Particularly important are methods to determine the source of heterogeneity (e.g., genetics or epigenetic differences) and to select potential (bio-)markers. We propose an analysis based on visual analytics to tackle this problem. Our approach combines parallel-coordinates plots, used for a visual assessment of the high-dimensional dependencies, and nonlinear support vector machines, for the quantification of effects. The method can be employed to study qualitative and quantitative differences among cells. To illustrate the different components, we perform a case study using the proapoptotic signal transduction pathway involved in cellular apoptosis.  相似文献   

5.
Big data and deep learning will profoundly change various areas of professions and research in the future. This will also happen in medicine and medical imaging in particular. As medical physicists, we should pursue beyond the concept of technical quality to extend our methodology and competence towards measuring and optimising the diagnostic value in terms of how it is connected to care outcome. Functional implementation of such methodology requires data processing utilities starting from data collection and management and culminating in the data analysis methods. Data quality control and validation are prerequisites for the deep learning application in order to provide reliable further analysis, classification, interpretation, probabilistic and predictive modelling from the vast heterogeneous big data. Challenges in practical data analytics relate to both horizontal and longitudinal analysis aspects. Quantitative aspects of data validation, quality control, physically meaningful measures, parameter connections and system modelling for the future artificial intelligence (AI) methods are positioned firmly in the field of Medical Physics profession. It is our interest to ensure that our professional education, continuous training and competence will follow this significant global development.  相似文献   

6.

Background

Mobile health applications are complex interventions that essentially require changes to the behavior of health care professionals who will use them and changes to systems or processes in delivery of care. Our aim has been to meet the technical needs of Health Extension Workers (HEWs) and midwives for maternal health using appropriate mobile technologies tools.

Methods

We have developed and evaluated a set of appropriate smartphone health applications using open source components, including a local language adapted data collection tool, health worker and manager user-friendly dashboard analytics and maternal-newborn protocols. This is an eighteen month follow-up of an ongoing observational research study in the northern of Ethiopia involving two districts, twenty HEWs, and twelve midwives.

Results

Most health workers rapidly learned how to use and became comfortable with the touch screen devices so only limited technical support was needed. Unrestricted use of smartphones generated a strong sense of ownership and empowerment among the health workers. Ownership of the phones was a strong motivator for the health workers, who recognised the value and usefulness of the devices, so took care to look after them. A low level of smartphones breakage (8.3%,3 from 36) and loss (2.7%) were reported. Each health worker made an average of 160 mins of voice calls and downloaded 27Mb of data per month, however, we found very low usage of short message service (less than 3 per month).

Conclusions

Although it is too early to show a direct link between mobile technologies and health outcomes, mobile technologies allow health managers to more quickly and reliably have access to data which can help identify where there issues in the service delivery. Achieving a strong sense of ownership and empowerment among health workers is a prerequisite for a successful introduction of any mobile health program.  相似文献   

7.
The biopharmaceutical industry continuously seeks to optimize the critical quality attributes to maintain the reliability and cost-effectiveness of its products. Such optimization demands a scalable and optimal control strategy to meet the process constraints and objectives. This work uses a model predictive controller (MPC) to compute an optimal feeding strategy leading to maximized cell growth and metabolite production in fed-batch cell culture processes. The lack of high-fidelity physics-based models and the high complexity of cell culture processes motivated us to use machine learning algorithms in the forecast model to aid our development. We took advantage of linear regression, the Gaussian process and neural network models in the MPC design to maximize the daily protein production for each batch. The control scheme of the cell culture process solves an optimization problem while maintaining all metabolites and cell culture process variables within the specification. The linear and nonlinear models are developed based on real cell culture process data, and the performance of the designed controllers is evaluated by running several real-time experiments.  相似文献   

8.
Data from the electronic medical record comprise numerous structured but uncoded ele-ments, which are not linked to standard terminologies. Reuse of such data for secondary research purposes has gained in importance recently. However, the identification of rele-vant data elements and the creation of database jobs for extraction, transformation and loading (ETL) are challenging: With current methods such as data warehousing, it is not feasible to efficiently maintain and reuse semantically complex data extraction and trans-formation routines. We present an ontology-supported approach to overcome this challenge by making use of abstraction: Instead of defining ETL procedures at the database level, we use ontologies to organize and describe the medical concepts of both the source system and the target system. Instead of using unique, specifically developed SQL statements or ETL jobs, we define declarative transformation rules within ontologies and illustrate how these constructs can then be used to automatically generate SQL code to perform the desired ETL procedures. This demonstrates how a suitable level of abstraction may not only aid the interpretation of clinical data, but can also foster the reutilization of methods for un-locking it.  相似文献   

9.
BackgroundAmong patients who are discharged from the Emergency Department (ED), about 3% return within 30 days. Revisits can be related to the nature of the disease, medical errors, and/or inadequate diagnoses and treatment during their initial ED visit. Identification of high-risk patient population can help device new strategies for improved ED care with reduced ED utilization.ConclusionsOur ED 30-day revisit model was prospectively validated on the Maine State HIN secure statewide data system. Future integration of our ED predictive analytics into the ED care work flow may lead to increased opportunities for targeted care intervention to reduce ED resource burden and overall healthcare expense, and improve outcomes.  相似文献   

10.
Geoscience observations and model simulations are generating vast amounts of multi-dimensional data. Effectively analyzing these data are essential for geoscience studies. However, the tasks are challenging for geoscientists because processing the massive amount of data is both computing and data intensive in that data analytics requires complex procedures and multiple tools. To tackle these challenges, a scientific workflow framework is proposed for big geoscience data analytics. In this framework techniques are proposed by leveraging cloud computing, MapReduce, and Service Oriented Architecture (SOA). Specifically, HBase is adopted for storing and managing big geoscience data across distributed computers. MapReduce-based algorithm framework is developed to support parallel processing of geoscience data. And service-oriented workflow architecture is built for supporting on-demand complex data analytics in the cloud environment. A proof-of-concept prototype tests the performance of the framework. Results show that this innovative framework significantly improves the efficiency of big geoscience data analytics by reducing the data processing time as well as simplifying data analytical procedures for geoscientists.  相似文献   

11.
Clinical prediction models play a key role in risk stratification, therapy assignment and many other fields of medical decision making. Before they can enter clinical practice, their usefulness has to be demonstrated using systematic validation. Methods to assess their predictive performance have been proposed for continuous, binary, and time-to-event outcomes, but the literature on validation methods for discrete time-to-event models with competing risks is sparse. The present paper tries to fill this gap and proposes new methodology to quantify discrimination, calibration, and prediction error (PE) for discrete time-to-event outcomes in the presence of competing risks. In our case study, the goal was to predict the risk of ventilator-associated pneumonia (VAP) attributed to Pseudomonas aeruginosa in intensive care units (ICUs). Competing events are extubation, death, and VAP due to other bacteria. The aim of this application is to validate complex prediction models developed in previous work on more recently available validation data.  相似文献   

12.
13.
As the biopharmaceutical industry evolves to include more diverse protein formats and processes, more robust control of Critical Quality Attributes (CQAs) is needed to maintain processing flexibility without compromising quality. Active control of CQAs has been demonstrated using model predictive control techniques, which allow development of processes which are robust against disturbances associated with raw material variability and other potentially flexible operating conditions. Wide adoption of model predictive control in biopharmaceutical cell culture processes has been hampered, however, in part due to the large amount of data and expertise required to make a predictive model of controlled CQAs, a requirement for model predictive control. Here we developed a highly automated, perfusion apparatus to systematically and efficiently generate predictive models using application of system identification approaches. We successfully created a predictive model of %galactosylation using data obtained by manipulating galactose concentration in the perfusion apparatus in serialized step change experiments. We then demonstrated the use of the model in a model predictive controller in a simulated control scenario to successfully achieve a %galactosylation set point in a simulated fed‐batch culture. The automated model identification approach demonstrated here can potentially be generalized to many CQAs, and could be a more efficient, faster, and highly automated alternative to batch experiments for developing predictive models in cell culture processes, and allow the wider adoption of model predictive control in biopharmaceutical processes. © 2017 The Authors Biotechnology Progress published by Wiley Periodicals, Inc. on behalf of American Institute of Chemical Engineers Biotechnol. Prog., 33:1647–1661, 2017  相似文献   

14.
As the discipline of biomedical science continues to apply new technologies capable of producing unprecedented volumes of noisy and complex biological data, it has become evident that available methods for deriving meaningful information from such data are simply not keeping pace. In order to achieve useful results, researchers require methods that consolidate, store and query combinations of structured and unstructured data sets efficiently and effectively. As we move towards personalized medicine, the need to combine unstructured data, such as medical literature, with large amounts of highly structured and high-throughput data such as human variation or expression data from very large cohorts, is especially urgent. For our study, we investigated a likely biomedical query using the Hadoop framework. We ran queries using native MapReduce tools we developed as well as other open source and proprietary tools. Our results suggest that the available technologies within the Big Data domain can reduce the time and effort needed to utilize and apply distributed queries over large datasets in practical clinical applications in the life sciences domain. The methodologies and technologies discussed in this paper set the stage for a more detailed evaluation that investigates how various data structures and data models are best mapped to the proper computational framework.  相似文献   

15.
Gemcitabine is a nucleoside analog effective against several solid tumors. Standard treatment consists of an intravenous infusion over 30 min. This is an invasive, uncomfortable and often painful method, involving recurring visits to the hospital and costs associated with medical staff and equipment. Gemcitabine’s activity is significantly limited by numerous factors, including metabolic inactivation, rapid systemic clearance of gemcitabine and transporter deficiency-associated resistance. As such, there have been research efforts to improve gemcitabine-based therapy efficacy, as well as strategies to enhance its oral bioavailability. In this work, gemcitabine in vitro and clinical data were analyzed and in silico tools were used to study the pharmacokinetics of gemcitabine after oral administration following different regimens. Several physiologically based pharmacokinetic (PBPK) models were developed using simulation software GastroPlus™, predicting the PK parameters and plasma concentration–time profiles. The integrative biomedical data analyses presented here are promising, with some regimens of oral administration reaching higher AUC in comparison to the traditional IV infusion, supporting this route of administration as a viable alternative to IV infusions. This study further contributes to personalized health care based on potential new formulations for oral administration of gemcitabine, as well nanotechnology-based drug delivery systems.  相似文献   

16.
Computational modeling of biological networks permits the comprehensive analysis of cells and tissues to define molecular phenotypes and novel hypotheses. Although a large number of software tools have been developed, the versatility of these tools is limited by mathematical complexities that prevent their broad adoption and effective use by molecular biologists. This study clarifies the basic aspects of molecular modeling, how to convert data into useful input, as well as the number of time points and molecular parameters that should be considered for molecular regulatory models with both explanatory and predictive potential. We illustrate the necessary experimental preconditions for converting data into a computational model of network dynamics. This model requires neither a thorough background in mathematics nor precise data on intracellular concentrations, binding affinities or reaction kinetics. Finally, we show how an interactive model of crosstalk between signal transduction pathways in primary human articular chondrocytes allows insight into processes that regulate gene expression.  相似文献   

17.
ABSTRACT: BACKGROUND: Medical records accumulate data concerning patient health and the natural history of disease progression. However, methods to mine information systematically in a form other than an electronic health record are not yet available. The purpose of this study was to develop an object modeling technique as a first step towards a formal database of medical records. METHOD: Live Sequence Charts (LSC) were used to formalize the narrative text obtained during a patient interview. LSCs utilize a visual scenario-based programming language to build object models. LSC extends the classical language of UML message sequence charts (MSC), predominantly through addition of modalities and providing executable semantics. Interobject scenarios were defined to specify natural history event interactions and different scenarios in the narrative text. Result A simulated medical record was specified into LSC formalism by translating the text into an object model that comprised a set of entities and events. The entities described the participating components (i.e., doctor, patient and record) and the events described the interactions between elements. A conceptual model is presented to illustrate the approach. An object model was generated from data extracted from an actual new patient interview, where the individual was eventually diagnosed as suffering from Chronic Fatigue Syndrome (CFS). This yielded a preliminary formal designated vocabulary for CFS development that provided a basis for future formalism of these records. CONCLUSIONS: Translation of medical records into object models created the basis for a formal database of the patient narrative that temporally depicts the events preceding disease, the diagnosis and treatment approach. The LSCs object model of the medical narrative provided an intuitive, visual representation of the natural history of the patient's disease.  相似文献   

18.
Although most statistical methods for the analysis of longitudinal data have focused on retrospective models of association, new advances in mobile health data have presented opportunities for predicting future health status by leveraging an individual's behavioral history alongside data from similar patients. Methods that incorporate both individual-level and sample-level effects are critical to using these data to its full predictive capacity. Neural networks are powerful tools for prediction, but many assume input observations are independent even when they are clustered or correlated in some way, such as in longitudinal data. Generalized linear mixed models (GLMM) provide a flexible framework for modeling longitudinal data but have poor predictive power particularly when the data are highly nonlinear. We propose a generalized neural network mixed model that replaces the linear fixed effect in a GLMM with the output of a feed-forward neural network. The model simultaneously accounts for the correlation structure and complex nonlinear relationship between input variables and outcomes, and it utilizes the predictive power of neural networks. We apply this approach to predict depression and anxiety levels of schizophrenic patients using longitudinal data collected from passive smartphone sensor data.  相似文献   

19.
Data summarization and triage is one of the current top challenges in visual analytics. The goal is to let users visually inspect large data sets and examine or request data with particular characteristics. The need for summarization and visual analytics is also felt when dealing with digital representations of DNA sequences. Genomic data sets are growing rapidly, making their analysis increasingly more difficult, and raising the need for new, scalable tools. For example, being able to look at very large DNA sequences while immediately identifying potentially interesting regions would provide the biologist with a flexible exploratory and analytical tool. In this paper we present a new concept, the “information profile”, which provides a quantitative measure of the local complexity of a DNA sequence, independently of the direction of processing. The computation of the information profiles is computationally tractable: we show that it can be done in time proportional to the length of the sequence. We also describe a tool to compute the information profiles of a given DNA sequence, and use the genome of the fission yeast Schizosaccharomyces pombe strain 972 h and five human chromosomes 22 for illustration. We show that information profiles are useful for detecting large-scale genomic regularities by visual inspection. Several discovery strategies are possible, including the standalone analysis of single sequences, the comparative analysis of sequences from individuals from the same species, and the comparative analysis of sequences from different organisms. The comparison scale can be varied, allowing the users to zoom-in on specific details, or obtain a broad overview of a long segment. Software applications have been made available for non-commercial use at http://bioinformatics.ua.pt/software/dna-at-glance.  相似文献   

20.
The recent improvements in mass spectrometry instruments and new analytical methods are increasing the intersection between proteomics and big data science. In addition, bioinformatics analysis is becoming increasingly complex and convoluted, involving multiple algorithms and tools. A wide variety of methods and software tools have been developed for computational proteomics and metabolomics during recent years, and this trend is likely to continue. However, most of the computational proteomics and metabolomics tools are designed as single‐tiered software application where the analytics tasks cannot be distributed, limiting the scalability and reproducibility of the data analysis. In this paper the key steps of metabolomics and proteomics data processing, including the main tools and software used to perform the data analysis, are summarized. The combination of software containers with workflows environments for large‐scale metabolomics and proteomics analysis is discussed. Finally, a new approach for reproducible and large‐scale data analysis based on BioContainers and two of the most popular workflow environments, Galaxy and Nextflow, is introduced to the proteomics and metabolomics communities.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号