Article

EHRs Connect Research and Practice: Where Predictive Modeling, Artificial Intelligence, and Clinical Decision Support Intersect

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Objectives: Electronic health records (EHRs) are only a first step in capturing and utilizing health-related data - the challenge is turning that data into useful information. Furthermore, EHRs are increasingly likely to include data relating to patient outcomes, functionality such as clinical decision support, and genetic information as well, and, as such, can be seen as repositories of increasingly valuable information about patients' health conditions and responses to treatment over time. Methods: We describe a case study of 423 patients treated by Centerstone within Tennessee and Indiana in which we utilized electronic health record data to generate predictive algorithms of individual patient treatment response. Multiple models were constructed using predictor variables derived from clinical, financial and geographic data. Results: For the 423 patients, 101 deteriorated, 223 improved and in 99 there was no change in clinical condition. Based on modeling of various clinical indicators at baseline, the highest accuracy in predicting individual patient response ranged from 70-72% within the models tested. In terms of individual predictors, the Centerstone Assessment of Recovery Level - Adult (CARLA) baseline score was most significant in predicting outcome over time (odds ratio 4.1 + 2.27). Other variables with consistently significant impact on outcome included payer, diagnostic category, location and provision of case management services. Conclusions: This approach represents a promising avenue toward reducing the current gap between research and practice across healthcare, developing data-driven clinical decision support based on real-world populations, and serving as a component of embedded clinical artificial intelligences that "learn" over time.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Or to put it another way: the real question is which patients have future trajectories (in terms of both costs and outcomes) that are actually change-able? In real world clinical settings, we want actionable information [13]. A patient may be at-risk, but there may either be nothing we can do about it, or too late to do so. ...
... While we are concerned with the technical development of an artificial intelligence (AI) system for diabetes here, we are also concerned with the "people side" of the equation and how such a system could integrate with existing practices of providers and patients within healthcare systems. The latter is key for successful implementation and user adoption [13,16]. ...
... A critical part of modeling real-world healthcare datasets is the creation of "meta-data" from the underlying raw data derived from backend databases, which falls under the concept of feature engineering. The goal is to create information by intersecting across data sources and fields, constructing "meaning" out of individual data fields that may otherwise lack it [13]. In healthcare, this often takes the form of combining subject matter expertise (SME) with analytic techniques. ...
Article
Full-text available
The CDOI outcome measure – a patient-reported outcome (PRO) instrument utilizing direct client feedback – was implemented in a large, real-world behavioral healthcare setting in order to evaluate previous findings from smaller controlled studies. PROs provide an alternative window into treatment effectiveness based on client perception and facilitate detection of problems/symptoms for which there is no discernible measure (e.g. pain). The principal focus of the study was to evaluate the utility of the CDOI for predictive modeling of outcomes in a live clinical setting. Implementation factors were also addressed within the framework of the Theory of Planned Behavior by linking adoption rates to implementation practices and clinician perceptions. The results showed that the CDOI does contain significant capacity to predict outcome delta over time based on baseline and early change scores in a large, real-world clinical setting, as suggested in previous research. The implementation analysis revealed a number of critical factors affecting successful implementation and adoption of the CDOI outcome measure, though there was a notable disconnect between clinician intentions and actual behavior. Most importantly, the predictive capacity of the CDOI underscores the utility of direct client feedback measures such as PROs and their potential use as the basis for next generation clinical decision support tools and personalized treatment approaches.
... An interesting review of risk prediction models is presented in [9]. Machine learning and predictive models have also been used in pathology screening and detection [10], [11]. In [10], for instance, a set of data-mining models, including neural networks, Bayesian networks, random forest, decision trees, and log regression, were implemented using EHRs data to predict individual patient treatment response. ...
... Machine learning and predictive models have also been used in pathology screening and detection [10], [11]. In [10], for instance, a set of data-mining models, including neural networks, Bayesian networks, random forest, decision trees, and log regression, were implemented using EHRs data to predict individual patient treatment response. A comprehensive review of predictive data-mining applications in medicine is presented in [3]. ...
... Although this value is high compared to the real number of tests needed, it is significantly small compared with the real number of tests prescribed in the center (about 18 500 tests) for this period following current practices. The optimal ECG rationing solution is x j = 1, i.e., subclass j is given ECG, for nodes 10,14,17,23,24,25,33,34,39, and 40, and y j = 1, i.e., subclass j is not given ECG, for nodes 3, 18, and 22 where leaves are highlighted. Note that the optimal ECG rationing is not always made at the leaf level, and both x j = 1 and y j = 1 are selected for some nonleaf nodes. ...
Article
Medical test selection is a recurring problem in health prevention and consists of proposing a set of tests to each subject for diagnosis and treatment of pathologies. The problem is characterized by the unknown risk probability distribution across the population and two contradictory objectives: minimizing the number of tests and giving the medical test to all at-risk populations. This article sets this problem in a general framework of chance-constrained medical test rationing with unknown subject distribution over an attribute space and unknown risk probability but with a given sample population. A new approach combining decision-tree and Bayesian inference is proposed to allocate relevant medical tests according to the subjects' profile. Case studies on screening of hypertension and diabetes are conducted, and the performance of the proposed approach is evaluated. Significant savings on unnecessary tests are achieved with limited numbers of subjects needing but not receiving necessary tests.
... Or to put it another way: the real question is which patients have future trajectories (in terms of both costs and outcomes) that are actually change-able? In real world clinical settings, we want actionable information [13]. A patient may be at-risk, but there may either be nothing we can do about it, or too late to do so. ...
... While we are concerned with the technical development of an artificial intelligence (AI) system for diabetes here, we are also concerned with the "people side" of the equation and how such a system could integrate with existing practices of providers and patients within healthcare systems. The latter is key for successful implementation and user adoption [13,16]. ...
... A critical part of modeling real-world healthcare datasets is the creation of "meta-data" from the underlying raw data derived from backend databases, which falls under the concept of feature engineering. The goal is to create information by intersecting across data sources and fields, constructing "meaning" out of individual data fields that may otherwise lack it [13]. In healthcare, this often takes the form of combining subject matter expertise (SME) with analytic techniques. ...
Preprint
Diabetes is a major public health problem in the United States, affecting roughly 30 million people. Diabetes complications, along with the mental health comorbidities that often co-occur with them, are major drivers of high healthcare costs, poor outcomes, and reduced treatment adherence in diabetes. Here, we evaluate in a large state-wide population whether we can use artificial intelligence (AI) techniques to identify clusters of patient trajectories within the broader diabetes population in order to create cost-effective, narrowly-focused case management intervention strategies to reduce development of complications. This approach combined data from: 1) claims, 2) case management notes, and 3) social determinants of health from ~300,000 real patients between 2014 and 2016. We categorized complications as five types: Cardiovascular, Neuropathy, Opthalmic, Renal, and Other. Modeling was performed combining a variety of machine learning algorithms, including supervised classification, unsupervised clustering, natural language processing of unstructured care notes, and feature engineering. The results showed that we can predict development of diabetes complications roughly 83.5% of the time using claims data or social determinants of health data. They also showed we can reveal meaningful clusters in the patient population related to complications and mental health that can be used to cost-effective screening program, reducing the number of patients to be screened down by 85%. This study outlines creation of an AI framework to develop protocols to better address mental health comorbidities that lead to complications development in the diabetes population. Future work is described that outlines potential lines of research and the need for better addressing the 'people side' of the equation.
... Electronic health records (EHRs) become one of most important sources of information about patients, which provide insight into diagnoses [19] and prognoses [11], as well as assist in the development of cost-e ective treatment and management programs [1,12]. X. Jiang and H. Yu are co-corresponding authors. ...
... It is rewri en with respect to mode-n matricization 1) . is is our decomposition goal in the rest of this paper. Solving the problem (1) while preserving privacy is technically challenging because the tensor residual term X − O inherently contains other hospitals' patient data that involve sensitive information. ...
... is assumption is reasonable because all hospitals aim to have the same phenotypes and share them with others. By assuming Eq. (3), the horizontal concatenation of the local factor matrices of patient mode A (1) k forms the (global) factor matrix A (1) (Fig. 2): ...
Article
Tensor factorization models offer an effective approach to convert massive electronic health records into meaningful clinical concepts (phenotypes) for data analysis. These models need a large amount of diverse samples to avoid population bias. An open challenge is how to derive phenotypes jointly across multiple hospitals, in which direct patient-level data sharing is not possible (e.g., due to institutional policies). In this paper, we developed a novel solution to enable federated tensor factorization for computational phenotyping without sharing patient-level data. We developed secure data harmonization and federated computation procedures based on alternating direction method of multipliers (ADMM). Using this method, the multiple hospitals iteratively update tensors and transfer secure summarized information to a central server, and the server aggregates the information to generate phenotypes. We demonstrated with real medical datasets that our method resembles the centralized training model (based on combined datasets) in terms of accuracy and phenotypes discovery while respecting privacy.
... Electronic health records (EHRs) become one of most important sources of information about patients, which provide insight into diagnoses [19] and prognoses [11], as well as assist in the development of cost-effective treatment and management programs [1,12]. But meaningful use of EHRs is also accompanied with many challenges, for example, diversity of populations, heterogeneous of information, and data sparseness. ...
... The objective function of the tensor factorization with regularization terms for pairwise distinct constraints [28] is formulated as (1) It is rewritten with respect to mode-n matricization where Π (n) = A (N) ⊚ … ⊚ A (n+1) ⊚ A (n−1) ⊚ … ⊚ A (1) . This is our decomposition goal in the rest of this paper. ...
... The objective function of the tensor factorization with regularization terms for pairwise distinct constraints [28] is formulated as (1) It is rewritten with respect to mode-n matricization where Π (n) = A (N) ⊚ … ⊚ A (n+1) ⊚ A (n−1) ⊚ … ⊚ A (1) . This is our decomposition goal in the rest of this paper. ...
Conference Paper
Tensor factorization models offer an effective approach to convert massive electronic health records into meaningful clinical concepts (phenotypes) for data analysis. These models need a large amount of diverse samples to avoid population bias. An open challenge is how to derive phenotypes jointly across multiple hospitals, in which direct patient-level data sharing is not possible (e.g., due to institutional policies). In this paper, we developed a novel solution to enable federated tensor factorization for computational phenotyping without sharing patient-level data. We developed secure data harmonization and federated computation procedures based on alternating direction method of multipliers (ADMM). Using this method, the multiple hospitals iteratively update tensors and transfer secure summarized information to a central server, and the server aggregates the information to generate phenotypes. We demonstrated with real medical datasets that our method resembles the centralized training model (based on combined datasets) in terms of accuracy and phenotypes discovery while respecting privacy.
... Or to put it another way: the real question is which patients have future trajectories (in terms of both costs and outcomes) that are actually change-able? In real world clinical settings, we want actionable information [13]. A patient may be at-risk, but there may either be nothing we can do about it, or too late to do so. ...
... While we are concerned with the technical development of an artificial intelligence (AI) system for diabetes here, we are also concerned with the "people side" of the equation and how such a system could integrate with existing practices of providers and patients within healthcare systems. The latter is key for successful implementation and user adoption [13,16]. ...
... A critical part of modeling real-world healthcare datasets is the creation of "meta-data" from the underlying raw data derived from backend databases, which falls under the concept of feature engineering. The goal is to create information by intersecting across data sources and fields, constructing "meaning" out of individual data fields that may otherwise lack it [13]. In healthcare, this often takes the form of combining subject matter expertise (SME) with analytic techniques. ...
Article
Objective: Diabetes is a major public health problem in the United States, affecting roughly 30 million people. Diabetes complications, along with the mental health comorbidities that often co-occur with them, are major drivers of high healthcare costs, poor outcomes, and reduced treatment adherence in diabetes. Here, we evaluate in a large state-wide population whether we can use artificial intelligence (AI) techniques to identify clusters of patient trajectories within the broader diabetes population in order to create cost-effective, narrowly-focused case management intervention strategies to reduce development of complications. Methods: This approach combined data from: 1) claims, 2) case management notes, and 3) social determinants of health from ∼300,000 real patients between 2014 and 2016. We categorized complications as five types: Cardiovascular, Neuropathy, Ophthalmic, Renal, and Other. Modeling was performed combining a variety of machine learning algorithms, including supervised classification, unsupervised clustering, natural language processing of unstructured care notes, and feature engineering. Results: The results showed that we can predict development of diabetes complications roughly 83.5% of the time using claims data or social determinants of health data. They also showed we can reveal meaningful clusters in the patient population related to complications and mental health that can be used to design a cost-effective screening program, reducing the number of patients to be screened down by 85%. Conclusion: This study outlines creation of an AI framework to develop protocols to better address mental health comorbidities that lead to complications development in the diabetes population. Future work is described that outlines potential lines of research and the need for better addressing the “people side” of the equation.
... CDSS tools À both those based on expert system models and otherwise À have had a mixed history of success (Garg et al., 2005;Jaspers, Smeulers, Vermeulen, & Peute, 2011;Kawamoto, Houlihan, Balas, & Lobach, 2005). Many are based on evidence-based guidelines (typically derived from expert opinion or statistical averages) that prescribe a one-size-fits-all treatment regimen for every patient, or a standardized sequence of treatment options (Bauer, 2002;Bennett, Doub, & Selove, 2012;Green, 2008). However, real-world patients display individualized characteristics and symptoms that impact treatment effectiveness. ...
... For instance, these AI approaches can "learn" from 30 Artificial Intelligence in Behavioral and Mental Health Care the clinical data. Some of them can even evaluate the accuracy of their predictions/recommendations and further "learn" from their mistakes (Bennett & Doub, 2010;Bennett et al., 2011Bennett et al., , 2012. Systems such as these can discover patterns that not even human experts may be aware of, and they can do so in an automated fashion. ...
Chapter
Artificial intelligence (AI) based tools hold potential to extend the current capabilities of clinicians, to deal with complex problems and ever-expanding information streams that stretch the limits of human ability. In contrast to previous generations of AI and expert systems, these approaches are increasingly dynamical and less computationalist – less about “rules” and more about leveraging the dynamic interplay of action and observation over time. The (treatment) choices we make change what we observe (clinically, or otherwise), which changes future choices, which affects future observations, and so forth. As humans (clinicians or otherwise), we leverage this fact every day to act “intelligently” in our environment. To best assist us, our clinical computing tools should approximate the same process. Such an approach ties to future developments across the broader healthcare space, e.g., cognitive computing, smart homes, and robotics.
... S CIENTIFIC advances since the completion of the human genome project have confirmed that the genetic composition of individual humans has a significant role to play in predisposition to common diseases and therapeutic interventions. The traditional medicine model has relied on best practices emerging from large population studies and dictates a one-sizefits-all approach [1]. Although synthesized evidence is essential to demonstrate the overall safety and efficacy of medical approaches, it falls short in explaining the individual variations that exist among patients. ...
... Although there is an overwhelming amount of clinical and genomic data being captured and collected, the data are not being analyzed in a manner to produce actionable information [1]. As of yet, this represents lost opportunities for making improvements to personalized healthcare. ...
Article
Full-text available
Non-small cell lung cancer (NSCLC) constitutes the most common type of lung cancer and is frequently diagnosed at advanced stages. Clinical studies have shown that molecular targeted therapies increase survival and improve quality of life in patients. Nevertheless, the realization of personalized therapies for NSCLC faces a number of challenges including the integration of clinical and genetic data and a lack of clinical decision support tools to assist physicians with patient selection. To address this problem, we used frequent pattern mining to establish the relationships of patient characteristics and tumor response in advanced NSCLC. Univariate analysis determined that smoking status, histology, EGFR mutation, and targeted drug were significantly associated with response to targeted therapy. We applied four classifiers to predict treatment outcome from EGFR-TKIs. Overall, the highest classification accuracy was 76.56% and the AUC was 0.76. The decision tree used a combination of EGFR mutations, histology, and smoking status to predict tumor response and the output was both easily understandable and in keeping with current knowledge. Our findings suggest that support vector machines and decision trees are a promising approach for clinical decision support in the patient selection for targeted therapy in advanced NSCLC.
... There are also an increasing number of people who are unaware that they are at risk of chronic condition and it would be excellent and helpful if such patients can be detected or diagnosed at an earlier stage. In order to do achieve effective treatment and intervention, the key is improved clinical decision-making by cal practice, Ammerman, Smith, and Calancie (2014) and Bennett, Doub, and Selove (2012) noted that, it took too long to be implemented to have real benefit for the patients at large. There has to be a more effective and faster manner in which clinical practice can be improved. ...
... As new ways are being sourced to further improve patient care, Bennett et al. (2012) proposed a datadriven healthcare approach which is able to narrow the gap between research and practice. This would mean requiring the use of massive and extensive amount of data, which fortunately has been made possible due to HIT. ...
Chapter
Full-text available
Decision making is such an integral aspect in health care routine that the ability to make the right decisions at crucial moments can lead to patient health improvements. Evidence-based practice, the paradigm used to make those informed decisions, relies on the use of current best evidence from systematic research such as randomized controlled trials. Limitations of the outcomes from RCT, such as “quantity” and “quality” of evidence generated, has lowered healthcare professionals’ confidence in using EBP. An alternate paradigm of Practice-Based Evidence has evolved with the key being evidence drawn from practice settings. Through the use of health information technology, electronic health records capture relevant clinical practice “evidence”. A data-driven approach is proposed to capitalize on the benefits of EHR. The issues of data privacy, security and integrity are diminished by an information accountability concept. Data warehouse architecture completes the data-driven approach by integrating health data from multi-source systems, unique within the healthcare environment.
... typing dynamics). The data collection is not enforced as in a clinical trial or case-control study, but contains the types of messy data typically seen in real-world clinical datasets and electronic health records [24]. A primary goal of this paper is to evaluate how machine learning modeling of mood instability may work in real-world clinical care of bipolar patients, outside the scope of controlled clinical studies. ...
... normalizing the features, target class rebalancing. We also Performance was estimated using multiple performance metrics, based on 5-fold cross validation, following standard machine learning guidelines [24]. Given that various types of analysis were performed, more specific details are provided in the Results (Section 3), where appropriate. ...
Article
Full-text available
Modeling smartphone keyboard dynamics as the foundation of an early warning system (EWS) for mood instability holds potential to expand the reach of healthcare beyond the traditional clinic wall’s, which may lead to better ongoing care for chronic mental illnesses such as bipolar disorder. Here, we investigate the feasibility of such a system using a real-world open-science dataset. In particular, we are interested in whether passive technology interaction patterns in real-world datasets reflect findings from more controlled research trials, and the implications for clinical care. Data from 328 people who downloaded an open-science app was analyzed using a variety of machine learning methods, including different modeling methods (random forests, gradient boosting, neural networks), different types of class rebalancing, and pre-processing techniques. The aim was to predict fluctuations in PHQ scores in the weeks before the fluctuation occurred. Various feature selection methods were also employed to identify the top features driving the predictive patterns (out of total 54 starting features). Results showed predictive accuracy around ∼90%, similar to controlled research trials, while revealing a number of interesting features (e.g. PTSD and mood instability) that suggest future research avenues. The findings from our analysis appear to indicate that real-world interaction data from smartphones can be utilized as an EWS monitoring tool for mood disorders like bipolar. We also discuss the broader applicability of ecological momentary assessment (EMA) approaches to connected systems combining different forms of pervasive technology interaction (smartphones, wearables, social robots) to track everyday health status.
... Many fields use human-centered co-design studios to develop and pre-test technological options (Hetrick et al. 2018;Russell et al. 2020). Focus groups, studies, and surveys have suggested ways to improve technology in health care through involvement and supervision of caregiver(s), better visuals, more immediate chat options and feedback between patients and clinicians, gamification and incentives, reminder alarms/alerts (MacKintosh et al. 2019), and seamless integration of data from apps and mobile phones into patient portals (Bennett et al. 2012). ...
... All of these issues should be evaluated during the technology design process so appropriate encryption and data security features are built in (Luxton et al. 2012 challenges include sensor precision, power, location, analytic procedure, communication, data acquisition, and processing, as well as performance issues between devices, accuracy of date versus gold standards, error rates, and other parameters (Patel et al. 2002). Sensor and wearable technologies have not been adopted significantly to date, partly due to security, affordability, user-friendliness, compatibility with EHR systems (Bennett et al. 2012;Dinh-Le et al.2019), and financial and policy issues that have not been resolved even though virtual care codes were approved for some services in the U.S. in 2019 . Manufacturers and healthcare providers also must consider certification and approval requirements related to US, European Union, and other international regulatory standards ...
Article
Full-text available
Sensor, wearable, and remote patient monitoring technologies are typically used in conjunction with video and/or in-person care for a variety of interventions and care outcomes. This scoping review identifies clinical skills (i.e., competencies) needed to ensure quality care and approaches for organizations to implement and evaluate these technologies. The literature search focused on four concept areas: (1) competencies; (2) sensors, wearables, and remote patient monitoring; (3) mobile, asynchronous, and synchronous technologies; and (4) behavioral health. From 2846 potential references, two authors assessed abstracts for 2828 and, full text for 521, with 111 papers directly relevant to the concept areas. These new technologies integrate health, lifestyle, and clinical care, and they contextually change the culture of care and training—with more time for engagement, continuity of experience, and dynamic data for decision-making for both patients and clinicians. This poses challenges for users (e.g., keeping up, education/training, skills) and healthcare organizations. Based on the clinical studies and informed by clinical informatics, video, social media, and mobile health, a framework of competencies is proposed with three learner levels (novice/advanced beginner, competent/proficient, advanced/expert). Examples are provided to apply the competencies to care, and suggestions are offered on curricular methodologies, faculty development, and institutional practices (e-culture, professionalism, change). Some academic health centers and health systems may naturally assume that clinicians and systems are adapting, but clinical, technological, and administrative workflow—much less skill development—lags. Competencies need to be discrete, measurable, implemented, and evaluated to ensure the quality of care and integrate missions.
... An interesting review on risk prediction models is presented in [8]. Machine learning and predictive models have also been used in pathology screening and detection [9], [10]. In [9] for instance a set of data-mining models including neural networks, Bayesian Networks, Random Forest, Decision Trees and Log Regression, were implemented using EHRs data to predict individual patient treatment response. ...
... Machine learning and predictive models have also been used in pathology screening and detection [9], [10]. In [9] for instance a set of data-mining models including neural networks, Bayesian Networks, Random Forest, Decision Trees and Log Regression, were implemented using EHRs data to predict individual patient treatment response. A comprehensive review of predictive data-mining applications in medicine is presented in [2]. ...
Preprint
Medical test selection is a recurring problem in healthcare prevention and consists in proposing a set of tests to each subject for diagnosis and treatment of pathologies. The problem is characterized by the unknown risk probability distribution across the population and two contradictory objectives : minimizing the number of tests and giving the medical test to all at-risk population. This paper sets this problem in a more general framework of chance-constrained medical test rationing with unknown subject distribution over an attribute space and unknown risk probability but with a given sample population. A new approach combining decision-tree and Bayesian inference is proposed to allocate relevant medical tests according to subjects profile. A case study on ECG allocation in health prevention is conducted and the performance of the proposed approach evaluated. The results show that significant savings on unnecessary tests could be achieved with limited numbers of subjects needing but not receiving ECG.
... However, the focus have been prediction of particular medical conditions and diagnosis. Studies addressing the implementation of Computeraided Decision-Making and Decision Support Systems in medicine can be found in [4], [8], [9]. Few studies have addressed test selection [2], [3] and to the best of our knowledge machine learning based strategies to solve this problem as part of the design of optimal preventive health evaluation programs has not been addressed. ...
... Few studies have addressed test selection [2], [3] and to the best of our knowledge machine learning based strategies to solve this problem as part of the design of optimal preventive health evaluation programs has not been addressed. However, a clear trend towards data-driven applications and the use of artificial intelligence in this field can be noticed [4], [10], [8], [5], [9]. To try to fill some gaps in data-driven decisionsupport and health prevention programs design, the present study introduces two machine learning strategies to subject profiling and medical test selection. ...
... A data-driven approach will facilitate the analysis of large volumes of time-series data useful for pattern discovery and predictive modelling [2]. However, emphasis must be placed on reliable data and the aggregation of data from large and diverse populations to produce reliable and replicable findings [1]. The Institute of Medicine (IOM) envision future clinical decision making that leverage "personal health knowledge bases" to support practitioners in the aggregation, integration and transformation of information into actionable decisions [14]. ...
Article
Full-text available
A commitment in 2010 by the Australian Federal Government to spend $466.7 million dollars on the implementation of personally controlled electronic health records (PCEHR) heralded a shift to a more effective and safer patient centric eHealth system. However, deployment of the PCEHR has met with much criticism, emphasised by poor adoption rates over the first 12 months of operation. An indifferent response by the public and healthcare providers largely sceptical of its utility and safety speaks to the complex sociotechnical drivers and obstacles inherent in the embedding of large (national) scale eHealth projects. With government efforts to inflate consumer and practitioner engagement numbers giving rise to further consumer disillusionment, broader utilitarian opportunities available with the PCEHR are at risk. This paper discusses the implications of establishing the PCEHR as the cornerstone of a holistic eHealth strategy for the aggregation of longitudinal patient information. A viewpoint is offered that the real value in patient data lies not just in the collection of data but in the integration of this information into clinical processes within the framework of a commoditised data-driven approach. Consideration is given to the eHealth-as-a-Service (eHaaS) construct as a disruptive next step for co-ordinated individualised healthcare in the Australian context.
... Real-world patient populations are notoriously different from those seen in controlled, experimental studies, which in a healthcare sense also necessitates certain practice-based-evidence approaches [22,23]. It is therefore important to empirically study how SARs might be used as tools for preventive healthcare in home environments, as technologies that can improve people's quality of life by affecting their health status over time as part of their everyday lives. ...
Conference Paper
Full-text available
This paper presents the results of a pilot study measuring and evaluating the intervention effects of voluntary in-home use of a socially assistive robot by older adults diagnosed with depression. The study was performed with 8 older adult patients over the course of one month, during which participants were provided the robot to use as they desired in their own homes. During the in-home study, several types of data was collected, including robotic sensor data from a collar worn by the robot, daily activity levels via a wristband (Jawbone) worn by the older adults, and weekly health outcome measures. Results of data analysis of the robotic intervention suggest that: 1) the use of the Paro robot in participants’ homes significantly reduced the symptoms of depression for a majority of patients, and that 2) weekly fluctuations in patient depression levels can be predicted using a combination of robotic sensor data and Jawbone activity data (i.e. measuring their general activity levels and their interactions with the robot).
... Approximately half of all primary clinics had not implemented a basic EHR system by the end of 2015 (Office of the National Coordinator for Health Information Technology, 2015). Continuous improvement and understanding is critical as healthcare companies continue to transform and transition towards modern EHR technology (Bennett, Doub, & Selove, 2012). Therefore, we established the following research question: What are the rural primary care physicians and physician assistants' lived experiences and perceptions of complex adaptive systems as they pertain to overcoming barriers to implementing electronic health records? ...
Article
Medicare-eligible physicians at primary care practices (PCP) that did not implement an electronic health record (EHR) system by the end of 2015 face stiff penalties. One year prior to the 2015 deadline, approximately half of all primary clinics have not implemented a basic EHR system. The purpose of this phenomenology study was to explore rural primary care physicians and physician assistants’ experiences regarding overcoming barriers to implementing EHRs. Complex adaptive systems formed the conceptual framework for this study. Data were collected through face-to-face interviews with a purposeful sample of 21 physicians and physician assistants across 2 rural PCPs in the southeastern region of Missouri. Participant perceptions were elicited regarding overcoming barriers to implementing EHRs systems as manadated by federal legislation. Interview questions were transcribed and processed through qualitative software to discern themes of how rural PCP physicians and physician assistants might overcome barriers to implementing electronic health records. Through the exploration of the narrative segments, 4 emergent themes were common among the participants including (a) limited finances to support EHRs, (b) health information exchange issues, (c) lack of business education, and (d) lack of change management at rural medical practices. This study may provide rural primary care physicians and administrators with strategies to promote the adoption of EHRs, provide cost efficient business services, and improve change management plans. © 2017: Patricia Mason, Roger Mayer, Wen-Wen Chien, Judith P. Monestime, and Nova Southeastern University.
... As a complementary approach to the well-known Evidence-based practice (EBP) or Evidence-based medicine (EBM), the PBE approach to decision making is an approach where meaningful evidence of clinical practices performed by healthcare professionals as part of their care routine are captured and stored in electronic health records, and then used to support and inform clinical decision making towards the care of individual patients [5]. One of the major constraints of EBP has been the difficulty in applying the evidence gathered from it as well as being relevant to actual patients [6][7][8], especially those with multiple chronic conditions. This is partly due to the reliance on randomised controlled trials (RCTs) as the main preferred method of generating clinical evidence [9,10] to direct treatments or interventions, which itself has considerable limitations. ...
Conference Paper
The adoption of Electronic Health Record (EHR) systems has been widespread both locally and globally. The result of such adoptions has been the generation of huge amounts of digital healthcare data, assets which are valuable towards providing better care and management of patients. While studies conducted on secondary use of EHR data have found to be beneficial, such use is still in its infancy. As such, a complementary approach of Practice-based evidence (PBE) to decision making which leverages on EHR data as practical clinical evidence has been proposed. As part of evaluating the feasibility of PBE approach to decision making, this paper aims at studying the perception Singapore doctors have on the clinical benefits of using EHR systems and the usefulness of EHR data to assist with decision making. The findings from this study will aid in understanding the potential of utilising EHR as practical clinical evidence in the approach of PBE to decision making.
... When developed and implemented properly, CDS has the ability to process large amounts of complex data, such as WGS data, and present actionable, evidence-based recommendations to clinicians at the point of care [15]. In doing so, CDS has been shown to be effective in reducing errors, improving clinician performance, and ultimately improving the quality of care in clinical settings [16]. ...
Article
The ease with which whole genome sequence (WGS) information can be obtained is rapidly approaching the point where it can become useful for routine clinical care. However, significant barriers will inhibit widespread adoption unless clinicians are able to effectively integrate this information into patient care and decision-making. Electronic health records (EHR) and clinical decision support (CDS) systems may play a critical role in this integration. A previously published technical desiderata focused primarily on the integration of genomic data into the EHR. This manuscript extends the previous desiderata by specifically addressing needs related to the integration of genomic information with CDS. The objective of this study is to develop and validate a guiding set of technical desiderata for supporting the clinical use of WGS through CDS. A panel of domain experts in genomics and CDS developed a proposed set of seven additional requirements. These desiderata were reviewed by 63 experts in genomics and CDS through an online survey and refined based on the experts' comments. These additional desiderata provide important guiding principles for the technical development of CDS capabilities for the clinical use of WGS information.
... The aim is to sort out which citizens really need EMS, predict diseases, and react early enough if something critical happens. [42][43][44] If predictive analytics can show that a citizen's health risk level is rising, a physician can recommend at-home care and start appropriate treat-ments before something critical happens [45]. The analyzed data can also show when a citizen does not need EMS and should instead be guided to social welfare services for the help they need [46]. ...
Article
Full-text available
The field of emergency medical services (EMS) is a challenging environment for ensuring fluent information exchange between stakeholders because several different kinds of organizations are involved in EMS missions. Solutions for information and communication technology can vary significantly depending on the organization. This study aims to identify current communication bottlenecks between EMS professionals, understand the technological challenges behind them, and describe technologies that can improve EMS communication in the future. Information for the study about current EMS processes, technologies, and technology needs was collected from EMS professionals during three workshops, five personal interviews, and one email questionnaire. All surveyed health care professionals were working in the county of Northern Ostrobothnia. Information about proposed technologies for EMS was obtained from literature and interviews with five technology companies. The principal problem in EMS communication is scattered health data. This leads to a lack of common situational awareness for professionals and incomplete medical histories for patients. The reasons behind those problems are different information systems which do not communicate with each other and the lack of a common electronic patient care record (ePCR) for use by stakeholders. Personal health measurements, sensors, telemedicine, and artificial intelligence will create opportunities for further improving the flow of communication in EMS, provided those tools can be integrated into decision-making systems.
... 30 Providing decision support to clinicians results in improved decision making leading to improved quality and efficiency in patient care. 31 Implementing clinical support rule(s) and monitoring compliance with the rules are among the core set of objectives to demonstrate meaningful use of EHR by clinical organizations. 32 We propose an automated EHR method to recognize dysglycemia, to coherently align dispersed actionable glucose data, and to provide clinical recommendations through a messaging system at the point of care. ...
Article
Multiple factors hinder the management of diabetes in hospitals. Amid the demands of practice, health care providers must collect, collate, and analyze multiple data points to optimally interpret glucose control and manage insulin dosing. Such data points are commonly dispersed in different sections of electronic health records (EHR), and the system for data display and physician interaction with the EHR are often poorly conducive to seamless clinical decision making. In this perspective article, we examine challenges in the process of EHR data retrieval, interpretation and decision making, using glucose management as an exemplar. We propose a conceptual, systems-based design for closing the loop between data gathering, analysis and decision making in the management of inpatient diabetes. This concept capitalizes on attributes of the EHR that can enable automated recognition of cases and provision of clinical recommendations.
... In their work, Bennett, C. C. et al. (2012) recognize the potential of the EHR to transform it into a decision support system through predictive modeling [45]. More specifically, EHR probably contains data and functionality that can support computational approaches in the health domain. ...
Article
Background According to European legislation, a clinical trial is research involving patients, which also includes a research end-product. The main objective of the clinical trial is to prove that the research product, i.e. a proposed medication or treatment, is effective and safe for patients. The implementation, development, and operation of a patient database, which will function as a matrix of samples with the appropriate parameterization, may provide appropriate tools to generate samples for clinical trials. Aim The aim of the present work is to review the literature with respect to the up-to-date progress on the development of databases for clinical trials and patient recruitment using free and open-source software in the field of endocrinology. Methods An electronic literature search was conducted by the authors from 1984 to June 2019. Original articles and systematic reviews selected, and the titles and abstracts of papers screened to determine whether they met the eligibility criteria, and full texts of the selected articles were retrieved. Results The present review has indicated that the electronic health records are relating with both the patient recruitment and the decision support systems in the domain of endocrinology. The free and open-source software provides integrated solutions concerning electronic health records, patient recruitment, and the decision support systems. Conclusions The patient recruitment relates closely to the electronic health record. There is maturity at the academic and research level, which may lead to good practices for the deployment of the electronic health record in selecting the right patients for clinical trials.
... Models were generally run using the default parameters in Scikit. Performance was estimated using multiple performance metrics (e.g., accuracy and AUC), based on 5-fold cross validation, following standard machine learning guidelines [51,52]. In order to predict the game state target using the standard ML approach, feature data was "collapsed" into aggregated data across each 15 min interaction by calculating averages/percentages for each feature across the entire window, resulting in a single row of data for each game state target. ...
Article
Full-text available
The development of new approaches for creating more “life-like” artificial intelligence (AI) capable of natural social interaction is of interest to a number of scientific fields, from virtual reality to human–robot interaction to natural language speech systems. Yet how such “Social AI” agents might be manifested remains an open question. Previous research has shown that both behavioral factors related to the artificial agent itself as well as contextual factors beyond the agent (i.e., interaction context) play a critical role in how people perceive interactions with interactive technology. As such, there is a need for customizable agents and customizable environments that allow us to explore both sides in a simultaneous manner. To that end, we describe here the development of a cooperative game environment and Social AI using a data-driven approach, which allows us to simultaneously manipulate different components of the social interaction (both behavioral and contextual). We conducted multiple human–human and human–AI interaction experiments to better understand the components necessary for creation of a Social AI virtual avatar capable of autonomously speaking and interacting with humans in multiple languages during cooperative gameplay (in this case, a social survival video game) in context-relevant ways.
... The rapid adoption of electronic health record (EHR) systems incentivized by the Health Information Technology for Economic and Clinical Health (HITECH) Act of 2009 has enabled the digital transformation of clinical data and large-scale data-driven research 1, 2 . In particular, the longitudinal, voluminous, and dense data offered by the EHR fuels the development of modern machine learning (ML) techniques for predicting disease trajectories and health outcomes, offering unique opportunities for real-time clinical decision support, risk management, and personalized patient monitoring 3,4 . In alignment with the vision of evidence-based care and precision medicine, medical decisions can be tailored to the individual patient leveraging predictive models trained from longitudinal EHR data 5 . ...
Preprint
Full-text available
Translation of predictive modeling algorithms into routine clinical care workflows faces challenges in the form of varying data quality-related issues caused by the heterogeneity of electronic health record (EHR) systems. To better understand these issues, we retrospectively assessed and compared the variability of data produced from two different EHR systems. We considered three dimensions of data quality in the context of EHR-based predictive modeling for three distinct translational stages: model development (data completeness), model deployment (data variability), and model implementation (data timeliness). The case study was conducted based on predicting post-surgical complications using both structured and unstructured data. Our study discovered a consistent level of data completeness, a high syntactic, and moderate-high semantic variability across two EHR systems, for which the quality of data is context-specific and closely related to the documentation workflow and the functionality of individual EHR systems.
... The completion of human genome project [14] has allowed a move from the traditional medical model of targeting a large population, as a one-size fits all approach [15], towards personalized therapies. Information from genomic and genetic data provides new opportunities for patient care, prevention, and diagnosis [16]. ...
Preprint
Full-text available
Lung cancer caused by mutations in the epidermal growth factor receptor (EGFR) is a major cause of cancer deaths worldwide. EGFR Tyrosine kinase inhibitors (TKIs) have been developed, and have shown increased survival rates and quality of life in clinical studies. However, drug resistance is a major issue, and treatment efficacy is lost after about an year. Therefore, predicting the response to targeted therapies for lung cancer patients is a significant research problem. In this work, we address this issue and propose a personalized model to predict the drug-response of lung cancer patients. This model uses clinical information, geometrical properties of the drug binding site, and the binding free energy of the drug-protein complex. The proposed model achieves state of the art performance with 97.5% accuracy, 100% recall, 95% precision, and 96.3% F1-score with a random forest classifier. This model can also be tested on other types of cancer and diseases, and we believe that it may help in taking optimal clinical decisions for treating patients with targeted therapies
... The adoption of electronic health record (EHR) systems has simultaneously changed clinical practice and expanded the breadth of biomedical research. For clinical research studies, EHR data are used alone or integrated with other established data sources such as registries, genomic data from biobanks, and administrative databases [1][2][3][4][5][6][7] . EHR clinical data typically includes diagnostic billing codes, laboratory orders and results, procedure codes, and medication prescriptions. ...
Article
Full-text available
The increasing availability of electronic health record (EHR) systems has created enormous potential for translational research. However, it is difficult to know all the relevant codes related to a phenotype due to the large number of codes available. Traditional data mining approaches often require the use of patient-level data, which hinders the ability to share data across institutions. In this project, we demonstrate that multi-center large-scale code embeddings can be used to efficiently identify relevant features related to a disease of interest. We constructed large-scale code embeddings for a wide range of codified concepts from EHRs from two large medical centers. We developed knowledge extraction via sparse embedding regression (KESER) for feature selection and integrative network analysis. We evaluated the quality of the code embeddings and assessed the performance of KESER in feature selection for eight diseases. Besides, we developed an integrated clinical knowledge map combining embedding data from both institutions. The features selected by KESER were comprehensive compared to lists of codified data generated by domain experts. Features identified via KESER resulted in comparable performance to those built upon features selected manually or with patient-level data. The knowledge map created using an integrative analysis identified disease-disease and disease-drug pairs more accurately compared to those identified using single institution data. Analysis of code embeddings via KESER can effectively reveal clinical knowledge and infer relatedness among codified concepts. KESER bypasses the need for patient-level data in individual analyses providing a significant advance in enabling multi-center studies using EHR data.
... It is important for developers of technologies to apply best practices in user experience to avoid user distraction, connectivity problems, limited processing power, and differences in input devices (e.g., smaller buttons) that may cause errors. Data from apps and mobile phones need to be seamlessly integrated in electronic health record (EHR) portals, but this has not been widely achieved (Bennett et al., 2012), despite many start-up organizations' enabling technologies (Dinh-Le et al., 2019). Based on a thorough review of the extant usability literature, PACMAD (People at the Centre of Mobile Application Development) proposed a usability model designed to address the limitations of existing usability models when applied to mobile devices (Harrison et al., 2013). ...
Article
Full-text available
Sensors and wearables measure physiological and behavioral data in real time for behavioral health, using a variety of methods, interventions, and outcomes. A six-stage scoping review of 10 literature databases focused on keywords in four concept areas: (1) mobile technologies; (2) sensors, wearables, and remote monitoring; (3) mood and anxiety disorders, as well as stress; and (4) behavioral health care. Two authors independently screened results based on titles and abstracts, reviewed the full-text articles, and used inclusion/exclusion criteria to find research that studied self-report or management of symptoms and interventions. Out of a total of 5468 potential references, 76 papers were selected and an additional 16 studies were discovered in references. Of the 92 studies, 54 (58.7%) focused on mood (depressive, N = 28; bipolar, N = 26), 18 (19.6%) on anxiety disorders, and 20 (21.7%) on psychological stress/stress disorders. There were 7 (7.6%) randomized controlled trials, and 31 (33.7%) comparison studies. Research is shifting toward standardized methods, interventions, and evaluation measures, with longitudinal correlation, prediction, and/or biomarking/digital phenotyping of patients’ outcomes. These technologies pose several challenges for users, clinicians (e.g., selection, training, skills), healthcare systems (e.g., technology, integration into workflow, privacy), and organizations (e.g., training, creating a professional e-culture, change). Future research is needed on clinical health outcomes; human–computer interaction; medico-legal, professional, and privacy policy issues; models of service delivery; and effectiveness at a population level, across cultures, and related to economic costs. Clinician and institutional competencies could ensure quality of care, integration of missions, and institutional change.
... In the following decades, we have witnessed improvements of computing power (Koomey et al., 2010), the abundance of data thanks to the Internet (Rajaraman and Ullman, 2011) and noticeable advances in the fields of computer vision (Dougherty, 2009) and natural language processing (Banko and Brill, 2001;Mikolov et al., 2013;Brown et al., 2020). These have led to a large number of applications of AI in healthcare, including, but not limited to, in radiology (Li et al., 2020;Chockley and Emanuel, 2016), screening (Patcas et al., 2019;McKinney et al., 2020), psychiatry (Graham et al., 2019;Fulmer et al., 2018), primary care (Blease et al., 2019;Liyanage et al., 2019), disease diagnosis (Alić et al., 2017), telehealth (Pacis et al., 2018), analysis of electronic health records (Bennett et al., 2012), prediction of drug interactions and creation of new drugs (Bokharaeian et al., 2016;Christopoulou et al., 2020;Zhou et al., 2018), prediction of injuries of football players (Borchert and Schnackenburg, 2020) and others. ...
Chapter
Full-text available
Artificial intelligence (AI) is the next step of the industrial revolution. It aims to automate human or manual decision making. AI has started to disrupt nearly every industry, including healthcare. However, we have just started to scratch the surface as there are many more AI opportunities for healthcare that will allow to improve patient care while cutting waiting times and costs. In this chapter, we provide an introduction to AI and its applications in healthcare. We then examine possible future opportunities of how AI could skyrocket healthcare. Next, we look at the challenges in and around AI research, the impact of AI on our society, fears, education and the need for data literacy for everyone, including physicians and patients. We also discuss how these challenges could be solved. This chapter also serves as a foundation for other book chapters that present further AI applications in healthcare.
Article
Present-day society shows keen interest in the field of medical treatment, and the diagnostic mode is now developing toward doctor–patient shared decision-making. Therefore, a patient׳s source of medical information is quite important, with that source needing to be reliable, accurate, and easily accessible. Ensuring that informational sources meet these requirements becomes a challenge with the development of the informational network, which causes the amount of material available online to steadily increase and the general public to become more aware of health- and medical-treatment-related information. Therefore, focusing on the medical information seeker, this paper will discuss two user identities: patients and healthcare professionals. For patients, online medical articles are a major source of medical information; patients with concerns about diseases often search for their symptoms on the Internet and look for related medical information. However, online medical articles are usually long, so patients sometimes self-diagnose their disease or determine the severity of their condition based on only part of an article or on limited, incomplete, or even inaccurate information in several articles related to the symptoms searched out. Consequently, patients may misdiagnose their condition or underestimate the severity or seriousness of the condition and delay treatment. In addition, present medical technology advances rapidly, so physicians and other healthcare professionals must obtain the latest medical information from the Internet. However, searching for and reading professional in-depth medical articles to find required, critical information online is time-consuming, creating a time-management challenge. To address these aforementioned problems, this paper develops an Automatic Key Medical Information Generating model, uses medical articles as the basis of analysis, and develops and designs a medical article key-information-generating methodology applicable to medical article retrieval and reading. The word segmentation is implemented for the articles according to the Chinese Knowledge and Information Processing (CKIP) of Academia Sinica, and the medical articles are then distributed to various clusters by the clustering technology of this model, so that the medical information seeker can conduct a rapid search for the required medical article information. When the medical information seeker finds the target medical article, the article׳s key statements are screened out by the keywords rule base created in this paper, and the key statement scores are calculated. The medical article key information is sequenced according to the key statements so as to generate the medical article key information table. In addition, a web-based key-medical-information-generating system will be built based on the proposed model, and the effectiveness and feasibility of the model and technology will be evaluated using a real-world case. In summary, this paper presents a model to analyze the keywords and key statements of medical articles to generate a medical article key information table. This model can help the medical information seeker look for the required health information rapidly and accurately on the Internet, shortening the time for screening medical information and increasing the probability of obtaining the required information.
Chapter
The healthcare industry is one of the most attractive domains to realize the actionable knowledge discovery objectives. This chapter studies recent researches on knowledge discovery and data mining applications in the healthcare industry and proposes a new classification of these applications. Studies show that knowledge discovery and data mining applications in the healthcare industry can be classified to three major classes, namely patient view, market view, and system view. Patient view includes papers that performed pure data mining on healthcare industry data. Market view includes papers that saw the patients as customers. System view includes papers that developed a decision support system. The goal of this classification is identifying research opportunities and gaps for researchers interested in this context.
Chapter
The healthcare industry is one of the most attractive domains to realize the actionable knowledge discovery objectives. This chapter studies recent researches on knowledge discovery and data mining applications in the healthcare industry and proposes a new classification of these applications. Studies show that knowledge discovery and data mining applications in the healthcare industry can be classified to three major classes, namely patient view, market view, and system view. Patient view includes papers that performed pure data mining on healthcare industry data. Market view includes papers that saw the patients as customers. System view includes papers that developed a decision support system. The goal of this classification is identifying research opportunities and gaps for researchers interested in this context.
Chapter
Decision making is such an integral aspect in health care routine that the ability to make the right decisions at crucial moments can lead to patient health improvements. Evidence-based practice, the paradigm used to make those informed decisions, relies on the use of current best evidence from systematic research such as randomized controlled trials. Limitations of the outcomes from RCT, such as "quantity" and "quality" of evidence generated, has lowered healthcare professionals' confidence in using EBP. An alternate paradigm of Practice-Based Evidence has evolved with the key being evidence drawn from practice settings. Through the use of health information technology, electronic health records capture relevant clinical practice "evidence". A data-driven approach is proposed to capitalize on the benefits of EHR. The issues of data privacy, security and integrity are diminished by an information accountability concept. Data warehouse architecture completes the data-driven approach by integrating health data from multi-source systems, unique within the healthcare environment.
Preprint
Objective: The increasing availability of Electronic Health Record (EHR) systems has created enormous potential for translational research. Even with a working knowledge of EHR, it is difficult to know all the relevant codes related to a phenotype due to the large number of codes available. Traditional data mining approaches often require the use of patient-level data, which hinders the ability to share data across institutions to establish a cooperative and integrated knowledge network. In this project, we demonstrate that multi-center large-scale code embeddings can be used to efficiently identify relevant features related to a disease or condition of interest. Method: We constructed large-scale code embeddings for a wide range of codified concepts, including diagnosis codes, medications, procedures, and laboratory tests from EHRs from two large medical centers. We developed knowledge extraction via sparse embedding regression (KESER) for feature selection and integrative network analysis based on the trained code embeddings. We evaluated the quality of the code embeddings and assessed the performance of KESER in feature selection for eight diseases. Besides, we developed an integrated clinical knowledge map combining embedding data from both institutions. Results: The features selected by KESER were comprehensive compared to lists of codified data generated by domain experts. Additionally, features identified automatically via KESER used in the development of phenotype algorithms resulted in comparable performance to those built upon features selected manually or identified via existing feature selection methods with patient-level data. The knowledge map created using an integrative analysis identified disease-disease and disease-drug pairs more accurately compared to those identified using single institution data. Conclusion: Analysis of code embeddings via KESER can effectively reveal clinical knowledge and infer relatedness among diseases, treatment, procedures, and laboratory measurement. This approach automates the grouping of clinical features facilitating studies of the condition. KESER bypasses the need for patient-level data in individual analyses providing a significant advance in enabling multi-center studies using EHR data.
Article
Introduction This study builds upon prior knowledge to integrate data from an EHR system to investigate EHR implementation on patient -flow for operations within a pediatric practice. We compare pre-implementation administrative data from a practice management system with paper-based documents, and post-implementation data from a cloud-based EHR system. Methods This study reports on visits from a clinic within a network of eleven pediatric clinics during the period of April 16, 2012 to April 15, 2014. Results 2448 independent patient visits were used in the study. 838 pre-implementation visit records were collected for the period April 16, 2012 to May 15, 2012 period, 789 visit records for the period April 16, 2013 to May 15, 2013, and 821 visit records for April 16, 2014 to May 15, 2014. Overall mean process time increased to 81.43 min immediately after implementation of the new EHR system. This was followed by a decrease (16.83 min) in time from check in to check out post-implementation. Discussion There were significant improvements observed in patient-flow relative to initial EHR adoption; such improvements resulted in gains in operational efficiency in several steps within the process. Conclusion Findings suggests the effective use of knowledge-sharing among employees in complement with EHR training cannot be overlooked. While expected gains in operational efficiency may initially be achieved within some steps of the process, sustained overall gains can only be accomplished by overcoming the barriers and challenges to organizational learning.
Chapter
The healthcare industry is one of the most attractive domains to realize the actionable knowledge discovery objectives. This chapter studies recent researches on knowledge discovery and data mining applications in the healthcare industry and proposes a new classification of these applications. Studies show that knowledge discovery and data mining applications in the healthcare industry can be classified to three major classes, namely patient view, market view, and system view. Patient view includes papers that performed pure data mining on healthcare industry data. Market view includes papers that saw the patients as customers. System view includes papers that developed a decision support system. The goal of this classification is identifying research opportunities and gaps for researchers interested in this context.
Article
Medical knowledge is disseminated and shared without any boundary due to the free and convenient sharing of medical documents. However, improper vocabulary and colors of medical documents tend to have an adverse impact on the perception and emotions of medical knowledge demanders. Hence, this paper develops a “Medical Documents Rewriting Model based on Medical Knowledge Demanders’ Feelings and Emotions”, and analyzes the provocative words and negative colors of medical articles. The words and colors of the target medical article are rewritten by calculating synonyms and suitable color codes. This paper also establishes a web-based medical documents rewriting system and conducts a case study to verify the feasibility of the model. The verification results show that when the system maintains about 600 medical documents, the average satisfaction score can be improved to 3.81 (76.2%). Hence, the developed system has a stable and high-performance level in medical documents rewriting. That is, this model and system can be applied to medical article sharing websites (e.g. A+ medicine and National Taiwan University Hospital), and the user’s negative emotion in reading medical documents can be reduced according to the medical article rewriting results. For medical knowledge demanders, the probability of obtaining medical knowledge with friendliness and quality can be enhanced.
Article
Full-text available
Background: There is an urgent need for the development of global analytic frameworks that can perform analyses in a privacy-preserving federated environment across multiple institutions without privacy leakage. A few studies on the topic of federated medical analysis have been conducted recently with the focus on several algorithms. However, none of them have solved similar patient matching, which is useful for applications such as cohort construction for cross-institution observational studies, disease surveillance, and clinical trials recruitment. Objective: The aim of this study was to present a privacy-preserving platform in a federated setting for patient similarity learning across institutions. Without sharing patient-level information, our model can find similar patients from one hospital to another. Methods: We proposed a federated patient hashing framework and developed a novel algorithm to learn context-specific hash codes to represent patients across institutions. The similarities between patients can be efficiently computed using the resulting hash codes of corresponding patients. To avoid security attack from reverse engineering on the model, we applied homomorphic encryption to patient similarity search in a federated setting. Results: We used sequential medical events extracted from the Multiparameter Intelligent Monitoring in Intensive Care-III database to evaluate the proposed algorithm in predicting the incidence of five diseases independently. Our algorithm achieved averaged area under the curves of 0.9154 and 0.8012 with balanced and imbalanced data, respectively, in κ-nearest neighbor with κ=3. We also confirmed privacy preservation in similarity search by using homomorphic encryption. Conclusions: The proposed algorithm can help search similar patients across institutions effectively to support federated data analysis in a privacy-preserving manner.
Chapter
Artificial intelligence (AI) is a field of science devoted to the study and design of intelligent and smart systems or machines. For people ignorant about AI, intelligent machines may in the first instance seem like charismatic computers or robots mimicking humans, such as those finding place in science fiction. While others may consider AI as a technology alike mysterious computers in research laboratories or a technological advancement that will come to a reality in the long run. However examples such as drones used for surveillance, cars without drivers, or the emerging super‐intelligent robots have generated growing awareness in general about the topic. AI technologies and techniques are already in vogue all around us, and at times behind the scenes. Many applications of AI are used so commonly that we are caught unaware that those applications in reality use AI. E.g, AI used for predicting weather forecasts, logistics planning, manufacturing, banking, keeping a watch trade market trends, etc. Moreover, AI‐enabled technology is also employed in automotive, aircraft guidance systems, smart mobile phones with voice recognition application like Apple's Siri, Internet web browsers, and a spectrum of other applications in the world of work. These technologies address problems and execute tasks with better reliability, efficiency, and effectiveness. The behavioral and mental healthcare fields too are reaping the fruits of advancements in AI. E.g. AI‐driven computing techniques for learning, understanding, and reasoning can help a lot to healthcare professionals in making quality clinical decision‐making, testing, diagnosis, and healthcare management. AI can further self‐care diagnostic tools to better quality of healthier lives of people, like interactive mobile‐enabled health apps that learn from the trends and likes of users. AI bettered public health by providing timely detection of health risks and counselling about remedies. This chapter demonstrates, the possibilities to make use of AI technologies in providing need based service in the behavioral and mental healthcare fields. The review will provide fundamentals to new readers of AI. The treatise enumerates the expert systems in behavioral and mental healthcare areas. The chapter further discusses the benefits AI can offer to behavioral and mental healthcare.
Article
Full-text available
Receiver Operating Characteristics (ROC) graphs are useful for organizing classi-fiers and visualizing their performance. ROC graphs are commonly used in medical decision making, and in recent years have been used increasingly in machine learning and data mining research. Although ROC graphs are apparently simple, there are some common misconceptions and pitfalls when using them in practice. The purpose of this article is to serve as an introduction to ROC graphs and as a guide for using them in research.
Article
Full-text available
The abstract for this document is available on CSA Illumina.To view the Abstract, click the Abstract button above the document title.
Article
Full-text available
Research has found that client change occurs earlier rather than later in the treatment process, and that the client's subject experience of meaningful change in the first few sessions is critical. If improvement in the client's subject sense of well-being does not occur in the first few sessions then the likelihood of a positive outcome significantly decreases. Recent studies have found that there are significant improvements in both retention and outcome when therapists receive formal, real- time feedback from clients regarding the process and outcome of therapy. However, the most used instruments in these feedback studies are long and take up valuable therapy time to complete. It has been found that most therapists are not likely to use any feedback instruments if it takes more than five minutes to complete, score and interpret. This article reports the results of an evaluation of the use of two very brief instruments for monitoring the process and outcome of therapy, the Outcome Rating Scale (ORS) and the Session Rating Scale (SRS), in a study involving 75 therapists and 6,424 clients over a two year period. These two instruments were found to be valid and reliable and had a high use-rate among the therapists. The findings are discussed in light of the current emphasis on evidence-based practice.
Article
Full-text available
There is an industry-wide trend toward making outcome evaluation a routine part of therapeutic services, yet most measures are infeasible for everyday clinical use. Consequently, the Outcome Rating Scale (ORS) was developed and recently validated by its authors (Miller, Duncan, Brown, Sparks, & Claud, 2003). This article reports the findings of an independent replication study evaluating the reliability and concurrent validity of the ORS as studied in a non-clinical sample. Concurrent validity was tested by comparing the ORS with the Outcome Questionnaire 45.2 (OQ) using correlation statistics. The findings re-confirm that the ORS has high test-retest reliability, strong internal consistency, and moderate concurrent validity. Implications for clinical practice and future research are discussed.
Article
Full-text available
To extend two previous surveys of specific decision support system (DSS) applications over the period (January 1971-December 1994), we have conducted a follow-up survey covering the period between 1995 and 2001. A total of 210 published applications are identified. To examine the development pattern of a specific DSS over time, we analysed and summarized the survey results according to (1) the area of application, (2) the year of publication in each area of application, (3) the distribution of underlying tools in DSSs, (4) a classification based on Alter's taxonomy, and (5) the management level (operational, tactical, or strategic) for which the DSS was designed.Journal of the Operational Research Society (2006) 57, 1264-1278. doi:10.1057/palgrave.jors.2602140 Published online 28 December 2005
Article
Full-text available
This survey investigated psychologists' use of outcome measures in clinical practice. Of the respondents, 37% indicated that they used some form of outcome assessment in practice. A wide variety of measures were used that were rated by the client or clinician. Clinicians who assess outcome in practice are more likely to be younger, have a cognitive-behavioral orientation, conduct more hours of therapy per week, provide services for children and adolescents, and work in institutional settings. Clinicians who do not use outcome measures endorse practical (e.g., cost, time) and philosophical (e.g., relevance) barriers to their use. Both users and nonusers of outcome measures were interested in similar types of information, including client progress since entering treatment, current strengths and weaknesses, and determining if there is a need to alter treatment. Implications for practicing clinicians are discussed. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Motivation: Predicting the metastatic potential of primary malignant tissues has direct bearing on choice of therapy. Several microarray studies yielded gene sets whose expression profiles successfully predicted survival (Ramaswamy et al 2003; Sorlie et al 2001; van't Veer et al 2003). Nevertheless, the overlap between these gene sets is almost zero. Such small overlaps were observed also in other complex diseases (Lossos et al 2003; Miklos and Maleszka 2004), and the variables that could account for the differences had evoked a wide interest. One of the main open questions in this context is whether the disparity can be attributed only to trivial reasons such as different technologies, different patients and different types of analysis. Results: To answer this question we concentrated on one single breast cancer dataset, and analyzed it by one single method, the one which was used by van't Veer et al to produce a set of outcome predictive genes. We showed that in fact the resulting set of genes is not unique; it is strongly influenced by the subset of patients used for gene selection. Many equally predictive lists could have been produced from the same analysis. Three main properties of the data explain this sensitivity: (a) many genes are correlated with survival; (b) the differences between these correlations are small; (c) the correlations fluctuate strongly when measured over different subsets of patients. A possible biological explanation for these properties is discussed.
Article
Full-text available
The development and initial psychometric studies for the Ohio Youth Problems, Functioning, and Satisfaction Scales (Ohio Scales) are described. The Ohio Scales were developed to be practical yet rigorous, multi-content, multi-source measures of outcome for children and adolescents receiving mental health services. Initial studies suggest that the Ohio Scales are promising (reliable, valid, and sensitive to change) measures that can be used to track the effectiveness of mental health interventions for youth with serious emotional disorders. Additional studies are warranted to expand the situations and populations within which the scales are valid.
Article
Full-text available
Purpose: This goal of this study was to evaluate the effects of a data-driven clinical productivity system that leverages Electronic Health Record (EHR) data to provide productivity decision support functionality in a real-world clinical setting. The system was implemented for a large behavioral health care provider seeing over 75,000 distinct clients a year. Design/methodology/approach: The key metric in this system is a "VPU", which simultaneously optimizes multiple aspects of clinical care. The resulting mathematical value of clinical productivity was hypothesized to tightly link the organization's performance to its expectations and, through transparency and decision support tools at the clinician level, affect significant changes in productivity, quality, and consistency relative to traditional models of clinical productivity. Findings: In only 3 months, every single variable integrated into the VPU system showed significant improvement, including a 30% rise in revenue, 10% rise in clinical percentage, a 25% rise in treatment plan completion, a 20% rise in case rate eligibility, along with similar improvements in compliance/audit issues, outcomes collection, access, etc. Practical implications: A data-driven clinical productivity system employing decision support functionality is effective because of the impact on clinician behavior relative to traditional clinical productivity systems. Critically, the model is also extensible to integration with outcomes-based productivity. Originality/Value: EHR's are only a first step - the problem is turning that data into useful information. Technology can leverage the data in order to produce actionable information that can inform clinical practice and decision-making. Without additional technology, EHR's are essentially just copies of paper-based records stored in electronic form.
Article
Full-text available
In the feature subset selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus its attention, while ignoring the rest. To achieve the best possible performance with a particular learning algorithm on a particular training set, a feature subset selection method should consider how the algorithm and the training set interact. We explore the relation between optimal feature subset selection and relevance. Our wrapper method searches for an optimal feature subset tailored to a particular algorithm and a domain. We study the strengths and weaknesses of the wrapper approach and show a series of improved designs. We compare the wrapper approach to induction without feature subset selection and to Relief, a filter approach to feature subset selection. Significant improvement in accuracy is achieved for some datasets for the two families of induction algorithms used: decision trees and Naive-Bayes.
Conference Paper
Full-text available
We present a method for constructing ensembles from libraries of thousands of models. Model libraries are generated using different learning algorithms and parameter settings. Forward stepwise selection is used to add to the ensemble the models that maximize its performance. Ensemble selection allows ensembles to be optimized to performance metric such as accuracy, cross entropy, mean precision, or ROC Area. Experiments with seven test problems and ten metrics demonstrate the benefit of ensemble selection.
Article
Full-text available
The CDOI outcome measure – a patient-reported outcome (PRO) instrument utilizing direct client feedback – was implemented in a large, real-world behavioral healthcare setting in order to evaluate previous findings from smaller controlled studies. PROs provide an alternative window into treatment effectiveness based on client perception and facilitate detection of problems/symptoms for which there is no discernible measure (e.g. pain). The principal focus of the study was to evaluate the utility of the CDOI for predictive modeling of outcomes in a live clinical setting. Implementation factors were also addressed within the framework of the Theory of Planned Behavior by linking adoption rates to implementation practices and clinician perceptions. The results showed that the CDOI does contain significant capacity to predict outcome delta over time based on baseline and early change scores in a large, real-world clinical setting, as suggested in previous research. The implementation analysis revealed a number of critical factors affecting successful implementation and adoption of the CDOI outcome measure, though there was a notable disconnect between clinician intentions and actual behavior. Most importantly, the predictive capacity of the CDOI underscores the utility of direct client feedback measures such as PROs and their potential use as the basis for next generation clinical decision support tools and personalized treatment approaches.
Article
Full-text available
Of numerous proposals to improve the accuracy of naive Bayes by weak- ening its attribute independence assumption, both LBR and super-parent TAN have demonstrated remarkable error performance. However, both techniques obtain this outcome at a considerable computational cost. We present a new approach to weak- ening the attribute independence assumption by averaging all of a constrained class of classifiers. In extensive experiments this technique delivers comparable prediction accuracy to LBR and super-parent TAN with substantially improved computational efficiency at test time relative to the former and at training time relative to the latter. The new algorithm is shown to have low variance and is suited to incremental learning.
Article
Full-text available
Storing and using specific instances improves the performance of several supervised learning algorithms. These include algorithms that learn decision trees, classification rules, and distributed networks. However, no investigation has analyzed algorithms that use only specific instances to solve incremental learning tasks. In this paper, we describe a framework and methodology, called instance-based learning, that generates classification predictions using only specific instances. Instance-based learning algorithms do not maintain a set of abstractions derived from specific instances. This approach extends the nearest neighbor algorithm, which has large storage requirements. We describe how storage requirements can be significantly reduced with, at most, minor sacrifices in learning rate and classification accuracy. While the storage-reducing algorithm performs well on several real-world databases, its performance degrades rapidly with the level of attribute noise in training instances. Therefore, we extended it with a significance test to distinguish noisy instances. This extended algorithm's performance degrades gracefully with increasing noise levels and compares favorably with a noise-tolerant decision tree algorithm.
Article
Full-text available
Evaluation of the overall effectiveness of decision support systems (DSS) has been a research topic since the early 1980s. As artificial intelligence methods have been incorporated into systems to create intelligent decision support systems (IDSS), researchers have attempted to quantify the value of the additional capabilities. Despite the useful and relevant insights generated by previous research, existing evaluation methodologies offer only a fragmented and incomplete view of IDSS value and the contribution of its technical infrastructure. This paper proposes an integrative, multiple criteria IDSS evaluation framework through a model that links the decision value of an IDSS to both the outcome from, and process of, decision making and down to specific components of the IDSS. The proposed methodology provides the designer and developer specific guidance on the intelligent tools most useful for a specific user with a particular decision problem. The proposed framework is illustrated by evaluating an actual IDSS that coordinates management of urban infrastructures.
Article
Full-text available
Receiver Operating Characteristics (ROC) graphs are a useful technique for organizing classifiers and visualizing their performance. ROC graphs are commonly used in medical decision making, and in recent years have been increasingly adopted in the machine learning and data mining research communities. Although ROC graphs are apparently simple, there are some common misconceptions and pitfalls when using them in practice. This article serves both as a tutorial introduction to ROC graphs and as a practical guide for using them in research.
Article
Full-text available
Electronic health records (EHR's) are only a first step in capturing and utilizing health-related data - the problem is turning that data into useful information. Models produced via data mining and predictive analysis profile inherited risks and environmental/behavioral factors associated with patient disorders, which can be utilized to generate predictions about treatment outcomes. This can form the backbone of clinical decision support systems driven by live data based on the actual population. The advantage of such an approach based on the actual population is that it is "adaptive". Here, we evaluate the predictive capacity of a clinical EHR of a large mental healthcare provider (~75,000 distinct clients a year) to provide decision support information in a real-world clinical setting. Initial research has achieved a 70% success rate in predicting treatment outcomes using these methods.
Article
Full-text available
Clinical information system (CIS) developers and implementers have begun to look to other scientific disciplines for new methods, tools, and techniques to help them better understand clinicians and their organizational structures, clinical work environments, capabilities of clinical information and communications technology, and the way these structures and processes interact. The goal of this article is to help CIS researchers, developers, implementers, and evaluators better understand the methods, tools, techniques, and literature of the field of human factors. We developed a framework that explains how six key human factors topics relate to the design, implementation, and evaluation of CISs. Using this framework we discuss the following six topics: 1) informatics and patient safety; 2) user interface design and evaluation; 3) workflow and task analysis; 4) clinical decision making and decision support; 5) distributed cognition; and 6) mental workload and situation awareness. Integrating the methods, tools, and lessons learned from each of these six areas of human factors research early in CIS design and incorporating them iteratively during development can improve user performance, user satisfaction, and integration into clinical workflow. Ultimately, this approach will improve clinical information systems and healthcare delivery.
Article
Full-text available
In December 2008, version 2.0 of the data analysis platform KNIME was released. It includes several new features, which we will describe in this paper. We also provide a short introduction to KNIME for new users.
Article
Full-text available
The Konstanz Information Miner is a modular environment which enables easy visual assembly and interactive execution of a data pipeline. It is designed as a teaching, research and collaboration platform, which enables easy integration of new algorithms, data manipulation or visualization methods as new modules or nodes. In this paper we describe some of the design aspects of the underlying architecture and briefly sketch how new nodes can be incorporated. OVERVIEW Large volumes of data are often generated during simulations and the need for modular data analysis environments has increased dramatically over the past years. In order to make use of the vast variety of data analysis
Article
Full-text available
Despite wide distribution and promotion of clinical practice guidelines, adherence among Dutch general practitioners (GPs) is not optimal. To improve adherence to guidelines, an analysis of barriers to implementation is advocated. Because different recommendations within a guideline can have different barriers, in this study we focus on key recommendations rather than guidelines as a whole, and explore the barriers to implementation perceived by Dutch GPs. A qualitative study using six focus groups was conducted, in which 30 GPs participated, with an average of seven per session. Fifty-six key recommendations were derived from twelve national guidelines. In each focus group, barriers to the implementation of the key recommendations of two clinical practice guidelines were discussed. Focus group discussions were audiotaped and transcribed verbatim. Data was analysed by using an existing framework of barriers. The barriers varied largely within guidelines, with each key recommendation having a unique pattern of barriers. The most perceived barriers were lack of agreement with the recommendations due to lack of applicability or lack of evidence (68% of key recommendations), environmental factors such as organisational constraints (52%), lack of knowledge regarding the guideline recommendations (46%), and guideline factors such as unclear or ambiguous guideline recommendations (43%). Our study findings suggest a broad range of barriers. As the barriers largely differ within guidelines, tailored and barrier-driven implementation strategies focusing on key recommendations are needed to improve adherence in practice. In addition, guidelines should be more transparent concerning the underlying evidence and applicability, and further efforts are needed to address complex issues such as comorbidity in guidelines. Finally, it might be useful to include focus groups in continuing medical education as an innovative medium for guideline education and implementation.
Article
Full-text available
Despite the overall efficacy of psychotherapy, dropouts are substantial, many clients do not benefit, therapists vary in effectiveness, and there may be a crisis of confidence among consumers. A research paradigm called patient-focused research--a method of enhancing outcome via continuous progress feedback--holds promise to address these problems. Although feedback has been demonstrated to improve individual psychotherapy outcomes, no studies have examined couple therapy. The current study investigated the effects of providing treatment progress and alliance information to both clients and therapists during couple therapy. Outpatients (N = 410) at a community family counseling clinic were randomly assigned to 1 of 2 groups: treatment as usual (TAU) or feedback. Couples in the feedback condition demonstrated significantly greater improvement than those in the TAU condition at posttreatment, achieved nearly 4 times the rate of clinically significant change, and maintained a significant advantage on the primary measure at 6-month follow-up while attaining a significantly lower rate of separation or divorce. Mounting evidence of feedback effects with different measures and populations suggests that the time for routine tracking of client progress has arrived.
Article
Bagging predictors is a method for generating multiple versions of a predictor and using these to get an aggregated predictor. The aggregation averages over the versions when predicting a numerical outcome and does a plurality vote when predicting a class. The multiple versions are formed by making bootstrap replicates of the learning set and using these as new learning sets. Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy. The vital element is the instability of the prediction method. If perturbing the learning set can cause significant changes in the predictor constructed, then bagging can improve accuracy.
Article
Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, ***, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.
Article
A survey of specific DSS applications published between 1971 and April 1988 indicates the development of a wide variety of DSS applications in many different fields. Despite two decades of cooperative efforts by practitioners and theoreticians to develop specific DSSs, many goals in the DSS field remain unfulfilled. The critical issue is to implement a system that integrates organizational decision making vertically (among strategic, tactical, and operational levels) and horizontally (among many functional fields at the same level) to coordinate and manage conflicts among the various subunits of the organization.
Article
The validity and reliability of the Outcome Rating Scale (ORS) and the Session Rating Scale (SRS) were evaluated against existing longer measures, including the Outcome Questionnaire-45, Working Alliance Inventory, Depression Anxiety Stress Scale-21, Quality of Life Scale, Rosenberg Self-Esteem Scale and General Self-efficacy Scale. The measures were administered to patients referred for psychological services to a rural primary health-care service. Participants were recruited from both current and new patients of psychologists providing the service. Both the ORS and SRS demonstrated good reliability and concurrent validity with their longer alternatives. The ORS also evidenced significant correlations with measures of self-esteem, self-efficacy, and quality of life. The ORS and SRS offer benefits such as cost-effectiveness, brevity, simple administration, and easy interpretation of results in the measurement of clinical outcomes when compared to their longer counterparts. These results provide clear support for the adoption of brief outcome assessment measures in psychological practice.
Article
The Konstanz Information Miner is a modular environment, which enables easy visual assembly and interactive execution of a data pipeline. It is designed as a teaching, research and collaboration platform, which enables simple integration of new algorithms and tools as well as data manipulation or visualization methods in the form of new modules or nodes. In this paper we describe some of the design aspects of the underlying architecture, briey sketch how new nodes can be incorporated, and highlight some of the new features of version 2.0.
Book
The growing interest in data mining is motivated by a common problem across disciplines: how does one store, access, model, and ultimately describe and understand very large data sets? Historically, different aspects of data mining have been addressed independently by different disciplines. This is the first truly interdisciplinary text on data mining, blending the contributions of information science, computer science, and statistics. The book consists of three sections. The first, foundations, provides a tutorial overview of the principles underlying data mining algorithms and their application. The presentation emphasizes intuition rather than rigor. The second section, data mining algorithms, shows how algorithms are constructed to solve specific problems in a principled manner. The algorithms covered include trees and rules for classification and regression, association rules, belief networks, classical statistical models, nonlinear models such as neural networks, and local "memory-based" models. The third section shows how all of the preceding analysis fits together when applied to real-world data mining problems. Topics include the role of metadata, how to handle missing data, and data preprocessing.
Conference Paper
We describe a discrete event simulator developed for daily prediction of WIP position in an operational 300 mm wafer fabrication factory to support tactical decision-making. The simulator is distinctive in that its intended prediction horizon is relatively short, on the order of a few days, while its modeling scope is relatively large. The simulation includes over 90% of the wafers being processed in the fab and all process, measurement and testing tools. The model parameters are automatically updated using statistical analyses performed on the historical event logs generated by the factory. This paper describes the simulation model and the parameter estimation methods. A key requirement to support daily and weekly decision-making is good validation results of the simulation against actual fab performance. Therefore, we also present validation results that compare simulated production metrics against those obtained from the actual fab, for fab-wide, process, tool and product specific metrics.
Chapter
Chapter 4 discusses the fusion of label outputs. Four types of classifier outputs are listed: class labels (abstract level), ranked class labels (rank level), degree of support for the classes (measurement level) and correct/incorrect decision (oracle level). Combination methods for class label outputs are presented including majority vote, plurality vote, weighted majority vote, naive Bayes, multinomial combiners (Behavior Knowledge Space (BKS) and Werneckes methods), probabilistic combination and singular value decomposition (SVD) combination.
Article
Evolutionary computing (EC) is an exciting development in Computer Science. It amounts to building, applying and studying algorithms based on the Darwinian principles of natural selection. In this paper we briefly introduce the main concepts behind evolutionary computing. We present the main components all evolutionary algorithms (EAs), sketch the differences between different types of EAs and survey application areas ranging from optimization, modeling and simulation to entertainment. (C) 2002 Published by Elsevier Science B.V.
Article
Successfully supporting managerial decision-making is critically dependent upon the availability of integrated, high quality information organized and presented in a timely and easily understood manner. Data warehouses have emerged to meet this need. They serve as an integrated repository for internal and external data—intelligence critical to understanding and evaluating the business within its environmental context. With the addition of models, analytic tools, and user interfaces, they have the potential to provide actionable information resources—business intelligence that supports effective problem and opportunity identification, critical decision-making, and strategy formulation, implementation, and evaluation. Four themes frame our analysis: integration, implementation, intelligence, and innovation.
Conference Paper
The conditional independence assumption of naive Bayes essentially ignores attribute dependencies and is often violated. On the other hand, although a Bayesian network can represent arbitrary attribute dependencies, learning an optimal Bayesian network from data is in- tractable. The main reason is that learning the opti- mal structure of a Bayesian network is extremely time consuming. Thus, a Bayesian model without structure learning is desirable. In this paper, we propose a novel model, called hidden naive Bayes (HNB). In an HNB, a hidden parent is created for each attribute which combines the inuences from all other attributes. We present an approach to creating hidden parents using the average of weighted one-dependence estimators. HNB inherits the structural simplicity of naive Bayes and can be easily learned without structure learning. We propose an algorithm for learning HNB based on conditional mutual information. We experimentally test HNB in terms of classication accuracy, using the 36 UCI data sets recommended by Weka (Witten & Frank 2000), and compare it to naive Bayes (Langley, Iba, & Thomas 1992), C4.5 (Quinlan 1993), SBC (Langley & Sage 1994), NBTree (Kohavi 1996), CL-TAN (Fried- man, Geiger, & Goldszmidt 1997), and AODE (Webb, Boughton, & Wang 2005). The experimental results show that HNB outperforms naive Bayes, C4.5, SBC, NBTree, and CL-TAN, and is competitive with AODE.
Article
A great many tools have been developed for supervised classification, ranging from early methods such as linear discriminant analysis through to modern developments such as neural networks and support vector machines. A large number of comparative studies have been conducted in attempts to establish the relative superiority of these methods. This paper argues that these comparisons often fail to take into account important aspects of real problems, so that the apparent superiority of more sophisticated methods may be something of an illusion. In particular, simple methods typically yield performance almost as good as more sophisticated methods, to the extent that the difference in performance may be swamped by other sources of uncertainty that generally are not considered in the classical supervised classification paradigm.
Article
The area under the ROC curve (AUC) is a very widely used measure of performance for classification and diagnostic rules. It has the appealing property of being objective, requiring no subjective input from the user. On the other hand, the AUC has disadvantages, some of which are well known. For example, the AUC can give potentially misleading results if ROC curves cross. However, the AUC also has a much more serious deficiency, and one which appears not to have been previously recognised. This is that it is fundamentally incoherent in terms of misclassification costs: the AUC uses different misclassification cost distributions for different classifiers. This means that using the AUC is equivalent to using different metrics to evaluate different classification rules. It is equivalent to saying that, using one classifier, misclassifying a class 1 point is p times as serious as misclassifying a class 0 point, but, using another classifier, misclassifying a class 1 point is P times as serious, where p P≠ . This is nonsensical because the relative severities of different kinds of misclassifications of individual points is a property of the problem, not the classifiers which happen to have been chosen. This property is explored in detail, and a simple valid alternative to the AUC is proposed.
Article
This article takes the position that mental health (MH) services for youths are unlikely to improve without a system of measurement that is administered frequently, is concurrent with treatment, and provides feedback. The system, which I characterize as a measurement feedback system (MFS), should include clinical pro- cesses (mediators), contexts (moderators), outcomes, and feedback to clinicians and supervisors. In spite of the routine call to collect and use outcome data in real- world treatment, progress has been painstakingly slow.1Y3 For example, Garland and colleagues4 found that even when outcome assessments were required, more than 90% of the clinicians surveyed used their own judgment and paid little heed to the data. A more recent national survey of MH service organizations serving children and families indicated that almost 75% reported collecting some standardized outcome data.5 However, just collecting data on an annual basis will not result in improvement. MEASUREMENT IS NOT ENOUGH Feedback from clients and families naturally occurs in treatment, but it is highly filtered, biased, and subject to distortions caused by the use of cognitive heuristics and schemas.6 This informal and flawed feedback needs to be supplemented by an MFS that uses valid, reliable, and standardized measures. This system is central to quality improvement, professional development, as well as enhancing accountability. Feedback has been successfully applied outside MH for several decades.7,8 However the application of a fully implemented MFS is in its infancy in MH. An MFS has been shown to improve outcomes in adult MH, especially for those clients who were either not improving or deteriorating while in therapy.9 It has rarely been applied in children_s MH. Yet researchers
Article
Recognizing an urgent need for increased access to evidenced-based psychological treatments, public health authorities have recently allocated over $2 billion to better disseminate these interventions. In response, implementation of these programs has begun, some of it on a very large scale, with substantial implications for the science and profession of psychology. But methods to transport treatments to service delivery settings have developed independently without strong evidence for, or even a consensus on, best practices for accomplishing this task or for measuring successful outcomes of training. This article reviews current leading efforts at the national, state, and individual treatment developer levels to integrate evidence-based interventions into service delivery settings. Programs are reviewed in the context of the accumulated wisdom of dissemination and implementation science and of methods for assessment of outcomes for training efforts. Recommendations for future implementation strategies will derive from evaluating outcomes of training procedures and developing a consensus on necessary training elements to be used in these efforts.
Article
Increasing interest in end users' reactions to health information technology (IT) has elevated the importance of theories that predict and explain health IT acceptance and use. This paper reviews the application of one such theory, the Technology Acceptance Model (TAM), to health care. We reviewed 16 data sets analyzed in over 20 studies of clinicians using health IT for patient care. Studies differed greatly in samples and settings, health ITs studied, research models, relationships tested, and construct operationalization. Certain TAM relationships were consistently found to be significant, whereas others were inconsistent. Several key relationships were infrequently assessed. Findings show that TAM predicts a substantial portion of the use or acceptance of health IT, but that the theory may benefit from several additions and modifications. Aside from improved study quality, standardization, and theoretically motivated additions to the model, an important future direction for TAM is to adapt the model specifically to the health care context, using beliefs elicitation methods.