Conference PaperPDF Available

Problem-oriented patient record summary: An early report on a Watson application

Authors:

Abstract and Figures

As the use of Electronic Medical Records (EMRs) becomes widespread, the amount of data in an EMR becomes a challenge for its comprehension. We developed problem-oriented EMR summarization to address this issue, as a part of a larger effort of adapting IBM Watson to the medical domain. The problem-orientation refers to the central role of a patient's medical problems in the summary. The summarization uses a generated problem list, relates these generated medical problems to relevant clinical data, and organizes the clinical data in a medically meaningful manner. Watson analytics are used for creating the summarization. This is a step in building the next generation EMR, one that is based not on just keeping record but instead on a conceptual understanding of medicine, thereby crossing the threshold from record storage to an intelligent entity for clinical decision making.
Content may be subject to copyright.
Problem-Oriented Patient Record Summary:
An Early Report on a Watson Application
Murthy Devarakonda, Dongyang Zhang, Ching-Huei Tsou, Mihaela Bornea
IBM Research and Watson Group
Yorktown Heights, NY
Abstract As the use of Electronic Medical Records (EMRs)
becomes widespread, the amount of data in an EMR becomes a
challenge for its comprehension. As a part of a larger effort of
adapting IBM Watson to the medical domain, we developed
problem-oriented EMR summarization to address this issue. The
problem-orientation refers to the central role of a patient's
medical problems in the summary. The summarization uses a
generated problem list, relates these generated medical problems
with relevant clinical data, and organizes the clinical data in a
medically meaningful manner. Watson analytics are used for
creating the summarization. This is a step in building the next
generation EMR, one that is based not on just keeping record but
instead on a conceptual understanding of medicine, thereby
crossing the threshold from record storage to an intelligent entity
for clinical decision making.
Keywords—Electronic Medical Records; Problem-oriented
patient record summary; Summarization; Clinical summarization;
Medical concepts; Watson; UMLS; Text analysis;
I. I
NTRODUCTION
As Electronic Medical Records (EMRs) are widely
adopted in patient care, the data they store for a patient has also
grown accordingly. A typical EMR contains several hundreds
of unstructured plain text clinical notes, as well as large
amounts of semi-structured data, such as medications ordered,
lab test values, procedures, and vitals. So, the very technology
that allows recording every aspect of patient care is also
making it (quite unintentionally) difficult to comprehend it
quickly. Since manual summarization is time consuming and
prone to errors, there is a pressing need for automatic methods.
Summarization, in particular text summarization, is a well-
known problem in Artificial Intelligence. The task is one of
maximizing the information coverage while minimizing the
redundancy within a limited amount of space. Developing
accurate patient record summaries requires sophisticated
medical semantic analysis of EMR data and is a fertile ground
for applying the IBM Watson technology.
Watson effectively analyzed vast amounts of unstructured
text to answer natural language questions in defeating two all-
time winning champions on the American TV quiz show
Jeopardy! [1] [2]. Since then, we are adapting Watson to the
medical domain. The value Watson provides in EMR
summarization is in identifying key relationships among
clinical concepts with a granularity that matches clinical
decision making, e.g. inferring the purpose of specific
medications that a patient is taking for curing a disease or
palliative relief of symptoms.
II. R
ELATED
W
ORK
Text summarization research goes back to the 1950s [3].
Today, it is generally accepted that a good summary should
include the most important information and it should be short
[4] [5]. While text summarization is researched extensively,
clinical summarization, developing a summary of a patient’s
clinical data, is at a nascent stage. The key difference is in the
nature of data from which the summary is produced. Unlike in
text summarization, a patient’s clinical data is a mix of
unstructured plain text and semi-structured data. While the
purpose of text summarization is often amorphous, clinical
summarization has one clear goal, that is, to help a physician
care for a patient, which is the goal of our summarization.
The cognitive process in manually summarizing a patient
record sheds some light on the requirements for automatic
summarization. When asked to create a summary from a
previously unseen EMR, it was reported [6] that physicians
spend significant time studying clinical notes and labs.
Diagnostic procedures and medications are the next most
reviewed items. Physicians used a strategy of identify, validate,
and ascertain status, as a way to understand patient problems.
An automated summary should efficiently provide the
information accessed in the manual process, and indeed that is
a part of our summarization.
In the seminal paper on keeping effective patient records,
Weed [7] suggested that medical records should be organized
by patient problems. He called medical records so organized as
problem-oriented medical records. Diagnosing, treating, and
managing a patient’s medical problems should be central to
keeping a patient record. Therefore, it makes sense to organize
the patient summary around patient problems.
Succinct visualization of a patient record can be
considered as a form of summarization [8]. AnamneVis [9]
framework uses the journalistic approach of Five W’s (who,
when, what, where, and why) to show a patient record. A
medical incident is shown as a connected chain of symptoms,
tests, diagnoses, and treatment. Our goal is to develop
information content for summary, but not its visualization per
se, and therefore, our summary can drive this or other similar
visualization techniques [10].
III. P
ATIENT
R
ECORD
S
UMMARIZATION
What should be the summarization model since its purpose
is to provide a clinician with a quick and easy way to grasp the
most important information about a patient? What are the
To Appear in IEEE HealthCom 2014 (16
th
Int’l Conf. on E-health: Networking, Applications & Services), Natal, RN Brazil.
semantic elements in this model where the Watson technology
plays an important role? This section discusses these topics.
An approach to clinical summarization involving
increasingly sophisticated abstractions of aggregation,
organization, reduction and/or transformation, interpretation,
and synthesis is proposed in [11]. Such a linear abstraction
works well for a lab or a single patient problem, but a model
for the extensive collection of data types found in a typical
EMR should include semantic relationships that exist among
various data types. For instance, a lab may be associated with a
problem in the sense that it is indicative of the problem status.
So, our model consists of multiple types of clinical data, as
well as relationships among the data. We group the elements of
the data aggregates in a clinically meaningful way. Numerical
data is interpreted and presented concisely, and detailed data is
only one or two clicks away. Details are described below.
A. Summarization Model
Since a patient record contains various collections of data
about a patient and their care, i.e. problems, medications, labs,
procedures, allergies, and so on, the natural way to achieve the
coverage and brevity as needed for summarization is to start
with aggregates of these collections, which we call clinical
data aggregates of a patient.
Elements of each of these aggregates may themselves be
summarized to some level of abstraction as conceptualized in
[11]. For example, results of a lab test may be organized,
transformed and interpreted such that the summary shows the
latest value and an indication as to whether it is now, or has
ever been, out of the normal range. By clicking on it (as
explained later) a detailed timeline can be seen with abnormal
values highlighted.
The next key part of our summarization is the clinical
relationships, which identify semantic relations between the
elements of the aggregates. For example, a problem is treated
by one or more medications. Neither the problem data
aggregate nor the medications data aggregate contains this
important semantic association. These relationships are not
directly present in an EMR, but they are the result of a
physician’s judgment. As described later, we apply the Watson
technology to identify such semantic relations.
The next element of the model is the similarity of elements
in a data aggregate. The nearness attribute identifies how
closely an element is related to the other elements of the
aggregate. For example, for the medications aggregate, the
clinically relevant feature space for determining the nearness
consists of the pharmacologic mechanisms of a medication and
the classes of pharmacologic effects on human physiology.
This is an example of how our summarization determines the
clinically meaningful grouping of aggregates.
One of the key data aggregates is the patient encounter
clinical notes, i.e. clinician written notes for patient contact
points. A clinician may be a primary care physician, specialist,
emergency medicine doctor, or a nurse. Each contact results in
a clinical note being written. Thus a clinical note and a patient
encounter are one to one. The encounters, and therefore the
clinical notes, need to be categorized by the practice for
subsequent reference, e.g. it would help answer the question,
when did the patient last see a cardiologist? While the clinical
notes are a significant part of an EMR, the practice and
specialty data is missing in the header of a clinical note,
especially when the service is provided by a physician from an
outside clinic. So, our summarization involves analytics to
identify this missing data and then use it to categorize clinical
notes (and thus encounters).
Yet another element of the summarization which we have
partially implemented is a filter that determines the data to
show and/or prioritize based on the specialty of the clinician.
For example, a cardiologist may want to see only heart related
problems, medications, labs, and so on, or may want this data
prioritized over the rest.
B. Problem-Oriented Summary
The central aggregate of this summarization is a generated
problem list, and hence we refer to this summarization as the
problem-oriented patient summary. The problem list, which is
a list of the most important medical disorders of a patient that
require care and treatment [7], is abstracted or “generated” by
our application from the clinical notes text and other data in the
patient’s EMR. This is different from (and more accurate than)
the data in the problems section of an EMR, which is typically
entered by the clinical staff (and not curated by physicians,
hence not consistently reliable). The details of our problem list
generation are beyond the scope of this paper, but we note that
the recall and precision of the generated problem list are far
higher than the entered problem list based on the ground truth
created by medical experts on a set of actual patient records.
Navigation to other clinical aggregates works best from
the problems list aggregate because all the clinical relationships
start with it. For navigational purposes, the other aggregates are
secondary to the problem list. It is expected that a physician
would start with the problem list and then explore the other
data aggregates.
The problem-oriented summarization model described so
far is shown in Figure 1. Notice the clinical data aggregates of
the summary, the centrality of the problem list, and the clinical
relationships of a problem to other clinical data. The value of
such a summarization is the ability to see the most relevant
Figure
1
Summarization model showing generated problems list,
the other data aggregates, and clinical
relationships among them.
patient data from a problem perspective. It is, however,
possible to consider more than one problem at a time, and in
that case, the relationships would represent the “union” of
relationships.
Our patient record summarization consists of the following
data aggregates:
Generated problem list
Medications
Lab tests
Procedures
Vitals
Timeline of patient encounters
Social history, allergies, and demographics
Summarization automatically generates the following clinical
semantics:
Relationships between the problem list entries and the
elements of the other clinical data aggregates
Clinically meaningful grouping of elements in each
data aggregate
Categorization of patient encounters based on the
physician specialty
Filtered and/or prioritized summary data based on the
specialty of the physician using the summary
C. Visualization of Patient Record Summarization
Figure 2 shows the visualization of the patient record
summarization. Each table in the view holds a data aggregate,
and it has a default presentation based on the clinical grouping,
but can also be re-ordered based on date, alphabetical, or other
aggregate specific characteristics. For example, the generated
problems list table is shown with clinical grouping, by default;
however, the table can be re-ordered to show problems by the
diagnosed date.
The patient encounters are shown in a timeline and they
are categorized by the clinician type. The Specialties category
can be expanded to see the most frequently visited specialists.
The timeline can be narrowed to focus on a shorter period of
time, rather than the entire time range.
Selecting one or more problems changes the visualization
of several data aggregates in order to highlight elements in
them that are clinically related to the problem(s). As shown in
Figure 3, when Diabetes Mellitus, Non-insulin Dependent is
selected, the related medications that the patient is taking,
Metformin and Glipizide, are highlighted and shown at the top
of the list. Similarly, related labs, procedures, and clinical
encounters are highlighted when a problem is selected. A
physician viewing this summary can therefore quickly grasp
this patient’s treatments and labs for the selected problem(s)
and quickly find relevant notes from previous encounters.
Figure 3 When a medical problem is selected, the dashboard
highlights related patient medications and brings them to the top.
D. One or Two Click Access to Raw Data
If a physician needs to access detailed clinical data about a
patient, in our summary visualization, he/she can do so rapidly
without unnecessary mouse clicks and mouse movement. For
instance, if a physician needs to see the history of a lab,
clicking on the specific lab in the labs table opens a new
window that shows the historical values of the lab (see Figure
4). Similarly, clicking on a medication in the medications table
will bring up the timeline for it.
Figure
2
A dashboard-style visualization of a patient record summary, showing clinical data in tables and patient contacts as a timeline.
Reviewing clinical notes from previous encounters is
sometimes necessary. Clicking on the markers in the
encounters timeline in the summary view opens a window
showing the corresponding clinical note. Relevant clinical
notes for a problem can also be accessed by clicking on the
problem. A list of relevant clinical notes appears, each with a
brief synopsis. The physician can preview the synopsis and
then click to fully open the corresponding clinical note. In the
clinical note, references to the problem are highlighted.
Figure 4 One click access to lab test results (Hemoglobin A1C) to
see data, as well as a plot with reference high and low.
IV. A
NALYSIS AND
A
CCURACY
The summarization described above depends on natural
language processing, information retrieval, and semantic
reasoning techniques from the Watson system. The foundation
of the analysis is the medical concepts identification in an
EMR’s clinical notes and in its metadata, which we will
describe now.
A. UMLS concepts extraction
Our analyses use Unified Medical Language System
(UMLS) [12] defined Concept Unique Identifiers (CUIs) to
reason about medical concepts in the EMR data. UMLS
concepts are now commonly used in medical text analytics, as
it facilitates reasoning in a standardized vocabulary. Published
literature often cites UMLS Metamap software [12] for
mapping plain text to UMLS concepts, however, we use the
Watson NLP and medical concept analytic which offers
significant functional refinement and runtime improvement.
Figure 5 shows a typical clinical note and how the text is
annotated to identify UMLS concepts. The natural language
processing component of Watson includes an English language
parser, a concept mapper, a negation detector, and related
technologies. As seen in the figure, we identify various UMLS
concepts (e.g. Diabetes Mellitus) and their semantic types (e.g.
Disease or Syndrome) [13].
Figure 5 Medical concepts in the EMR clinical notes are identified
as UMLS concepts in preparation for reasoning about the EMR
contents using the UMLS standardized vocabulary
.
In addition to the clinical notes text, we identify UMLS
concepts for the entries in the EMR semi-structured data, such
as the name of a medication. Here, there is no sentence
structure and the term represents a certain clinical entity (e.g. a
medication). Therefore, we can directly find the term’s UMLS
concepts in the corresponding semantic type. This helps to find
accurate concept identifiers for the term.
B. Relationship Scoring
As mentioned earlier, an important part of the
summarization is to establish clinically meaningful
relationships between the generated medical problems and the
elements of the other clinical data aggregates. In order to do so,
the summarization needs to quantify pair-wise clinical
association between the problems and medications, labs, and
procedures.
Watson used a combination of rule-based and statistical
approaches to learn relations between entities from the broad-
domain corpora for the Jeopardy game [14]. This approach was
later extended to relations between medical concepts in
adapting Watson to the medical domain [15] and was also
enhanced using the UMLS relations between medical concepts
[12] [16]. In addition, Latent Semantic Analysis [17] applied to
the medical corpus can also provide an association score
between medical concepts. An even more accurate approach
called Distributional Relation Detection, incorporating
Distributional Semantics [18], is being developed for scoring
associations between medical concepts in Watson.
We applied two of these methods, the Latent Semantic
Analysis and the Distributional Semantics, to score relations
between problems and elements from the other clinical
aggregates (e.g. medications). We measured the accuracy of
the two methods by testing with the “ground truth” created by
medical experts for twenty de-identified medical records of
actual patients made available to us by Cleveland Clinic under
an IRB protocol for the study. The medical experts reviewed
the patient medical records and identified the relationships.
Table 1 shows the accuracy of the relations scoring algorithms
for problems and medications compared to the ground truth.
While the accuracy improvement is still in progress, the
preliminary results are encouraging for the Distributional
Semantics approach.
Table 1 The analysis accuracy that determines if a medication treats
a problem is shown for two different analysis methods we tried; The
area under the curve (1.0 is the best) is calculated from the precision-
recall curve at different threshold values for positive association.
Relationship Detection Algorithm Area Under the Precision-Recall
Curve
Latent Semantic Analysis (LSA) 0.36
Distributional Semantics 0.54
C. Relating Problems to Notes
To show the clinical notes relevant to a problem, we
identify UMLS disorders (i.e. medical concepts that belong to
the semantic type disorders in UMLS) in a clinical note and
match them with (meaning equal to or close variants of) the
concept unique identifiers of the problem. For example, for
Diabetes Mellitus from the problem list, clinical notes that
contain one or more UMLS concept identifiers matching that
of Diabetes Mellitus are identified as relevant to this problem.
D. Grouping Analysis
The clinical grouping analysis for medications starts with
an unordered list of medications from an EMR, and ends with a
clinically ordered medications list in which related medications
are together. The analysis first maps each medication to a set of
general classes from The National Drug File Reference
Terminology (NDF-RT) [19], which models each drug in terms
of various classes including its ingredients, chemical structure,
dose form, mechanism of action and pharmacokinetics. The
next step in the analysis clusters the medications based on the
similarity of their classes. The clustering is a bottom-up
hierarchical method using cosine similarity of their class
vectors. The resulting hierarchical clustering is shown for a
patient’s medications in Figure 6.
Notice that in the clinically grouped medications list, the
patient’s steroidal asthma treatments - Prednisone,
Dexamethasone, and Medrol - are close to each other, but as a
group they are distant from the patient’s antipyretics and
analgesics - Aspirin, Acetaminophen, and Motrin.
A similar grouping analysis is conducted for the medical
problems using MeSH [20] Class 1 descriptors, under diseases
and mental disorders, from UMLS to create class vectors, and
then using the same clustering method used for the
medications. The process yields a clinically meaningful
grouping of the problems list of each patient.
E. Note Type Categorization
Another analysis we used is categorizing clinical notes by
the type of the practice that created it, i.e. whether it was
created by a primary care physician, a specialist, a nurse, or by
an Emergency Department doctor. We call this note
categorization for short. The clinical note metadata
(description) in the EMR is not a reliable means of identifying
its note category. However, in presenting the timeline of a
patient’s encounters with clinicians, it is useful to correctly
categorize the encounters by practice because such a
categorized timeline allows a physician viewing the summary
to easily find the note from a particular type of previous
encounter. Once such a note is identified in the timeline, using
the one click access function described in section III.D, the
physician can quickly open the needed note.
We use a machine learning algorithm to identify the note
category. Machine learning features extracted from each note
for this purpose include UMLS medical concepts occurring in
the note text, whether there are certain informal sections (e.g.
previous medical history, assessment & plan) in the note, and
any physician specialty information in the note. We developed
the training and test data sets for about 2100 notes with the
help of medical experts - they categorized the notes by
practice. We used 1300 notes from the ground truth to train a
maximum entropy model, and used the remaining 800 to test
the model. Results as shown in Table 2 indicate reasonable
accuracy (overall F1 score of 0.782) for the model.
Table 2 Accuracy of note categorization analysis is shown here; each
note is categorized as one of the five shown types using a maximum
entropy model; the overall F1 score is reasonably high.
Note Type Precision Recall F1 Score
Primary Care 0.636 0.677 0.656
Specialties 0.804 0.830 0.817
Emergency 0.824 0.737 0.778
Nursing 1.000 0.500 0.667
Other 0.746 0.798 0.771
Total 0.782 0.782 0.782
V. F
UTURE
W
ORK AND
S
UMMARY
The application and analytics described here are the
beginning of an effort to apply the Watson technologies to
analysis of a patient record. The patient record summary
Figure
6
Clinically related medications are grouped together in our
summarization.
described here includes a generated problem list and clinical
data aggregates such as medications, lab tests, procedures, and
clinical encounter notes. The Watson analytics provide
clinically relevant relationships between problems and the
other clinical data. The analytics also provide a means to group
data aggregates semantically, and to categorize clinical notes
(and therefore, encounters). The Watson analytics are also used
for the problem list generation, but the method is not described
in this paper. The summary can be visualized in a dashboard of
clinical data aggregates and clinical note timelines. The
dashboard also shows semantic relations, grouping, and clinical
note categorization. In addition, it also provides rapid access to
actual notes, and the current and historical values of
medications and labs via a single click in the application. The
intent of the summarization is to help physicians quickly grasp
all of the important aspects of a patient record, with easy access
to details as needed.
The larger goal of this research is to apply Watson
technology to build a clinical decision support system that
works directly with a complete Electronic Medical Record of a
patient. As a near term goal, we will further improve patient
record summarization and conduct experiments to assess the
effectiveness of this record summary in patient care. Improving
patient record summarization is the process of establishing
increasingly richer clinical relationships, including disease
progression and causal associations, in a patient’s EMR. Many
of the Watson technologies, including Deep Question and
Answering, can help develop the necessary algorithms.
VI. A
CKNOWLEDGEMENTS
We thank the physicians and IT staff at Cleveland Clinic
who guided definition of the requirements for this application
and provided de-identified EMRs under an IRB protocol for
the study. We also acknowledge the groundbreaking work of
our Watson team colleagues, past and present, which made this
application possible.
VII. R
EFERENCES
[1]
D. Ferrucci, E. Brown, J. Chu-
Carroll, J. Fan, D. Gondek, A.
A. Kalyanpur, A. Lally, J. W. Murdock, E. Nyberg, J. Prager,
N. Schlaefer and C. Welty, "Building Watson: An overview of
the DeepQA project," AI Magazine, vol. 31, no. 3, pp. 59-79,
2010.
[2]
"This Is Watson,"
IBM Journal of Research and Development,
vol. 56, no. 3.4, pp. 1:1 - 1:15, 2012.
[3]
D. Das and F. T. M. Andre, "A Survey on Automatic Text
Summarization," Carnegie Mellon University, 2007.
[4]
R. Alterman, "Understanding and Summarization,"
Artificial
Intelligence Review, vol. 5, no. 4, pp. 239-254, 1991.
[5]
D. R. Radev, E. Hovy and K. McKeown, "Introduction to the
special issue on text summarization,"
Computational
Linguistics, vol. 28, no. 4, December 2002.
[6]
D. Reichert, D. Kaufman, B. Bloxham, H. Chase and N.
Elhadad, "Cognitive Analysis of the Summarization of
Longitudinal Patient Records," in AMIA Annu Symp Proc
,
2010.
[7]
L. L. Weed, "Medical Records That Guide and Teach,"
New
England Journel of Medicine, pp. 652-657, March 1968.
[8]
C. Plaisant, R. Mushlin, A. Snyder, J. Li, D. Heller and B.
Schneiderman, "LifeLines: Using Visualization to Enhance
Navigation and Analysis of Patient Records," in
AMIA Annu
Symp Proc, 1998.
[9]
Z. Zhang, F. Ahmed, A
. Mittal, I. Ramakrishnan, R. Zhao, A.
Viccellio and K. Mueller, "AnamneVis: A Framework for the
Visualization of Patient History and Medical Diagnostics
Chains," in
Workshop on Visual Analytics in Healthcare:
Understanding the Physician Perspective, Provi
dence, RI,
2011.
[10]
T. D. Wang, C. Plaisant, A. J. Quinn, R. Stanchak and B.
Shneiderman, "Aligning temporal data by sentinel events:
discovering patterns in electronic health records," in
Proceedings of the ACM SIGCHI Conference on Human
Factors in Computing Systems (CHI '08), 2008.
[11]
J. C. Feblowitz, A. Wright, H. Singh, L. Samal and D. F.
Sittig, "Summarization of clinical information: A conceptual
model," Jounral of Biomedical Informatics, vol. 44, pp. 688-
699, 2011.
[12]
"UMLS Reference M
anual," National Library of Medicine
(US), September 2009. [Online]. Available:
http://www.ncbi.nlm.nih.gov/books/NBK9675/. [Accessed 15
04 2014].
[13]
"UMLS Semantic Groups," National Library of Medicine
(US), [Online]. Available:
http://semanticnetwork.nlm.nih.gov/SemGroups/SemGroups.t
xt. [Accessed 15 4 2014].
[14]
C. Wang, A. A. Kalyanpur, J. Fan, B. Boguraev and D.
Gondek, "Relation Extraction and Scoring in DeepQA,"
IBM
Journal of Research and Development, 2012.
[15]
D. Ferrucci, A. Levas, S. Ba
gchi, D. Gondek and R. T.
Mueller, "Watson: Beyond Jeopardy!,"
Artificial Intelligence,
pp. 93-105, 2013.
[16]
C. Wang and J. Fan, "Medical Relation Extraction with
Manifold Models," in
The 52nd Annual Meeting of the
Association for Computational Linguistics (ACL 2014), 2014.
[17]
S. Deerwester, D. T. Susan, G. W. Furnas, T. K. Landauer and
R. Harshman, "Indexing by Latent Semantic Analysis,"
Journal of the American Society for Information Science,
vol.
41, no. 6, pp. 391-407, September 1990.
[18]
A. Gliozzo, "Beyond Jeopardy! Adapting Watson to New
Domains Using Distributional Semantics," [Online].
Available:
https://www.icsi.berkeley.edu/icsi/sites/default/files/events/tal
k_20121109_gliozzo.pdf. [Accessed 18 04 2014].
[19]
"National Drug File - Reference Terminology (NDF-
RT),"
National Library of Medicine (US), [Online]. Available:
http://www.nlm.nih.gov/research/umls/sourcereleasedocs/curr
ent/NDFRT. [Accessed 15 04 2014].
[20]
"MeSH," National Library of Medicine (US), [Online].
Available: htt
p://www.nlm.nih.gov/mesh/meshhome.html.
[Accessed 16 04 2014].
... The reasoning chain is visualized through multi-stage flow charts enriched with examination data. Devarakonda et al. developed a visualization method based on summarisation of Electronic Medical Records (EMRs) created by Watson analytics, which relates a patient's problem to relevant clinical data [13]. Dabek et al. [14] described methods for aggregating and summarizing of electronic health records. ...
Conference Paper
Full-text available
Since pathology is supported by information tech- nology new opportunities and questions have arisen. The digital age enables analyzing histopathological data with artificial intel- ligence methods to reveal further information and correlations. In this paper existing approaches to visualization of medical decision processes are presented as well as the relevance of explainability in decision making. The first step for implementing decision-paths in systems is to retrace an experienced patholo- gist’s diagnosis finding process. Recording a route through a landscape composed of human tissue in terms of a roadbook is one possible approach to collect information on how diagnoses are found. Choosing the roadbook metaphor provides a simple schema, that holds basic directions enriched with metadata regarding landmarks on a rally - in the context of pathology such landmarks provide information on the decision finding process.
... IBM offers a variety of services in terms of predictive analysis such as Watson analytics and SPSS [24][25][26][27][28][29][30][31][32][33][34][35][36]. AlFaris et al. reviewed the smart technologies; the interface and integration of the meters, sensors and monitoring systems with the home energy management system (HEMS) within the IoT with the outline that the smart home in practice provides the ability to the house to be net-zero energy building. ...
Article
Full-text available
Standard solutions for handling a large amount of measured data obtained from intelligent buildings are currently available as software tools in IoT platforms. These solutions optimize the operational and technical functions managing the quality of the indoor environment and factor in the real needs of residents. The paper examines the possibilities of increasing the accuracy of CO₂ predictions in Smart Home Care (SHC) using the IBM SPSS software tools in the IoT to determine the occupancy times of a monitored SHC room. The processed data were compared at daily, weekly and monthly intervals for the spring and autumn periods. The Radial Basis Function (RBF) method was applied to predict CO₂ levels from the measured indoor and outdoor temperatures and relative humidity. The most accurately predicted results were obtained from data processed at a daily interval. To increase the accuracy of CO₂ predictions, a wavelet transform was applied to remove additive noise from the predicted signal. The prediction accuracy achieved in the selected experiments was greater than 95%.
... In [16] it has been used to parse medical texts through the combination of deep linguistic learning analysis and background resources to detect and match entities and relations [16]. In [8] it has been used to build a system able to summarize the great amount of information contained into medical texts to create a new generation of Electronic Medical Records (EMR). ...
... For the training of their system, they use an active learning methodology, in which the user interactively provides the desired output. In [13], the authors highlight the power of IBM Watson in identifying key relationships among clinical concepts. They aggregate data by type, e.g. ...
Conference Paper
Clinical summarization means the collection and synthesis of a patient's significant data, undertaken in order to support health-care providers in the process of patient care. Considering that medical information comes from multiple sources, a system for the automatic generation of problem lists could prove to be very effective in terms of saving time in the analysis of large amounts of medical data. In this paper, we propose a system able to acquire and present relevant references to medical disorders from a patient's history, producing a subject-oriented summary. The implemented system relies on an NLP pipeline, for the extraction of relevant medical entities contained in narrative health records, and on several queries, necessary for the scanning of structured documents. The tool aggregates any medical problems, performed procedures, and prescribed medications, providing the healthcare practitioner with a visual summary of the patient's data.
Article
Full-text available
The surge in text data has driven extensive research into developing diverse automatic summarization approaches to effectively handle vast textual information. There are several reviews on this topic, yet no large‐scale analysis based on quantitative approaches has been conducted. To provide a comprehensive overview of the field, this study conducted a bibliometric analysis of 3108 papers published from 2010 to 2022, focusing on automatic summarization research regarding topics and trends, top sources, countries/regions, institutions, researchers, and scientific collaborations. We have identified the following trends. First, the number of papers has experienced 65% growth, with the majority being published in computer science conferences. Second, Asian countries and institutions, notably China and India, actively engage in this field and demonstrate a strong inclination toward inter‐regional international collaboration, contributing to more than 24% and 20% of the output, respectively. Third, researchers show a high level of interest in multihead and attention mechanisms, graph‐based semantic analysis, and topic modeling and clustering techniques, with each topic having a prevalence of over 10%. Finally, scholars have been increasingly interested in self‐supervised and zero/few‐shot learning, multihead and attention mechanisms, and temporal analysis and event detection. This study is valuable when it comes to enhancing scholars' and practitioners' understanding of the current hotspots and future directions in automatic summarization. This article is categorized under: Algorithmic Development > Text Mining
Article
Background and aim: Cognitive Computing systems are the intelligent systems that thinks, understands and augments the capabilities of human brain by blending the technologies of Artificial Intelligence, Machine Learning and Natural Language Processing. In recent days, maintenance or enhancement of health by preclusion, prognosis, and analysis of diseases has become a challenging task. The increasing diseases and its causes becomes a big question before humanity. Limited risk analysis, meticulous training process, and automated critical decision-making are some of the issues of cognitive computing. To overcome this issue, cognitive computing in healthcare works like a medical prodigy which anticipates the disease or illness of the human being and helps the doctors with technological facts to take the timely action. The main aim of this survey article is to explore the present and futuristic technological trends of cognitive computing in healthcare. In this work, different cognitive computing applications are reviewed, and the best application is recommended to the clinicians. Based on this recommendation, the clinicians are able to monitor and analyze the physical health of patients. Methods: This article presents the systematic literature on the different aspects of cognitive computing in healthcare. Nearly seven online databases such as SCOPUS, IEEE Xplore, Google Scholar, DBLP, Web of Science, Springer and PubMed were screened and the published articles related to cognitive computing in healthcare is collected from 2014 to 2021. In total, 75 articles were selected, examined and their pros and cons are analyzed. The analysis is done with respect to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Results: The basic findings of this review article and their significance for theory and practice are mindmaps portraying the cognitive computing platforms, cognitive applications in healthcare, and use cases of cognitive computing in healthcare. A detailed discussion section highlighting the present issues, future research directions and recent applications of cognitive computing in healthcare. Accuracy analysis of different cognitive systems conclude that the Medical Sieve achieves 0.95 and Watson For Oncology (WFO) achieves 0.93 and hence proves to be the prominent computing systems for healthcare. Conclusions: Cognitive computing, an evolving technology in healthcare augments the clinical thought process and enable the doctors to make the right diagnosis and preserve the patient's health in good condition. These systems provides timely care, optimal and cost-effective treatment. This article provides an extensive survey of the importance of cognitive computing in the health sector by highlighting the platforms, techniques, tools, algorithms, applications, and use cases. This survey also explores about the works in the literature on present issues and proposes the future research directions of applying cognitive systems in healthcare.
Preprint
Full-text available
This article describes the use of the PI ProcessBook software tool for visualization and indirect monitoring of occupancy of SHC rooms from the measured operational and technical quantities for monitoring of daily living activities for support of independent life of elderly persons. The proposed method for data processing (predicting the CO2 course using neural networks from the measured temperature indoor Ti (°C), temperature outdoor To (°C) and the relative humidity indoor rHi (%)) was implemented, verified and compared in MATLAB SW tool and IBM SPSS SW tool with IoT platform connectivity. Within the proposed method, the Stationary Wavelet Transform de noising algorithm was used to remove the noise of the resulting predicted course. In order to verify the method, two long-term experiments were performed, (specifically from February 8 to February 15, 2015, from June 8 to June 15, 2015) and two short-term experiments (from February 8, 2015 and from June 8, 2015). For the best results of the trained ANN BRM within the prediction of CO2, the correlation coefficient R for the proposed method was up to 90%. The verification of the proposed method confirmed the possibility to use the presence of persons of the monitored SHC premises for rooms ADL monitoring.
Article
Background: This study demonstrates clinical named entity recognition (NER) methods on the clinical texts of rheumatism patients in South Korea. Despite the recent increase in the adoption rate of the electronic health record (EHR) system in global health institutions, health information technologies for handling and acquisition of information from numerous unstructured texts in the EHR system are still in their developing stages. The aim of this study is to verify the conventional named entity recognition (NER) methods, namely dictionary-lookup-based string matching and conditional random fields (CRFs). Methods: We selected discharge summaries for 200 rheumatic patients from the EHR system of the Seoul National University Hospital and attempted to identify heterogeneous semantic types present in the clinical notes of each patient's history. Results: CRFs outperform string matching in extracting most semantic types (median F1 = 0.761, minimum = 0.705, maximum = 0.906). String matching is found to be better suited for identifying hospital visit information. The performance of both methods is comparable for identifying medications. The 10-fold cross-validation shows that CRFs had median F1 = 0.811 (minimum = 0.752, maximum = 0.918), and exhibited good performance even when trained with simple features. Conclusion: CRFs are a good candidate for implementing clinical NER in Korean clinical narrative documents. Increasing the training data and incorporating sophisticated feature engineering might improve the accuracy of identifying health information, enabling automated patient history summarization in the future.
Article
We present a new model of patient record search, called SemanticFind, which goes beyond traditional textual and medical synonym matches by locating patient data that a clinician would want to see rather than just what they ask for. The new model is implemented by making extensive use of the UMLS semantic network, distributional semantics, and NLP, to match query terms along several dimensions in a patient record with the returned matches organized accordingly. The new approach finds all clinically related concepts without the user having to ask for them. An evaluation of the accuracy of SemanticFind shows that it found twice as many relevant matches compared to those found by literal (traditional) search alone, along with very high precision and recall. These results suggest potential uses for SemanticFind in clinical practice, retrospective chart reviews, and in automated extraction of quality metrics.
Article
Full-text available
The medical history or anamnesis of a patient is the factual information obtained by a physician for the medical diagnostics of a patient. This information includes current symptoms, history of present illness, previous treatments, available data, current medications, past history, family history, and others. Based on this information the physician follows through a medical diagnostics chain that includes requests for further data, diagnosis, treatment, follow-up, and eventually a report of treatment outcome. Patients often have rather complex medical histories, and visualization and visual analytics can offer large benefits for the navigation and reasoning with this information. Here we present AnamneVis, a system where the patient is represented as a radial sunburst visualization that captures all health conditions of the past and present to serve as a quick overview to the interrogating physician. The patient's body is represented as a stylized body map that can be zoomed into for further anatomical detail. On the other hand, the reasoning chain is represented as a multi-stage flow chart, composed of date, symptom, data, diagnosis, treatment, and outcome.
Article
Full-text available
This paper presents a vision for applying the Watson technology to health care and describes the steps needed to adapt and improve performance in a new domain. Specifically, it elaborates upon a vision for an evidence-based clinical decision support system, based on the DeepQA technology, that affords exploration of a broad range of hypotheses and their associated evidence, as well as uncovers missing information that can be used in mixed-initiative dialog. It describe the research challenges, the adaptation approach, and finally reports results on the first steps we have taken toward this goal.
Article
Full-text available
The increasing availability of online information has necessitated intensive research in the area of automatic text summarization within the Natural Lan-guage Processing (NLP) community. Over the past half a century, the prob-lem has been addressed from many different perspectives, in varying domains and using various paradigms. This survey intends to investigate some of the most relevant approaches both in the areas of single-document and multiple-document summarization, giving special emphasis to empirical methods and extractive techniques. Some promising approaches that concentrate on specific details of the summarization problem are also discussed. Special attention is devoted to automatic evaluation of summarization systems, as future research on summarization is strongly dependent on progress in this area.
Conference Paper
In this paper, we present a manifold model for medical relation extraction. Our model is built upon a medical corpus containing 80M sentences (11 gigabyte text) and designed to accurately and efficiently detect the key medical relations that can facilitate clinical decision making. Our approach integrates domain specific parsing and typing systems, and can utilize labeled as well as unlabeled examples. To provide users with more flexibility, we also take label weight into consideration. Effectiveness of our model is demonstrated both theoretically with a proof to show that the solution is a closed-form solution and experimentally with positive results in experiments.
Article
Detecting semantic relations in text is an active problem area in natural-language processing and information retrieval. For question answering, there are many advantages of detecting relations in the question text because it allows background relational knowledge to be used to generate potential answers or find additional evidence to score supporting passages. This paper presents two approaches to broad-domain relation extraction and scoring in the DeepQA question-answering framework, i.e., one based on manual pattern specification and the other relying on statistical methods for pattern elicitation, which uses a novel transfer learning technique, i.e., relation topics. These two approaches are complementary; the rule-based approach is more precise and is used by several DeepQA components, but it requires manual effort, which allows for coverage on only a small targeted set of relations (approximately 30). Statistical approaches, on the other hand, automatically learn how to extract semantic relations from the training data and can be applied to detect a large amount of relations (approximately 7,000). Although the precision of the statistical relation detectors is not as high as that of the rule-based approach, their overall impact on the system through passage scoring is statistically significant because of their broad coverage of knowledge.
Article
The abstract for this document is available on CSA Illumina.To view the Abstract, click the Abstract button above the document title.
Conference Paper
Electronic Health Records (EHRs) and other temporal databases contain hidden patterns that reveal important cause-and-effect phenomena. Finding these patterns is a challenge when using traditional query languages and tabular displays. We present an interactive visual tool that complements query formulation by providing operations to align, rank and filter the results, and to visualize estimates of the intervals of validity of the data. Display of patient histories aligned on sentinel events (such as a first heart attack) enables users to spot precursor, co-occurring, and aftereffect events. A controlled study demonstrates the benefits of providing alignment (with a 61% speed improvement for complex tasks). A qualitative study and interviews with medical professionals demonstrates that the interface can be learned quickly and seems to address their needs.
Article
This article is an overview of the literature on narrative summarization. The capacity to summarize is a fundamental property of intelligence and has significance for several areas of artificial intelligence research and development. The first part of the paper includes a description of four critical features of a summary. The bulk of this review is concerned with sorting available summarization frameworks and techniques. A latter section of the paper describes the significance of summarization technology to three current topics in artificial intelligence: explanation-based learning, case-based reasoning, and plan evaluation.