Conference PaperPDF Available

Task 1 of the CLEF eHealth Evaluation Lab 2014 Visual-Interactive Search and Exploration of eHealth Data


Abstract and Figures

Discharge summaries serve a variety of aims, ranging from clinical care to legal purposes. They are also important tools in patient empowerment, but a patient’s comprehension of the information is often suboptimal. Continuing in the tradition of focusing on automated approaches to increasing patient comprehension, The CLEFeHealth2014 lab tasked participants to visualize the information in discharge summaries while also providing connect ions to additional online information. Participants were provided with six cases containing a discharge summary, patient profile and information needs. Of fifty registrations, only the FLPolytech team completed all requirements related to the task. They augmented the discharge summary by linking to external resources, inserting structure related to timing of the information need (past, present future), enriching the content, i.e., with definitions, and providing meta-information, e.g., how to make future appointments. Four panelists evaluated the submission. Overall, they were positive about the enhancements, but all agreed that additional visualization could further improve the provided solution.
Content may be subject to copyright.
Task 1 of the CLEF eHealth Evaluation Lab 2014
Visual-Interactive Search and Exploration of eHealth Data
Hanna Suominen1, Tobias Schreck2, Gondy Leroy3, Harry Hochheiser4, Lorraine
Goeuriot5, Liadh Kelly5, Danielle L Mowery4, Jaume Nualart6, Gabriela Ferraro7,
Daniel Keim2
1 NICTA, The Australian National University, University of Canberra, and University of Tur-
ku, Canberra, ACT, Australia,, corresponding author
2 University of Konstanz, Germany, {tobias.schreck, daniel.keim}@uni-
3 University of Arizona, Tucson, AZ, USA,
4 University of Pittsburgh, Pittsburgh, PA, USA, {harryh, dlm31}
5 Dublin City University, Dublin, Ireland, {lgoeuriot, lkelly}
6 NICTA, University of Canberra, and University of Barcelona, Canberra, ACT, Australia,
7 NICTA and The Australian National University, Canberra, ACT, Australia,
Abstract. Discharge summaries serve a variety of aims, ranging from clinical
care to legal purposes. They are also important tools in patient empowerment,
but a patient’s comprehension of the information is often suboptimal. Continu-
ing in the tradition of focusing on automated approaches to increasing patient
comprehension, The CLEFeHealth2014 lab tasked participants to visualize the
information in discharge summaries while also providing connections to addi-
tional online information. Participants were provided with six cases containing
a discharge summary, patient profile and information needs. Of fifty registra-
tions, only the FLPolytech team completed all requirements related to the task.
They augmented the discharge summary by linking to external resources, insert-
ing structure related to timing of the information need (past, present future), en-
riching the content, i.e., with definitions, and providing meta-information, e.g.,
how to make future appointments. Four panellists evaluated the submission.
Overall, they were positive about the enhancements, but all agreed that addi-
tional visualization could further improve the provided solution.
Keywords: Comprehension, Information Retrieval, Information Visualization,
Evaluation, Medical Informatics, Patient Education, Records as Topic, Software
Design, Test-set Generation, Text Classification, User-Computer Interface
Contributor Statement: HS, TS, GL, HSH, DK, LG, and LK designed the task
and its evaluation methodology. Together with JN and GF, they developed the
task description. LG and LK chose the six patient cases and extracted the re-
spective subset from the CLEFeHealth2013 data. HS and DLM automatically
de-identified discharge summaries of this subset by hand. HS, HH, TS, JN, and
GL prepared initial example designs as a starting and inspiration point for par-
ticipants. HS, TS, and GL led the task as a part of the CLEFeHealth2014 evalu-
ation lab, chaired by LG and LK. HS drafted this paper and after this all authors
expanded and revised it. All authors have read and approved the final version.
1 Introduction
Discharge summaries transfer information in health care services between working
shifts and geographical locations. They are written or dictated by nurses, physicians,
radiologists, specialists, therapists, or other clinicians responsible for patient care to
describe the course of treatment, the status at release, and care plans. Their primary
purpose is to support the care continuum as a handoff note between clinicians, but
they also serve legal, financial, and administrative purposes.
However, patients, their next-of-kin, and other laypersons are likely to perceive the
readability of discharge summaries as poor, in other words, have difficulties in under-
standing their content (Fig. 1) [1]. Improving the readability of these summaries can
empower patients, providing partial control and mastery over health and care, leading
to patients making better health/care decisions, being more independent from health
care services, and decreasing the associated costs [2]). Specifically, supportive, pa-
tient-friendly, personalized language can help patients have an active role in their
health care and make informed decisions. Making the right decisions depends on pa-
tients’ access to the right information at the right time; therefore, it is crucial to pro-
vide patients with personalized and readable information about their health conditions
for their empowerment.
Fig. 1. Summary of the CLEFeHealth2013 tasks and outcomes
The overall problem of the CLEFeHealth2013 Task 1: Visual-Interactive Search
and Exploration of eHealth Data was to help patients (or their next-of-kin) with these
readability issues.1 The CLEFeHealth2013 Tasks 1–3 developed and evaluated auto-
mated approaches for discharge summaries (see Section 2, and Fig. 1):
1. terminology standardization for medical diseases/disorders (e.g., heartburn as op-
posed to gastroesophageal reflux disease),
2. shorthand expansion (e.g., heartburn as opposed to GERD), and
3. text linkage with further information available on the Internet (e.g., care guidelines
for heartburn).
With the 2014 Task 1, we challenged participants to design interactive visualizations
that help patients better understand their discharge summaries and explore additional
relevant documents in light of a large document corpus and their various facets
in context.
As a scenario, assume that an English-speaking, discharged patient (or her next of
kin) is in her home in the USA and wants to learn about her clinical treatment history
and implications for future behavior, possible symptoms or developments, and situa-
tional awareness related to their own health and healthcare in general. That is, target-
ed users were layperson patients (as opposed to clinical experts).
We asked participants to design an interactive visual representation of the dis-
charge summary and potentially relevant documents available on the Internet. The
goal of this tool was to provide an effective, usable, and trustworthy environment for
navigating, exploring, and interpreting both the discharge summary and the Internet
documents, as needed to promote understanding and informed decision-making. More
precisely, the participants were challenged to provide a prototype that demonstrates
the effectiveness of the proposed solutions. Although functioning prototypes were
preferred, we also accepted paper, mock screenshots or other low-fidelity prototypes.
We assumed a standard application environment as given, including a networked
desktop system and mobile device (e.g., smartphone or tablet). The challenge was
structured into two different but connected tasks:
1a: Discharge Resolution Challenge and
1b: Visual Exploration Challenge.
The participants could choose to work on these tasks separately, or address both to-
gether in an integrated task (i.e., Grand Challenge).
The rest of the paper is organized as follows: In Section 2, we justify the novelty of
our task by reviewing related work and evaluation labs (a.k.a. shared tasks, challeng-
es, or hackathons where participants’ goal is to solve the same problem, typically
using the same data set) for clinical text processing and information visualization. In
Section 3, we introduce our data set and its access policy, detail our challenge, and
specify our evaluation process for participant submissions. In Sections 4 and 5 respec-
tively we present and discuss our results.
1 (accessed 16 April 2014)
2 Related Work
In this section, we first describe previous evaluation labs for clinical text processing.
Then, we continue to justify the novelty of this task by relating the first subsection
with related evaluation labs and visual analysis of health-oriented data.
2.1 Evaluation Labs for Clinical Text Processing
Language/text technologies to generate, search, and analyze spoken or written natural,
human language were already being recognized as ways to automate text analysis in
health care in the 1970s [3, 4, 5, 6, 7]. However, their development and flow to health
care services was and still is – substantially hindered by the barriers of lack of ac-
cess to shared data; insufficient common conventions and standards for data, tech-
niques, and evaluations; inabilities to reproduce the results; limited collaboration; and
lack of user-centricity [8]. Evaluation labs began addressing these barriers in the early
2000s [9].
The first evaluation labs related to clinical language were in the Text REtrieval
Conference (TREC).2 This on-going series of annual evaluation labs, conferences, and
workshops was established in 1992 with its focus on information search. In 2000, the
TREC filtering track considered user profiling to filter in only the relevant documents
[10].3 Its data set contained approximately 350,000 abstracts related to biomedical
sciences over five years, manually created topics, and a topic set based on the stand-
ardized Medical Subject Headings (MeSH). In 2003–2007, the TREC genomics track
organized annual shared tasks with problems ranging from ad-hoc search to classifica-
tion, passage retrieval, and entity-based question answering [11].4 Its data sets origi-
nated from biomedical papers and clinical reports. In 2011–2012, the TREC medical
records track challenged the participants to develop search engines for identifying
patient cohorts from clinical reports for recruitment as populations in comparative
effectiveness studies [12].5 Its data set consisted of de-identified clinical reports,
searches that resemble eligibility criteria of clinical studies, and associated rele-
vance assessments.
In 1997, a Japanese counterpart of TREC, called NII Test Collection for Infor-
mation Retrieval Systems (NTCIR), was launched.6 In 2013 and 2014, its MedNLP
track considered clinical documents (i.e., simulated medical reports in Japanese)
[13].7 Tasks of this track included text de-identification in 2013; complaint/diagnosis
extraction in 2013 and 2014; complaint/diagnosis normalization in 2014; and an open
challenge, where participants were given the freedom to try to solve any other natural
2 (accessed 16 April 2014)
3 (accessed 16 April 2014)
4 (accessed 16 April 2014)
5 (accessed 16 April 2014)
6 (accessed 16 April 2014)
7 (accessed 16 April 2014)
language processing (NLP) task on the clinical data set of the task in 2013 and 2014.
In 2014, participant submissions are due by August 1.8
In 2000, the Conference and Labs of the Evaluation Forum (CLEF) began as a Eu-
ropean counterpart of TREC.9 In 2005, ImageCLEFmed introduced annual tasks on
accessing biomedical images in papers and on the Internet [14].10 In 2005–2014, it
targeted language-independent techniques for annotating images with concepts; multi-
lingual and multimodal (i.e., images and text) information search; and automated
form filling related to analyzing computed tomography scans. In 2013, the Question
Answering for Machine Reading Evaluation (QA4MRE) track introduced a pilot task
on machine reading on biomedical text about Alzheimer's disease and in 2014,
QA4MRE organized a task on biomedical semantic indexing and question answering
[15].11 In 2012, CLEFeHealth was created as a new CLEF track dedicated to electron-
ic clinical documents [16].12 In 2012, it organized a workshop to prepare an evalua-
tion lab and in 2013 and 2014 (called ShARe/CLEF eHealth), it ran both evaluation
labs and workshops. The 2013 tasks aimed to improve patients’ understanding of their
clinical documents and consisted of three tasks: 1) disease/disorder extraction and
normalization; 2) abbreviation/acronym normalization; and 3) information search on
the Internet to address questions patients may have when reading their clinical rec-
ords. The tasks used a subset of 300 de-identified clinical reports (i.e., discharge
summaries together with electrocardiogram, echocardiogram, and radiology reports)
in English from about 30,000 US intensive care patients and also used approximately
one million web documents (predominantly health and medicine sites). In 2014, the
data set was similar and the tasks included visual-interactive search and exploration –
as described in this paper – together with revisions of the 2013 tasks 1 and 3 [17].
In 2006–2014, the Informatics for Integrating Biology and the Bedside (i2B2) con-
sidered clinical documents through its following seven evaluation labs [18]:13 text de-
identification and identification of smoking status in 2006; recognition of obesity and
comorbidities in 2008; medication information extraction in 2009; concept, assertion,
and relation recognition in 2010; co-reference analysis in 2011; temporal-relation
analysis in 2012; and text de-identification and identification of risk factors for heart
disease over time in 2014. Data sets for these labs originated from the USA, were in
English, and included approximately 1,500 de-identified, expert-annotated dis-
charge summaries.
In 2007 and 2011, the Medical NLP Challenges addressed automated diagnosis
coding of radiology reports and classifying the emotions found in suicide notes [19].
14 In 2007, its data set included nearly two thousand de-identified radiology reports in
English from a US radiology department for children and in 2011, over a thousand
suicide notes in English were used.
8 (accessed 16 April 2014)
9 (accessed 16 April 2014)
10 (accessed 16 April 2014)
11 (accessed 16 April 2014)
12 (accessed 16 April 2014)
13 (accessed 16 April 2014)
14 (accessed 16 April 2014)
In 1998 through 2004, Senseval Workshops promoted system development for
word sense disambiguation for thirteen different languages including English, Italian,
Basque, Estonian, and Swedish. In 2007, Senseval transitioned to become SemEval
shifting the focus on other semantic tasks such as semantic role labeling, information
extraction, frame extraction, temporal annotation, etc. while continuing to address
multilingual texts. In 2014, SemEval addressed unsupervised learning of dis-
ease/disorder annotations from the ShARe/CLEF eHealth 2013 clinical texts includ-
ing 440 de-identified reports of four types – discharge summaries, electrocardio-
grams, echocardiograms, and radiology reports [20].15 These reports were in English
from the US.
2.2 Related Evaluation Labs and Visual Analysis of Health-Oriented Data
As described above, CLEFeHealth2013 is, to the best of our knowledge, the first
evaluation lab dedicated to improving patients’ understanding of their clinical docu-
ments using text processing and visual-interactive techniques. The novelty of the
2014 Task 1 described in this paper lies in combining this timely topic with promising
techniques from the field of Information Visualization.
A number of related research challenges have considered visualization and analysis
of health-oriented data before. In 2013, the Health Design Challenge had an evalua-
tion lab aiming to make clinical documents more usable by and meaningful to pa-
tients, their families, and others who take care of them.16 This design/visualization
task attracted over 230 teams to participate. However, this challenge did not specifi-
cally address text documents or their processing. Furthermore, the challenge mainly
aimed at static designs, where in context of our challenge, we aim at interactive ap-
proaches which allow users to query, navigate and explore the data using visu-
al representations.
The international VAST Challenge series asks researchers to design and practically
apply visual analysis systems that allow analyzing and understanding of large and
complex data sets which are provided as part of the challenge definition. Typically,
the challenge data contains unrevealed relationships and facts which need to be dis-
covered by the participants, as part of the evaluation approach. The VAST Challenge
has previously defined challenge data sets which relate to health-oriented problems.17
Specifically, in the 2011 and 2010 challenges, analysis of epidemic spread scenarios
has been proposed. Based on synthetic social media data and hospitalization records,
the task was to characterize and identify possible root causes of hypothetical epidemic
Research in Information Visualization has previously addressed design of visual-
interactive systems to help understand and relate clinical and health record data. One
example work is the LifeLines2 system, which allows comparison of categorical
events (exemplified on electronic health record data) for ranking, summarizing, and
15 (accessed 30 May 2014)
16 (accessed 16 April 2014)
17 (accessed 30 May 2014)
comparing event series [21]. More works, which support analysis and exploration of
electronic clinical documents have recently been surveyed in [22]. Many of these
works are oriented towards expert use by physicians and clinical researchers, and less
for layperson patient use. The latter is the focus of our lab definition.
A strong recent interest in the research of visualization and visual analysis of
health-oriented data is also evident from a number of scientific workshops organized
previously in conjunction with the IEEE VIS conference. These include the Workshop
on Visual Analytics in Healthcare, which has started in 2011,18 and the Public
Health's Wicked Problems: Can InfoVis Save Lives? 19 Workshop.
3 Materials and Methods
In this section, we introduce our data set and its access policy, detail our challenge,
and specify our evaluation process for participant submissions. In summary, we used
both discharge summaries and relevant Internet documents; participants’ task was to
design an interactive visual representation of these data; and the evaluation process
followed the standard peer-review practice and consisted of optional draft submission
in March 2014, followed by final submission two months later.
3.1 Dataset
The input data provided to participants consists of six carefully chosen cases from the
CLEFeHealth2013 data set [16]. Using the first case was mandatory for all partici-
pants and the other five cases were optional.
Each case consisted of a discharge summary, including the disease/disorder spans
marked and mapped to Systematized Nomenclature of Medicine Clinical Terms, Con-
cept Unique Identifiers (SNOMED-CT), and the shorthand spans marked and mapped
to the Unified Medical Language System (UMLS). Each discharge summary was also
associated with a profile (e.g., A forty year old woman, who seeks information about
her condition for the mandatory case) to describe the patient, a narrative to describe
her information need (e.g., description of what type of disease hypothyroidism is), a
query to address this information need by searching the Internet documents, and the
list of the documents that were judged as relevant to the query. Each query consisted
of a description (e.g., What is hypothyreoidism) and title (e.g., Hypothyreoidism).
To access the data set on the PhysioNetWorks workspaces, the participants had to
first register to CLEF2014 and agree to our data use agreement.20 The dataset was
accessible to authorized users from December, 2013. Participant access to these doc-
uments was facilitated by HS. The data set is to be opened for all registered
PhysioNetWorks users in October 2014.
18 (accessed 30 May 2014)
19 (accessed 30 May 2014)
20 (accessed 16 April 2014)
Case 1 (mandatory).
1. Patient profile: This 55-year old woman with a chronic pancreatitis is worried that
her condition is getting worse. She wants to know more about jaundice and
her condition
2. De-identified discharge summary, including the disease/disorder spans marked and
mapped to SNOMED-CT, and the shorthand spans marked and mapped to UMLS
(Fig. 2)
3. Information need: chronic alcoholic induced pancreatitis and jaundice in connec-
tion with it
4. Query (Fig. 3): is jaundice an indication that the pancreatitis has advanced
(a) Title: chronic alcoholic induced pancreatitis and jaundice
(b) 113 returned documents of which 26 are relevant
Fig. 2. Partial screenshot of the case 1 discharge summary
Fig. 3. Case 1 query
Case 2 (optional).
1. Patient profile: A forty year old woman, who seeks information about her condition
2. De-identified discharge summary, including the disease/disorder spans marked and
mapped to SNOMED-CT, and the shorthand spans marked and mapped to UMLS
3. Information need: description of what type of disease hypothyroidism is
4. Query: What is hypothyroidism
(a) Title: Hypothyroidism
(b) 96 returned documents of which 15 are relevant
Case 3 (optional).
1. Patient profile: This 50-year old female is worried about what is MI, that her father
has and is this condition hereditary. She does not want additional trouble on top of
her current illness
2. De-identified discharge summary, including the disease/disorder spans marked and
mapped to SNOMED-CT, and the shorthand spans marked and mapped to UMLS
3. Information need: description of what type of disease hypothyroidism is
4. Query: MI
(a) Title: MI and hereditary
(b) 132 returned documents of which 14 are relevant
Case 4 (optional).
1. Patient profile: This 87-year old female has had several incidences of abdominal
pain with no clear reason. The family now wants to seek information about her
bruises and raccoon eyes. Could they be a cause of some blood disease
2. De-identified discharge summary
<title>chronic alcoholic induced pancreatitis and jaundice</title>
<desc>is jaundice an indication that the pancreatitis has
<narr>chronic alcoholic induced pancreatitis and jaundice in
connection with it</narr>
<profile>This 55-year old woman with a chronic pancreatitis is
worried that her condition is getting worse. She wants to
know more about jaundice and her condition.</profile>
3. Information need: can bruises and raccoon eyes be symptoms of blood disease
4. Query: bruises and raccoon eyes and blood disease
(a) Title: bruises and raccoon eyes and blood disease
(b) 110 returned documents of which 5 are relevant
Case 5 (optional).
1. Patient profile: A 60-year-old male who knows that helicobacter pylori is causing
cancer and now wants to know if his current abdominal pain could be a symptom
of cancer
2. De-identified discharge summary, including the disease/disorder spans marked and
mapped to SNOMED-CT, and the shorthand spans marked and mapped to UMLS
3. Information need: is abdominal pain due to helicobacter pylori a symptom
of cancer
4. Query: cancer, helicobacter pylori and abdominal pain
(a) Title: abnominal pain and helicobacter pylori and cancer
(b) 674 returned documents of which 610 are relevant
Case 6 (optional).
1. Patient profile: A 43-year old male with down Syndrome lives in an extended care
facility. The personnel wants to know if they can avoid frothy sputum in connection
with the patient's chronic aspiration and status post laryngectomy
2. De-identified discharge summary, including the disease/disorder spans marked and
mapped to SNOMED-CT, and the shorthand spans marked and mapped to UMLS
3. Information need: how to avoid frothy sputum
4. Query: frothy sputum and how to avoid and care for this condition
(a) Title: frothy sputum and care
(b) 169 returned documents of which 7 are relevant
Discharge Summaries. Six discharge summaries were selected from a larger anno-
tated data set, the Shared Annotated Resources (ShARe) corpus. The ShARe corpus
was selected from an extensive database, Multiparameter Intelligent Monitoring in
Intensive Care (MIMIC-II)21 that contains intensive care unit data including de-
mographics, billing codes, orders, tests, monitoring device reads, and clinical free-text
notes. The data was originally automatically de-identified using de-identification
software, all dates were shifted, and realistic surrogates were added for names, geo-
graphic locations, medical record numbers, dates, and other identifying information.
For this task, two authors (HS and DM) independently reviewed the six discharge
summaries and manually removed other types of information that could potentially re-
identify a patient e.g., the name of a facility the patient was transferred from. We
replaced the exact character span of this information with “*”s to ensure the original
21 (accessed 30 May 2014)
CLEFeHealth2013 annotation offsets were preserved. We provided our consensus
annotations to a MIMIC-II representative for review.
Query Set. Six real patient queries (i.e. the six cases) generated from the six dis-
charge summaries, a set of in the order of 1 million health-related documents (pre-
dominantly health and medicine sites) that the queries can be searched on, and a list
of the documents which were judged to be relevant to each of the queries (named
result set) were used (Fig. 3). This document set originated from the Khresmoi pro-
ject.22 The queries were manually generated – as a part of the CLEFeHealth2013 Task
3 – by healthcare professionals from a manually extracted set of highlighted disorders
from the discharge summaries. A mapping between each query and the associated
matching discharge summary (from which the disorder was taken) was provided. We
used the TREC format to capture the document title, description, and narrative and
supplemented it with the following two fields:
1. discharge_summary: matching discharge summary, and
2. profile: details about the patient extracted, or inferred, from the discharge summary
(which is required for determining the information which is being sought by
the patient).
Document Set. Documents consisted of pages on a broad range of health topics and
targeted at both the general public and healthcare professionals. They were made
available as DAT files, including the original Internet address (i.e., Uniform Resource
Locator called #URL) and the document text called (#CONTENT). For example, for
the mandatory query, the folder included 113 files with their size varying from two to
eighty kilobytes (1.28 megabytes in total) (Fig. 4).
22 (accessed 30 May 2014)
Fig. 4. Example document for Case 1
Result Set. The document contents were judged for relevance, using the DAT file
names, called document IDs, as a reference. The relevance judgments were performed
by medical professionals – as a part of the CLEFeHealth2013 Task 3 – and mapped to
a 2-point scale of Irrelevant (0) and Relevant (1). The relevance assessments were
provided in a file in the standard TREC format with four columns: the first column
refers to the query number, the third column refers to the document ID, and the fourth
column indicates if the document is relevant (1) or not relevant (0) to the query. We
did not need the second column in this task, so it was always given the value of 0. For
example, for the mandatory document, 113 documents were judged, resulting in 26
relevant and 87 irrelevant documents.
3.2 Challenge
Challenges were deliberatively defined in a creative way and involved visual interac-
tive design and ideally, a combination of automatic, visual and interactive techniques.
Task 1a: Discharge Resolution Challenge. The goal was to visualize a given dis-
charge summary together with the disorder standardization and shorthand expansion
data in an effective and understandable way for laypeople. An interactive visualiza-
tion was to be designed based on the input discharge summary, including the disorder
spans marked and mapped to SNOMED-CT, and the shorthand spans marked and
mapped to the UMLS. The design should allow the patient (or his/her next of kin) to
perceive the original document together with the appropriate processing (i.e., disorder
standardization and shorthand expansion), thereby conveying an informative display
of the discharge summary information. Solutions were to include visualizations of the
Document ID: virtu4909_12_000034.dat
#CONTENT: Jaundice Jaundice Jaundice is a condition produced when excess amounts of bilirubin
circulating in the blood stream dissolve in the subcutaneous fat (the layer of fat just beneath the skin),
causing a yellowish appearance of the skin and the whites of the eyes. With the exception of normal
newborn jaundice in the first week of life, all other jaundice indicates overload or damage to the liver, or
inability to move bilirubin from the liver through the biliary tract to the gut. Review Date: 4/17/2011
Reviewed By: David C. Dugdale, III, MD, Professor of Medicine, Division of General Medicine, De-
partment of Medicine, University of Washington School of Medicine; George F. Longstreth, MD, De-
partment of Gastroenterology, Kaiser Permanente Medical Care Program, San Diego, California. Also
reviewed by David Zieve, MD, MHA, Medical Director, A.D.A.M., Inc. The information provided
herein should not be used during any medical emergency or for the diagnosis or treatment of any medi-
cal condition. A licensed medical professional should be consulted for diagnosis and treatment of any
and all medical conditions. Call 911 for all medical emergencies. Links to other sites are provided for
information only -- they do not constitute endorsements of those other sites. © 1997- A.D.A.M., Inc.
Any duplication or distribution of the information contained herein is strictly prohibited.
space of processed terms, including their location in the SNOMED-CT/UMLS termi-
nologies. Appropriate interaction methods, including linked navigation and detail on
demand, should support navigating the original discharge summary processed terms,
and foster the understanding of the record and the trust of the users in the presented
information. Although side-by-side linked views of the original discharge summary
and the space of processed terms could make up for an effective visualization, partici-
pants were encouraged to explore additional views and presentations, including
presentation of similarity between terms, identification of multiple abbreviations for a
given term, and semantic relationships between abbreviated and non-
abbreviated terms.
Task 1b: Visual Exploration Challenge. Given that discharge summaries had been
understood by the patients, the goal then was to explore a space of relevant documents
from a large corpus of documents. As a scenario, we assumed a forty year old woman,
who seeks information about her condition of hypothyroidism. She wanted to find a
document that describes what type of disease hypothyroidism is by using the query:
“What is hypothyroidism?” We assumed that a search engine is given and this engine
can retrieve and rank the large collection of documents from the Internet. Each docu-
ment consisted of text and possibly also images and links to other documents. Given
this scenario, the goal was to design a visual exploration approach that will provide an
effective overview over a larger set of possibly relevant documents to meet the pa-
tient’s information need. The successful design should include appropriate aggrega-
tion of result documents according to categories relevant to the documents, and/or by
applying automatic clustering techniques that help to understand the distribution of
relevant aspects in the answer set. Basic solutions should support the visual interac-
tive exploration of the top three to twenty relevant documents. However, participants
were encouraged to also consider larger result sets in their solutions, supported, for
example, by means of visual document navigation or aggregation by appropriate visu-
alization approaches. To that end, the system should include interactive facilities to
efficiently change the level of detail by which results are shown and to navigate in the
potentially very large space of search results, possibly using concept and/or document
Grand Challenge: Integrating 1a and 1b. We encouraged interested participants to
work on an integrated solution, which addresses both Task 1a and 1b in an integrated
approach. A key aspect of an effective integrated solution was the possibility to navi-
gate seamlessly between individual concepts from the discharge summary and explore
relevant Internet documents from the perspective of the concepts identified in the
reports, by ad-hoc querying for concepts found both in the discharge summary and the
currently viewed Internet documents. To that end, participants could (but did not have
to) implement their own term expansion and document retrieval algorithms, or reuse
results from the 2013 challenge. Ideally, solutions would provide full interactive sup-
port for term expansion and document retrieval, possibly also considering uncertain-
ties of the automatic expansion algorithm (if applicable) or user-adaptive functions
that assess the relevance of documents. Integrated solutions should also consider the
inclusion of external information sources into the exploration process by appropriate
navigation and search functionality. Possible sources could include but are not limited
to Wikipedia, Flickr, and Youtube.
3.3 Evaluation
Participants were given an option to submit to two evaluations via the official Easy-
Chair system of the task on the Internet:
1. By 1 February 2014 (optional, extended to March 1, 2014): drafts for comments.
Based on this submission, we provided participants comments that may help them
to prepare their final submission. We encouraged all participants to submit this
draft, but this submission was not mandatory.
2. By 1 May 2014: final submissions to be used to determine the final evaluation re-
sults. Final submissions needed to encompass the following mandatory items:
(a) a concise report of the design, implementation (if applicable), and application
results discussion in form of an extended abstract that highlights the obtained
findings, possibly supported by an informal user study or other means of vali-
dation and
(b) two demonstration videos illustrating the relevant functionality of the function-
al design or paper prototype in application to the provided task data.
(i) In the first video, the user should be from the development team (i.e., a per-
son who knows the functionality).
(ii) In the second video, the user should be a novice, that is, a person with no
previous experience from using the functionality and the video should also
explain how the novice was trained to use the functionality.
Solutions were supposed to address the task problems by appropriate visual-
interactive design and need to demonstrate its effectiveness. Participants were encour-
aged to implement prototypical solutions, but also pure designs without implementa-
tion were allowed.
Submissions were judged towards their rationale for the design, including selection
of appropriate visual interactive data representations and reference to state-of-the-art
techniques in information visualization, natural language processing, information
retrieval, machine learning, and document visualization. They had to:
1. Demonstrate that the posed problems are addressed, in the sense that the layperson
patient is helped in their complex information need,
2. Provide a compelling use-case driven discussion of the workflow supported and
exemplary results obtained, and
3. Highlight the evaluation approach and obtained findings.
Each final submission was assessed by a team of four evaluation panelists, sup-
ported by an organizer. Primary evaluation criteria included the effectiveness and
originality of the presented submissions. By following [23], submissions were judged
on usability, visualization, interaction, and aesthetics. Our usability heuristics were:
1. Minimal Actions: whether the number of steps needed to get to the solution
is acceptable,
2. Flexibility: whether there is an easy/obvious way to proceed to the next/other
task, and
3. Orientation and Help: ease of undoing actions, going back to main screen and
finding help.
Our visualization heuristics were
1. Information Encoding: whether the necessary/required information is shown,
2. Dataset Reduction: whether the required information is easy to perceive,
3. Recognition rather than Recall: Users should not remember or memorize infor-
mation to carry out tasks or understand information presented,
4. Spatial Organization: layout, efficient use of space, and
5. Remove the Extraneous: uncluttered display.
Depending on the field of all submissions, we promised to give recognition to the
best submissions along a number of categories. Prospective categories included but
were not limited to effective use of visualization, effective use of interaction, effective
combination of interactive visualization with computational analysis, solution adapt-
ing to different environments (e.g., desktop, mobile/tablet or print for presentation),
best use of external information resources (e.g., Wikipedia, Social Media, Flickr, or
Youtube), best solutions for Task 1a, 1b, and Grand Challenge, and best integration of
external information resources.
4 Results
In this section, we first introduce our organizers’ initialization to the problem in order
to help the participants to get started. Then, we briefly describe our participant sub-
mission – we received one final submission to the task – together with its evaluation.
We received 50 registrations in total, but only two teams (and three organizers) were
granted data access – other registrants did not return the data use agreement. Finally,
to enable comparisons, we provide our organizers’ approach. The initialization and
organizers’ comparative approach are not model solutions but rather intended to in-
spire critical thinking and new ideas.
4.1 Organizers’ Initialization
As starting points, we gave reading [24, 25, 26, 27], related labs (i.e., the aforemen-
tioned Health Design Challenge and IEEE VIS Workshop on Public Health's Wicked
Problems 2013), and software recommendations [28] to the participants. In addition,
we provided example designs to inspire all participants (Fig. 5–7).
Fig. 5 (continued from previous page). Task 1a inspiration: Designs for presenting discharge
record data using layout to indicate record structure and color to indicate status. We asked
participants to consider adapted designs for various output devices (e.g., desktop, tablet or print
output). See for original images.
Fig.6 (continued from previous page). Task 1b inspiration: Example design of an information
landscape for overviewing a set of answer documents. In this case, documents are mapped
according to relevance and document complexity, with document metadata mapped to color
and shape of document marks. See for original images.
Fig.7. Grand Challenge inspiration: Indicative sketch of a workflow supporting integrated
analysis of discharge report and querying for related documents.
4.2 Participant Submission
We received one final submission to the task. This submission has also been assessed
during the draft round. See [29] for the final submission description.
The submission was from the FLPolytech team. It is a partnership between Florida
Polytechnic University’s Department of Advanced Technology and the commercial
information science firm Retrivika. Florida Polytechnic is a public university located
in Lakeland, Florida. The Advanced Technology department is committed to excel-
lence in education and research in the areas of data analytics, cloud computing and
health informatics. Retrivika is a commercial software development company operat-
ing in the domain of information science applications, specifically electronic discov-
ery (eDiscovery) and electronic health records (eHealth). The team members are Dr.
Harvey Hyman and Warren Fridy.
Dr. Harvey Hyman is an assistant professor of advanced technology at Florida Pol-
ytechnic University in Lakeland, Florida, USA. He is a commercial software develop-
er and an inventor of three U.S. patents in the domain of electronic document search
and information retrieval. He holds the following advanced degrees: PhD in Infor-
mation Systems from University of South Florida (2012), MBA from Charleston
Southern University (2006), and JD from University of Miami, Florida (1993). He has
a diverse background that includes over 20 years of experience in complex litigation,
technology development and business process modeling. His current research projects
include: Information Retrieval Models and Processes, Exploration Behaviors in Elec-
tronic Search, Project Management Success Predictors, and Health Informatics Sup-
port Systems. His book Systems Acquisition, Integration and Implementation for
Engineers and IT Professionals is a best practice guide for software design and devel-
opment life cycle. It is available through Sentia Publishing. He may be contacted by
Warren Fridy is co-founder, Chief Technology Officer, and Director of Product
Design and Development at Retrivika, a cloud based eDiscovery software service
innovator. His love of computers and technology began at a very early age. When he
was just a junior in high school, he opened his first computer consulting company.
That passion for technology has continued through his Bachelor and Master degrees
in Computer Science and into his professional career. His knowledge and experience
reaches a wide variety of fields including insurance, financial, and education. In addi-
tion to his professional work, he collaborates with Dr. Harvey Hyman on a variety of
computational and data related topics, and recently trained several interns through an
internship program at a local college. He can be contacted at
and twitter @wfridy
The submission addressed both Tasks 1a and 1b together with their integration as
the Grand Challenge solution. It related to the task evaluation category of Effective
use of interaction. Although the submission did not describe tests with real expert
and/or novice users, the described system seems to be rather good at these two. The
final submission was evaluated by four evaluation panelists and one review by the
organizers. The draft submission was reviewed by five organizers.
4.3 Organizers’ Comparative Approach
For comparison purposes, we described organizers’ viewpoints of the system design
in Figures 8-13. Namely, we developed a digital design (Fig. 8) and printable design
(Fig. 9-12). The workflow of producing these contents is described in Figure 12. Our
fundamental principle was to prioritize simplicity. We used WebBook and Web For-
ager,23 QWiki,24 primary school books, and health pamphlets as our sources
of inspiration.
Both designs divided a given patient’s discharge summary with respect to time to
sections for Past, Present, and Future information. The present section consisted of a
summary image together with subsections for admission and discharge dates; partici-
pating healthcare services and care team members; patient identifiers; history of
present illness, and hospital course. The future section had subsections related to the
patient discharge together with recommended Internet sites, search phrases, and
23 (accessed 1 May 2014)
24 (accessed 1 May 2014)
glossary terms for further information. The past section included all other content of
the discharge summary.
The enriched or altered content was indicated as follows: All expanded shorthand
was faded underlined in the digital and printable version (Fig 8 and 10). Relevant
diseases and disorders were marked as definition hyperlinks in the digital version
(Fig. 8) and glossary terms in the printable version (Fig. 11). The recommended inter-
net sites and search phrases originated from the query and result sets (Fig. 8 and 11).
We assumed that the healthcare provided gave their patients an access to this electron-
ic and interactive glossary and digital version on the Internet.
The content was supplemented with a privacy statement, return address for lost
pamphlets, and contact details for healthcare services, description of this imaginary
hospital’s project for making health documents easier to understand for their patients
(Fig. 9).
Fig. 8. Digital version: Closer look of Fig. 1
Fig. 9. Printable pamphlet design for the optional case 4. This is intended for double-sided A4
printing. When the bottom figure is visible, the right hand side is to be folded first, followed by
the left hand side. This results in the page 1 (6) to be on top (bottom). The design is also availa-
ble at (accessed 11 June 2014).
Fig. 10. Present section
Fig. 11. Future section
Fig. 12. Past section
Fig. 13. Workflow of producing electronic and paper-based documents. The design is also
available at (accessed 11 June 2014).
5 Discussion
Continuing a tradition of evaluation labs that started in the 90s with TREC and since
2000 with CLEF, the CLEF tasks aim to provide a forum where different solutions for
the same problem can be compared and contrasted. While the CLEFeHealth2013 task
focused on readability, the 2014 Task also delved into the next step: interactive visu-
alization for increasing comprehension of a discharge summary and connecting to
additional online information. In this 2014 visualization challenge, 50 teams regis-
tered, two teams were granted data access and one team completed the tasks. This
team augmented the given discharge information with textual visualization, e.g., add-
ing structure, definitions and links. The panelists who reviewed the submission agreed
that more advanced visualization would be beneficial. However, a user study would
need to be conducted to verify that such augmentations do not make the material more
complex, especially for patients with low health literacy or limited computer skills. A
natural first step would be the evaluation of the effectiveness of the proposed interface
changes, e.g. the imposed time structure. For example, such a change could potential-
ly reduce cognitive load in patients upon discharge from the hospital by structuring
and managing the acquisition of additional information.
More generally, we argue that visual-interactive displays can be very effective tools to
help users navigate, explore and relate complex information spaces. Information Vis-
ualization to date has researched a variety of techniques, many of which are potential-
ly applicable to tasks in understanding discharge records and related information re-
sources. Respective Information Visualization techniques include, for example, doc-
ument visualization for over-viewing and navigating document collections, network
visualization for communicating relationships between facts and concepts, and time-
oriented visualization for understanding developments happening over time. While to
date, a number of systems exist for visualization of health records data [22], these
often are geared toward expert usage, and we expect further work is needed to enable
lay persons to take advantage of the analytic capabilities of such expert systems.
We recognize that the defined task was indeed a challenge in that it implied substan-
tial interdisciplinary work: Medical domain data understanding had to be paired with
techniques from Information Retrieval, text analysis and interactive data visualization.
Our task definition also implied work on implementation, application and user evalua-
tion, which in turn require expertise in software engineering and usability studies.
Given this indeed challenging task, we are glad to have received one contribution
which tackled the posed problems from the perspective of Information Retrieval. We
hope that our task definition, the presented data, instantiation and results will foster
more interest in the community to work on the problem of visual-interactive access to
personal health information by lay persons. We consider this task an important and
challenging problem with potentially high benefit for individuals and society alike.
This shared task was partially supported by the CLEF Initiative, Khresmoi project
(funded by the European Union Seventh Framework Programme (FP7/2007-2013)
under grant agreement no 257528) NICTA (funded by the Australian Government
through the Department of Communications and the Australian Research Council
through the ICT Centre of Excellence Program), PhysioNetWorks workspaces; and
MIMIC (Multiparameter Intelligent Monitoring in Intensive Care) II database.
We gratefully acknowledge the participating team’s hard work. We thank them for
their submission and interest in the task.
We greatly appreciate the hard work and feedback of our evaluation panelists Hila-
ry Cinis, Senior User Experience Designer, NICTA, Sydney, NSW, Australia; Chih-
Hao (Justin) Ku, Assistant Professor in Text mining and information visualization,
Lawrence Technological University, Southfield, MI, USA; Lin Shao, PhD student, in
Computer and Information Science, University of Konstanz, Konstanz, Germany; and
Mitchell Whitelaw, Associate Professor in Media Arts & Production, University of
Canberra, Canberra ACT, Australia. We also acknowledge the contribution of George
Moody, Harvard-MIT, Cambridge, MA,USA in proofing and supporting the release
of our six double de-identified (manually and automatically) discharge summaries.
We thank Dr. Wendy W Chapman, University of Utah, Salt Lake City, UT, USA
and Dr. Preben Hansen, SICS and Stockholm University, Stockholm, Sweden for
their contribution to the initial Task conceptualization.
1. Suominen, H. (ed.): The Proceedings of the CLEFeHealth2012 the CLEF 2012 Work-
shop on Cross-Language Evaluation of Methods, Applications, and Resources for eHealth
Document Analysis. NICTA, Canberra, ACT, Australia (2012)
2. McAllister, M., Dunn, G., Payne, K., Davies, L., Todd, C.: Patient empowerment: The
need to consider it as a measurable patient-reported outcome for chronic conditions. BMC
Health. Serv. Res. 12, 15 (2012)
3. Zweigenbaum, P., Demner-Fushman, D., Yu, H., Cohen, K.B.: Frontiers of biomedical
text mining: current progress. Review. Brief. Bioinform. 8, 358–375 (2007)
4. Meystre, S.M., Savova, G.K., Kipper-Schuler, K.C., Hurdle, J.F.: Extracting information
from textual documents in the electronic health record: a review of recent research. Re-
view. Yearb. Med. Inform., 128–144 (2008)
5. Demner-Fushman, D., Chapman, W.W., McDonald, C.J.: What can natural language pro-
cessing do for clinical decision support? Review. J. Biomed. Inform. 42, 760–772 (2009).
6. Nadkarni, P.M., Ohno-Machado, L., Chapman W.W.: Natural language processing: an in-
troduction. J. Am. Med. Inform. Assoc. 18, 544–551, (2011)
7. Friedman, C., Elhadad, N.: Natural language processing in health care and biomedicine. In:
Shortliffe, E.H., Cimino, J.J. (eds.) Biomedical Informatics: Computer Applications in
Health Care and Biomedicine, pp. 255–284. Springer-Verlag, London, UK (2014)
8. Chapman, W.W., Nadkarni, P.M., Hirschman, L., D'Avolio, L.W., Savova, G.K., Uzuner.
O.: Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for
additional creative solutions. Editorial. J. Am. Med. Inform. Assoc. 18, 540–543 (2011)
9. Suominen. H.: Text mining and information analysis of health documents. Guest editori-
al.Artif. Intell. Med. (2014) In Press.
10. Robertson, S., Hull, D.: The TREC-9 filtering track final report. In: Voorhees, E.M., Har-
man, D.K. (eds.): The 9th Text REtrieval Conference (TREC-9), NIST Special Publication
500-249, pp. 25–41. Department of Commerce, National Institute of Standards and Tech-
nology, Gaithersburg, MD, USA (2000)
11. Roberts, P.M., Cohen, A.M., Hersh, W.R.: Tasks, topics and relevance judging for the
TREC genomics track: Five years of experience evaluating biomedical text information re-
trieval systems. Information Retrieval 12, 81–97 (2009)
12. Voorhees, E.M., Hersh, W.: Overview of the TREC 2012 medical records track. In: Voor-
hees, E.M., Buckland, L.P. (eds.) The Twenty-First Text REtrieval Conference Proceed-
ings (TREC 2012), NIST Special Publication 500-298, pp. 25–41. Department of Com-
merce, National Institute of Standards and Technology, Gaithersburg, MD, USA (2012)
13. Morita, M., Kono, Y., Ohkuma, T., Miyabe, M., Aramaki, E.: Overview of the NTCIR-10
MedNLP task. In: Proceedings of the 10th NTCIR Conference, pp. 696–701. NTCIR, To-
kyo, Japan (2013)
14. de Herrera, A.G.S., Cramer, J.K., Demner Fushman, D., Antani, S., Müller, H.: Overview
of the ImageCLEF 2013 medical tasks. In: Forner, P., Navigli, R., Tufin, D. (eds.) CLEF
2013 Evaluation Labs and Workshop: Online Working Notes. CLEF, Valecia, Spain
15. Morante, R., Krallinger, M., Valencia, A., Daelemans, W.: Machine reading of biomedical
texts about Alzheimer's disease. In: Forner, P., Navigli, R., Tufin, D. (eds.) CLEF 2013
Evaluation Labs and Workshop: Online Working Notes. CLEF, Valecia, Spain (2013)
16. Suominen, H., Salanterä, S., Velupillai, S., Chapman, W.W., Savova, G., Elhadad, N.,
Pradhan, S., South, B.R., Mowery, D.L., Jones, G.J.F., Leveling, J., Kelly, L., Goeuriot,
L., Martinez, D., Zuccon, G.: Overview of the ShARe/CLEF eHealth evaluation lab 2013.
In Forner, P., Müller, H., Paredes, R., Rosso, P., Stein, B. (eds.) Information Access Eval-
uation. Multilinguality, Multimodality, and Visualization, LNCS 8138, pp. 212– 231
17. Goeuriot et al. Overview of the CLEF eHealth evaluation lab 2014. In: LNCS (2014). Mo-
rante, R., Krallinger, M., Valencia, A., Daelemans, W.: Machine reading of biomedical
texts about Alzheimer's disease. In: Forner, P., Navigli, R., Tufin, D. (eds.) CLEF 2013
Evaluation Labs and Workshop: Online Working Notes. CLEF, Valecia, Spain (2013)
18. Uzuner, O., South, B.R., Shen, S., DuVall, S.: 2010 i2b2/VA challenge on concepts, asser-
tions, and relations in clinical text. J. Am. Med. Inform. Assoc. 18, 552–556 (2011)
19. Pestian, J., Matykiewicz, P., Linn-Gust, M., South, B.R., Uzuner, O., Wiebe, J., Cohen, K.,
Hurdle, J., Brew, C.: Sentiment analysis of suicide notes: A shared task. Biomed. Inform.
Insights 5, 3–16 (2011)
20. Pradhan, S., Savova, G., Chapman, W.W, Elhadad, N.: SemEval-2014, task 7: Analysis of
clinical text. In: Proceedings of the International Workshop on Semantic Evaluations, Dub-
lin, Ireland (2014) In Press.
21. Wang, T.D., Plaisant, C., Quinn, A., Stanchak, A., Shneiderman, B., Murphy, S.: Aligning
temporal data by sentinel events: Discovering patterns in electronic health records. In:
Mackay, W.E., Brewster, S., Bodker, S. (chairs) Proceedings of the ACM SIGCHI Confer-
ence on Human Factors in Computing Systems (CHI 2008). ACM, New York, NY, USA
22. Rind, A., Wang, T., Aigner, W., Miksch, S., Wongsuphasawat, K., Plaisant, C., Shneider-
man, B.: Interactive information visualization for exploring and querying electronic health
records: A systematic review. Foundations and Trends in Human Computer Interaction 5,
207–298 (2013)
23. Forsell, C., Johansson, J.: An heuristic set for evaluation in information visualization. In:
Santucci, G. (ed.) AVI '10 Proceedings of the International Conference on Advanced Visu-
al Interfaces, pp. 199–206. CAM, New York, NY, USA (2010)
24. Shneiderman, B.: The eyes have it: a task by data type taxonomy for information visualiza-
tions. In: Spencer Sipple, R. (ed.) Proceedings of the IEEE Symposium on Visual Lan-
guages, pp. 336–343. IEEE Computer Society Press, Los Alamitos, CA, USA (1996)
25. White, R.W., Roth, R.A.: Exploratory Search: Beyond the Query-Response Paradigm:
Synthesis Lectures on Information Concepts, Retrieval, and Services. Morgan & Claypool
Publishers, San Rafael, CA, USA (2009)
26. Keim, S., Kohlhammer, J., Ellis, G., Mansmann, F. (eds.): Mastering the Information Age:
Solving Problems with Visual Analytics. Druckhaus Thomas Müntzer GmbH, Bad
Langensalza, Germany (2010)
27. Ward, M.O., Grinstein, G., Keim, D.: Interactive Data Visualization: Foundations, Tech-
niques, and Applications. A K Peters, Natick, MA, USA (2010)
28. Bostock, M., Ogievetsky,/ V., Heer, J.: D3: Data-Driven Documents. IEEE Trans. Visuali-
zation & Comp. Graphics (Proc. InfoVis), 358–375 (2011)
29. Hyman, H., Fridy, W.: An eHealth process model of visualization and exploration to sup-
port improved patient discharge record understanding and medical knowledge enhance-
ment. In Cappellato, L., Ferro, N., Halvey, M., Kraaij W. (eds.): CLEF 2014 Evaluation
Labs and Workshop: Online Working Notes. CLEF, Sheffield, UK (2014). In Press.
... The second CLEFeHealth [5] expanded our year-one efforts and again organized three tasks. Specifically, Task 1 aimed to help patients (or their next-of-kin) by addressing visualisation and readability issues related to their hospital discharge documents and related information search on the Internet [6]. Task 2 continued the IE work of the 2013 CLEFeHealth lab, specifically focusing on IE of disorder attributes from clinical text [7]. ...
Full-text available
In this paper, we provide an overview of the seventh annual edition of the CLEF eHealth evaluation lab. CLEF eHealth 2019 continues our evaluation resource building efforts around the easing and support of patients, their next-of-kins, clinical staff, and health scientists in understanding, accessing, and authoring electronic health information in a multilingual setting. This year’s lab advertised three tasks: Task 1 on indexing non-technical summaries of German animal experiments with International Classification of Diseases, Version 10 codes; Task 2 on technology assisted reviews in empirical medicine building on 2017 and 2018 tasks in English; and Task 3 on consumer health search in mono- and multilingual settings that builds on the 2013–18 Information Retrieval tasks. In total nine teams took part in these tasks (six in Task 1 and three in Task 2). Herein, we describe the resources created for these tasks and evaluation methodology adopted. We also provide a brief summary of participants of this year’s challenges and results obtained. As in previous years, the organizers have made data and tools associated with the lab tasks available for future research and development.
... Discharge summaries describe the course of treatment, the patient's condition and care plan. Their main purpose is to support the process of patient care as well as the handover notes among doctors [1]. Along with the development of information technology, the medical documents are now digitized. ...
Conference Paper
Full-text available
On reviewing clinical documents, doctors, researchers and caregivers all expect to know the time when a patient's disorders appear (in the past, present, and future or from the past until now...) in comparison with the time when the documents are written. The information about this period of time is very useful in building a treatment regimen and an inquiry system for the patient, and summarizing the relevant documents. This paper proposes a hybrid approach between the rules and the machine learning to classify the relationship between a patient's disorders and the time of writing clinical documents. The hybrid approach has achieved a result of accuracy 0.5194, which is higher than the best ranking system (0.328) in the ShARE/CLEF eHealth 2014 Evaluation Lab.
... The proposed solutions support questions from general analysis of research data and data exploration, to specific applications like health data record visualization [28] or the detection of adverse drug reactions [29]. So far, also several design challenges have been conducted to arrive at useful results 1 [30]. In our own previous work [31], we have discussed a workflow for analysis of biomedical data based on subspace clustering analysis. ...
Full-text available
Medical doctors and researchers in bio-medicine are increasingly confronted with complex patient data, posing new and difficult analysis challenges. These data are often comprising high-dimensional descriptions of patient conditions and measurements on the success of certain therapies. An important analysis question in such data is to compare and correlate patient conditions and therapy results along with combinations of dimensions. As the number of dimensions is often very large, one needs to map them to a smaller number of relevant dimensions to be more amenable for expert analysis. This is because irrelevant, redundant, and conflicting dimensions can negatively affect effectiveness and efficiency of the analytic process (the so-called curse of dimensionality). However, the possible mappings from high- to low-dimensional spaces are ambiguous. For example, the similarity between patients may change by considering different combinations of relevant dimensions (subspaces). We demonstrate the potential of subspace analysis for the interpretation of high-dimensional medical data. Specifically, we present SubVIS, an interactive tool to visually explore subspace clusters from different perspectives, introduce a novel analysis workflow, and discuss future directions for high-dimensional (medical) data analysis and its visual exploration. We apply the presented workflow to a real-world dataset from the medical domain and show its usefulness with a domain expert evaluation.
... Such tools can identify, extract, filter and generate information from clinical reports that assist patients and their families in understanding the patient's health status and their continued care. The ShARe/CLEF eHealth 2014 shared task [8] focused on facilitating understanding of information in narrative clinical reports, such as discharge summaries, by visualizing and interactively searching previous eHealth data (Task 1) [9], identifying and normalizing disorder attributes (Task 2), and retrieving documents from the health and medicine websites for addressing questions monoand multi-lingual patients may have about the disease/disorders in the clinical notes (Task 3) [10]. In this paper, we discuss Task 2: disorder template filling. ...
Full-text available
This report outlines the Task 1 of the ShARe/CLEF eHealth evaluation lab pilot. This task focused on identification (1a) and normalization (1b) of diseases and disorders in clinical reports. It used annotations from the ShARe corpus. A total of 22 teams competed in Task 1a and 17 of them also participated Task 1b. The best systems had an F1 score of 0.75 (0.80 Precision, 0.71 Recall) in Task 1a and an accuracy of 0.59 in Task 1b. The organizers have made the text corpora, annotations, and evaluation tools available for future research and development.
... This years' lab expands our year one efforts and supports evaluation of information visualisation (Task 1), information extraction (Task 2) and information retrieval (Task 3) approaches for the space. Specifically, Task 1 [5] aims to help patients (or their next-of-kin) in readability issues related to their hospital discharge documents and related information search on the Internet. Task 2 [6] continues the information extraction work of the 2013 CLEFeHealth lab, specifically focusing on information extraction of disorder attributes from clinical text. ...
Conference Paper
Full-text available
This paper reports on the 3rd CLEFeHealth evaluation lab, which continues our evaluation resource building activities for the medical domain. In this edition of the lab, we focus on easing patients and nurses in authoring, understanding, and accessing eHealth information. The 2015 CLEFeHealth evaluation lab was structured into two tasks, focusing on evaluating methods for information extraction (IE) and information retrieval (IR). The IE task introduced two new challenges. Task 1a focused on clinical speech recognition of nursing handover notes; Task 1b focused on clinical named entity recognition in languages other than English, specifically French. Task 2 focused on the retrieval of health information to answer queries issued by general consumers seeking information to understand their health symptoms or conditions. The number of teams registering their interest was 47 in Tasks 1 (2 teams in Task 1a and 7 teams in Task 1b) and 53 in Task 2 (12 teams) for a total of 20 unique teams. The best system recognized 4, 984 out of 6, 818 test words correctly and generated 2, 626 incorrect words (i.e., \(38.5 \%\) error) in Task 1a; had the F-measure of 0.756 for plain entity recognition, 0.711 for normalized entity recognition, and 0.872 for entity normalization in Task 1b; and resulted in P@10 of 0.5394 and nDCG@10 of 0.5086 in Task 2. These results demonstrate the substantial community interest and capabilities of these systems in addressing challenges faced by patients and nurses. As in previous years, the organizers have made data and tools available for future research and development.
Full-text available
This research brings together data analysis with software engineering and visualisation, with a specific focus on text mining and large document collections. My aim is to devise new, rich, and simple visualisation interfaces, which I call deep interfaces. With deep interfaces I introduce the idea-rich content as a product of the statistical analysis combined with human curation of labels and interpreted as a flow of subjectivity, complexity, and diversity between reader and interface and vice versa. The focus of such interfaces is not the representation of textual document collections as in Moretti’s distant reading, but to revisit traditional reading from the point of view of state of the art methods of textual analysis. Thus, the proposed interfaces can help us discover and explore text document collections by reading their contents. This is a practice-led research project that develops theoretical issues through the generation of practical artefacts. The research process is cu- mulative, following a reflexive methodology. The key outcomes of the project are embodied in an interface to a large collection of ANZAC war diaries: Diggers’ Diaries —
Conference Paper
Full-text available
In 2013, the tenth edition of the medical task of the Image CLEF benchmark was organized. For the first time, the ImageCLEFmed workshop takes place in the United States of America at the annual AMIA (American Medical Informatics Association) meeting even though the task was organized as in previous years in connection with the other ImageCLEF tasks. Like 2012, a subset of the open access collection of PubMed Central was distributed. This year, there were four subtasks: modality classification, compound figure separation, image{based and case{based retrieval. The compound figure separation task was included due to the large number of multipanel images available in the literature and the importance to separate them for targeted retrieval. More compound figures were also included in the modality classification task to make it correspond to the distribution in the full database. The retrieval tasks remained in the same format as in previous years but a larger number of tasks were available for image{based and case{based tasks. This paper presents an analysis of the techniques applied by the ten groups participating 2013 in ImageCLEFmed.
Full-text available
Health policy in the UK and elsewhere is prioritising patient empowerment and patient evaluations of healthcare. Patient reported outcome measures now take centre-stage in implementing strategies to increase patient empowerment. This article argues for consideration of patient empowerment itself as a directly measurable patient reported outcome for chronic conditions, highlights some issues in adopting this approach, and outlines a research agenda to enable healthcare evaluation on the basis of patient empowerment. Patient empowerment is not a well-defined construct. A range of condition-specific and generic patient empowerment questionnaires have been developed; each captures a different construct e.g. personal control, self-efficacy/self-mastery, and each is informed by a different implicit or explicit theoretical framework. This makes it currently problematic to conduct comparative evaluations of healthcare services on the basis of patient empowerment. A case study (clinical genetics) is used to (1) illustrate that patient empowerment can be a valued healthcare outcome, even if patients do not obtain health status benefits, (2) provide a rationale for conducting work necessary to tighten up the patient empowerment construct (3) provide an exemplar to inform design of interventions to increase patient empowerment in chronic disease. Such initiatives could be evaluated on the basis of measurable changes in patient empowerment, if the construct were properly operationalised as a patient reported outcome measure. To facilitate this, research is needed to develop an appropriate and widely applicable generic theoretical framework of patient empowerment to inform (re)development of a generic measure. This research should include developing consensus between patients, clinicians and policymakers about the content and boundaries of the construct before operationalisation. This article also considers a number of issues for society and for healthcare providers raised by adopting the patient empowerment paradigm. Healthcare policy is driving the need to consider patient empowerment as a measurable patient outcome from healthcare services. Research is needed to (1) tighten up the construct (2) develop consensus about what is important to include (3) (re)develop a generic measure of patient empowerment for use in evaluating healthcare (4) understand if/how people make trade-offs between empowerment and gain in health status.
Full-text available
This paper reports on a shared task involving the assignment of emotions to suicide notes. Two features distinguished this task from previous shared tasks in the biomedical domain. One is that it resulted in the corpus of fully anonymized clinical text and annotated suicide notes. This resource is permanently available and will (we hope) facilitate future research. The other key feature of the task is that it required categorization with respect to a large set of labels. The number of participants was larger than in any previous biomedical challenge task. We describe the data production process and the evaluation measures, and give a preliminary analysis of the results. Many systems performed at levels approaching the inter-coder agreement, suggesting that human-like performance on this task is within the reach of currently available technologies.
This report describes the task Machine reading of biomedical texts about Alzheimer's disease, which is a task of the Question Answering for Machine Reading Evaluation (QA4MRE) Lab at CLEF 2013. The task aims at exploring the ability of a machine reading system to answer questions about a scientific topic, namely Alzheimer's disease. As in the QA4MRE task, participant systems were asked to read a document and identify the answers to a set of questions about information that is stated or implied in the text. A background collection was provided for systems to acquire background knowledge. Three teams participated in the task submitting a total of 13 runs. The highest score obtained by a team was 0.42 c@1, which is clearly above baseline.
This report describes the task Machine reading of biomedical texts about Alzheimer's disease, which is a pilot task of the Question Answering for Machine Reading Evaluation (QA4MRE) Lab at CLEF 2012. The task aims at exploring the ability of a machine reading system to answer questions about a scientific topic, namely Alzheimer's disease. As in the QA4MRE task, participant systems were asked to read a document and identify the answers to a set of questions about information that is stated or implied in the text. A background collection was provided for systems to acquire background knowledge. The background collection is a corpus newly compiled for this task, the Alzheimer's Disease Literature Corpus. Seven teams participated in the task submitting a total of 43 runs. The highest score obtained by a team was 0.55 c@1, which is clearly above baseline.
After reading this chapter, you should know the answers to these questions: Why is natural language processing important? What are the potential uses for natural language processing (NLP) in the biomedical and health domains? What forms of knowledge are used in NLP? What are the principal techniques of NLP? What are challenges for NLP in the clinical, biological, and health consumer domains? KeywordsNoun PhraseNatural Language ProcessingRegular ExpressionParse TreeUnify Medical Language SystemThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Conference Paper
Discharge summaries and other free-text reports in healthcare transfer information between working shifts and geographic locations. Patients are likely to have difficulties in understanding their content, because of their medical jargon, non-standard abbreviations, and ward-specific idioms. This paper reports on an evaluation lab with an aim to support the continuum of care by developing methods and resources that make clinical reports in English easier to understand for patients, and which helps them in finding information related to their condition. This ShARe/CLEFeHealth2013 lab offered student mentoring and shared tasks: identification and normalisation of disorders (1a and 1b) and normalisation of abbreviations and acronyms (2) in clinical reports with respect to terminology standards in healthcare as well as information retrieval (3) to address questions patients may have when reading clinical reports. The focus on patients’ information needs as opposed to the specialised information needs of physicians and other healthcare workers was the main feature of the lab distinguishing it from previous shared tasks. De-identified clinical reports for the three tasks were from US intensive care and originated from the MIMIC II database. Other text documents for Task 3 were from the Internet and originated from the Khresmoi project. Task 1 annotations originated from the ShARe annotations. For Tasks 2 and 3, new annotations, queries, and relevance assessments were created. 64, 56, and 55 people registered their interest in Tasks 1, 2, and 3, respectively. 34 unique teams (3 members per team on average) participated with 22, 17, 5, and 9 teams in Tasks 1a, 1b, 2 and 3, respectively. The teams were from Australia, China, France, India, Ireland, Republic of Korea, Spain, UK, and USA. Some teams developed and used additional annotations, but this strategy contributed to the system performance only in Task 2. The best systems had the F1 score of 0.75 in Task 1a; Accuracies of 0.59 and 0.72 in Tasks 1b and 2; and Precision at 10 of 0.52 in Task 3. The results demonstrate the substantial community interest and capabilities of these systems in making clinical reports easier to understand for patients. The organisers have made data and tools available for future research and development.
We examine recent published research on the extraction of information from textual documents in the Electronic Health Record (EHR). Literature review of the research published after 1995, based on PubMed, conference proceedings, and the ACM Digital Library, as well as on relevant publications referenced in papers already included. 174 publications were selected and are discussed in this review in terms of methods used, pre-processing of textual documents, contextual features detection and analysis, extraction of information in general, extraction of codes and of information for decision-support and enrichment of the EHR, information extraction for surveillance, research, automated terminology management, and data mining, and de-identification of clinical text. Performance of information extraction systems with clinical text has improved since the last systematic review in 1995, but they are still rarely applied outside of the laboratory they have been developed in. Competitive challenges for information extraction from clinical text, along with the availability of annotated clinical text corpora, and further improvements in system performance are important factors to stimulate advances in this field and to increase the acceptance and usage of these systems in concrete clinical and biomedical research contexts.