Conference PaperPDF Available

Demonstration of HyperMind Builder: Pervasive User Interface to Create Intelligent Interactive Documents


Abstract and Figures

We introduce a Graphical User Interface (GUI) to create intelligent interactive documents for everyone. The intelligent interactive document refers to a document displaying contents dynamically according to a reader’s behavior. To the best of our knowledge, creating such documents requires a certain amount of efforts and implementation skills. By utilizing our system, users including non-technological experts can create interactive documents without any programming.Our system supports many people to enhance the possibility of designing new human-document interactions.
Content may be subject to copyright.
Demo: HyperMind Builder –
Pervasive User Interface to Create
Intelligent Interactive Documents
Shoya Ishimaru
Nicolas Großmann
Andreas Dengel
German Research Center for
Artificial Intelligence (DFKI)
Kaiserslautern, Germany
Carina Heisel
Pascal Klein
Jochen Kuhn
University of Kaiserslautern
Kaiserslautern, Germany
Ko Watanabe
Yutaka Arakawa
Nara Institute of Science and
Nara, Japan
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).
Copyright held by the owner/author(s).
UbiComp/ISWC’18 Adjunct, October 8–12, 2018, Singapore, Singapore
ACM 978-1-4503-5966-5/18/10.
We introduce a Graphical User Interface (GUI) to create in-
telligent interactive documents for everyone. The intelligent
interactive document refers to a document displaying con-
tents dynamically according to a reader’s behavior. To the
best of our knowledge, creating such documents requires a
certain amount of efforts and implementation skills. By uti-
lizing our system, users including non-technological experts
can create interactive documents without any programming.
Our system supports many people to enhance the possibil-
ity of designing new human-document interactions.
Author Keywords
Graphical User Interface; Human-Document Interaction
ACM Classification Keywords
H.5.m [Information interfaces and presentation (e.g., HCI)]:
Every human has different preferences in reading. For ex-
ample, some people need details about a background for
further understanding while the others do not require. It
depends on who reads what. However, documents have
traditionally been static. We believe that reading experi-
ences should become more immersive and interesting if
documents behave differently for each individual reader.
Figure 1: Overview of a work-flow in an educational scenario. Teachers create an interactive digital textbook, which displays information based
on gaze (i.e., utilizing an eye tracker to measure the visual attention and to employ it for vivid interaction) for students.
Since the appearance of human sensing technologies in-
cluding eye tracking, real-time collection of reading behav-
ior has been getting more affordable in several environ-
ments [6, 5]. These technologies have enabled researchers
to design intelligent interactive documents [4]. Furthermore,
Text 2.0 [1], a framework to create gaze-oriented dynamic
documents in HTML and JavaScript, has helped software
developers to implement interactive documents.
However, implementing interactions on documents is diffi-
cult especially for people who need it (e.g., teachers, pub-
lishers, and researchers in education). They still need helps
of a person who can write programs to create interactive
documents. In such a case, there is a possibility of hav-
ing a discrepancy on the understanding of each other. It is
difficult for teachers to explain their ideas completely, and
software developers may misunderstand them.
In order to give everyone an opportunity to create intelligent
interactive documents, we propose HyperMind Builder
GUI for creating intelligent interactive documents without
requiring any programming skills. This paper presents an
overview of our proposed system, application scenario, and
an initial observation to investigate further improvements.
Figure 1 shows an overview of a work-flow of our proposed
system. It consists of two processes: the creation and the
activation of interactions.
Creating the Intelligent Interactive Document
We focus on designing a system with no programming and
allowing intuitive operation for users. In our system, the
screen is divided into three columns. In the central col-
umn, we provide an open source rich text editor. Since it is
a WYSIWYG editor, a user can easily write texts and modify
the styles or copy-and-paste texts from other shared con-
tents. On each side of the editor, we arrange columns of
material container. A user puts additional materials (e.g.,
images and videos related to the content) into the container
by drag-and-drop. Providing columns on each side of the
editor allows a user to add materials anywhere close to
the context. After inserting materials, a user draws a hid-
den rectangle on the main content and creates relations
between inserted materials and rectangles. After creating
a document, a user can export and share the data with a
format of HyperMind Reader [4]. In summary, our system
requires only the writing of the content, drag-and-drop, and
mouse clicks.
Activating the Intelligent Interactive Document
The additional materials around the main content will be
displayed when they are required. The current implementa-
tion supports an activation based on a reader’s attention. In
other words, it utilizes an eye tracker and the related mate-
rials will be displayed if a reader’s eye gaze is on a hidden
rectangle longer than a threshold. We can register other
activation rules (e.g., interest, comprehension, mental work-
load) if we utilize additional sensors.
Figure 2: Examples of intelligent
interactive documents created on
our system. Document sources are
from OpenStax, MILINE Library,
and Twitter Bootstrap (CC license).
Application Scenario
Figure 2 presents examples of documents created on our
system. The most promising use-case is for textbooks. Ad-
ditional materials which will be displayed when a student
is interested in (or has troubles to understand) the content
should improve the learning experience. In addition, the
gaze-oriented interaction is useful in several scenarios in-
cluding reading a musical score or programming codes.
In order to explore how simple and useful our system is
for users, we have conducted a small study. The following
section describes the condition and the analysis results.
We asked 10 college students with an age between 20 - 29
to participate in our study. We provided sample texts, sup-
porting materials (videos and images), and multiple-choice
questions related to the text for measuring the comprehen-
sion. Tasks for the participants were (1) to create an inter-
active document, (2) to read a document created by an-
other participant, and (3) to solve multiple-choice questions.
Before starting the tasks, we gave an instruction of our sys-
tem to the participants. After the task, they answered sur-
veys of NASA-TLX [3] and two free-writing questions.
Results and Discussion
Participants put 5.4 ±2.1 supporting materials on a docu-
ment. Figure 3 shows the result of NASA-TLX. From the re-
sult, we calculated weights by pair comparison on each fac-
tors as shown in Figure 4. This figure infers that higher the
weight of factor, it corresponds to a cause of a high work-
load on each participant during the task. For instance, for
Participant 9, Performance was the highest weight recorded.
Hence, our system must improve Performance according to
this participant. Overall result infer that Mental Demand,
Effort, and Temporal Demand are factors that can be im-
proved, and Physical Demand seems to be lower.
Regarding a free-writing question: “How was the usage of
our system?”, we obtained some answers like “I was a bit
confused until I saw the example” or “Drag and drop were a
bit confusing”. We utilized the drag-and-drop because it is
an intuitive function but we should consider preparing other
options such as selecting from a list. But overall, we re-
ceived several positive feedback including “It was really in-
tuitive and useful” or “The usage of the system was straight-
forward and easy”. We also asked participants about im-
provements: “Do you have any idea of the additional func-
tion for the system?” but there was any feedback related to
the concept of the system.
Related Work
The closest concept to our system is the visual program-
ming languages application like Scratch [7]. Scratch is an
open-source media-rich programming environment. This
application allowed many users to learn the concept of pro-
gramming with an intuitive drag and drop method. It mo-
tivates many users and lowered the startup hurdle of pro-
gramming. Our GUI toolkit has a similar concept, which is
to lower the hurdle of creating an interactive document.
Another similar concept is the idea proposed by Cheng et
al. [2]. They come up with the idea of gazed-based gray
shading annotation while reading. The aim of this proposal
is to increase one’s reading speed and understanding by
guiding ones reading flow in a way how experts (teachers)
read. As a result, they found out that annotation improved
non-experts comprehension performance. Participants also
felt gaze annotation as helpful. This study, authenticate the
importance of the intelligent interactive document.
Figure 3: Result of NASA-TLX
Figure 4: Weight of factors
Conclusion and Future Work
We have implemented HyperMind Builder: GUI for to create
intelligent interactive documents. From the observation,
we overall proof the friendliness of a toolkit. Our next aim
is to identify an effective activation rules for interactions.
Thereby, we can add a new function of allowing creators
to design not only what but also when and how supporting
materials are be displayed.
HyperMind as a sub-project of “U.EDU: Unified Education
– Medienbildung entlang der Lehrerbildungskette” is part of
the “Qualitätsoffensive Lehrerbildung”, a joint initiative of the
Federal Government and the Länder which aims to improve
the quality of teacher training. The program is funded by the
Federal Ministry of Education and Research. This work was
also partially supported by JSPS KAKENHI (16H01721)
1. Ralf Biedert, Georg Buscher, Sven Schwarz, Manuel
Möller, Andreas Dengel, and Thomas Lottermann.
2010. The text 2.0 framework: writing web-based
gaze-controlled realtime applications quickly and easily.
In Proceedings of the 2010 workshop on Eye gaze in
intelligent human machine interaction. ACM, 114–117.
2. Shiwei Cheng, Zhiqiang Sun, Lingyun Sun, Kirsten
Yee, and Anind K Dey. 2015. Gaze-based annotations
for reading comprehension. In Proceedings of the 33rd
Annual ACM Conference on Human Factors in
Computing Systems. ACM, 1569–1572.
3. Sandra G Hart. 1986. NASA Task Load Index (TLX).
Volume 1.0; Computerized Version. (1986).
4. Shoya Ishimaru, Syed Saqib Bukhari, Carina Heisel,
Nicolas Großmann, Pascal Klein, Jochen Kuhn, and
Andreas Dengel. 2018. Augmented Learning on
Anticipating Textbooks with Eye Tracking. In Positive
Learning in the Age of Information. Springer, 387–398.
5. Shoya Ishimaru, Soumy Jacob, Apurba Roy,
Syed Saqib Bukhari, Carina Heisel, Nicolas Großmann,
Michael Thees, Jochen Kuhn, and Andreas Dengel.
2017. Cognitive State Measurement on Learning
Materials by Utilizing Eye Tracker and Thermal
Camera. In Proceedings of the 14th IAPR International
Conference on Document Analysis and Recognition,
Vol. 8. IEEE, 32–36.
6. Kai Kunze, Susana Sanchez, Tilman Dingler, Olivier
Augereau, Koichi Kise, Masahiko Inami, and Terada
Tsutomu. 2015. The augmented narrative: toward
estimating reader engagement. In Proceedings of the
6th Augmented Human International Conference. ACM,
7. Mitchel Resnick, John Maloney, Andrés
Monroy-Hernández, Natalie Rusk, Evelyn Eastmond,
Karen Brennan, Amon Millner, Eric Rosenbaum, Jay
Silver, Brian Silverman, and others. 2009. Scratch:
programming for all. Commun. ACM 52, 11 (2009),
... Our text summary augmentation system supports both of a reader and an editor. Although there are rational approaches of preparing a digital supporting material by an editor of the content [2,5], it is not practical to delegate a lot of work to the experts. We propose a solution to bring about an automated way of supportive content creation and decrease the efforts of experts. ...
... The methods and ideas used in this research are plausible to apply in other domains of eye-tracking research. Finally, the concept idea of this paper connects machine learning and human-computer interaction (Ishimaru et al., 2016;Ishimaru et al., 2018) to develop an intelligent user interface to advance education in fundamental science such as physics and mathematics. ...
Conference Paper
Full-text available
In this study, we taught 20 physics students two different visual strategies to graphically interpret the physical meaning of vector field divergence. Using eye-tracking technology, we recorded students' eye-movement behavior of both strategies when they were engaged in graphical vector field representations. From the eye-tracking data we extracted the number of fixations and saccadic direction and proposed a linear SVM model to classify strategies of problem-solving in the vector field domain. The results show different gaze patterns for the two strategies, and the influence of vector flow orientation on gaze-patterns. A high accuracy of $81.2\%(0.11\%)$ has been achieved by testing the algorithm using cross-validation, i.e. that the algorithm is able to predict the strategy the student applies to judge the divergence of a vector field. The results provide guiding tools for learning-effective instruction design and teachers gain benefit from monitoring the students' non-verbal level of performance and fluency using each strategy. Apart from that, students would receive objective feedback on their progress of learning.
We instructed a group of 20 undergraduate physics students with an instruction to visually interpret divergence of vector field with integral and differential strategies. We designed two distinct sets of 10 tasks and recorded the students’ eye gaze while they completed the task. In this study, we first developed Attentive Region Clustering (ARC), a novel unsupervised approach to analyze and evaluate the fixations and saccadic movements of the participants. Secondly, a linear Support Vector Machine model was used to classify the two problem-solving strategies in the vector field domain. The results revealed the implication of vector flow orientation in the eye movement patterns. We achieved an accuracy 10-fold cross-validation, we achieved \(81.2\%\) \((11\%)\) accuracy by evaluating a linear Support Vector Machine model to classify which strategy was applied by the student to comprehend the divergence of a vector field problem. The outcome of this work is useful to monitor the student visual performance on similar tasks. Besides, advances in Human-Computer Interaction empower students by getting objective feedback on their progress by visual clues in a vector field problems.
Full-text available
"Digital fluency" should mean designing, creating, and remixing, not just browsing, chatting, and interacting.
This paper demonstrates how eye tracking technologies can understand providers to realize a personalized learning. Although curiosity is an important factor for learning, textbooks have been static and constant among various learners. The motivation of our work is to develop a digital textbook which displays contents dynamically based on students’ interests. As interest is a positive predictor of learning, we hypothesize that students’ learning and understanding will improve when they are presented information which is in line with their current cognitive state. As the first step, we investigate students’ reading behaviors with an eye tracker, and propose attention and comprehension prediction approaches. These methods were evaluated on a dataset including eight participants’ readings on a learning material in Physics. We classified participants’ comprehension levels into three classes, novice, intermediate, and expert, indicating significant differences in reading behavior and solving tasks.
Conference Paper
We present the concept of bio-feedback driven computing to design a responsive narrative, which acts according to the readers experience. We explore on how to detect engagement and give our evaluation on the usefulness of different sensor modalities. We find temperature and blink frequency are best to estimate engagement and can classify engaging and non-engaging user-independent without error for a small user sample size (5 users).
Conference Paper
We study eye gaze movement behavior during paper reading and generate a series of annotations from a user's reading features: gray shading to indicate reading speed, borders to indicate frequency of re-reading, and lines to indicate transitions between sections of a document. Through a user study, we validate that our SocialReading system that shares teachers' gaze data for an academic paper can improve students' reading comprehension of that paper.
We created a simple-to-use framework to construct gaze-responsive applications using web technology focussing on text. A plugin enables any compatible browser to interpret a new set of gaze handlers that behave similar to existing HTML and JavaScript mouse and keyboard event facilities. Keywords like onFixation, onGazeOver, and onRead can be attached to parts of the DOM tree and are triggered on the corresponding viewing behavior. The plugin is part of a distributed architecture featuring a remote gaze provider and a number of assisting services and tools. Using this framework we implemented a number of applications providing help on comprehension difficulties.
NASA Task Load Index (TLX). Volume 1.0; Computerized Version
  • G Sandra
  • Hart
Sandra G Hart. 1986. NASA Task Load Index (TLX). Volume 1.0; Computerized Version. (1986).
Cognitive State Measurement on Learning Materials by Utilizing Eye Tracker and Thermal Camera
  • Soumy Shoya Ishimaru
  • Apurba Jacob
  • Roy
  • Carina Syed Saqib Bukhari
  • Nicolas Heisel
  • Michael Großmann
  • Jochen Thees
  • Andreas Kuhn
  • Dengel