Content uploaded by Andersen M. S. Ang
Author content
All content in this area was uploaded by Andersen M. S. Ang on Mar 20, 2015
Content may be subject to copyright.
Abstract—This paper presents a novel wearable
single-channel electrooculography (EOG) based
human-computer interface (HCI) with a simple system design
and robust performance. In the proposed system, EOG
signals are generated from double eye blinks, collected by a
commercial wearable device (the NeuroSky MindWave
headset), and then converted into a sequence of commands
that can control cursor navigations and actions. The
EOG-based cursor control system was tested on 8 subjects,
and the average accuracy is about 84% for indoor uses.
Compared with other EOG-based HCI systems, this system is
highly user-friendly and does not require any training.
Therefore, this system has the potential to provide an
easy-to-use and cheap assistive technique for locked-in
patients who have lost their main body muscular abilities but
with proper eye-condition.
Keywords: electrooculography, human-computer
interface, assistive devices, cursor control, augmentative and
alternative communication
I. INTRODUCTION
In recent years, numerous human-computer interface
(HCI) systems have been developed as assistive
technologies for improving life quality of the people with
neuromuscular disabilities [6]. Examples of these assistive
technologies include specially designed joystick,
infrared-oculography, tongue-computer interface and
brain-computer interface [7-8]. Generally, all these HCI
systems serve as the bridge between the human and the
computer by translating or decoding the signals generated
from physiological processes into control commands.
Particularly, because the eyes and related facial muscles are
rarely affected by neuromuscular mobility impairments,
many HCI systems are developed by translating
electrooculography (EOG) signals generated by intended
actions of these intact organs to control commands. A
majority of existing EOG-HCI systems [1-4] rely on multi
wet electrodes, because they can achieve a high
signal-to-noise ratio (SNR) of EOG and can provide more
discriminative information for recognizing more types of
eye activities. As a result, characteristic structure of EOG in
temporal or spatial domain can be more accurately extracted
by multiple wet electrodes, and hence the systems are more
capable of classifying different types of eye-movements
such as looking at different directions, resulting in a higher
system performance score.
Although the HCI systems based on multi-channel EOG
normally have a good performance, they are not considered
as user-friendly and practical. The overall quality of an
Research supported by a Hong Kong RGC GRF Grant (HKU
785913M)
A. M. S. Ang, Z. G. Zhang and Y. S. Hung are with the Department of
Electrical and Electronic Engineering, the University of Hong Kong,
Pokfulam Road, Hong Kong (email:angms@hku.hk;
zgzhang@eee.hku.hk; yshung@eee.hku.hk).
J.N.F. Mak is with NeuroSky Hong Kong, Science Park, Sha Tin, Hong
Kong, (e-mail: jmak@neurosky.com ).
assistive technology can be evaluated by two types of
factors, the ergonomic factors and the system performance
factors. System performance factors, such as the accuracy of
classifying signals encoded by different intentions, the false
alarm rate, the execution speed, and the information transfer
rate, are often regarded as the most important factors when
designing a HCI system, particularly, in laboratory.
However, for practical applications for end users, the
ergonomic factors such as comfortableness, portability, cost
effectiveness, and time spent on training are crucial because
a HCI system ultimately has to be used by an end user for a
long period of time and in an easy-to-use manner.
Ergonomic factors are less considered in many existing
EOG-based HCI systems developed in laboratory, and
therefore, these systems cannot provide a practical assistive
technique for users. For example, current HCI systems
based on EOG activities usually consist of a large bulk of
hardware, such as wiring and amplifiers, which are not
user-friendly for disabled people since they have a higher
setup and preparation cost. Another important ergonomic
factor is the time spent on training for users. Most
EOG-based HCI systems require users to perform intensive
practice in a training session, or to memorize certain kind of
eye movements for specific control commands, which is not
user-friendly and could not be achievable for patients with
cognitive impairments.
In this paper, we develop a novel EOG-based HCI
system which is aimed at maximizing both ergonomic
factors (usability, less training time, etc.) and system
performance factors (accuracy, information transfer rate,
etc.). The proposed system only use EOG produced by one
type of eye activity, double blink (DB), to encode user’s
intentions. In the proposed system, EOG activities are
captured by a single-channel commercial headset, the
NeuroSky MindWave Mobile Headset (NeuroSky, CA,
US). The headset is basically a single-channel sensor with a
dry electrode made of stainless steel. The sensor is attached
on the forehead of the user to collect electrical signals
generated by the brain and muscles continuously. The
system then sends the collected signals wirelessly through a
Bluetooth communication protocol to a computer for
processing. The processing blocks consist of filtering,
activity detection and classification. The recording will first
be filtered to remove noise and any undesired components,
and then the continuous recording will go through an
activity detector to extract a short data segment containing
EOG activities. The extracted data segment will be analyzed
to yield a set of discriminative features to be fed into a
classifier. Finally the classification output is used to control
the mouse cursor for multiple applications in a computer.
The control of the cursor is based on the following
switching control scheme : (1) the mouse cursor will keep
moving either vertically (from top to bottom) or horizontally
(from left to right) on the screen with a constant speed if no
double blink is detected; (2) when the first double blink (i.e.
a control command) is detected from EOG, the cursor will
switch its moving direction (from vertically to horizontally,
A User-friendly Wearable Single-channel EOG-based
Human-Computer Interface for Cursor Control
A. M. S. Ang, Z. G. Zhang, Member IEEE, Y. S. Hung, Senior Member IEEE, J. N. F. Mak
CONFIDENTIAL. Limited circulation. For review only.
Preprint submitted to 7th International IEEE EMBS Conference
on Neural Engineering. Received November 24, 2014.
or from horizontally to vertically); (3) when the second
double blink occurs, a clicking action will be performed on
that current cursor location and the mouse cursor will be
reset to the top-left corner of the screen and move again.
As compared with existing EOG-based HCI systems,
the advantages of the new system are three folds. First, it is
based on a commercial device, NeuroSky MindWave
headset, so that it has a high mobility and can be potentially
used in more scenarios. Second, the system only uses one
eye action, double blinks, to encode users’ intentions so that
the control strategy is simple and user friendly. Third,
because of the high inter-subject consistency of
double-blink EOG signals, the system is capable of
performing cross-subject decoding, which means that no
training is required for new users.
Unlike other multi-channel counterparts, the proposed
system only has one single electrode to collect information,
and therefore, the proposed system is more sensitive to noise
and hence having a lower SNR. To handle these problems,
the system adapts some advanced signal processing
methods (such as wavelet filtering and support vector
machine) to boost the system performance.
The rest of the paper is organized as follows. In Section
II, the system architecture, data analysis methods and
implementations of the new EOG-based system are
proposed. The experimental results and discussion are given
in Section III. Summary is drawn and discussed in Section
IV.
II. METHODS
A. System Architecture
Fig. 1 shows the structural diagram of the proposed
system. EOG signals generated from the user are collected
by the NeuroSky MindWave Mobile headset with a
sampling rate of 512Hz. The raw digital signals are packed
and then transmitted into a computer through Bluetooth
communication protocol. Data packet parsing is then
performed in the computer to obtain the raw numerical
values of the signals. These raw signals then go through
pre-processing blocks, which include de-trending and
wavelet de-noising to remove noise and un-desired
components. The filtered signals then pass through an
activity detector to extract the signal segments that contain
double blinks (or other major eye activities) and store the
extracted signal segment into a buffer of 1 second.
Subsequently, feature extraction is performed for each
buffer and the features extracted are fed into the SVM
classifier. All the functional blocks are cascaded, once a
non-interested activity is detected, the system will throw
away that segment immediately to save computational
resources.
Fig. 1. The cascade architecture of the whole system.
B. EOG Signals
EOG signals are the electrical activity generated by the
movements of the eyeballs or the eyelid muscles. As shown
in our previous study, various kinds of EOG signals, such as
looking towards different directions and blinking in
different ways, can be used to indicate the user’s intentions
[5]. However, because the system proposed in this paper is
primarily focused on practical aspects and is aimed at
providing an alternative communication pathway for the
disabled people, only both-eye double blinks (DB) will be
used as control signals in this paper.
Fig. 2. The raw double blink EOGs collected from 6
subjects. Signals are aligned by peaks.
C. Filtering and Real-time Detection
Although the EOGs look highly consistent, filtering is
necessary for noise removal and enhancement of the
reliability of the detector and the extracted features. Wavelet
filtering is used instead of traditional bandpass filtering to
remove noise from raw recordings because of its better
de-noising ability and smaller phase distortion.
Activity detection is used in the system to isolate DB
signals from continuous recordings. Because there is a great
difference in terms of magnitude between the DB signals
and the background recording, a magnitude-based detection
method is used. When the signal magnitude exceeds a
certain threshold, a 1-second segment of that instant will be
extracted out and stored into a buffer. To handle the
non-stationarity of DB signals, the threshold value
is
dynamically updated by adjusting previous threshold value
with the current estimated magnitude of background
recordings. That is,
is calculated as follows:
magnitude noise previous
magnitudes noise new
PreviousNew
.
(1)
Experimental data shows that the dynamic threshold
for
the DB signal is 0.0498mV 0.0072mV (meansd).
D. Feature Extraction and Classification
From the 1s data segments we extract features that can
differentiate DB signal of interest from signals generated by
other eye movements, such as single blinks. We extracted
numerous features (either in time domain or in frequency
domain) from data segments and performed feature
selection by comparing the inter-cluster distances of each
feature. Finally, the following three features are selected for
CONFIDENTIAL. Limited circulation. For review only.
Preprint submitted to 7th International IEEE EMBS Conference
on Neural Engineering. Received November 24, 2014.
classifying DB activity and non-DB activity: L1-norm,
Kurtosis and Entropy. L1-norm measures the magnitude of
the signal by summing up the absolute values of all the
samples in the signal vector. Kurtosis measures the
peakedness of the signal and entropy measures the amount
of information in the signal.
After feature extraction, three features are fed into the
classifier, which is basically a kernelized support vector
machine (SVM). In training the SVM, both regularization
and cross-validation are performed. The classification
output will be translated to digital commands for
controlling the mouse cursor.
E. Cursor Control
DB signals are used as a switch in the system to control
the following two actions: (1) the switching of the moving
direction of mouse cursor and (2) left-clicking. Suppose the
mouse cursor has the coordinate of (X,Y) on the screen with
an initial condition of X=Y=1 (the top left corner of the
screen). When the system starts running, the mouse cursor
will start moving horizontally (i.e., Y is kept constant but X
is kept increasing with a step size of
x). When a DB activity
is detected, the cursor will stop at the current location and
start moving in vertical (i.e., X is kept constant and Y is kept
increasing with step size of
y). When the second DB
activity is detected, the mouse will left-click at that location
(X,Y), and then its coordination will be reset to X=Y=1. The
whole process repeats so that complex functions, such as
text input, can be realized. The cursor control paradigm is
summarized in Table I.
TABLE I. FLOWCHART OF CURSOR CONTROL USING DOUBLE BLINKS
BEGIN
XMax Screen X ; YMax Screen Y;
x XMax /300,
y YMax / 100;
X 0; Y 0;
WHILE running
IF Direction == Horizontal
X mod(XMax , X+
x);
IF Double-Blink detected
Direction Vertical;
END IF
END IF
IF Direction == Vertical
Y mod(YMax , Y+
x);
IF Double-Blink detected
Mouse click;
Direction Horizontal;
X 0; Y 0;
END IF
END IF
END WHILE
END BEGIN
The default values of XMax and YMax are the screen sizes,
while
x and
y are the step sizes along horizontal and
vertical directions, respectively. These values affect the
cursor moving speed and thus can be tuned to meet
different people’s need. In this paper the screen resolutions
is 1920×1080.
F. Experiment Set up
Eight subjects (aged 18-40; all males) with proper eye
conditions participated in the experiment. Data were
recorded from two scenarios: indoor (in a quite laboratory)
and outdoor (in a canteen with different sources of noise to
stimulate real-life situation). We collected indoor data from
6 subjects, and outdoor data from 2 subjects. The subject is
seated in a chair and the headset sensor is attached to the
forehead with a sampling frequency at 512Hz. A computer
was placed about 50-100cm in front of the subject. The
experiment consists of two sessions: a calibration session
and a testing session. In the calibration session, we collected
a few (less than 5) DB signals from each subject and the
signals are used to check whether the sensor is detached
from the subject’s forehead as well as to calibrate threshold
as (1). No training is performed after the calibration. Then
system is tested using the virtual on screen keyboard. All
subjects gave their written informed consent, and the local
ethics committee approved the experimental procedures.
G. Performance evaluation
To evaluate the system performance, the accuracy, the
information transfer rate and the processing time will be
used. The accuracy is the ratio of correctly classified trials
to the total number of signal trials. The information transfer
rate (ITR) is a measure of amount of bits transferred per
minute. It is calculated as
)](l og)
1
1
(l og)1()(l og[
60
222 N
N
p
ppp
T
ITR
,
(2)
where T is the time interval between two consecutive
commands, N is the number of commands, and P is the
classification accuracy. In the proposed system N=2 and
T=2.0179s0.59s. Notice that ITR is a function of time
intervals between commands and accuracy, thus high
accuracy does not always implies a higher ITR.
Finally, the processing time required for the user to input a
short English phrase is also evaluated. To test the control of
the cursor on texting, the Windows On-Screen Keyboard is
used as shown in Fig. 3. When the mouse moves to the
corresponding button, the user performs a DB to “press” the
button. The English phrase “hello world” which contains 11
characters (space included) is used to simulate daily usage
of the computer for word input.
Fig. 3. The Windows On-Screen keyboard and the input
text.
CONFIDENTIAL. Limited circulation. For review only.
Preprint submitted to 7th International IEEE EMBS Conference
on Neural Engineering. Received November 24, 2014.
III. RESULTS
On average, the accuracy and ITR for indoor and
outdoor testing are (84.41%, 45.47 bits/min) and (71.50%,
41.39 bits/min) respectively. The following tables show the
results for each subject in indoor and outdoor environment.
It is important to note that, the SVM classifier is trained on
Subject 1 and applied to all subjects. In another word, the
performance of Subjects 2-6 listed in the following tables is
from cross-subject prediction and these subjects did not
undergo any training phase before they used the EOG-HCI
system.
TABLE II-A. IN DOOR SYSTEM PERFORMANCE OF THE EOG-HCI SYSTEM
Subject
1
2
3
4
5
6
Accuracy
(%)
95.12
80.00
88.10
85.29
88.00
70.00
ITR
(bit/min)
54.60
48.15
50.68
39.78
38.37
38.70
thello-world
(min)
2.1
3.3
3.1
3.5
4.0
3.8
TABLE II-B. OUTDOOR SYSTEM PERFORMANCE OF THE EOG-HCI SYSTEM
Subject
7
8
Accuracy
(%)
69.00
74.00
ITR
(bit/min)
40.00
42.77
thello-world
(min)
5.5
5.2
The times spent (thello-world) on typing “hello world” from
different subjects are largely different, which is due to the
following reason. Actually, the computational complexity
of the proposed system is moderately high. Because of the
advanced signal processing algorithms used in this system,
there is a small amount of time delay (0.1~0.5 seconds).
Such time delay makes it possible for the users to click at the
wrong location on the screen. Therefore, the time spent on
typing “hello world” actually includes the time spent on
pressing “backspace” to delete the wrong inputted
characters. To address the problem of lagging, the following
improvements can be made: (1) the selection of
x and
y
should be smaller, (2) a larger cursor (by changing the
operating system setting) should be used, (3) the computer
screen size should be larger so that the button size of the
virtual on screen keyboard can be larger and becomes easier
to press. It is also suggested that the movable range of the
cursor position should be constrained. For example, when
initializing the cursor position, the top-left corner of the
virtual on-screen keyboard should be used instead of the
position (0, 0).
In addition, we did a survey on the user-friendliness of
the system. All subjects agreed that there is no any
discomfort or pain during the experiment, and they all agree
that the system is very easy to use.
The following table shows the comparison of the
proposed system to other established systems on the
literature. It can be seen that, the new system has an
acceptable high accuracy, but the usability of the system is
higher than other existing counterparts (mainly because of
the NeuroSky MindWave Systems).
TABLE III. PERFORMANCES OF DIFFERENT SYSTEMS
System
[1]
[2]
[3]
[4]
This paper
Accuracy (%)
95
82-100
78-97
86
72-84
#channels
3
16
5
8
1
IV. CONCLUSION
This paper presented a computer-access solution for
people who can only move their eyes to control the
computer by translating eye-blinks into a series of mouse
cursor control sequence. With only single channel, the
system has a higher degree of usability but still can achieve
an acceptable accuracy rate. Not just in laboratory, the
system can be used in outdoor environment. The system
only requires double-blink action, which is natural to those
with proper eye-condition and causes no discomfort. Thus
in terms of performance, both ergonomic factors and system
performance factors are maximized. The system has the
potential to find many applications in daily computer usage
such as cursor control, text processing and web-browsing. In
this paper, only DB signals are utilized in the system. In
future, the proposed system can be extended further into a
more powerful system by utilizing other non-DB activities,
to form a system with a higher information transfer rate.
REFERENCES
[1]. E. English, A. Hung, E. Kesten, D. Latulipe, and Z. P. Jin,
“EyePhone: A mobile EOG-based human-computer
interface for assistive healthcare,” in Proc. IEEE Conf.
EMBS NER, 2013.
[2]. S. L. Wu, L. D. Liao, S. W. Lu, W. L. Jiang, S. A. Chen, and
C. T. Lin, “Controlling a human-computer interface system
with a novel classification method that uses
electrooculography signals,” IEEE Trans. Biomed. Eng., vol.
60, no. 8, pp. 2133-2141, Aug. 2013.
[3]. Y. Nam, B. Koo, A. Cichocki, and S. Choi, “GOM-Face:
GKP, EOG, and EMG-based multimodal interface with
application to humanoid robot control,” IEEE Trans. Biomed.
Eng., vol. 61, no. 2, pp. 453-462, Feb. 2014.
[4]. T. Yagi, Y. Kuno, K. Koga, and T. Mukai, “Drifting and
blinking compensation in electro-oculography (EOG)
eye-gaze interface,” in Proc. IEEE Conf. SMC, 2006.
[5]. J. F. Wu, A. M. S. Ang, K. M. Tsui, H. C. Wu, Y. S. Hung, Y.
Hu, J. N. F. Mak, S. C. Chan, and Z. G. Zhang, “Efficient
implementation and design of a new single-channel
electrooculography-based human-machine interface
system,’’ IEEE Trans. Circuit and Systems II, in press.
[6]. A. D. N. Edwards, ed. Extraordinary Human-Computer
Interaction: Interfaces for Users with Disabilities. vol. 7.
CUP Archive, 1995.
[7]. L. N. S. Andreasen Struijk, “An inductive tongue computer
interface for control of computers and assistive devices."
IEEE Trans. Biomed. Engineering vol. 53, no. 12, pp.
2594-2597, 2006.
[8]. J. R. Wolpaw, D. J. McFarlanda, G. W. Neatb, and C. A.
Fornerisa, “An EEG-based brain-computer interface for
cursor control,” Electroencephalography Clin.
Neurophysiology, vol. 78, no. 3, pp. 252-259, 1991.
CONFIDENTIAL. Limited circulation. For review only.
Preprint submitted to 7th International IEEE EMBS Conference
on Neural Engineering. Received November 24, 2014.