Content uploaded by Jacob Donley
Author content
All content in this area was uploaded by Jacob Donley on Sep 23, 2014
Content may be subject to copyright.
ANALYSING THE QUALITY OF EXPERIENCE OF MULTISENSORY MEDIA
FROM MEASUREMENTS OF PHYSIOLOGICAL RESPONSES
Jacob Donley, Christian Ritz, Muawiyath Shujau
School of Electrical Computer and Telecommunications Engineering, University of Wollongong,
Wollongong, NSW, Australia, 2522, jrd089@uowmail.edu.au, critz@uow.edu.au, ms970@uowmail.edu.au
ABSTRACT
This paper investigates the Quality of Experience (QoE) of
multisensory media by analysing biosignals collected by
electroencephalography (EEG) and eye gaze sensors and
comparing with subjective ratings. Also investigated is the impact
on QoE of various levels of synchronicity between the sensory
effect and target video scene. Results confirm findings from
previous research that show sensory effects added to videos
increases the QoE rating. While there was no statistical difference
observed for the QoE ratings for different levels of sensory effect
synchronicity, an analysis of raw EEG data showed 25% more
activity in the temporal lobe during asynchronous effects and 20-
25% more activity in the occipital lobe during synchronous effects.
The eye gaze data showed more deviation for a video with
synchronous effects and the EEG showed correlating occipital lobe
activity for this instance. These differences in physiological
responses indicate sensory effect synchronicity may affect QoE
despite subjective ratings appearing similar.
Index Terms— quality of experience (QoE), multisensory
media, gaze tracking (GT), electroencephalography (EEG),
synchronicity
1. INTRODUCTION
Multisensory media systems provide for an enhanced user
Quality of Experience (QoE), which in this context is also known
as Sensory Experience [1], through stimulating human senses by
vibration, blowing air and ambient lighting effects at time-points
that are linked to an audio/visual scene in a multimedia
presentation [2, 3]. Previous research investigating the use of a
multisensory media system to provide these sensory effects for
video demonstrated an enhanced user experience compared to the
same videos without the sensory effects [3]. This was based on
subjective testing, where a participant selected their vote from an
international standard scale for subjective quality assessment of
multimedia [2-4]. In [2, 5], research further investigated the
enhancement of specific emotional responses provided by sensory
effects added to video sequences. This existing research is based
on QoE or emotional response ratings provided by each user at the
end of the test stimulus. In contrast, this paper investigates the
temporal variation of QoE during the presentation of the sensory-
enhanced test stimulus. This is achieved by collecting and
analysing physiological responses using biosensors whilst a subject
participates in a QoE evaluation of the test sequences. Presented
here is a system that integrates an electroencephalography (EEG)
headset and an eye gaze tracker within a multisensory media
presentation and QoE evaluation system [2, 3].
EEG is a method of reading the electrical activity of the brain
by measuring the potential difference between two different
receptors positioned on the surface of the scalp. Analysing EEG
recordings can be used to identify a person’s emotional state for
both mapping to the six primary emotions elicited by viewing
images [6] as well as distinguishing between like and dislike when
viewing video advertisements [7]. Recent research [8] has
investigated the use of EEG data for understanding the time-
varying QoE of multimedia and research in [9] investigates EEG
responses for comparison with user ‘tags’ added temporally during
presentation of the multisensory media at time-points chosen by
the user. Eye gaze tracking data collected while users watch a
media presentation can be analysed to detect when a user glances
or stares at objects of interest within an audio/visual scene. This is
directly related to what a human brain is processing when engaged
in an activity [10]. Jointly analysing gaze and EEG activity can be
used to investigate how a person feels about what they are
observing onscreen at that particular time. Such analysis is
explored in this work to see how it relates to QoE for multisensory
media applications.
Previous research [4] has concluded that different sensory
effects have varying impact on the overall QoE and emotion as
judged by a participant. Results have also shown that the level of
QoE enhancement is related to the genre of the video. Such
research provides useful information in regards to modelling
sensory effects to ensure QoE is maximised. Poor synchronisation
of audio and visual content is known to provide a poor user
experience [11, 12]. Recent work [1] has found that synchronicity
of olfactory sensory effects with the target video scene may also
influence QoE. This paper provides a new investigation into the
impact of synchronicity between a video scene and other sensory
effects (wind, vibration, and lighting) on the resulting QoE. This is
based on analysing subjective QoE ratings as well as biosensor
signals (EEG and eye gaze) collected for sequences containing no
effects, effects synchronised to the desired time-location and
effects asynchronous with the desired time-location.
Section 2 of this paper reviews techniques for EEG and eye
gaze monitoring and describes the methods adopted in this research.
Section 3 describes the biosensor-based QoE evaluation system
while Section 4 describes the multisensory media QoE evaluation
incorporating biosensors. Results from subjective QoE testing and
analysis of EEG and eye gaze data for a selection of sensory
enhanced videos with different types and synchronisation of effects
are presented in Section 5 with conclusions provided in Section 6.
2. MEASURING EMOTIONAL RESPONSES FROM EEG
AND EYE GAZE SIGNALS
This section reviews existing approaches to measuring emotional
response from EEG and eye gaze signals and describes the
techniques adopted in this work.
2.1. Electroencephalography (EEG)
Being able to accurately and logically test for emotions and
user experiences by collecting EEG data is a problem when most
test equipment for these applications can be quite uncomfortable
and cumbersome. For example, the standard 10-10 EEG system
requires 81 separate wires and corresponding electrodes [13]
usually held on a net or cap which is strapped over the head. More
recently, less obtrusive EEG headsets have become available such
as the Emotiv EEG [14] that is used in this work. This new style of
EEG head set fits neatly and comfortably on a person’s head and
includes a built-in accelerometer to record head movement.
Using multiple sensors (14 channels in the case of the EEG
headset used here) allows for recording signals from different parts
of the brain. For example the frontal lobe of the cerebral cortex is
associated with behavioural and emotional responses, the lobe also
functions in close relationship with other regions of the brain in
aspects of memory and learning, attention and motivation [15]. All
of these complex relationships make it possible to read emotional
behaviour from electrodes. As well as recording raw signals, the
Emotiv EEG system includes software for deriving parameters
from these signals, such as ‘frustration’ values, that can be linked
to emotion. Being a proprietary algorithm, there is little
information about how these ‘frustration’ values are determined
(further explained in Section 3). Hence, this paper also analyses the
raw EEG signals using the standardised low resolution brain
electromagnetic tomography (sLORETA) method [16] for spatial
analysis. This method was chosen as it yields images of
standardised current density with zero localisation error [16] and is
an alternative to analysing P300 responses or sub-band frequency
powers [17] which are both temporal analysis methods. For an in-
depth explanation of sLORETA the reader is referred to [16]. The
advantage of using spatial analysis is that it can distinguish
between regions of temporal activity. The method (sLORETA)
helps quantify these active regions of the brain to assist in
deducing emotions with the added benefit of visualising brain
activity.
2.2. Eye gaze tracking
In addition to EEG, eye gaze tracking data can be analysed to
identify audio/visual content linked to emotional responses (see
Section 5.6). Similar to older EEG headsets, eye gaze trackers
consisted of mounting superfluous amounts of equipment on the
head such as a camera, glasses, reflective plates and camera
mounts [18, 19]. Recently there have been improvements to eye
gaze software which has made it possible to track eye gaze from
video without calibration [18] and from standard low resolution
web cameras [20, 21]. Compared to more sophisticated systems
involving special glasses or other hardware, later systems minimise
eye distractions that may otherwise influence the test results.
To increase the accuracy of the eye gaze tracker, the system
used in this paper includes a PS3Eye [22], varifocal lens and an
infrared (IR) light source pointing towards the eyes of the subject.
The varifocal lens was added to the PS3Eye to improve focus. The
PS3Eye manufactured by Sony was chosen as it was designed in
collaboration with sensor chip manufacturer OmniVision
Technologies to perform well in variable lighting [23]. Hence, this
improves the resolution and quality of the image of the eye that is
captured due to the accountability of distance to the subject and
varying lighting conditions. The high frame rate of this camera (60
frames per second) makes it possible to sample faster and average
multiple image frames together to improve the accuracy of the
tracker. The IR light source causes only the cornea to reflect the
light back at the camera. This creates a distinct pupil in the image
which can then be segmented from the image using software
algorithms that recognise the pupil and edges of the eye [20]. The
work in this paper uses the ITU Gaze Tracker (GT) [24], which has
been shown to provide accurate performance [21], to collect eye
gaze data as a user views videos.
3. INTEGRATED BIOSENSOR-BASED QOE EVALUATION
SYSTEM
In order to record all of the information a software package was
written which integrates the ITU GT, Emotiv EEG headset and the
Philips amBX multisensory media system. The ITU GT was added
to the package so that the original interface could be controlled to
perform the calibration and recordings. The integrated system is
shown in Figure 1.
The Emotiv EEG headset comes with a Software
Development Kit (SDK) which allows recording of raw EEG and
filtered signals. The software records values from the Expressiv™
and Affectiv™ suites, saving information on facial expressions and
real-time changes in subjective emotions, respectively. The
experiments in this paper analyse the ‘frustration’ values recorded
from the Affectiv™ suite as well as the raw EEG signals. Emotiv
have informally mentioned [25] that Affectiv™ values are
calculated from algorithms written to correlate to subjective studies
as explained in [26]. A live display was added to show the values
from the Affectiv™ suite, Expressiv™ suite, gyroscope and
contact quality of the sensors (see Figure 1).
The amBX multisensory media system is packaged with an
SDK allowing the software package to control the equipment
Figure 2: Hardware configuration and test setup
Figure 1: Layout of evaluation system
precisely. A media player was coded into the user interface to
display audio and video content. The program is coded based on
the MPEG-V standard [27] so that it can read vibration, wind and
lighting effects from compatible sensory effects metadata files. An
auto extraction feature was written for the lighting effects based on
previous work [28]. The multisensory media section of the
software package records the times when events occur in the video.
These times are saved along with the EEG and GT logs relative to
a common timestamp.
4. SUBJECTIVE TESTING METHODOLOGY
The testing methodology used was designed to coincide with
research done in [2, 3, 29]. The design was based on a single
stimulus testing method where subjects were introduced to the
equipment using a training phase [4]. Videos and their effects were
chosen based on [2, 4]. The asynchronous effects were designed so
that all effects (wind, vibration and light) preceded the audio and
video by 500ms. This value was chosen as it is the average skew in
the positive and negative direction considered perceptually
noticeable for media synchronisation [11, 12]. This section
describes the procedure used to evaluate the subjects with
biosensors and relates to previous discrete results [2, 3] by
incorporating a QoE voting stage for the videos. The equipment
setup and test configuration is depicted in Figure 2.
4.1. Introduction and pre-questionnaire
The evaluation begins by describing the procedure, set of
tasks and rating method to the subject. The subject is told the
working definition of QoE as stated in [30] and that they will be
asked to base their votes on their QoE of the presented
multisensory media. Following their consent, the evaluation
begins. A pre-questionnaire establishes basic demographic data
(name, age, gender). This was used to later determine that there
were no significant differences in test results due to age or gender
[4].
4.2. Biosensor setup
The biosensors are set up before the training phase to ensure the
participants become familiar with the equipment prior to the main
evaluation. The EEG headset is set up first using an appropriate
saline solution to ensure sensor contacts are fitted to provide for
the highest signal quality. The GT calibration takes place next and
the subject is positioned approximately 1 metre from the screen
[4]. The camera is optically zoomed so that any black background
visible in the image is minimised and the eyes are located near the
centre of the image. The colour of the GT calibration background
is set to a light grey colour; this allows for miosis in the dim
environment which gives a clearer distinction between the sclera,
iris and pupil. Individual calibration points are re-calibrated if the
GT cannot determine which quadrant of the screen the subject is
gazing at.
4.3. Training phase
A training phase takes place after the biosensor setup, as
recommended in [4]. This is designed to eliminate the surprise
effect [4] by helping the participants to become familiar with the
stimulus presentation and style of QoE voting. The design was
adapted from the training phases used in [4, 29]. The results of the
training phase are not included in the final analysis. The training
phase for all evaluations used a shortened version of the publicly
available ‘Titanic (1997)’ trailer (2012 3D release). The genre of
the trailer is Drama and is presented with 18 wind effects and 13
vibration effects. The training video was shown consecutively
three times with different effects picked randomly from: no effects;
asynchronous effects; and synchronous effects. After each video
the subject was asked to vote their QoE on an eleven-grade
numerical quasi-continuous quality scale from 0 to 100 [3, 4, 31].
4.4. Main evaluation
Once the subjects are familiar with the equipment, effects
and the voting stages, the EEG and GT logging commences for the
main evaluation. The 15 videos seen in Figure 3 (dataset available
at [32]) are randomly presented three times; without effects, with
asynchronous effects and with synchronous effects. No two videos
with the same title are shown consecutively [29]. At the end of
each video the subject is asked to record their QoE. The voting
takes no more than 10 seconds as suggested in [31]. The QoE votes
are stored together with logs for the biosensors. The single
stimulus assessment method is adopted from [4]. The videos have
an average duration of 30 seconds [32] and the entire evaluation
takes approximately 22.5 minutes.
4.5. Post-questionnaire
At the end of the main evaluation the EEG headset is
removed and the subject is asked to complete a post-questionnaire
[2, 4]. The post-questionnaire is given to gain feedback on the
evaluation task and gives the subject opportunity to provide
recommendations to the procedure.
5. RESULTS
For the subjective testing 10 subjects (6 males and 4 females) were
chosen from an initial set of 15 who were invited to participate in
35
55
75
QoE Vote (%)
Shortened Video Titles
Without Effects With Asynchronous Effects With Synchronous Effects
Figure 3: QoE Votes and 95% Confidence Interval
the experiment. Five subjects were excluded based on EEG
artefacts. Physiological tests can be highly complex and we found
reliability to be a key issue. The mean age for this set of
participants is 28.8 and ranges from 19 to 59 with a sample
standard deviation () of 12.2. The post-questionnaire showed
that some subjects commented on the vibration effect with various
recommendations on placement, realism and timing. Some subjects
also stated that they thought there was a difference between the
videos with synchronous and asynchronous effects; however, they
were unsure of exactly what the difference was.
5.1. Discrete QoE voting
Each subject voted on their QoE for each video as described
in Sections 4.1 and 4.4. The votes were analysed for each video
and the Mean Opinion Score (MOS) was plotted for each video
and effect type as recommended in [4]. The results for this
evaluation can be seen in Figure 3 and are presented with a 95%
confidence interval. Full video titles can be found at [32]. The
videos are ordered left to right from highest mean vote to lowest
mean vote, respectively.
A single factor Analysis of Variance (ANOVA) was applied
to the votes for each video to determine if there was a discernable
difference between the different effect types. ANOVA was applied
with alpha equal to 0.05. The p-values of the ANOVA test showed
that for 80% of the videos there was a discernable difference
between effect types and so Student’s t-tests were then applied to
refine the differences.
Figure 4 shows the probabilities calculated using a Student’s
t-test analysis of the QoE votes [2, 4], using a one tail distribution
and a two-sample equal variance. This was conducted three times
under the following null hypotheses:
; ; (1)
The mean of the QoE votes is denoted by , and
for without effects, with asynchronous effects and with
synchronous effects, respectively. The alternative hypotheses for
these tests are:
; ; (2)
The critical value is 0.05 (5%) and Figure 4 shows that the
null hypotheses is rejected 80% of the time for and , and
also shows that the alternative hypothesis is rejected 100% of
the time. This shows that videos with effects have a significantly
larger mean QoE observed than videos without and agrees with [2-
4, 28]. It also shows that there is no significant difference in mean
QoE observed between asynchronous and synchronous effects.
This indiscernible difference may be due to the asynchronicity
being on one side of the perceptual threshold for some effects and
on the other for other effects. It may also be possible that the large
contrast between no effects and effects reduced the apparent
difference between async and sync effects.
5.2. Temporal physiological responses
The biosensor analysis was completed with the two videos
that had the highest mean vote for QoE. The videos were ‘Tron’
and ‘Berrecloth’ with the highest and second highest mean vote,
respectively. EEG analysis using ‘frustration’ values was
completed using both videos, whereas, the eye gaze and raw EEG
analysis was performed on the ‘Tron’ video. The data was analysed
to show responses directly after major effects. Vibration was found
to be the most dominant effect when compared to light and wind in
[4] and so the first vibration with 100% strength was examined.
5.3. Electroencephalography (EEG)
For this analysis all 10 sets of data were used to find the
‘frustration’ values throughout the ‘Tron’ and ‘Berrecloth’ videos.
Frustration was chosen as it is synonymous to annoyance stated in
the working definition of QoE [30]. The description of
synchronous effects presented to the subjects for the video ‘Tron’
and ‘Berrecloth’ are available at [32]. The EEG data for all
subjects was linearly interpolated to a common sampling rate of
10Hz and then filtered using a moving average filter with a
window size of 10 samples. The filtered data was normalised so
that all subjects were within the same range of amplitude and the
gradients of this information was calculated. The frustration
gradients can be related to the time that the effects occurred using
the effect metadata.
To calculate whether there is a significant difference with
increasing and decreasing frustration a statistical analysis needs to
be applied. The same method is adopted from the one described in
Section 5.1, however, it is now two-tailed and is applied to the
frustration gradients between the start of the first full strength
vibration and second full strength vibration. The null hypotheses
for this test asked if the frustration gradients are observed
statistically greater than zero. The results of the t -tests show that
30% of subjects were observed having a significant increase in
frustration for no effects and asynchronous effects and 50% for
synchronous.
The analysis used for ‘Tron’ was then applied to
‘Berrecloth’. The period is once again equal to the gap between the
first and second full strength vibrations. The results of the t-tests
show that 10% of subjects were observed having a significant
increase in frustration for no effects and synchronous effects and
30% for asynchronous. Frustration in these cases may be caused by
unanticipated and/or unfavourable effects. It is difficult to draw
conclusions due to the ‘frustration’ algorithm being proprietary and
so this paper explores the recorded EEG potentials.
0%
10%
20%
30%
40%
50%
Probability
Shortened Video Titles
Without Effects & Async Without Effects & Sync Async & Sync critical value
Figure 4: T-Test showing the probability that mean QoE would be observed the same (alpha=0.05)
5.4. Spatial EEG analysis
The sLORETA method was used to analyse the raw EEG
data due to the proprietary Affectiv™ algorithms. This method was
used to determine the location at which the propagating electrical
potentials originated. Data from the ‘Tron’ video was used for this
analysis due to the stronger first vibration and higher QoE vote.
The sLORETA method can provide images of standardised current
density and quantification of brain lobes and Brodmann areas. The
data was analysed using a moving window with a period of 0.5s
which overlapped for the duration of the effect and after the effect.
Averaging was applied to the EEG signal for 10s to remove DC
bias and a Fast Fourier Transform (FFT) was applied for Delta,
Theta, Alpha and Beta bands. The images presented in Figure 5
and Figure 6 show a model brain at 5% opacity from directly
above with the frontal lobe positioned at the top. The coloured
areas show the most active regions of the brain with a threshold of
25%. These images are provided in two dimensions (2D) for print;
however, three dimensional (3D) visualisations are possible.
5.5. Synchronicity
The temporal lobe of the brain can be related to memory of
temporal events and the senses these stimulate, which can be
related to the synchronicity of the sensory effects. Figure 5 shows a
particular subjects brain for the three different versions of the
video ‘Tron’, where it can be seen that for asynchronous effects the
temporal lobe is the most active. Furthermore, Figure 7 shows
there is an increase of 25% of activity in the temporal lobe for
asynchronous effects for all subjects, whereas, there was no
activity in this lobe observed under the conditions for no effects
and synchronous effects. There is also an increase in activity in the
order of 20-25% for the occipital lobe during synchronous effects;
this is discussed in more detail in 5.6.
5.6. Eye gaze tracking
A method with preliminary results for correlating brain lobe
activity with gaze deviation is presented in this section. Eye gaze
analysis was performed on two individual subjects, one female and
one male, because during the videos it was common for subjects to
drift off camera causing distorted eye gaze logs. This could be
circumvented by keeping subjects still for the length of the test,
reducing the length of the test and/or eye tracking with a wider
view. The two individuals chosen had the least distorted eye gaze.
Blinking caused null values which were removed using linear
interpolation. The data was then up sampled and filtered using the
same methods for the EEG data in Section 5.3. It should be noted
that these two subjects had different opinions of the sensory effects;
this was apparent from the subject’s QoE votes. Named subject one
and two, they are situated on the left and right side for each effect
type in Figure 8, respectively. Subject one had an average QoE
vote of 47, 69 and 68, and subject two, 54, 44 and 45 for no effects,
async effects and sync effects, respectively. The order of the
videos was async effects, no effects and sync effects for subject
one and async effects, sync effects and no effects for subject two.
A much larger standard deviation for synchronous effects is
experienced by subject one but only slightly for subject two. At
this time and for this effect, subject one also has more activity in
the occipital lobe (Figure 6) which may be correlated to the
increased gaze deviation. A significant aspect of the occipital lobe
is the primary visual cortex, which highly correlates to increased
gaze deviation. Subject two lacks activity in this lobe and this is
reflected in the uniform gaze deviation across effect types.
6. CONCLUSIONS
This paper presents results for QoE assessment of multisensory
media using an EEG neuroheadset and eye gaze tracker. The
response to sensory effects when complementing audio/visual
content under asynchronous and synchronous conditions is
compared to content with no effects. The results show that sensory
effects enhance the QoE; however, there was statistically
indiscernible difference between the synchronicities of effects.
Furthermore, the EEG results show that there is correlating brain
activity with a 20-25% decrease in frontal lobe activity for both
asynchronous and synchronous effects. The EEG results also show
that there is an increase in activity in the temporal lobe by 25% for
asynchronous and occipital lobe by 20% for synchronous. The
preliminary gaze tracker results may support this by showing that
gaze deviation and occipital lobe activity increase mutually.
Future work includes identifying the influence of a wider
range of synchronicities on QoE by conducting further user studies
and data mining to gather information of alternate correlations
Frontal Frontal Frontal
Frontal Frontal
Occipital
0
500
No Effects Async Sync
Standard
Deviation
(Pixels)
Type of Effects
Subject 1 -
X Coord
Subject 1 -
Y Coord
Subject 2 -
X Coord
Subject 2 -
Y Coord
Figure 8: Standard deviations of gaze location with most
active brain lobe labelled
Figure 5: Subject with increased brain activity in the temporal
lobe. No effects (A), async effects (B) and sync effects (C)
A B C
Figure 6: Subject with increased brain activity in the occipital
lobe. No effects (A), async effects (B) and sync effects (C)
A B C
0%
25%
50%
75%
100%
No Effects Async Sync
Subjects (%)
Type of Effects
Frontal
Parietal
Limbic
Occipital
Temporal
QoE MOS
Figure 7: Most active brain lobes for different effects
between biosensors and QoE thus providing better knowledge for
future user studies in the area.
7. ACKNOWLEDGEMENTS
The authors would like to thank all participants for volunteering
their time, the Commonwealth Scientific and Industrial Research
Organisation (CSIRO) for their help with EEG analysis and
Christian Timmerer and Benjamin Rainer for their insightful and
useful preliminary discussions related to this work.
8. REFERENCES
[1] C. Timmerer, M. Waltl, B. Rainer, and N. Murray, "Sensory
Experience: Quality of Experience Beyond Audio-Visual," in
Quality of Experience, ed: Springer, 2014, pp. 351-365.
[2] B. Rainer, M. Waltl, E. Cheng, M. Shujau, C. Timmerer, S.
Davis, et al., "Investigating the impact of sensory effects on
the quality of experience and emotional response in web
videos," in Quality of Multimedia Experience (QoMEX),
2012 Fourth International Workshop on, 2012, pp. 278-283.
[3] M. Waltl, C. Timmerer, and H. Hellwagner, "Improving the
quality of multimedia experience through sensory effects," in
Quality of Multimedia Experience (QoMEX), 2010 Second
International Workshop on, 2010, pp. 124-129.
[4] C. Timmerer, B. Rainer, and M. Waltl, "A utility model for
sensory experience," in Quality of Multimedia Experience
(QoMEX), 2013 Fifth International Workshop on, 2013, pp.
224-229.
[5] C. Timmerer, M. Waltl, B. Rainer, and H. Hellwagner,
"Assessing the quality of sensory experience for multimedia
presentations," Signal Processing: Image Communication,
vol. 27, pp. 909-916, 2012.
[6] P. C. Petrantonakis and L. J. Hadjileontiadis, "Emotion
recognition from brain signals using hybrid adaptive filtering
and higher order crossings analysis," Affective Computing,
IEEE Transactions on, vol. 1, pp. 81-97, 2010.
[7] R. B. Silberstein and G. E. Nield, "Measuring Emotion in
Advertising Research: Prefrontal Brain Activity," Pulse,
IEEE, vol. 3, pp. 24-27, 2012.
[8] J.-N. Antons, S. Arndt, R. Schleicher, and S. Möller, "Brain
Activity Correlates of Quality of Experience," in Quality of
Experience, ed: Springer, 2014, pp. 109-119.
[9] E. Cheng, S. Davis, I. Burnett, and C. Ritz, "An ambient
multimedia user experience feedback framework based on
user tagging and EEG biosignals," 4th International
Workshop on Semantic Ambient Media Experience (SAME
2011), pp. 1-5, 2011.
[10] J. G. Carrier, "Mind, Gaze and Engagement Understanding
the Environment," Journal of Material Culture, vol. 8, pp. 5-
23, 2003.
[11] R. Steinmetz, "Human perception of jitter and media
synchronization," Selected Areas in Communications, IEEE
Journal on, vol. 14, pp. 61-72, 1996.
[12] W. Yaodu, X. Xiang, K. Jingming, and H. Xinlu, "A speech-
video synchrony quality metric using CoIA," in Packet Video
Workshop (PV), 2010 18th International, 2010, pp. 173-177.
[13] V. Jurcak, D. Tsuzuki, and I. Dan, "10/20, 10/10, and 10/5
systems revisited: their validity as relative head-surface-based
positioning systems," Neuroimage, vol. 34, pp. 1600-1611,
2007.
[14] Emotiv. (2014, May). EEG Features [Online]. Available:
http://emotiv.com/eeg/
[15] M. S. Buchsbaum, "Frontal cortex function," American
Journal of Psychiatry, vol. 161, pp. 2178-2178, 2004.
[16] R. Pascual-Marqui, "Standardized low-resolution brain
electromagnetic tomography (sLORETA): technical details,"
Methods Find Exp Clin Pharmacol, vol. 24, pp. 5-12, 2002.
[17] R. Gupta, S. Arndt, J.-N. Antons, R. Schleicher, S. Moller,
and T. H. Falk, "Neurophysiological experimental facility for
Quality of Experience (QoE) assessment," in Integrated
Network Management (IM 2013), 2013 IFIP/IEEE
International Symposium on, 2013, pp. 1300-1305.
[18] S. Kohlbecher, S. Bardinst, K. Bartl, E. Schneider, T.
Poitschke, and M. Ablassmeier, "Calibration-free eye tracking
by reconstruction of the pupil ellipse in 3D space," in
Proceedings of the 2008 symposium on Eye tracking research
& applications, 2008, pp. 135-138.
[19] C. W. Huang, Z. S. Jiang, W. F. Kao, and Y. L. Huang,
"Building a Low-Cost Eye-Tracking System," Applied
Mechanics and Materials, vol. 263, pp. 2399-2402, 2013.
[20] R. Valenti, J. Staiano, N. Sebe, and T. Gevers, "Webcam-
based visual gaze estimation," Image Analysis and
Processing–ICIAP 2009, pp. 662-671, 2009.
[21] J. San Agustin, H. Skovsgaard, E. Mollenbach, M. Barret, M.
Tall, D. W. Hansen, et al., "Evaluation of a low-cost open-
source gaze tracker," in Proceedings of the 2010 Symposium
on Eye-Tracking Research & Applications, 2010, pp. 77-80.
[22] Sony. (2014, April 24). PS3 EYE CAMERA [Online].
Available: http://www.sony.com.au/product/playstation+eye
[23] N. G. Croal. (2007, May). Geek Out: The Playstation Eye is
Nearly Upon Us. Dr. Richard Marks Takes Us Behind the
Scenes of its Birth. [Online]. Available:
http://archive.today/WESxm
[24] ITUGazeGroup. (2014, May). ITU Gaze Tracker [Online].
Available: http://www.gazegroup.org/downloads/23
[25] Emotiv. (2014, July). Scientific background for the emotions
in the affectiv suite, what is it? [online]. Available:
emotiv.com/forum/messages/forum4/topic1262/message7424/
[26] K. M. Gilleade and A. Dix, "Using frustration in the design of
adaptive videogames," in Proceedings of the 2004 ACM
SIGCHI International Conference on Advances in computer
entertainment technology, 2004, pp. 228-232.
[27] C. Timmerer, S. Hasegawa, and S. Kim, "Working Draft of
ISO/IEC 23005 Sensory Information," ISO/IEC JTC 1/SC
29/WG 11 N, vol. 10618, 2009.
[28] M. Waltl, C. Timmerer, and H. Hellwagner, "Increasing the
user experience of multimedia presentations with sensory
effects," in Image Analysis for Multimedia Interactive
Services (WIAMIS), 2010 11th International Workshop on,
2010, pp. 1-4.
[29] F. De Simone, M. Naccari, M. Tagliasacchi, F. Dufaux, S.
Tubaro, and T. Ebrahimi, "Subjective assessment of H.
264/AVC video sequences transmitted over a noisy channel,"
in Quality of Multimedia Experience, 2009. QoMEx 2009.
International Workshop on, 2009, pp. 204-209.
[30] P. Le Callet, S. Möller, and A. Perkis, "‘Qualinet White Paper
on Definitions of Quality of Experience," European Network
on Quality of Experience in Multimedia Systems and Services
(COST Action IC 1003), Lausanne, Switzerland, Tech. Rep,
vol. 1.2, p. 6, 2013.
[31] P. ITU-T RECOMMENDATION, "Subjective video quality
assessment methods for multimedia applications," 1999.
[32] (2014, April). Sensory Experience Lab (SELab). Available:
http://selab.itec.aau.at