Conference PaperPDF Available

Impact of Video Resolution Changes on QoE For Adaptive Video Streaming

Authors:

Abstract and Figures

HTTP adaptive streaming (HAS) has become the de-facto standard for video streaming to ensure continuous multime-dia service delivery under irregularly changing network conditions. Many studies already investigated the detrimental impact of various playback characteristics on the Quality of Experience of end users, such as initial loading, stalling or quality variations. However, dedicated studies tackling the impact of resolution adaptation are still missing. This paper presents the results of an immersive audiovisual quality assessment test comprising 84 test sequences from four different video content types, emulated with an HAS adaptation mechanism. We employed a novel approach based on systematic creation of adaptivity conditions which were assigned to source sequences based on their spatio-temporal characteristics. Our experiment investigates the resolution switch effect with respect to the degradations in MOS for certain adaptation patterns. We further demonstrate that the content type and resolution change patterns have a significant impact on the perception of resolution changes. These findings will help develop better QoE models and adaptation mechanisms for HAS systems in the future.
Content may be subject to copyright.
IMPACT OF VIDEO RESOLUTION CHANGES ON QoE FOR
ADAPTIVE VIDEO STREAMING
Avs¸ar Asan, Werner Robitza§, Is-haka Mkwawa, Lingfen Sun, Emmanuel Ifeachor, Alexander Raake
Signal Processing and Multimedia Communications Lab, Plymouth University, Plymouth, U.K
§Telekom Innovation Labs, Deutsche Telekom AG, Berlin, Germany
Audiovisual Technology Group, Technical University of Ilmenau, Ilmenau, Germany
ABSTRACT
HTTP adaptive streaming (HAS) has become the de-facto
standard for video streaming to ensure continuous multime-
dia service delivery under irregularly changing network con-
ditions. Many studies already investigated the detrimental
impact of various playback characteristics on the Quality of
Experience of end users, such as initial loading, stalling or
quality variations. However, dedicated studies tackling the
impact of resolution adaptation are still missing. This pa-
per presents the results of an immersive audiovisual quality
assessment test comprising 84 test sequences from four dif-
ferent video content types, emulated with an HAS adaptation
mechanism. We employed a novel approach based on system-
atic creation of adaptivity conditions which were assigned to
source sequences based on their spatio-temporal characteris-
tics. Our experiment investigates the resolution switch effect
with respect to the degradations in MOS for certain adapta-
tion patterns. We further demonstrate that the content type
and resolution change patterns have a significant impact on
the perception of resolution changes. These findings will help
develop better QoE models and adaptation mechanisms for
HAS systems in the future.
Index TermsQuality of Experience, Video Quality,
Resolution Switch, HTTP Adaptive Streaming
1. INTRODUCTION
Today, HTTP Adaptive Streaming (HAS) is the most popu-
lar method of streaming videos to end user devices over the
web infrastructure. It is cost-effective and ensures multimedia
service constancy and stability. HAS adapts the video play-
back according to the network characteristics. This is typi-
cally achieved by switching between representations of dif-
ferent bitrate and resolution of video. The impact of such
resolution changes during the playout on the users’ perceived
quality is an important factor; previous work [1] has already
shown how the Quality of Experience (QoE) can be influ-
enced by buffering events or variations in quality over time.
QoE also significantly affects decisions on the preference to
use a service or not [2]. Negatively affected QoE due to unsta-
ble network conditions may trigger a chain reaction, starting
from individual service abandonment up to users leaving their
service/content providers (i.e., user churn) in the long term.
Video resolution switch phenomena and their effects on
QoE have not yet been fully investigated. The main objective
of this work is to provide a systematic analysis of resolution
changes and their impact on QoE. We present the results of
a quality assessment test which investigates resolution switch
effects. In our work. the term resolution switch corresponds
to the video player switching from one played resolution to
another. We also define adaptivity as an overall effect, i.e. the
sum of resolution switch events in a sequence. With the aid of
our systematic approach, it is possible to analytically investi-
gate adaptivity patterns with respect to their Mean Opinion
Score (MOS).
We begin by describing related work in Section 2.We then
propose a novel theoretical framework for the assessment of
resolution adaptivity in Section 3. Our audiovisual test setup
is explained in Section 4. In Section 5 we interpret the results
of our assessment. Finally, in Section 6 we discuss our find-
ings and list future work. The paper is concluded in Section 7.
2. RELATED WORK AND MOTIVATION
Although the detrimental effects of various playback impair-
ments such as initial loading, stalling or quality variations
have been widely investigated (e.g., comprehensive surveys
are found in [1, 3]), there is still a need for dedicated and sys-
tematic studies to tackle the impact of resolution adaptation.
In addition to the obvious visual effects during video play-
back, other factors contribute to user QoE as well: our liter-
ature analysis focuses on the key influencing factors on QoE
for HAS adaptivity.
Human perception system characteristics play a key role
in subjective quality assessment tasks. Cranley et al. [4] em-
phasize that human visual perception is able to adapt to a spe-
cific video quality only after a few seconds. The authors noted
that impairment effects become more annoying if the quality
changes happen frequently in a very short time period. Al-
though HAS is technically capable of changing the quality
every few seconds while streaming, in practice, adaptation is
carried out more slowly to prevent large quality variation pe-
riods or oscillations. The authors of [5] investigated up to six
quality changes in a 20-second video, which in the light of the
aforementioned considerations is beyond realistic.
Ecological validity refers to how useful and valid results
from a laboratory study are when they are applied in real life.
Experiments with artificial settings or test scenarios based on
an imaginary situation produce decontextualized results and
may not be implemented in daily life, as recently discussed
in [6]. In the domain of QoE, it is known that short-term video
quality prediction models (e.g., as shown in [7,8]) can obtain
high performance, but the ecological validity of these models
is questionable, especially for longer video durations and their
applicability on HAS algorithms. Finally, a recent study [9]
revealed that millennials (18–34) tend to spend around 14
hours per week on video streaming services and that longer
video durations are more preferable.
Traditional testing methods show short, non-entertaining
stimuli, with repeating contents, which is known to bore
users. Having users be immersed and entertained is one of
the key factors to get more ecologically valid results from a
lab-based quality test. However, it seems that there is a lack
of application of this paradigm. When they are immersed,
users feel “sucked” into the media [10]. It may make them
less aware of their surroundings, be more enjoyed, and help
reduce stress during an experiment. Pinson et al. [11] first
suggested a new test method: a source stimulus should be
used only once so that the subjects can focus on the content
rather than evaluating the same sequences over and over. In
the same work, it is also proven that in an immersive test de-
sign, boredom and fatigue can be significantly reduced. Rob-
itza et al. [12] successfully applied the immersive test design
for HAS QoE. They note that stimuli should be entertaining
and meaningfully complete for a more ecologically valid test.
Using different content types with various characteristics
helps in developing more general statements about QoE. Con-
tent may differ in genre and enjoyability, but also in technical
parameters such as spatiotemporal complexity, the latter hav-
ing a significant impact on the quality of compressed video
encodes. In [13, 14] the only content is computer-generated
graphics. Also some studies such as [5] did not analyse the
impact of content on their obtained results. In one study [15]
investigating an adaptive streaming model, the authors chose
seven content types having almost the same spatiotemporal
complexity, although video stimuli are from different genres.
Consequently the work failed to explain a logical relation-
ship between spatiotemporal characteristics and the content
type. Additionally the impact of quality switches is not inves-
tigated.
Rodriguez et al. [7] modelled the impact of video quality
level switching. However, the authors assumed that the im-
pact of a switch would be the same, no matter if it happened
Encoding Level
Video codec
Bit rate
Resolution
Frame rate
QP
Receiver Level
Re-buffering
Stalling
Resolution Changes
Start-up delay
Quality variations
Presentation Level
Display Resolution
Display Colour Scheme
Contrast
Brightness
Impact of
Resolution Changes
Perception Level
Visual Adaptability
Visibility of Impairments
Transmission
medium
Quality
Awareness
Resolution change depth
Resolution change frequency
Resolution change direction
Specific parameters
Immersion
Boredom/Enjoyment
Content preference
Quality of
Experience Level
Fig. 1. Conceptual framework for QoE in HAS services.
in good or bad quality regions. Our results however will show
that the impact is more complex. Finally, Liu et al. [16] in-
vestigated the quality level variation factors depending on av-
erage level, number of quality changes and average change
magnitude. However, the underlying quality metric they use
does not model the impact of quality switches. Also, their
work assumes a 1:1 relationship between bitrates and resolu-
tions, which is not constant in practice.
3. CONCEPTUAL QoE FRAMEWORK
From the points insufficiently tackled in previous literature
we developed a conceptual framework (see Figure 1). It com-
prises all steps in the transmission chain – from the source to
the user – and highlights the factors that need to be investi-
gated to fully understand the impact of adaptivity on QoE. It
may guide in creating a study setup, interpreting our results
and developing our future agenda. In our framework, we sim-
plified concepts from [17], but added specific parameters for
our investigation purpose.
The Encoding Level is the primary phase for the prepa-
ration of videos. It emphasizes the non-linear relationship
between encoding parameters and QoE. By encoding param-
eters we mean the combined configuration of video bitrate,
framerate, resolution and the codec chosen for different HAS
video representations. On the Receiver Level, we look at ef-
fects of unstable network conditions, shown as typical HAS
impairments. Specifically, we focus on adaptivity – together
with its parameters. The Presentation Level is about how mul-
timedia is presented on the user side. It denotes viewing con-
ditions and types of devices. Our subjective test has been
designed with a focus on that level. The Perception Level is
about the way humans consume multimedia services. Influ-
ence factors in this level are subjective: for example, they
depend on whether people can visually adapt to the impair-
ments, perceive any impairments at all or really pay atten-
Table 1. Adaptivity Patterns
Reference conditions 1080, 480, 240
Single drop or increase 1080 240, 1080 480, 240 1080
480 1080, 240 480, 480 240
Symmetrical drop 1080 240 1080, 1080 480 1080
480 240 480
Fluctuations 1080 240 480, 480 240 1080
240 1080 480, 480 1080 240
Constant drop or increase 1080 480 240, 240 480 1080
Symmetrical increase 480 1080 480, 240 480 240
240 1080 240
tion to what is happening on the screen. Those are aspects
that seem intangible at first, but are very important for con-
sideration in future work. Finally, the Quality of Experience
Level is where the sense of quality is formed after perception.
Quality awareness is a cognitive gate component between the
Perception and Quality of Experience level. It relates to user
anticipation, content preference, enjoyment and immersion as
a function of the subjects’ desired quality features [10]. In or-
der to have an understanding of the users perceived quality,
the preceding levels in our framework should be well under-
stood first. This is where our test comes into play.
4. EXPERIMENT SETUP
4.1. Source Stimuli
Many existing databases do not take into account factors
such as immersiveness and enjoyment.We took special care
to include the above-mentioned considerations on ecologi-
cally valid conditions, immersive source video selection and
assignment of conditions based on spatiotemporal charac-
teristics. Our 43 original videos were obtained from vari-
ous online portals, choosing the popular genres sports, cook-
ing, sightseeing and music videos. Due to the fact that
such sequences have already been compressed by the con-
tent provider, only 4K (3840 ×2160) sources were chosen
to ensure a high enough bitrate. We additionally checked ev-
ery video for its quality to be pristine (e.g., presence of cam-
era noise or shakiness, compression artifacts). The 43 videos
were then cut to logically complete test scenes of 45 seconds
length, resulting in 84 source clips (from here on: SRCs).
4.2. Conditions (Adaptivity Patterns)
Our test conditions (i.e. the way the resolution switches occur
over time) were based on three resolution levels: 240p, 480p
and 1080p. We first defined three reference conditions with
those constant resolutions; all other conditions had one or two
resolution switch in them. For our systematic design we con-
sidered the following adaptivity patterns, 21 in total as can be
seen from Table 1, the conditions allow comparisons against
each other, which we will detail in Section 5.
0.000
0.005
0.010
0.015
0 50 100 150
Spatial Information
Density
Fig. 2. Spatial Information values for all source sequences.
4.3. SRC–Condition Assignment
Spatiotemporal characteristics play a key role in comparing
different contents in terms of how much spatial detail and
motion there is. Especially in our test design, every SRC is
shown only once, hence, it needs to be made sure that their
characteristics are equally spread out. While they may orig-
inate from the same original content (e.g. a longer sports
sequence), individual portions of the clip may differ in their
characteristics. We hypothesize that these characteristics have
a direct impact on the visibility of resolution switches.
We first calculated the Spatial Information (SI) values (ac-
cording to ITU-T Rec. P.910) for every frame in our SRCs.
From those, we obtained an SI density function, as shown in
Figure 2, giving us the entire range of SI values in our test. By
looking at the thresholds of the 33% and 66% quantile (47.8
and 71.1), we can then classify any SI values as high,middle
or low.
For the allocation of our SRCs to the conditions, we first
calculated the average SI for each third of every SRC, i.e.
from 0–15, 15–30 and 30–45 s. We then assigned the average
SI to the above-mentioned classes high,middle or low. For
example, if one SRC was split into three parts and had the
average SI values 33.5,50.9,79.2, it was classified as LM
H. The same procedure was repeated with the SRCs split
in half (i.e., from 0–22.5 and 22.5–45 s).
These SI characteristics were then systematically paired
with the conditions under the following rules: 1) Each con-
dition should have a SRC with an SI characteristic match-
ing that condition. For example for a condition with 240
480, there would be a SRC with the SI characteristic
LM. 2) There should be the inverse condition for that
characteristic, e.g. MLshould be assigned to the SRC
in the previous case. 3) Two SRCs with constant-high and
constant-low SI would be mapped to the condition, too, re-
spectively. This lead to four SRCs being applied to every
condition (21 conditions ×4SRCs = 84 sequences).
4.4. Video Encoding and Test Sequence Generation
For encoding the final sequences, we chose to simulate HAS
offline. We first divided the SRC clips into two or three
equally sized parts, depending on the condition assigned to
them. These parts were then encoded with ffmpeg and x264
and downscaled (if necessary) to match the condition pattern.
x264 was set to use a Constant Rate Factor of 23 to ensure
constant quality across the encode, with a one-pass encod-
ing mode. The maximum bitrate was constrained for differ-
ent resolutions – similar to what popular video streaming ser-
vices implement: 400 kbps for 240p, 1.5 Mbps for 480p and
5.5 Mbps for 1080p. After that, the parts were upscaled to
1080p and concatenated to form the final processed video se-
quences (PVSes). Audio was not compressed during the PVS
generation and played throughout the whole sequence.
4.5. Test Environment and Protocol
Our test was conducted in a standards-compliant environment
(according to ITU-T Rec. P.910). The sequences were shown
on a 42” LCD display with 1920 ×1080 resolution. Subjects
were seated 3H (three times the height of the display) from
the monitor.
First, subjects were introduced to the topic. They were
then checked for visual acuity and colour blindness and had
to fill out a simple demographic questionnaire. For the main
experiment part, each PVS was presented after another, with
a randomized playlist for each subject in order to minimize
ordering effects. Before the actual PVSes were shown, we
displayed five “training” clips whose ratings were not taken
into consideration later. Subjects were asked to rate the vi-
sual quality of the stimuli. The ratings themselves were given
on a standard Absolute Category Rating (ACR) scale with la-
bels from Bad to Excellent (see ITU-T P.910), using the open
source AVRate software. Finally, subjects filled a post-test
questionnaire on what they had seen.
5. RESULTS
In our test, 30 subjects took part, 20 of which female. Their
age ranged from 19 to 51 (average: 30). In order to eliminate
unreliable viewers, we used the following procedure [12]: We
first calculated the Pearson correlation between each subject’s
vote and the overall MOS for every PVS. Then, once a sub-
ject’s correlation was below 0.70, they were removed from the
pool and the procedure repeated. This lead to the exclusion
of three subjects, meaning that our shown results are based on
27 assessors.
Our overall range of MOS is 1.484.70 (average: 3.13),
showing a good use of the rating scale, which stems from a
balanced test. In the following, we will explain the impact of
certain experimental factors on the subjective ratings.
5.1. Impact of Conditions
First we want to verify that the chosen conditions have the
expected impact on the overall ratings. Figure 3 shows all
MOS values, grouped by condition, with the different content
Table 2. MOS impact for one resolution switch, averaged
over all content types.
Ref. Resolution Switching Pattern MOS Impact
1080 1080 240 -1.66
1080 1080 480 -0.49
480 480 1080 0.45
480 480 240 -1.12
240 240 1080 1.31
240 240 480 1.36
types highlighted. We can see that the pattern indeed has a
strong effect on the ratings over the entire range of conditions.
Our systematic approach makes it possible to directly
compare one pattern against another. This is especially
useful when comparing, for example, a reference condition
(e.g., continuous 1080P) against a condition with a resolution
change (e.g. 1080480). Here we can directly formulate the
impact of the switch on the MOS, by subtracting the MOS of
the switching condition from the MOS for the reference. In
the above case, this is 4.504.01 = 0.49. Thus, we can say
that generally, when switching from 1080 to 480, this incurs
a MOS impact of 0.49.
Table 2 shows an overview of the different MOS im-
pacts identified for the patterns with one switch. It shall be
noted that these values correspond to averages over all content
types, however, we believe that those will be more useful for a
content-independent modelling of QoE. From this table it can
be seen that the depth (as measured in vertical pixels, i.e., 600
and 840) of the switch has a significant impact, as shown by
an ANOVA between the depth and ratings (p < 0.02). Gen-
erally, we also observe that a change in resolution is worse
when it occurs at a lower level.
The direction of the change appears to have an impact too:
when users start with low resolution, they score any upwards
resolution more positively than if they had experienced a drop
in resolution. This could also be explained by a positional
effect: when low quality is played at the last few seconds of
the sequence, subjects may have given more weight to these
portions. This is also called a “recency effect”. It is visible
in other conditions, too. For example, 480 1080, 240
480 and 240 1080 are scored significantly higher than their
reversed counterparts that started well, but end at low quality.
The same holds true for conditions with only one switch. We
further conducted an ANOVA between the change directions
(“down” or “up”) and the ratings, which showed a significant
effect (p= 0.04) and confirms our results.
5.2. Impact of Content and other experimental Factors
As mentioned in Section 4, we included different content
types in our experiment to get a more balanced MOS estima-
tion for a given pattern, under the hypothesis that any system-
1.5
2.0
2.5
3.0
3.5
4.0
4.5
240
240−480−240
240−1080−240
480−1080−240
480−240
1080−240
1080−480−240
1080−240−480
240−1080−480
480−240−1080
480−240−480
240−1080
240−480
1080−240−1080
240−480−1080
480−1080−480
480
1080−480
1080−480−1080
480−1080
1080
MOS
Content Type: cooking music sightseeing sports
Fig. 3. MOS for each PVS, sorted by condition. Content types are highlighted in different colors.
sports
sightseeing
music
cooking
2.8 2.9 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8
MOS with 95 percent CI
Fig. 4. MOS for each content type, averaged over all patterns.
atic content effect could be averaged out. This is especially
relevant in practical quality monitoring applications, where
the content type may not be known.
However, for correctly analyse the subjective test results,
the content type cannot be neglected: Figure 4 shows the av-
erage ratings for a given content type, considering all con-
ditions, as clearly visible, there is a significant effect of the
content type on the MOS ratings. We conducted a one-way
ANOVA between content type and ratings (p < 0.02) to prove
this effect. A post-hoc test (Tukey HSD) revealed that only
the difference between sports and cooking videos was signif-
icant (p < 0.01). As we will later see in the questionnaire
results, this is due to the visual characteristics of the content
itself, not because of its enjoyability.
5.3. Questionnaire
From our pre- and post-assessment questionnaires we gath-
ered more insight into the MOS ratings: subjects were asked
to rank the four content types according to their liking. Only
13% of the subjects marked “cooking” as first priority. This
contrasts with our quality ratings, where cooking content was
judged significantly higher than others. This leads us to con-
clude that subjects found resolution switches less disturbing
for this content, and that content preference and quality rat-
ings are not necessarily correlated. We attribute these findings
to the visual characteristics of the chosen cooking sequences
which may have made the switches less visible. However, fur-
ther analysis and tests are needed to confirm that hypothesis,
which would require the inclusion of more content types with
varying spatiotemporal characteristics.
6. DISCUSSION
As can be seen from the MOS results, our test is well-
balanced in terms of the range of conditions and content
types. The obtained MOS degradation values in Table 2 can
be used as a component in QoE models, when it is necessary
to quantify the impact of a single switch. Of course – as al-
ways the case for subjective studies – the factors shown here
are just a small part in the big picture of HAS QoE, and we
will conduct further test series in the future. In other words, it
is impossible to design a test in which one can investigate all
factors reliably. However, previous research has rarely been
that systematic: the design of conditions should be done in
such a way that they can be compared against each other.
For future tests we can re-use some of the shown clips as an-
chor points, which will allow us to create a bigger database
of adaptivity conditions that can also be systematically com-
pared.
We could successfully apply an “immersive” paradigm in
our test, meaning that entertaining sequences were used, with-
out repeating the same source. Our results indicate a strong
impact of the content characteristics on the perceived quality.
At this stage, we believe that an attempt to model the im-
pact of resolution switches would result in a too narrow view.
In fact, it would require at least another study to serve as a
database for validating any created model. Hence, our focus
lies on producing a series of complementary tests, in order to
be able to create more robust models in the end. For example,
this process has also been successfully used for the models
standardised by ITU-T.
7. CONCLUSION
In this paper we presented the results of a first study on the im-
pact of resolution changes on user-perceived QoE. We were
motivated by current literature being still inconclusive about
investigating single impairment factors that are typical for
HAS services, with resolution switches being one of them.
Our novel, systematic test design – in which we could com-
pare reference against adaptivity conditions – allowed us to
predict the effect of a specific switch in terms of MOS degra-
dation.
The experiment shown here is just one of a series of tests
that we will conduct, in order to give a full picture on resolu-
tion switches. Our conceptual framework lists all these points
as a guideline for future research: it includes factors such as
switch visibility, positional effects of switches in longer se-
quences, impact of different device types, and – among oth-
ers – socio-economic factors related to user demographics and
pricing.
Acknowledgements
The work presented in this paper is funded by the European
Union in the context of Horizon 2020 Research and Innova-
tion Programme under Marie Sklodowska-Curie Innovative
Training Networks (MSCA-ITN-2014-ETN) – Grant Agree-
ment No.643072, Network QoE-NET.
8. REFERENCES
[1] M. Seufert, S. Egger, M. Slanina, T. Zinner, T. Hoßfeld,
and P. Tran-Gia, “A survey on Quality of Experience
of HTTP adaptive streaming, IEEE Communications
Surveys & Tutorials, vol. 17, no. 1, pp. 469–492, 2015.
[2] A. Sackl, P. Zwickl, S. Egger, and P. Reichl, “The role of
cognitive dissonance for QoE evaluation of multimedia
services,” in 2012 IEEE Globecom Workshops. IEEE,
2012, pp. 1352–1356.
[3] M.-N. Garcia, F. De Simone, S. Tavakoli, N. Staelens,
S. Egger, K. Brunnstr¨
om, and A. Raake, “Quality of
Experience and HTTP adaptive streaming: A review of
subjective studies, in QoMEX 2014. IEEE, 2014, pp.
141–146.
[4] N. Cranley, P. Perry, and L. Murphy, “User percep-
tion of adapting video quality, International Journal of
Human-Computer Studies, vol. 64, no. 8, pp. 637–647,
2006.
[5] S. Egger, B. Gardlo, M. Seufert, and R. Schatz, “The
impact of adaptation strategies on perceived quality of
HTTP adaptive streaming, in Workshop on Design,
Quality and Deployment of Adaptive Video Streaming.
ACM, 2014, pp. 31–36.
[6] M. Peeters, C. Megens, C. Hummels, A. Brombacher,
and W. Ijsselsteijn, “Experiential Design Landscapes:
Design Research in the Wild, in Nordic Design Re-
search Conference 2013, 2013.
[7] D. Z. Rodr´
ıguez, Z. Wang, R. L. Rosa, and G. Bres-
san, “The impact of video-quality-level switching on
user Quality of Experience in dynamic adaptive stream-
ing over HTTP, EURASIP Journal on Wireless Com-
munications and Networking, vol. 2014, no. 1, pp. 1–15,
2014.
[8] M.-N. Garcia, P. List, S. Argyropoulos, D. Lindegren,
M. Pettersson, B. Feiten, J. Gustafsson, and A. Raake,
“Parametric model for audiovisual quality assessment in
IPTV: ITU-T Rec. P. 1201.2,” in Multimedia Signal Pro-
cessing (MMSP), 2013 IEEE 15th International Work-
shop on. IEEE, 2013, pp. 482–487.
[9] C. International, “Video streaming survey 2016,” Tech.
Rep.
[10] W. Robitza, S. Sch¨
onfellner, and A. Raake, “A The-
oretical Approach to the Formation of Quality of Ex-
perience and User Behavior in Multimedia Services,
in 5th ISCA/DEGA Workshop on Perceptual Quality of
Systems (PQS 2016), 2016, pp. 39–43.
[11] M. Pinson, M. Sullivan, and A. Catellier, “A new
method for immersive audiovisual subjective testing,
in 8th International Workshop on Video Processing and
Quality Metrics for Consumer Electronics (VPQM),
2014.
[12] W. Robitza, M. N. Garcia, and A. Raake, “At home in
the lab: Assessing audiovisual quality of HTTP-based
adaptive streaming with an immersive test paradigm,”
in Seventh International Workshop on Quality of Multi-
media Experience (QoMEX). IEEE, 2015, pp. 1–6.
[13] M. Grafl and C. Timmerer, “Representation switch
smoothing for adaptive HTTP streaming, in 4th In-
ternational Workshop on Perceptual Quality of Systems
(PQS 2013), 2013, pp. 178–183.
[14] J. De Vriendt, D. De Vleeschauwer, and D. Robinson,
“Model for estimating QoE of video delivered using
HTTP adaptive streaming,” in 2013 IFIP/IEEE Interna-
tional Symposium on Integrated Network Management
(IM 2013). IEEE, 2013, pp. 1288–1293.
[15] S. Tavakoli, K. Brunnstr¨
om, K. Wang, B. Andr´
en,
M. Shahid, and N. Garcia, “Subjective quality as-
sessment of an adaptive video streaming model, in
IS&T/SPIE Electronic Imaging. International Society
for Optics and Photonics, 2014, pp. 90160K–90160K.
[16] Y. Liu, S. Dey, F. Ulupinar, M. Luby, and Y. Mao, “De-
riving and validating user experience model for dash
video streaming,” IEEE Transactions on Broadcasting,
vol. 61, no. 4, pp. 651–665, 2015.
[17] A. Raake and S. Egger, “Quality and Quality of Expe-
rience,” in Quality of Experience, pp. 11–33. Springer,
2014.
... As per prior subjective studies, the QoE of video streaming is a function of application layer QoS features that are either dependent on the video content (e.g., video bitrate) or the playout metrics (e.g., the initial startup delay) [121,122]; the playout metrics further depend on the underlying network conditions such as the network throughput or delay. In this thesis, we consider building QoE functions that take as input the network throughput or the video bitrate and differ to a future work considering other factors that might also impact the QoE. ...
... For every average result, the 90% confidence intervals are plotted to help assess their convergence. We also compare the different implementations in terms of main application-level QoS factors (e.g., stalls, resolution switches) that could impact the subjective QoE [121,122]. ...
Thesis
The Internet has changed drastically in recent years; multiple novel applications and services have emerged, all about consuming digital content. In parallel, users are no longer satisfied by the Internet's best effort service; instead, they expect a seamless service of high quality from the side of the network. This has increased the pressure on Internet Service Providers (ISP) to efficiently engineer their traffic and improve their end-users Quality of Experience (QoE) rather than just monitoring the physical properties of their networks. Furthermore, content providers from their side, and to protect the content of their customers, have shifted towards end-to-end encryption (e.g., TLS/SSL), which has complicated even further the task of ISPs in handling the traffic in their networks. Today, the challenge is notable, especially for video streaming since it is the most dominant service and the primary source of pressure on the Internet infrastructure, imposing tight constraints on the quality of service (QoS) provided by the network. Video streaming relies on the dynamic adaptive streaming over HTTP (DASH) protocol which takes into consideration the underlying network conditions (e.g., delay, loss rate, and throughput) and the viewport capacity (e.g., viewport resolution) to improve the experience of the end-user in the limit of the available network resources. Nevertheless, knowing encrypted video traffic is of great help to ISPs as it allows taking appropriate network management actions. Therefore, this thesis focuses on video streaming services and video QoE to properly manage the enormous and diverse video content available on the Internet. To that aim, one needs to understand the transmission process of dynamic adaptive video streaming over HTTP (DASH) protocol, identify new metrics correlated to video QoE, and propose solutions to infer and leverage such metrics for optimal network resources management while maximizing the end-user QoE.In the beginning, we present a controlled experimental framework that leverages the YouTube and Dailymotion video players and the Chrome Web Request API to assess the impact of browser viewport on the observed video resolution pattern. We use the observed patterns to quantify the wasted bandwidth. Then, we propose a methodology based on controlled experimentation able to infer fine-grained video flow information such as chunk sizes and use them as features for machine learning models able to predict the viewport resolution class from encrypted video traces. Later, we formulate a QoE-driven resource allocation problem to pinpoint the optimal bandwidth allocation that maximizes the QoE for users sharing the same bottleneck link while considering their viewport sizes. For content providers, operating at the network edge, we study a viewport aware caching optimization problem for dynamic adaptive video streaming that appropriately considers the client viewport size and access speed, the join time, and the characteristics of videos.
... While the abovementioned model produces MOS values, it does not consider the adaptive bit rate mechanism to avoid excessive rebuffering by downgrading the video streaming resolution. Therefore, this paper uses the findings of Asan et al. [11] to reflect the impact of switching from 1080 p as the reference condition to 480 p and 240 p in Equation (1) by reducing the overall MOS by 0.49 and 1.66 units, respectively. While Asan et al. [11] addressed only downgrading the resolution from 1080 p to 480 p and 240 p, the YouTube platform also implemented additional resolutions such as 720 p and 144 p. ...
... Therefore, this paper uses the findings of Asan et al. [11] to reflect the impact of switching from 1080 p as the reference condition to 480 p and 240 p in Equation (1) by reducing the overall MOS by 0.49 and 1.66 units, respectively. While Asan et al. [11] addressed only downgrading the resolution from 1080 p to 480 p and 240 p, the YouTube platform also implemented additional resolutions such as 720 p and 144 p. Thus, this study extended the findings of Asan et al. and calculated the impact of YouTube quality switches from 1080 p to 720 p and 144 p, as presented in Table 1. ...
Article
Full-text available
Video on demand (VoD) services such as YouTube have generated considerable volumes of Internet traffic in homes and buildings in recent years. While Internet service providers deploy fiber and recent wireless technologies such as 802.11ax to support high bandwidth requirement, the best-effort nature of 802.11 networks and variable wireless medium conditions hinder users from experiencing maximum quality during video streaming. Hence, Internet service providers (ISPs) have an interest in monitoring the perceived quality of service (PQoS) in customer premises in order to avoid customer dissatisfaction and churn. Since existing approaches for estimating PQoS or quality of experience (QoE) requires external measurement of generic network performance parameters, this paper presents a novel approach to estimate the PQoS of video streaming using only 802.11 specific network performance parameters collected from wireless access points. This study produced datasets comprising 802.11n/ac/ax specific network performance parameters labelled with PQoS in the form of mean opinion scores (MOS) to train machine learning algorithms. As a result, we achieved as many as 93–99% classification accuracy in estimating PQoS by monitoring only 802.11 parameters on off-the-shelf Wi-Fi access points. Furthermore, the 802.11 parameters used in the machine learning model were analyzed to identify the cause of quality degradation detected on the Wi-Fi networks. Finally, ISPs can utilize the results of this study to provide predictable and measurable wireless quality by implementing non-intrusive monitoring of customers’ perceived quality. In addition, this approach reduces customers’ privacy concerns while reducing the operational cost of analytics for ISPs.
... Table I summarizes the notation used in our framework, while Figure 3 illustrates the process of generating requests for the case |S| = 5 screens (most common screen resolutions in mobile market). As per prior subjective studies, the QoE of video streaming is a function of application layer QoS features that are either dependent on the video content (e.g., video bitrate) or the playout metrics (e.g., the intial startup delay) [7], [27]; the playout metrics further depend on the underlying network conditions such as the network throughput or delay. In this work, we consider building QoE functions that take as input the network throughput or the video bitrate, and differ to a future work the consideration of other factors that might also impact the QoE, though to a lesser importance, such as the delay and the loss rate. ...
... We also compare the different implementations in terms of main application-level QoS factors (e.g., stalls, resolution switches) that could impact the subjective QoE [7], [27]. In Figure 12, we focus on the quality switches as they appear because of the DASH dynamics. ...
Article
Screen resolution along with network conditions are main objective factors impacting the user experience, in particular for video streaming applications. User terminals on their side feature more and more advanced characteristics resulting in different network requirements for good visual experience. Previous studies tried to link mean opinion score (MOS) to video bitrate for different screen types (e.g., Common Intermediate Format [CIF], Quarter Common Intermediate Format [QCIF], and High Definition [HD]). We leverage such studies and formulate a Quality of Experience (QoE)‐driven resource allocation problem to pinpoint the optimal bandwidth allocation that maximizes the QoE over all users of a network service provider located behind the same bottleneck link, while accounting for the characteristics of the screens they use for video playout. For our optimization problem, QoE functions are built using curve fitting on datasets capturing the relationship between MOS, screen characteristics, and bandwidth requirements. We propose a simple heuristic based on Lagrangian relaxation and Karush Kuhn Tucker (KKT) conditions to efficiently solve the optimization problem. Our numerical simulations show that the proposed heuristic is able to increase overall QoE up to 20% compared to an allocation with a TCP look‐alike strategy implementing max‐min fairness. A new Quality of Experience (QoE)‐driven bandwidth allocation framework that maximizes the QoE over all users of a network provider based on their screen resolution. QoE functions built using curve fitting on datasets capturing relationship between MOS, screen characteristics, and bandwidth requirements. Proposed optimal solution able to improve overall QoE and main application‐level QoS metrics (e.g., stalls and resolution switches) compared to legacy DASH.
... Similar to stall performance, bitrate quality amplitude has a significant effect on QoE [8], unlike switching between different qualities while retaining the same resolution [8]. However, switching between different resolutions can influence user experience [1]. QoE depicts the maximum (initial) value (score) for QoE or growth factor depending on the QoE model, and denotes a weight for the QoE score. ...
Preprint
Full-text available
Multimedia streaming over the Internet (live and on demand) is the cornerstone of modern Internet carrying more than 60% of all traffic. With such high demand, delivering outstanding user experience is a crucial and challenging task. To evaluate user QoE many researchers deploy subjective quality assessments where participants watch and rate videos artificially infused with various temporal and spatial impairments. To aid current efforts in bridging the gap between the mapping of objective video QoE metrics to user experience, we developed DashReStreamer, an open-source framework for re-creating adaptively streamed video in real networks. DashReStreamer utilises a log created by a HAS algorithm run in an uncontrolled environment (i.e., wired or wireless networks), encoding visual changes and stall events in one video file. These videos are applicable for subjective QoE evaluation mimicking realistic network conditions. To supplement DashReStreamer, we re-create 234 realistic video clips, based on video logs collected from real mobile and wireless networks. In addition our dataset contains both video logs with all decisions made by the HASalgorithm and network bandwidth profile illustrating throughput distribution. We believe this dataset and framework will permit other researchers in their pursuit for the final frontier in understanding the impact of video QoE dynamics.
... Due to the introduction of RTT, the users faced with video freezing that influences their MOS [36]. Besides, when HAS adapts dynamic network changes, the quality level changed another quality level, and this causes a resolution switch that degrades their user satisfaction [37]. ...
Thesis
Full-text available
In today’s connected world, the availability of fast Internet access and penetration of smart-phones has created an opportunity for the emerging of new telecom services. Similarly, in Ethiopia, the improvement of technology brought a change from the traditional services to more advanced communication like video streaming service. To ensure whether the customers have satisfied for a given service or not, capturing user Quality of Experience (QoE) is important. Traditionally, Internet Service Providers (ISP)s monitor the network performance by collecting network key performance indicators without involving users’ perception. However, user-perceived QoE estimation is multidimensional, which is affected by different influencing factors. So, estimating user-centric QoE based on Network-level QoS (NQoS) remains challenging tasks for ISPs. Yet, QoE assessment model for video streaming services that map Quality of Service (QoS) to QoE concerning users’ perception has not been performed in Ethiopia. This thesis proposes video streaming QoE assessment models using machine learning techniques to estimate user-perceived experience in the Long-Term Evolution (LTE) network. The model predicts perceived QoE in a Mean Opinion Score (MOS), by evaluating NQoS, Application-level QoS (AQoS) and contextually formulated survey questionnaire. The models take NQoS metrics such as upload bit rate and download bit rate in Megabits Per Second (Mb/s), latency and jitter in milliseconds (ms), and packet loss in percentage. Content-type and resolution also considered from the application level. Contrary to existing models for QoE prediction, the proposed model gives a good estimation of the perceived quality with a minimum Mean Squared Error (MSE) of 7.74%; and Pearson and Spearman correlations of 97.94% and 97.43%, corresponding to the measured QoE. The result obtained from the model shows that the average MOS value is 2.79, which is below the recommended one. Accordingly, the proposed model allows ISP to monitor the perceived QoE level accurately.
... These resolution changes affect the users' perceived quality. In [14], Asan et al. proposed a method to analyze resolution changes and their impact on QoE, as well as investigate adaptive patterns with respect to their mean opinion score (MOS). The results of this study are still inconclusive concerning single impairment factors that are typical for HTTP adaptive streaming (HAS) services. ...
Article
Full-text available
Given a large number of online video viewers, video streaming, over various networks, is important communication technology. The multitude of viewers makes it challenging for service providers to provide a good viewing experience for subscribers. Video streaming capabilities are designed based on concepts including quality, viewing flexibility, changing network conditions, and specifications for different customer devices. Adjusting the quality levels, and controlling various relevant parameters to stream the video content with good quality and without interruption is vital. This paper proposes an adaptive framework to balance the average video bitrates with respect to appropriate quality switches, making the transition to higher switches more seamless. The quality adaptation scheme increases the bitrates to the maximum value at their current quality switch before shifting to a higher level. This reduced switching times between levels and guarantees the stability of viewing and avoids interruptions. The use of a dynamic system ensures optimal performance, by controlling system parameters and making the algorithm more tunable. We built the system using an open-source DASH library (Libdash) with QuickTime player, studied the video load changes on two performance parameters, Central Processing Unit and Memory usages that have a high impact on multimedia quality. Consequently, the values of parameters that affected the performance of video streaming could be decreased, permitting users to regulate the parameters according to their preferences. Further, reducing the switching levels will reduce the overloads that occur while transferring from one level to another.
... These resolution changes affect the users' perceived quality. In [14], Asan et al. proposed a method to analyze resolution changes and their impact on QoE, as well as investigate adaptive patterns with respect to their mean opinion score (MOS). The results of this study are still inconclusive concerning single impairment factors that are typical for HTTP adaptive streaming (HAS) services. ...
Article
The increasing use of video streaming services that followed the Covid-19 pandemic has more than ever driven a rising interest among various stakeholders in service provisioning chain to understand factors influencing quality of experience (QoE). Many research activities have so far addressed different influence factors in order to understand and improve QoE when using video streaming service. However, we have recognized the requirement to address QoE as multidimensional concept, and show the relationship between QoE, perceptual dimensions, and influence factors. In this paper, we provide the multidimensional modelling and analysis of QoE for video streaming. Result analysis has shown that QoE for video streaming can be modelled by using perception of quality of video and perception of ease of use of application as predictors. The analysis of influence of individual system (i.e., resolution, coding tree unit (CTU), and constant rate factor (CRF)), context (i.e., location, lighting, and video type), and human (i.e., gender, education level, and prior experience) factors and their interactions on QoE perceptual dimensions (i.e., perception of quality and perception of ease of use) show statistically significant impacts, which means that alternations of these factors can enhance perceptual dimensions and consequently QoE for video streaming which is a final goal.
Conference Paper
Full-text available
This paper presents a conceptual model that relates the quality formation process—as established in literature—to human behavior in multimedia consumption. It gives theoretical definitions of behavioral aspects to allow a common understanding of the terms in the User Experience and Quality of Experience communities. The work creates a basis for the predictive modeling of behavior, identifies challenges for its practical assessment, and poses research questions that will need to be answered.
Conference Paper
Full-text available
Thanks to the emergence of new sensing and behaviour tracking technologies, design research can take place anywhere and anytime in the real world. When doing design research, a trade-off has to be made between experimental control and ecological validity. In this paper, we compare Experiential Design Landscapes (EDLs) with three more traditional research approaches that are frequently used in design research, i.e., Lab Research, Living Lab and design research 'in the field', and reflect on this trade-off. By means of an example, we discuss how EDLs deals with issues of 'generalisability' to the real world and the potential loss of experimental control.
Article
Full-text available
Ever since video compression and streaming techniques have been introduced, measurement of perceived video quality has been a non-trivial task. Dynamic adaptive streaming (DASH) over hypertext transfer protocol, is a new worldwide standard for adaptive streaming of video. DASH has introduced an additional level of complexity for measuring perceived video quality, as it varies the video bit rate and quality. In this paper, we study the perceived video quality using DASH. We investigate three factors which impact user perceived video quality: 1) initial delay; 2) stall (frame freezing); and 3) bit rate (frame quality) fluctuation. For each factor, we explore multiple dimensions that can have different effects on perceived quality. For example, in the case of the factor stall, while most previous research have studied how stall duration correlates with user experience, we also consider how the stalls are distributed together with the amount of motion in the video content, since we believe they may also impact user perceived quality. We conduct extensive subjective tests in which a group of subjects provide subjective evaluation while watching DASH videos with one or more artifacts occurring. Based on the subjective tests, we first derive impairment functions which can quantitatively measure the impairment of each factor, and then combine these impairment functions together to formulate an overall user experience model for any DASH video. We validate with high accuracy the user experience model, and demonstrate its applicability to long videos.
Conference Paper
Full-text available
In this paper, we assess audiovisual quality of HTTP Adaptive Streaming using a novel subjective test design. The goal of this test was to systematically study the impact of both quality variations and stalling events on remembered quality. To gather more ecologically valid results, we wanted to reach a degree of test subject engagement and attention closer to real-life video viewing than what is achieved in traditional lab tests. To this aim, we used a novel test design method: the “immersive” paradigm, where subjects never see the same source stimulus twice. A total of 66 source clips of one minute length have been shown, selected from online video services. Together with qualitative results obtained from questionnaires, we can confirm previously reported effects, such as the impact of quality switching frequency. We also present new findings on the interaction of stalling events and quality drops. Finally, the contribution highlights that a long duration test of over one hour is feasible using the immersive paradigm while keeping subjects entertained.
Article
Full-text available
Dynamic adaptive streaming over HTTP (DASH) has become a promising solution for video delivery services over the Internet in the last few years. Currently, several video content providers use the DASH solution to improve the users’ quality of experience (QoE) by automatically switching video quality levels (VQLs) according to the network status. However, the frequency of switching events between different VQLs during a video streaming session may disturb the user’s visual attention and therefore affect the user’s QoE. As one of the first attempts to characterize the impact of VQL switching on the user’s QoE, we carried out a series of subjective tests, which show that there is a correlation between the user QoE and the frequency, type, and temporal location of the switching events. We propose a novel parameter named switching degradation factor (SDF) to capture such correlation. A DASH algorithm with SDF parameter is compared with the same algorithm without SDF. The results demonstrate that the SDF parameter significantly improves the user’s QoE, especially when network conditions vary frequently.
Conference Paper
Full-text available
HTTP adaptive streaming technology has become widely spread in multimedia services because of its ability to provide adaptation to characteristics of various viewing devices and dynamic network conditions. There are various studies targeting the optimization of adaptation strategy. However, in order to provide an optimal viewing experience to the end-user, it is crucial to get knowledge about the Quality of Experience (QoE) of different adaptation schemes. This paper overviews the state of the art concerning subjective evaluation of adaptive streaming QoE and highlights the challenges and open research questions related to QoE assessment.
Conference Paper
Full-text available
With the recent increased popularity and high usage of HTTP Adaptive Streaming (HAS) techniques, various studies have been carried out in this area which generally focused on the technical enhancement of HAS technology and applications. However, a lack of common HAS standard led to multiple proprietary approaches which have been developed by major Internet companies. In the emerging MPEG-DASH standard the packagings of the video content and HTTP syntax have been standardized; but all the details of the adaptation behavior are left to the client implementation. Nevertheless, to design an adaptation algorithm which optimizes the viewing experience of the enduser, the multimedia service providers need to know about the Quality of Experience (QoE) of different adaptation schemes. Taking this into account, the objective of this experiment was to study the QoE of a HAS-based video broadcast model. The experiment has been carried out through a subjective study of the end user response to various possible clients' behavior for changing the video quality taking different QoE-influence factors into account. The experimental conclusions have made a good insight into the QoE of different adaptation schemes which can be exploited by HAS clients for designing the adaptation algorithms.
Conference Paper
Changing network conditions like bandwidth fluctuations and resulting bad user experience issues (e.g. video freezes) pose severe challenges to Internet video streaming. To address this problem, an increasing number of video services utilizes HTTP adaptive streaming (HAS). HAS enables service providers to improve Quality of Experience (QoE) and resource utilization by incorporating information from different layers. However, these adaptation possibilities of HAS also introduce new perceivable impairments such as the fluctuation of audiovisual quality levels over time, which in turn lead to novel QoE-related research questions. The main contribution of this paper is the formulation of open research questions as well as a thorough systematic user-centric analysis of different quality adaptation dimensions and strategies. The underlying data has been acquired through two crowdsourcing and one lab study. The results provide guidance w.r.t. which encoding dimensions are combined best for the creation of the adaptation set and what type of adaptation strategy should be used. Furthermore it provides in- sights on the impact of adaptation frequency and the true QoE gain of adaptation over stallings.