Conference PaperPDF Available

Abstract and Figures

To ensure the best multimedia service quality in order to well address users' expectations, a new concept named Quality of Experience (QoE) has appeared. Two methods can be used to evaluate the user satisfaction, a subjective one and an objective one. The subjective approach is based on measured real data. The problem is there is no dataset large enough and can be used to well evaluate the QoE. In this, work we present our approach to build a data set for subjective evaluation based on a categorization approach and open source software.
Content may be subject to copyright.
An Open Source Platform for Perceived Video Quality
Evaluation
Lamine Amour, Sami Souihi, Said Hoceini and Abdelhamid Mellouk
University of Paris-Est Créteil Val de Marne (UPEC)
Image, Signal and Intelligent Systems Lab-LiSSi and Netw & Telecoms Dept, IUT CV
122 rue Paul Armangot, 94400 Vitry sur Seine, France
(lamine.amour, sami.souihi, hoceini, mellouk) @u-pec.fr
ABSTRACT
To ensure the best multimedia service quality in order to well
address users’ expectations, a new concept named Quality of
Experience (QoE) has appeared. Two methods can be used to
evaluate the user satisfaction, a subjective one and an objective
one. The subjective approach is based on measured real data.
The problem is there is no dataset large enough and can be
used to well evaluate the QoE. In this, work we present our
approach to build a data set for subjective evaluation based
on a categorization approach and open source software.
Keywords
Quality of Experience (QoE); Controlled environment; Crowd-
sourcing; Mean opinion score; Video; Framework.
1. INTRODUCTION
A number of works are already made on the QoE area. For
example, Qualinet tries to quantify and propose efficient QoE
estimation models. To build a more accurate model, the first
need is to build a consistent database. User perception can be
is influenced by a huge number of parameters (User Profile,
Network parameters, Application parameters, ...). These fac-
tors are called QoE Influence Factors (QoE IFs) [2]. Nowadays,
there is no available database large enough to includes all these
QoE IFs and can be used to produce a better accurate esti-
mation model. In this context, we present CLLF (Controlled
LiSS i Lab F ramewor k) as an open source platform to help
researchers to build a large QoE/QoE IFs database for video
streaming services (Youtube).
2. RELATED WORK
The QoE is a hot topic given the huge number of works that
we can find in the literature. In this section, we will present
two video QoE frameworks.
- The first framework is proposed by Figuerola Salas et al.[3].
This system is used on a large scale with preliminary results of
a validation study. In fact, this system is based on an HTML5
Web-based tool that collects ratings of videos encoded at dif-
ferent bitrates. This work focuses on High Definition (720p
Permission to make digital or hard copies of part or all of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. Copyrights for third-
party components of this work must be honored. For all other uses, contact
the Owner/Author(s). Copyright is held by the owner/author(s).
Q2SWinet’15, November 2–6, 2015, Cancun, Mexico.
ACM 978-1-4503-3757-1/15/11.
DOI: http://dx.doi.org/10.1145/2815317.2815344.
(HD)) video service commonly employed for video services over
the Internet. In this framework, authors used different quality
of video which are encoded from 784 Kbps and 1 Mbps with
AV C/H.264 codec, and used Mean Opinion Score method to
record the user’s perception.
- The second system is proposed by Hanhart et al.[4]. In
this framework, the authors investigate two approaches to as-
sess a multi-view video depth (MVD) content on 2Ddisplays :
at first by using a virtual view. Secondly, by using a free-
viewpoint video, which corresponds to a smooth camera mo-
tion during a time freeze. In fact, they conducted the crowd-
sourcing experiments using seven MVD sequences encoded at
different bit rates and tested in both Lab-based evaluation and
Crowd-based evaluation, In this work, the single-stimulus (SS)
methodology was chosen as this methodology with the MOS
method.
3. FRAMEWORKS DESCRIPTION
3.1 Objectives
The main objective of our proposed platform [1] is to evalu-
ate the video streaming services user perception in controlled
environment which is achieved based on our previous proposal :
’QoE IFs hierarchical classification’ made in [2]. Based on QoE
IFs categories (Network, User profile, Application, Device and
User feedback -Figure 1-), we will explain how to proceed to
built a framework with a lot of QoE IFs in a controlled envi-
ronment.
Figure 1: QoE IFs categories presentation.
In fact, the platform goals are summarized in following ele-
ments : (i) An application has been developed and installed to
evaluate the video streaming services in controlled experimen-
tation. (ii) A platforme has been carried out, to implement our
architecture [2]. (iii) Several codecs were considered ( 144pto
1080p(Full HD)) and several types of videos were used (e.g.
sport, movie, documentary, news, music, etc.) in the experi-
ments. In the following section, we will explain how we proceed
to collect QoE IFs and to achieve these goals.
3.2 How to collect QoE IFs categories ?
To build the dataset, we made a testbed in totally a con-
trolled environment. The Figure 2 presents the overall system
components.
In this testbed, users give their MOS at the end of each
video. To consider all the QoE IFs that can have an im-
Figure 2: System components
pact on the user’s perception, we use and combine a lot of
softwares such as : VideoLAN Client player (VLC), Netem
(Network Emulator), Python language and MySQL database
server. Then, below we will describe how we proceed to collect
QoE IFs for each categorie (Figure 1).
3.2.1 UF and UP
User Feedback (UF) is a score given by user to subjectively
evaluate a service (MOS). It is represented by the bellow ques-
tions. The answer of each question is between 1 and 5 where
User Feedback
1)Starting video time : 1 : very long / 5 : very fast.
2)Lag between image and audio : 1 : very big lag/
5 : No lag
3)Image quality : 1 : very bad/ 5 : very good.
4)Sound quality : 1 : very bad/ 5 : very good.
5)MOS : 1 : very bad/ 5 : very good.
1 indicates that the quality is not acceptable or very bad (time
to start very long, lag between picture and audio very high,
etc.) and 5 indicates that the quality is very good (no buffer-
ing time, no lag, etc.). In addition, the platform implements
an ergonomic interface to ask users about their profiles (Age,
Sex, Study level and Experience with the video service).
3.2.2 QoD
Another important aspect is the device quality (QoD). The
proposed platform can collect devices information as the screen
size and the screen resolution, CPU performance, available
memory ....
3.2.3 QoS
To evaluate the impact of network QoS, the platform uses
a network emulator called N etEm. A preliminary study was
done to help us to fix the QoS variations value in order to get
out the combinations of QoS that show the problems in the
video stream as : video blocking, failure launch, lag between
the sound and image,.... In this experimentation, we test 3
QoS factors (delay, loss and rate.) [5, 6] and 117 combinations
summarized in the Table 1. These 3 QoS factors are varied in
a random manner.
QoE IFs Values
delay 0, 100, 200, 400 (ms)
rate 256, 512, 768, 1024, 1536 (kbitsrs)
loss 0, 5, 10 (%)
Table 1: QoS variations on the controlled laboratory
testbed.
3.2.4 QoA
To collect the category application factors for the CLLF
framework, we worked on two sides. The first part involves
the harvesting of traditional application parameters (e.g. bi-
trate, frame rate, etc). Our platform uses a modified version
of VideoLAN Client player (VLC) to gather all these QoE IFs.
The second part is the video content (e.g. video type, degree
of motion and codec). To implement the second part, (i) we
started with the selection of 8 video types based on existing
work. (ii) Once the choice is done, we downloaded 24 videos
with free rights (Youtube Creative Common) and with differ-
ent resolutions (between 144pand 1080p), (iii) selected portion
of 30 seconds with a specific motion degree, (iv) uploaded them
an other time to Youtube and (v) call them with their URL
from the player . Figure 3 presents screen shots of the used
videos.
Figure 3: Screen shots of used videos.
4. CONCLUSION
Quality of Experience (QoE) appeared to improve network
control taking into account the real user’s perceived quality.
To attempt this goal, we have studied the QoE factors that
can impact the user’s satisfaction in a mobile controlled envi-
ronment in order to implement a platform that can be used to
build consistent and large databases. This latter can help us
to improve QoE estimation models. Our open source platform
is still in development to continue to introduce new QoE IFs
and enlarge dataset.
5. REFERENCES
[1] L. Amour, S. Souihi, S. Hoceini, and A. Mellouk.
Platform video :. https://youtu.be/pMvfVQYplVk, 2015.
[2] L. Amour, S. Souihi, S. Hoceini, and A. Mellouk. A
hierarchical classification model of qoe influence factors.
13th International Conference on Wired and Wireless
Internet Communications, May 25-27, 2015.
[3] O. Figuerola Salas, V. Adzic, A. Shah, and H. Kalva.
Assessing internet video quality using crowdsourcing. In
Proceedings of the 2Nd ACM International Workshop on
Crowdsourcing for Multimedia, CrowdMM ’13, pages
23–28, New York, NY, USA, 2013. ACM.
[4] P. Hanhart, P. Korshunov, and T. Ebrahimi.
Crowd-based quality assessment of multiview video plus
depth coding. 2014.
[5] A. Khan, L. Sun, and E. Ifeachor. Content clustering
based video quality prediction model for mpeg4 video
streaming over wireless networks. In Communications,
2009. ICC ’09. IEEE International Conference on, pages
1–5, June 2009.
[6] M. Mushtaq, B. Augustin, and A. Mellouk. Empirical
study based on machine learning approach to assess the
qos/qoe correlation. In Networks and Optical
Communications (NOC), 2012 17th European Conference
on, pages 1–7, June 2012.
... Our proposal is a generic QoE predictive model that uses several stages to predict the QoE. Initially, experiments using an experimental platform presented in [3] is performed in order to build a dataset that contains the user's MOS under certain conditions (QoE IFs). Secondly, the Pearson r correlation matrix is built for the dataset collected. ...
... To answer this question, our platform [3] is used to build the collected dataset. In this dataset, QoE IFs are categorized in 5 categories (application, network, device, user feedback and MOS) as explained in the last work [2]. ...
... Our proposed predictive model estimates the user's satisfaction in terms of MOS using YouTube service. Initially, a testbed setup based on experimental platform presented in [3] is performed. A dataset containing 600 samples (the real user's MOS and QoE IFs) are collected. ...
... Our proposal is a generic QoE predictive model that uses several stages to predict the QoE. Initially, experiments using an experimental platform presented in [3] is performed in order to build a dataset that contains the user's MOS under certain conditions (QoE IFs). Secondly, the Pearson r correlation matrix is built for the dataset collected. ...
... To answer this question, our platform [3] is used to build the collected dataset. In this dataset, QoE IFs are categorized in 5 categories (application, network, device, user feedback and MOS) as explained in the last work [2]. ...
... Our proposed predictive model estimates the user's satisfaction in terms of MOS using YouTube service. Initially, a testbed setup based on experimental platform presented in [3] is performed. A dataset containing 600 samples (the real user's MOS and QoE IFs) are collected. ...
Conference Paper
Full-text available
Nowadays, Network Operators (NOs) and Service Providers (SPs) are more interested by the user's satisfaction measurement because their businesses are highly dependent on the user's satisfaction. Generally, the traditional strategies to measure the user's perception are based on Quality of Service (QoS), which is not sufficient to reflect the real user's perceived quality. Therefore, NOs and SPs start to develop new strategies based on the Quality of Experience (QoE) metric to analyze the relationship between the user's satisfaction and influence factors (QoE IFs). In this paper, we propose a new method to build a predictive model to estimate user's satisfaction in term of Mean Opinion Score (MOS). The proposal method uses the dataset collected using the controlled testbed based on the YouTube video service. In the proposed model, the correlation matrix is used to develop a new heuristic method that used backjumping technique to select the most beneficial factors to predict the optimal user's satisfaction.
... These crowdworkers get a small fee for the study, which is usually, like the recruitment, handled by a crowdsourcing provider like Microworkers 1 or Amazon Turk 2 . This methodology has been employed several times for obtaining video quality ratings [3,13,29]. Several studies have been conducted that researched the methodology, like the influence of video clip length [22], a training phase [23] and fraud detection [33]. These studies have been gathered in recommendation guidelines for QoE assessment in crowdsourcing [28,30], after which we designed this study. ...
... We first re-encoded the individual streams with ffmpeg. 3 The four individual stream were then composed to one clip with GStreamer 4 and the final result scaled to 1280 × 720 pixels and encoded with H264. This results in 16 different streamcompositions per videoclip. ...
Article
Full-text available
In desktop multi-party video-conferencing videostreams of participants are delivered in different qualities, but we know little about how such composition of the screen affects the quality of experience. Do the different videostreams serve as indirect quality references and the perceived video quality is thus dependent on other streams in the same session? How is the relation between the perceived qualities of each stream and the perceived quality of the overall session? To answer these questions we conducted a crowdsourcing study, in which we gathered over 5000 perceived quality ratings of overall sessions and individual streams. Our results show a contrast effect: high quality streams are rated better when more low quality streams are co-present, and vice versa. In turn, the quality perception of an overall session can increase significantly by exchanging one low quality stream with a high quality one. When comparing the means of individual and overall ratings we can further observe that users are more critical when asked for individual streams than for an overall rating. However, the results show that while contrast effect exists, the effect is not strong enough, to optimize the experience by lowering the quality of other participants.
... The algorithm presented in this paper was evaluated using two different datasets, built in the LiSSi laboratory to collect a lot of QoE IFs using a VLC media player, as illustrated in Table I [30]. ...
... La plateforme utilise différents outils et logiciels dédiés pour le contrôle de l'environnement de test. La Figure (2.2) présente l'ensemble des composants de la plateforme et les logiciels utilisés[92]. ...
Thesis
Full-text available
Le contexte général de la thèse se situe dans le développement de mécanismes adaptatifs capables de récupérer des éléments d’information issus d’un environnement et d’un contexte donnés (Qualité de l’Expérience : QoE) et de s’y adapter afin d’enclencher des actions spécifiques correctrices suite à l’arrivée d’événements imprévisibles ou non souhaités comme une QoS insatisfaisante, un retour d’expérience négatif ou encore des dysfonctionnements des éléments du réseau. Pour ce faire, l’idée maîtresse défendue ici consiste à faire évoluer la gestion du réseau en migrant d’une vue "centrée réseau" où l’on se contentait uniquement des paramètres issus du réseau lui-même, vers une vue "centrée utilisateur". Cette vue consiste à considérer l’ensemble des facteurs, endogènes ou exogènes, pouvant impacter le retour d’expérience de l’utilisateur final. Les travaux développés dans le cadre de cette thèse se sont focalisés sur l’étude de ces facteurs sur l’ensemble des composants de la chaîne de traitement et de transport (utilisateur, périphérique, application, éléments du réseau) et mesurer leur impact dans l’estimation de la QoE usager. Le cas d’usage considéré concerne les approches de diffusion de vidéos en ligne. Les travaux de thèse ont conduit à la proposition d’une nouvelle solution de diffusion de vidéos à débit variable fonctionnant en boucle fermée et sensible à la QoE. Une telle solution permet d’optimiser en temps réel la qualité de la vidéo en fonction de la mesure instantanée de la QoE estimée.
Conference Paper
Full-text available
Quality of Service (QoS) optimization are not sufficient to ensure users needs. That’s why, operators are investigating a new concept called Quality of Experience (QoE), to evaluate the real quality perceived by users. This concept becomes more and more important, but still hard to estimate. This estimation can be influenced by a lot of factors called: Quality of Experience Influence Factors (QoE IFs). In this work, we survey and review existing approaches to classify QoE IFs. Then, we present a new modular and extensible classification architecture. Finally, regarding the proposed classification, we evaluate some QoE estimation approaches to highlight the fact that categories do not affect in the same the user perception.
Conference Paper
Full-text available
Crowdsourcing is becoming a popular cost effective alternative to lab-based evaluations for subjective quality assessment. However, crowed-based evaluations are constrained by the limited availability of display devices used by typical online workers, which makes the evaluation of 3D content a challenging task. In this paper, we investigate two possible approaches to crowed-based quality assessment of multiview video plus depth (MVD) content on 2D displays: by using a virtual view and by using a free-viewpoint video corresponding to a smooth camera motion during a time freeze. We conducted the corresponding crowdsourcing experiments using seven MVD sequences encoded at different bit rates with the upcoming 3D-AVC video coding standard. The crowdsourcing results demonstrate high correlation with subjective evaluations performed on a stereoscopic monitor in a laboratory environment. The analysis shows no statistically significant difference between the two approaches. Full text is unavailable at the moment, since the paper is not published yet.
Conference Paper
Full-text available
The appearance of new emerging multimedia services have created new challenges for cloud service providers, which have to react quickly to end-users experience and offer a better Quality of Service (QoS). Cloud service providers should use such an intelligent system that can classify, analyze, and adapt to the collected information in an efficient way to satisfy end-users' experience. This paper investigates how different factors contributing the Quality of Experience (QoE), in the context of video streaming delivery over cloud networks. Important parameters which influence the QoE are: network parameters, characteristics of videos, terminal characteristics and types of users' profiles. We describe different methods that are often used to collect QoE datasets in the form of a Mean Opinion Score (MOS). Machine Learning (ML) methods are then used to classify a preliminary QoE dataset collected using these methods. We evaluate six classifiers and determine the most suitable one for the task of QoS/QoE correlation.
Conference Paper
Full-text available
The aim of this paper is quality prediction for streaming MPEG4 video sequences over wireless networks for all video content types. Video content has an impact on video quality under same network conditions. This feature has not been widely explored when developing reference-free video quality prediction model for streaming video over wireless or mobile communications. In this paper, we present a two step approach to video quality prediction. First, video sequences are classified into groups representing different content types using cluster analysis. The classification of contents is based on the temporal (movement) and spatial (edges, brightness) feature extraction. Second, based on the content type, video quality (in terms of Mean Opinion Score) is predicted from network level parameter (packet error rate) and application level (i.e. send bitrate, frame rate) parameters using Principal Component Analysis (PCA). The performance of the developed model is evaluated with unseen datasets and good prediction accuracy is obtained for all content types. The work can help in the development of reference-free video prediction model and priority control for content delivery networks.
Conference Paper
In this paper, we present a subjective video quality evaluation system that has been integrated with different crowdsourcing platforms. We try to evaluate the feasibility of replacing the time consuming and expensive traditional tests with a faster and less expensive crowdsourcing alternative. CrowdFlower and Amazon's Mechanical Turk were used as the crowdsourcing platforms to collect data. The data was compared with the formal subjective tests conducted by MPEG as part of the video standardization process, as well as with previous results from a study we ran at the university level. High quality compressed videos with known Mean Opinion Score (MOS) are used as references instead of the original lossless videos in order to overcome intrinsic bandwidth limitations. The bitrates chosen for the experiment were selected targeting Internet use, since this is the environment in which users were going to be evaluating the videos. Evaluations showed that the results are consistent with formal subjective evaluation scores, and can be reproduced across different crowds with low variability, which makes this type of test setting very promising.
Platform video :. https
  • L Amour
  • S Souihi
  • S Hoceini
  • A Mellouk
L. Amour, S. Souihi, S. Hoceini, and A. Mellouk. Platform video :. https://youtu.be/pMvfVQYplVk, 2015.