Content uploaded by Awais Khan Jumani
Author content
All content in this area was uploaded by Awais Khan Jumani on Dec 28, 2021
Content may be subject to copyright.
A Preliminary Study on Quality of Experience
Assessment of Compressed Audio File Format
Jawwad Ali Baloch
Department of Computer Science
ILMA University
Karachi, Pakistan
jawwad.balooch001@gmail.com
Vania V. Estrela
Department of Telecommunication
Fluminense Federal University (UFF)
RJ, Brazil
vania.estrela.phd@ieee.org
Awais Khan Jumani
Department of Computer Science
ILMA University
Karachi, Country
awaisjumani@yahoo.com
Ricardo T.Lopes
LIN-COPPE, UFRJ
RJ, Braazil
ricardo@lin.ufrj.br
Asif Ali Laghari
Department of Computer Science
Sindh Madressatul Islam University
Karachi, Pakistan
asif.laghari@smiu.edu.pk
Abstract—Audio compression formats reduce the file size
before downloading and uploading on the Cloud and social
media platforms. An audio codec performs this operation. The
uncompressed or lossless format can become lossless and lossy
compressed files shrinking file size due to compression.
Uncompressed file format change not only decreases the original
size but also cuts the audio quality. The bit rate also drops. One
assesses the quality of experience (QoE) of compressed audio file
formats by comparing them to the original audios. One needs to
conduct subjective QoE assessment experiments of both
formats. This can be done by uploading a compressed file on the
Cloud depending on its quality and sample rate to determine the
satisfaction level of the end-user and which audio quality can be
uploaded on the cloud, including other social media platforms.
Keywords—Quality of Experience (QoE), Audio File Formats,
MP3, ACC.
I. INTRODUCTION
Connectivity mediums differ regarding platforms where
users share feelings, events, and blisses globally. Different
mobile devices accessing the internet can upload/download
audio files from social media platforms, e.g., Twitter, Sound
Cloud, YouTube, Facebook, and Instagram. These platforms
entail compression of video and audio files to reduce the file
size and economize space. Many algorithms permit file size
reduction with codecs, viz. MPEG-4, H.264/AVC, and
HEVC [1]. The same goes for audio, where codecs compress
the data and convert the audio into digital format. Compressed
or lossy audio formats, such as MP3 and ACC, cut the audio
content and reduce the audio file size.
Different audio formats like uncompressed WAV and
AIFF contain full, original audio at the expense of disk space.
Others are compressed formats like lossless Free Lossless
Audio Codec (FLAC) and lossy MP3, Advanced Audio
Coding (AAC) audio files [2]. The former reduces the file size
without losing content, while the latter loses the audio content.
Audio digitization requires sampling with adequate
frequencies, such as a standard sample rate (16 bit) of 44.1
kHz for CD [3]. Video file formats are composed of containers
and codecs. The former encloses video data, including audio
data, metadata, and subtitles, while the second performs the
decoding operation of compressed video or audio. Due to the
incomplete information about codec, different containers may
work better with different codecs. Most social media
platforms recommend MP4 compression with H.264 or
MPEG-4 if the format is not known. It is also evident that the
compressed uploaded video on social media will not remain
the same as the original because of noise and distortion [4]. As
we know that video file formats comprise containers and
codecs, the former contains video data, including audio data,
metadata, and subtitles. At the same time, the latter decodes
compressed video or audio [5]. Due to the incomplete
information about a codec, containers may differ. Many video
compression schemes exist, e.g., H.264, MPEG-4, audio,
MP3, metadata, XML, etc. Most social media platforms
endorse compression of MP4 with H.264 or MPEG-4 for
unknown format [6, 7]. It is also patent that the compacted
uploaded video on social media will not stay the same as the
original owing to noise and distortion.
Recently, Quality of Experience (QoE) has become critical
to evaluate users' positive or negative experiences with
products or services (user's satisfaction level). It can deliver
an assessment to enhance user's satisfaction on performance
trustworthiness depends on the subjective or objective
measure concerning any service [8, 9]. This investigation
evaluates the QoE of the user on the different compressed
audio file formats. It analyzes how the compression happens
and which file formats suit the end-users. We use subjective
methods for perceived answers while considering the end
user's satisfaction level on different quality file formats. This
solution makes available us to measure the satisfaction level
of the end-user. Several QoE assessments (QoEAs) have been
made on video quality on Clouds, comprising video
streaming. Each compressed audio format scheme performs
differently from the other in terms of audio content. The QoE
was used in different domains to get user reviews about the
devices, products and improve the services using subjective
QoE of the end user's for different audio file formats, listening
2021 IEEE URUCON
978-1-6654-2443-1/21/$31.00 ©2021 IEEE 161
2021 IEEE URUCON | 978-1-6654-2443-1/21/$31.00 ©2021 IEEE | DOI: 10.1109/URUCON53396.2021.9647114
Authorized licensed use limited to: SOUTH CHINA UNIVERSITY OF TECHNOLOGY. Downloaded on December 28,2021 at 12:20:33 UTC from IEEE Xplore. Restrictions apply.
qualities, sample rates, and bit-rates. QoEA on a compressed
audio file and its quality on the Cloud has never been
investigated by any research work using QoE on audio
compression parameters.
The 2nd section surveys literature. The third explains
audio details. Section 4 is based on QoEA experiments, and
the 5th section presents results and discussion. And finally, we
conclude our work in section 6.
II. RELATED WORK
Yang [10] explored MP3s and converted low bit-rate to
high bit-rate to double MP3 compression. Hence, to determine
the double compression of audio, different methods have been
used to detect the double compression and avoid many fake
quality MP3 files available on the internet. To identify these
fake MP3 audio files which are low bit-rate and recompressed
into high bit-rate. Moreover, Benford's law has been used to
find out the distribution of MP3 digits. Assorted experiments
help distinguish between single and double compression audio
files.
Similarly, Rasool et al. [11] the less time-consuming
PEAQ algorithm for audio quality at low bit-rates with
objective QoEA worked very well. The perpetual audio
quality depends on codecs and encoders. Furthermore, Sloan
et al. [12] compared objective metric Virtual Speech Quality
Objective Listener (ViSQOL) against others, e.g., Perceptual
Evaluation of Audio Quality (PEAQ), Perception Model
Quality Assessment (PEMO-Q), and Perceptual objective
listening quality assessment (POLQA). ViSQOL
outperformed other models based on accuracy and
performance for two out of three datasets.
A QoE web-based platform gauges/improves customer
satisfaction, provides QoS quality, offers backend storage, and
allows complaints' registration. An evaluating application
system has been created. The system can analyze and provide
a solution to customer's feedback. This can also evaluate the
performance of an employee. Based on this QoE and quality
of service (QoS), the banking sector can determine its
performance and internal structural problems that are the
biggest impediments to the best possible service delivery.
Laghari et al. [13] have investigated QoE between Cloud and
end-user. The QoE experiment handled different video
formats having additional quality features designated on two
other locations, long and nearby. The Ping command has been
used to identify the network delay time. By recognizing the
video from the Cloud on which users have shared their
satisfaction level. The findings showed that nearby locations
got higher ratings for video streaming than the far Clouds,
identified as a cause of network delay.
Sajida et al. [14] investigated video quality in the social
Cloud-based to determine the video quality analysis (VQA)
and exploring the reference metrics in quality degradation.
Experiments involved uploading selected low-quality videos
on social clouds. Objective video metrics for cloud hosting
helped check emotion during a social media interaction. The
quality and resolution of different videos have been affected
by the compression. Gunawan et al. [15] investigates QoE
assessment (QoEA) of the IEEE 1857.2 Advanced Audio
Coding, a lossless audio codec alternative to other codecs.
This study checked the codec performance and operation and
found that lossless compression improved the compression
operation due to the pre-processing block. Still, it increased
the encoding period and algorithm complexity. For lossless
compression, the IEEE 1857.2 compression performance is
much better than others. Bouraqia et al. [16] analyzed QoEA
for diverse networks and mentioned measurement methods
and worthy definitions. Online user's satisfaction with
streaming using search engines needs new rating acquisition
procedures.
III. AUDIO TEXT
Audio quality is an essential feature attracting users to
listen to audio and share his/her response as feedback. The
audio quality relies on some crucial elements such as sample
rate, bit rate, bit depth, and audio file format. The reason
behind this research is to relate QoE to the compression of
audio file formats. As far as format is concerned, digital audio
is stored on a computer. An audio file can be uncompressed or
compressed. The uncompressed audio file formats are WAV,
AIFF, and AU; they contain complete and original audio
formats and takes a lot of disk space.
On the contrary, the other two are compressed formats
such as lossless FLAC and lossy MP3 and AAC. The former
reduces the file size without losing the audio content, while
the latter reduces the audio file size and loses the audio
content. If we further describe the uncompressed audio file
formats, this format consists of original audio waves with full
data content as captured or taken. Because of no compression,
it takes a lot of space to be stored. E.g., WAV is an
uncompressed format with PCM. Still, few persons argue that
WAV is not an uncompressed audio file format but a window
container with multiple audio formats. A lossy audio file
format like MP3 reduces the audio file size and loses the audio
excellence during data compression. MP3 is also a lossy file
format. MP3 files can easily work on different audio devices
because of their less size. It also drops the sound or audio so
that it could not be audible for the user. The lossless audio file
format is the one that reduces the file format during
compression but without losing the audio data. The free
lossless audio codec is an open-source format. It can compress
the audio file without losing any data.
Fig. 1. Fig.1. Process of compression
Fig. 1 shows compression from 16-bit to 8-bit files
measuring sample rate during audio digitizing. A standard
sample rate (16 bit) for CDs is 44.1 kHz. Video file formats
have containers and codecs. The former contains video data,
including audio, metadata, and subtitles, while the latter
decodes compressed video or audio. In the past, a higher
sample rate took more storage and power to run, but
nowadays, digital storage resolved this issue. Bit rate defines
the audio stream quality, and a higher bit rate describes the
2021 IEEE URUCON
162
Authorized licensed use limited to: SOUTH CHINA UNIVERSITY OF TECHNOLOGY. Downloaded on December 28,2021 at 12:20:33 UTC from IEEE Xplore. Restrictions apply.
better quality of audio. The amount of audio file content and
its file size can increase because of the high sample rate and
bit depth, and that can only be possible with the help of high
bandwidth for the production quality audio. Low bit rate takes
less bandwidth and size but affects the audio quality. A bit
depth is several bits per sample, which is an active range of
sound. The active range of sound differentiates between the
highest and the lowest sound during the recording. The bit rate
calculation follows:
bit rate = Frequency*channels*bit depth
A series of binary numbers (bits) are produced through an
analog to digital converter by taking a snapshot of sound to
describe the sample. Further, digitization means several
samples or snapshots from analog audio went through an A/D
converter. Table 1 below shows some multimedia purposes
applications, and it also indicates bit rates and sample rates of
each.
TABLE I. APPLICATION WITH SAMPLE RATE AND BIT RATE
Quality
Mono/
Stereo
Bits
per
Sample
Sample
Rate
Frequency
Band
Data
Rate
Telephone
Mono
8
8
200-3,400
Hz
8k
byt/sec
AM Radio
Mono
8
11.025
11.0k
byt/sec
FM Radio
Stereo
16
22.050
88.2k
byt/sec
CD
Stereo
16
44.1
20-20,000
Hz
176.4k
byt/sec
DAT
Stereo
16
48
20-20,000
Hz
192.0k
byt/sec
IV. QUALITY OF EXPERIENCE
The uncompressed WAV and AIFF file formats have
better quality. The file size of these audio formats is much
larger than the compressed one, which takes a lot of space to
be stored. On the other side, the compressed audio file formats
are twofold: lossless and lossy. The lossless format like FLAC
reduces the file size during the compression without
compromising on the quality of audio content. A compressed
lossy format, such as MP3 and AAC, reduces the original file
size and losing content. We used two forms of WAV audio;
two music audios and two voice speeches. Where music one
audio size was 56 seconds and music two audio sizes was 54
seconds while the size of both voice speeches was 46 seconds
and then converted these WAV audio formats into compressed
lossy file formats like MP3 and AAC. We used uncompressed
WAV and then converted it into compressed lossy file formats
like MP3 and AAC. We used different techniques to transform
the WAV into MP3 and AAC to perform this compressed
operation. Then, we checked the QoE accuracy between
compressed formats such as MP3 and ACC. After that, the
outcome of the MP3 and AAC compressed files has been
transferred and uploaded into the other listening devices for
QoEA wherein to get the best compressed audio file among
MP3 and AAC. Fig 2 depicts this study on a step-by-step
basis, from audio compression and towards QoEA.
Fig. 2. QoE model for compressed audio format
To investigate QoEA, the MP3 and AAC compressed
audio files formats went to listening devices. Secondly, a quest
poll of audio listening for the QoE study allowed 200
participants. Most of them were undergraduate students and
masters' degree holders. Out of 200 participants, 152 males
and 48 females participated in this poll, and their ages were
between 21 to 35 years. Viewpoints came from distributed
questionnaires for rating the audio. Firstly, we played original
audio through listening devices. Then, we played MP3 and
AAC compressed audio and allowed participants to rate the
best-compressed format. The rating mechanism or mean
opinion score format relied on questionnaires tabular form.
We used absolute category rating (ACR) format to rank the
best compressed audio file format in table 2. The mean
opinion score (MOS) technique has been used for the QoE
assessment to get the score for each audio format.
TABLE II. QUALITY PARAMETERS
Quality
Score
Excellent
5
Good
4
Fine
3
Fair
2
Poor
1
V. RESULTS AND DISCUSSION
The VLC Player version 3.0 and Windows Media Player
version 12 for audio compressions have been used.
Uncompressed WAV aid in checking the difference of audio
quality and size between compressed lossy and uncompressed
formats. The WAV format carries the original audio, and it
outclassed all other formats. However, the adverse point of the
WAV is its losses. Leveraging over-compressed audio formats
consumes storage space. This is why people go towards
compression schemes like MP3 and AAC against WAV. MP3
and AAC have compressed the WAV format. The QoEA
experiment relies on compression formats. For that purpose,
2021 IEEE URUCON
163
Authorized licensed use limited to: SOUTH CHINA UNIVERSITY OF TECHNOLOGY. Downloaded on December 28,2021 at 12:20:33 UTC from IEEE Xplore. Restrictions apply.
we conducted a public or user opinion-based poll on these
compressed audio formats, where participants of this poll had
listened to these formats and gave their opinion on the best
audio format.
TABLE III. FEEDBACK FOR ACC
Advanced Audio Coding (AAC)
Quality
Participants
Excellent
59
Good
67
Fine
65
Fair
9
Poor
0
TABLE IV. FEEDBACK FOR MP3
MP3
Quality
Participants
Excellent
42
Good
40
Fine
44
Fair
63
Poor
11
Tables 3 and 4 display outcomes in absolute category
rating (ACR) for 200 participants, where 59 marked Excellent
for AAC, while 42 marked Excellent for MP3. Moreover, 67
participants rated AAC as Good, while 40 marked Good for
MP3. Likewise, for ACC, 65 marked Fine, and 9 chose Fair.
However, 44 ranked Fine, 63 marked Fair, and 11 chose Poor
for MP3.
Fig. 3. Comparison of audio formats (AAC and MP3)
Results show users' MOS in Table 5 and Fig 3 with high
scores to AAC compared to MP3. As discussed above, both
audio formats belong to the same lossy family, but the
effective compression algorithm AAC performs well than
MP3. AAC audio file format is much more effective than
MP3, but it contains more original content than MP3 while
having identical file sizes and bit rates. Plus, it is also the fact
that AAC performs much better than MP3 in low bit-rates
such as less than 128 kbps. Consequently, the above tables and
fig. 4 illustrate that how AAC is better in performance than
MP3.
Fig. 4. Audio formats' comparison
VI. CONCLUSION
This paper measured the quality of experience (QoE) of
audio file format by a public opinion poll known as mean
opinion score (MOS). The uncompressed WAV file has been
compressed into MP3 and AAC using VLC Player, Windows
Media Player, and other online platforms. QoEA involved 200
participants invited to hear audio formats and rete quality
audio. As per the QoEA, it can be said that the AAC audio
format quality is better than MP3. MP3 is more popular
among the masses. This research work provides an opinion
regarding compressed file formats with less quality than
uncompressed audio formats like WAV. However, we have
also found out that many users most commonly use
compressed audio formats. It is just because of the
compressing scheme due to which these audio formats take
less storage space.
REFERENCES
[1] N. Sadeghi, M. Fahiminia, and M. Teimouri, “Dataset for file fragment
classification of video file formats,” BMC Res. Notes, vol. 13, no. 1,
2020, doi: 10.1186/s13104-020-05037-x.
[2] J. Oh and E. S. Jang, “MP3-based point cloud compression,” 2021 Int.
Conf. Electron. Information, Commun. ICEIC 2021, 2021, doi:
10.1109/ICEIC51217.2021.9369717.
[3] G. Schuller, “Predictive Lossless Audio Coding,” Filter Banks and
Audio Coding, pp. 161–166, 2020, doi: 10.1007/978-3-030-51249-
1_7.
[4] A. Alfa et al., “A Comparative Study of Methods for Hiding Large Size
Audio File in Smaller Image Carriers,” Commun. Comput. Inf. Sci.,
vol. 985, pp. 179–191, 2019, doi: 10.1007/978-981-13-8300-7_15.
[5] Q. Huang, R. Wang, D. Yan, and J. Zhang, “AAC audio compression
detection based on QMDCT coefficient,” Lect. Notes Comput. Sci.
(including Subser. Lect. Notes Artif. Intell. Lect. Notes
Bioinformatics), vol. 11068 LNCS, pp. 347–359, 2018, doi:
10.1007/978-3-030-00021-9_32.
[6] S. Cunningham and I. McGregor, “Subjective Evaluation of Music
Compressed with the ACER Codec Compared to AAC, MP3, and
Uncompressed PCM,” Int. J. Digit. Multimed. Broadcast., vol. 2019,
2019, doi: 10.1155/2019/8265301.
[7] K. D. Singh, Y. Hadjadj-Aoul, and G. Rubino, “Quality of experience
estimation for adaptive HTTP/TCP video streaming using
H.264/AVC,” 2012 IEEE Consum. Commun. Netw. Conf.
CCNC’2012, pp. 127–131, 2012, doi: 10.1109/CCNC.2012.6181070.
[8] A. Laghari, H. He, S. Karim, H. A. Shah, and N. K. Karn, “Quality of
Experience Assessment of Video Quality in Social Clouds,” Wirel.
Commun. Mob. Comput., vol. 2017, 2017, doi:
10.1155/2017/8313942.
2021 IEEE URUCON
164
Authorized licensed use limited to: SOUTH CHINA UNIVERSITY OF TECHNOLOGY. Downloaded on December 28,2021 at 12:20:33 UTC from IEEE Xplore. Restrictions apply.
[9] F. De Turck, S. Petrangeli, J. Van Der Hooft, and T. Wauters, “Quality
of experience-centric management of adaptive video streaming
services: Status and challenges,” ACM Trans. Multimed. Comput.
Commun. Appl., vol. 14, no. 2s, 2018, doi: 10.1145/3165266.
[10] R. Yang, Y. Q. Shi, and J. Huang, “Detecting double compression of
audio signal,” Media Forensics Secur. II, vol. 7541, p. 75410K, 2010,
doi: 10.1117/12.838695.
[11] R. Rassool, “VMAF reproducibility: Validating a perceptual practical
video quality metric,” IEEE Int. Symp. Broadband Multimed. Syst.
Broadcast. BMSB, 2017, doi: 10.1109/BMSB.2017.7986143.
[12] Sloan, N. Harte, D. Kelly, A. C. Kokaram, and A. Hines, “Objective
Assessment of Perceptual Audio Quality Using ViSQOLAudio,” IEEE
Trans. Broadcast., vol. 63, no. 4, pp. 693–705, 2017, doi:
10.1109/TBC.2017.2704421.
[13] A. Laghari, H. He, M. Shafiq, and A. Khan, “Assessing effect of Cloud
distance on end user’s Quality of Experience (QoE),” 2016 2nd IEEE
Int. Conf. Comput. Commun. ICCC 2016 - Proc., pp. 500–505, 2017,
doi: 10.1109/CompComm.2016.7924751.
[14] S. Karim, H. He, A. R. Junejo, and M. Sattar, “Measurement of
Objective Video Quality in Social Cloud Based on Reference Metric,”
Wirel. Commun. Mob. Comput., vol. 2020, 2020, doi:
10.1155/2020/5028132.
[15] T. S. Gunawan, M. K. Mat Zain, F. A. Muin, and M. Kartiwi,
“Investigation of lossless audio compression using IEEE 1857.2
advanced audio coding,” Indones. J. Electr. Eng. Comput. Sci., vol. 6,
no. 2, pp. 422–430, 2017, doi: 10.11591/ijeecs.v6.i2.pp422-430.
[16] K. Bouraqia, E. Sabir, M. Sadik, and L. Ladid, “Quality of Experience
for Streaming Services: Measurements, Challenges and Insights,”
IEEE Access, vol. 8, pp. 13341–13361, 2020, doi:
10.1109/ACCESS.2020.2965099.
2021 IEEE URUCON
165
Authorized licensed use limited to: SOUTH CHINA UNIVERSITY OF TECHNOLOGY. Downloaded on December 28,2021 at 12:20:33 UTC from IEEE Xplore. Restrictions apply.