A.M. Kondoz

Loughborough University, Loughborough, England, United Kingdom

Are you A.M. Kondoz?

Claim your profile

Publications (372)226.24 Total impact

  • Source

    Full-text · Dataset · Nov 2015
  • Source

    Full-text · Dataset · Nov 2015
  • M. Stefanovic · Y.D. Cho · S. Villette · A.M. Kondoz
    [Show abstract] [Hide abstract]
    ABSTRACT: Speech coding at very low bit rates has many applications such as answering machines, IP telephony, mobile communications, military communications etc. In this paper we describe a speech coder capable of operating at both 2.4 and 1.2kb/s, and produces good quality synthesised speech. The basic principle of the coder is based on frequency domain vocoding method where the LP excitation is classified into voiced and unvoiced parts. The rate of the coder can be switched from 2.4kb/s, which operates on 20ms frames, to 1.2kb/s by having 60ms frames. Both rates use the same analysis and synthesis building blocks over 20ms. Reliable pitch estimation and very elaborate voiced/unvoiced mixture determination algorithms render the algorithm robust to background noise. However in order to communicate in very severe noisy conditions a noise pre-processor has been integrated within the speech encoder. Owing to robust quantisation methods and careful encoding/index assignment algorithms the coder maintains its good quality with 1% random bit errors and 3% random frame (20ms) erasures.
    No preview · Article · Mar 2015
  • Haiyue Yuan · Janko Calic · Ahmet Kondoz
    [Show abstract] [Hide abstract]
    ABSTRACT: Quality of Experience has recently become a paramount research topic in multimedia systems, especially in the emerging areas of high-definition and 3D video content. There has been a significant amount of research focusing on 3D content production, compression and delivery. However very little research has been dedicated to the emerging challenges in assessing user experience when interacting with the 3D video content. Interaction tasks such as pointing and selection are critical to the consumer's experience of the 3D video technology. This paper studies the impact of pointing modalities on the quality of interaction experience with stereoscopic 3D television. The conducted user study compares and evaluates three pointing modalities: standard mouse-based interaction, virtual laser pointer and hand gesture modality, using the ISO 9241-9 standard for multi-directional tapping task. The results suggest that the virtual laser pointer modality can provide better quality of interaction experience than other modalities in terms of user performance and user satisfaction.
    No preview · Article · Jan 2015
  • H. Lim · C. Kim · E. Ekmekcioglu · S. Dogan · A.P. Hill · A.M. Kondoz · X. Shi
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes an immersive audio rendering scheme for networked 3D multimedia systems. The spatial audio rendering method based on wave field synthesis is particularly useful for applications where multiple listeners experience a true spatial soundscape while being free to move without losing spatial sound properties. The proposed approach can be considered as a general solution to the static listening restriction imposed by conventional methods, which rely on an accurate sound reproduction within a sweet spot only. The paper reports on the results of numerical analysis and experimental validation using various sound sources. It is demonstrated and confirmed that while covering the majority of the listening area, the developed approach can create a variety of virtual audio objects at target positions with very high accuracy. Subjective evaluation results show that an accurate spatial impression can be achieved with multiple simultaneous audible depth cues improving localization accuracy over single object rendering.
    No preview · Article · Jan 2015
  • Source
    C. Ozcinar · E. Ekmekcioglu · A. Kondoz
    [Show abstract] [Hide abstract]
    ABSTRACT: Streaming 3D multi-view video to multiple clients simultaneously remains a highly challenging problem due to the high-volume of data involved and the inherent limitations imposed by the delivery networks. Delivery of multimedia streams over Peer-to-Peer (P2P) networks has gained great interest due to its ability to maximise link utilisation, preventing the transport of multiple copies of the same packet for many users. On the other hand, the quality of experience can still be significantly degraded by dynamic variations caused by congestions, unless content-aware precautionary mechanisms and adaptation methods are deployed. In this paper, a novel, adaptive multi-view video streaming over a P2P system is introduced which addresses the next generation high resolution multi-view users' experiences with autostereoscopic displays. The solution comprises the extraction of low-overhead supplementary metadata at the media encoding server that is distributed through the network and used by clients performing network adaptation. In the proposed concept, pre-selected views are discarded at a times of network congestion and reconstructed with high quality using the metadata and the neighbouring views. The experimental results show that the robustness of P2P multi-view streaming using the proposed adaptation scheme is significantly increased under congestion.
    Full-text · Article · Jan 2015
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper addresses the problem caused by motion vector coding dependencies on the error resilience performance of the emergent High Efficiency Video Coding (HEVC) standard. We propose a method based on the prediction dependency of motion vectors (MV) to select the most relevant ones for redundant coding with reduced overhead. The spatial dependencies are analysed in the encoder to prioritise the MVs that should be selected for redundancy, based on the number of subsequent dependent coding units. Then, a subset of prioritised MVs is transmitted as redundancy (referred to as side information in the paper), to reduce the use and propagation of mismatched MV predictions in case of transmission errors or data loss. The simulation results show that the proposed MV selection method can effectively identify the most relevant motion field, achieving improved error robustness with a reduced redundancy overhead. Exploiting only 30% of the generated MVs for redundancy, average quality gains of up to 1 dB are achieved compared to a uniform MV selection scheme, and up to 2 dB compared to the original HEVC standard with no redundant encoded information.
    No preview · Article · Jan 2015
  • E. Ekmekcioglu · V. De Silva · S. Dogan · A. Kondoz
    [Show abstract] [Hide abstract]
    ABSTRACT: Asymmetric quality stereoscopic coding has proved to be an effective method in reducing the bandwidth without altering the visual quality. Visual attention cues can also be used in asymmetric coding that has not been widely studied yet. This study reports the performance limits of visual attention aided asymmetric stereoscopic video coding over conventional asymmetric coding.
    No preview · Article · Dec 2014
  • [Show abstract] [Hide abstract]
    ABSTRACT: The high level of compression efficiency achieved by HEVC coding techniques decreases the error resilience performance under error prone conditions. This paper addresses the error resiliency of the HEVC standard, focusing on the new motion estimation tools. It is shown that the temporal dependency of motion information is comparatively higher than that in the H.264/AVC standard, causing an increase in the error propagation. Based on this evidence, this paper proposes a method to make intelligent use of temporal motion vector (MV) candidates during the motion estimation process, in order to decrease the temporal dependency, and improve the error resiliency without penalising the rate-distortion performance. The simulation results show that the proposed method improves the error resilience under tested conditions by increasing the video quality by up to 1.7 dB in average, compared to the reference method that always enables temporal MV candidates.
    No preview · Article · Nov 2014
  • Source
    G. Nur Yilmaz · H.K. Arachchi · S. Dogan · A. Kondoz
    [Show abstract] [Hide abstract]
    ABSTRACT: 3-Dimensional (3D) video adaptation decision taking is an open field in which not many researchers have carried out investigations yet compared to 3D video display, coding, etc. Moreover, utilizing ambient illumination as an environmental context for 3D video adaptation decision taking has particularly not been studied in literature to date. In this paper, a user perception model, which is based on determining perception characteristics of a user for a 3D video content viewed under a particular ambient illumination condition, is proposed. Using the proposed model, a 3D video bit rate adaptation decision taking technique is developed to determine the adapted bit rate for the 3D video content to maintain 3D video quality perception by considering the ambient illumination condition changes. Experimental results demonstrate that the proposed technique is capable of exploiting the changes in ambient illumination level to use network resources more efficiently without sacrificing the 3D video quality perception.
    Full-text · Article · Sep 2014
  • [Show abstract] [Hide abstract]
    ABSTRACT: Demand on multimedia content by consumers' handheld devices over wireless channels is on the increase. In view of the accelerated trend towards consumption of high quality video, power utilization by mobile devices is expected to excessively increase. Hence, it becomes equally important to advance more efficient power minimization techniques, in light of the short battery life in portable devices. However, power minimization algorithms that adopt consumers' perceptual quality of video have not received adequate research. This paper proposes a joint optimization of energy and quality requirements in a multiuser orthogonal frequencydivision multiplexing environment. A multi-objective optimization problem is formulated with the aim to identify bitrate allocations among users such that total power is minimized, and average quality is maximized. For this, a content-aware and energy-efficient resource allocation scheme (CaERAS) is proposed based on genetic and greedy algorithms. Simulation results show that CaERAS, as a lowcomplexity scheme, outperforms comparable methods in terms of efficiency and selectivity of suboptimal solutions. It is shown to acquire a suboptimal solution in as low as 0.0025 of the search space in previous methods. Also, a significant average saving of 85.66% in required energy is observed in broadcast transmission as opposed to unicast transmission.
    No preview · Conference Paper · Aug 2014
  • [Show abstract] [Hide abstract]
    ABSTRACT: Demand on multimedia content by consumers' handheld devices over wireless channels is on the increase. In view of the accelerated trend towards consumption of high quality video, power utilization by mobile devices is expected to excessively increase. Hence, it becomes equally important to advance more efficient power minimization techniques, in light of the short battery life in portable devices. However, power minimization algorithms that adopt consumers' perceptual quality of video have not received adequate research. This paper proposes a joint optimization of energy and quality requirements in a multiuser orthogonal frequencydivision multiplexing environment. A multi-objective optimization problem is formulated with the aim to identify bitrate allocations among users such that total power is minimized, and average quality is maximized. For this, a content-aware and energy-efficient resource allocation scheme (CaERAS) is proposed based on genetic and greedy algorithms. Simulation results show that CaERAS, as a lowcomplexity scheme, outperforms comparable methods in terms of efficiency and selectivity of suboptimal solutions. It is shown to acquire a suboptimal solution in as low as 0.0025 of the search space in previous methods. Also, a significant average saving of 85.66% in required energy is observed in broadcast transmission as opposed to unicast transmission.
    No preview · Article · Aug 2014 · IEEE Transactions on Consumer Electronics
  • [Show abstract] [Hide abstract]
    ABSTRACT: An inherent problem of Depth Image Based Rendering (DIBR) is the visual presence of disocclusions in the rendered views. This poses a significant challenge when the subjective assessment of these views is utilised for evaluating the quality of the depth maps used in the rendering process. Although various techniques are available to address this challenge, they result in concealing distortions, which are directly caused by the depth map imperfections. For the purposes of depth map quality evaluation, there is a need for an approach that deals with the presence of disocclusions without having further impact on other distortions. The aim of this approach is to enable the subjective assessments of rendered views to provide results, which are more representative of the quality of the depth map used in the rendering process.
    No preview · Conference Paper · Jul 2014
  • Haiyue Yuan · Janko Calic · Ahmet Kondoz
    [Show abstract] [Hide abstract]
    ABSTRACT: In spite of a large body of research focused on 3D video technology, very little attention has been dedicated to the design practices of stereoscopic 3D video interaction devices. Interaction tasks such as pointing and selection are critical to the consumer's experience of the 3D video technology. This paper presents implementation and investigation of pointing modalities in the context of stereoscopic 3D display devices. The conducted investigation implements three pointing modalities: standard mouse-based interaction, virtual laser pointer implemented using Wiimote, and hand movement modality using Kinect. The study explores it's utilisation in interactive tasks such as two-handed interaction and zoom functionalities. Finally, we investigate the definition of interaction space for hand movement modality to facilitate effective and comfortable pointing.
    No preview · Conference Paper · Jul 2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Owing to its wide-interoperability, stereoscopic 3D video format in High Definition (HD) is a popular choice for 3D entertainment media distribution. However, the delivery over bandwidth constrained networks exhibits challenges in terms of intermittent congestions in the network traffic, which enforce the delivery system to perform perception-aware coding to save bandwidth. In the scope of stereoscopic 3D video, asymmetric quality adaptation has proved to be an effective method in terms of maintaining the perceived quality while reducing the required transmission bandwidth. On the other hand, Region-Of-Interest (ROI) based coding in accordance with the visual attention cues, which offers non-uniform quality assignment to regions of different saliency levels has not been widely studied in combination with asymmetric coding of stereoscopic 3D video. In this work, the effectiveness of using visual attention aided non-uniform asymmetric 3D video coding is explored. The importance of incorporating compression artefacts in the formulation of visual attention model is also revealed. The discussions in this paper are based on a comprehensive subjective test with 8 stereoscopic video sequences of different spatial and temporal characteristics at different conditions.
    Full-text · Article · Jun 2014 · IEEE Journal of Selected Topics in Signal Processing
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This article proposes a converged broadcast and broadband platform in order to deliver 3D media to both mobile and fixed users with guaranteed minimum quality of experience (QoE). The work presented offers an ideal business model for operators having both digital video broadcast and Internet Protocol (IP)-based media services. To that end, the DVB and peer-to-peer Internet technologies will be combined to provide sufficient resources for supporting high-bandwidth high-quality 3D multiview video. The motivations behind combining these technologies are outlined with an emphasis on their complementary characteristics. In addition, the overall design of the proposed architecture is presented focusing on the protocols that are exploited to achieve the interworking of the underlying technologies. Moreover, innovative key techniques for supporting both fixed and mobile users in an efficient manner are introduced.
    Full-text · Article · Jun 2014 · IEEE Wireless Communications
  • G C V Perera · V De Silva · A M Kondoz · S Dogan
    [Show abstract] [Hide abstract]
    ABSTRACT: With the exponential growth of stereoscopic imaging in various applications, it has become very demanding to have a reliable quality assessment technique to measure the human perception of stereoscopic images. Quality assessment of stereoscopic visual content in the presence of artefacts caused by compression and transmission is a key component of end-to-end 3D media delivery systems. Despite a few recent attempts to develop stereoscopic image/video quality metrics, there is still a lack of a robust stereoscopic image quality metric. Towards addressing this issue, this paper proposes a full reference stereoscopic image quality metric, which mimics the human perception while viewing stereoscopic images. A signal processing model that is consistent with physiological literature is developed in the paper to simulate the behaviour of simple and complex cells of the primary visual cortex in the Human Visual System (HVS). The model is trained with two publicly available stereoscopic image databases to match the perceptual judgement of impaired stereoscopic images. The experimental results demonstrate a significant improvement in prediction performance as compared with several state-of-the-art stereoscopic image quality metrics.
    No preview · Conference Paper · May 2014
  • Source
    V De Silva · A M Kondoz · S Dogan · C Galkandage
    [Show abstract] [Hide abstract]
    ABSTRACT: With the exponential growth of stereoscopic imaging in various applications, it has become very demanding to have a reliable quality assessment technique to measure the human perception of stereoscopic images. Quality assessment of stereoscopic visual content in the presence of artefacts caused by compression and transmission is a key component of end-to-end 3D media delivery systems. Despite a few recent attempts to develop stereoscopic image/video quality metrics, there is still a lack of a robust stereoscopic image quality metric. Towards addressing this issue, this paper proposes a full reference stereoscopic image quality metric, which mimics the human perception while viewing stereoscopic images. A signal processing model that is consistent with physiological literature is developed in the paper to simulate the behaviour of simple and complex cells of the primary visual cortex in the Human Visual System (HVS). The model is trained with two publicly available stereoscopic image databases to match the perceptual judgement of impaired stereoscopic images. The experimental results demonstrate a significant improvement in prediction performance as compared with several state-of-the-art stereoscopic image quality metrics.
    Full-text · Conference Paper · May 2014
  • G.C.V. Perera · V. De Silva · A.M. Kondoz · S. Dogan
    [Show abstract] [Hide abstract]
    ABSTRACT: With the exponential growth of stereoscopic imaging in various applications, it has become very demanding to have a reliable quality assessment technique to measure the human perception of stereoscopic images. Quality assessment of stereoscopic visual content in the presence of artefacts caused by compression and transmission is a key component of end-to-end 3D media delivery systems. Despite a few recent attempts to develop stereoscopic image/video quality metrics, there is still a lack of a robust stereoscopic image quality metric. Towards addressing this issue, this paper proposes a full reference stereoscopic image quality metric, which mimics the human perception while viewing stereoscopic images. A signal processing model that is consistent with physiological literature is developed in the paper to simulate the behaviour of simple and complex cells of the primary visual cortex in the Human Visual System (HVS). The model is trained with two publicly available stereoscopic image databases to match the perceptual judgement of impaired stereoscopic images. The experimental results demonstrate a significant improvement in prediction performance as compared with several state-of-the-art stereoscopic image quality metrics.
    No preview · Conference Paper · May 2014
  • Ikhwana Elfitri · Xiyu Shi · Ahmet Kondoz
    [Show abstract] [Hide abstract]
    ABSTRACT: This study presents a novel spatial audio coding (SAC) technique, called analysis by synthesis SAC (AbS-SAC), with a capability of minimising signal distortion introduced during the encoding processes. The reverse one-to-two (R-OTT), a module applied in the MPEG Surround to down-mix two channels as a single channel, is first configured as a closed-loop system. This closed-loop module offers a capability to reduce the quantisation errors of the spatial parameters, leading to an improved quality of the synthesised audio signals. Moreover, a sub-optimal AbS optimisation, based on the closed-loop R-OTT module, is proposed. This algorithm addresses a problem of practicality in implementing an optimal AbS optimisation while it is still capable of improving further the quality of the reconstructed audio signals. In terms of algorithm complexity, the proposed sub-optimal algorithm provides scalability. The results of objective and subjective tests are presented. It is shown that significant improvement of the objective performance, when compared to the conventional open-loop approach, is achieved. On the other hand, subjective test show that the proposed technique achieves higher subjective difference grade scores than the tested advanced audio coding multichannel.
    No preview · Article · Feb 2014 · IET Signal Processing

Publication Stats

2k Citations
226.24 Total Impact Points

Institutions

  • 2014
    • Loughborough University
      Loughborough, England, United Kingdom
  • 2-2014
    • University of Surrey
      • • Department of Electronic Engineering
      • • Centre for Communication Systems Research (CCSR)
      • • I-Lab Multimedia and DSP Research Group
      • • Department of Computing
      Guilford, England, United Kingdom
  • 2011-2013
    • Multimedia Communications Research Laboratory
      Ottawa, Ontario, Canada
  • 2008
    • Deutsche Telekom Laboratories
      BerlĂ­n, Berlin, Germany
  • 2004
    • Eastern Mediterranean University
      Magosa, Ammochostos, Cyprus
  • 2001
    • Kocaeli University
      Cocaeli, Kocaeli, Turkey