Quan Huynh-Thu’s research while affiliated with Technicolor and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (31)


Fig. 1. (A) Schematic representation of the experimental apparatus used in the study. S3D refers to the stereoscopic 3D display used in the conflict viewing condition and for the initial fixation in the match condition. BS means beam splitter. The second display presents the fixation target for the match viewing condition. To compare the vergence response in both conditions, the second display distance was adjusted on the optical bench to match the 3D stereoscopic distance used in the S3D condition. The disparity amplitude is determined by the convergence angle at initial position (h) minus the angle at stimulus depth (d). In the present case, the disparity amplitude is negative and corresponds to a convergent disparity step. Drawing is not to scale. (B) Stimulus used to measure the vergence response to a disparity step. (a) is the fixation target and (b) is the frame composed of small squares to help maintain stereoscopic fusion. This stimulus introduces relative disparity between the fixation target and the frame. 
Fig. 2. Schematic representation of the disparity step value used in Experiment 1 and portrayal of their relation to the accommodative demand. The panels plot absolute vergence demand as a function of stimulus presentation time (A) and accommodative demand as a function of vergence demand (B). The left figure (A) shows the possible (unsigned) disparity step magnitudes (0.75°, 1°, 1.25° and 1.5°) as a function of trial duration (in seconds). The left ordinate axis represents the corresponding vergence demand scale (MA = Meter Angles). The random step stimulus is represented with a black rectangle with the three possible latencies (0.5, 1 and 1.5 s). The right figure (B) plots the accommodative demand as a function of vergence demand for the possible step stimuli in the conflict condition (white squares) and the match condition (white circles). The main diagonal black line corresponds to the natural vision line. The dark grey zone is the Panum0s fusion area (15 arc min, Hoffman et al., 2008), the medium grey zone is the depth-of-focus (0.3 diopter) and the light grey zone corresponds to the Percival0s zone of comfort. 
Fig. 4. (A) Plot of main sequence of amplitude vs. peak velocity data points, for symmetric vergence step responses for all participants combined (N = 10). The linear regression lines are represented for convergence on the left and divergence on the right in both the match condition (black lines) and the conflict viewing condition (gray lines). The Pearsons correlation coefficients range from 0.80 to 0.89. The black circles correspond to data averaged over participants for the match condition and the grey triangles represent the data for the conflict viewing condition. (B) Left: vergence gain as a function of viewing condition (match vs. conflict) and vergence direction (convergence vs. divergence). Right: mean latency of vergence movements as function of viewing condition and vergence direction. Error bars denote the standard error of means and, where applicable, brackets with accompanying stars indicate significant differences (Ãp < 0:05 and à à p < 0:01). 
Fig. 5. Schematic representation of the disparity step values used in the experimental phase of Experiment 2 and their relation to the accommodative demand. (A) shows an example of disparity amplitudes observed during the experimental phase. The disparity steps of Session 1 (small disparity values) are represented in black and those of Session 2 are represented in grey (large disparity values). Negative values correspond to convergence steps and positive values to divergence steps. The stereo-demand implied by the moving-in-depth annulus is also represented with the oscillating pattern both in Session 1 and Session 2. (B) plots the accommodative demand as a function of vergence demand for the possible step stimulus in Session 1 (using small disparity values, white diamonds on the figure) and Session 2 (large disparity values, white squares on the figure). The main diagonal black line corresponds to the natural vision line. The dark grey zone is the Panum0s fusion area, the medium grey zone is the depth-of-focus and the light grey zone corresponds to the Percival0s zone of comfort. 
Fig. 6. Main sequences for one observer, for symmetric vergence responses based on raw data. The upper part of the figure represents the data for Session 1 (small disparity amplitudes) and the lower part represents the data for Session 2 (large disparity amplitudes). The black triangles are data in pre-test and the grey squares represent data in post-test. The linear regression lines are represented for convergence on the left and divergence on the right for both the pre-test (black lines) and the post-test (grey lines). The Pearson's correlation coefficients range from 0.65 to 0.86. 

+1

Effect of the accommodation-vergence conflict on vergence eye movements
  • Article
  • Full-text available

May 2014

·

576 Reads

·

59 Citations

Vision Research

·

·

·

[...]

·

With the broader use of stereoscopic displays, a flurry of research activity about the accommodation-vergence conflict has emerged to highlight the implications for the human visual system. In stereoscopic displays, the introduction of binocular disparities requires the eyes to make vergence movements. In this study, we examined vergence dynamics with regard to the conflict between the stimulus-to-accommodation and the stimulus-to-vergence. In a first experiment, we evaluated the immediate effect of the conflict on vergence responses by presenting stimuli with conflicting disparity and focus on a stereoscopic display (i.e. increasing the stereoscopic demand) or by presenting stimuli with matched disparity and focus using an arrangement of displays and a beam splitter (i.e. focus and disparity specifying the same locations). We found that the dynamics of vergence responses were slower overall in the first case due to the conflict between accommodation and vergence. In a second experiment, we examined the effect of a prolonged exposure to the accommodation-vergence conflict on vergence responses, in which participants judged whether an oscillating depth pattern was in front or behind the fixation plane. An increase in peak velocity was observed, thereby suggesting that the vergence system has adapted to the stereoscopic demand. A slight increase in vergence latency was also observed, thus indicating a small decline of vergence performance. These findings offer a better understanding and document how the vergence system behaves in stereoscopic displays. We describe what stimuli in stereo-movies might produce these oculomotor effects, and discuss potential applications perspectives.

Download


Visual storytelling in 2D and stereoscopic 3D video: effect of blur on visual attention

March 2013

·

125 Reads

·

4 Citations

Proceedings of SPIE - The International Society for Optical Engineering

Visual attention is an inherent mechanism that plays an important role in the human visual perception. As our visual system has limited capacity and cannot efficiently process the information from the entire visual field, we focus our attention on specific areas of interest in the image for detailed analysis of these areas. In the context of media entertainment, the viewers' visual attention deployment is also influenced by the art of visual storytelling. To this date, visual editing and composition of scenes in stereoscopic 3D content creation still mostly follows those used in 2D. In particular, out-of-focus blur is often used in 2D motion pictures and photography to drive the viewer's attention towards a sharp area of the image. In this paper, we study specifically the impact of defocused foreground objects on visual attention deployment in stereoscopic 3D content. For that purpose, we conducted a subjective experiment using an eyetracker. Our results bring more insights on the deployment of visual attention in stereoscopic 3D content viewing, and provide further understanding on visual attention behavior differences between 2D and 3D. Our results show that a traditional 2D scene compositing approach such as the use of foreground blur does not necessarily produce the same effect on visual attention deployment in 2D and 3D. Implications for stereoscopic content creation and visual fatigue are discussed.


The Influence of Subjects and Environment on Audiovisual Subjective Tests: An International Study

October 2012

·

809 Reads

·

97 Citations

IEEE Journal of Selected Topics in Signal Processing

Traditionally, audio quality and video quality are evaluated separately in subjective tests. Best practices within the quality assessment community were developed before many modern mobile audiovisual devices and services came into use, such as internet video, smart phones, tablets and connected televisions. These devices and services raise unique questions that require jointly evaluating both the audio and the video within a subjective test. However, audiovisual subjective testing is a relatively under-explored field. In this paper, we address the question of determining the most suitable way to conduct audiovisual subjective testing on a wide range of audiovisual quality. Six laboratories from four countries conducted a systematic study of audiovisual subjective testing. The stimuli and scale were held constant across experiments and labs; only the environment of the subjective test was varied. Some subjective tests were conducted in controlled environments and some in public environments (a cafeteria, patio or hallway). The audiovisual stimuli spanned a wide range of quality. Results show that these audiovisual subjective tests were highly repeatable from one laboratory and environment to the next. The number of subjects was the most important factor. Based on this experiment, 24 or more subjects are recommended for Absolute Category Rating (ACR) tests. In public environments, 35 subjects were required to obtain the same Student's t-test sensitivity. The second most important variable was individual differences between subjects. Other environmental factors had minimal impact, such as language, country, lighting, background noise, wall color, and monitor calibration. Analyses indicate that Mean Opinion Scores (MOS) are relative rather than absolute. Our analyses show that the results of experiments done in pristine, laboratory environments are highly representative of those devices in actual use, in a typical user environment.


Figure 4: spectral measure example
Figure 5: 1° and 16° field observed at the center of display 
55.1: Diversity and Coherence of 3D Crosstalk Measurements

August 2012

·

110 Reads

·

8 Citations

SID Symposium Digest of Technical Papers

3D crosstalk is a major contributor to 3D quality loss and visual fatigue on stereoscopic displays. This paper presents several 3D crosstalk measurement methods and discusses the coherence between methods, towards the derivation of meaningful quality indicators. It also identifies the need of synthetic indicators for complex crosstalk effects.


Physiological-Based Affect Event Detector for Entertainment Video Applications

July 2012

·

75 Reads

·

75 Citations

IEEE Transactions on Affective Computing

In this paper, we propose a methodology to build a real-time affect detector dedicated to video viewing and entertainment applications. This detector combines the acquisition of traditional physiological signals, namely, galvanic skin response, heart rate, and electromyogram, and the use of supervised classification techniques by means of Gaussian processes. It aims at detecting the emotional impact of a video clip in a new way by first identifying emotional events in the affective stream (fast increase of the subject excitation) and then by giving the associated binary valence (positive or negative) of each detected event. The study was conducted to be as close as possible to realistic conditions by especially minimizing the use of active calibrations and considering on-the-fly detection. Furthermore, the influence of each physiological modality is evaluated through three different key-scenarios (mono-user, multi-user and extended multi--user) that may be relevant for consumer applications. A complete description of the experimental protocol and processing steps is given. The performances of the detector are evaluated on manually labeled sequences, and its robustness is discussed considering the different single and multi-user contexts.


The accuracy of PSNR in predicting video quality for different video scenes and frame rates

June 2012

·

2,380 Reads

·

186 Citations

Telecommunication Systems

Peak Signal-to-Noise Ratio (PSNR) is widely used as a video quality metric or performance indicator. Some studies have indicated that it correlates poorly with subjective quality, whilst others have used it on the basis that it provides a good correlation with subjective data. Existing literature seems to provide conflicting evidence of the accuracy of PSNR as a video quality metric. Based on experimental results, we explain a scenario where PSNR provides a reliable indication of the variation of subjective video quality and scenarios where PSNR is not a reliable video quality metric. We show that PSNR follows a monotonic relationship with subjective quality in the case of full frame rate encoding when the video content and codec are fixed. We provide evidence that PSNR becomes an unreliable and inaccurate quality metric when several videos with different content are jointly assessed. Furthermore, PSNR is inaccurate in measuring video quality of a video content encoded at different frame rates because it is not capable of assessing the perceptual trade-off between the spatial and temporal qualities. Finally, where PSNR is not a reliable video quality metric across different video contents and frame rates, we show that a perceptual video model recently approved by the International Telecommunication Union (ITU) provides quality predictions highly correlating with subjective scores even if different video scenes coded at different frame rates are considered in the test set.


3D cinema to 3DTV content adaptation

February 2012

·

22 Reads

·

1 Citation

Proceedings of SPIE - The International Society for Optical Engineering

3D cinema and 3DTV have grown in popularity in recent years. Filmmakers have a significant opportunity in front of them given the recent success of 3D films. In this paper we investigate whether this opportunity could be extended to the home in a meaningful way. "3D" perceived from viewing stereoscopic content depends on the viewing geometry. This implies that the stereoscopic-3D content should be captured for a specific viewing geometry in order to provide a satisfactory 3D experience. However, although it would be possible, it is clearly not viable, to produce and transmit multiple streams of the same content for different screen sizes. In this study to solve the above problem, we analyze the performance of six different disparity-based transformation techniques, which could be used for cinema-to-3DTV content conversion. Subjective tests are performed to evaluate the effectiveness of the algorithms in terms of depth effect, visual comfort and overall 3D quality. The resultant 3DTV experience is also compared to that of cinema. We show that by applying the proper transformation technique on the content originally captured for cinema, it is possible to enhance the 3DTV experience. The selection of the appropriate transformation is highly dependent on the content characteristics.


Method and Simulation to Study 3D Crosstalk Perception

February 2012

·

17 Reads

·

1 Citation

Proceedings of SPIE - The International Society for Optical Engineering

To various degrees, all modern 3DTV displays suffer from crosstalk, which can lead to a decrease of both visual quality and visual comfort, and also affect perception of depth. In the absence of a perfect 3D display technology, crosstalk has to be taken into account when studying perception of 3D stereoscopic content. In order to improve 3D presentation systems and understand how to efficiently eliminate crosstalk, it is necessary to understand its impact on human perception. In this paper, we present a practical method to study the perception of crosstalk. The approach consists of four steps: (1) physical measurements of a 3DTV, (2) building of a crosstalk surface based on those measurements and representing specifically the behavior of that 3TV, (3) manipulation of the crosstalk function and application on reference images to produce test images degraded by crosstalk in various ways, and (4) psychophysical tests. Our approach allows both a realistic representation of the behavior of a 3DTV and the easy manipulation of its resulting crosstalk in order to conduct psycho-visual experiments. Our approach can be used in all studies requiring the understanding of how crosstalk affects perception of stereoscopic content and how it can be corrected efficiently.


Towards adapting current 3DTV for an improved 3D experience

February 2012

·

7 Reads

·

2 Citations

Proceedings of SPIE - The International Society for Optical Engineering

Recent upgrades of HDTV into 3DTV resulted in impairments in displaying stereo contents. One of the most critical flaws is probably crosstalk and the resultant ghosting effect impairing the 3D experience. The purpose of this study is to identify the primary source of crosstalk, throughout the full image generation and viewing chain, for a selection of 3D displays: Liquid Crystal Display (LCD) and Plasma Display Panel (PDP) combined with different active glasses technologies. Time measurements have been carried out on various display panels and shutter glasses technologies. For each technology, the crosstalk is a complex combination of several factors depending on display panels, shutter glasses and their synchronization, and ghost busting. The study tried to discriminate the main sources of crosstalk in each case, and to simulate the effect of various display panels or shutter glasses performance optimizations. Analysis and conclusions vary depending on the display technology. For LCD, light leakage at the panel level appears the first cause of crosstalk, and, in a second step, optimization of the shutter glasses. For PDP the use of more adapted shutter glasses can mitigate color distortion effects.


Citations (25)


... Human perception is known to be affected by low-level fea- tures, semantic information, personal preferences and personal attributes [1] [2]. Subjective quality assessment experiments conducted in controlled environments provide a way to un- derstand the aforementioned factors. ...

Reference:

Effect of Primitive Features of Content on Perceived Quality of Light Field Visualization
Multimedia Quality assessment

... For instance, in two-stage object detection algorithms, the simplest AlexNet [30] requires 61 million parameters, 731 million floatingpoint operations, and 233 MB of memory. And even though some simplified one-stage object recognition methods could meet real-time requirements on some edge devices (FPS ≥ 25 [31]), there is still potential to reduce computational complexity and model size, providing more possibilities for offline deployment on terminal devices. In underwater target recognition, electromagnetic waves experience significant attenuation in the underwater environment, making it difficult to transmit data to servers for processing. ...

Perceived quality of the variation of the video temporal resolution for low bit rate coding

... There have been advances in understanding which demands VACs place on the vergence and accommodation systems from studies using other types of stereoscopic displays [7,12,13,14,15]. Even so, applying these results to AR-HMDs is not trivial since viewing in AR is fundamentally different from other types of stereoscopic displays by being more complex and may be affecting visual function differently [16,17]. ...

Effect of the accommodation-vergence conflict on vergence eye movements

Vision Research

... Table III shows the same analysis, using subjective data instead of objective data. It performs lab-to-lab comparisons using experiments that were conducted by six or more international labs: the common set from VQEG-HDTV [31] and VQEG-MM2 [6], [33]. We made lab-to-lab pairwise comparisons and then picked the median over all pairs. ...

Subjective and objective evaluation of an audiovisual subjective dataset for research and development
  • Citing Conference Paper
  • July 2013

... When the vergence angle changes due to shifts in visual attention within a shot, vergence movements are under the control of the viewer, and are driven by the understanding of the narrative and other cues in the visual scene (e.g. blur can influence where we look, Huynh-Thu, Vienne & Blondé, 2013). When a cut occurs, in contrast, the new fixation point will generally occur toward the position that is nearest the screen position of the previous fixation. ...

Visual storytelling in 2D and stereoscopic 3D video: effect of blur on visual attention

Proceedings of SPIE - The International Society for Optical Engineering

... Video quality perception is a complex subjective process that involves assessing and interpreting visual stimuli. When perceiving visual quality, humans can simultaneously obtain a variety of perceptual information [7] [8]. Our brains can easily establish connections between different modalities during perception, thus gaining a more comprehensive understanding of things [9]. ...

The Influence of Subjects and Environment on Audiovisual Subjective Tests: An International Study

IEEE Journal of Selected Topics in Signal Processing

... Although substantial efforts have been expended to model overt attention, the effect and mechanism of stereovision on saliency have not been fully examined. Reports investigating how depth information affects human eye movements (Gautier & Le Meur, 2012;Huynh-Thu & Schiatti, 2011;Jansen et al., 2009;Khaustova et al., 2013;Lang et al., 2012) showed that humans tend to fixate similar locations in situations with or without binocular information. More specifically, the fixation locations are almost the same for 3D and 2D images in long observation windows (20 seconds) but different in short observation windows (about four or five seconds) (Gautier & Le Meur, 2012;Jansen et al., 2009;Khaustova et al., 2013). ...

Examination of 3D visual attention in stereoscopic video content
  • Citing Article
  • February 2011

Proceedings of SPIE - The International Society for Optical Engineering

... Answering this question depends on the physical properties of the generated visual stimuli related to the display technology and the specifics of human visual perception 3 . Therefore, assessing the ergonomics of 3D presenter tools has become essential in terms of depth perception and deployment of visual attention 4,5 . Visual search in 3D environment depends on depth perception 6,7 because it relies on our ability to selectively attend to certain features or objects in a visual scene and perceiving them, moreover, the depth cues have an essential role in depth judgment, and applying more depth cues results in more direct depth perception 8,9 . ...

The Importance of Visual Attention in Improving the 3DTV Viewing Experience: Overview and New Perspectives

IEEE Transactions on Broadcasting

... However, when optimizing test suites, there are risks associated with the possibility that defects will be contained in fragments that have been removed from the test suite. Besides, there are various laboratory factors, including, for example, screen size, lighting, viewer-to-screen distance, and so on [8]. ...

Subjective video quality evaluation for multimedia applications - art. no. 60571D
  • Citing Conference Paper
  • January 2006

Proceedings of SPIE - The International Society for Optical Engineering

... These developments have also contributed to the reduction in temporal artifacts such as motion blur, flickering, and stuttering [8]. Several studies have examined the impact of reduced temporal resolution on Quality of Experience (QoE) [9], particularly in relation to issues such as jitter and jerkiness [10]. In cases when network conditions required the use of lower bitrates, the QoE was found to be higher when video sequences were encoded using lower SR compared to video sequences with lower temporal resolution (TR) [11]. ...

Impact of jitter and jerkiness on perceived video quality