
Patrick Le CalletNantes Université | UNIV Nantes · LS2N UMR 6004, Polytech Nantes
Patrick Le Callet
PhD, HDR
About
627
Publications
99,744
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
13,372
Citations
Introduction
Additional affiliations
September 2003 - present
Publications
Publications (627)
Image composition involves extracting a foreground object from one image and pasting it into another image through Image harmonization algorithms (IHAs), which aim to adjust the appearance of the foreground object to better match the background. Existing image quality assessment (IQA) methods may fail to align with human visual preference on image...
With the rapid development of eXtended Reality (XR), egocentric spatial shooting and display technologies have further enhanced immersion and engagement for users. Assessing the quality of experience (QoE) of egocentric spatial videos is crucial to ensure a high-quality viewing experience. However, the corresponding research is still lacking. In th...
With the development of eXtended Reality (XR), head-mounted shooting and display technology have experienced significant advancement and gained considerable attention. Egocentric spatial images and videos are emerging as a compelling form of stereoscopic XR content. Different from traditional 2D images, egocentric spatial images present challenges...
Omnidirectional media formats, particularly 360° videos with spatial audio, provide new immersive experiences and introduce a novel dimension to content consumption.We explore the relationship between subjective data quality and metric performance evaluation in the context of Omnidirectional videos with spatial audio. While methodologies for 360° v...
Traditional passive point cloud acquisition systems, such as lidars or stereo cameras, can be impractical in real-life and industrial use cases. Firstly, some extreme environments may preclude the use of these sensors. Secondly, they capture information from the entire scene instead of focusing on areas relevant to the end task, such as object reco...
The deployment of multi-stream fusion strategy on behavioral recognition from skeletal data can extract complementary features from different information streams and improve the recognition accuracy, but suffers from high model complexity and a large number of parameters. Besides, existing multi-stream methods using a fixed adjacency matrix homogen...
Point clouds have become increasingly prevalent in representing 3D scenes within virtual environments, alongside 3D meshes. Their ease of capture has facilitated a wide array of applications on mobile devices, from smartphones to autonomous vehicles. Notably, point cloud compression has reached an advanced stage and has been standardized. However,...
Unsupervised 3D pose estimation has gained prominence due to the challenges in acquiring labeled 3D data for training. Despite promising progress, unsupervised approaches still lag behind supervised methods in performance. Two factors impede the progress of unsupervised approaches: incomplete geometric constraint and inadequate interaction among sp...
Les récents travaux sur l’acceptabilité des solutions intégrant de l’intelligence artificielle (IA) en situation de travail se focalisent sur la prise en compte des facteurs humains et des facteurs intrinsèques à l’outil comme déterminants de l’adoption par les employé·es. Dans cet article, nous retraçons le déploiement d’outils professionnels basé...
In video streaming applications, a fixed set of bitrate-resolution pairs (known as a bitrate ladder) is typically used during the entire streaming session. However, an optimized bitrate ladder per scene may result in (i) decreased storage or delivery costs or/and (ii) increased Quality of Experience. This paper introduces a Just Noticeable Differen...
In the past years, AI has seen many advances in the field of NLP. This has led to the emergence of LLMs, such as the now famous GPT-3.5, which revolutionise the way humans can access or generate content. Current studies on LLM-based generative tools are mainly interested in the performance of such tools in generating relevant content (code, text or...
Over the past decade, 3D graphics have become highly detailed to mimic the real world, exploding their size and complexity. Certain applications and device constraints necessitate their simplification and/or lossy compression, which can degrade their visual quality. Thus, to ensure the best Quality of Experience (QoE), it is important to evaluate t...
Point clouds have become increasingly prevalent in representing 3D scenes within virtual environments, alongside 3D meshes. Their ease of capture has facilitated a wide array of applications on mobile devices, from smartphones to autonomous vehicles. Notably, point cloud compression has reached an advanced stage and has been standardized. However,...
In this chapter, we present our previous study of few-shot pill recognition [1] as a case study to demonstrate how few-shot/meta learning could be applied for medical use-cases. Pill image recognition is vital for many personal/public healthcare applications and should be robust to diverse unconstrained real-world conditions. Most existing pill rec...
Screen content, which is often computer-generated, has many characteristics distinctly different from conventional camera-captured natural scene content. Such characteristic differences impose major challenges to the corresponding content quality assessment, which plays a critical role to ensure and improve the final user-perceived quality of exper...
With the development of multimedia technology, Augmented Reality (AR) has become a promising next-generation mobile platform. The primary value of AR is to promote the fusion of digital contents and real-world environments, however, studies on how this fusion will influence the Quality of Experience (QoE) of these two components are lacking. To ach...
Due to complex and volatile lighting environment, underwater imaging can be readily impaired by light scattering, warping, and noises. To improve the visual quality, Underwater Image Enhancement (UIE) techniques have been widely studied. Recent efforts have also been contributed to evaluate and compare the UIE performances with subjective and objec...
feature extraction from Difference of Closings (DoC) bands at multiple scales and multiple resolution levels
Emotions, and consequently facial expressions, play an essential role in communication - and thus in everyday life. With the increase of human-machine interactions, and more especially of multimedia applications, automatic recognition of facial expressions has emerged as a challenging task, particularly under naturalistic conditions. In the present...
The causal relationship between olfactory perception and human emotions has been widely studied and accepted by various fields including, but not limited to, health, marketing, and multimedia. In this work-in-progress paper, we present an olfactive, interactive and immersive experience taking place during the World Creativity & Innovation Week in N...
Emotions are fundamental to human experience, as they impact our cognition, perception, and daily tasks (e.g., communication). The ACM IMX'22 Emotion workshop aims to bring together researchers and practitioners from various fields (including, but not limited to, computer science, design, and cognitive science) to discuss challenges in crafting and...
The human eye cannot perceive small pixel changes in images or videos until a certain threshold of distortion. In the context of video compression, Just Noticeable Difference (JND) is the smallest distortion level from which the human eye can perceive the difference between reference video and the distorted/compressed one. Satisfied-User-Ratio (SUR...
Images synthesized using Depth-Image-Based Rendering (DIBR) techniques are characterized by complex structural distortion. Multi-resolution multi-scale sparse image representation generated using morphological Difference of Closings operator (DoC) is used to efficiently capture structure-related distortion of synthesized images in the no-reference...
Immersive geospatial visualization finds increasing application for navigation, exploration, and analysis. Many such require the display of data at different scales, often in views with three-dimensional geometry. Multi-view solutions, such as focus+context, overview+detail, and distorted projections can show different scales at the same time, and...
Just Noticeable Difference (JND) model developed based on Human Vision System (HVS) through subjective studies is valuable for many multimedia use cases. In the streaming industries, it is commonly applied to reach a good balance between compression efficiency and perceptual quality when selecting video encoding recipes. Nevertheless, recent state-...
To open up new possibilities to assess the multimodal perceptual quality of omnidirectional media formats, we proposed a novel open source 360 audiovisual (AV) quality dataset. The dataset consists of high-quality 360 video clips in equirectangular (ERP) format and higher-order ambisonic (4th order) along with the subjective scores. Three subjectiv...
The widespread image applications have greatly promoted the vision-based tasks, in which the Image Quality Assessment (IQA) technique has become an increasingly significant issue. For user enjoyment in multimedia systems, the IQA exploits image fidelity and aesthetics to characterize user experience; while for other tasks such as popular object rec...
With the development of multimedia technology, Augmented Reality (AR) has become a promising next-generation mobile platform. The primary value of AR is to promote the fusion of digital contents and real-world environments, however, studies on how this fusion will influence the Quality of Experience (QoE) of these two components are lacking. To ach...