João Ascenso

João Ascenso
University of Lisbon | UL · Instituto Superior Técnico

Ph.D.

About

142
Publications
16,411
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,324
Citations
Additional affiliations
January 2008 - present
January 2007 - present
Polytechnic Institute of Lisbon
January 2005 - December 2009
Instituto Politécnico de Lisboa
Position
  • Instituto Superior de Engenharia de Lisboa

Publications

Publications (142)
Preprint
This short paper proposes a new database - NeRF-QA - containing 48 videos synthesized with seven NeRF based methods, along with their perceived quality scores, resulting from subjective assessment tests; for the videos selection, both real and synthetic, 360 degrees scenes were considered. This database will allow to evaluate the suitability, to Ne...
Article
The Joint Photographic Experts Group (JPEG) AI learning-based image coding system is an ongoing joint standardization effort between International Organization for Standardization (ISO), International Electrotechnical Commission (IEC), and International Telecommunication Union - Telecommunication Sector (ITU-T) for the development of the first imag...
Article
This is an updated report of the work of the JPEG Committee formed by the International Telecommunication Union (ITU), the International Organization for Standardization (ISO), and the International Electrotechnical Commission (IEC), known as the Joint Photographic Experts Group (JPEG) . A detailed introduction to JPEG’s work was already presente...
Article
Full-text available
To visualize omnidirectional (or 360°) visual content, a sphere to plane projection is employed, that maps pixels from the observed sphere region to a 2D image, called as viewport . However, this projection introduces geometrical distortions on the rendered image, such as object shape stretching, or shearing, and bending of straight lines, which...
Article
Full-text available
Point cloud coding solutions have been recently standardized to address the needs of multiple application scenarios. The design and assessment of point cloud coding methods require reliable objective quality metrics to evaluate the level of degradation introduced by compression or any other type of processing. Several point cloud objective quality...
Preprint
Full-text available
Point cloud coding solutions have been recently standardized to address the needs of multiple application scenarios. The design and assessment of point cloud coding methods require reliable objective quality metrics to evaluate the level of degradation introduced by compression or any other type of processing. Several point cloud objective quality...
Preprint
Full-text available
Point cloud coding solutions have been recently standardized to address the needs of multiple application scenarios. The design and assessment of point cloud coding methods require reliable objective quality metrics to evaluate the level of degradation introduced by compression or any other type of processing. Several point cloud objective quality...
Preprint
Full-text available
Point clouds (PCs) are a powerful 3D visual representation paradigm for many emerging application domains, especially virtual and augmented reality, and autonomous vehicles. However, the large amount of PC data required for highly immersive and realistic experiences requires the availability of efficient, lossy PC coding solutions are critical. Rec...
Article
Full-text available
Recently, data-driven algorithms such as deep neural networks have attracted a lot of attention and have become a popular area of research and development. This interest is driven by several factors, such as recent advances in processing power (cheap and powerful hardware), the availability of large datasets (big data), and several small but import...
Article
Recently, point clouds have shown to be a promising way to represent 3D visual data for a wide range of immersive applications, from augmented reality to autonomous cars. Emerging imaging sensors have made it easier to perform richer and denser point cloud acquisition, notably with millions of points, thus raising the need for efficient point cloud...
Conference Paper
An increased interest in immersive applications has drawn attention to emerging 3D imaging representation formats, notably light fields and point clouds (PCs). Nowadays, PCs are one of the most popular 3D media formats, due to recent developments in PC acquisition, namely with new depth sensors and signal processing algorithms. To obtain high fidel...
Article
Full-text available
Due to the increasing shortage of fish in the seas, Vessel Monitoring Systems (VMS) play a very important role in fishing activity monitoring, control and surveillance. In this context, the detection of fishing activities in prohibited zones is a critical task. Although position, speed and other information are provided by the VMS system, detecting...
Article
Nowadays, point clouds (PCs) are a promising representation format for immersive content and target several emerging applications, notably in virtual and augmented reality. However, efficient coding solutions are critically needed due to the large amount of PC data required for high quality user experiences. To address these needs, several PC codin...
Preprint
Full-text available
An increased interest in immersive applications has drawn attention to emerging 3D imaging representation formats, notably light fields and point clouds (PCs). Nowadays, PCs are one of the most popular 3D media formats, due to recent developments in PC acquisition, namely with new depth sensors and signal processing algorithms. To obtain high fidel...
Article
Humans mainly communicate among them and with the world around them using light and vision, thus implying that visual representation technologies play a central role in human societies. While visual representation has been based on the 2D representation paradigm for many decades, multiple developments are nowadays pressing towards the adoption of m...
Preprint
Full-text available
Reliable quality assessment of decoded point cloud geometry is essential to evaluate the compression performance of emerging point cloud coding solutions and guarantee some target quality of experience. This paper proposes a novel point cloud geometry quality assessment metric based on a generalization of the Hausdorff distance. To achieve this goa...
Article
In recent years, visual sensors have been quickly improving, notably targeting richer acquisitions of the light present in a visual scene. In this context, the so-called lenslet light field (LLF) cameras are able to go beyond the conventional 2D visual acquisition models, by enriching the visual representation with directional light measures for...
Article
To render omnidirectional (or 360°) visual content, a projection that maps the pixels from a portion of the viewing sphere to a 2D plane must be employed; this projection creates the viewport image shown to the user and thus has an important role on the quality of experience, for this type of content. However, a sphere to planar projection always i...
Preprint
Full-text available
Recently, point clouds have shown to be a promising way to represent 3D visual data for a wide range of immersive applications, from augmented reality to autonomous cars. Emerging imaging sensors have made easier to perform richer and denser point cloud acquisition, notably with millions of points, thus raising the need for efficient point cloud co...
Article
In this paper, a novel quality metric to evaluate depth-based synthesized views is proposed. This metric relies on a hybrid approach that uses features extracted in different phases of the image synthesis procedure, namely from the bitstream, from intermediate data produced during the synthesis process, and from the final synthesized view; these fe...
Conference Paper
Omnidirectional video, also known as 360 o video, is becoming quite popular since it provides a more immersive and natural representation of the real world. However, to fulfill the expectation of an high quality-of-experience (QoE), the video content delivered to the end users must also have high quality. To automatically evaluate the video quality...
Conference Paper
Full-text available
Omnidirectional video, also known as 360o video, is becoming quite popular since it provides a more immersive and natural representation of the real world. However, to fulfill the expectation of an high quality-of-experience (QoE), the video content delivered to the end users must also have high quality. To automatically evaluate the video quality,...
Article
Recently, 3D visual representation models such as light fields and point clouds are becoming popular due to their capability to represent the real world in a more complete and immersive way, paving the road for new and more advanced visual experiences. The point cloud representation model is able to efficiently represent the surface of objects/scen...
Article
Full-text available
Holography is an emerging technology to represent and display visual information with high expectations in terms of user experience. A hologram is a reproduction of a light field represented through the interference pattern between two wavefields, the reference and the object wavefields. Whatever their creation process, holograms may have a digital...
Conference Paper
Popular local feature extraction schemes, such as SIFT, are robust when changes in illumination, translation and scale occur, and play an important role in visual content retrieval. However, these solutions are not very robust to 3D object rotations and camera viewpoint changes. In such scenarios, the emerging and richer lenslet light field image r...
Article
The emerging Scalable HEVC (SHVC) video coding standard provides an efficient solution for transmission of video over heterogeneous and time dynamic networks, terminals, and usage environments. The encoding complexity and the error sensitivity associated to the efficient HEVC coding tools adopted in SHVC make this scalable codec less attractive to...
Article
The GreenEyes project aims at developing a comprehensive set of new methodologies, practical algorithms and protocols, to empower wireless sensor networks with vision capabilities. The key tenet of this research is that most visual analysis tasks can be carried out based on a succinct representation of the image, which entails both global and local...
Conference Paper
Full-text available
In the HEVC scalable extension (SHVC), the merge mode prediction plays an important role due to its high selection probability and its low associated bitrate. In the SHVC enhancement layer (EL) merge mode prediction, the motion information is selected from the merge candidates, in this case the spatial, temporal, and inter-layer candidates. Th...
Article
Considering the scarce resources in visual sensor networks, video coding solutions are necessary to address these constraints, namely the limited energy, memory, and bandwidth. The distributed video coding (DVC) paradigm is able to address these limitations by shifting most of the complexity to the decoder, typically a central location with a signi...
Article
The growing heterogeneity and dynamic nature of the networks, terminals, and usage environments has boosted the need for powerful scalable video coding engines able to efficiently adapt to changing consumption conditions. Some emerging applications such as video surveillance, visual sensor networks, and remote space transmission require scalable co...
Article
Local features represent a powerful tool which is exploited in several applications such as visual search, object recognition and tracking, etc. In this context, binary descriptors provide an efficient alternative to real-valued descriptors, due to low computational complexity, limited memory footprint and fast matching algorithms. The descriptor c...
Article
This demo showcases some of the results obtained by the GreenEyes project, whose main objective is to enable visual analysis on resource-constrained multimedia sensor networks. The demo features a multi-hop visual sensor network operated by BeagleBones Linux computers with IEEE 802.15.4 communication capabilities, and capable of recognizing and tra...
Conference Paper
Substantial rate-distortion (RD) gains have been achieved in video coding standards by increasing the encoder complexity while maintaining the decoder complexity the lowest possible. On the other hand, the alternative distributed video coding (DVC) approach proposes to exploit the video redundancy mostly at the decoder side, keeping the encoder as...
Conference Paper
Full-text available
In a landscape of heterogeneous networks, terminals and usage environments, a low encoding complexity scalable video coding engine is required for many emerging applications such as wireless video surveillance, visual sensor networks and remote space transmission. To fulfil this need, a distributed scalable video coding (DSVC) framework has been de...
Article
Full-text available
Distributed video coding (DVC) is a coding paradigm which exploits the redundancy of the source (video) at the decoder side, as opposed to predictive coding, where the encoder leverages the redundancy. To exploit the correlation between views, multiview predictive video codecs require the encoder to have the various views available simultaneously....
Article
As high dynamic range video is gaining popularity, video coding solutions able to efficiently provide both low and high dynamic range video, notably with a single bitstream, are increasingly important. While simulcasting can provide both dynamic range videos at the cost of some compression efficiency penalty, bit-depth scalable video coding can pro...
Article
This demo showcases some of the results obtained by the GreenEyes project, whose main objective is to enable visual analysis on resource-constrained multimedia sensor networks. The demo features a multi-hop visual sensor network operated by BeagleBones Linux computers with IEEE 802.15.4 communication capabilities, and capable of recognizing and tra...
Conference Paper
Full-text available
The growing heterogeneity of networks, devices and consumption conditions asks for flexible and adaptive video coding solutions. The compression power of the HEVC standard and the benefits of the distributed video coding paradigm allow designing novel scalable coding solutions with improved error robustness and low encoding complexity while still a...
Conference Paper
Full-text available
Distributed Video Coding (DVC) is a video coding paradigm where the source statistics are exploited at the decoder based on the availability of Side Information (SI). In a monoview video codec, the SI is generated by exploiting the temporal redundancy of the video, through motion estimation and compensation techniques. In a multiview scenario, the...
Article
In video communication systems, the video signals are typically compressed and sent to the decoder through an error-prone transmission channel that may corrupt the compressed signal, causing the degradation of the final decoded video quality. In this context, it is possible to enhance the error resilience of typical predictive video coding schemes...
Article
Full-text available
The distributed video coding (DVC) paradigm is based on two well-known information theory results: the Slepian-Wolf and Wyner-Ziv theorems. In a DVC codec, the video signal correlation is mostly exploited at the decoder, providing a flexible distribution of the computational complexity between the encoder and the decoder and error robustness to cha...
Conference Paper
In a heterogeneous landscape of networks, devices and consumption environments, scalability is one of the most important video coding features. To achieve higher scalable video compression efficiency, this paper proposes a novel scalable video coding framework based on predictive video coding but also exploiting some additional decoder side informa...
Conference Paper
Nowadays, local feature descriptors have emerged as one of the most promising and powerful visual representation solutions. In fact, with a minimal amount of computational effort, the detection and extraction of visual features can provide reliable and a compact image representation that enables a rich set of image analysis tasks. In this paper, a...
Conference Paper
Binary descriptors have recently emerged as low-complexity alternatives to state-of-the-art descriptors such as SIFT. The descriptor is represented by means of a binary string, in which each bit is the result of the pair-wise comparison of smoothed pixel values properly selected in a patch around each keypoint. Previous works have focused on the co...
Article
Video coding technologies have played a major role in the explosion of large market digital video applications and services. In this context, the very popular MPEG-x and H-26x video coding standards adopted a predictive coding paradigm, where complex encoders exploit the data redundancy and irrelevancy to ‘control’ much simpler decoders. This codec...
Conference Paper
Full-text available
Several visual feature extraction algorithms have recently appeared in the literature, with the goal of reducing the computational complexity of state-of-the-art solutions (e.g., SIFT and SURF). Therefore, it is necessary to evaluate the performance of these emerging visual descriptors in terms of processing time, repeatability and matching accurac...
Conference Paper
Nowadays, visual sensor networks have emerged as an important research area for distributed signal processing, with unique challenges in terms of performance, complexity, and resource allocation. In visual sensor networks, the energy consumption must be kept low to extend the lifetime of each battery-operated camera node. Thus, considering the larg...
Conference Paper
Full-text available
With the emergence of the High Efficiency Video Coding (HEVC) standard, significant additional video compression improvements have recently been provided, notably around 50% increased compression performance regarding the previously available standard video coding solutions. Moreover, the same requirements which demanded some years ago for a scalab...
Article
Full-text available
Frame rate upconversion (FRUC) is an important post-processing technique to enhance the visual quality of low frame rate video. A major, recent advance in this area is FRUC based on trilateral filtering which novelty mainly derives from the combination of an edge-based motion estimation block matching criterion with the trilateral filter. However,...
Conference Paper
3D and free viewpoint video (FVV) are new media types able to provide the user an experience far beyond what is currently offered by traditional 2D video systems. Among the possible formats, the so-called multi-view plus depth format has the potential to synthesize at the decoder more views than those explicitly coded at the encoder. These addition...
Conference Paper
Full-text available
Although major achievements have been reached in terms of video compression efficiency, additional gains are still needed to satisfy current and emerging applications needs. This trend justifies the continuous efforts to go beyond the compression capabilities of the state-of-the-art H.264/AVC standard. This paper proposes to further bridge two vide...
Conference Paper
In video streaming, bitstreams are often transmitted in best-effort IP networks where impairments such as congestion and varying delay often cause artifacts in the decoded video. In this scenario, packet losses should be detected as early as possible in the transmission chain, preferably inside the network, where perceptual video quality metrics ar...
Conference Paper
Full-text available
The so-called DIRECT coding mode plays an important role in the RD performance of predictive video coding such as the H.264/AVC and MPEG-4 standards because there is typically a large probability that the DIRECT mode is selected in B-slices by the rate-distortion optimization (RDO) process. Although the current H.264/AVC DIRECT coding procedure exp...