| Comparison between geometry compression codecs in (Quach et al., 2020b) and MPEG G-PCC v10.0 (MPEG, 2021c). Learning-based approaches can achieve significantly lower distortions at equivalent bitrates.

| Comparison between geometry compression codecs in (Quach et al., 2020b) and MPEG G-PCC v10.0 (MPEG, 2021c). Learning-based approaches can achieve significantly lower distortions at equivalent bitrates.

Source publication
Article
Full-text available
Point clouds are becoming essential in key applications with advances in capture technologies leading to large volumes of data. Compression is thus essential for storage and transmission. In this work, the state of the art for geometry and attribute compression methods with a focus on deep learning based approaches is reviewed. The challenges faced...

Context in source publication

Context 1
... probabilities predicted by the entropy model can then be used by an arithmetic coder in order to encode and decode the latent space. Quach et al. (2019) introduce the use of CNNs for point cloud geometry compression and in (Quach et al., 2020b) the proposed approach significantly outperform MPEG G-PCC (MPEG, 2021c) as shown in Figure 6. The decoding of a point cloud can be cast as a binary classification problem in a voxel grid. ...

Similar publications

Preprint
Full-text available
3D face recognition systems have been widely employed in intelligent terminals, among which structured light imaging is a common method to measure the 3D shape. However, this method could be easily attacked, leading to inaccurate 3D face recognition. In this paper, we propose a novel, physically-achievable attack on the fringe structured light syst...

Citations

... R ECENT years, we have observed substantial progress in employing deep learning techniques to improve visual data compression [1], [2], [3]. By utilizing datadriven learning in an end-to-end fashion, these methods significantly surpass traditional, rule-based approaches for compressing various types of data, including images [4], [5], [6], videos [7], [8], [9], point clouds [10], [11], among others. ...
... As in (10), the decoded features comprise the information from encoded latent residual, spatial, and temporal references. In theory, the full information in the past frames can be utilized progressively scale by scale in 3. Proper upsampling is devised to ensure the resolution consistency. ...
Preprint
The enhanced Deep Hierarchical Video Compression-DHVC 2.0-has been introduced. This single-model neural video codec operates across a broad range of bitrates, delivering not only superior compression performance to representative methods but also impressive complexity efficiency, enabling real-time processing with a significantly smaller memory footprint on standard GPUs. These remarkable advancements stem from the use of hierarchical predictive coding. Each video frame is uniformly transformed into multiscale representations through hierarchical variational autoencoders. For a specific scale's feature representation of a frame, its corresponding latent residual variables are generated by referencing lower-scale spatial features from the same frame and then conditionally entropy-encoded using a probabilistic model whose parameters are predicted using same-scale temporal reference from previous frames and lower-scale spatial reference of the current frame. This feature-space processing operates from the lowest to the highest scale of each frame, completely eliminating the need for the complexity-intensive motion estimation and compensation techniques that have been standard in video codecs for decades. The hierarchical approach facilitates parallel processing, accelerating both encoding and decoding, and supports transmission-friendly progressive decoding, making it particularly advantageous for networked video applications in the presence of packet loss. Source codes will be made available.
... The simplest form of this data is a set of collected points called a point cloud [1]. The term "point cloud" refers to a collection of points consisting of x, y, z coordinates and associated properties like color, normal, reflectance [2]. The geometry or position of individual points and the attributes or other data associated with each of those points can be separated into his two halves or parts of the point cloud. ...
... Using fitted surfaces or predetermined surfaces, projection onto 2D planes may also be made to those points (like planes aligned on axis). The point cloud transforms into a point sequence when the dimensionality is reduced to one dimension [2]. ...
Article
Full-text available
High-density point clouds expressing attractive 3D images are attracting attention. These gigantic media require large bandwidth allocations, making them problematic to stream to resource-constrained hand-held devices. This paper proposes a method for point cloud compression and streaming of large point clouds using a web server. Storing large point cloud videos on a web server allows users to publish data sets without using additional applications or sending large amounts of data ahead of time. HTTP/2 improves transfer efficiency by compressing headers into binary format and reduces latency by sending many multiplexed requests using a single TCP session. The large size of the point cloud data and the current limitations of wireless channels demand point cloud compression methods. MATLAB is used for the purpose of analysis, at results show an average compression ratio of 30:1. Our proposed system achieves lower average latency between successive frames resulting in an increased frame rate to 72.46 fps when streaming 4M points, and 82.51 fps in the case of streaming 800K points compared to much lower rates in conventional work.
... Recently, many deep learning-based point cloud downsampling methods have been proposed. Geometric deep learning-based point cloud compression employs techniques, such as local feature extraction and local feature aggregation, to achieve point cloud compression [32]. K-means clustering was first used to select representative points and to remove redundant points [33]. ...
Article
Full-text available
We develop a phase-resolved wave field reconstruction method by the learning-based downsampling network for processing large amounts of inhomogeneous data from non-contact wave optical observations. The Waves Acquisition Stereo System (WASS) extracts dense point clouds from ocean wave snapshots. We couple learning-based downsampling networks with the phase-resolved wave reconstruction algorithm, and the training task is to improve the wave reconstruction completeness ratio CR. The algorithm first achieves initial convergence and task-optimized performance on numerical ocean waves built by the linear wave theory model. Results show that the trained sampling network can lead to a more uniform spatial distribution of sampling points and improve CR at the observed edge regions far from the optical camera. Finally, we apply our algorithm to a natural ocean wave dataset. The average completeness ratio is improved over 30% at low sampling ratios (SR∈[2−9,2−7]) compared to the traditional FPS method and Random sampling method. Moreover, the relative residual between the final reconstructed wave and the natural wave is less than 15%, which provides an efficient tool for wave reconstruction in ocean engineering.
... In this case, data can be compressed and transmitted over a network in a way quite similar to what is done for common 2D images [17,18]. The main 2D-based techniques leverage traditional image or video compression methods, such as JPEG, MPEG, or dictionary-based compression [19]. ...
... Although depth image compression has been studied for many years, it remains an open discussion topic in the scientific community. Various lossy and lossless methods have been proposed [15,19]; however, a standard reference method has not yet been established. ...
Article
Full-text available
3D modeling and reconstruction are critical to creating immersive XR experiences, providing realistic virtual environments, objects, and interactions that increase user engagement and enable new forms of content manipulation. Today, 3D data can be easily captured using off-the-shelf, specialized headsets; very often, these tools provide real-time, albeit low-resolution, integration of continuously captured depth maps. This approach is generally suitable for basic AR and MR applications, where users can easily direct their attention to points of interest and benefit from a fully user-centric perspective. However, it proves to be less effective in more complex scenarios such as multi-user telepresence or telerobotics, where real-time transmission of local surroundings to remote users is essential. Two primary questions emerge: (i) what strategies are available for achieving real-time 3D reconstruction in such systems? and (ii) how can the effectiveness of real-time 3D reconstruction methods be assessed? This paper explores various approaches to the challenge of live 3D reconstruction from typical point cloud data. It first introduces some common data flow patterns that characterize virtual reality applications and shows that achieving high-speed data transmission and efficient data compression is critical to maintaining visual continuity and ensuring a satisfactory user experience. The paper thus introduces the concept of saliency-driven compression/reconstruction and compares it with alternative state-of-the-art approaches.
... Another trendy area nowadays is 3D point cloud compression [22], especially using tensorial representations and machine learning techniques [23][24][25][26][27]. The authors have invested in these approaches mainly because they allow the representation of static and dynamic 3D objects, providing a more straightforward yet realistic representation over classic 3D meshes. ...
Article
Full-text available
Three-dimensional mesh compression is vital to support advances in many scenarios, such as 3D web-based applications. Existing 3D mesh methods usually require complex data structures and time-consuming processing. Given a mesh represented by its vertices and triangular faces, we present a novel, fast, and straightforward encoding algorithm. Our method encodes the mesh connectivity data based on an upper triangular matrix which is easily recovered by its underlying decoding process. Our technique encodes the mesh edges in linear time without losing any face in the process. Results show that our method provides a connectivity compression rate of 55.29 and an average total compression rate of 27.09. Furthermore, our approach achieves, on average, a similar compressing rate of state-of-the-art algorithms, such as OpenCTM, which considers geometry and connectivity, while our approach considers only their connectivity.
... Sheng et al. present an approach using neural networks for point cloud attribute compression, incorporating second-order point convolution [31]. Furthermore, Quach et al. provide a comprehensive survey summarizing recent advancements in neural point cloud compression [25]. While neural compression techniques show promise, the current research emphasizes static point clouds. ...
Conference Paper
Full-text available
Volumetric videos allow six degrees of freedom (6DoF) movement for viewers, enabling numerous applications in domains such as entertainment, healthcare, and education. MPEG’s Video-based Point Cloud Compression (V-PCC) is a recent new standard for volumetric video compression that achieves a considerable compression rate while maintaining the quality of the point cloud sequence. However, V-PCC is hard to fit into existing tiling-based volumetric video streaming framework due to the lack of proper user viewing adaptive techniques. In this paper, we propose QV4 , a Quality-of-Experience (QoE) based streaming pipeline for viewpoint-aware V-PCC-encoded volumetric video. Specifically, we leverage the intermediate results produced by the V-PCC encoder to achieve effective and efficient viewpoint-aware tiling for V-PCC. We then build a QoE model and a 6DoF movement model based on real-world user data, to predict the users’ viewing experience and behaviors, respectively. The proposed QoE model and 6DoF movement model are combined with viewpoint-aware V-PCC tiling to maximize the visual quality of volumetric videos. Extensive simulations show that by enabling viewpoint-aware adaptation and optimization for V-PCC-encoded volumetric videos, QV4 can achieve up to 14.67% improvement in structural similarity index (SSIM) and 7.39% improvement in video multi-method assessment fusion (VMAF) over highly dynamic viewing behaviors in a network with limited and fluctuating bandwidth.
... A point cloud is a ubiquitous dataset used to represent a target's spatial distribution and surface properties in a unified reference system. Point clouds acquired via laser measurement principles contain threedimensional coordinates and laser reflection intensity, while those obtained through photogrammetry principles comprise both color information and three-dimensional coordinates [18]. The intensity information reflects various surface characteristics, including material composition, roughness, incident angle direction, emitted energy of the instrument, and laser wavelength. ...
Article
This paper investigates the problem of paint surface estimation and trajectory planning for the automated painting process using a six-degree-of-freedom (6DOF) robot. We first present the kinematic model of the 6DOF articulated spraying robot and calculate the coordinate transformation of the robot's end effector relative to the base position. Then we design the size of the robot's joints to ensure sufficient spraying working space for the outer coverings of automobiles. Next, we stitch the acquired workpiece point cloud data (PCD) and perform noise reduction. The iterative closest point (ICP) algorithm integrates the workpiece PCD obtained from different locations into a unified coordinate system. Based on the features of the workpiece surface, we compute the normal vector for the point cloud. Then, the original point cloud data is segmented into several pieces by the different components of the normal vector on the coordinate axis. By slicing the segmented PCD evenly, we approximate the spraying paths of each surface of the car cover. Finally, we validate the effectiveness of our proposed algorithm through simulation.
... The geometry, or the location of each single point, and the attributes, or the further data associated with each of these points, can be separated into two halves, or parts of point clouds. The terms "dynamic point cloud" and "static point cloud" are used to describe point clouds with and without a temporal dimension [6]. ...
Conference Paper
Full-text available
Due to the high demand of Augmented Reality for civilian and military applications, it has been taking a lot of attention recently. In our country the university’s campus is such urgent demanding of such applications where new coming students, guests, and temporary teachers might have trouble finding buildings, facilities, and administrations on a vast university campus with many buildings, making it difficult to move among them. This paper presents the implementation of a complete Campus Navigation System using Augmented Reality technique that is user-friendly and applicable for almost all hand-held smart devices. The proposed applications are built with Unity using ARcore, NavMesh and point cloud localization techniques, and carried out at the campus of Al-Nahrain University College of Information Engineering in Baghdad Iraq. Due to the large size of point cloud data and the current limitation in wireless channels, a method of point cloud compression is also presented. MATLAB is used for the purpose of system efficiency analysis. According to test results, the proposed application could improve pathfinding efficiency and usefulness on campus. Keywords— Point cloud compression, Augmented Reality, Campus, Indoor navigation, Mobile application
... Sheng at al. [28] proposed a neural network for point cloud attribute compression, using a second-order point convolution. Other recent advances in neural point cloud compression are summarized by Quach et al. in their survey [22]. While neural compression techniques have shown promises, current research focuses on static point clouds. ...
... For example, the space partitioning trees approaches that exploit the 3D correlation between pointcloud points are widely used to compress the pointcloud data [4]- [9]. Recently, deep learning based approaches were also proposed to leverage data and learn or encode the pointcloud compression [10]- [14]. Different from these frameworks, the probabilistic approaches exploit the compactness of the distributions to compress 3D sensor observation. ...