Article

# Run-Length Encodings

Authors:
To read the full-text of this research, you can request a copy directly from the author.

## Abstract

First Page of the Article

## No full-text available

... Huffman coding [3] is an extraordinary method to eliminate coding redundancy for a stream of data. Arithmetic coding [4] and Golomb coding [5] are also approaches to eliminating coding redundancy. They all require accurate probability models of input symbols. ...
... We use Golomb coding to encode the distance difference between each location and the previous location. Golomb coding [5] was designed for non-negative integer input with geometric probability distribution. We use it in the following steps. ...
... We use the coordinate of the pixel in the upper left corner to represent the location of the whole shape. Therefore, the locations of these shapes are {(x i , y i ) : (0, 0), (0, 6), (3,5), (4, 2), (5, 4), (6, 0), (6, 5)}. Using relative values to represent locations can reduce coding costs. ...
Article
Full-text available
Soft compression is a lossless image compression method that is committed to eliminating coding redundancy and spatial redundancy simultaneously. To do so, it adopts shapes to encode an image. In this paper, we propose a compressible indicator function with regard to images, which gives a threshold of the average number of bits required to represent a location and can be used for illustrating the working principle. We investigate and analyze soft compression for binary image, gray image and multi-component image with specific algorithms and compressible indicator value. In terms of compression ratio, the soft compression algorithm outperforms the popular classical standards PNG and JPEG2000 in lossless image compression. It is expected that the bandwidth and storage space needed when transmitting and storing the same kind of images (such as medical images) can be greatly reduced with applying soft compression.
... This technique and a range of other related strategies can be used for significantly reducing the effects of bit errors on the fidelity of the decoded data. Wen and Villasenor [18,19] proposed a new class of asymmetric RVLCs having the same codeword length distribution as the Golomb-Rice codes [20,21] as well as exp-Golomb codes [22] and applied them to the video coding framework of the H.263+ [23] and MPEG-4 [24] standards. Later on, in the quest for constructing optimal RVLCs, further improved algorithms were proposed in [25][26][27][28][29]. ...
... However the above observations are reversed in Fig. 2.19 in the case of hard-decoding. For the RVLC-I, the symbol-levle trellis based decoding schemes performs slightly better than the corresponding bit-level trellis based decoding schemes when using hard-decoding, while they perform similarly in the case of soft-decoding, as seen in Fig. 2. 20. Finally, Table 2.12, using BPSK transmission over AWGN channels. ...
... The EXIT function of an IRCC can be obtained from those of its subcodes. Denote the EXIT function of the subcode k as Tc1,k(lA(Cl»)' Assuming that the trellis segments of the subcodes do not significantly interfere with each other, which might change the associated transfer characteristics, the EXIT function Tq (lA(Cl») of the target IRCC is the weighted superposition of the EXIT function Tq,k(IA(q)) [85], yielding, ( 4.22) For example, the EXIT functions of the 17 subcodes used in [85] are shown in Fig. 4. 20. We now optimise the weighting coefficients, {ad, so that the IRCC's EXIT curve ofEq. ...
Thesis
p>In this thesis, we apply EXtrinsic Information Transfer (EXIT) charts for analysing iterative decoding and detection schemes. We first consider a two-stage iterative source/channel decoding scheme, which is then extended to a three-stage serially concatenated turbo equalisation scheme. As a result, useful design guidelines are obtained for optimising the system’s performance. More explicitly, in Chapter 2 various source codes such as Huffman codes, Reversible Variable-Length Codes (RVLCs) and Variable-Length Error-Correcting (VLEC) codes are introduced and a novel algorithm is proposed for the construction of efficient RVLCs and VLEC codes. In Chapter 3, various iterative source decoding, channel decoding and channel equalisation schemes are investigated and their convergence behaviour is analysed by using EXIT charts. The effects of different source codes and channel precoders are also considered. In Chapter 4, a three-stage serially concatenated turbo equalisation scheme consisting of an inner channel equaliser, an intermediate channel code and an outer channel code separated by interleavers is proposed. With the aid of the novel concept of EXIT modules, conventional 3D EXIT chart analysis may be simplified to 2D EXIT chart analysis. Interestingly, it is observed that for the three-stage scheme relatively weak convolutional codes having short memories result in lower convergence thresholds than strong codes having long memories. Additionally, it is found that by invoking the outer and the intermediate decoder more frequently the total number of decoder activations is reduced, resulting in a relatively lower decoding complexity. Furthermore, the three-stage turbo equalisation schemes employing non-unity rate intermediate codes or IRregular Convolutional Codes (IRCCs) as the outer constituent codes are investigated. The performance of the resultant schemes is found to become gradually closer to the channel’s capacity at the expense of the increase of decoding complexity.</p
... Methods for compressing single indicators, including: Shannon-Fano coding [7,8], Huffman coding [9], Golomb-Rice coding [10,11], Gamma and Delta coding [12], Fibbonaci coding [13], (S,C)-dense coding [14], and Zeta coding [15]. ...
... The problem of coding sparse binary sequences is often raised in the literature, for example in the book by Solomon [23]. One of the first algorithms was proposed by Golomb [10], who introduced quotient and remainder coding for the distance between successive occurrences of ones. Then, Gallager and van Voorhis [24] derived a relationship that allows for optimal selection of compression parameters for Golomb codes. ...
... Underlying the new algorithm is the Golomb-Rice coding [10,26,27], which is still used in many applications that require efficient data compression. It is vital for modern image coding methods [40,41], video compression [42,43], transmission coding in wireless sensor networks [44], compression of data from exponentially distributed sources [45], Gaussian distributed sources [46], and high variability sources [47]. ...
Article
Full-text available
This article deals with compression of binary sequences with a given number of ones, which can also be considered as a list of indexes of a given length. The first part of the article shows that the entropy H of random n-element binary sequences with exactly k elements equal one satisfies the inequalities k⋅log2(0.48⋅n/k)<H<k⋅log2(2.72⋅n/k). Based on this result, we propose a simple coding using fixed length words. Its main application is the compression of random binary sequences with a large disproportion between the number of zeros and the number of ones. Importantly, the proposed solution allows for a much faster decompression compared with the Golomb-Rice coding with a relatively small decrease in the efficiency of compression. The proposed algorithm can be particularly useful for database applications for which the speed of decompression is much more important than the degree of index list compression.
... Similarly, the probability model of symbol LR is designed, as shown in Fig. 30, where the reference region for 2-D transform kernels is reduced to the nearest three coefficients. The symbol HR is coded using Exp-Golomb code [42]. The sign bit is only needed for nonzero quantized transform coefficients. ...
... A grain sample in the luma plane is generated using a (2L + 1) × L block above and an L × 1 block to the left, as shown in Fig. 41, which involves 2L(L + 1) reference samples, where L ∈ {0, 1, 2, 3}. The AR model is given by (42) where S ref is the reference region and z is a pseudorandom variable that is drawn from a zero-mean unit-variance Gaussian distribution. The grain samples for chroma components are generated similar to (42) with one additional input from the collocated grain sample in the luma plane. ...
... The AR model is given by (42) where S ref is the reference region and z is a pseudorandom variable that is drawn from a zero-mean unit-variance Gaussian distribution. The grain samples for chroma components are generated similar to (42) with one additional input from the collocated grain sample in the luma plane. The model parameters associated with each plane are transmitted through the bitstream to formulate the desired grain patterns. ...
Article
Full-text available
The AV1 video compression format is developed by the Alliance for Open Media consortium. It achieves more than a 30% reduction in bit rate compared to its predecessor VP9 for the same decoded video quality. This article provides a technical overview of the AV1 codec design that enables the compression performance gains with considerations for hardware feasibility.
... assumption on the indices where small indices are always more probable to be selected. For geometric sources, there are two standard entropy coding methods: unary coding and Golomb coding (Golomb, 1966;Gallager & Van Voorhis, 1975). Details about these entropy Algorithm 1 Successive Pruning Hyperparameters: scale factor α for empirical mean Inputs: exponential sequence u 1 , . . . ...
... This makes Golomb coding a promising entropy coding method for successive pruning. The construction of Golomb codes can be found in (Golomb, 1966). ...
Preprint
Full-text available
Neural network (NN) compression has become essential to enable deploying over-parameterized NN models on resource-constrained devices. As a simple and easy-to-implement method, pruning is one of the most established NN compression techniques. Although it is a mature method with more than 30 years of history, there is still a lack of good understanding and systematic analysis of why pruning works well even with aggressive compression ratios. In this work, we answer this question by studying NN compression from an information-theoretic approach and show that rate distortion theory suggests pruning to achieve the theoretical limits of NN compression. Our derivation also provides an end-to-end compression pipeline involving a novel pruning strategy. That is, in addition to pruning the model, we also find a minimum-length binary representation of it via entropy coding. Our method consistently outperforms the existing pruning strategies and reduces the pruned model's size by 2.5 times. We evaluate the efficacy of our strategy on MNIST, CIFAR-10 and ImageNet datasets using 5 distinct architectures.
... Fortunately, since our designed median predictor is very accurate in predicting the MSB plane, the prediction error map shows a relatively uniform phenomenon, that is, most of the values in this map are 0. This implies that we can employ lossless compression method, e.g., run-length coding compression [4], to further compress the auxiliary data. ...
... According to (4), the XOR processing is implemented between the (k − 1)-th bit plane and the k bit plane to generate a new bit plane, which is used to replace the original k-th bit plane, that is to say, the information of each bit plane is stored lossless in the upper bit plane. Accordingly, the LSB bit plane can be vacated as the free rooms, which are used to embed secret data (including auxiliary data and additional data). ...
Article
Full-text available
Existing reversible data hiding work in encrypted images (RDH-EI) mostly does not attain a good balance among good visual quality, large embedding capacity and high security performance. To address this problem, we design a new reversible data hiding scheme in encrypted images by combining median prediction and bit plan cycling-XOR. Our scheme firstly estimates the most significant bit (MSB) of each pixel by considering the median value of its adjacent pixels and generates a prediction error map to mark these pixels whose MSB bits are predicted incorrectly. Subsequently, we divide bit planes of cover image and then implement plane cyclic exclusive OR from least significant bit (LSB) plane to MSB plane. The LSB plane is finally vacated to be free room. Furthermore, the processed image is encrypted by a stream cipher algorithm, and data hider can embed additional data into the LSB plane. Separable operations of data extraction, image decryption and image recovery can be achieved successfully by the receiver. Comprehensive experiments demonstrate that compared with existing methods, our scheme can attain a better balance among good visual quality, large embedding capacity and high security performance.
... Golomb. Golomb coding [15] is a widely used method for encoding integers. It is optimal if the distribution of the integer values follows the geometric distribution. ...
... A summary of the obtained results, when varying threshold t ∈ [2,15], is collected in Tables 4 and 5. They show that t influences the compression efficiency importantly. ...
Article
Full-text available
A new method for encoding a sequence of integers, named Binary Adaptive Sequential Coding with Return to Bias, is proposed in this paper. It extends the compressing pipeline for chain codes’ compression consisting of Burrows Wheeler Transform, Move-To-Front Transform, and Adaptive Arithmetic Coding. We also explain when to include the Zero-Run Transform into the above-mentioned pipeline. The Zero-Run Transform generates a sequence of integers corresponding to the number of zero-runs. This sequence is encoded by Golomb coding, Binary Adaptive Sequential Coding, and the new Binary Adaptive Sequential Coding with Return to Bias. Finally, a comparison is performed with the two state-of-the-art methods. The proposed method achieved similar compression efficiency for the Freeman chain code in eight directions. However, for the chain codes with shorter alphabets (Freeman chain code in four directions, Vertex Chain Code, and Three-OrThogonal chain code), the introduced method outperforms the referenced ones.
... If so, the expanded codeword setW E m0m1···mi−1| has the same common prefix, and from Eq. (31), W Q m0m1···mi−1 must also have it. This fact conflicts with Eq. (24), and thus w e m0m1···mi−1 for any i > 0 and (m 0 , m 1 , · · · , m i−1 ) must be '(NULL)':w |m =w |m ,w m0m1···mi−1|m =w m0m1···mi−1|m . ...
... The proposed codes outperformed both AIFV-m and Huffman codes at p s [0] ≥ 0.53. We can also compare the results with Golomb-Rice codes [24,25], well-known codes effective for exponential sources, by checking the results in [12,13,26]. The proposed codes are much more efficient than Golomb-Rice codes except for the sparse sources, with high p s [0]. ...
Preprint
Full-text available
A general class of the almost instantaneous fixed-to-variable-length (AIFV) codes is proposed, which contains every possible binary code we can make when allowing finite bits of decoding delay. The proposed codes, N-bit-delay AIFV codes, are represented by multiple code trees with high flexibility. The paper guarantees them to be uniquely decodable and present a code-tree construction algorithm under a reasonable condition. The presented algorithm provides us with a set of code trees, which achieves minimum expected code length, among a subset of N-bit-delay AIFV codes for an arbitrary source. The experiments show that the proposed codes can perform more efficiently compared to the conventional AIFV-m and Huffman codes. Additionally, in some reasonable cases, the proposed codes even outperform the 32-bit-precision range codes. The theoretical and experimental results in this paper are expected to be very useful for further study on AIFV codes.
... La compresión RLE [12] (Run-Length Encoding) a veces escrito RLC (Run-Length Coding) es una forma muy simple de compresión de datos en la que secuencias de datos con el mismo valor consecutivas son almacenadas como un único valor más su recuento, es decir, se aprovecha la redundancia de repetición. ...
Thesis
Full-text available
This thesis introduces a new multimedia compression algorithm called "LHE" (logarithmical Hopping Encoding) that does not require transformation to frequency domain, but works in the space domain. This feature makes LHE a linear algorithm of reduced computational complexity. The results of the algorithm are promising, outperforming the JPEG standard in quality and speed.
... Genelde geometrik dagılıma yakın şekilde, olasılıkları monoton azalan bir davranışa sahip aralık degerlerini sıkıştırmak için kullanılabilecek evrensel veya uyumsal birçok farklı kodlama teknigi bulunmaktadır. Evrensel kodlama şemalarına, degişken sekiz ikili kodlama [3] ve Elias kodlaması [5]; uyumsal kodlama şemalarına ise, Golomb kodlaması [6], Simple kodlaması [7] ve PForDelta [8]örnek gösterilebilir. Bu noktada, kullanılacak teknigin dizin sıkıştırma oranı ve kod çözme hızı arasında ortaya koyduguödünleşim gözönünde bulundurularak bilgi erişim sisteminin verimliligini bellek kullanımı ve sorgu işleme hızı bakımından eniyileyen şemanın seçilmesi amaçlanır.Örnegin, Golomb kodları diger tekniklere görë ustün bir sıkıştırma performansı göstermesine ragmen kod çözme hızı açısından geri kalabilmektedir [7], [9]. ...
Conference Paper
In this paper, an entropy coding technique, namely, combinatorial encoding, is investigated in the context of document indexing as an alternative compression scheme to the conventional gap encoding methods. To this purpose, a finite state transducer is designed to reduce the complexity of calculating the lexicographic rank of the encoded bitstream, and a component that will efficiently calculate the binomial coefficient for large numbers is developed. The encoding speed of the implemented solid and multi-block indexing schemes are empirically evaluated through varying term frequencies on randomly created bit strings. The method's ability of compressing a memoryless source to its entropy limit, yielding an on-the-fly indexing scheme, and conforming to document reordering by means of the transducer have been designated as its most prominent aspects.
... This error is then quantized using a uniform mid-tread quantizer with a bin size δ = 2 * N EAR + 1, where N EAR is a parameter chosen by the user, which is equal to the maximum possible error of a pixel value in the decoded image. The quantized error is then coded by a low complexity adaptive block coder based on Golomb codes [24], which the authors call Golomb-power-of-2 (GPO2) codes. ...
Article
Full-text available
Near-lossless compression is a generalization of lossless where the codec user is able to set the maximum absolute difference (the error tolerance) between the values of an original pixel and the decoded one. This enables higher compression ratios, while still allowing to control the bounds of the quantization errors in the space domain. This feature makes them attractive for applications where a high degree of certainty is required. The JPEG-LS lossless and near-lossless image compression standard combines a good compression ratio with a low computational complexity which makes it very suitable for scenarios with strong restrictions, common in embedded systems. However, our analysis shows great coding efficiency improvement potential, especially for lower entropy distributions, more common in near-lossless. In this work, we propose enhancements to the JPEG-LS standard, aimed at improving its coding efficiency at a low computational overhead, particularly for hardware implementations The main contribution is a low complexity and efficient coder, based on Tabled Asymmetric Numeral Systems (tANS), well suited for a wide range of entropy sources and with a simple hardware implementation. This coder enables further optimizations, resulting in great compression ratio improvements. When targeting photographic images, the proposed system is capable of achieving, in mean, 1.6%, 6% and 37.6% better compression for error tolerances of 0, 1 and 10, respectively. Additional improvements are achieved increasing the context size and image tiling, obtaining 2.3% lower bpp for lossless compression. Our results also show that our proposal compares favorably against state-of-the-art codecs like JPEG-XL and WebP, particularly in near-lossless, where it achieves higher compression ratios with a faster coding speed.
... The algorithm uses a dictionary which is fixed apriori. It uses the modified version of exponential Golomb code [15] in which each data item is converted to a binary representation where B is the number of discrete values that can be produced. For each new data item the algorithm finds variation between two binary representations = − −1 . ...
Article
The most daunting and conflicting challenges accompanying the wireless sensor networks are energy and security. And data aggregation and compression techniques are two of the effective ways to reduce energy consumption. As it is known that the radio transceiver consumes energy which is proportional to the number of bits transmitted on the network; hence sending fewer bits on the communication channel implies lesser energy consumption. This paper works on compressing the index of secure index on distributed data (SIDD) technique [1]; to reduce the number of bits of an index that is transmitted on the communication channel. The objective being to reduce energy consumption of SIDD. In this paper, we have worked to reduce the number of bits of the index sent on the communication channel, deploying difference encoding. The compression mechanism has established an upper bound on the energy consumption whilst all data items were unique. The scheme is scalable and can be deployed for saving energy consumption.
... The energy of the residual signal is reduced by the proposed color transformation as well as the modified DPCM prediction model which considers both the intra-spectral and inter-spectral redundancy. The JPEG_LS algorithm uses the adaptive Golomb-Rice coding [16] which is the same as the proposed encoding method. The proposed color components contain different characteristics and statistical properties. ...
Article
Full-text available
This paper presents a lossless color transformation and compression algorithm for Bayer color filter array (CFA) images. In conventional CFA compression algorithms, the compression block is placed after the demosaicking stage. However, in the proposed method, the compression block is performed first. The study used a low complexity four-channel color space aiming to reduce the correlation among the Bayer CFA color components. In the presented method, the color components showed better de-correlation compared to the raw CFA color components. After the color transformation, components are independently encoded using modified differential pulse-code modulation (DPCM) in a raster order fashion. The compression algorithm includes an adaptive Golomb-Rice and unary coding in order to generate the final bit stream. Several verification was performed on both the simulated CFA and real CFA datasets. The results show that the proposed algorithm requires less bits per pixel than the conventional lossless CFA compression technique. In addition, it outperforms the recent works on lossless CFA compression algorithms by a considerable margin.
... If a fixed number of original bits are encrypted into a fixed number of (smaller) bits, such a case is categorized into a "fixed-to-fixed" compression scheme. Similarly, "fixed-to-variable", "variable-tofixed", and "variable-to-variable" categories are available while variable lengths of original and/or encrypted bits allow higher compression ratio than fixed ones (e.g., Huffman codes (Huffman, 1952) as fixed-to-variable scheme, Lempel-Ziv (LZ)-based coding (Ziv & Lempel, 2006) as variable-to-fixed scheme, and Golomb codes (Golomb, 1966) as variable-to-variable scheme). ...
Preprint
Even though fine-grained pruning techniques achieve a high compression ratio, conventional sparsity representations (such as CSR) associated with irregular sparsity degrade parallelism significantly. Practical pruning methods, thus, usually lower pruning rates (by structured pruning) to improve parallelism. In this paper, we study fixed-to-fixed (lossless) encryption architecture/algorithm to support fine-grained pruning methods such that sparse neural networks can be stored in a highly regular structure. We first estimate the maximum compression ratio of encryption-based compression using entropy. Then, as an effort to push the compression ratio to the theoretical maximum (by entropy), we propose a sequential fixed-to-fixed encryption scheme. We demonstrate that our proposed compression scheme achieves almost the maximum compression ratio for the Transformer and ResNet-50 pruned by various fine-grained pruning methods.
... For lossless image compression, a lot of research work has been carried out [25]. The state-ofthe-art lossless image compression algorithms are Run-Length Encoding (RLE) [12], the entropy coding [9], and the dictionary-based coding [17]. In RLE, data is stored as a single value and the count of the same consecutive values, so the code length is decreased significantly. ...
Article
Full-text available
There is an increasing number of image data produced in our life nowadays, which creates a big challenge to store and transmit them. For some fields requiring high fidelity, the lossless image compression becomes significant, because it can reduce the size of image data without quality loss. To solve the difficulty in improving the lossless image compression ratio, we propose an improved lossless image compression algorithm that theoretically provides an approximately quadruple compression combining the linear prediction, integer wavelet transform (IWT) with output coefficients processing and Huffman coding. A new hybrid transform exploiting a new prediction template and a coefficient processing of IWT is the main contribution of this algorithm. The experimental results on three different image sets show that the proposed algorithm outperforms state-of-the-art algorithms. The compression ratios are improved by at least 6.22% up to 72.36%. Our algorithm is more suitable to compress images with complex texture and higher resolution at an acceptable compression speed.
... Symbol-Based coding 52 is mainly designed for document storage, which takes the repeated characters in the text as a symbol. It considers both symbols and locations, which can be expressed by formula (6). ...
Article
Full-text available
In disease diagnosis, medical image plays an important part. Its lossless compression is pretty critical, which directly determines the requirement of local storage space and communication bandwidth of remote medical systems, so as to help the diagnosis and treatment of patients. There are two extraordinary properties related to medical images: lossless and similarity. How to take advantage of these two properties to reduce the information needed to represent an image is the key point of compression. In this paper, we employ the big data mining to set up the image codebook. That is, to find the basic components of images. We propose a soft compression algorithm for multi-component medical images, which can exactly reflect the fundamental structure of images. A general representation framework for image compression is also put forward and the results indicate that our developed soft compression algorithm can outperform the popular benchmarks PNG and JPEG2000 in terms of compression ratio.
... Finally, we conduct an experiment to demonstrate the effectiveness of our context model by applying the proposed entropy coding method to the predicted signal of JPEG-LS, which originally uses the Golomb-Rice coder [40] as the entropy coder. For a fair comparison, the prediction scheme of JPEG-LS is applied after the reversible color transform mentioned in Section III-B. ...
Article
Full-text available
This paper presents a new lossless image compression method based on the learning of pixel values and contexts through multilayer perceptrons (MLPs). The prediction errors and contexts obtained by MLPs are forwarded to adaptive arithmetic encoders, like the conventional lossless compression schemes. The MLP-based prediction has long been attempted for lossless compression, and recently convolutional neural networks (CNNs) are also adopted for the lossy/lossless coding. While the existing MLP-based lossless compression schemes focused only on accurate pixel prediction, we jointly predict the pixel values and contexts. We also adopt and design channel-wise progressive learning, residual learning, and duplex network in this MLP-based framework, which leads to improved coding gain compared to the conventional methods. Experiments show that the proposed method performs better than the conventional non-learning algorithms and also recent learning-based compression methods with practical computation time.
... Parmi les codeurs entropiques les plus célèbres, on peut citer, le codage de Huffman [Huffman 1952], le codage arithmétique [Rissanen 1979] ou bien encore le codage de Golomb [Golomb 1966]. ...
Thesis
Full-text available
The works presented in this thesis are part of the development of energy-aware image compression methods in the context of wireless sensor networks. The main objective is to decrease the energy consumption of sensors and thus, to maintain a long network lifetime. The contribution of this thesis focuses mainly on the reduction of the algorithmic complexity of the JPEG image compression standard. This algorithm is very greedy in time and energy because of its high complexity, and more precisely, due to the complexity of its discrete cosine transform (DCT) stage. In order to adapt this algorithm to the particular constraints of wireless sensor networks, we proposed to reduce the computational complexity of the DCT by combining an approximate DCT method with a pruning approach. The aim of the former is to reduce the computational complexity by using only additions instead of costly multiplication operations, while the latter aims at computing only the more important low-frequency coefficients. An algorithm for the fast computation of the proposed transform is developed. Only ten additions are required for both forward and backward transformations. The proposed pruned DCT transform exhibits extremely low computational complexity while maintaining competitive image compression performance in comparison with the state-of-the-art methods. Simulation experiments of the software-based implementation are provided to prove the efficiency of the proposal in terms of processing time, required memory and energy consumption in comparison with state-of-the-art methods. Experimental works are conducted using the Atmel ATmega128 processor of Mica2 and MicaZ sensor boards. Efficient parallel-pipelined hardware architecture for the proposed pruned DCT is also designed. The resulting design is implemented on Xilinx Virtex-6 XC6VSX475T-2ff1156 FPGA technology and evaluated for hardware resource utilization, power consumption and real-time performance. All the metrics we investigated showed clear advantages of the proposed design over the state-of-the-art competitors. Keywords: Wireless sensors networks, visual sensors networks, image compression, DCT approximation, pruning approach, VLSI architecture, FPGA, energy conservation
... The encoder losslessly encodes the sequence of mapped quantizer indices. Section 4 describes the new entropy coding option, which combines a family of codes, equivalent to the length-limited Golomb-Power-of-2 (GPO2) codes [7] used by the sample-adaptive coding option of the Issue 1 standard, along with 16 new variable-to-variable length codes designed to provide more effective compression of low-entropy samples. This hybrid coding approach adaptively switches between these two coding methods on a sample-bysample basis. ...
Conference Paper
Full-text available
This paper describes the emerging Issue 2 of the CCSDS-123.0-B standard for low-complexity compression of multispectral and hyperspectral imagery, focusing on its new features and capabilities. Most significantly, this new issue incorporates a closed-loop quantization scheme to provide near-lossless compression capability while still supporting lossless compression, and introduces a new entropy coding option that provides better compression of low-entropy data.
... We then replace I(m, n) by the prediction residual δI(m, n) = I(m, n) − I(m, n). If the predictions are accurate, the result is a residual image which is efficiently encoded using a variable-length Huffman, or arithmetic, entropy coder [21,22]. In this article we consider three predictive lossless image compression algorithms. ...
... In Corollary 3, we show that the values we need to encode satisfy are close to a geometric distribution. These distributions allow for more effective coding-for example, a Golomb code [20] is both fast and space-optimal for such distributions, but uses > 1 bit per element. ...
Preprint
Full-text available
Filters are fast, small and approximate set membership data structures. They are often used to filter out expensive accesses to a remote set S for negative queries (that is, a query x not in S). Filters have one-sided errors: on a negative query, a filter may say "present" with a tunable false-positve probability of epsilon. Correctness is traded for space: filters only use log (1/\epsilon) + O(1) bits per element. The false-positive guarantees of most filters, however, hold only for a single query. In particular, if x is a false positive of a filter, a subsequent query to x is a false positive with probability 1, not epsilon. With this in mind, recent work has introduced the notion of an adaptive filter. A filter is adaptive if each query has false positive epsilon, regardless of what queries were made in the past. This requires "fixing" false positives as they occur. Adaptive filters not only provide strong false positive guarantees in adversarial environments but also improve performance on query practical workloads by eliminating repeated false positives. Existing work on adaptive filters falls into two categories. First, there are practical filters based on cuckoo filters that attempt to fix false positives heuristically, without meeting the adaptivity guarantee. Meanwhile, the broom filter is a very complex adaptive filter that meets the optimal theoretical bounds. In this paper, we bridge this gap by designing a practical, provably adaptive filter: the telescoping adaptive filter. We provide theoretical false-positive and space guarantees of our filter, along with empirical results where we compare its false positive performance against state-of-the-art filters. We also test the throughput of our filters, showing that they achieve comparable performance to similar non-adaptive filters.
... In particular, FLAC adopts a linear predictive coding (LPC) preprocessor for its first stage, which simply tries to estimate the next sample as a linear combination of n previous samples, and a variation of the Rice entropy coder [24], known as exponential Golomb, as its second stage. Actually, the Rice encoding is a particular case of the Golomb encoding [25] for a power-of-two encoding parameter m = 2 r . The Golomb encoding itself is a computational efficient and optimal encoding if the probability distribution of the numbers to be encoded follows the geometric distribution. ...
Article
Full-text available
Electromyography (EMG) sensors produce a stream of data at rates that can easily saturate a low-energy wireless link such as Bluetooth Low Energy (BLE), especially if more than a few EMG channels are being transmitted simultaneously. Compressing data can thus be seen as a nice feature that could allow both longer battery life and more simultaneous channels at the same time. A lot of research has been done in lossy compression algorithms for EMG data, but being lossy, artifacts are inevitably introduced in the signal. Some artifacts can usually be tolerable for current applications. Nevertheless, for some research purposes and to enable future research on the collected data, that might need to exploit various and currently unforseen features that had been discarded by lossy algorithms, lossless compression of data may be very important, as it guarantees no extra artifacts are introduced on the digitized signal. The present paper aims at demonstrating the effectiveness of such approaches, investigating the performance of several algorithms and their implementation on a real EMG BLE wireless sensor node. It is demonstrated that the required bandwidth can be more than halved, even reduced to 1/4 on an average case, and if the complexity of the compressor is kept low, it also ensures significant power savings.
... This approach firstly extends the top-k sparsification to get the sparse ternary tensor in the flattened tensor T * ∈ {−µ, 0, µ} where µ is the average value of top-k values. Afterward, they used the Golomb code [21] to compress the position of non-zero elements (i.e., µ's value) in T * before transferring on the network. To decode this tensor, the authors also need a one-bit tensor to represent the sign of µ's value in T * . ...
Article
Full-text available
Deep learning has achieved great success in many applications. However, its deployment in practice has been hurdled by two issues: the privacy of data that has to be aggregated centrally for model training and high communication overhead due to transmission of a large amount of data usually geographically distributed. Addressing both issues is challenging and most existing works could not provide an efficient solution. In this paper, we develop FedPC, a Federated Deep Learning Framework for Privacy Preservation and Communication Efficiency. The framework allows a model to be learned on multiple private datasets while not revealing any information of training data, even with intermediate data. The framework also minimizes the amount of data exchanged to update the model. We formally prove the convergence of the learning model when training with FedPC and its privacy-preserving property. We perform extensive experiments to evaluate the performance of FedPC in terms of the approximation to the upper-bound performance (when training centrally) and communication overhead. The results show that FedPC maintains the performance approximation of the models within 8.5% of the centrally-trained models when data is distributed to 10 computing nodes. FedPC also reduces the communication overhead by up to 42.20% compared to existing works.
... Due to this property, the encoder and decoder employed in its architecture are quite complicated and costly. Specifically, for lossless compression, the best available technique in literature is JPEG in lossless mode (JPEG-LS) [8], a prediction based effective approach based on a median edge detector (MED) with an additional entropy coding technique namely Golomb Rice Encoding (GRC) [9]. Context adaptive lossless image compression (CALIC) [10] was also heavily employed in literature based on the gradient adjusted predictor (GAP). ...
Preprint
Full-text available
In this work, we propose a two-stage autoencoder based compressor-decompressor framework for compressing malaria RBC cell image patches. We know that the medical images used for disease diagnosis are around multiple gigabytes size, which is quite huge. The proposed residual-based dual autoencoder network is trained to extract the unique features which are then used to reconstruct the original image through the decompressor module. The two latent space representations (first for the original image and second for the residual image) are used to rebuild the final original image. Color-SSIM has been exclusively used to check the quality of the chrominance part of the cell images after decompression. The empirical results indicate that the proposed work outperformed other neural network related compression technique for medical images by approximately 35%, 10% and 5% in PSNR, Color SSIM and MS-SSIM respectively. The algorithm exhibits a significant improvement in bit savings of 76%, 78%, 75% & 74% over JPEG-LS, JP2K-LM, CALIC and recent neural network approach respectively, making it a good compression-decompression technique.
... 24 The GR coding algorithm assumes that the larger integer value has a lower probability of occurrence. 25,26 However, the performance of the GR coding is reduced due to the large difference error values found in the high contrast images. Moreover, the algorithm requires an additional amount of work, which is the weakness of the algorithm given that the only integer parameter, the divisor, needs to be constantly updated in it. ...
Article
The data produced in today's Internet and computer world are expanding their wings day by day. With the passage of time, storage and archiving of this data are becoming a significant problem. To overcome this problem, attempts have been made to reduce data sizes using compression methods. Therefore, compression algorithms have received great attention. In this study, two efficient encoding algorithms are presented and explained in a crystal-clear manner. In all compression algorithms, frequency modulation is used. In this way, the characters with the highest frequency after each character are determined and the Huffman encoding algorithm is applied to them. In this study, the compression ratio (CR) is 49.44%. Moreover, 30 randomly selected images in three different datasets consisting of USC-SIPI, UCID, and STARE databases have been used to evaluate the performance of the algorithms. Consequently, excellent results have been obtained in all test images according to well-known comparison algorithms such as the Huffman encoding algorithm, arithmetic coding algorithm, and LPHEA.
... Some applications are interval intersection in computational biology [48], web-graph compression [7,8], IR and query processing in reordered databases [3,38], valid-time joins in temporal databases [11,22,28,52], ancestor checking in trees [12,13], data structures for set intersection [14], and bit vectors of wavelet trees [25,32] of the Burrows-Wheeler transform of highly-repetitive texts [27,34,39]. Although GAP(S) addresses this kind of non-uniform distribution, in the presence of runs, run-length encoding (RLE) [29] is more appropriate. Here, a set S with cbv C S [1..u] = 0 z 1 1 l 1 0 z 2 1 l 2 · · · 0 z g 1 l g is represented through the sequences z 1 , . . . ...
Article
Full-text available
Representing a static set of integers S, $$|S| = n$$ from a finite universe $$U = [1{..}u]$$ is a fundamental task in computer science. Our concern is to represent S in small space while supporting the operations of $$\mathsf {rank}$$ and $$\mathsf {select}$$ on S; if S is viewed as its characteristic vector, the problem becomes that of representing a bit-vector, which is arguably the most fundamental building block of succinct data structures. Although there is an information-theoretic lower bound of $${\mathcal {B}}(n, u)= \lg {u\atopwithdelims ()n}$$ bits on the space needed to represent S, this applies to worst-case (random) sets S, and sets found in practical applications are compressible. We focus on the case where elements of S contain runs of| $$\ell >1$$ consecutive elements, one that occurs in many practical situations. Let $${\mathcal {C}}^{{\scriptscriptstyle (}n{\scriptscriptstyle )}}$$ denote the class of $${u\atopwithdelims ()n}$$ distinct sets of $$n$$ elements over the universe $$[1{..}u]$$. Let also $${\mathcal {C}}^{{\scriptscriptstyle (}n{\scriptscriptstyle )}}_{g}\subset {\mathcal {C}}^{{\scriptscriptstyle (}n{\scriptscriptstyle )}}$$ contain the sets whose $$n$$ elements are arranged in $$g \le n$$ runs of $$\ell _i \ge 1$$ consecutive elements from U for $$i=1,\ldots , g$$, and let $${\mathcal {C}}^{{\scriptscriptstyle (}n{\scriptscriptstyle )}}_{g,r}\subset {\mathcal {C}}^{{\scriptscriptstyle (}n{\scriptscriptstyle )}}_{g}$$ contain all sets that consist of g runs, such that $$r \le g$$ of them have at least 2 elements. This paper yields the following insights and contributions related to $$\mathsf {rank}$$/$$\mathsf {select}$$ succinct data structures: We introduce new compressibility measures for sets, including: $${\mathcal {B}}_1(g,n,u)= \lg {|{\mathcal {C}}^{{\scriptscriptstyle (}n{\scriptscriptstyle )}}_{g}|} = \lg {{u-n+1 \atopwithdelims ()g}} + \lg {{n-1 \atopwithdelims ()g-1}}$$, and $${\mathcal {B}}_2(r, g, n,u)= \lg {|{\mathcal {C}}^{{\scriptscriptstyle (}n{\scriptscriptstyle )}}_{g,r}|} =\lg {{u-n+1 \atopwithdelims ()g}} + \lg {{n-g-1 \atopwithdelims ()r-1}} + \lg {{g\atopwithdelims ()r}}$$, such that $${\mathcal {B}}_2(r, g, n,u)\le {\mathcal {B}}_1(g,n,u)\le {\mathcal {B}}(n, u)$$. We give data structures that use space close to bounds $${\mathcal {B}}_1(g,n,u)$$ and $${\mathcal {B}}_2(r, g, n,u)$$ and support $$\mathsf {rank}$$ and $$\mathsf {select}$$ in $$\mathrm {O}(1)$$ time. We provide additional measures involving entropy-coding run lengths and gaps between items, and data structures to support $$\mathsf {rank}$$ and $$\mathsf {select}$$ using space close to these measures.
... Once the blocks are identified, the SPV node can download only those blocks from the full nodes 12 . In this second version, Golomb Code Sets [33] are used instead of Bloom filters for the approximate membership checking 13 . ...
Preprint
Full-text available
In the last decade, significant efforts have been made to reduce the false positive rate of approximate membership checking structures. This has led to the development of new structures such as cuckoo filters and xor filters. Adaptive filters that can react to false positives as they occur to avoid them for future queries to the same elements have also been recently developed. In this paper, we propose a new type of static filters that completely avoid false positives for a given set of negative elements and show how they can be efficiently implemented using xor probing filters. Several constructions of these filters with a false positive free set are proposed that minimize the memory and speed overheads introduced by avoiding false positives. The proposed filters have been extensively evaluated to validate their functionality and show that in many cases both the memory and speed overheads are negligible. We also discuss several use cases to illustrate the potential benefits of the proposed filters in practical applications.
... The majority of lossless compression algorithms can be classified as either statistical-based approaches or dictionarybased approaches [61,71]. Statistical-based approaches take advantage of each character's frequency and some popular examples are Run-length encoding [24], Shannon-Fano encoding [19,62], Arithmetic encoding [49], and Huffman encoding [31]. Run-length encoding can effectively compress consecutive repeated characters in a text. ...
Preprint
Full-text available
Mobile and wearable technologies have promised significant changes to the healthcare industry. Although cutting-edge communication and cloud-based technologies have allowed for these upgrades, their implementation and popularization in low-income countries have been challenging. We propose ODSearch, an On-device Search framework equipped with a natural language interface for mobile and wearable devices. To implement search, ODSearch employs compression and Bloom filter, it provides near real-time search query responses without network dependency. Our experiments were conducted on a mobile phone and smartwatch. We compared ODSearch with current state-of-the-art search mechanisms, and it outperformed them on average by 55 times in execution time, 26 times in energy usage, and 2.3% in memory utilization.
Article
Full-text available
Entropy coding is the essential block of transform coders that losslessly converts the quantized transform coefficients into the bit‐stream suitable for transmission or storage. Usually, the entropy coders exhibit less compression capability than the lossy coding techniques. Hence, in the past decade, several efforts have been made to improve the compression capability of the entropy coding technique. Recently, a symbol reduction technique (SRT) based Huffman coder is developed to achieve higher compression than the existing entropy coders at similar complexity of the regular Huffman coder. However, the SRT‐based Huffman coding is not popular for the real‐time applications due to the improper negative symbol handling and the additional indexing issues, which restrict its compression gain at most 10–20% over the regular Huffman coder. Hence, in this paper, an improved SRT (ISRT) based Huffman coder is proposed to properly alleviate the deficiencies of the recent SRT‐based Huffman coder and to achieve higher compression gains. The proposed entropy coder is extensively evaluated on the ground of compression gain and the time complexity. The results show that the proposed ISRT‐based Huffman coder provides significant compression gain against the existing entropy coders with lower time consumptions.
Article
We treat scalar data compression in sensor network nodes in streaming mode (compressing data points as they arrive, no pre-compression buffering). Several experimental algorithms based on linear predictive coding (LPC) combined with run length encoding (RLE) are considered. In entropy coding stage we evaluated ( a ) variable-length coding with dynamic prefixes generated with MTF-transform, ( b ) adaptive width binary coding, and ( c ) adaptive Golomb-Rice coding. We provide a comparison of known and experimental compression algorithms on 75 sensor data sources. Compression ratios achieved in the tests are about 1.5/4/1000000 (min/med/max), with compression context size about 10 bytes.
Chapter
A natural number “n” is represented by a collection of n tokens placed at a node. Thus, the unary coding of numbers is adopted, being their basic representation in the Peano standard model of arithmetic [Pea1889].
Chapter
This chapter describes methods for coding images without loss; this means that, after compression, the input signal can be exactly reconstructed. These have two primary uses. Firstly, they are useful when it is a requirement of the application that loss of any kind is unacceptable. Examples include some medical applications, where compression artifacts could be considered to impact upon diagnoses, or perhaps in legal cases where documents are used as evidence. Secondly, and more commonly, they are used as a component in a lossy compression technique, in combination with transformation and quantization, to provide entropy-efficient representations of the resulting coefficients. We first review the motivations and requirements for lossless compression and then focus our attention on the two primary methods, Huffman coding and arithmetic coding. Huffman coding is used widely, for example in most JPEG codecs, and we show it can be combined with the discrete cosine transform (DCT) to yield a compressed bitstream with minimum average codeword length. We explore the properties and limitations of Huffman coding in Section 7.3 and then introduce an alternative method – arithmetic coding – in Section 7.6. This elegant and flexible, yet simple, method supports encoding strings of symbols rather than single symbols as in Huffman coding. Arithmetic coding has now been universally adopted as the entropy coder of choice in all modern compression standards. Finally, Section 7.7 presents some performance comparisons between Huffman and arithmetic coding.
Article
Full-text available
Now a days huge amount of biological data is produced due to advancements in high-throughput sequencing technology. Those enormous volumes of sequence require effective storage, fast transmission and provision of quick access for analysis to any record. Standard general purpose lossless compression techniques failed to compress these sequences rather they may increase the size enough. Researcher always try to develop new algorithms for this purpose. Present algorithms indicate that there is enough room to make new algorithms to compress groups of genomes and that will be more time and space effective. In this review paper, we will be analyzing and presenting genomic compression algorithms for both single genomes i.e. non-referential algorithm by exploiting the intra-sequence similarity and sets of related or non-related genomes i.e. referential compression algorithms by exploiting inter-sequence similarity. Also we will discuss the different data format on which those algorithms are applied. The main focus of this review paper is the different data structure for huge sequence representation like compressed suffix tries, suffix tree, suffix array, etc. algorithm such as dynamic programming approach and different indexing technique for searching similar subsequence using pattern recognition methods. We will also discuss about Map-Reduce using HDFS, Yarn and Spark for first searching and streaming concept for reference sequence selection.
Article
Full-text available
Picture and video coding developments have progressed by leaps and bounds in recent years. However, as image and video acquisition devices become more prevalent, the growth rate of image and video data is outpacing the compression ratio increase. It's been widely acknowledged that seeking more coding performance enhancement within the conventional hybrid coding paradigm is becoming increasingly difficult. Deep convolution neural network (CNN) is a form of neural network that has seen a resurgence in recent years and has seen a lot of success in the fields of artificial intelligence and signal processing. also offers a novel and promising image and video compression solution. We present a systematic, thorough, and up-to-date analysis of neural network-based image and video compression techniques in this paper. For images and video, the evolution and advancement of neural network-based compression methodologies are discussed. More precisely, cutting-edge video coding techniques based on deep learning and the HEVC system are introduced and addressed, with the goal of significantly improving state-of-the-art video coding efficiency. In addition, end-to-end image and video coding frameworks based on neural networks are examined, revealing intriguing explorations on next-generation image and video coding frameworks. The most important research works on image and video coding related topics using neural networks are highlighted, as well as future developments. The joint compression of semantic and visual information, in particular, is tentatively investigated in order to formulate high-efficiency signal representation structures for both human and machine vision. In the age of artificial intelligence, which are the two most popular signal receptors.
Chapter
Time series mining is an important branch of data mining, as time series data is ubiquitous and has many applications in several domains. The main task in time series mining is classification. Time series representation methods play an important role in time series classification and other time series mining tasks. One of the most popular representation methods of time series data is the Symbolic Aggregate approXimation (SAX). The secret behind its popularity is its simplicity and efficiency. SAX has however one major drawback, which is its inability to represent trend information. Several methods have been proposed to enable SAX to capture trend information, but this comes at the expense of complex processing, preprocessing, or post-processing procedures. In this paper we present a new modification of SAX that we call Trending SAX (TSAX), which only adds minimal complexity to SAX, but substantially improves its performance in time series classification. This is validated experimentally on 50 datasets. The results show the superior performance of our method, as it gives a smaller classification error on 39 datasets compared with SAX.
Article
In this paper, we propose an intelligent compression system that addresses the problems of energy limitations of wireless video capsule endoscopy. The principle is to include a classification feedback loop, based on deep learning, to determine the importance of the images being transmitted. This classification is used with a simple prediction-based compression algorithm to allow an intelligent management of the limited energy of the capsule. For this, the capsule starts by transmitting a subsampled version of each image with a small rate. The images will be decoded and classified, automatically, to detect any possible lesions. Following the classification result, the images considered as important, for diagnosis, will be enhanced with additional content, whereas the less important ones will be recorded with low quality. In this way, large amounts of bits will be saved, without affecting the diagnosis. The saved energy can be used to extend the life of the capsule or to increase the resolution and frame rate of some WCE images. The results of classification show an accuracy of more than 99%, which allowed us to code losslessly almost all the important images of our test sequences. Our results also show that many additional images can be transmitted and their number depend on the used subsampling and the number of important images.
Article
An optimum method of coding an ensemble of messages consisting of a finite number of members is developed. A minimum-redundancy code is one constructed in such a way that the average number of coding digits per message is minimized.