Available via license: CC BY 4.0

Content may be subject to copyright.

A preview of the PDF is not available

Piecewise linear approximation of sensor signals is a well-known technique in the fields of Data Mining and Activity Recognition. In this context, several algorithms have been developed, some of them with the purpose to be performed on resource constrained microcontroller architectures of wireless sensor nodes. While microcontrollers are usually constrained in computational power and memory resources, all state-of-the-art piecewise linear approximation techniques either need to buffer sensor data or have an execution time depending on the segment’s length. In the paper at hand, we propose a novel piecewise linear approximation algorithm, with a constant computational complexity as well as a constant memory complexity. Our proposed algorithm’s worst-case execution time is one to three orders of magnitude smaller and its average execution time is three to seventy times smaller compared to the state-of-the-art Piecewise Linear Approximation (PLA) algorithms in our experiments. In our evaluations, we show that our algorithm is time and memory efficient without sacrificing the approximation quality compared to other state-of-the-art piecewise linear approximation techniques, while providing a maximum error guarantee per segment, a small parameter space of only one parameter, and a maximum latency of one sample period plus its worst-case execution time.

Figures - available via license: Creative Commons Attribution 4.0 International

Content may be subject to copyright.

Available via license: CC BY 4.0

Content may be subject to copyright.

A preview of the PDF is not available

... To this end, single-pass, continuous compression algorithms that spend a constant time per ingested data point can offer rapid compression at small computational footprints. While single-pass algorithms for compression can not outperform multi-pass alternatives in regards to compressed data size (as multiple passes allow for continued optimization of the compressed representation), single-pass lossy compression presents a viable option in VCPSs by trading off accuracy and computational demands [81]. ...

... However, given the constraints posed by the latter in the form of small computational capabilities (see Section 2.2), such compression must have a low overhead; and faced with fast-paced continuous data streams, compression that uses only a single pass over the data is advantageous. A compression technique optimized specifically for edge devices is, for example, presented in [81]. Optimizing compression and resource usage, the authors opt for lossy PLA compression (see Section 3.2) with a small memory footprint and instruction count per operation. ...

... Any deviations from the raw data introduced by compression will lead to inaccuracies downstream in the analysis pipeline, and such inaccuracies may render the use of boundlessly lossy compression impossible. In [81], the author's technique bounds the maximum deviation per segment and thus gives some precision guarantees, but an evaluation of the downstream effect of the overall reduced precision was not presented. ...

... As exemplified in the work of Xu et al. [52], formulating a non-convex problem and transforming it into a convex problem followed by optimization can lead to solutions that suggest superior performance. Karhunen-Loeve transform (KLT) Lossless [82] Piecewise linear approximation (PLA) Lossy [47,[83][84][85] Huffman algorithm Lossless [14] k-Means clustering (vector quantization) Lossy [13,67] Displayed are algorithms that have been considered thus far for edge computing ...

... An example is the work of Duvignau et al. in the context of processing streaming data, that is, how to (piecewise linear) efficiently approximate input data streams [47]. Other works include that of Grützmacher et al., achieving both a computational and a memory complexity of O(1) towards competitive algorithms for online PLA of sensor signals [85]. Introducing the so-called swing resolution reduction in their piecewise linear approximation, which imposes additional constraint on the endpoints of the approximating segments, Lin et al. obtained a more effective compression ratio for the (smart factory) sensor data considered [84]. ...

... Has the brain mastered Compression extracting major extrema Time domain The process of compressing a time series by extracting its major extrema, that is, minima and maxima [173] Principal component analysis (PCA) Time domain PCA is one of the most famous technique that is used for unsupervised data compressiod and it helps identifying the patterns in the dataset based on the correlation between the elements. PCA finds the direction of the maximum variance and projects the data into lower dimensions [174] [47,[83][84][85] Critical aperture (CA) Time domain A system for compressing a set of data points that includes a critical aperture compression module that is configured to discard one or more data points from the set of data points. [72] Walsh-Hadamard transform (WHT) Transform domainWHT-based image compression relies on two techniques to reduce data required to represent the image. ...

Edge computing aims to address the challenges associated with communicating and transferring large amounts of data generated remotely to a data center in a timely and efficient manner. A central pillar of edge computing is local (i.e., at- or near-source) data processing capability so that data transfer to a data center for processing can be minimized. Data compression at the edge is therefore a natural component of edge workflows. We present a survey of data compression algorithms with a focus on edge computing. Not all compression algorithms can accommodate the data type heterogeneity, tight processing and communication time constraints, or energy efficiency requirement characteristics of edge computing. We discuss specific examples of compression algorithms that are being explored in the context of edge computing. We end our review with a brief survey of emerging quantum compression techniques that are of importance in quantum information processing, including the proposed concept of quantum edge computing.

... In [4], the whole domain was evenly divided into several intervals, and each interval was approximated by a cubic polynomial using the least squares method with the constraint that it had continuity in all boundaries. The studies cited in [11][12][13][14][15] proposed an approximation method using piecewise linear approximation (PLA), while ref. [16] proposed a method consisting of using several linear functions and then moving the endpoints of the interval appropriately to reduce the approximation error of the interval. ...

... Thus, the order and the number of polynomials should be balanced and optimized. In order to accomplish this, an optimization scheme is proposed in contrast with [4][5][6][7][8][9][10][11][12][13][14][15][16], which utilized a predetermined polynomial order and number of polynomials. ...

In this paper, the optimal approximation algorithm is proposed to simplify non-linear functions and/or discrete data as piecewise polynomials by using the constrained least squares. In time-sensitive applications or in embedded systems with limited resources, the runtime of the approximate function is as crucial as its accuracy. The proposed algorithm searches for the optimal piecewise polynomial (OPP) with the minimum computational cost while ensuring that the error is below a specified threshold. This was accomplished by using smooth piecewise polynomials with optimal order and numbers of intervals. The computational cost only depended on polynomial complexity, i.e., the order and the number of intervals at runtime function call. In previous studies, the user had to decide one or all of the orders and the number of intervals. In contrast, the OPP approximation algorithm determines both of them. For the optimal approximation, computational costs for all the possible combinations of piecewise polynomials were calculated and tabulated in ascending order for the specific target CPU off-line. Each combination was optimized through constrained least squares and the random selection method for the given sample points. Afterward, whether the approximation error was below the predetermined value was examined. When the error was permissible, the combination was selected as the optimal approximation, or the next combination was examined. To verify the performance, several representative functions were examined and analyzed.

... This study is based on the methodology outlined in [13,14], which is used to adaptively linearize sensor transfer functions. This approach simplifies the design and improves the measurement accuracy of sensors and Internet of Things (IoT) devices, especially those with limited resources. ...

In this work, an innovative numerical approach for polylinear approximation (polylinearization) of non-self-intersecting compact sensor characteristics (transfer functions) specified either pointwise or analytically is introduced. The goal is to optimally partition the sensor characteristic, i.e. to select the vertices of the approximating polyline (approximant) along with their positions on, the sensor characteristics so that the distance (i.e. the separation) between the approximant and the characteristic is rendered below a certain problem-specific tolerance. To achieve this goal, two alternative non-linear optimization problems are solved, whose essential difference is in the adopted quantitative measure of the separation between the transfer function and the approximant. In the first problem, which relates to absolutely integrable sensor characteristics (their energy is not necessarily finite, but they can be represented in terms of convergent Fourier series), the polylinearization is constructed by numerical minimization of the L^1--metric (a distance-based separation measure), concerning the number of polyline vertices and their locations. In the second problem which covers the quadratically integrable sensor characteristics (whose energy is finite, but they do not necessarily admit a representation in terms of convergent Fourier series), the polylinearization is constructed by minimizing numerically the L^2-metric (area-, or energy-based separation measure) for the same set of optimization variables –the locations, and the number of polyline vertices.

... This research built upon the technique outlined in [1,18], which was used to adaptively linearize sensor characteristics, make the design simpler, and improve the measurement accuracy of sensors and IoT devices that are resource-limited. ...

The popularity of smart sensors and the Internet of Things (IoT) is growing in various fields and applications. Both collect and transfer data to networks. However, due to limited resources, deploying IoT in real-world applications can be challenging. Most of the algorithmic solutions proposed so far to address these challenges were based on linear interval approximations and were developed for resource-constrained microcontroller architectures, i.e., they need buffering of the sensor data and either have a runtime dependency on the segment length or require the sensor inverse response to be analytically known in advance. Our present work proposed a new algorithm for the piecewise-linear approximation of differentiable sensor characteristics with varying algebraic curvature, maintaining the low fixed computational complexity as well as reduced memory requirements, as demonstrated in a test concerning the linearization of the inverse sensor characteristic of type K thermocouple. As before, our error-minimization approach solved the two problems of finding the inverse sensor characteristic and its linearization simultaneously while minimizing the number of points needed to support the characteristic.

Increasing amounts of data are sensed at the edge of the Edge-to-Cloud (E2C) continuum, enabling the rapid development of data-driven applications based on, e.g., Machine Learning. This is especially true for Vehicular Cyber-Physical Systems (VCPSs), networks of connected vehicles equipped with high-bandwidth sensors, where Big Data originating on the vehicles is crucial for the advancement of autonomous drive, developing new cars, and more. Limited bandwidth and storage mean that moving this vehicular Big Data from the edge to central processing increasingly poses challenges. In this work, we present our research on how to alleviate these through efficiently localizing data on the edge, selecting relevant data in a data stream, and distributing the processing of data in a VCPS.

This study introduces an innovative numerical approach for polylinear approximation (polylinearization) of non-self-intersecting compact sensor characteristics (transfer functions) specified either pointwise or analytically. The goal is to partition the sensor characteristic optimally, i.e., to select the vertices of the approximating polyline (approximant) along with their positions, on the sensor characteristics so that the distance (i.e., the separation) between the approximant and the characteristic is rendered below a certain problem-specific tolerance. To achieve this goal, two alternative nonlinear optimization problems are solved, which differ in the adopted quantitative measure of the separation between the transfer function and the approximant. In the first problem, which relates to absolutely integrable sensor characteristics (their energy is not necessarily finite, but they can be represented in terms of convergent Fourier series), the polylinearization is constructed by the numerical minimization of the L1-metric (a distance-based separation measure), concerning the number of polyline vertices and their locations. In the second problem, which covers the quadratically integrable sensor characteristics (whose energy is finite, but they do not necessarily admit a representation in terms of convergent Fourier series), the polylinearization is constructed by numerically minimizing the L2-metric (area- or energy-based separation measure) for the same set of optimization variables—the locations and the number of polyline vertices.

Long-term activity recognition relies on wearable sensors that log the physical actions of the wearer, so that these can be analyzed afterwards. Recent progress in this field has made it feasible to log high-resolution inertial data, resulting in increasingly large data sets. We propose the use of piecewise linear approximation techniques to facilitate this analysis. This paper presents a modified version of SWAB to approximate human inertial data as efficiently as possible, together with a matching algorithm to query for similar subsequences in large activity logs. We show that our proposed algorithms are faster on human acceleration streams than the traditional ones while being comparable in accuracy to spot similar actions, benefitting post-analysis of human activity data.

In this paper, two tools are presented: an execution driven cache simulator which relates event metrics to a dynamically built-up call-graph, and a graphical front end able to visualize the generated data in various ways. To get a general purpose, easy-to-use tool suite, the sim- ulation approach allows us to take advantage of runtime instrumentation, i.e. no preparation of application code is needed, and enables for sophis- ticated preprocessing of the data already in the simulation phase. In an ongoing project, research on advanced cache analysis is based on these tools. Taking a multigrid solver as an example, we present the results obtained from the cache simulation together with real data measured by hardware performance counters.

Given a time series S = ((x1, y1), (x2, y2), ...) and a prescribed error bound ε, the piecewise linear approximation (PLA) problem with max-error guarantees is to construct a piecewise linear function f such that |f(xi)-yi| ≤ ε for all i. In addition, we would like to have an online algorithm that takes the time series as the records arrive in a streaming fashion, and outputs the pieces of f on-the-fly. This problem has applications wherever time series data is being continuously collected, but the data collection device has limited local buffer space and communication bandwidth, so that the data has to be compressed and sent back during the collection process. Prior work addressed two versions of the problem, where either f consists of disjoint segments, or f is required to be a continuous piecewise linear function. In both cases, existing algorithms can produce a function f that has the minimum number of pieces while meeting the prescribed error bound ε. However, we observe that neither minimizes the true representation size of f, i.e., the number of parameters required to represent f. In this paper, we design an online algorithm that generates the optimal PLA in terms of representation size while meeting the prescribed max-error guarantee. Our experiments on many real-world data sets show that our algorithm can reduce the representation size of f by around 15% on average compared with the current best methods, while still requiring O(1) processing time per data record and small space.

This paper proposes an activity inference system that has been designed for deployment in mood disorder research, which aims at accurately and efficiently recognizing selected leisure activities in week-long continuous data. The approach to achieve this relies on an unobtrusive and wrist-worn data logger, in combination with a custom data mining tool that performs early data abstraction and dense motif discovery to collect evidence for activities. After presenting the system design, a feasibility study on weeks of continuous inertial data from 6 participants investigates both accuracy and execution speed of each of the abstraction and detection steps. Results show that our method is able to detect target activities in a large data set with a comparable precision and recall to more conventional approaches, in approximately the time it takes to download and visualize the logs from the sensor.

We present a method for spotting sporadically occurring gestures in a continuous data stream from body-worn inertial sensors. Our method is based on a natural partitioning of continuous sensor signals and uses a two-stage approach for the spotting task. In a first stage, signal sections likely to contain specific motion events are preselected using a simple similarity search. Those preselected sections are then further classified in a second stage, exploiting the recognition capabilities of hidden Markov models. Based on two case studies, we discuss implementation details of our approach and show that it is a feasible strategy for the spotting of various types of motion events.

Many sensor network applications observe trends over an area by regularly sampling slow-moving values such as humidity or air pressure (for example in habitat monitoring). Another well-published type of application aims at spotting sporadic events, such as sudden rises in temperature or the presence of methane, which are tackled by detection on the individual nodes. This paper focuses on a zone between these two types of applications, where phenomena that cannot be detected on the nodes need to be observed by relatively long sequences of sensor samples. An algorithm that stems from data mining is proposed that abstracts the raw sensor data on the node into smaller packet sizes, thereby minimizing the network traffic and keeping the essence of the information embedded in the data. Experiments show that, at the cost of slightly more processing power on the node, our algorithm performs a shape abstraction of the sensed time series which, depending on the nature of the data, can extensively reduce network traffic and nodes{\textquoteright} power consumption.

With sensors becoming smaller and more power efficient, wearable sensors that anyone could wear are becoming a feasible concept. We demonstrate a small lightweight module, called Porcupine, which aims at continuously monitoring human activities as long as possible, and as fine-grained as possible. We present initial analysis of a set of abstraction algorithms that combine and process raw accelerometer data and tilt switch states, to get descriptors of the user's motion- based activities. The algorithms are running locally, and the information they produce is stored in on-board memory for later analysis.

Dynamic binary instrumentation (DBI) frameworks make it easy to build dynamic binary analysis (DBA) tools such as checkers and profilers. Much of the focus on DBI frameworks has been on performance; little attention has been paid to their capabilities. As a result, we believe the potential of DBI has not been fully exploited. In this paper we describe Valgrind, a DBI framework designed for building heavyweight DBA tools. We focus on its unique support for shadow values-a powerful but previously little-studied and difficult-to-implement DBA technique, which requires a tool to shadow every register and memory value with another value that describes it. This support accounts for several crucial design features that distinguish Valgrind from other DBI frameworks. Because of these features, lightweight tools built with Valgrind run comparatively slowly, but Valgrind can be used to build more interesting, heavyweight tools that are difficult or impossible to build with other DBI frameworks such as Pin and DynamoRIO.