Conference PaperPDF Available

# A Fast Approximation of the Hyperbolic Tangent When Using Posit Numbers and Its Application to Deep Neural Networks

Authors:

## Abstract and Figures

Deep Neural Networks (DNNs) are being used in more and more fields. Among the others, automotive is a field where deep neural networks are being exploited the most. An important aspect to be considered is the real-time constraint that this kind of applications put on neural network architectures. This poses the need for fast and hardware-friendly information representation. The recently proposed Posit format has been proved to be extremely efficient as a low-bit replacement of traditional floats. Its format has already allowed to construct a fast approximation of the sigmoid function, an activation function frequently used in DNNs. In this paper we present a fast approximation of another activation function widely used in DNNs: the hyperbolic tangent. In the experiment, we show how the approximated hyperbolic function outperforms the approximated sigmoid counterpart. The implication is clear: the posit format shows itself to be again DNN friendly, with important outcomes.
Content may be subject to copyright.
A Fast Approximation of the Hyperbolic Tangent
when Using Posit Numbers and its Application
to Deep Neural Networks
Marco Cococcioni1, Federico Rossi1, Emanuele Ruffaldi2, Sergio Saponara1
1Dept of Information Engineering, University of Pisa, 56122 Italy
2MMI spa, Calci, Pisa, 56011 Italy
Abstract. Deep Neural Networks (DNNs) are being used in more and more
fields. Among the others, automotive is a field where deep neural networks are
being exploited the most. An important aspect to be considered is the real-time
constraint that this kind of applications put on neural network architectures.
This poses the need for fast and hardware-friendly information representation.
The recently proposed Posit format has been proved to be extremely efficient as
a low-bit replacement of traditional floats. Its format has already allowed to
construct a fast approximation of the sigmoid function, an activation function
frequently used in DNNs. In this paper we present a fast approximation of
another activation function widely used in DNNs: the hyperbolic tangent. In the
experiment, we show how the approximated hyperbolic function outperforms
the approximated sigmoid counterpart. The implication is clear: the posit format
shows itself to be again DNN friendly, with important outcomes.
Keywords. Deep Neural Networks (DNNs), Posit, Activation functions
1 Introduction
The use of deep neural networks (DNN) as a general tool for signal and data
processing is increasing both in industry and academia. One of the key challenge is
the cost-effective computation of DNNs in order to ensure that these techniques can
be implemented at low-cost, low-power and in real-time for embedded applications in
IoT devices, robots, autonomous cars and so on. To this aim, an open research field is
devoted to the cost-effective implementation of the main operators used in DNN,
among them the activation function. The basic node of a DNN implements the sum of
products of inputs (X) and their corresponding Weights (W) and then applies an
activation function
()f
to it to get the output of that layer and feed it as an input to
the next layer. If we do not apply an activation function then the output signal would
simply be a simple linear function, which has a low complexity but is not power
enough to learn complex mappings (typically non-linear) from data. This is why the
most used activation functions like Sigmoid, Tanh (Hyperbolic tangent) and ReLu
(Rectified linear units) introduce non-linear properties to DNN [1,2]. Choosing the
activation function for a DNN model must take into account various aspects of both
the considered data distribution and the underlying information representation.
Moreover, for decision critical applications like machine perception for robotic and
autonomous cars, also the implementation accuracy is important.
Indeed, one of the main trend in industry to keep low the complexity of DNN
computation is avoiding complex arithmetic like double-precision floating point (64-
bit), but relying on much more compact formats like BFLOAT or Flexpoint [3, 4] (i.e.
a revised version of the 16-bit IEEE-754 floating point format adopted by Google
Tensor Processing Units and Intel AI processors) or transprecision computing [5, 6]
(e.g. the last Turing GPU from NVIDIA sustains INT32, INT8, INT4 and fp32 and
fp16 computation [5]). To this aim, this paper presents a fast approximation of the
hyperbolic tangent activation function combined with a new hardware-friendly
information representation based on Posit numerical format.
Hereafter, Section 2 introduces the Posit format and the CppPosit library implemented
at University of Pisa for the computation of the new numerical format. Section 3
introduces the hyperbolic tangent and its approximation. Implementation results when
the proposed technique is applied to DNN with known benchmark dataset are
reported in Section 4, where also a comparison with other known activation functions,
like sigmoid, is discussed. Conclusions are drawn in Section 5.
2 Posit Arithmetic and the CppPosit Library
The Posit format as proposed in [7-9] is a fixed-length representation composed by at
most 4 fields as shown in Fig 1.: 1-bit sign field, variable-length regime field,
variable-length (up to es-bits) exponent field and a variable-length fraction field. The
overall length and the maximum exponent lengths are decided a-priori. Regime
length and bit-content is determined as by the number of consecutive zeroes or ones
terminated, respectively, by a single one (negative regime) or zero (positive regime)
(see Fig. 2).
In this work we are going to use the cppPosit library, a modern C++14
implementation of the original Posit number system. The library identifies four
different operational levels (L1-L4):
- L1 operations are the ones involving bit-manipulation of the posit, without decoding
it, considering it as an integer. L1 operations are thus performed on ALU and are fast.
- L2 operations involve unpacking the Posit into its four different fields, with no
exponent computation.
- L3 operations instead involve full exponent unpacking, but without the need to
perform arithmetic operations on the unpacked fields (examples are converting
to/from float, posit or fixed point).
- L4 operations require the unpacked version to perform software/hardware floating
point computation using unpacked fields.
L1 operations are the most interesting, since they are the most efficient ones. L1
operations include inversion, negation, comparisons and absolute value. Moreover,
when esbits=0, L1 operations also include doubling/halving, 1’s complement when
the specific Posit representation falls within the range [0,1] and an approximation of
the sigmoid function, called here fast Sigmoid, and described in [9]. Table 1 reports
some implemented L1 operations stating whether the formula is exact or an
approximation and the operation requirements in terms of Posit configuration and
value. It is important to underline that every effort put in finding an L1 expression for
some functions or operations has two advantages: a faster execution when using a
software emulated PPU (Posit Processing Units), and a lower area required (i.e. less
transistors) when the PPU is implemented in hardware.
Table 1. L1 operations summary
Operation
Approximation
Requirements
2*x
no
esbits=0
x/2
no
esbits=0
1/x
no
none
1-x
no
esbits=0, x in [-1,1]
FastSigmoid [9]
yes
esbits=0
FastTanh (see below)
yes
esbits=0
3 The Hyperbolic Tangent and its Approximation FastTanh
The hyperbolic tangent is a non-linear activation function typically adopted as a
replacement to the sigmoid activation function. The advantage of the hyperbolic
tangent over the sigmoid is the higher enhancement given to the negative values. In
fact, the output of the hyperbolic tangent spans in [-1, 1] while the sigmoid outputs
are only half of the previous, lying in [0, 1]. Furthermore, this difference in output
range heavily impacts performances when using small-sized number representation,
such as Posits with 10 or 8 bits. If we consider the sigmoid function applied to a Posit
with x bits, we are actually using, as output, a Posit with x-1 bits, since we are
discarding the range [-1,0], which is significantly dense when using the Posit format
(see Fig. 3).
Fig. 3 The posit circle when the total number of bits is 5. The hyperbolic tangent uses all the
numbers in [-1, 1], while the sigmoid function only the ones in [0, 1].
However, as already mentioned before, the sigmoid function
( )
sigmoid( ) 1 1
x
xe=−
has a fast and efficient L1 approximation when using Posits with 0 exponent bits [9]
(FastSigmoid). In order to exploit a similar trick for the hyperbolic tangent, we first
introduced the scaled sigmoid function:
sSigmoid ( ) sigmoid( ) 2
kx k k x k= 
(1)
Particularly interesting is the case k=2, when the scaled sigmoid coincides with the
hyperbolic tangent:
( ) ( )
22
2
sSigmoid ( ) 1 1 tanh( )
xx
x e e x

= + =
(2)
Now that we can express the hyperbolic tangent as a linear function of the sigmoid
one, we must rework the expression in order to provide a fast and efficient
approximation to be used with Posits.
We know that Posit properties guarantee that, when using 0 exponent bits format,
doubling the Posit value and computing its sigmoid approximation is just a matter of
bit manipulations, so they can be efficiently obtained. The subtraction in Equation (1)
does not come with an efficient bit manipulation implementation as-is. In order to
transform it into an L1 operation we have to rewrite it as:
(3)
Then let us focus on negative values for x only. For these values, the expression
2 sigmoid(2 )x
is inside the unitary region [0, 1]. Therefore, the L1 1’s complement
can be applied. Finally, the negation is always an L1 operation, thus for all negative
values of x the hyperbolic tangent approximation can be computed as an L1 operation.
Moreover, thanks to the anti-symmetry of the hyperbolic tangent, this approach can
values used by Tanh
values used by the Sigmoid
also be extended to positive values. The following is a possible pseudo-code
implementation:
FastTanh(x) → y
x_n = x > 0 ? -x:x
s = x > 0
y_n = neg(compl1(twice(FastSigmoid(twice(x_n)))))
y = s > 0 ? -y_n:y_n
where twice is an L1 operation which computes
2x
and compl1 is the L1
function that computes the 1 complement, again as an L1 operation.
Since we are also interested in training neural networks, we also need an efficient
implementation of the hyperbolic tangent derivative:
d(tanh(x))/d(x) = 1-tanh(x
Let y=tanh(x, we know that 1-y is always a L1 operation when esbits = 0, since
tanh(x is always in [0,1]. In order to provide an efficient way to compute the
hyperbolic tangent square, we can tabulate the square operator for all Posit values.
This approach does not come with a great cost, since it is a unary operator and it can
also be applied even to Posit with 16 bits.
4 Experimental Results
We compared the approximated hyperbolic tangent to the original version in terms of
execution time and precision. Figure 4 shows the precision comparison, reporting also
for Posit8 and Posit16 the mean squared error between the approximated and the
original form (for both types, we used 0 bits of exponent). Figure 5 shows execution
time comparison for several repetitions. Each repetition consists in computing about
60,000 hyperbolic tangents with the approximated formula and the exact one. As
reported, the precision degradation is in the order of 10-3 while the gain in speed is
around a factor 6. In Figs. 4 and 5 fast appr tanh is the Posit-based implementation,
using L1 operations, of the Tanh function, by using the FastTanh formula in Eq. 3.
This corresponds to the column labeled has FastTanh in Table 2.
Then we tested the approximated hyperbolic tangent as activation function for the
LeNet-5 convolutional neural network, replacing the exact hyperbolic tangent used in
the original implementation proposed in [10,11] and comparing results against the
original activation. The network model has been trained on MNIST [11] and Fashion-
MNIST datasets [12].
Table 2 shows performance comparison between the two activation functions
(FastTanh and Tanh) on the two datasets. Moreover, also the results obtained with
Sigmoid and ReLu are reported, since they are widely adopted in literature as
activation functions for DNN. The results in Table 2 in terms of accuracy show that
the FastTanh outperforms both the ReLu and the FastSigmoid (a well-known
approximation of the sigmoid function) which are widely used in state-of-art to
implement activation functions in DNN.
Fig. 4. Comparison between exact hyperbolic tangent (True tanh, in blue) and FastTanh (fast
appr. tanh, in black), for Posit<8,0> (top) and Posit<16,0> (bottom). For Posit<8,0> the mean
squared error is 2.816·10-3, while for Posit<16,0> it is 2.947·10-3.
Fig. 5. Comparison of execution time of multiple consecutive executions between exact
hyperbolic tangent (True tanh, in blue) and FastTanh (fast appr tanh, in black)
Table 2. Accuracy (%) and inference time (ms) comparison between different activation
functions and different Posit configurations (MNIST and Fashion-MNIST data set)
MNIST
Activation
FastTanh
(this paper)
True Tanh
FastSigmoid
[9]
ReLu
%
ms
%
ms
%
ms
%
ms
Posit16,0
98.5
3.2
98.8
5.28
97.1
3.31
89
2
Posit14,0
98.5
2.9
98.8
4.64
97.1
3.09
89
1.9
Posit12,0
98.5
2.9
98.8
4.66
97.1
3.04
89
1.9
Posit10,0
98.6
2.9
98.7
4.62
96.9
3.08
89
1.9
Posit8,0
98.6
3.01
98.4
4.84
94.2
3.01
88
1.9
FASHION-MNIST
Activation
FastTanh
(this paper)
True Tanh
FastSigmoid
[9]
ReLu
%
ms
%
ms
%
ms
%
ms
Posit16,0
89.6
3.4
90.0
5.5
85.2
3.4
85
2.1
Posit14,0
89.6
2.9
90.0
5.0
85.2
3.2
85
1.9
Posit12,0
89.7
2.9
90.0
5.1
85.2
3.1
85
1.9
Posit10,0
89.7
2.9
89.7
5.1
85.1
3.2
85
1.9
Posit8,0
89.6
3.1
89.3
5.2
84.3
3.0
84
1.9
5 Conclusions
In this work we have introduced FastTanh, a fast approximation of the hyperbolic
tangent for numbers represented in Posit format which uses only L1 operations. We
have used this approximation to speed up the training phase of deep neural networks.
The proposed approximation has been tested on common deep neural network
benchmarks. The use of this approximation resulted in a slightly less accurate neural
network, with respect to the use of the slower true hyperbolic tangent, but with better
performance in terms of inference time of the network. In our experiment, the
FastTanh also outperforms both the ReLu and the FastSigmoid, which is a well-
known approximation of the sigmoid function, a de facto standard activation function
in neural networks. We are now working on deriving fast L1 approximations for other
activating functions, such as the softplus and others.
Acknowledgements
Work partially supported by H2020 European Project EPI (European Processor
Initiative) and by the Italian Ministry of Education and Research (MIUR) in the
framework of the CrossLab project (Departments of Excellence program), granted to
the Department of Information Engineering of the University of Pisa.
References
1. D. Pedamonti, Comparison of non-linear activation functions for deep neural networks on
2. V. Nair, G. E. Hinton, “Rectified linear units improve restricted Boltzmann machines”, 27th
Int. Conf. on International Conference on Machine Learning (ICML) 2010, pp. 807-814
3. U. Köster et al. Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep
Neural Networks”, NIPS 2017, pp. 1740-1750
4. V. Popescu et al., “Flexpoint: predictive numerics for deep learning”, IEEE Symposium on
Computer Arithmetics, 2018
5. NVIDIA TURING GPU Architecture, graphics reinvented, White paper n. WP-09183-
001_v01, pp. 1-80, 2018
6. A. Malossi et al., The transprecision computing paradigm: concept, design, and
applications”, IEEE DATE 2018, pp. 1105-1110
7. M. Cococcioni, F. Rossi, E. Ruffaldi, S. Saponara, “Novel Arithmetics to Accelerate
Machine Learning Classifiers in Autonomous Driving Applications”, submitted to IEEE
ICECS 2019
8. M. Cococcioni, E. Ruffaldi, S. Saponara, “Exploiting Posit arithmetic for Deep Neural
Networks in Autonomous Driving Applications”, IEEE Automotive 2018
9. J. L. Gustafson and I. T. Yonemoto, “Beating floating point at its own game: Posit
arithmetic,” Supercomputing Frontiers and Innovations, vol. 4, no. 2, pp. 71–86, 2017
10. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to
document recognition,”Proceedingsof the IEEE, 1998.
11. Y. LeCun, L. Jackel, L. Bottou, A. Brunot, C. Cortes, J. Denker, H. Drucker, I. Guyon, U.
Muller, E. Sackinger, P. Simard,and V. Vapnik, “Comparison of learning algorithms for
handwritten digit recognition,” in International Conference on Artificial Neural Networks,
Paris, F. Fogelman and P. Gallinari, Eds. EC2 and Cie, 1995, pp. 5360.
12. H. Xiao, K. Rasul, R. Vollgraf, “Fashion-mnist: a novel image dataset for benchmarking
machine learning algorithms”, arXiv:1708.07747, 2017
... The posit TM format ( [8][9][10]) is one of the most promising representations that deviates from the IEEE 754 standard. In machine learning, this kind has been shown to be a great drop-in replacement for 32-bit IEEE 754 floats, using only 16 bits [11][12][13][14][15][16]. Furthermore, it has been successfully used in low-precision inference down to 8-bit posit representation with minimal network inference accuracy degradation. ...
... Furthermore, it has been successfully used in low-precision inference down to 8-bit posit representation with minimal network inference accuracy degradation. Moreover, as explained in [12], this number system can be used to create quick, approximated, and efficient activation functions for neural networks such as the sigmoid function by simply using the already existing CPU integer arithmetic operations. ...
... -FPU back-end; -Fixed back-end, exploiting big-integer support (64 or 128 bits) for operations; -Tabulated back-end, generating lookup tables for most of the operations (suitable for Posit [8,12], * due to table sizes). ...
Chapter
Full-text available
With the pervasiveness of deep neural networks in scenarios that bring real-time requirements, there is the increasing need for optimized arithmetic on high performance architectures. In this paper we adopt two key visions: i) extensive use of vectorization to accelerate computation of deep neural network kernels; ii) adoption of the posit compressed arithmetic in order to reduce the memory transfers between the vector registers and the rest of the memory architecture. Finally, we present our first results on a real hardware implementation of the ARM Scalable Vector Extension.
... BFloat16 (Brain Floating Point) is used in upcoming Intel AI processors (NERVANA), XEON processors, Google Cloud TPU and ARMv8.6-A, as well as in RISC-V extensions [21]. Posit are a new compressed floating-point data format for which University of Pisa has developed a SW library called CppPosit [22], [23]. From the first results of applying the CppPosit library to AI/DNN problems, Posit can lead to the same processing accuracy of float but with a data compression from a factor 2 to 4 [22], [23]. ...
... Posit are a new compressed floating-point data format for which University of Pisa has developed a SW library called CppPosit [22], [23]. From the first results of applying the CppPosit library to AI/DNN problems, Posit can lead to the same processing accuracy of float but with a data compression from a factor 2 to 4 [22], [23]. This means that applying Posit to the application cases (HPC, HPDA and AI/CNN) has the potential to reduce data storage issues and allows for fast data movement. ...
... The novel Posit binary arithmetic format can offer higher precision while using less bits than standard IEEE floating-point numbers. Recent literature shows [22], [23] that 16bit Posit can leverage comparable results like fp32 and Posit with 8bit precision outperform in terms of accuracy fp16 (for CNN 8bit Posit can leverage comparable results like fp32). Calculations can be done even with simple bit manipulations on the Posit format without extraction, further decreasing the complexity of the operations. ...
Conference Paper
Full-text available
To achieve high performance and high energy efficiency on near-future exascale computing systems, three key technology gaps needs to be bridged. These gaps include: energy efficiency and thermal control; extreme computation efficiency via HW acceleration and new arithmetics; methods and tools for seamless integration of reconfigurable accelerators in heterogeneous HPC multi-node platforms. TEXTAROSSA aims at tackling this gap through a co-design approach to heterogeneous HPC solutions, supported by the integration and extension of HW and SW IPs, programming models and tools derived from European research.
... Another promising representation that diverges from the floating-point standard is the posit number system [5][6][7]. This type has been proven to be a perfect drop-in replacement of 32-bit IEEE 754 floats in machine learning, using just 16 bits [8][9][10][11][12][13]. Moreover, it has been productively exploited in low-precision inference down to 8-bit posit representation, with very little degradation of network inference accuracy. ...
... Moreover, it has been productively exploited in low-precision inference down to 8-bit posit representation, with very little degradation of network inference accuracy. Furthermore, as also explained in Sect. 2 and in [9], this number system can be exploited to build fast, approximated and efficient activation functions for neural networks like the sigmoid function by only using the already existent arithmetic logic unit (ALU) within the CPU. On the side of target-specific platform accelerators, the ubiquity of operations such as dot products, matrix multiplications and filter convolutions points out the need for optimized routines able to increase the throughput for these operations. ...
... Moreover, some interesting nonlinear activation functions in DNNs can be approximated with this format. Some of the most important approximated functions that can be implemented are the Sigmoid (see [7]), hyperbolic tangent and the extended linear unit function (see [9]), as also explained in Sect. 2.1.6. ...
Article
Full-text available
With the advent of image processing and computer vision for automotive under real-time constraints, the need for fast and architecture-optimized arithmetic operations is crucial. Alternative and efficient representations for real numbers are starting to be explored, and among them, the recently introduced posit$$^{\mathrm{TM}}$$ number system is highly promising. Furthermore, with the implementation of the architecture-specific mathematical library thoroughly targeting single-instruction multiple-data (SIMD) engines, the acceleration provided to deep neural networks framework is increasing. In this paper, we present the implementation of some core image processing operations exploiting the posit arithmetic and the ARM scalable vector extension SIMD engine. Moreover, we present applications of real-time image processing to the autonomous driving scenario, presenting benchmarks on the tinyDNN deep neural network (DNN) framework.
... The posit format [7][8][9]19] is a fixed length format that can be configured in the number of overall bits (nbits) and the maximum number of exponent bits (esbits). ...
Conference Paper
Full-text available
Real-time processing of images and videos is becoming considerably crucial in modern applications of machine learning (ML) and deep neural networks. Having a faster and compressed floating point arithmetic can significantly increase the performance of such applications optimizing memory occupation and transfer of information. In this field, the novel posit number system is very promising. In this paper we exploit posit numbers to evaluate the performance of several machine learning algorithms in real-time image and video processing applications. Future steps will involve further hardware accelerations for native posit operations.
... Level 3 (L3) operations require the unpacked Posit version to be built thus including full computation of regime and exponent as added cost with respect to L2. Level 4 (L4) operations require conversion to Float format, exploiting either software or hardware back-ends. [5] yes esbits=0 FastTanh(x) [6] yes esbits=0 ...
Conference Paper
Full-text available
Nowadays, real-time applications are exploiting DNNs more and more for computer vision and image recognition tasks. Such kind of applications are posing strict constraints in terms of both fast and efficient information representation and processing. New formats for representing real numbers have been proposed and among them the Posit format appears to be very promising, providing means to implement fast approximated version of widely used activation functions in DNNs. Moreover, information processing performance are continuously improved thanks to advanced vectorized SIMD (single-instruction multiple-data) processor architectures and instructions like ARM SVE (Scalable Vector Extension). This paper explores both approaches (Posit-based implementation of activation functions and vectorized SIMD processor architectures) to obtain faster DNNs. The two proposed techniques are able to speed up both DNN training and inference steps.
... At level 3 we have the unpacked version that is completely built (including sign, exponent, fraction). In addition to level [43] yes esbits=0 FastELU yes esbits=0 Table II shows a summary of the requirements support on two common architectures (both the architectures have been used for the benchmarks executed in the next sections, respectively Intel i7560u and ARM Cortex A72). The two architectures do not differ in terms of hardware requirements for the aforementioned phases. ...
Article
Full-text available
This paper focuses on trends, opportunities and challenges of novel arithmetics for DNN signal processing, with particular reference to assisted and autonomous drivingapplications. Due to strict constrains in terms of latency, dependability and security of autonomous driving, machine perception (i.e. detection or decisions tasks) based on DNN cannot be implemented relying on a remote cloud access. These tasks must be performed in real-time on embedded systems on-board the vehicle, particularly for the inference phase (considering the use of DNNs pre-trained during an off-line step). When developing a DNN computing platform, the choice of the computing arithmetics matters. Moreover, functional safe applications like autonomous driving pose severe constraints on the effect that signal processing accuracy has on final rate of wrong detection/decisions. Hence, after reviewing the different choices and trade-off concerning arithmetics, both in academia and industry, we highlight the issues in implementing DNN accelerators to achieve accurate and low-complex processingof automotive sensor signals (the latter coming from diversesources like cameras, radars, lidars, ultrasonics). The focus ison both on general-purpose operations massively used in DNN like multiply, accumulation, compare, or on specific functionslike for example sigmoid or hyperbolic tangent, used for neuron activation.
Chapter
The pervasiveness of deep neural networks (DNNs) in edge devices enforces new requirements on information representation. Low precision formats from 16 bits down to 1 or 2 bits have been proposed in the last years. In this paper we aim to illustrate a general view of the possible approaches of optimizing neural networks for DNNs at the edge. In particular we focused on these key points: i) limited non-volatile storage ii) limited volatile memory iii) limited computational power. Furthermore we explored the state-of-the-art of alternative representations for real numbers comparing their performance in recognition and detection tasks, in terms of accuracy and inference time. Finally we present our results using posits in several neural networks and datasets, showing the small accuracy degradation between 32-bit floats and 16-bit (or even 8-bit) posits, comparing the results also against the bfloat family.
Article
Growing constraints on memory utilization, power consumption, and I/O throughput have increasingly become limiting factors to the advancement of high performance computing (HPC) and edge computing applications. IEEE-754 floating-point types have been the de facto standard for floating-point number systems for decades, but the drawbacks of this numerical representation leave much to be desired. Alternative representations are gaining traction, both in HPC and machine learning environments. Posits have recently been proposed as a drop-in replacement for the IEEE-754 floating-point representation. We survey the state-of-the-art and state-of-the-practice in the development and use of posits in edge computing and HPC. The current literature supports posits as a promising alternative to traditional floating-point systems, both as a stand-alone replacement and in a mixed-precision environment. Development and standardization of the posit type is ongoing, and much research remains to explore the application of posits in different domains, how to best implement them in hardware, and where they fit with other numerical representations.
Article
Full-text available
With the arrival of the open-source RISC-V processor architecture, there is the chance to rethink Deep Neural Networks (DNNs) and information representation and processing. In this work we will exploit the following ideas: i) reduce the number of bits needed to represent the weights of the DNNs using our recent findings and implementation of the posit number system, ii) exploit RISC-V vectorization as much as possible to speed up the format encoding/decoding, the evaluation of activations functions (using only arithmetic and logic operations, exploiting approximated formulas) and the computation of core DNNs matrix-vector operations. The comparison with the well-established architecture ARM Scalable Vector Extension (SVE) is natural and challenging due to its closedness and mature nature. The results show how it is possible to vectorize posit operations on RISC-V, gaining a substantial speed-up on all the operations involved. Furthermore, the experimental outcomes highlight how the new architecture can catch up, in terms of performance, with the more mature ARM architecture. Towards this end, the present study is important because it anticipates the results that we expect to achieve when we will have an open RISC-V hardware co-processor capable to operate natively with posits.
Conference Paper
Full-text available
This paper discusses the introduction of an integrated Posit Processing Unit (PPU) as an alternative to Floating-point Processing Unit (FPU) for Deep Neural Networks (DNNs) in automotive applications. Autonomous Driving tasks are increasingly depending on DNNs. For example, the detection of obstacles by means of object classification needs to be performed in real-time without involving remote computing. To speed up the inference phase of DNNs the CPUs on-board the vehicle should be equipped with co-processors, such as GPUs, which embed specific optimization for DNN tasks. In this work, we review an alternative arithmetic that could be used within the co-processor. We argue that a new representation for floating point numbers called Posit is particularly advantageous, allowing for a better trade-off between computation accuracy and implementation complexity. We conclude that implementing a PPU within the co-processor is a promising way to speed up the DNN inference phase.
Article
Full-text available
Activation functions play a key role in neural networks so it becomes fundamental to understand their advantages and disadvantages in order to achieve better performances. This paper will first introduce common types of non linear activation functions that are alternative to the well known sigmoid function and then evaluate their characteristics. Moreover deeper neural networks will be analysed because they positively influence the final performances compared to shallower networks. They also strictly depend on the weight initialisation hence the effect of drawing weights from Gaussian and uniform distribution will be analysed making particular attention on how the number of incoming and outgoing connection to a node influence the whole network.
Article
Full-text available
A new data type called a posit is designed as a direct drop-in replacement for IEEE Standard 754 floating-point numbers (floats). Unlike earlier forms of universal number (unum) arithmetic, posits do not require interval arithmetic or variable size operands; like floats, they round if an answer is inexact. However, they provide compelling advantages over floats, including larger dynamic range, higher accuracy, better closure, bitwise identical results across systems, simpler hardware, and simpler exception handling. Posits never overflow to infinity or underflow to zero, and "Nota- Number" (NaN) indicates an action instead of a bit pattern. A posit processing unit takes less circuitry than an IEEE float FPU. With lower power use and smaller silicon footprint, the posit operations per second (POPS) supported by a chip can be significantly higher than the FLOPS using similar hardware resources. GPU accelerators and Deep Learning processors, in particular, can do more per watt and per dollar with posits, yet deliver superior answer quality. A comprehensive series of benchmarks compares floats and posits for decimals of accuracy produced for a set precision. Low precision posits provide a better solution than "approximate computing" methods that try to tolerate decreased answer quality. High precision posits provide more correct decimals than floats of the same size; in some cases, a 32-bit posit may safely replace a 64-bit float. In other words, posits beat floats at their own game.
Article
Full-text available
We present Fashion-MNIST, a new dataset comprising of 28x28 grayscale images of 70,000 fashion products from 10 categories, with 7,000 images per category. The training set has 60,000 images and the test set has 10,000 images. Fashion-MNIST is intended to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms, as it shares the same image size, data format and the structure of training and testing splits. The dataset is freely available at https://github.com/zalandoresearch/fashion-mnist.
Conference Paper
Full-text available
This paper compares the performance of classifier algorithmson a standard database of handwritten digits. We consider not rawaccuracy, but rejection, training time, recognition time, and memoryrequirements."Comparison of Leaning for Handwritten Digit Recognition", International Conference onNeural F. and P. Cie Publishers, 1995Y. Le L. Bottou, C. J.S. I L.D. Jackel, U.A.P. V.COMPARISON OF LEARNING ALGORITHMS FORHANDWRITTEN DIGIT RECOGNITIONY. L. L. A. C.J. H. I. U.E. P....
Conference Paper
Guaranteed numerical precision of each elementary step in a complex computation has been the mainstay of traditional computing systems for many years. This era, fueled by Moore's law and the constant exponential improvement in computing efficiency, is at its twilight: from tiny nodes of the Internet-of-Things, to large HPC computing centers, sub-picoJoule/operation energy efficiency is essential for practical realizations. To overcome the power wall, a shift from traditional computing paradigms is now mandatory. In this paper we present the driving motivations, roadmap, and expected impact of the European project OPRECOMP. OPRECOMP aims to (i) develop the first complete transprecision computing framework, (ii) apply it to a wide range of hardware platforms, from the sub-milliWatt up to the MegaWatt range, and (iii) demonstrate impact in a wide range of computational domains, spanning IoT, Big Data Analytics, Deep Learning, and HPC simulations. By combining together into a seamless design transprecision advances in devices, circuits, software tools, and algorithms, we expect to achieve major energy efficiency improvements, even when there is no freedom to relax end-to-end application quality of results. Indeed, OPRECOMP aims at demolishing the ultra-conservative “precise” computing abstraction, replacing it with a more flexible and efficient one, namely transprecision computing.
Article
Deep neural networks are commonly developed and trained in 32-bit floating point format. Significant gains in performance and energy efficiency could be realized by training and inference in numerical formats optimized for deep learning. Despite advances in limited precision inference in recent years, training of neural networks in low bit-width remains a challenging problem. Here we present the Flexpoint data format, aiming at a complete replacement of 32-bit floating point format training and inference, designed to support modern deep network topologies without modifications. Flexpoint tensors have a shared exponent that is dynamically adjusted to minimize overflows and maximize available dynamic range. We validate Flexpoint by training AlexNet, a deep residual network and a generative adversarial network, using a simulator implemented with the neon deep learning framework. We demonstrate that 16-bit Flexpoint closely matches 32-bit floating point in training all three models, without any need for tuning of model hyperparameters. Our results suggest Flexpoint as a promising numerical format for future hardware for training and inference.
Conference Paper
Restricted Boltzmann machines were developed using binary stochastic hidden units. These can be generalized by replacing each binary unit by an infinite number of copies that all have the same weights but have progressively more negative biases. The learning and inference rules for these “Stepped Sigmoid Units ” are unchanged. They can be approximated efficiently by noisy, rectified linear units. Compared with binary units, these units learn features that are better for object recognition on the NORB dataset and face verification on the Labeled Faces in the Wild dataset. Unlike binary units, rectified linear units preserve information about relative intensities as information travels through multiple layers of feature detectors. 1.