ArticlePDF Available

Formulation, analysis, and hardware implementation of chaotic dynamics based algorithm for compression and feature recognition in digital images

Authors:

Abstract and Figures

In this paper we will discuss the utilization of a set of waveforms derived from chaotic dynamical systems for compression and feature recognition in digital images. We will also describe the design and testing of an embedded systems implementation of the algorithm. We will show that a limited set of combined chaotic oscillations are sufficient to form a basis for the compression of thousands of digital images. We will demonstrate this in the analysis of images extracted from the solar heliospheric observatory (SOHO), showing that we are able to detect coronal mass ejections (CMEs) in quadrants of the image data during a severe solar event. We undertake hardware design in order to optimize the speed of the algorithm, taking advantage of its parallel nature. We compare the calculation speed of the algorithm in compiled C, enhanced Matlab, Simulink, and in hardware.
Content may be subject to copyright.
Formulation, analysis, and hardware implementation of chaotic
dynamics based algorithm for compression and feature recognition in
digital images
Chance M. Glennab, Srikanth Manthab, Sajin Georgeb , Deepti Atlurib,
and Antonio F Mondragon-Torresb
aAlabama A&M University, 4900 Meridian Street, Huntsville, AL, USA 35762;
bRochester Institute of Technology, 78 Lomb Memorial Dr., Rochester, NY USA 14623
ABSTRACT
In this paper we will discuss the utilization of a set of waveforms derived from chaotic dynamical systems for
compression and feature recognition in digital images. We will also describe the design and testing of an embedded
systems implementation of the algorithm. We will show that a limited set of combined chaotic oscillations are sufficient
to form a basis for the compression of thousands of digital images. We will demonstrate this in the analysis of images
extracted from the solar heliospheric observatory (SOHO), showing that we are able to detect coronal mass ejections
(CMEs) in quadrants of the image data during a severe solar event. We undertake hardware design in order to optimize
the speed of the algorithm, taking advantage of its parallel nature. We compare the calculation speed of the algorithm in
compiled C, enhanced Matlab, Simulink, and in hardware.
Keywords: compression, coronal mass ejections, chaos, transformation, embedded systems, feature recognition,
anomaly detection, hardware.
1. INTRODUCTION
The Digital Age has afforded us a great number of technological benefits. The ability to extract important information
from digital images is one such benefit. In addition to this, the study and understanding of chaotic processes has opened
a window of knowledge that is now being translated into many applications. This work brings together image
processing and chaotic dynamical systems theory in a way that unexpected benefits occur. First we have utilized the
waveform diversity inherent in chaotic processes to create a digital signal compression algorithm. Essentially, this
algorithm transforms the digital signal into a unique representation of itself. We call this dynamical systems
transformation a D-transform. We will describe it, show examples of its application, and discuss optimal
implementations in hardware in the sections to come.
2. THE D-TRANSFORM
2.1 Concept
The fundamental basis of the D-transform is the relationship that a finite set of waveforms derived from nonlinear
dynamical systems theory, or chaos theory, has with digital sequences. We have found that audio, image, and video data
have sequential relationships that are identifiable and can be mimicked by mathematically generated waveforms. In this
respect the D-transform has a similarity to the wavelet transform [1].
Chaotic behavior is not purely random behavior, it only appears to be random over a long time because of the
exponential sensitivity to small changes. The present behavior progressively loses relation to the past behavior.
Predictability erodes. However, there is a finite time where prediction is possi ble. For mathematical systems, such as
the Lorenz equations [2], there is a deterministic set of equations creating the behavior. Short-term future behavior can
be known. It may sound paradoxical, but a signal produced by a chaotic system is in fact formally a random process.
Image Processing: Algorithms and Systems XI, edited by
Karen O. Egiazarian, Sos S. Agaian, Atanas P. Gotchev, Proc. of SPIE-IS&T Electronic Imaging, SPIE
Vol. 8655, 86550C · © 2013 SPIE-IS&T · CCC code: 0277-786/13/$18 · doi: 10.1117/12.2001152
SPIE-IS&T/ Vol. 8655 86550C-1
Downloaded From: http://spiedigitallibrary.org/ on 02/23/2013 Terms of Use: http://spiedl.org/terms
40
30
¢ 2010
o
106
6
4
22
0
22
15
10
E
6.5 77.5 8.5
time (usec) 99.5 10
5
0
2
4 5 6
time (uses) 8 9 10
3 4 56
time (uses) 8 9 10
This view emerges from the area of mathematics known as ergodic theory which deals with random processes produced
by deterministic maps on a measure space [3].
Figure 1. The time dependent waveforms for the inductor current, emitter voltage, and collector voltage for the Colpitts oscillator.
2.2 Formulation
The D-transform is a dynamics-based algorithm that performs a transformation on a sequence of numbers and returns a
relationship to a finite set of chaotic waveforms that form the basis functions. The basic foundation of the process lies in
the realizations that (a) chaotic oscillators are dynamical systems that can be governed by mathematical expressions, and
(b) chaotic oscillators are capable of producing diverse waveform shapes. The premise is this: a segment of a digital
sequence, such as that derived from the natural lateral fluctuations of a digital image sequence, can be replaced by the
initial conditions of a chaotic oscillation that matches it within an acceptable error tolerance. We have successfully
demonstrated this algorithm for digital audio, digital images, and digital streaming video [4].
Figure 2. Block diagram for the D-transform algorithm.
Instead of being constrained to only one chaotic oscillator we can consider utilizing several systems. For example, the
Lorenz equations produce waveforms with as much diversity as the Colpitts system [5]. We realized that it was not
necessary to mathematically solve sets of equations using stored initial conditions if we pre-stored the solutions in
computer memory. This led us to the further realization that we could create even more diverse waveform shapes by
intelligently combining chaotic waveforms together using various mathematical operations. We postulated that the
greater the chaotic oscillation waveform diversity, the higher the probability of matching arbitrary digital sequences to
an acceptable degree of error. Figure 3 depicts the arrangement of a combined chaotic oscillation, or CCO matrix that
we produced. It consists of 32 independent oscillations consisting of 216 16-bit sample points each.
D
C
k
d
s
SPIE-IS&T/ Vol. 8655 86550C-2
Downloaded From: http://spiedigitallibrary.org/ on 02/23/2013 Terms of Use: http://spiedl.org/terms
Figure 3. Representation of a 32 x 65,536 combined chaotic oscillation matrix.
The brute force method is to select a segment, s[ns], of length Ns, from a test set S. The CCO matrix is swept through
and segments, c[nc] of length Nc are extracted for comparison with s[ns]. Ns does not necessarily equal Nc initially,
however, they must be made the same length when compared. Therefore Nc = mcNs, where mc is an integer. This allows
for increased waveform diversity for the selected waveforms from the CCO matrix. The segment c[nc] having the
smallest Euclidean distance from s[ns] is selected as the replacement segment sr[ns].
The selected segment c can be represented by an address that consists of the waveform type number (0-31), the position
along the waveform ni, and the length of the segment Nc. The scaling information must also be preserved so that the
replacement segment can be fully produced. We call the complete set of information needed to reconstruct sr a d-bite,
and it is represented by the symbol d.
2.3 Compression Results
For a 24-bit color image, d can be represented by a 40-bit word. Therefore the compression ratio achieved for segments
extracted from the three 8-bit channels is 8Ns/40:1 or Ns/5:1. We selected five 512x512 images from the USC Viterbi
School of Engineering’s Signal and Image Processing Institute image database. Figure 4 shows the original images, the
decompressed images, along with their respective peak signal-to-noise ratios for Ns = 32, or a compression ratio of 6.4:1.
A non-optimized C implementation of the algorithm took approximately 496 seconds to complete.
Image 1
Image 2
Image 3
Image 4
Image 5
Original
Decompressed
PSNR (dB)
26.1
30.3
22.4
28.4
28.5
Figure 4. D-Transform compression process implemented on a set of images from the USC-SIPI image database.
In figure 5 we show the processing results on image #4 for Ns = 16, 32, and 64. We include the processing time and the
PSNR for each Ns value. No smoothing or error correction algorithm is applied to the decompressed image.
0
216-1
c0(i)
c2(i)
c3(i)
c31(i)
ni
ni+Nc-1
SPIE-IS&T/ Vol. 8655 86550C-3
Downloaded From: http://spiedigitallibrary.org/ on 02/23/2013 Terms of Use: http://spiedl.org/terms
t2
1
°0.8
d 0.6
E= 04
0.2
oo5 10 15 20 25 30
CCO waveform type
Ns
16
32
64
Decompressed Image
Compression ratio
3.2:1
6.4:1
12.8:1
PSNR (dB)
31.2
28.4
25.0
Processing time (sec)
528
496
456
Figure 5. Comparison of D-Transform compression for Ns = 16, 32, and 64.
2.4 D-Transform Profiles
It is important to note that one CCO matrix has been sufficient to process thousands of different images, audio samples,
and video samples. We engaged in an extensive optimization process to determine the best matrix to use [6]. We
concluded that our original matrix was optimal. During the course of analyzing the performance of the algorithm we
began to look at the collection of waveforms that made up each image as it was processed. As a result we produced D-
transform profiles, which are simply histograms of the number of waveforms from each set which are selected to
comprise the complete image reconstruction. We found that if we held the processing parameters constant, then there
were distinctive profiles for each image. Figure 6 shows the image 1 and its associated D-transform profile.
(a)
(b)
Figure 6. (a) An original image and (b) its associated D-transform profile for Ns = 32, and mc = 4.
Figure 7 shows the five test images along with their D-transform profiles. The image sizes were 512x512, and the
processing parameters were Ns = 32, and mc = 4. We included the PSNR values as well. Note that they are slightly better
than the results on figure 4. This is because we used mc = 1,2,3, and 4, which increases the diversity of the waveforms
SPIE-IS&T/ Vol. 8655 86550C-4
Downloaded From: http://spiedigitallibrary.org/ on 02/23/2013 Terms of Use: http://spiedl.org/terms
i.
E.....1.1.1.11.1,1111.111
it:.
t..
E.
E.,.0.110,111,1111111 li
selected from the CCO matrix. This did, however, increase the processing time by a factor of four (2,091 seconds).
Image 1
Image 2
Image 3
Image 4
Image 5
Decompress
ed Image
PSNR (dB)
26.1
30.3
22.4
28.4
28.5
D-transform
profile
Figure 7. Comparison of the D-transform profiles of the test images.
Furthermore, we discovered that the images of similar subject matter had similar profiles. This became a method of
feature or anomaly detection. In the next section we will show how we used the D-transform profiles to detect solar
coronal mass ejections in series of images.
3. ANALYSIS OF SOLAR CORONAL MASS EJECTIONS
The Solar and Heliospheric Observatory (SOHO) is a project that is the result of an international collaboration between
NASA and the European Space Agency (ESA). The mission is to study the Sun from its deep core to the outer corona
and the solar wind [7]. The Observatory is a satellite that orbits around the First Lagrangian point (L1) and is thus
locked to the Earth-Sun line. This provides an uninterrupted view of Sol.
SOHO has twelve different instruments on board that were provided by various international space agencies. The data
that we have processed comes from the Large Angle and Spectrometric Coronagraph (LASCO) that was provided by and
monitored by the Naval Research Laboratory. LASCO provides images from its two telescopes, C2 and C3, in real-time
as well as in archives that goes back many years.
A coronal mass ejection (CME) is a large magnetic bubble of plasma that erupts from the Sun’s corona sending solar
energetic particles through space at high speeds that can be in excess of 3,200 km/s [8]. These CMEs feed the solar
wind and if the eruptions occur from the proper position on the sun, they can send these particles towards the Earth. The
Aurora Borealis is a result of the interaction of the Earth’s magnetic field with particles ejected from the sun [9]. At high
intensity levels, major disruptions to communication equipment and power systems can occur [10]. These particles can
reach the Earth in a matter of days, or hours, depending upon the speed at which they travel. NASA has long sought
methods of providing automated warnings for severe events derived from data that is already available.
Our analysis has come from LASCO C2. Figure 8 shows an example of a 512x512 image retrieved from the SOHO data
archive. For our analysis, we have separated the image into quadrants. Image (a) is taken from a quiet period, while
image (b) shows a major CME in quadrant 2.
SPIE-IS&T/ Vol. 8655 86550C-5
Downloaded From: http://spiedigitallibrary.org/ on 02/23/2013 Terms of Use: http://spiedl.org/terms
12
08
.1, _10 dillII 1114
L,111111I Ill 1101
(a)
(b)
Figure 8. Samples of images taken from the SOHO LASCO C2 image data archive (http://lasco-www.nrl.navy.mil). Image (a)
represents a quiet period while image (b) shows a major CME in in quadrant 2.
The data archive provided a large set of images that represented mild, moderate, and severe CMEs, in addition to period
of non-CME activity. We created a statistically viable set of images from the four quadrants when the solar activity was
relatively quiet, and created an average D -transform profile from them. We called these quiet zone profiles. A test
profile having a large Euclidian distance from the quite zone profile indicates a high probability that an eruption is
occurring. We called this a quiet zone differential method. Figure 9 shows the average quiet zone profiles for each
quadrant. There were some elements of the profile that were redundant across all images that degraded the resolution of
the comparison. Therefore, elements 1, 8, and 11were consistently zeroed out in all profiles.
Figure 9. Quiet zone profiles for quadrants 1 – 4.
Q4
Q1
Q3
Q2
SPIE-IS&T/ Vol. 8655 86550C-6
Downloaded From: http://spiedigitallibrary.org/ on 02/23/2013 Terms of Use: http://spiedl.org/terms
iIII
nisiluook
rAitr-Av
A31 °r^
a.mn.P°W<
LASCO C2
150
n Im 19 m Zr ]m Yn am m m°
., -c507..754.+e5-
Ewe... °name î
Wi
wI..o.i..n.an Ü
oM
®Snoo
®®, ®RMIS
___ If« . °
I1m
W150
w
2:0 0
o rROfile WO
.«..
r250 W 1m 19 ]m 2W
o rRefile Ws
Olone MAIN
rsr
10 AL
amn.°iRW1
ürann°MRI
°
°s
WW
m
BO
Ww
P"---
aIda
Figure 10 shows an image selected from quadrant 3 that has a CME forming. The D-transform profile shows a marked
difference from the quiet zone profile of quadrant 3. The difference is calculated as
     

 , (1)
where  is the profile of the image under test, and  is the average quiet zone profile. When  is high, it
indicates a possible eruption.
(a)
(b)
Figure 10. (a) Image of a CME in quadrant 2 and (b) its associated D-transform profile.
We created a user interface can process a sequential series of images that have been extracted from the SOHO archive.
Figures 11 and 12 show examples of the analysis of CMEs that occurred on periods around November 24, 2000 and
April 8, 2012 respectively. This user interface shows peaks in the D-transform profile analysis using the quiet zone
differential method. It is evident that establishing thresholds for
p can be the basis for an alert system that can provide
hours, or even days warning time for severe events.
Figure 11. An example of the user interface for the analysis of solar images from the SOHO archive using D-transform profiles
(November 24, 2000).
SPIE-IS&T/ Vol. 8655 86550C-7
Downloaded From: http://spiedigitallibrary.org/ on 02/23/2013 Terms of Use: http://spiedl.org/terms
_raise
LASCO C2
.W[ssnun aD a, -WÆVebYVFbn1M'IL
1N meuxmrosm L.
ewnuseu rv
®r ¡m
L,... .a la
m i I L-1 w1.444
IC I H o II aI
Limo awe .
9-
lm19
200
250 D
dum nám 18 200 IDI
D
9
1D)
250
1m191W
250 9 100 19 d SI
u
of Profile Nn
u 5
,ai. z1
05
45
ou.. Damn
10 mn
p6Ìm.DMMID
10 20 30
OT Pnofile #121
o
n 5
10 mn
a.um Dream
19 lID 140
D5
110
\"-,/mvo
t
Figure 12. An example of the user interface for the analysis of solar images from the SOHO archive using D-transform profiles (April
8, 2012).
One of the severe drawbacks of the utilization of the algorithm is the time it takes to process. In the next section we will
discuss the implementation of the algorithm in hardware.
4. HARDWARE IMPLEMENTATION OF THE D-TRANSFORM
In order to accelerate the performance of the algorithm, two approaches were taken. On one hand a Matlab model existed
which validates the algorithm implementation at the system level. On the other hand if performance needs to be
improved, a C language program was written in an attempt to speed up the processing time. In this implementation we
will describe how both system level implementations were leveraged to be targeted to a Field Programmable Gate Array
(FPGA), an Application Specific Integrated Circuit (ASIC) or an Application Specific Standard Product (ASSP).
The implementations described in this section are as follows:
From Matlab to Simulink to Xilinx System Generator to FPGA
From C Language to Vivado HLS to FPGA
It is worth mentioning that both methodologies are what is commonly known in the Electronic Design Automation
(EDA) as Electronic System Level (ESL) tools. These have gained popularity in the last ten years for rapid prototyping
and do not substitute a Register Transfer Level (RTL) design methodology on a final product, but the results obtained
with these ESL tools have been approaching the quality of results of hand coded RTL with the additional ability to
explore microarchitectures, sharing resources and a complete verification suite from behavioral level to RTL and gate
level designs [12].
4.1 Basic Algorithm Structure
The algorithm performs an extensive search of image segments on the complete CCO matrix to find which will be the
closest N sample vector in the Euclidean distance sense. The result is a series of indexes X and Y that are required to
regenerate the original compressed image by accessing the Yth N sample vector within the Xth channel. The value Sum is
used within the algorithm to find during the search which will be the minimum set of indexes.
SPIE-IS&T/ Vol. 8655 86550C-8
Downloaded From: http://spiedigitallibrary.org/ on 02/23/2013 Terms of Use: http://spiedl.org/terms
   
 
1
0
, , min 0.. 1; 0.. 1; 0.. 1
N
l l l i
X Y Sum abs cco j i k N img l i j M k O l P

 


(2)
Where cco[M][N×O] is a N×M×O matrix representing the CCO matrix, img[P][N] is a P×N matrix representing one
layer of image, N is the number of samples, M is the number of channels in the CCO matrix which for this work M=32,
O is the number of blocks size N in each row of the CCO matrix which for this work is 2048, P is the number of N
sample blocks for the processed image, for this work P=2048.
In summary the CCO matrix is composed of 32 orthogonal channels of 65536 samples each. The processed image is a
256x256 and each layer (RGB) is processed identically and in parallel. For this work a single layer details will be
presented and can easily be extended for the complete RGB image.
4.2 Traditional RTL design flow methodology
As mentioned previously, the traditional methodology calls to select the best microarchitecture to implement a particular
algorithm. For this purpose, a Hardware Description Language (HDL) is used; common languages are VHDL, Verilog
and most recently System Verilog.
The architect requires a very thorough analysis of the code implementation to come up with the most efficient
implementation that will meet the best speed, area and power consumption tradeoffs. This is a very daunting task and it
is very difficult to make substantial changes once the microarchitecture has been selected. For many years this is the
approach taken to design FPGA, ASIC and ASSP devices.
On the other hand most algorithms are implemented as a subsystem of a System on a Chip (SoC ) and what this means is
that there is a central processor, traditionally an ARM processor and the algorithm is implemented as a hardware
accelerator or a coprocessor. These types of implementation have been very popular due to the flexibility allowed for the
Hardware/Software partitioning that allows the designer to place in hardware exclusively the functions that are
constrained by speed, power and area, leaving the control and non-priority tasks to the main processor in software. The
processor acts more as a state machine controller and allows creating an interface with a user and in some advanced
cases an operating system could be integrated to sequence the tasks performed either in real time or not depending on the
application.
Since this methodology is very well known and used in practice, we will focus the remaining of this section on
methodologies for architecture exploration and rapid prototyping. Another advantage is that the algorithm system level
designer could contribute to determine on the final architecture, while in the traditional methodology sometimes
algorithm design and implementation are mutually exclusive tasks performed by distinct types of design engineers.
4.3 Matlab to Simulink to Xilinx System Generator to FPGA
As mentioned in the introductory section a Matlab behavioral model was the first step for algorithm validation. A Matlab
representation does not contain time information since it is basically a high level floating point representation of the
algorithm. A hardware implementation needs to be driven by a clock signal and the digital design is sequential where the
RTL methodology is employed.
The design was converted from Matlab m-code to Simulink blocks. Simulink adds a time domain representation by
performing continuous or discrete evaluation of the evolution of the algorithm over time. At this stage the model still is
in floating point mode and the results should be identical to those in the Matlab simulation.
The next stage is to convert the different sub-systems and blocks to ones that can be implemented directly in hardware.
There are two approaches for this; one is to use Matlab HDL coder which can generate HDL code directly from the
Simulink blocks and some Matlab m-code, the other is to use blocks provided directly by the FPGA manufacturer, in this
case both Altera and Xilinx provide this capabilities.
A hardware implementation can be done on floating point, but these operations are not optimal on any sense (power,
speed and area), thus the design needs to be converted to a fixed point representation. For this particular hardware
SPIE-IS&T/ Vol. 8655 86550C-9
Downloaded From: http://spiedigitallibrary.org/ on 02/23/2013 Terms of Use: http://spiedl.org/terms
implementation, most of the values used are already represented as finite fixed quantities, so the conversion is not a
problem. Also the simulation is discrete by the nature of the algorithm.
The decision at this point was to implement the inner two loops as shown in equation (3) below which can compare in
parallel an image vector of N samples with a block of N samples and M channels, the implementation requires that the
internal CCO N×M sub-matrix should be loaded at each iteration, thus the bottleneck becomes loading this M×N sub-
matrix. This process has to be done O times for each of the l image vectors. In Appendix A, a more detailed view of the
hardware implementation is shown.
   
 
1
0
, , ; 0.. 1; 0.. 1min 0.. 1
l l l
X Y Sum k O l P

 

 
N
iabs cco j i k N img l i j M
(3)
For this particular implementation the RTL code was generated and later added to a SoC design which is running on a
Zedboard with a Zynq processor with dual ARM Cortex A9 processors. The reason that a Zedboard was chosen for
implementation is due to the fact that is one of the state of the art boards available for university design. On a closer look
as can be seen in Table 1, it contains an equivalent of 1.3 million ASIC gates which in theory should be enough for
implementing the design.
ARM
AXI Bus
Memory
Mapped
Registers
CCO N×M
co-processor
Memory Peripherals
Figure 13. Xilinx System Generator FPGA implementation.
Table 1. Zedboard Zynq platform Characteristics
FPGA
Xilinx® XC7Z020-1CLG484CES Zynq-7000 AP SoC
Embedded Processor
Dual ARM Cortex A9 MP Core (800 MHz)
Programmable Logic Cells (Approximate ASIC Gates)
85K Logic Cells (~1.3M) (Basic speed 100 MHz)
Look Up Tables (LUT)
53,200
Flip-Flops
106,400
Extensible RAM (#36 Kb Blocks)
560 KB (140)
4.4 C Language to Vivado HLS to FPGA
As mentioned in the introduction to the hardware implementation, in the present work we adopted a different ESL
methodology which is converting from an untimed sequential C language golden model to an RTL representation
including simulation and verification, all the way to integration in the SoC platform. At first sight it seems the same
methodology as before, the difference strives in that the Vivado High Level Synthesis (HLS) tool allows architecture
exploration from the same behavioral C code and a less painful integration of the code to the SoC.
With this tool we were able to play with all levels of parallelization, pipelining, data streaming, etc. The main challenge
is that when a large degree of parallelization was obtained, the number of resources on the Zynq XC7Z020 was not
enough to hold the hardware implementation, additional FPGA platforms were tested as well and resulted in a better
integration, but those were not SoC, so a soft core had to be instantiated using the FPGA resources. On the other hand
the amount of memory to hold the CCO matrix is 2 MB and as can be observed in Table 1, the amount of Block RAM is
limited to about one fourth of the required memory.
SPIE-IS&T/ Vol. 8655 86550C-10
Downloaded From: http://spiedigitallibrary.org/ on 02/23/2013 Terms of Use: http://spiedl.org/terms
The algorithm is almost exponentially parallelizable as can be observed in equation (2) (N×M×O×P (232 operations)).
There were not many obvious intermediate solutions to partition the algorithm. On the other hand the sequence of the
operations can be interchanged obtaining almost the same results. What we mean is that the perceived decompressed
image will be almost the same to the human eye by trying different combinations of the sequence of operations.
After extensive testing, the best combination resulted in the following form:
   
 
1
0
, , min 0.0.. 1; 0.. .1; 1
N
l l l i
X Y Sum abs cco k O jj i k N i Mmg l i l P

 



(4)
As can be observed the indexes k and j were interchanged. This resulted in an image vector of size N to be compared to a
complete channel in a single step. A single CCO matrix for one channel of size N×O can be preloaded and then O
comparisons can be done in the coprocessor.
The number of clock cycles required to compute the vector comparison against one CCO channel is 50,204,675 clock
cycles, assuming a 100 MHz clock, the time required is ~0.5 seconds. This has to be multiplied by 32 to obtain an
approximate of the time to complete the compression process plus it is required an overhead for loading the 64K vector
into memory. Measuring manually running the Zedboard as a hardware accelerator under the control of a C program
running on one of the ARM Cortex A9’s the time is around 23 seconds.
Table 2. Hardware Implementation Performance Comparisons
Implementation Methodology
Hardware Speed
Processing Speed
C Language to Vivado HLS to FPGA
ARM Core A9 (800 MHz) FPGA (100 MHz)
23 Seconds
Matlab to Simulink to Xilinx System
Generator to FPGA
ARM Core A9 (800 MHz) FPGA (100 MHz)
C Language gcc RHEL 6
AMD Phenom II X6 1100T (3 GHz)
45 Seconds
4.5 Performance Comparisons
Here we show the results of different implementations of the algorithm on a 256x256 image processed for Ns = 32 and
mc = 1.
Table 3. D-Transform processing time comparison for 256x256 RGB image.
Method
Processing Time
Matlab
40 min
FSWC Enhanced Matlab
7 min
Compiled C
3 min
Simulink
34 min
FPGA
1min
5. CONCLUSION AND FUTURE WORK
We have shown that an algorithm based upon chaotic dynamical systems theory can be used to compress and to form a
level of feature analysis on digital images. We have directly demonstrated that this D-transform can be used to
distinguish coronal mass ejections from images stored in the solar heliospheric observatory data archive. One of the
challenges to overcome in the algorithm is the large amount of processing time required. We have developed a hardware
SPIE-IS&T/ Vol. 8655 86550C-11
Downloaded From: http://spiedigitallibrary.org/ on 02/23/2013 Terms of Use: http://spiedl.org/terms
design solution that takes advantage of the parallel nature of the algorithm and significantly improves the processing
speed. This will allow real-time analysis of image streams and video.
In other work we have shown that cancerous tumors in digital x-rays can be distinguished utilizing D-transform profiles.
We have also used the algorithm to determine anomalies in digital audio sequences and to classify music into distinct
categories. We are yet to compare other methods of accomplishing the same tasks with it to determine if it is indeed
superior.
More work can be done to discover if the combined chaotic oscillation set is optimal in size and population. Many more
data types can be processed to discover the advantages that the algorithm can provide for anomaly detection, feature
extraction, subject recognition, and classification.
Rapid advances in hardware architectures and memory will only serve to allow better hardware implementation.
SPIE-IS&T/ Vol. 8655 86550C-12
Downloaded From: http://spiedigitallibrary.org/ on 02/23/2013 Terms of Use: http://spiedl.org/terms
6. REFERENCES
[1] Blahut, Richard E, Principles and Practice of Information Theory, Addison-Wesley Publishing Company, New
York, (1987).
[2] Ott, Edward, Chaos in Dynamical Systems, Cambridge Univ. Press, Canada, (1993).
[3] Crutchfield, J. P., Packard, N. H., Symbolic Dynamics of One-Dimensional Maps: Entropies, Finite Precision, and
Noise, International Journal of Theoretical Physics 21, 433-466 (1982).
[4] Glenn, C., Eastman, M., Paliwal, G., “A New Digital Image Compression Algorithm Based on Nonlinear
Dynamical Systems”, IADAT International Conference on Multimedia, Image Processing and Computer Vision,
Conference Proceedings, March (2005).
[5] Moon, Francis, Chaotic Vibrations, Wiley and Sons, New York, (1987).
[6] Sinha, Anurag R., “Optimization of a New Digital Image Compression Algorithm based on Nonlinear Dynamical
Systems”, EE Masters Thesis, Rochester Institute of Technology, May (2007).
[7] “About the SOHO Mission”, http://sohowww.nascom.nasa.gov/about.
[8] “Coronal Mass Ejections: Scientists Unlock the Secrets of Exploding Plasma Clouds on the Sun”, Science Daily,
November 14, (2010), http://www.sciencedaily.com.
[9] Hargreaves J. K., The Solar-Terrestrial Environment: An Introduction to Geospace, Cambridge University Press,
(1992).
[10] Wall, Mike, “Catastrophe Looming? The Risks of Rising Solar Storm Activity”, Space.com, February 17, (2011).
[11] Booth, N., Smith, A. S., Infrared Detectors, Goodwin House Publishers, New York & Boston, 241-248 (1997).
[12] Mondragon-Torres, A. F., "Hardware Implementation of Wireless Communications Algorithms," in Wireless
Communications, D. E. Ali, Ed., ed, Croatia: INTECH, 2012.
SPIE-IS&T/ Vol. 8655 86550C-13
Downloaded From: http://spiedigitallibrary.org/ on 02/23/2013 Terms of Use: http://spiedl.org/terms
Canstml
u0_,
-01 IIIwrf.1
7.102
=3.1M0_3
Data
Saasysrrn
datan Data
Vie
SLbsystern8
-1e777eg12
.Urtuffar3
MO_ Urtaffen
0.1
MO_S
0.1
Eq-
car,.
E
atan Data
SLbsystern7
D.a
Vie fu.
SLbsystern8
LIftml.r5
22777721
M0_6 Ufb.05m8 necereg22
h.
MO 7 U77u177r7 ecoeg23
0.1
M0_8 Md.; =crap.
--0
En
atar
Stbryston9
a.. Data
Ifie fus
Vie
Subsys.210
D.afus
Subsystem' I
Data
Subsys222I3
--0
7
In10
Subsysbn05
tayalem
,a,
Butsys
Suhansto
9.ta la
Bute ta
--©IneM öWarn 4
7. APPENDIX A
Figure 14. Section of the top level Xilinx System Generator Design. The main three components are the Gateway,
FIFO and Minimum/Channel Selection. Signals are converted from floating point to fixed point at the gateway,
then the FIFO accumulates N samples and triggers a parallel computation once all FIFOs for all channels are full.
Figure 15. FIFO structure. Each channel FIFO is filled with N samples and will generate a valid signal once it is
full and it will remain valid until the FIFO is completely empty.
MinCh
Figure 16. The MinCh unit takes in N samples and generates the minimum value of the minimum Euclidian
distance as well as information of the channel. A tree structure is used to select the minimum of a set of 8
channels.
SPIE-IS&T/ Vol. 8655 86550C-14
Downloaded From: http://spiedigitallibrary.org/ on 02/23/2013 Terms of Use: http://spiedl.org/terms
dodl
moo
do
öMual Cant lad
1do
1,,,10181.1 ,BRIEMBINIIMETI II FIEBEIEl...1*13.1.71.1411=311 ;IFIg471411.41;1111878+1111116.,
:7! 01.07;4 C....*.iPlifik
,I.,;,e, 114-''N .4 . .
.. -.-1.-tff. ': .
.' -, -..,...
,. . 7 ,
4,777 ...... ',-:. .;'
....::-.., ,.._
w . Ai, :......i .,,.. ;...,.......::,
.,..: 7,,, _,....r
_.. ::::::,.. .. ...........-:.,
- r:',., ... Wi.E.,,,,ii,i
,,,,,.. .... ,.,7- -
.1,,,fili..4,,,ke. é..,..
..1E:..2... ,. um ...
:..L., .:e: ...... 11,: .: ..::
:r..., filt,,dr.. _::,
::- ,,,,,te. . .,. !, ..ra
-: . ....: ::::: :
ti iktoanniturr
,..: .. .,.: : :3, . w :...
:m. .,
-,,,....:w b..
..
.. %4 .,,.
,,.,,:a::,;.,
. . . - .,
...
3......
.,,, :
:z.rm. -- .....
--
..''' . ...... ''. . ''''
::::: . : ..... :W., ..,,,,,Itri,..,.:716014.711,-...... ,,,,,:.4,.J :....,:,. ' ' :17.
.....t.,,,,....."
.1.,:k .,
H...:n.l...
- - - -
,......L.=
............=
........=
.:.:: .1. :.
Ji. ..., :11.1 ....
..:: ..: a
li. '.. :..
.:
:1013.4.111165.
Figure 17. Euclidian distance computation unit. Takes N samples and generated the highlighted term in equation
(33).
Figure 18. SlctMin unit. This unit is the base of the tree structure and its purpose is to find the minimum and
channel information for a pair of values. The channel information is hardwired in the present design.
Figure 19. Zynq layout after Place and Route.
SPIE-IS&T/ Vol. 8655 86550C-15
Downloaded From: http://spiedigitallibrary.org/ on 02/23/2013 Terms of Use: http://spiedl.org/terms
... They provide implementations for the power converter algorithm in C, MATLAB and VHDL to show HLS competitiveness. Several other studies in the literature follow this approach such as the work by Hiraiwa and Amano (2013), Monson et al. (2013), Homisirikamol and Gaj (2014), Loughlin et al. (2014) and Glenn et al. (2013). ...
Article
Full-text available
Programming FPGAs requires advanced hardware design skills which limits their adoption in data centres. FPGA vendors have provided high level synthesis (HLS) tools to build register transfer level (RTL) specifications from designs provided in high level languages. We present a suite of C and C++-based hardware accelerators for the Purdue MapReduce benchmark suite and use the Xilinx Vivado HLS tool to compare their performance and resource efficiency to hand-coded RTL code. We show that simple design changes in the high level language-based accelerators can improve results. Using Vivado HLS, five benchmarks match the performance of hand optimised RTL while sort, self join, adjacency list and word count algorithms are about 4.7×, 3×, 2× and 1.3× slower, respectively.
Article
In this paper we discuss the formulation of, and show the results for, a new compression/decompression algorithm called DYNAMAC, that has its basis in nonlinear systems theory. We show that we are able to achieve significant compression of RGB image data while maintaining good image quality. We discuss the implementation of this algorithm in hardware, show that the same process is applicable to other digital forms of data, demonstrate that the decompression process is ideal for streaming applications, and show that the algorithm has an exploitable aspect of encryption useful for digital rights management and secure transmission. We discuss our methodology for the improvement of the performance of this codec. 1. BACKGROUND DYNAMAC (dy-NAM-ac) stands for dynamics-based algorithmic compression. The basic foundation of the process lies in the realizations that (a) chaotic oscillators are dynamical systems that can be governed by mathematical expressions, and (b) chaotic oscillators are capable of producing diverse waveform shapes. The premise is this: a segment of a digital sequence, such as that derived from image data, can be replaced by the initial conditions of a chaotic oscillation that matches it within an acceptable error tolerance. If the size of the data needed to specify the initial conditions needed to reproduce the chaotic oscillation are smaller than the size of the digital sequence, compression is achieved. Further, if we improve the chaotic oscillator's ability to produce diverse waveform shapes, we increase the probability of matching arbitrary digital sequence segments. There are a number of compression algorithms for digital images (1). We introduce this new nonlinear dynamics-based algorithm and attempt to show the potential it has for comparative improvements given a deeper study of its mechanisms.
Article
This textbook, a sequel to "The upper atmosphere and solar terrestrial relations" first published in 1979, is the paperback version of the 1992 edition (56.003.030). It describes physical conditions in the upper atmosphere and magnetosphere of the Earth. This geospace environment begins 70 kilometres above the surface of the Earth and extends in near space to many times the Earth's radius. There are three introductory chapters that give basic physics and explain the principles of physical investigation. The principal material contained in the main part of the book covers the neutral and ionized upper atmosphere, the magnetosphere, and structures, dynamics, disturbances and irregularities. The concluding chapter deals with technological applications.
Article
In the study of nonlinear physical systems, one encounters apparently random or chaotic behavior, although the systems may be completely deterministic. Applying techniques from symbolic dynamics to maps of the interval, we compute two measures of chaotic behavior commonly employed in dynamical systems theory: the topological and metric entropies. For the quadratic logistic equation, we find that the metric entropy converges very slowly in comparison to maps which are strictly hyperbolic. The effects of finite precision arithmetric and external noise on chaotic behavior are characterized with the symbolic dynamics entropies. Finally, we discuss the relationship of these measures of chaos to algorithmic complexity, and use algorithmic information theory as a framework to discuss the construction of models for chaotic dynamics.
Catastrophe Looming? The Risks of Rising Solar Storm Activity
  • Mike Wall
Wall, Mike, "Catastrophe Looming? The Risks of Rising Solar Storm Activity", Space.com, February 17, (2011).
Optimization of a New Digital Image Compression Algorithm based on Nonlinear Dynamical Systems
  • Anurag R Sinha
Sinha, Anurag R., "Optimization of a New Digital Image Compression Algorithm based on Nonlinear Dynamical Systems", EE Masters Thesis, Rochester Institute of Technology, May (2007).
Coronal Mass Ejections: Scientists Unlock the Secrets of Exploding Plasma Clouds on the Sun
" Coronal Mass Ejections: Scientists Unlock the Secrets of Exploding Plasma Clouds on the Sun ", Science Daily, November 14, (2010), http://www.sciencedaily.com.
The Solar-Terrestrial Environment: An Introduction to Geospace
  • J K Hargreaves
Hargreaves J. K., The Solar-Terrestrial Environment: An Introduction to Geospace, Cambridge University Press, (1992).