Content uploaded by Khaled Elleithy
Author content
All content in this area was uploaded by Khaled Elleithy on May 04, 2014
Content may be subject to copyright.
A Reliable and Fast Real-Time Hardware Engine for Text Steganography
Ammar Odeh, Khaled Elleithy, and Miad Faezipour
Dept. of Computer Science and Engineering
University Of Bridgeport
Bridgeport, CT 06604, USA
aodeh@bridgeport.edu, elleithy@bridgeport.edu, mfaezipo@bridgeport.edu
Abstract
Different strategies were introduced in the literature to
protect data. Some techniques change the data form
while other techniques hide the data inside another
file. Steganography techniques conceal information
inside different digital media like image, audio, and
text files. Most of the introduced techniques use
software implementations to embed secret data inside
the carrier file. This is while software implementations
are not sufficiently fast for real-time applications. In
this paper, we present a new real-time Steganography
technique to hide data inside a text file using a
hardware engine that can achieve up to 9.6 Gbps
hidden data rate. The fast Steganography
implementation is presented in this paper.
1. Introduction
Steganography is an ancient art of hiding data inside
a carrier file. Steganography techniques are classified
into three categories depending on the carrier file.
Image format is one of the most popular media used to
carry sensitive information. The simplest method used
with an image file is to replace the least significant bit
for each pixel. In audio signals, the data can be shifted
to right or left before hidden in an audio file. Inserting
data in a text file, however, is the most challenging
Steganography method as text files have the least
redundant data compared to other carrier files.
Text Steganography is classified into different
categories depending on how the secret data is
inserted:-
1. Linguistic Text Steganography [1] :-
Linguistic methods automatically conceal
information by replacing some words by their
synonyms without changing the sentence
meaning. Synonyms have relative semantic
concepts to avoid any suspicion by attackers.
This technique creates 3-tuples of <word,
syntax, semantic> to establish a very large
database collection. As the database collection
increases in size, there is a better success ratio
to hide the secret data without being
discovered.
2. Font Format techniques [2]:-
Font Format techniques depend on changing
the file format. These techniques suffer from
its dependency on the language features and
characteristics. Some techniques work well
with some languages but not with other
languages. These methods analyze the length
of the hidden data and then analyze the font
attributes for each character in the text.
3. Random and statistical Generation Methods [3]:-
These techniques modify the carrier file based
on certain statistical properties of the carrier
file to avoid any comparison between the
created Stego_object with a known carrier
file.
4. Other Techniques [4]:-
Other proposed techniques use file properties
such as feature coding, abbreviation, or
control word spelling to hide the secret
message.
Most of text Steganography techniques process the
carrier file and the hidden data in software. Software
implementations are slow and cannot support high real-
time data rates. In this paper, we provide a highly
efficient and very fast hardware implementation for
text Steganography. We propose a hardware
implementation for text Steganography of the Remark
Algorithm similar to the one we have earlier developed
in [5]. ALTERA DE II Field Programmable Gate
Array (FPGA) board is used for the hardware engine
implementation.
2. Related Work
In this section, we glance at prior steganography
techniques that were implemented in hardware.
In [6], a novel hardware design was proposed for
image steganography using the least significant bit
(LSB) algorithm. The implementation was carried out
using Cyclone II FPGA of the ALTERA family. The
technique employed 2/3LSB design to produce a good
image quality to avoid any attacker doubt. Meanwhile,
it provided a high memory access performance to
speed up the system performance. In [7], an FPGA
hardware architecture was introduced to hide the secret
information by exploiting the noise regions in an
image. This strategy improved system transparency
which made it hard to realize the hidden data. In [8], an
implementation of audio/video Stenography using
FPGAs was discussed. The proposed algorithm speeds
up the secret data embedding rate at the hardware
implementation for real-time Steganography. Another
hardware architecture was introduced in [9] to simulate
the ability to hide information inside image and video
carrier files. Two schemes were applied to speed up
real-time video applications. The main drawback of
this system is its need for a high speed memory buffer.
In [10], the proposed algorithm employed image as the
carrier file by using multilayer embedding in parallel
with three-stage pipeline on FPGA. Promising results
showed high throughputs while maintaining the image
quality. In [11] and [12], authors employed perturbed
quantization to hide data inside JPEG image. The main
feature of perturbed quantization is that it is
undetectable with current stegaonalysis tools.
As can be seen, all of the prior steganography
algorithms implemented in hardware focus on image,
video, or audio as the carrier file for the secret
message. This is while text steganography has not been
considered for implementation in hardware engines
and/or digital signal processors.
3. Proposed Model
In this paper, a novel hardware engine
implementation is presented over text as a carrier file.
The process is carried by inserting two invisible
symbols of MS word Right and left remarks to hide the
sensitive data inside text file. The suggested model
depends on space to embed secret information by
adding symbols into the carrier file. The Hiding Data
Algorithm describes the scenario at the hidden stage.
Software implementation of this technique would
consist of two steps. The first stage is the searching
process and the second stage is insertion. Searching
and insertion are slow and depend on the best and
worst cases. Sequential search applied in this algorithm
has an average time O(N/2), and this might increase
the ability of the attackers to capture a secret message.
In this section, the software-based algorithm is first
described. In the hardware implementation, a high
speed performance can be achieved while increasing
Steganography’s robustness and transparency.
A. Software Implementation:-
In this work, the idea is to hide data inside a word
file without any change in the file format. Stego-
analyst will try to analyze the file content and
formatting. If there is any change in the file format, it
can catch the hidden data. In our algorithm, we will use
the Right-to-Left Remark (U200F) symbol “ ” and the
Left-to-Right Remark (U200E) symbol “ ” to hide
bits inside the message. Our method will not change
the format of the file and can also be applied to
different languages regardless of the UNICODE or
ASCII coding. Moreover, it is easy to apply this
method using Microsoft office word application to hide
data.
To avoid the retyping problem that the attacker may
employ, we convert our file to PDF, which prevents
anyone from editing it.
Scenarios to hide the data are as follows:
1. (00) add nothing
2. (01) add Right-to-Left Remark (U200F)
3. (10) Left-to-Right Remark (U200E)
4. (11) Left-to-Right Remark (U200E), Right-to-
Left Remark (U200F)
By applying one of these four cases, we can hide
data without any changes in the file information.
Algorithm I: Hiding Data
Input: - Carrier file, hidden bits file
Output: - Stego file (embedded U200E && U200F
file)
Step1:- Choose any DOC file
Step2. Repeat while! (EOF)// repeat until the end of the
hidden file
Step3: Embed hidden data in the selected file
Step 3a. Start from first letter of the carrier file
Step 3b. Pack out the first two hidden bits
If 00 then no U200F nor U200E
Else if 01 then there is U200F
Else if 10 then there is U200E
Else add U200F and U200E.
Step 4: Go to step 2
Step 5: Save file as PDF then send it to other side.
Algorithm II: Data Extraction
Input:-Stegofile
Output: - Secure data, original file
Step1:- Open PDF Message
Step2. Repeat while! (EOF)// repeat until the end of
Stego file
Step 3: Embed hidden data in selected file
Step 3a. Separate each letter
Step 3b.
If there is nothing then 00
Else if only U200F then 01
Else if U200E then it’s a 10
Else, 11
Step 4: Go to step 3
Step 5: Read hidden data.
Our algorithm has some main advantages which are
listed below. Other advantages are also provided in
Section IV. The main advantages are:
a) File format will not be affected by embedding the
Stego data.
b) The algorithm is applied to any language.
B. Hardware Implementation:-
For implementing this algorithm in hardware, a
state transition diagram must be constructed that
reflects the algorithm procedure. Figure 1 shows the
state diagram of the system. This system consists of
five states, where each state depends on the input value
(e.g. character in the text file) and the hidden data.
State A represents the initial state of the search. The
hidden information represents an input data to transfer
from one state to another. As shown in Figure 1, the
system reads the hidden bits and input data bytes at
each state. If the hidden bits are 00 the output is Null
and next state is B. If the hidden bits are 01, the next
state is C and the symbol inserted in the output file is
U200F; and so on.
Figure 2 represents the main components of the
hardware engine in RTL view. The system consists of
four comparison units to check the hidden information
in order to choose a suitable data path based on what
the hidden data bits are. Table I provides a list of the
used components representing the device utilization of
the FPGA for this hardware engine.
In our hardware implementation, we process the
hidden file two bits in each state to hide it and then
transition the current state to another state based on the
conditions. Figure 3 shows the signal analysis of data
inserted and the system states transitions using a wave
graph. This timing wave graph depicts how the
hardware system functions in terms of states and state
transitions. In our simulations, we use R as the
RightLeft remark symbol output, L as the Leftright
remark symbol output, and B for both of remark
symbols. Null represents "Insert nothing", indicating
no change required to construct the output where the
same character would
be inserted at the output. The state transitions occur
based on the conditions of the input data characters and
hidden data bits.
The critical path time reported by the tool (Quartus
II) is 1.669ns. We process 16 bits in each
clock cycle. Hence,
. 599161174.36 Hz .
Therefore, the system has an overall throughput of
9.6Gbits/second. This is while software simulations
would require O (n) time to process any file, and is
controlled by the file size and processor speed.
Table I. Device Utilization of the FPGA
Components Name
Family Stratix
Met timing requirements Yes
Total Logic elements 12/10570 (<1%)
Total pins 20/336 (6%)
Total virtual pins 0
Device EP1S10F484C5
Timing Models Final
Table II shows l our Algorithm and five other hardware
implementations reported in the literature.
Table II. Steganography Hardware Engines
Algorithm Strategy/Carrier file
[6] 2/3 Image Steganography
[7] Noisy region of image
[8] Audio, video
[9] Image, Video
[10] Multilayer and parallel
Image
Proposed System Text file/ Insert Invisible
symbols
All the presented hardware implementations process
audio, image or video files. To the best of our
knowledge, there is no hardware systems reported in
the literature for processing text. The Remarks
Algorithm hardware implementation represents one of
the unique text Steganography algorithms, as it
provides very high speed processing of real-time
applications while maintaining a minimum memory
buffer use.
In summary, the Remarks algorithm implementation
has the following advantages:
1. Language independent: - Remarks algorithm
can be applied in any language. This feature
enables users to hide data in different file
formats (Unicode, ASCII). This is while other
algorithms depend on language characteristics,
which limits the algorithm flexibility.
2. Improved transparency: - This algorithm
improves the transparency feature since the
Stego file format seems like the original file.
3. File format: - Our method is not dependent on
any special format. This allows the use of the
carrier text in different formats such as HTML
pages, Microsoft Word documents or even plain
text format.
4. Algorithm optimization: - Our method suggests
optimization steps to reduce the file size
change.
5. Hiding capacity: The Remarks algorithm
enables users to hide huge amount of data
between two letters. Any two users can
determine where the suitable place to insert bits
would be. In our simulations, we used the space
between two words to hide one word, where the
whole message can also be hidden in one space.
6. High-speed Text Steganography Hardware
Engine:- Furthermore, this stand-alone
hardware engine can be integrated within a
Network Interface Card (NIC) of a router for
actual networking scenarios on internet packets
(e.g. websites) to perform high-speed
steganography, if and when needed.
4. Simulation results
Our simulation results are divided into two parts. The
first one is concerned about which optimization step is
employed, as shown in Table III. We created secret
messages, converted the messages into ASCII, and
computed the number of ones and zeros in each
message.
From our simulation results we conclude that the
best way to optimize the embedded message with
respect to the file size is to separate our message word
by word (where the space binary code is 00100000),
and apply formulation (1) to each word. For example,
if our secret message is “See You At 10”, and if we
apply the Scenario 1, the file size would increase, and
this may lead to violating one of the important
steganography concepts; transparency. In contrast, if
we split the message into parts, and apply the best
scenario to each part, one case is that the message “See
You At 10” could be divided into two parts, where
“See You” will use scenario 4, and “At 10” will use
scenario 1. By using the switching scenarios strategy,
storage space will be saved as much as possible, and
this will improve the transparency goal.
In Table IV, we analyze the ability of a few websites
to hide bits and also compute the capacity ratio for
each (see Equation 1). In our experiments, we assume
that hidden bits are inserted between any two words to
make it easier to decrypt by finding the space in the file
and then finding the Remarks.
.
%
5. Conclusions and Future Directions
In this paper, we presented a fast and real-time
hardware implementation for secure and safe
communications over networks. We have presented the
hardware implementation of the Remarks algorithm.
The proposed design represents one of the most fastest
Text Steganography techniques in hardware. Previous
implementations provided efficient hardware
implementations over other carriers such as image,
video or audio files. This is the first hardware
implementation presented in literature for text
Steganography. In the future, we are planning to
present a parallel processing design to optimize the
system encryption speed and power consumption for
the Remarks algorithm as well as other text
Steganography algorithms.
6. References
[1] C. S, G. D, and D. NC, "Linguistic approach
for text steganography through Indian text," in
Computer Technology and Development
(ICCTD), 2nd International Conference on,
2010, pp. 318-322.
[2] X. Lingyun, S. Xingming, L. Gang, and G.
Can, "Research on steganalysis for text
steganography based on font format," in
Information Assurance and Security, 2007.
IAS 2007. Third International Symposium on,
2007, pp. 490-495.
[3] B. Souvik, B. Indradip, and S. Gautam, "A
novel approach of secure text based
steganography model using word mapping
method (WMM)," International Journal of
Computer and Information Engineering, vol.
4, pp. 96-102, 2010.
[4] A. Odeh, A. Alzubi, Q. Hani, and K. Elleithy,
"Steganography by multipoint Arabic letters,"
in Systems, Applications and Technology
Conference (LISAT), 2012 IEEE Long Island,
2012, pp. 1-7.
[5] A. Odeh, K. Elleithy, and M. Faezipour, "Text
Steganography Using Language Remarks," in
The American Society of Engineering
Education, Northfield, VT, USA., 2013.
[6] M. Jamil, A. Saed, A.-H. Thaier, and S.
Alouneh, "FPGA hardware of the LSB
steganography method," in Computer,
Information and Telecommunication Systems
(CITS), 2012 International Conference on,
2012, pp. 1-4.
[7] G.-H. Edgar, F.-U. Claudia, and C. Rene,
"FPGA hardware architecture of the
steganographic context technique," in
Electronics, Communications and Computers,
2008. CONIELECOMP 2008, 18th
International Conference on, 2008, pp. 123-
128.
[8] F. Hala and S. Magdy, "Design and
implementation of a secret key steganographic
micro-architecture employing FPGA," in
Design, Automation and Test in Europe
Conference and Exhibition, 2004.
Proceedings, 2004, pp. 212-217.
[9] L. HY, C. LM, C. LL, and C. Chi-Kwong,
"Hardware realization of steganographic
techniques," in Intelligent Information Hiding
and Multimedia Signal Processing, 2007.
IIHMSP 2007. Third International
Conference on, 2007, pp. 279-282.
[10] A. Fadhil and N. Abdul, "An FPGA
Implementation of Secured Steganography
Communication System," Tikrit Journal of
Engineering Science, vol. 19, pp. 14-23, 2012.
[11] G. Gkhan and D. A. Emir, "Steganalytic
features for JPEG compression-based
perturbed quantization," Signal Processing
Letters, IEEE, vol. 14, pp. 205-208, 2007.
[12] F. Jessica, G. Miroslav, and S. David,
"Perturbed quantization steganography,"
Multimedia Systems, vol. 11, pp. 98-107,
2005.
Figure 1.Finite State Machine Diagram
Figure2. RTL view of the hardware engine
Figure3. Timing Simulation of the hiding data algorithm and state transitions where input file is space.
Table III. The capacity of articles in web pages for hiding data
# Website Article Number of words that
can be embedded
Text Size
(Kilo Byte)
Capacity
Ratio
1 www.nydailynews.com 826 8.8 674
2 www.aljazeera.com 1658 18.7 637
3 www.englisharticles.info 1351 15.9 610
4 www.latimes.com 1208 14.8 586
Space