Conference PaperPDF Available

Steganography by multipoint Arabic letters

Authors:

Abstract and Figures

Security methodologies are taken into consideration for many applications, where transferring sensitive data over network must be protected from any intermediate attacker. Privacy of data can be granted using encryption, by changing transmitted data into cipher form. Apart from encryption, hiding data represents another technique to transfer data without being noticeable by an attacker. This technique is called Steganography. In this paper, we will discuss the main concepts of Steganography and a carrier media that is used for this goal. Employing text as mask for other text represents the most difficult method that can be used to hide data. We will discuss some algorithms that use Arabic text. We then describe our doted space methodology to enhance data hiding.
Content may be subject to copyright.
Steganography by Multipoint Arabic Letters
Ammar Odeh, Aladdin Alzubi, Qassim Bani Hani, Khaled Elleithy
Department of Computer Science & Engineering,
University of Bridgeport
Bridgeport, CT 06604, USA
aodeh @bridgeport.edu
Abstract-Security methodologies are taken into consideration for
many applications, where transferring sensitive data over
network must be protected from any intermediate attacker.
Privacy of data can be granted using encryption, by changing
transmitted data into cipher form. Apart from encryption, hiding
data represents another technique to transfer data without being
noticeable by an attacker. This technique is called
Steganography. In this paper, we will discuss the main concepts
of Steganography and a carrier media that is used for this goal.
Employing text as mask for other text represents the most
difficult method that can be used to hide data. We will discuss
some algorithms that use Arabic text. We then describe our doted
space methodology to enhance data hiding.
Keywords: Steganography, carrier file, text steganograph, image
steganography, audio steganography, information hiding,
Persian/Arabic Text, steganalysis, stego medium, stego_key.
I. INTRODUCTION
A. Background
Steganography is a Greek word coming from cover text.
"Stegano" means hidden and “Graptos" means writing. In
steganography, the secure data will be embedded into another
object, so middle attacker can't catch it [1]. Invisible ink is an
example for Steganography using a readable message transfer
between source and destination. Everyone in the middle can
read the message without having any clue about the hidden
data. On other hand, authorized persons can read it depending
on the substances features [2][3].
Ancient Greeks used to shave the messenger head and then
wait until the hair grew back. That is when the message will be
sent to the destination [1]. Depending on this method, there are
two possibilities:
1. Message has arrived so the receiver can read the message
and recognize if message has changed or not.
2. If message did not arrive, it means the attacker has
detected the message.
B. Motivation
Steganography algorithms depend on three techniques to
embed the hidden data in the carrier files.
1. Substitution: Exchange a small part of the carrier file by
the hidden message where the middle attacker cannot observe
the changes on the carrier file. On the other hand, in choosing
a replacement process, it is very important to avoid any
suspicion. This means that it is important to select
insignificant parts from the carrier file and then replace them.
For instance, if the carrier file is an image (RGB), then the
least significant bit (LSB) can be used as the exchange bit [4].
2. Injection: By adding hidden data into the carrier file, the
file size will increase and this will increase the suspicion. So
the main goal to present techniques to add hidden data while
avoiding attacker suspicion [4].
3. Propagation: There is no need for a cover object. It
depends on using a generation engine fed by input (hidden
data) to produce and mimic a file (graphic or music or text
document ).
The Steganography process consists of three main
components as show in Figure 1.
Figure 1. General components of Steganography
Different types of cover media including image, sound,
video and text can be used in Steganograph, as shown in
Figure 2. Choosing carrier file is very sensitive where it plays
a key role to protect the embedded message. Successful
Steganography depends on avoiding suspicion. Steganalysis
will start checking the file. If there is any suspicion, this will
compromise the main goal of Steganography [3][4].
Figure 2. Stego Media
Text Steganography represents the most difficult type,
where there is generally lack of data redundancy in the text file
in comparison with other carrier files [5]. The existence of
such redundancy can help increase the capacity of hidden data
size. Furthermore, text Steganography depends on the
language, as each language has its own unique characteristics
which is completely different from other languages. For
example, the letter shape in English language does not depend
on its position in the word, while Persian/Arabic letters have
different forms depending on letter positioning [6].
In our new proposed algorithm, we hide text inside text by
employing Arabic language and applying a random algorithm
to distribute the hidden bits inside the message. The main
reasons for choosing the Arabic language are:
1. The proposed algorithm will depends on multi dotted
points letters. Therefore, the algorithm must employ a
language that has as many as possible dotted letters. For
example, the Arabic Language has 5 multipoint letters and
Persian/Farsi language has 8 letters [7], while English does not
have any.
2. Wealth availability of electronic textual information.
3. There is little research on other languages compared to
English.
4. The approach can be extended to other languages like Urdu
and Kurdish.
C. Main Contribution and Paper Organization
An efficient algorithm is presented in this paper. The main
idea is to use multiple point characters in Arabic which
enables us to hide more than two bits per letter. The rationale
behind this approach is that most of the previous algorithms
reported in literature hide one bit for one letter. Furthermore,
we will merge our algorithm with vertical shifting point
algorithm to increase the size of the hidden file. The size of the
carrier will be constant without any change. After we add the
data, we convert the file into image to avoid the retyping
problem.
The rest of this paper is organized as follows. In section II,
we discuss some text Steganography techniques as well as
their advantages and disadvantages. Our multipoint hidden
algorithm is discussed in section III. In section IV, we present
experimental results of our algorithm. Finally, conclusions are
presented in section V.
II. PRIOR WORK
Text Steganography is divided into two categories. The
first one is the semantic method, and the second is the
formatting method, as shown in Figure 2. In this Section we
will briefly explain some Steganography examples. In Table I,
we present a simple comparison between semantic and
formatting methods.
Table I. Comparison between text Steganography methods
Semantic Method Format Method
Amount of hidden
data
Small amount More than semantic
Flaws Sentence meaning notice from OCR or
retyping
Steganography criteria will depend on the amount of data
that can be hidden and the main problem facing the method.
We describe ten algorithms that hide data inside text
documents. The last two algorithms deal with Arabic and
Persian languages.
1. Word Synonym
Word Synonym is also called semantic method and it
depends on replacing some words by their synonym. See
Table II. This technique will convey data without making any
suspicion. It is limited in terms of that fact that hidden data
will be small relative to other methods. Moreover, it may
change the sentence meaning [7][10][12].
2. Punctuation
This method uses punctuation like (.)(;) to represent hidden
text. For example "NY, CT, and NJ" is similar to "NY, CT and
NJ" where the comma before the “and” represents 1, and the
other represents 0. The amount of hidden data in this method
is very small compared to the amount of cover media.
Inconsistence use of punctuation will be noticeable from
Steganoanalysis point of view [9].
Table II. Using Word Synonym
Word Synonym
Big Large
Find Observe
Familiar Popular
Dissertation Thesis
Chilly Cool
3. Line Shifting
Line shifting means to vertically shift the line a little bit to
hide information to create a unique shape of the text.
Unfortunately, line shifting can be detected by a character
recognition program. Moreover retyping removes all hidden
data [7][10].
In Figure 3, we present an example regarding line shifting
where the vertical shifting is very small (1/300 inch). This is
not noticeable by the human eye.
Figure 3. Line shifting; second line is shifted up 1/300 inch
[10].
4. Word Shifting
In this method, changing spaces between words enables us
to hide information. Word shifting is noticeable by OCR
through detecting space sequence between words [7][10].
5. SMS Abbreviations
Recently most SMS messages use abbreviations for
simplicity and security while used in different applications
such as internet chatting, email, and mobile messaging. The
main advantage of this method is to speed typing, reducing the
message’s length and manipulated keyboard limitation
character [13].
Other algorithms use numbers to convey specific
information. As mentioned above, SMS abbreviation can be
used in specific applications while using in others creates
suspicion of any entity that monitors the ongoing transmission.
Table III. Some SMS Abbreviations
Abbreviation Meaning
ADR Address
ABT About
URW You are welcome
ILY I love you
EOL End of lecture
AYS Are you serious?
6. Text Abbreviations
Text abbreviation is similar to SMS abbreviation, where a
dictionary is created for each word abbreviation and its
meaning. The dictionary is published between the
communication parties. Abbreviation represents one method to
hide data. For example if you send (see) it means (do you
understand) [13].
7. HTML Spam Text
This method depends on HTML pages, where their tags
and their members are insensitive. For example <BR> equal to
<Br>, and the same as <br> and <bR>. The hidden data
depends on upper case or lower case letters to embedd 0 or 1.
8. TeX Ligatures
In TeX ligatures, some special groups of letters can be
joined together to create a single glyph as shown in Figure 4.
The algorithm finds available ligatures in the text to hide a
single bit in each one. For example, if we want to hide 1 we
write fi to f {}i which creates some space between f and i.
Otherwise, we encode 0 [5].
The same algorithm can be applied to Arabic character ""
or " ل". This algorithm has two problems. The first problem is
that file size increases when we apply extension in our text.
The second problem is that if the ORC notices the font change,
it can detect the decoding hidden message [6][5].
9. Arabic Diacritics
Arabic language uses different marks. The main reason to
use these symbols is to distinguish between words that have
same letters. It depends on Arabic Diacritics (Harakat), where
diacritics are optional. Most of Arabic novels can be read
without Diacritics which depends on the language’s grammar.
The most occurrence is Fatha " َ " which will be used to
encode 1 otherwise encode 0.Our new algorithm will enhance
the reuse of cover media. Furthermore, the carrier file size
might be reduced depending on the hidden message. On the
other hand, when ORC detects the same message with
different diacritics, it might conclude that there is a hidden
data. In addition, retyping will remove the embedded message
[8].
Figure 4 .Join between characters [5]
Table IV. Some Letters with mark and their Pronunciation
Pronunciation Letter with
Haraka
Haraka
Do ُد Dama
De ِد Kasra
Da َد Fatha
10. Vertical Displacement of the Points
This algorithm achieves excellent performance as it is
applied on pointed (dotted) letters. Other languages such as
English language have only two dotted letters; {i, j}; and thus
limits the application of this algorithm. Alternatively, some
languages such as Arabic and Persian have many pointed
letters which make them fit better for this technique.
Arabic and Persian languages have many pointed
characters. Arabic has 26 letters where 13 of them are pointed,
and Persian has 32 letters where 22 of them are pointed. In this
new algorithm, we encode 1 to shift up the point, otherwise
encode 0. This method can encode a huge number of bits, and
need a strong OCR to recognize the changes. Meanwhile,
retyping will remove the entire message [7].
Figure5. Vertical shifting point [7]
III.PROPOSED ALGORITHM
Pointed letters represents one of the important
characteristics of Arabic and Persian languages. Table II
classifies Arabic letter with respect to the number of points.
Table V Arabic letters with respect to the number of points
Letter Number of points
ا,ح,د,ر,ص,ط,ع,ك,ل,,و 0
ب,ج,خ,ذ,ز,ض,ظ,غ,ف,ن 1
ت,ق,ي 2
ث,ش 3
Our algorithm hides data in multipoint Arabic/Persian
letters like (ث, tha). In Arabic language, there are five multi-
pointed letters, and in Persian there are eight. Each character
can be used to hide 2 bits to determine the shifting and
distance between letter points. Table III represents the relation
between letter, shifting, and encoding
Table VI represents relation of shifting and distance to
letter format
Point
shifting
distance code Letter effects
0 0 00 No change
0 1 01 Only distance between
point increase
1 0 10 Only little upper
shifting
1 1 11 Upper shifting and
increase distance
between point
A. Pseudo Code and Flow Chart
In this subsection we present the pseudo-code and
flowchart of the proposed Arabic multipoint steganography
algorithm. The flowchart of the algorithm is shown in Figure
7. The pseudo-code follows:
1. Enter the text and hidden file and its size
2. Search for multipoint letters
3. Hide size of embedded data at the beginning
4. For I= start to EOF
IF hidden data ="00" then call Nochange();
Else if hidden data= "01" then call distance ();
Else if hiddendata ="10" then call shifting ();
Else if hiddendata="11" then call
distance_shifting();
Else random call for any one // for padding purpose
End for
5. Convert file to image file and send to other side
6. End
As can be seen in the above pseudo-code, in step 5, the file
must be converting to image file for transfer. The receiver will
scan the image file and find out the multipoint letters and then
classify the function applied on it.
B. Advantages and Disadvantages
Our multipoint algorithm has many advantages, as one
character can hide 2 bits compared to other algorithms that can
hide 1 bit per letter. This implies that the amount of hidden
data can be duplicated. Furthermore, the number of changed
characters for a given message, which leads to decrease which
leads to less suspicion. Moreover, the file size will have fewer
changes, since the number of changes in characters format is
less.
On the other hand, any retyping process removes all the
hidden data, as the hidden data depends on the file format. The
consistent format used in the system might raise the level of
suspicion of an attacker.
IV. EXPERIMENTAL RESULTS
A. Multipoint algorithm
Our algorithm depends on multipoint letters to include
hidden data. For this reason, we test different websites that
contain text and picture.
We will run two files in parallel for carrier file and hidden
data file. This process we will use two bits for each letter by
applying the distance_shifting algorithm.
Figure 6. Parallel operation carrier file and hidden data
As shown in Figure 6, after we merge the hidden data, we
convert it to image file to prevent the issues caused by
retyping. After that, we can compress the file to produce the
compressed image. The receiver decompresses the image and
then extracts the hidden data.
B. Experimental Results of the Multipoint Algorithm
Table VII Capacity of webpage for different Arabic website
# Page Name
Page
Size
Character
# 2 point
or more
Capacity
Ratio
(Bit/
Kilobyte)
1 aljazeera.net
23.8
KB
1245
105
2 daralhayat.com
15.4
KB
968
126
3 salahws.com
10.3
KB
535
104
4 holyquran.net/tadabur
13.8
KB
516
75
5 khayma.com
21.8
499
46
KB
HD: Hidden Data
EOF: End of File
Figure 7. Flowchart of multipoint algorithm
As shown in Table VII, we used different Arabic news
websites to observe the data ratio that can be hidden in each
website.
C. Analysis of the Algorithm
Our algorithm is compared to the vertical shifting
algorithm in terms of the number of changing letters and the
number of bits that can be hidden. In Table VIII, the result of
testing one paragraph is shown. The total number of letters is
115 which indicates that 42% of it is pointed. On the other
hand, the number of multipoint letters is 29, which indicates
25%. The data that can be hidden in the multiple point
algorithm is more than the first one. So we calculate the
efficiency depending on Equation 1:
E= #of hidden characters / #of characters *100% (1)
where E is the efficiency.
So, the efficiency of multiple points is 50% while vertical
point shifting is 43%.
Table VIII Vertical Point shifting versus Multipoint
algorithm
فﺮﻌﺘﻟا ﻦﻜﻤ ﻪﻧأ ﻦﻴﻴﻧﺎﻄﻳﺮﺒﻟا ﻦﻴﺜﺣﺎﺒﻟا ﻦﻣ ﺔﻋﻮﻤﺠﻣ ﺎهاﺮﺟأ ﺔﺜﻳﺪﺣ ﺔﺳارد ﺘﺒﺛأ
ﻪﺗرﺎﻴﺳ عﻮﻧ ﻲﻓ ﻖﻴﻗﺪﺘﻟا لﻼﺧ ﻦﻣ صﺎﺨﺷﻷا ﺾﻌﺑ ﺔﻴﺼﺨﺷ ﺢﻴﺗﺎﻔﻣ ﻰﻠﻋ
ﺔﺻﺎﺨﻟا[13]
Number of letter Number of
hidden bit
Pointed letter 50 50
Multipoint letter 29 58
D. Merge with Vertical Point Shifting
In this subsection, we calculate the ratio when we
compound the vertical point shifting algorithm discussed in [7]
with the multipoint algorithm.
Figure 8. Merging vertical and multipoint algorithms
In Figure 8, we note the ability of the vertical point
algorithm to hide huge amount of data as compared to the
multipoint algorithm. On the other hand, an observer can
notice the vertical changes more than multipoint changes
where the number of changes will be more. Consequently, the
merging both algorithms gives us better idea and flexibility in
hiding data and those that can be detected by an observer.
V. CONCLUSION
In this paper we introduce a new text Steganograph for
Arabic multipoint letters. The new algorithm deals with two
bits for each multipoint letter. We combine our strategy with
vertical point shifting [7] to improve the amount of hidden
data.
The retyping process is a challenging problem for similar
algorithms which removes all the hidden data. We solve this
challenge to mitigate any new font format changes by unifying
all data which leads a homogenous file. Finally, the result
reported by this implementation has outperformed similar
results reported in literature in terms of the hiding capacity and
the possibility of having such steganography mechanism used
in hiding information.
REFERENCES
[1] Aelphaeis Mangarae "Steganography FAQ," Zone-H.Org March
18th 2006.
[2] S. Dickman, "An Overview of Steganography," July 2007.
[3] V. Potdar, E. Chang. "Visibly Invisible: Ciphertext as a
Steganographic Carrier," Proceedings of the 4th International
Network Conference (INC2004), pp. 385–391, Plymouth, U.K.,
July 2004.
[4] M. Al-Husainy "Image Steganography by Mapping Pixels to
Letters," Science Publications, 2009.
[5] M. Shahreza, S. Shahreza, “Steganography in TeX Documents,
Proceedings of Intelligent System and Knowledge Engineering,
ISKE 2008. 3rd International Conference, Nov. 2008.
[6] M. S. Shahreza, M. H. Shahreza, “An Improved Version of
Persian/Arabic Text Steganography Using "La" Word”
Proceedings of IEEE 6th National Conference on
Telecommunication Technologies, 2008.
[7] M. H. Shahreza, M. S. Shahreza, “A New Approach to
Persian/Arabic Text Steganography
Proceedings of 5th
IEEE/ACIS International Conference on Computer and
Information Science, 2006.
[8] M. Aabed, S. Awaideh, A. Elshafei and A. Gutub “Arabic
Diacritics Based SteganograophyProceedings of IEEE
International Conference on Signal Processing and
Communications (ICSPC'07), 2007.
[9] W. Bender, D. Gruhl ,N. Morimoto ,A. Lu “Techniques for data
Hiding” Proceedings of IBM Systems Journal, Vol. 35, Nos 3&4,
1996.
[10] J. Brassil, , S. Low, N. Maxemchuk, L. Gorman, "Electronic
Marking and Identification Techniques to Discourage Document
Copying", in Proceedings of the 13th IEEE INFOCOM
Networking for Global Communications Conference, Oct.1995.
[11] K. Bennett, “Linguistic Steganography: Survey, Analysis, and
Robustness Concerns for Hiding Information in Text” Center for
Education and Research in Information Assurance and Security,
Purdue University, 2004.
[12] M. Nosrati , R. Karimi and, M. Hariri ,” An Introduction to
Steganography Methods” World Applied Programming, Vol. 1,
No. 3,pp. 191-195, Aug. 2011.
[13] M.H. Shirali-Shahreza, M. Shirali-Shahreza, "Text
Steganography in Chat" Proceedings of 3rd IEEE/IFIP
International Conference in Central Asia , Sept. 2007.
Ammar Odeh is a PhD. Student in University of
Bridgeport. He earned the M.S. degree in Computer
Science College of King Abdullah II School for
Information Technology (KASIT) at the University
of Jordan in Dec. 2005 and the B.Sc. in Computer
Science from the Hashemite University. He has
worked as a Lab Supervisor in Philadelphia
University (Jordan) and Lecturer in Philadelphia
University for the ICDL courses and as technical
support for online examinations for two years. He
served as a Lecturer at the IT, (ACS,CIS ,CS)
Department of Philadelphia University in Jordan,
and also worked at the Ministry of Higher
Education (Oman, Sur College of Applied Science)
for two years. Ammar joined the University of
Bridgeport as a PhD student of Computer Science
and Engineering in August 2011. His area of
concentration is reverse software engineering,
computer security, and wireless networks.
Specifically, he is working on the enhancement of
computer security for data transmission over
wireless networks. He is also actively involved in
academic community, outreach activities and
student recruiting and advising.
Qassim Bani Hani is Ph.D. candidate of computer
science and Engineering department in the
University of Bridgeport. His current research
interests include the design and development of
learning environment to support the learning about
heterogamous domain, collaborative discovery
learning and the development of mobile
applications to support mobile collaborative
learning (MCL), The congestion mechanism of
transmission of control protocol including various
existing variants, delivery of multimedia
applications. He completed his Bachelor degree in
computer science from Irbid National University in
2004 and Master degree in computer science from
Al-Balqa' Applied University in 2007. Qassim has
been directly involved in design and development of
mobile applications to support learning
environments to meet pedagogical needs of schools,
colleges, universities and various organizations.
Aladdin Alzubi received the B.Sc. in Software
Engineering from Philadelphia University, Amman,
Jordan in 2004, and the Master of Computer
Sciences from University Sians Malaysia –
Malaysia in 2006. In 2011 he joined University of
Bridgeport as Ph.D. student in computer science and
engineering at the University of Bridgeport,
Connecticut-USA. From 2000 to 2004.
Dr. Elleithy is the Associate Dean for Graduate
Studies in the School of Engineering at the
University of Bridgeport. He has research interests
are in the areas of network security, mobile
communications, and formal approaches for design
and verification. He has published more than one
hundred fifty research papers in international
journals and conferences in his areas of expertise.
Dr. Elleithy is the co-chair of the International Joint
Conferences on Computer, Information, and
Systems Sciences, and Engineering (CISSE). CISSE
is the first Engineering/Computing and Systems
Research E-Conference in the world to be
completely conducted online in real-time via the
internet and was successfully running for four years.
Dr. Elleithy is the editor or co-editor of 10 books
published by Springer for advances on Innovations
and Advanced Techniques in Systems, Computing
Sciences and Software.
... An elliptic curve-based key-encryption method was created by Koblitz and Miller [56]. ...
... The constant parameter optimization employed in encryption cryptography highlights the significance of this approach. Population-based methods underpin EA's capacity to generate several acceptable outputs in a single development [56]. This method, however, also has drawbacks, like slow convergence speed, etc. ...
Chapter
Biomedical image security is necessary for securing sensitive information and files when computerized images and their appropriate patient records are communicated throughout the public domain networks. In this chapter, we shed light on the biomedical image security concepts, elements, structure, and other aspects of biomedical image security. Specifically, this chapter will extend the elaboration on biomedical imaging, its techniques, methods of medical image security, med images attack types, Cryptography in the spatial domain such as chaotic-based encryption, deoxyribonucleic acid (DNA)-based encryption, elliptic curve-based encryption, metaheuristics-based encryption, cellular automata), cryptography in the transform domain, steganography including spatial domain steganography, transform domain steganography, adaptive domain image steganography, and watermarking techniques, phases, requirements, and performance metrics (such as PSNR, MSE, NC, NPCR, SSIM, UACI, BER). This chapter profounds the medical image security knowledge and provides more insights into readers about medical image attacks, malware, and their countermeasures.
... The percentage of this movement is about (1/300), and it is so small that it is nearly invisible to the naked eye. Still, it can be determined using a specific program designed to extract hidden text from the text cover [10], [11], as shown in table (11). 2.Extended letters (Kashida): Persian and Arabic languages differ from English in that their letters are connected rather than separated in printing, with one letter joining the next, with the exception of seven letters that cannot be joined to the next, which are ‫‪).The‬ا،د،ر،و،ذ،ز،ژ(‬ extension can be found in the Persian and Arabic languages as a kind of embellishment and adjustment to equalize the length of all the lines of the text where the extension is made between any two letters except for the end of the previous seven letters. ...
... The first group carries a value (0) bit, and the second group represents the value of (1) bit. The rest of the letters represented by (10) letters are considered to have a neutral value to complete the (28) letters used in the Arabic language. When generating words, they are tested one by one. ...
Article
Full-text available
Steganography is one of the oldest methods for securely sending and transferring secret information between two people without raising suspicion. Recently, the use of Artificial Intelligence (AI) has become simpler and more widely used. Since the emergence of natural language processing (NLP), building language models using deep learning has become more Furthermore, because of the importance of concealing secret information in delivered messages, Artificial Intelligence theories along with Natural Language Processing algorithms were employed to conceal hidden information within the text cover. The Arabic language was used because of its large number of words, vocabulary, and linguistic meanings, and its most significant feature is Arabic poetry. This study discovered a new way to hide secret data inside newly formulated Arabic poetry based on previous Arabic poetic texts and a database of a number of Arab poets from the ancient and modern eras using Artificial Intelligence and Long-Short Term Memory (LSTM) theories to increase storage capacity by 45 percent. The linguistic accuracy and volume of secret data hidden within the formulated poetry were increased using a Baudot Code algorithm, where the secret data is hidden at the level of letters rather than words, and the linguistic accuracy and volume of secret data hidden within the formulated poetry were increased to eliminate the negatives found in previous studies.
... And to do that, the user must increase the size of the file by using vertical shift point algorithm, this way the carrier's size will be increased without alteration or modification, after adding the secret information inside, a process of converting the file into a picture in order to avoid and prevent any re-typing issues [12]. Also, not being able to re-type can be a disadvantage that can cause problems because the hidden information is dependent on the format of the file, this makes the attacker suspicious about these different formats [3]. ...
... To prove that the program will use any non-printable letters and not only zero width-joiner and zero width non-joiner, but we also used two other non-printable letters which gave the same results as can be seen in Figure 10 and in Figure 11 the output gave the secret message without any changes. After reviewing the program, we decided to change the values that are appointed to each letter to random values, each letter in the program was assigned a 0 or 1 value based on its appearance, the pointed letters were assigned a 0 value and nonpointed letters were assigned a 1 value, table 2 shows the letters that were assigned a "1" value and table [3] shows the letters that were assigned a "0" value, the values were changed to a random sequence to eliminate any sequences in the values of the letters. Each secret letter in the non-printable letters technique needs 9 bits since 8 bits will be assigned to one secret letter and an extra bit will be used to separate each secret letter from the next secret letter, so to get the number if secret letters that can be hidden in each text, we can use this equation, Hidden letters = number of letters in a text/9, as demonstrated in Figure 12, the capacity of this technique is high compared to the other techniques that had far more problems and obstacles compared to the non-printable letters technique. ...
... Otherwise, unpointed letters will be used. Inspired by the work in [6], a new text steganography for Arabic multipoint letters is introduced in [14]. Compared to other literature schemes, the proposed algorithm in [14] can hide 2 bits per letter rather than 1 bit. ...
... Inspired by the work in [6], a new text steganography for Arabic multipoint letters is introduced in [14]. Compared to other literature schemes, the proposed algorithm in [14] can hide 2 bits per letter rather than 1 bit. The main idea is to only use multipoint Arabic/Persian letters like ( ) to conceal binary data. ...
Conference Paper
Full-text available
Because of the great evolution of communication technologies, the subject of information security has become a sensitive topic that must be given great importance in order to protect confidential information. Steganography is one of the effective ways to improve security and information protection while transferring data over the Internet. It consists of hiding information within an unremarkable cover object that can be either of text, image, audio, or video. Arabic language and other similar languages such as Urdu and Persian, possess some special features that make them excellent covers for steganography. In this work, we propose to use for the first time in the literature, the position of points in dotted Arabic letters combining with other Arabic text features to create a new steganography technique. Based on this idea, we introduce new secret bit patterns that could allow us to hide three bits of secret information into a cover object instead of one bit, and hence greatly enhance the capacity performance. Experiments show the effectiveness of our proposed technique by outperforming other existing steganography methods.
... Unfortunately, there is no research on coverless information hiding in Arabic. However, oppositely, there has been much research on conventional information hiding in the Arabic language, such as concealing a secret text based on the dots of the letters [36], and concealing textual data within the visual allure of elongated letters or Kashida [26], [37], [38]. Harakats also serve a secondary purpose of concealing confidential information within the text [3], [39], [40]. ...
Article
Full-text available
Text steganography is crucial in information security due to the limited redundancy in text. The Arabic language features offer a new method for data concealment. In this paper, the researchers propose a new coverless text information hiding method based on built-in features of Arabic scripts. The first word of each row in the dataset is tested based on eight features to get one byte containing 1 or 0. That is a result of the presence or absence of the following features: mahmoze, diacritics, isolated, two sharp edges, vowels, dotted, looping, and high frequency. Then, each byte is converted to a decimal number (ASCII code) to implement a dynamic mapping protocol with the most frequent letter. In the hiding process, each character in the secret message is converted to ASCII code and successfully matched in the dataset. Thus, after matching, the candidate text is sent to the receiver. In contrast, the pre-agreed dynamic mapping protocol was implemented in a receiver to extract secret messages. Three Arabic datasets are used in this paper SANAD (Single-Label Arabic News Articles Dataset) includes 45500 articles, Arabic Poem Comprehensive Dataset (APCD) contains 1,831,770 poetic verses in total, Arabic Poetry Dataset contains more than 58000 poems). The suggested approach withstands existing detecting methods because of no modification or generation. Moreover, there is an enhancement in hiding capacity, which can conceal a (character per word). Hence, all the messages are embedded successfully using dynamic mapping.
... In order to increase the amount of information that may be hidden, two secret bits per character instead of one may be used [61] to generate four possible positions (i.e., 00, 01, 10, & 11). The two scenarios in hiding the secret bit are point shifting and distance between points. ...
Article
Full-text available
Despite the rapidly growing studies on Arabic text steganography (ATS) noted recently; systematic, in-depth, and critical reviews are in scarcity due to high overlap or low segregation level among the existing review articles linked to this research area. As such, the objective of this paper is to present an extensive systematic literature review (SLR) on the techniques and algorithms used to analyse ATS. Data were retrieved from three primary databases, namely Science direct Journal, IEEE Explore Digital Library, and Scopus Journal. As a result, 214 publications were identified since the past 5 years regarding methods of analysing ATS. A comprehensive SLR was executed to detect a range of unique characteristics from the algorithms, which led to the discovery of a new structure of ATS categories. Essentially, a hybrid method for ATS was identified with other sub-disciplines, especially cryptography, which leads to a new branch in enhancing security for ATS. Other relevant findings included key performance and evaluation criteria used to measure the performance of the algorithms (i.e., capacity, invisibility, robustness, & security). 87 % of the reviewed articles are the capacity measurement performance. Therefore, it disclosed a huge potential for the other two criteria (i.e. invisibility, robustness and security) to set a benchmark for future research endeavour.
... In a 2012 study by Odeh et al. [24], a secret message is hidden using multi-pointed letters from the Arabic and Persian languages. In this method, two bits of secret data are hidden in each letter by shifting the distance between the character points. ...
Article
This paper introduces a novel method for Arabic text steganography for irreversible practical usage that involves using the spaces in Unicode standard. The method also benefits from several characters in the Arabic language that can be changed based on their locations within words. The main innovation of this proposed method is the utilization of the different typescripts of Unicode standard for each letter shape. The technique also utilizes spaces and other Unicode features of Arabic e-text to hide secret data. This method depends on the use of contextual forms of Arabic characters and whitespaces to irreversibly hide certain specific secret bits. Additionally, the method employs the extra characters zero-width-joiner and Kashida to further enhance embedding capacity while preserving maximum security. The research implementation experiments show that this technique outperforms most available existing irreversible methods in both capacity and security. Moreover, this procedure can be widely adopted in related languages, such as Urdu and Farsi, due to their use of similar Unicode features, which is the encoding standard used in most of the world’s writing systems.
Article
Mega events attract mega crowds, and many data exchange transactions are involved among organizers, stakeholders, and individuals, which increase the risk of covert eavesdropping. Data hiding is essential for safeguarding the security, confidentiality, and integrity of information during mega events. It plays a vital role in reducing cyber risks and ensuring the seamless execution of these extensive gatherings. In this paper, a steganographic approach suitable for mega events communication is proposed. The proposed method utilizes the characteristics of Arabic letters and invisible Unicode characters to hide secret data, where each Arabic letter can hide two secret bits. The secret messages hidden using the proposed technique can be exchanged via emails, text messages, and social media, as these are the main communication channels in mega events. The proposed technique demonstrated notable performance with a high-capacity ratio averaging 178% and a perfect imperceptibility ratio of 100%, outperforming most of the previous work. In addition, it proves a performance of security comparable to previous approaches, with an average ratio of 72%. Furthermore, it is better in robustness than all related work, with a robustness against 70% of the possible attacks.
Article
Full-text available
Protecting sensitive information transmitted via public channels is a significant issue faced by governments, militaries, organizations, and individuals. Steganography protects the secret information by concealing it in a transferred object such as video, audio, image, text, network, or DNA. As text uses low bandwidth, it is commonly used by Internet users in their daily activities, resulting a vast amount of text messages sent daily as social media posts and documents. Accordingly, text is the ideal object to be used in steganography, since hiding a secret message in a text makes it difficult for the attacker to detect the hidden message among the massive text content on the Internet. Language’s characteristics are utilized in text steganography. Despite the richness of the Arabic language in linguistic characteristics, only a few studies have been conducted in Arabic text steganography. To draw further attention to Arabic text steganography prospects, this paper reviews the classifications of these methods from its inception. For analysis, this paper presents a comprehensive study based on the key evaluation criteria (i.e., capacity, invisibility, robustness, and security). It opens new areas for further research based on the trends in this field.
Article
Full-text available
In this paper, we are going to introduce different types of steganography considering the cover data. As the first step, we will talk about text steganography and investigate its details. Then, image steganography and its techniques will be investigated. Some techniques including Least Significant Bits, Masking and filtering and Transformations will be subjected during image steganography. Finally, audio steganography which contains LSB Coding, Phase Coding, Spread Spectrum and Echo Hiding techniques will be described.
Conference Paper
Full-text available
Conveying information secretly and establishing hidden relationship has been of interest since long past. Text documents have been widely used since very long time ago. Therefore, we have witnessed different method of hiding information in texts (text steganography) since past to the present. In this paper we introduce a new approach for steganography in Persian and Arabic texts. Considering the existence of too many points in Persian and Arabic phrases, in this approach, by vertical displacement of the points, we hide information in the texts. This approach can be categorized under feature coding methods. This method can be used for Persian/Arabic Watermarking. Our method has been implemented by JAVA programming language.
Article
Full-text available
The goal of steganography is to avoid drawing suspicion to the transmission of a hidden message. If suspicion is raised then this goal is defeated. The success of steganography, to a certain extent, depends on the secrecy of the cover medium. Once the steganographic carrier is disclosed then the security depends on the robustness of the algorithm used. Hence, to maintain secrecy either we have to make the cover medium more robust against steganalysis or discover new and better cover mediums. We consider the latter approach much more effective, since old techniques get prone to steganalysis. In this paper, we present one such cover medium. We propose to use ciphertext as a steganographic carrier. (114 words)
Conference Paper
Full-text available
New steganography methods are being proposed to embed secret information into text cover media in order to search for new possibilities employing languages other than English. This paper utilizes the advantages of diacritics in Arabic to implement text steganography. Diacritics - or Harakat - in Arabic are used to represent vowel sounds and can be found in many formal and religious documents. The proposed approach uses eight different diacritical symbols in Arabic to hide binary bits in the original cover media. The embedded data are then extracted by reading the diacritics from the document and translating them back to binary.
Article
Full-text available
Data hiding, a form of steganography, embeds data into digital media for the purpose of identification, annotation, and copyright. Several constraints affect this process: the quantity of data to be hidden, the need for invariance of these data under conditions where a "host" signal is subject to distortions, e.g., lossy compression, and the degree to which the data must be immune to interception, modification, or removal by a third party. We explore both traditional and novel techniques for addressing the data-hiding process and evaluate these techniques in light of three applications: copyright protection, tamper-proofing, and augmentation data embedding.
Conference Paper
Full-text available
Modern computer networks make it possible to distribute documents quickly and economically by electronic means rather than by conventional paper means. However, the widespread adoption of electronic distribution of copyrighted material is currently impeded by the ease of illicit copying and dissemination. The authors propose techniques that discourage illicit distribution by embedding each document with a unique codeword. The encoding techniques are indiscernible by readers, yet enable one to identify the sanctioned recipient of a document by examination of a recovered document. The authors propose three coding methods, describe one in detail, and present experimental results showing that their identification techniques are highly reliable, even after documents have been photocopied
Article
Full-text available
Modern computer networks make it possible to distribute documents quickly and economically by electronic means rather than by conventional paper means. However, the widespread adoption of electronic distribution of copyrighted material is currently impeded by the ease of unauthorized copying and dissemination. In this paper we propose techniques that discourage unauthorized distribution by embedding each document with a unique codeword. Our encoding techniques are indiscernible by readers, yet enable us to identify the sanctioned recipient of a document by examination of a recovered document. We propose three coding methods, describe one in detail, and present experimental results showing that our identification techniques are highly reliable, even after documents have been photocopied
Article
Steganography is a useful tool that allows covert transmission of information over an overt communications channel. Combining covert channel exploitation with the encryption methods of substitution ciphers and/or one time pad cryptography, steganography enables the user to transmit information masked inside of a file in plain view. The hidden data is both difficult to detect and when combined with known encryption algorithms, equally difficult to decipher. This paper provides a general overview of the following subject areas: historical cases and examples using steganography, how steganography works, what steganography software is commercially available and what data types are supported, what methods and automated tools are available to aide computer forensic investigators and information security professionals in detecting the use of steganography, after detection has occurred, can the embedded message be reliably extracted, can the embedded data be separated from the carrier revealing the original file, and finally, what are some methods to defeat the use of steganography even if it cannot be reliably detected.
Conference Paper
By expanding communication, in some cases there is a need for hidden communication. Steganography is one of the methods used for hidden exchange of information. Steganography is a method to hide the information under a cover media such as image or text. One of the text steganography methods for Persian and Arabic texts is "La" steganography method. But that method increases the file size and changes the apparent of the text. In this paper a method for solving these problems is proposed. In Persian and Arabic, each letter can have four different shapes regarding to its position in the word. In this method by using this feature of Persian and Arabic languages and the way which documents are saved in the Unicode Standard, the above problems are solved.
Conference Paper
Steganography is a method for hidden exchange of information by hiding data in a cover media such as image or sound. Text Steganography is one of the most difficult methods because a text file is not a proper media to hide data in it. In this paper we propose a new text Steganography method. In this method, we hide data in TeX documents. This method hides the data in places where there is a ligature such as ¿fi¿.