ArticlePDF Available

Abstract and Figures

The need for secure communication methods has significantly increased with the explosive growth of the internet and mobile communications. The usage of text documents has doubled several times over the past years especially with mobile devices. In this paper we propose a new steganography algorithm for Arabic text. The algorithm employs some letters that can be joined with other letters. These letters are the extension letter, Kashida and Zero width character. The extension letter, Kashida, does not have any change in the word meaning if joined to other letters. Also, the Zero width character (Ctrl+ Shift +1) does not change the meaning. The new proposed algorithm, Zero Width and Kashidha Letters (ZKS), mitigate the possibility to be discovered by steganoanalysis through using parallel connection and permutation function.
Content may be subject to copyright.
International Journal of Computer Science & Information Technology (IJCSIT) Vol 4, No 3, June 2012
DOI : 10.5121/ijcsit.2012.4301 1
STEGANOGRAPHY IN ARABIC TEXT USING ZERO
WIDTH AND KASHIDHA LETTERS
Ammar Odeh
1
and Khaled Elleithy
2
1
Department of Computer Science & Engineering, University of Bridgeport
Bridgeport, CT06604, USA
aodeh@bridgeport.edu
2
Department of Computer Science & Engineering, University of Bridgeport
Bridgeport, CT06604, USA
elleithy@bridgeport.edu
ABSTRACT
The need for secure communication methods has significantly increased with the explosive growth of the
internet and mobile communications. The usage of text documents has doubled several times over the past
years especially with mobile devices. In this paper we propose a new steganography algorithm for Arabic
text. The algorithm employs some letters that can be joined with other letters. These letters are the
extension letter, Kashida and Zero width character. The extension letter, Kashida, does not have any
change in the word meaning if joined to other letters. Also, the Zero width character (Ctrl+ Shift +1) does
not change the meaning. The new proposed algorithm, Zero Width and Kashidha Letters (ZKS), mitigate
the possibility to be discovered by steganoanalysis through using parallel connection and permutation
function.
KEYWORDS
Steganography, Kashida, Carrier file, Zero width character, textsteganography, image steganography,
audio steganography, Information Hiding, Persian/Arabic Text, Stegoanalysis,stego_medium, stego_key.
1. INTRODUCTION
1.1.BACKGROUND
Steganography is a Greek word coming from cover text. "Stegano" means hidden and Graptos"
means writing. In steganography, the secure data will be embedded into another object, so middle
attacker cannot catch it [1]. Invisible ink is an example for Steganography using a readable
message transfer between source and destination. Everyone in the middle can read the message
without having any clue about the hidden data. On other hand, authorized persons can read it
depending on the substances features [2][3].
Ancient Greeks used to shave the messenger head and then wait until the hair grew back. That is
when the message will be sent to the destination [1]. Depending on this method, there are two
possibilities:
1. Message has arrived so the receiver can read the message and recognize if message has
changed or not.
2. If message did not arrive, it means the attacker has detected the message.
International Journal of Computer Science & Information Technology (IJCSIT) Vol 4, No 3, June 2012
2
1.2. MOTIVATION
Steganography algorithms depend on three techniques to embed the hidden data in the carrier
files.
1. Substitution: Exchange a small part of the carrier file by the hidden message where the middle
attacker cannot observe the changes on the carrier file. On the other hand, in choosing a
replacement process, it is very important to avoid any suspicion. This means that it is important
to select insignificant parts from the carrier file and then replace them. For instance, if the carrier
file is an image (RGB), then the least significant bit (LSB) can be used as the exchange bit [4].
2. Injection: By adding hidden data into the carrier file, the file size will increase and this will
increase the suspicion. Therefore, the main goal to present techniques to add hidden data while
avoiding attacker suspicion [4].
3. Propagation: There is no need for a cover object. It depends on using a generation engine fed
by input (hidden data) to produce and mimic a file (graphic or music or text document).
The Steganography process consists of three main components as show in Figure 1.
Figure 1. General components of Steganography
Different types of cover media including image, sound, video and text can be used in
Steganograph, as shown in Figure 2. Choosing carrier file is very sensitive where it plays a key
role to protect the embedded message. Successful Steganography depends on avoiding suspicion.
Steganalysis will start checking the file. If there is any suspicion, this will compromise the main
goal of Steganography [3][4].
Figure 2. Stego Media
Text Steganography represents the most difficult type, where there is generally lack of data
redundancy in the text file in comparison with other carrier files [5]. The existence of such
redundancy can help increase the capacity of hidden data size. Furthermore, text Steganography
depends on the language, as each language has its own unique characteristics, which is
completely different from other languages. For example, the letter shape in English language does
International Journal of Computer Science & Information Technology (IJCSIT) Vol 4, No 3, June 2012
3
not depend on its position in the word, while Persian/Arabic letters have different forms
depending on letter positioning [6].
In our new proposed algorithm, we hide text inside text by employing Arabic language and
applying a random algorithm to distribute the hidden bits inside the message. The main reasons
for choosing the Arabic language are:
1. The proposed algorithm will depends on multi dotted points letters. Therefore, the algorithm
must employ a language that has as many as possible dotted letters. For example, the Arabic
Language has 5 multipoint letters and Persian/Farsi language has 8 letters [7], while English does
not have any.
2. Wealth availability of electronic textual information.
3. There is little research on other languages compared to English.
4. The approach can be extended to other languages like Urdu and Kurdish.
1.3.MAIN CONTRIBUTION AND PAPER ORGANIZATION
An efficient algorithm is presented in this paper. The main idea is to use Kashida, Zero width
characters in Arabic that enables us to hide more tow bits per one letter, Most of pervious
algorithm hide one bit for one letter. Addition we will use parallel connection, randomization
strategy to avoid any adaption.
The rest of this paper is organized as follows. In section II we discuss some text Steganography
techniques. Employ Kashida and Zero width hidden algorithm discuses in section III. Finally,
conclusion remarks are in section IV.
2. PRIOR WORK
Text Steganography is divided into two categories. The first one is the semantic method, and the
second is the formatting method, as shown in Figure 2. In this Section, we will briefly explain
some Steganography examples. In Table I, we present a simple comparison between semantic and
formatting methods.
Table I. Comparison between text Steganography methods
Semantic Method
Format Method
Amount of hidden
data
Small amount
More than
semantic
Flaws
Sentence meaning
notice from OCR
or retyping
Steganography criteria will depend on the amount of data that can be hidden and the main
problem facing the method.
We describe ten algorithms that hide data inside text documents. The last two algorithms deal
with Arabic and Persian languages.
2.1. WORD SYNONYM
Word Synonym is also called semantic method and it depends on replacing some words by their
synonym. See Table II. This technique will convey data without making any suspicion. It is
International Journal of Computer Science & Information Technology (IJCSIT) Vol 4, No 3, June 2012
4
limited in terms of that fact that hidden data will be small relative to other methods. Moreover, it
may change the sentence meaning [7][10][12].
2.2. PUNCTUATION
This method uses punctuation like (.)(;) to represent hidden text. For example "NY, CT, and NJ"
is similar to "NY, CT and NJ" where the comma before the and” represents 1, and the other
represents 0. The amount of hidden data in this method is very small compared to the amount of
cover media. Inconsistence use of punctuation will be noticeable from Stegoanalysis point of
view [9].
Table II. Using Word Synonym
Word
Synonym
Big
Large
Find
Observe
Familiar
Popular
Dissertation
Thesis
Chilly
Cool
2.3. LINE SHIFTING
Line shifting means to vertically shift the line a little bit to hide information to create a unique
shape of the text. Unfortunately, line shifting can be detected by a character recognition program.
Moreover retyping removes all hidden data [7][10].
In Figure 3, we present an example regarding line shifting where the vertical shifting is very small
(1/300 inch). This is not noticeable by the human eye.
Figure 3. Line shifting; second line is shifted up 1/300 inch [10].
2.4. WORD SHIFTING
In this method, changing spaces between words enables us to hide information. Word shifting is
noticeable by OCR through detecting space sequence between words [7][10].
2.5. SMS ABBREVIATIONS
Recently most SMS messages use abbreviations for simplicity and security while used in different
applications such as internet chatting, email, and mobile messaging. The main advantage of this
method is to speed typing, reducing the message’s length and manipulated keyboard limitation
character [13].
International Journal of Computer Science & Information Technology (IJCSIT) Vol 4, No 3, June 2012
5
Other algorithms use numbers to convey specific information. As mentioned above, SMS
abbreviation can be used in specific applications while using in others creates suspicion of any
entity that monitors the ongoing transmission.
2.6. TEXT ABBREVIATIONS
Text abbreviation is similar to SMS abbreviation, where a dictionary is created for each word
abbreviation and its meaning. The dictionary is published between the communication parties.
Abbreviation represents one method to hide data. For example if you send (see) it means (do you
understand) [13].
Table III. Some SMS Abbreviations
Abbreviation
Meaning
ADR
Address
ABT
About
URW
You are welcome
ILY
I love you
EOL
End of lecture
AYS
Are you serious?
2.7. HTML SPAM TEXT
This method depends on HTML pages, where their tags and their members are insensitive. For
example <BR> equal to <Br>, and the same as <br> and <bR>. The hidden data depends on
upper case or lower case letters to embed0 or 1.
2.8. TEX LIGATURES
In TeX ligatures, some special groups of letters can be joined together to create a single glyph as
shown in Figure 4. The algorithm finds available ligatures in the text to hide a single bit in each
one. For example, if we want to hide 1 we write fi to f {} i which creates some space between f
and i. Otherwise, we encode 0 [5].
Figure 4 .Join between characters [5]
International Journal of Computer Science & Information Technology (IJCSIT) Vol 4, No 3, June 2012
6
The same algorithm can be applied to Arabic character "" or ". This algorithm has two
problems. The first problem is that file size increases when we apply extension in our text. The
second problem is that if the ORC notices the font change, it can detect the decoding hidden
message [6][5].
2.9. ARABIC DIACRITICS
Arabic language uses different marks. The main reason to use these symbols is to distinguish
between words that have same letters. It depends on Arabic Diacritics (Harakat), where diacritics
are optional. Most of Arabic novels can be read without Diacritics, which depends on the
language’s grammar. The most occurrence is Fatha " َ" which will be used to encode 1
otherwise encode 0.Our new algorithm will enhance the reuse of cover media. Furthermore, the
carrier file size might be reduced depending on the hidden message. On the other hand, when
ORC detects the same message with different diacritics, it might conclude that there is a hidden
data. In addition, retyping will remove the embedded message [8].
Table IV. Some Letters with mark and their Pronunciation
Pronunciation
Letter with
Haraka
Haraka
Do
ُد
Dama
De
ِد
Kasra
Da
َد
Fatha
2.10. VERTICAL DISPLACEMENT OF THE POINTS
This algorithm achieves excellent performance as it is applied on pointed (dotted) letters. Other
languages such as English language have only two dotted letters; {i, j}; and thus limits the
application of this algorithm. Alternatively, some languages such as Arabic and Persian have
many pointed letters which make them fit better for this technique.
Arabic and Persian languages have many pointed characters. Arabic has 26 letters where 13 of
them are pointed, and Persian has 32 letters where 22 of them are pointed. In this new algorithm,
we encode 1 to shift up the point, otherwise encode 0. This method can encode a huge number of
bits, and need a strong OCR to recognize the changes. Meanwhile, retyping will remove the entire
message [7].
Figure5. Vertical shifting point [7]
2.11. USING THE EXTENSION ‘KASHIDA CHARACTER
Strategy of this method will depend on letter extension (Kashida). Kashida cannot be adding at
the beginning and at end of word, it can be added between letters in words. In other words if un-
pointed letter with extension to hide zero, pointed letter with extension will hold one.
International Journal of Computer Science & Information Technology (IJCSIT) Vol 4, No 3, June 2012
7
Message content will not be affected. On other hand, a new Unicode will be added (0640).
Figure6. Vertical shifting point [6]
As Figure show, not all characters will hide a bit. Therefore, stegoanalysis may suspect message
and this will vaulted main goal of steganography.[13]
2.12. UTILIZATION OF USING THE EXTENSION ‘KASHIDA CHARACTER
This algorithm try to use Kashida by the following way, One Kashida represent 0 and 2 Kashida
represent one. More over depending on number of Arabic letters with sum up is 32. Since each
letter, need 16 bits to represented, the algorithm use-mapping table to which each character map.
So instead 16bit we can represent each letter by 6 bit only and this will save 10 bits.[14]
2.13. USING PSEUDO-SPACE AND PSEUDO CONNECTION CHARACTERS
Also called zero width non-joins (ZWNJ) and zero width joiner (ZWJ) characters. At the
beginning, we classify letters to join or non-join letters.If we want to hide 1 we will add zero
width, otherwise we hide 0.[15]
3. PROPOSED ALGORITHM
Some of Arabic characters features support different steganography algorithms.
3.1. ARABIC LETTERS CHARACTERISTICS
A. Arabic language has 28 letters and each one of them has four different shapes, depends on
position of that letter. English language letter have same shape regardless position. Table V show
some Arabic letter shape
Table V. Some Letters with mark and their Pronunciation
Letter
beginning
Middle
End
ت
ت
ـﺘـ
ﺖـ
b
ث
ـﺛ
ـﺜـ
ﺚـ
t
B. Most of Arabic letters can be connected together like (نﻮﻤﻠﻌﯾ) where in English all words
consist of separated letters.
C. Each Arabic letters encoding into Unicode. Where each letter represents by 2bytes.
D. Any steganography algorithm applied in Arabic text can be extended to other language.
(Persian, Pashto, Sindhi, Kurdish, Urdu).
3.2. ZKS ALGORITHM
ZKS algorithm tries to employ letters connectivity and extension to hide 1 bit, moreover using
Zero width letter to hide 2 bits per each connective character.
International Journal of Computer Science & Information Technology (IJCSIT) Vol 4, No 3, June 2012
8
Table VI: - Steganography Extension algorithm
Cover
Object
ﺔﯾروطارﺑﻣﻻاو رﺻﻣ ءاذﻏ ﺔَﻠﺳ ﻲﻟﺎﻣﺷﻟا رﺻﻣ لﺣﺎﺳ نﺎﻛ
ءﺎﻧﺑ لﺑﻘﻓ .مﻼﺳﻹا لﺑﻗ رﺻﻣ لﺗﺣﺗ تﻧﺎﻛ ﻲﺗﻟا ﺔﯾﻧﺎﻣورﻟا
ﻰﻓ لﯾﻧﻟا هﺎﯾﻣ ﻰﻠﻋ نوﯾرﺻﻣﻟا دﻣﺗﻋا ،ﻲﻟﺎﻌﻟا دﺳﻟا
هﺎﯾﻣ ﻰﻠﻋ اودﻣﺗﻋا ﺎﻣﻛ ﺎﺗﻟدﻟاو يداوﻟا ﻲﻓ ﺔﯾﻔﯾﺻﻟا تﺎﻋارزﻟا
ﺢﻣﻘﻟا ﺔﻋارز ﻲﻓ رﺎطﻣﻷا
Stego
Object
رـﺻـﻣ ءاذﻏ ﺔ َـﻠـﺳ ﻲﻟﺎـﻣﺷـﻟا رﺻـﻣ لﺣﺎـﺳ نﺎـﻛ
رـﺻـﻣ لـﺗـﺣـﺗ تـﻧﺎﻛ ﻲﺗـﻟا ﺔــﻧﺎـﻣورﻟا ﺔﯾروـطارﺑﻣﺎـﻟاو
دﻣﺗﻋا ،ﻲـﻟﺎـﻌﻟا دـﺳـﻟا ءﺎﻧـﺑ لﺑﻘـﻓ .مﺎـﻠـﺳﻹا لـﺑـﻗ
ﺔــﻔــﺻﻟا تﺎﻋارزﻟا ﻰـﻓ لـﻧـﻟا هﺎــﻣ ﻰـﻠـﻋ نوـﯾرﺻﻣـﻟا
رﺎـطﻣﻷا هﺎﯾـﻣ ﻰـﻠـﻋ اودـﻣـﺗـﻋا ﺎـﻣﻛ ﺎـﺗﻟدﻟاو يداوـﻟا ﻲـﻓ
ﺢـﻣـﻘﻟا ﺔﻋارز ﻲﻓ
Hidden
Bits
1101010101101110010011110011111110110101
1101100010011111110100011111110010001111
11100100011
As table above show, a huge amount of bits can be added to message. In by applying this
algorithm we can hide a huge amount of data.
By applying Zero width letter (U+200D) we can increase hidden bit capacity (Ctrl+Shift+1).
Table VII: - Steganography Extension algorithm
Extension
Zero Width
Code
Letter effect
No
No
00
No EFFECT
Yes
No
01
Extension
No
Yes
10
Zero width
Yes
Yes
11
Extension +
Width
Table VIII: - Simulated results by applying ZKS algorithm
Cover
Object
ﺔﯾروطارﺑﻣﻻاو رﺻﻣ ءاذﻏ ﺔَﻠﺳ ﻲﻟﺎﻣﺷﻟا رﺻﻣ لﺣﺎﺳ نﺎﻛ
ءﺎﻧﺑ لﺑﻘﻓ .مﻼﺳﻹا لﺑﻗ رﺻﻣ لﺗﺣﺗ تﻧﺎﻛ ﻲﺗﻟا ﺔﯾﻧﺎﻣورﻟا
ﻰﻓ لﯾﻧﻟا هﺎﯾﻣ ﻰﻠﻋ نوﯾرﺻﻣﻟا دﻣﺗﻋا ،ﻲﻟﺎﻌﻟا دﺳﻟا
هﺎﯾﻣ ﻰﻠﻋ اودﻣﺗﻋا ﺎﻣﻛ ﺎﺗﻟدﻟاو يداوﻟا ﻲﻓ ﺔﯾﻔﯾﺻﻟا تﺎﻋارزﻟا
ﺢﻣﻘﻟا ﺔﻋارز ﻲﻓ رﺎطﻣﻷا
Stego
Object
رـﺻـﻣ ءاذﻏ ﺔ َـﻠـﺳ ﻲﻟﺎـﻣﺷـﻟا رﺻـﻣ لﺣﺎﺳ نﺎـﻛ
رـﺻـﻣ لـﺗـﺣـﺗ تـﻧﺎﻛ ﻲﺗـﻟا ﺔــﻧﺎـﻣورﻟا ﺔﯾروـطارﺑﻣﺎـﻟاو
دﻣﺗﻋا ،ﻲـﻟﺎـﻌﻟا دـﺳـﻟا ءﺎﻧـﺑ لﺑﻘـﻓ .مﺎـﻠـﺳﻹا لـﺑـﻗ
ﺔــﻔــﺻﻟا تﺎﻋارزﻟا ﻰـﻓ لـﻧـﻟا هﺎــﻣ ﻰـﻠـﻋ نوـﯾرﺻﻣـﻟا
رﺎـطﻣﻷا هﺎﯾـﻣ ﻰـﻠـﻋ اودـﻣـﺗـﻋا ﺎـﻣﻛ ﺎـﺗﻟدﻟاو يداوـﻟا ﻲـﻓ
ﺢـﻣـﻘﻟا ﺔﻋارز ﻲﻓ
Hidden
Bits
1001001000001110101100010111101100011111
1100111011111001111101000001001011100011
00000001011010001111111000000101001
As the above table VIII shown the mount of bits can be hidden inside text message. The amount
of change in text unnoticeable.
3.3. PSEUDO CODE AND FLOW CHART
ZKS algorithm uses two stages to hide message, to avoid any attacker suspicions.
International Journal of Computer Science & Information Technology (IJCSIT) Vol 4, No 3, June 2012
9
Figure7. State Diagram for ZKS
The first stage depends on Fragment of Stego cover media to enable different strategies to apply
to embedded message.
In the figure, we suggest eight messages can be send parallel, so we send embedded 3 bits to
recognize sequence number of that message. Depends on message to be hiding we can add bit to
increase of parallel messages as show in figure 8.
Figure8. Parallel connection
In second stage, algorithm permutated-fragmented messages and randomization function choose
which application can be used.
International Journal of Computer Science & Information Technology (IJCSIT) Vol 4, No 3, June 2012
10
Therefore, the first four most significant bits determine sequence message and last one for applied
algorithm.
Figure9. State Diagram for ZKS
Each message has different Stego key regardless of routing path, which increase Stegoanalysis
confusion.
3. CONCLUSION
In our paper, we introduced new text Steganography in Arabic letters. Our algorithm deals with
connected letter by adding Kashida character and Zero width letter. ZKS algorithm improve
previous one by use different concepts like parallel connection, permutation, and randomization,
to complicated Stegoanalysis process.
REFERENCES
[1] Aelphaeis Mangarae “Steganography FAQ," Zone-H.Org March 18th 2006
[2] S. Dickman, "An Overview of Steganography," July 2007.
[3] V. Potdar, E. Chang. "Visibly Invisible: Ciphertext as a Steganographic Carrier," Proceedings of the
4th International Network Conference (INC2004), page(s):385391, Plymouth, U.K., July 69, 2004
[4] M. Al-Husainy "Image Steganography by Mapping Pixels to Letters," 2009 Science Publications
[5] M. Shahreza, S. Shahreza, “Steganography in TeX Documents,” Proceedings of Intelligent System
and Knowledge Engineering, ISKE 2008. 3rd International Conference, Nov. 2008
[6] M. S. Shahreza, M. H. Shahreza, “An Improved Version of Persian/Arabic Text Steganography Using
"La" Word” Proceedings of IEEE 2008 6th National Conference on Telecommunication
Technologies.
[7] M. H. Shahreza, M. S. Shahreza, “A New Approach to Persian/Arabic Text Steganography
Proceedings of 5th IEEE/ACIS International Conference on Computer and Information Science
2006
[8] M. Aabed, S. Awaideh, A. Elshafei and A. Gutub “ARABIC DIACRITICS BASED
STEGANOGRAPHY Proceedings of IEEE International Conference on Signal Processing and
Communications (ICSPC 2007)
[9] W. Bender ,D. Gruhl ,N. Morimoto ,A. Lu Techniques for data Hiding” Proceedings OF IBM
SYSTEMS JOURNAL, VOL 35, NOS 3&4, 1996
[10] K. Bennett, “Linguistic Steganography : survey, analysis, and robustness concerns for hiding
information in text” Center for Education and Research in Information Assurance and Security,
Purdue University 2004
[11] M. Nosrati , R. Karimi and, M. Hariri ,” An introduction to steganography methods” World Applied
Programming, Vol (1), No (3), August 2011. 191-195.
[12] M.H. Shirali-Shahreza, M. Shirali-Shahreza, " Text Steganography in chat" Proceedings of 3rd
IEEE/IFIP International Conference in Central Asia on Sept. 2007
[13] Adnan Abdul-Aziz Gutub, Wael Al-Alwani, and Abdulelah Bin Mahfoodh“ Improved Method of
Arabic Text SteganographyUsing the Extension Kashida Character” Bahria University Journal of
Information & Communication Technology Vol.3, Issue 1, December 2010
International Journal of Computer Science & Information Technology (IJCSIT) Vol 4, No 3, June 2012
11
[14] Adnan Abdul-Aziz Gutub, and Manal Mohammad Fattani, “A Novel Arabic Text Steganography
Method Using Letter Points and Extensions World Academy of Science, Engineering and
Technology 27 200
[15] Hassan Shirali-Shahreza, Mohammad Shirali-Shahreza "STEGANOGRAPHY IN PERSIAN AND
ARABIC UNICODE TEXTS USING PSEUDO-SPACE AND PSEUDO CONNECTION
CHARACTERS". Journal of Theoretical and Applied Information Technology
Authors
Ammar Odeh is a PhD. Student in University of Bridgeport. He earned the M.S. degree
in Computer Science College of King Abdullah II School for Information Technology
(KASIT) at th e University of Jordan in Dec. 2005 and the B.Sc. in Computer Science
from the Hashemite University. He has worked as a Lab Supervisor in Philadelphia
University (Jordan) and Lecturer in Philadelphia University for the ICDL courses and as
technical support for online examinations for two years.
He served as a Lecturer at the IT, (ACS,CIS ,CS) Department of Philadelphia University in Jordan, and also
worked at the Ministry of Higher Education (Oman, Sur College of Applied Science) for two years. Ammar
joined the University of Bridgeport as a PhD student of Computer Science and Engineering in August 2011.
His area of concentration is reverse software engineering, computer security, and wireless networks.
Specifically, he is working on the enhancement of computer security for data transmission over wireless
networks. He is also actively involved in academic community, outreach activities and student recruiting
and advising.
Dr. Elleithy is the Associate Dean for Graduate Studies in the School of Engineering at the
University of Bridgeport. He has research interests are in the areas of network security,
mobile communications, and formal approaches for design and verification. He has
published more than one hundred fifty research papers in international journals and
conferences in his areas of expertise.
Dr. Elleithy is the co-chair of the International Joint Conferences on Computer, Information, and Systems
Sciences, and Engineering (CISSE). CISSE is the first Engineering/Computing and Systems Research E-
Conference in the world to be completely conducted online in real-time via the internet and was
successfully running for four years. Dr. Elleithy is the editor or co-editor of 10 books published by Springer
for advances on Innovations and Advanced Techniques in Systems, Computing Sciences and Software.
... Vowels in Arabic can be defined as animated sounds which help to determine word pronunciation, and writing them is the same as normal but their placement gives them different pronunciations. Choosing these letters supposedly helps find the positions for embedding the secret data as shown above in Fig. 2. A watermarking process based on kashida with a dotting property has been presented in [10]. This approach is achieved via inserting the kashida before or after letters containing points to indicate bit 1. ...
... These writings were gradually developed through antiquity and are used even until today. For instance, the kashida is a method to lengthen some of the letter in Arabic text and connect it to other letters on the right side (Arabic script is written from right to the left), while some letters have accepted the insertion of the kashida after it [9][10][11][12][13][14]. The Quran was written in a distinct Arabic language and has unique characteristics [11][12][13][14][15]. ...
... In addition to diacritics, the Quranic text includes "tajweed" symbols-symbols used in the recitation rules that direct readers to correctly recite the Quran. These tajweed symbols are not used in Arabic literature and are, therefore, unique to the Holy Quran [10][11][12][13][14][15][16]. ...
Article
Full-text available
The most sensitive Arabic text available online is the digital Holy Quran. This sacred Islamic religious book is recited by all Muslims worldwide including non-Arabs as part of their worship needs. Thus, it should be protected from any kind of tampering to keep its invaluable meaning intact. Different characteristics of Arabic letters like the vowels (), Kashida (extended letters), and other symbols in the Holy Quran must be secured from alterations. The cover text of the Quran and its watermarked text are different due to the low values of the Peak Signal to Noise Ratio (PSNR) and Embedding Ratio (ER). A watermarking technique with enhanced attributes must, therefore, be designed for the Quran’s text using Arabic vowels with kashida. The gap addressed by this paper is to improve the security of Arabic text in the Holy Quran by using vowels with kashida. The purpose of this paper is to enhance the Quran text watermarking scheme based on a reversing technique. The methodology consists of four phases: The first phase is a pre-processing followed by the second phase-the embedding process phase—which will hide the data after the vowels. That is, if the secret bit is “1”, then the kashida is inserted; however, the kashida is not inserted if the bit is “0”. The third phase is the extraction process and the last phase is to evaluate the performance of the proposed scheme by using PSNR (for the imperceptibility) and ER (for the capacity). The experimental results show that the proposed method of imperceptibility insertion is also optimized with the help of a reversing algorithm. The proposed strategy obtains a 90.5% capacity. Furthermore, the proposed algorithm attained 66.1% which is referred to as imperceptibility.
... This section explain the process cover and secret message in one level, as depicted in Figure Figure (5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16)(17)(18)(19): The process cove and secret message using single double quotation 7 th . ...
... cover without stego Table ( [6][7][8][9][10][11][12][13][14][15][16][17]: Similarity between cover and stego cover single quotation in level two 6 th . cover without stego Table ( [6][7][8][9][10][11][12][13][14][15][16][17][18]: Similarity between cover and stego cover double quotation in level two 6 th . cover without stego ...
... This section discusses cases to ensure the proposed technique of embedding level one using kashida, is as depicted in Figure (6-29).  The Jaro-Winkler method measures distance, in the similarity between two strings, as depicted in Table (6-20), Table (6-21), Table(6-22), and Table If the word is " " ‫ال‬ ‫سرد‬ stego in level one, dj= 1/3(5/7+5/7+5-2/7) = 0.6190 Table ( [6][7][8][9][10][11][12][13][14][15][16][17][18][19][20]: Similarity between cover and stego cover kashida in level one 7 th . cover without stego Table ( [6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21]: Similarity between cover and stego cover single quotation in level one 7 th . ...
... However, this method uses a pattern-based technique for the embedding process, which might lead to secret predictability, while [34] used only the dots. The previous algorithms were modified by adding bits through the connection of the characters and their extension to hide one bit, as well as using a zero-width character to embed two bits per cursive [35]. ...
... Approach Used Imperceptibility Capacity Robustness Security [17] Hybrid high high high high [29] Kashida unknown high unknown unknown [32] Diacritics unknown high unknown unknown [54] Kashida unknown high unknown unknown [30] Hybrid high unknown unknown unknown [31] Kashida unknown high unknown high [33] Kashida unknown high unknown unknown [34] Kashida unknown high unknown unknown [35] Kashida and Unicode unknown unknown unknown unknown [36] Kashida unknown unknown unknown unknown [37] shape unknown high unknown unknown [55] Kashida unknown high unknown unknown [38] Unicode unknown high unknown unknown [39] Unicode unknown high unknown unknown [40] Unicode unknown high unknown unknown [41] Unicode unknown high unknown unknown [42] Unicode unknown unknown unknown unknown [43] Unicode high high unknown [44] Diacritics unknown high unknown unknown [45] Diacritics unknown high unknown unknown [56] Diacritics unknown high unknown unknown [46] Diacritics unknown high unknown unknown [47] Diacritics unknown high unknown unknown [48] Diacritics Low high unknown unknown [4] Diacritics unknown high high high our approach Diacritics and Hamzas high high high high ...
Article
Full-text available
Steganography is a widely used technique for concealing confidential data within images, videos, and audio. However, using text for steganography has not been sufficiently explored. Text-based steganography has the advantage of a low bandwidth overhead, making it a promising alternative for protecting sensitive information. Among languages, Arabic is known for its linguistic richness, making it ideal for text-based steganography. This paper proposes a robust, dynamic, and multi-layered steganography approach that uses text, encryption algorithms, and images. This approach utilizes Arabic diacritic features to hide limited-size and highly classified information. The algorithm uses several scenarios and is extensively tested to ensure the required level of security and user performance. The experimental results on actual data demonstrate the robustness of the proposed algorithm, with no noticeable impact on the carrier message (original text). Furthermore, no known potential attack can break the proposed algorithm, making it a promising solution for text-based steganography.
... Several methods have been proposed, each depending on the maximum number of Kashida that may be added per word: one, two or even three. Meanwhile, features of Zero Width and Kashida Letters (ZKS) were introduced to enlarge capacity [56]. Both characters do not change the word meaning when they are connected to other letters. ...
... In [70], the Kashida stego-cover technique was modified by merging with a new way of embedding sensitive data within white spaces. Besides, some studies conducted by [56] were extended by [71], whereby Kashida is combined with ZWJ Unicode. 4.2.1.6. ...
Article
Full-text available
Despite the rapidly growing studies on Arabic text steganography (ATS) noted recently; systematic, in-depth, and critical reviews are in scarcity due to high overlap or low segregation level among the existing review articles linked to this research area. As such, the objective of this paper is to present an extensive systematic literature review (SLR) on the techniques and algorithms used to analyse ATS. Data were retrieved from three primary databases, namely Science direct Journal, IEEE Explore Digital Library, and Scopus Journal. As a result, 214 publications were identified since the past 5 years regarding methods of analysing ATS. A comprehensive SLR was executed to detect a range of unique characteristics from the algorithms, which led to the discovery of a new structure of ATS categories. Essentially, a hybrid method for ATS was identified with other sub-disciplines, especially cryptography, which leads to a new branch in enhancing security for ATS. Other relevant findings included key performance and evaluation criteria used to measure the performance of the algorithms (i.e., capacity, invisibility, robustness, & security). 87 % of the reviewed articles are the capacity measurement performance. Therefore, it disclosed a huge potential for the other two criteria (i.e. invisibility, robustness and security) to set a benchmark for future research endeavour.
... This was achieved by unifying all the data leading to a homogenous file. The proposed study [12] has presented an approach that makes use of eight varying diacritical symbols in Arabic for the purpose of hiding binary bits in the original cover medium. The embedded information is extracted and revealed by reading the diacritics from the text, and converting it back to a binary representation. ...
... The usability ratio is used to determine the total characters of a cover text that are able to be utilised in embedding the process of the hidden message [12], [20]. This analysis is very important for a steganographer to understand the capability of the cover text to hide a hidden message. ...
Article
Full-text available
The enormous development in the utilization of the Internet has driven by a continuous improvement in the region of security. The enhancement of the security embedded techniques is applied to save the intellectual property. There are numerous types of security mechanisms. Steganography is the art and science of concealing secret information inside a cover media such as image, audio, video and text, without drawing any suspicion to the eavesdropper. The text is ideal for steganography due to its ubiquity. There are many steganography embedded techniques used Arabic language to embed the hidden message in the cover text. Kashida, Shifting Point and Sharp-edges are the three Arabic steganography embedded techniques with high capacity. However, these three techniques have lack of performance to embed the hidden message into the cover text .This paper present about Traid-bit method by integrating these three Arabic text steganography embedded techniques. It is an effective way to evaluate many embedded techniques at the same time, and introduced one solution for various cases.
... This was achieved by unifying all the data leading to a homogenous file. The proposed study [12] has presented an approach that makes use of eight varying diacritical symbols in Arabic for the purpose of hiding binary bits in the original cover medium. The embedded information is extracted and revealed by reading the diacritics from the text, and converting it back to a binary representation. ...
... The usability ratio is used to determine the total characters of a cover text that are able to be utilized in embedding the process of the hidden message [12,20]. This analysis is very important for a steganographer to understand the capability of the cover text to hide a hidden message. ...
Article
Full-text available
The enormous development in the utilization of the Internet has driven by a continuous improvement in the region of security. The enhancement of the security embedded techniques is applied to save the intellectual property. There are numerous types of security mechanisms. Steganography is the art and science of concealing secret information inside a cover media such as image, audio, video and text, without drawing any suspicion to the eavesdropper. The text is ideal for steganography due to its ubiquity. There are many steganography embedded techniques used Arabic language to embed the hidden message in the cover text. Kashida, Shifting Point and Sharp-edges are the three Arabic steganography embedded techniques with high capacity. However, these three techniques have lack of performance to embed the hidden message into the cover text. This paper present about traid-bit method by integrating these three Arabic text steganography embedded techniques. It is an effective way to evaluate many embedded techniques at the same time, and introduced one solution for various cases.
... To increase the data hiding capacity (Ahmadoh and Gutub, 2015) improve the work of (Aabed et al., 2007) and proposed a new data hiding approach by using two diacritics 'Fateh' and 'Kasrah' on the basis of even and odd bits of the secret message. A lot of efforts have been done to hide secret data by using more than one Arabic diacritics like Fatha, reverse Fatha or diacritics shift (vertically upward) (Odeh and Elleithy, 2012). However, recently these methods are facing different challenges. ...
Article
Full-text available
Recently, information security has become a very important topic for researchers as well as military and government officials. For secure communication, it is necessary to develop novel ways to hide information. For this purpose, steganography is usually used to send secret information to its destination using different techniques. In this article, our main focus is on text-based steganography. Hidden information in text files is difficult to discover as text data has low redundancy in comparison to other mediums of steganography. Hence, we use Arabic text to hide secret information using a combination of Unicode character's zero-width-character and zero-width-joiner and pseudo-space in our proposed algorithm. The experimental results show hidden data capacity per word is significantly increased in comparison to the recently proposed algorithms. The major advantage of our proposed algorithm over previous research is the high visual similarity in both cover and stego-text that can reduce the attention of intruders. الملخص العربي: في الآونة الأخيرة، أصبح أمن المعلومات موضوعاً بالغ الأهمية بالنسبة للباحثين فضلا عن المسؤولين العسكريين والحكوميين. وللتواصل الآمن من الضروري استحداث طرق جديدة لإخفاء المعلومات، ولهذا الغرض، عادة ما يستخدم علم إخفاء البيانات لإرسال معلومات سرية إلى مقصدها باستخدام تقنيات مختلفة . والهدف من هذه الأطروحة هو توفير طريقة جديدة لعلم اخفاء البيانات بالتقنية النصية. من الصعب اكتشاف المعلومات المخفية في الملفات النصية، حيث أن البيانات النصية ذات إسهاب اقل بالمقارنة مع تقنيات أخرى من علم إخفاء البيانات. ومن هنا فإننا نستخدم نصا عربيا لإخفاء المعلومات السرية باستخدام مزيج من الحرف ذو العرض الصفري، والانضمام ذو العرض الصفري، والفضاء ال ا زئف في الخوارزمية المقترحة . وتظهر النتائج التجريبية زيادة في سعة البيانات المخفية لكل كلمة مقارنة بالخوارزميات المقترحة مؤخ ا ر. الميزة الرئيسية لخوارزميتنا المقترحة على البحوث السابقة هو التشابه البصري العالي في كل من الغلاف ونص الاخفاء الذي يمكن أن يقلل من انتباه الدخلاء
Article
Full-text available
With the rapid development of Internet, safe covert communications in the network environment become an essential research direction. Steganography is a significant means that secret information is embedded into cover data imperceptibly for transmission, so that information cannot be easily aware by others. Text Steganography is low in redundancy and related to natural language rules these lead to limit manipulation of text, so they are both great challenges to conceal message in text properly and to detect such concealment. This paper presents a novel algorithm to hide a large amount of text in cover text without affecting the cover format, by using many types of pointers (which are characters can interpreter as invisible character, or as a part of cover. Pointers used as single pointer or set of pointer to represent new single pointer. The suggested algorithm can hide more than 40% of the cover size, which represent more than four times of the best known method used to hide text in text.
Article
Full-text available
This paper presents a new steganography approach suitable for Arabic texts. It can be classified under steganography feature coding methods. The approach hides secret information bits within the letters benefiting from their inherited points. To note the specific letters holding secret bits, the scheme considers the two features, the existence of the points in the letters and the redundant Arabic extension character. We use the pointed letters with extension to hold the secret bit 'one' and the un-pointed letters with extension to hold 'zero'. This steganography technique is found attractive to other languages having similar texts to Arabic such as Persian and Urdu.
Article
Full-text available
In this paper, we are going to introduce different types of steganography considering the cover data. As the first step, we will talk about text steganography and investigate its details. Then, image steganography and its techniques will be investigated. Some techniques including Least Significant Bits, Masking and filtering and Transformations will be subjected during image steganography. Finally, audio steganography which contains LSB Coding, Phase Coding, Spread Spectrum and Echo Hiding techniques will be described.
Conference Paper
Full-text available
Conveying information secretly and establishing hidden relationship has been of interest since long past. Text documents have been widely used since very long time ago. Therefore, we have witnessed different method of hiding information in texts (text steganography) since past to the present. In this paper we introduce a new approach for steganography in Persian and Arabic texts. Considering the existence of too many points in Persian and Arabic phrases, in this approach, by vertical displacement of the points, we hide information in the texts. This approach can be categorized under feature coding methods. This method can be used for Persian/Arabic Watermarking. Our method has been implemented by JAVA programming language.
Article
Full-text available
This paper presents a new steganography approach suitable for Arabic texts. It can be classified under steganography feature coding methods. The approach hides secret information bits within the letters benefiting from their inherited points. To note the specific letters holding secret bits, the scheme considers the two features, the existence of the points in the letters and the redundant Arabic extension character. We use the pointed letters with extension to hold the secret bit 'one' and the un-pointed letters with extension to hold 'zero'. This steganography technique is found attractive to other languages having similar texts to Arabic such as Persian and Urdu.
Article
Full-text available
The goal of steganography is to avoid drawing suspicion to the transmission of a hidden message. If suspicion is raised then this goal is defeated. The success of steganography, to a certain extent, depends on the secrecy of the cover medium. Once the steganographic carrier is disclosed then the security depends on the robustness of the algorithm used. Hence, to maintain secrecy either we have to make the cover medium more robust against steganalysis or discover new and better cover mediums. We consider the latter approach much more effective, since old techniques get prone to steganalysis. In this paper, we present one such cover medium. We propose to use ciphertext as a steganographic carrier. (114 words)
Article
Full-text available
Data hiding, a form of steganography, embeds data into digital media for the purpose of identification, annotation, and copyright. Several constraints affect this process: the quantity of data to be hidden, the need for invariance of these data under conditions where a "host" signal is subject to distortions, e.g., lossy compression, and the degree to which the data must be immune to interception, modification, or removal by a third party. We explore both traditional and novel techniques for addressing the data-hiding process and evaluate these techniques in light of three applications: copyright protection, tamper-proofing, and augmentation data embedding.
Article
Steganography is a useful tool that allows covert transmission of information over an overt communications channel. Combining covert channel exploitation with the encryption methods of substitution ciphers and/or one time pad cryptography, steganography enables the user to transmit information masked inside of a file in plain view. The hidden data is both difficult to detect and when combined with known encryption algorithms, equally difficult to decipher. This paper provides a general overview of the following subject areas: historical cases and examples using steganography, how steganography works, what steganography software is commercially available and what data types are supported, what methods and automated tools are available to aide computer forensic investigators and information security professionals in detecting the use of steganography, after detection has occurred, can the embedded message be reliably extracted, can the embedded data be separated from the carrier revealing the original file, and finally, what are some methods to defeat the use of steganography even if it cannot be reliably detected.
Conference Paper
By expanding communication, in some cases there is a need for hidden communication. Steganography is one of the methods used for hidden exchange of information. Steganography is a method to hide the information under a cover media such as image or text. One of the text steganography methods for Persian and Arabic texts is "La" steganography method. But that method increases the file size and changes the apparent of the text. In this paper a method for solving these problems is proposed. In Persian and Arabic, each letter can have four different shapes regarding to its position in the word. In this method by using this feature of Persian and Arabic languages and the way which documents are saved in the Unicode Standard, the above problems are solved.
Conference Paper
Steganography is a method for hidden exchange of information by hiding data in a cover media such as image or sound. Text Steganography is one of the most difficult methods because a text file is not a proper media to hide data in it. In this paper we propose a new text Steganography method. In this method, we hide data in TeX documents. This method hides the data in places where there is a ligature such as ¿fi¿.
Conference Paper
Invention of the Internet and its spread in the world changes various aspects of human life. In addition, Internet changed human relations. Chat is one of the new aspects which invented after the Internet and is welcomed by users, especially by young people. In chat rooms, people talk with each other using text messages. Because of the need for quick typing of the word and also because of the high amount of sentences which is exchanged between users, new abbreviations are invented for various words and phrases in chat rooms. This new language is known as SMS-texting. On the other hand, the issue of safety and security of information and especially secret relationships has led to the introduction of numerous methods for secret communication. Among these methods, steganography is a rather new method. The present paper offers a new method for secret exchange of information through chat by using and developing abbreviation text steganography with the use of the SMS-texting language. This paper has been implemented by Java programming language.