Conference PaperPDF Available

Use of HOG descriptors in phishing detection

Authors:

Abstract and Figures

Phishing is a scamming activity which deals with making a visual illusion on computer users by providing fake web pages which mimic their legitimate targets in order to steal valuable digital data such as credit card information or e-mail passwords. In contrast to other anti-phishing attempts this paper proposes to evaluate and solve this problem by leveraging a pure computer vision based method in the concept of web page layout similarity. Proposed approach employs Histogram of Oriented Gradients (HOG) descriptor in order to capture cues of page layout without the need of time consuming intermediate stage of segmentation. Moreover, histogram intersection kernel has been used as a similarity metric for computing similarity. Thus, an efficient and fast phishing page detection scheme has been developed in order to combat with zero-day phishing page attacks. To verify the efficiency of our phishing page detection mechanism, 50 unique phishing pages and their legitimate targets have been collected. Furthermore, 100 pairs of legitimate pages have been gathered. As the next stage, the similarity scores in these two groups were computed and compared. According to promising results, similarity degree around 75% and above can be adequate for alarming.
Content may be subject to copyright.
Use of HOG Descriptors in Phishing Detection
Ahmet Selman Bozkir
Hacettepe University Dept. of Computer Engineering
Ankara, Turkey
selman@cs.hacettepe.edu.tr
Ebru Akcapinar Sezer
Hacettepe University Dept. of Computer Engineering
Ankara, Turkey
ebru@hacettepe.edu.tr
Abstract Phishing is a scamming activity which deals with
making a visual illusion on computer users by providing fake web
pages which mimic their legitimate targets in order to steal
valuable digital data such as credit card information or e-mail
passwords. In contrast to other anti-phishing attempts this paper
proposes to evaluate and solve this problem by leveraging a pure
computer vision based method in the concept of web page layout
similarity. Proposed approach employs Histogram of Oriented
Gradients (HOG) descriptor in order to capture cues of page
layout without the need of time consuming intermediate stage of
segmentation. Moreover, histogram intersection kernel has been
used as a similarity metric for computing similarity. Thus, an
efficient and fast phishing page detection scheme has been
developed in order to combat with zero-day phishing page attacks.
To verify the efficiency of our phishing page detection mechanism,
50 unique phishing pages and their legitimate targets have been
collected. Furthermore, 100 pairs of legitimate pages have been
gathered. As the next stage, the similarity scores in these two
groups were computed and compared. According to promising
results, similarity degree around 75% and above can be adequate
for alarming.
Keywords Phishing; Anti phishing; Computer vision; HOG;
Page layout
I. INTRODUCTION
With the advent of e-commerce and on-line payment
systems, traditional banking transactions have been evolved to
online banking operations. This progress has not only eased the
life but also created an attraction to steal the private personal
data (e.g. credit card information, online governmental
credentials) by scammers. Thus, a web based attack called as
‘phishing’ emerged. By a term definition, [1] reported that it
derives from the concept of ‘fishing’ for target. Actually,
phishing is a scamming activity which deals with making a
visual illusion on computer users by providing fake web pages
which mimic their legitimate ones in order to steal valuable
digital data such as username or e-mail passwords.
Even though there exist various attempts to protect internet
users from phishing attacks, the amount of financial loss and the
number in new phishing web sites are still rising. According to
quarterly reports of Anti-Phishing Working Group (APWG) [2],
123.741 unique phishing web pages were reported in the first
half of 2014. The second half of the previous year (2013) has
received 115.565 phishing cases. According to [2], while the
average life time of phishing sites was determined around 32
hours, the median was reported as 8 hours and 42 minutes. This
fact implies that, a phishing web site lives shorter than one day.
Therefore, the need of zero-day phishing detection mechanisms
has emerged in recent years since the well-known blacklist
approaches have remained incapable to combat with these state-
of-art attacks.
There can be more than one type of classifications in
phishing detection works. However, in general manner, the
studies in anti-phishing literature can be classified into 4 groups:
(i) real-time black-list, whitelist, e-mail filtering and related
reactive solutions, (ii) content based works (iii) page structure
similarity based attempts and (iv) computer vision based studies.
Black-list approaches rely on gathering phishing web site
URLs from various sources. In general, whenever a user visits a
web page, its URL is queried against the generated black-list
corpus and one is allowed or blocked according to the result. As
an instance, Google Safe Browsing for Firefox [3], a black-list
based Firefox browser extension, warns users whenever a
sensitive financial information is requested by a suspicious page.
However, due to previously stated fast taking down cycles,
black-list based solutions have lost their effectiveness. On the
other hand, as stated in [4], whitelist approaches are based on
building a feature library from those legitimate web pages which
are likely to be imitated by phishers. Zhang et al. in [4] pointed
out the limitation of this technique by stating as the white list
approach is based on similarity search instead of exact matching,
its detection speed is greatly affected by the feature library size
and searching strategy”.
Ant-phishing studies based on page structure similarity focus
on seeking similarities among structural features of web pages
such as DOM trees, style and size of HTML elements in
corresponding blocks. Liu et al. in [5] proposed an anti-phishing
system via computing block-level, layout and style similarities
by considering size and style information embedded in HTML
elements. Similarly, Medvet et al. in [6] suggested an approach
by considering some page features extracted from DOM tree
representation such as style of text pieces (e.g. font-color, font-
size), and 2D haar-wavelet transformation of images.
Although, structure based studies achieve good results, they
cover three main shortcomings. First, different DOM
organizations can be rendered in the same way that makes these
type of attempts vulnerable to attackers. Second, DOM trees are
not fully reliable source of information since scammers are able
to create dynamic loading phishing web pages by utilizing client
side programming techniques. Third, as of 2016, instead of
inline coding, style information of page elements are usually
stored in CSS (Cascading Style Sheet) pages which forces a
DOM based anti-phishing mechanism to suffer from complex
and time consuming deeper analyses. These facts reduce the
effectiveness of DOM based solutions.
In recent years, for several reasons, there is a growing trend
to use computer vision techniques in phishing detection. First of
all, since a web page itself is a kind of visual stimuli, computer
vision techniques are found appropriate to analyze and evaluate
visual similarities via suitable features. Second, pure vision
based approaches are classified as proactive solutions which are
robust to zero-day attacks. On the other hand, as the main trick
of phishing is mimicking the legitimate web sites, scammers
have started to create polymorphic web pages to breach the
defense of the anti-phishing mechanisms. In order to evade
phishing detection systems, attackers apply different
representation techniques to create visually similar web page
which has been called phishing page polymorphism [7]. As one
or several portions of a fake web page can be composed of
images or other type of interactive contents, polymorphic pages
cannot be detected by use of traditional DOM based methods.
Furthermore, the addressed drawbacks and limitations of
structure based methods empower the use of computer vision
methods in phishing detection.
In literature, there exist limited but growing number of
studies on phishing detection via image processing or computer
vision methods. For instance, Maurer and Herzner in [8]
employed texture and color histograms gathered from phishing
and legitimate web pages in order to detect phishing. As another
work, Lam et al. in [1] proposed a layout concerning similarity
metric to be used in phishing detection. They first segmented the
screenshots of web pages in order to reveal page blocks and then
they measured the block pair matches by considering some
properties such as size, location and symmetry. Although it
achieves good detection rates, Lam et al.’s approach suffers in
segmenting the web pages having complex background textures.
On the other hand, in [9] Fu et al. applied Earth Mover’s
Distance (EMD) algorithm on image-based anti-phishing
framework in order to compute the similarity between legitimate
and fake web pages. However as stated by Lam et al. in [8] EMD
based approach has some shortcomings: (1) all the web pages
must have same aspect ratio; (2) EMD can cause false alarms
when color dispositions of two legitimate pages are similar. As
another study, Wang et al. in [10] presented a phishing detection
strategy based on capturing logo similarity via scale and rotation
invariant SIFT [11] features. While their approach differs from
others by focusing only on the logos, some important issues
related to diversity and typography in logo types were addressed.
The approach closest to our study was proposed in Rao and Ali’s
paper [1]. They employed SURF detector to extract visual
features from suspicious and genuine web pages and then
measured the overall visual similarity by use of these scale and
rotation invariant features. Nevertheless, while their work
obtained scale and rotational invariance, it has the lack of partial
similarity.
In this paper, we try to detect phishing web pages
considering following aspects:
a. A well designed phishing page must mimic the
legitimate one. Therefore the layout of the phishing
page must be similar or identical to its target in order to
deceive even expert users. The importance and role of
layout in visual perception has been emphasized in [12].
b. Smart attackers who are aware of the state-of-art anti-
phishing solutions, consciously make some
modifications (e.g. adding or removing minor contents)
over the legitimate pages in development process of
their phishing pages. Therefore an efficient ant-phishing
solution must consider computing partial similarities as
well as overall similarity.
c. Considering complex backgrounds, the segmentation
stage should be eliminated in order to build a robust,
efficient and effective anti-phishing mechanism.
d. In contrast to other approaches such as EMD, an
efficient phishing detection solution should be able to
generate fast and easy to calculate page signatures
which will be employed on real time corpus querying at
next stages.
Concerning these design considerations, we proposed a
phishing detection scheme which employs Histogram of
Oriented Gradients (HOG) descriptors in order to capture visual
cues of page layout. By use of HOG, efficient and effective
layout signatures belonging suspicious and legitimate web pages
were generated and compared. The main motivation point of this
study is to verify whether HOG descriptors are suitable in the
field of phishing detection. According to promising results of
conducted experiments, the similarity value of 75% seems a
good threshold for phishing alarm.
The remaining sections of this paper is organized as follows.
Section 2 overviews the HOG descriptor and features. Section 3
describes the proposed approach by referring application
environment. Section 4 reports the results of conducted
experiments and Section 5 concludes the paper.
II. METHODOLOGY
Invented by Dalal and Triggs [13], Histogram of Oriented
Gradients is a powerful computer vision method which has been
used for characterizing and capturing local object appearance or
shapes by utilizing distribution of intensity gradients or edge
directions. In essence, HOG descriptors are designed to
represent and reveal orientations in a local patch of an image.
Since its invention, HOG descriptors have been employed in
various fields such as moving object detection [14] and shape
representation [15]. For the following reasons, HOG descriptors
were preferred in this study: (i) HOG descriptors are able to
capture visual cues of overall page layout; (ii) they are able to
provide a certain degree of rotation and translation invariance.
Extracting HOG descriptors require three main steps: (i)
gradient computation, (ii) orientation binning and (iii) block
normalization. At the first stage, grid of equal sized cells are
obtained by dividing the image. As the second stage, for each
pixel, gradient vector is converted to an angle and orientation
bins are built according to angle ranges. Moreover, the
normalization stage which works on grouped cells (blocks) is
carried out in order to avoid illuminance variations and obtain
more robust results. As the final stage, normalized histograms
are concatenated and final descriptor is formed.
During the gradient computations, different kinds of
derivative masks (e.g. sobel or 2×2 diagonal masks) are
employed. In [13], for the particular case of human recognition,
it was discovered that the simplest  and
 kernels perform best classification results. Let
the image I is given, the gradients in both axes are first computed
by applying the mentioned kernels via the formula   
and    . Due to limited number of pages, HOG
descriptors were not comprehensively explained in this paper.
For further reading, Dalal and Trigg’s paper [13] can be studied.
Following the process of building concatenated feature
vector, the similarity of two pages is calculated via histogram
intersection kernel which is expressed in the equation (1).
  
 (1)
Application of HOG descriptors and histogram intersection
kernel is depicted in Fig. 1.
III. PROPOSED APPROACH
Main trick of phishing attacks is to deceive users via creating
visually similar fake pages which seems identical with their
legitimate ones. In this way, even experts users can be deceived
since visual appearance between the fake and its target cannot
be easily differentiated. Our proposed approach is primarily
designed to detect these type of zero-day phishing attacks by use
of HOG descriptors in order to capture layout cues.
Our system consists of two modules. The first module so
called Wrapper”, was designed and implemented in order to
find out effective page boundaries and taking a screenshot of
web page. Following the stage of revealing target ROI (Region
of Interest) effective portion is cropped and prepared for being
an input to next module. The “Wrapper” module was coded in
C# language and Mozilla GeckoFX API [16] was employed.
Second module so called “Hogger” was implemented in
order to take a JPEG file and output a concatenated HOG feature
vector. “Hogger” module was coded in native C++ code for
achieving high performance. According to given parameters,
Hogger” outputs variant sized of concatenated feature vector.
A. Identifying Region of Interest
Web pages currently designed especially for wide screens.
However, due to the backwards compatibility issues, most of the
web pages have been still in concordant with 1024 pixel wide
screen resolution. On the other hand, a web page may cover
wider than 1024 pixels. Similarly, there is no limitation for their
absolute height. This situation makes it as an essential pre-
requirement to determine the effective and discriminative region
of interest on web pages. Bozkir and Akcapinar Sezer in [12] has
pointed out that the most significant and discriminative visual
information in web pages is found at nonscroll requiring top
most 1024 pixels. Therefore, it was decided to crop and use the
top most 1024 pixels. For the sake of simplicity in computation
and the convention addressed above, the width of ROI was
preferred to be 1024 pixels.
In order to take a right screenshot, we employed Mozilla
GeckoFx.NET browser API. The “wrapper” window was
precisely set for taking 1024 pixel wide screen shots. At next
stage, we cropped the portion below 1024 pixels. For the cases
where height of web page is lower than 1024 pixels, we applied
a dominant color detection method for filling the empty lowest
part in order to have full square input image. In this way, input
images were generated concerning the existing dominant color
in web page. Finally the output image was converted to
grayscale in order to increase the gradient computation
accuracy.
B. Revealing the Cues of Page Layout via HOG Descriptors
As it was stated before, the main thought behind this
approach is to represent page layout by the help of distribution
of oriented gradients in grid of equal sized cells. In order to
reveal the appropriate cell size we applied two different grid
configurations. In first configuration (HOG128), the input
image consisting of 1024×1024 pixels was divided into grid of
8×8 cells having side length of 128 pixels. For the second
configuration (HOG64), the side length of square sized cells
was reduced to 64 pixels which totally results 16×16 grid. By
Fig. 1. HOG features generation and computing similarity
use of these two types of grid configuration, it was aimed to
understand and evaluate the levels of details. Moreover, the
translational and rotational invariance properties of HOG were
also examined on different configurations.
For block normalization, the scheme L2-norm,  
  was selected. For general use, the “Hogger” was
enabled to be take some parameters such as cell size, block size
and bin number. The native HOG implementation was adopted
from open source OpenCV [17] project. According to our
measurements, HOG feature extraction takes less than one
second on a computer with an Intel® Core ™ i5-2430M
processor with 4GB RAM.
C. Use Case Scenario
As we mentioned before, the proposed system was designed
to detect zero-day phishing attacks. In order to achieve this goal,
we first collect URLs of legitimate pages LPi which have
potential phishing risk and the layout signature of the LPi is
stored in legitimate corpus database along with its root domain.
Once all the pages which needs phishing protection were loaded
to the central corpus, a suspicious page SPj can be checked
against the legitimate corpus in order to verify whether it has a
high similar legitimate target. During the verification process,
Histogram Intersection Kernel (HIK) is employed as a similarity
metric. If a corresponding legitimate page is found then the root
domains of LPi and SPj are compared. If the root domains are
different then the system notifies the user for phishing page. The
proposed system is depicted in Fig 2.
IV. EXPERIMENT AND RESULTS
A. Experiments
In order to verify whether or not HOG method can be a
suitable feature extraction method for phishing detection, we
conducted an experiment by using two test data set. To establish
the first test set, we have collected 50 unique phishing pages
reported from Phishtank [18] covering the days between 14
December 2015 and 5 January 2016. The adjective unique here
refers to have unique visual appearances. Meanwhile, as it is
expected, most of the gathered phishing pages generally target
e-commerce, online payment and banking web sites. We also
gathered the legitimate targets of these pages. Since the taking
down cycles of phishing pages are so short, we decided to store
the phishing pages in a local folder. Thus, these web page pairs
were saved in HTML format by using a freeware utility.
For the second test set, we have collected 18 legitimate home
pages from Alexa [19] top 500 web site directory. Afterwards,
we have shuffled the page URLs in order to obtain 100 distinct
legitimate home page pairs.
In order to assess whether or not HOG descriptors are
applicable features for phishing detection, we followed the
following procedure:
1. For the test set 1, we computed the similarity scores of
pairs with HOG-64 and HOG-128 configurations. The
results are depicted in Fig. 3 and related statistic listed
in Table 1.
2. For the test set 2, we computed the similarity scores of
unique legitimate page pairs with HOG64 and HOG128
configurations. The results are depicted in Fig. 4 and
related statistic listed in Table 2.
TABLE I. STATISTICS OF PHISHING PAIRS IN HOG-64 AND HOG-128
Statistics
Similarity of Pairs of Phishing Pages
(50 pages)
HOG-64 px cells
HOG-128 px cells
min
51.873 %
49.910 %
max
98.861 %
98.390 %
mean
78.868 %
78.637 %
standard deviation
12.147 %
10.963 %
TABLE II. STATISTICS OF UNIQUE LEGITIMATE PAGE PAIRS IN HOG-64
AND HOG-128
Statistics
Similarity of Pairs of Legitimate Pages
(100 unique pairs)
HOG-64 px cells
HOG-128 px cells
min
38.420 %
45.683 %
max
74.459 %
77.092 %
mean
60.739 %
66.012 %
standard deviation
11.026 %
9.492 %
Fig. 2. Module design of proposed system
B. Results and Discussion
According to the obtained results, the following conclusions
were deduced:
Similarity scores between the pairs of test 1 and test 2
sets were found notably different which results that
HOG descriptors are suitable for phishing detection
tasks.
If the charts in Fig. 3 and Fig. 4 are investigated it can
be clearly seen that the similarity value around 75%
seems a good threshold value for phishing alarm.
As the HOG-64 configuration reveals slightly higher
scores between phishing page-legitimate page pairs (test
1 set), it produces lower outcomes in legitimate page
pairs (test 2 set) compared the HOG-128 configuration.
Therefore, it can be deduced that feature intersection at
smaller local patches gain us more robust and
discriminative results.
Fig. 4. Similarity scores of unique legitimate page pairs
Fig. 3. Similarity scores of phishing pages and their legitimate targets
It is observed that, in most of the phishing pages,
scammers may use different images (e.g. IMG tag)
while keeping the page layout as identical as their
legitimate targets. Since the HOG features are affected
by image content, as a future work, it is being planned
to detect and replace image contents with a placeholder
in order to improve detection accuracy.
V. CONCLUSION
In this paper, usage of an efficient and effective computer
vision method is proposed in the field of phishing detection.
Hence, the Histogram of Oriented Gradients descriptor has been
employed and verified. It is primarily aimed to detect zero-day
attacks by capturing visual cues of web page’s page layout via
HOG features and creating an easy to compare page layout
signature. In order to compute the similarity, histogram
intersection kernel has been employed. According to results, the
similarity value 75% was found as an appropriate threshold
value for phishing alarm. However, we believe that proposed
approach can be enhanced by providing image content
invariance. Therefore, we are planning to detect image contents
by computer vision and image processing techniques and
represent them with a virtually created gradient bin.
REFERENCES
[1] R.S. Rao and S.T. Ali, “A Computer Vision Technique to Detect Phishing
Attack”, Fifth International Conference on Communication Systems and
Network Technologies, 2015.
[2] APWG, Phishing activity trends paper. [Online]. Available at
http://www/antiphishing.org/resources/apwg-papers/
[3] Google Safe Browsing for Firefox. [Online]. Available at
http://www.google.com/tools/firefox/safebrowsing
[4] W. Zhang, H. Lu, B. Xu and H. Yang, “W eb Phi shing Detection Based
on Page Spatial Layout Similarity”, Informatica, vol. 37, pp. 231-244,
2013.
[5] W. Liu, X. Deng, G. Huang and A.Y. Fu, “ An Antiphishing Strategy
Based on Visual Similarity Assesment”, IEEE Internet C omputing, vol.
10, pp. 58-65, March 2006.
[6] E. Medvet, E. Kirda and C. Krueger, Visual-Similarity-Based Phishing
Detection”, Securecomm ’08 International Conference on Security and
Privacy in Communication Networks, 2008.
[7] I.F. Lam, W.C. Xiao, S.C. Wang and K.T. Chen, “Counteracting Phishing
Page Polymorphism: An Image Layout Analysis Approach”, Third
International Conference and Workshops, ISA2009, 2009.
[8] M.E. Maurer and D. Herzner, “ Using visual website similarity for
phishing detection a nd reporting”, In CHI’12 Extended Abstacts on
Human Factors in Computing Systems, 2012.
[9] A.Y. Fu, L. Wenyin and X. Deng, “Detecting Phishing Web Pages with
Visual Similarity Assesment based Earth Mover’s Distance (EMD)”,
IEEE Transactions on Dependable and Secure Computing, pp. 301-311,
2006.
[10] G. Wang, H. Liu, S. Becerra, K. Wang, “Verilog: Proactive Phishing
Detection via Logo Recognition”, Technical Report CS2011-0669, UC
San Diego, 2011.
[11] D.G. Lowe, “Distinctive image features from scale-invariant keypoints”,
International Journal of Computer Vision. vol. 60, 2004.
[12] A.S. Bozkir and E. Akcapinar Sezer, “SimiLay: A Developing Web Page
Layout Based Visual Similarity Search Engine”, 10th International
Conference on Machine Learning and Data Mining, St.Petersburg, 2014.
[13] N. Dalal and B. Triggs, “Histogram of Oriented Gradients for Human
Detection”, Computer Vision and Pattern Recognition, 2005.
[14] C.W. Liang and C.F. Yuang, “Moving object classification using local
shape and HOG features in wavelet-transformed space with hierachical
SVM classifers, Applied Soft Computing, vol. 28, 2015.
[15] A. Bosch, A. Zisserman and X. Munoz, “Representing shape with a
spatial pyramid kernel”, Conference of CIVR’07, Netherlands, 2007.
[16] Gecko FX, [Online], Available at https://code.google.com/p/geckofx/
[17] OpenCV, [Online], Available at http://opencv.org/
[18] Phishtank, [Online], Available at https://www.phishtank.com/
[19] Alexa, [Online], Available at http://www.alexa.com/topsites/
... In malicious website detection, models that use only domain names have been widely used, although it requires high computational power [13], [14], [15], [16]. In addition to domain names which are just text data, other information can be used such as DNS information [17], [18], [19], [20], [21] and web contents [22], [23], [24], [25], [26]. In this work, we suggest utilizing WebAssembly as a new feature for malicious website detection. ...
Article
Machine learning is often used for malicious website detection, but an approach incorporating WebAssembly as a feature has not been explored due to a limited number of samples, to the best of our knowledge. In this paper, we propose JABBERWOCK (JAvascript-Based Binary EncodeR by WebAssembly Optimization paCKer), a tool to generate WebAssembly datasets in a pseudo fashion via JavaScript. Loosely speaking, JABBERWOCK automatically gathers JavaScript code in the real world, converts them into WebAssembly, and then outputs vectors of the WebAssembly as samples for malicious website detection. We experimentally evaluate JABBERWOCK from three perspectives. First, we measure its processing time. Second, we compare the samples generated by JABBERWOCK with the actual WebAssembly gathered from the Internet. Third, we investigate if JABBERWOCK can be used in malicious website detection. Regarding the processing time, we show that JABBERWOCK can construct a dataset in 4.5 seconds per sample for any number of samples. Next, comparing 10,000 samples output by JABBERWOCK with 168 gathered WebAssembly samples, we believe that the generated samples by JABBERWOCK are similar to those in the real world. We then show that JABBERWOCK can provide malicious website detection with 99% F1-score because JABBERWOCK makes a gap between benign and malicious samples as the reason for the above high score. We also confirm that JABBERWOCK can be combined with an existing malicious website detection tool to improve F1-scores. JABBERWOCK is publicly available via GitHub (https://github.com/c-chocolate/Jabberwock).
... 2) Features of Malicious Website Detection: In malicious website detection, models that use only domain names have been widely used, although it requires high computational power [13]- [16]. In addition to domain names which are just text data, other information can be used such as DNS information [17]- [21] and web contents [22]- [26]. In this work, we suggest utilizing WebAssembly as a new feature for malicious website detection. ...
Preprint
Full-text available
Machine learning is often used for malicious website detection, but an approach incorporating WebAssembly as a feature has not been explored due to a limited number of samples, to the best of our knowledge. In this paper, we propose JABBERWOCK (JAvascript-Based Binary EncodeR by WebAssembly Optimization paCKer), a tool to generate WebAssembly datasets in a pseudo fashion via JavaScript. Loosely speaking, JABBERWOCK automatically gathers JavaScript code in the real world, convert them into WebAssembly, and then outputs vectors of the WebAssembly as samples for malicious website detection. We also conduct experimental evaluations of JABBERWOCK in terms of the processing time for dataset generation, comparison of the generated samples with actual WebAssembly samples gathered from the Internet, and an application for malicious website detection. Regarding the processing time, we show that JABBERWOCK can construct a dataset in 4.5 seconds per sample for any number of samples. Next, comparing 10,000 samples output by JABBERWOCK with 168 gathered WebAssembly samples, we believe that the generated samples by JABBERWOCK are similar to those in the real world. We then show that JABBERWOCK can provide malicious website detection with 99\% F1-score because JABBERWOCK makes a gap between benign and malicious samples as the reason for the above high score. We also confirm that JABBERWOCK can be combined with an existing malicious website detection tool to improve F1-scores. JABBERWOCK is publicly available via GitHub (https://github.com/c-chocolate/Jabberwock).
... Usually, the attackers copy these visual elements to fool the target users. A number of visual similarity approaches are introduced, such as Segmentation based visual similarity (Afroz andGreenstadt, 2011, Bozkir andSezer, 2016), Earth Mover's Distance based on images , and DOM Tree Similarity to recognize phishing attacks . The DOM method detects phishing by comparing phishing and legitimate web pages based on graph similarities. ...
Chapter
The phishing attack targets the client's email and any other connection medium to illicitly get the user credentials of e-commerce websites, educational websites, banks, credit card information, and other crucial user information. Exploitations caused by different types of cyberattacks result in data loss, identity theft, financial loss, and various other adversaries on both human and infrastructure. Therefore, investigating the threats and vulnerabilities on web applications and analysis of recent cyberattacks on web applications can also provide a holistic scenario about the recent security standpoint. Therefore, in this chapter, phishing attack techniques and their current scenario will be discussed extensively. Moreover, recent phishing techniques will be discussed to understand the severity of this type of attack. Finally, this chapter will outline the proposed and existing countermeasures for protecting users' identities and credentials from the phishing technique.
Preprint
Full-text available
Phishing attacks pose a significant threat to Internet users, with cybercriminals elaborately replicating the visual appearance of legitimate websites to deceive victims. Visual similarity-based detection systems have emerged as an effective countermeasure, but their effectiveness and robustness in real-world scenarios have been unexplored. In this paper, we comprehensively scrutinize and evaluate state-of-the-art visual similarity-based anti-phishing models using a large-scale dataset of 450K real-world phishing websites. Our analysis reveals that while certain models maintain high accuracy, others exhibit notably lower performance than results on curated datasets, highlighting the importance of real-world evaluation. In addition, we observe the real-world tactic of manipulating visual components that phishing attackers employ to circumvent the detection systems. To assess the resilience of existing models against adversarial attacks and robustness, we apply visible and perturbation-based manipulations to website logos, which adversaries typically target. We then evaluate the models' robustness in handling these adversarial samples. Our findings reveal vulnerabilities in several models, emphasizing the need for more robust visual similarity techniques capable of withstanding sophisticated evasion attempts. We provide actionable insights for enhancing the security of phishing defense systems, encouraging proactive actions. To the best of our knowledge, this work represents the first large-scale, systematic evaluation of visual similarity-based models for phishing detection in real-world settings, necessitating the development of more effective and robust defenses.
Article
Phishing attacks reached a record high in 2022, as reported by the Anti-Phishing Work Group [1], following an upward trend accelerated during the pandemic. Attackers employ increasingly sophisticated tools in their attempts to deceive unaware users into divulging confidential information. Recently, the research community has turned to the utilization of screenshots of legitimate and malicious websites to identify the brands that attackers aim to impersonate. In the field of Computer Vision, convolutional neural networks (CNNs) have been employed to analyze the visual rendering of websites, addressing the problem of phishing detection. However, along with the development of these new models, arose the need to understand their inner workings and the rationale behind each prediction. Answering the question, “How is this website attempting to steal the identity of a well-known brand?” becomes crucial when protecting end-users from such threats. In cybersecurity, the application of explainable AI (XAI) is an emerging approach that aims to answer such questions. In this paper, we propose VORTEX, a phishing website detection solution equipped with the capability to explain how a screenshot attempts to impersonate a specific brand. We conduct an extensive analysis of XAI methods for the phishing detection problem and demonstrate that VORTEX provides meaningful explanations regarding the detection results. Additionally, we evaluate the robustness of our model against Adversarial Example attacks. We adapt these attacks to the VORTEX architecture and evaluate their efficacy across multiple models and datasets. Our results show that VORTEX achieves superior accuracy compared to previous models, and learns semantically meaningful patterns to provide actionable explanations about phishing websites. Finally, VORTEX demonstrates an acceptable level of robustness against adversarial example attacks.
Article
In recent years, phishing attacks have evolved considerably, causing existing adversarial features that were widely utilised for detecting phishing websites to become less discriminative. These developments have fuelled growing interests among security researchers towards an anti-phishing strategy known as the identity-based detection technique. Identity-based detection techniques have consistently achieved high true positive rates in a rapidly changing phishing landscape, owing to its capitalisation on fundamental brand identity relations that are inherent in most legitimate webpages. However, existing identity-based techniques often suffer higher false positive rates due to complexities and challenges in establishing the webpage’s brand identity. To close the existing performance gap, this paper proposes a new hybrid identity-based phishing detection technique that leverages webpage visual and textual identity. Extending earlier anti-phishing work based on the website logo as visual identity, our method incorporates novel image features that mimic human vision to enhance the logo detection accuracy. The proposed hybrid technique integrates the visual identity with a textual identity, namely, brand-specific keywords derived from the webpage content using textual analysis methods. We empirically demonstrated on multiple benchmark datasets that this joint visual-textual identity detection approach significantly improves phishing detection performance with an overall accuracy of 98.6%. Benchmarking results against an existing technique showed comparable true positive rates and a reduction of up to 3.4% in false positive rates, thus affirming our objective of reducing the misclassification of legitimate webpages without sacrificing the phishing detection performance. The proposed hybrid identity-based technique is proven to be a significant and practical contribution that will enrich the anti-phishing community with improved defence strategies against rapidly evolving phishing schemes.
Preprint
Full-text available
Most research into anti-phishing defence assumes that the mal-actor is attempting to harvest end-users' personally identifiable information or login credentials and, hence, focuses on detecting phishing websites. The defences for this type of attack are usually activated after the end-user clicks on a link, at which point the link is checked. This is known as after-the-click detection. However, more sophisticated phishing attacks (such as spear-phishing and whaling) are rarely designed to get the end-user to visit a website. Instead, they attempt to get the end-user to perform some other action, for example, transferring money from their bank account to the mal-actors account. These attacks are rarer, and before-the-click defence has been investigated less than after-the-click defence. To better integrate and contextualize these studies in the overall anti-phishing research, this paper presents a systematic literature review of proposed anti-phishing defences. From a total of 6330 papers, 21 primary studies and 335 secondary studies were identified and examined. The current research was grouped into six primary categories, blocklist/allowlist, heuristics, content, visual, artificial intelligence/machine learning and proactive, with an additional category of "other" for detection techniques that do not fit into any of the primary categories. It then discusses the performance and suitability of using these techniques for detecting phishing emails before the end-user even reads the email. Finally, it suggests some promising areas for further research.
Conference Paper
Full-text available
Web page visual similarity has been a trend topic in last decade. Furthermore , effective methods and approaches are crucial for phishing detection and related issues. In this study, we aim to develop a search engine for web page visual similarity and propose a novel method for capturing and calculating layout similarity of web pages. To achieve this, web page elements are classified and mapped with a novel technique. Furthermore, an extension of well known bag of features approach named spatial pyramid match has been employed via histogram intersection schema for capturing and measuring the partial and whole page layout similarity. Promising results demonstrate that spatial pyramid matching kernel can be used for this field.
Article
Full-text available
Defending users against fraudulent Websites (i.e., phishing) is a task that is reactive in practice. Blacklists, spam filters, and takedowns all depend on first finding new sites and verifying that they are fraudulent. In this paper we explore an alternative approach that uses a combination of computer-vision techniques to proactively identify likely phishing pages as they are rendered, interactive queries to validate such pages with brand holders, and a single keyboard-entry filter to minimize false positives. We have developed a prototype version of this approach within the Firefox browser and we provide a preliminary evaluation of both the underlying technology (the accuracy and performance of logo recognition in Web pages) as well as its effectiveness in controlled small-scale user studies. While no such approach is perfect, our results demonstrate that this technique offers a significant new capability for minimizing response time in combating a wide range of phishing scams.
Conference Paper
Full-text available
Many visual similarity-based phishing page detectors have been developed to detect phishing webpages, however, scammers now cre- ate polymorphic phishing pages to breach the defense of those detectors. We call this kind of countermeasure phishing page polymorphism. Poly- morphic pages are visually similar to genuine pages they try to mimic, but they use difierent representation techniques. It increases the level of di-culty to detect phishing pages. In this paper, we propose an efiective detection mechanism to detect polymorphic phishing pages. In contrast to existing approaches, we analyze the layout of webpages rather than the HTML codes, colors, or content. Speciflcally, we compute the sim- ilarity degree of a suspect page and an authentic page through image processing techniques. Then, the degrees of similarity are ranked by a classifler trained to detect phishing pages. To verify the e-cacy of our phishing detection mechanism, we collected 6;750 phishing pages and 312 mimicked targets for the performance evaluation. The results show that our method achieves an excellent detection rate of 99:6%.
Article
This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Conference Paper
Phishing refers to cybercrime that use social engineering and technical subterfuge techniques to fool online users into revealing sensitive information such as username, password, bank account number or social security number. In this paper, we propose a novel solution to defend zero-day phishing attacks. Our proposed approach is a combination of white list and visual similarity based techniques. We use computer vision technique called SURF detector to extract discriminative key point features from both suspicious and targeted websites. Then they are used for computing similarity degree between the legitimate and suspicious pages. Our proposed solution is efficient, covers a wide range of websites phishing attacks and results in less false positive rate.
Article
Web phishing is becoming an increasingly severe security threat in the web domain. Effective and efficient phishing detection is very important for protecting web users from loss of sensitive private information and even personal properties. One of the keys of phishing detection is to efficiently search the legitimate web page library and to find those page that are the most similar to a suspicious phishing page. Most existing phishing detection methods are focused on text and/or image features and have paid very limited attention to spatial layout characteristics of web pages. In this paper, we propose a novel phishing detection method that makes use of the informative spatial layout characteristics of web pages. In particular, we develop two different options to extract the spatial layout features as rectangle blocks from a given web page. Given two web pages, with their respective spatial layout features, we propose a page similarity definition that takes into account their spatial layout characteristics. Furthermore, we build an R-tree to index all the spatial layout features of a legitimate page library. As a result, phishing detection based on the spatial layout feature similarity is facilitated by relevant spatial queries via the R-tree. A series of simulation experiments are conducted to evaluate our proposals. The results demonstrate that the proposed novel phishing detection method is effective and efficient.
Article
This paper proposes an integrated system for the segmentation and classification of four moving objects, including pedestrians, cars, motorcycles, and bicycles, from their side-views in a video sequence. Based on the use of an adaptive background in the red–green–blue (RGB) color model, each moving object is segmented with its minimum enclosing rectangle (MER) window by using a histogram-based projection approach or a tracking-based approach. Additionally, a shadow removal technique is applied to the segmented objects to improve the classification performance. For the MER windows with different sizes, a window scaling operation followed by an adaptive block-shifting operation is applied to obtain a fixed feature dimension. A weight mask, which is constructed according to the frequency of occurrence of an object in each position within a square window, is proposed to enhance the distinguishing pixels in the rescaled MER window. To extract classification features, a two-level Haar wavelet transform is applied to the rescaled MER window. The local shape features and the modified histogram of oriented gradients (HOG) are extracted from the level-two and level-one sub-bands, respectively, of the wavelet-transformed space. A hierarchical linear support vector machine classification configuration is proposed to classify the four classes of objects. Six video sequences are used to test the classification performance of the proposed method. The computer processing times of the object segmentation, object tracking, and feature extraction and classification approaches are 79 ms, 211 ms, and 0.01 ms, respectively. Comparisons with different well-known classification approaches verify the superiority of the proposed classification method.
Article
Phishing is a severe threat to online users, especially since attackers improve in impersonating other websites [1]. With websites looking visually the same, users are fooled more easily. However, the close visual similarity can also be used to counteract phishing. We present a framework that uses visual website similarity: (1) to detect possible phishing websites and (2) to create better warnings for such attacks. We report first results together with the three step process planned for the project. We expect the detection results to be comparable to previously published work which would allow for new kinds of phishing warnings with better coverage, less false positives and explicit user recommendations how to avoid these critical situation.
Conference Paper
The objective of this paper is classifying images by the object categories they contain, for example motorbikes or dolphins. There are three areas of novelty. First, we introduce a descriptor that represents local image shape and its spatial layout, together with a spatial pyramid kernel. These are designed so that the shape correspondence between two images can be measured by the distance between their descriptors using the kernel. Second, we generalize the spatial pyramid kernel, and learn its level weighting parameters (on a validation set). This significantly improves classification performance. Third, we show that shape and appearance kernels may be combined (again by learning parameters on a validation set). Results are reported for classification on Caltech-101 and retrieval on the TRECVID 2006 data sets. For Caltech-101 it is shown that the class specific optimization that we introduce exceeds the state of the art performance by more than 10%.