Preprint

Peek-a-boo, I Can See You, Forger: Influences of Human Demographics, Brand Familiarity and Security Backgrounds on Homograph Recognition

Authors:
Preprints and early-stage research may not have been peer reviewed yet.
To read the file of this research, you can request a copy directly from the authors.

Abstract

Homograph attack is a way that attackers deceive victims about which domain they are communicating with by exploiting the fact that many characters look alike. The attack becomes serious and is raising broad attention when recently many brand domains have been attacked such as Apple Inc., Adobe Inc., Lloyds Bank, etc. We first design a survey of human demographics, brand familiarity, and security backgrounds and apply it to 2,067 participants. We build a regression model to study which actors affect participants' ability in recognizing homograph domains. We then find that participants exhibit different ability for different kinds of homographs. For instance, female participants tend to be able to recognize homographs while male participants tend to be able to recognize non-homographs. Furthermore, 16.59% of participants can recognize homographs whose visual similarity with the target brand domains is under 99.9%; however, when the similarity increases to 99.9%, the number of participants who can recognize homographs drops down significantly to merely 0.19%; and for the homographs with 100% of visual similarity, there is no way for the participants to recognize. We also find that people working or educated in computer science or computer engineering are the ones who tend to exhibit the best ability to recognize all kinds of homographs and non-homographs. Surprisingly to us, brand familiarity does not influcence the ability in either homographs or non-homographs. Stated differently, people who frequently use the brand domains but do not have enough knowledge are still easy to fall in vulnerabilities.

No file available

Request Full-text Paper PDF

To read the file of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Users must regularly distinguish between secure and insecure cyber platforms in order to preserve their privacy and safety. Mouse tracking is an accessible, high-resolution measure that can be leveraged to understand the dynamics of perception, categorization, and decision-making in threat detection. Researchers have begun to utilize measures like mouse tracking in cyber security research, including in the study of risky online behavior. However, it remains an empirical question to what extent real-time information about user behavior is predictive of user outcomes and demonstrates added value compared to traditional self-report questionnaires. Participants navigated through six simulated websites, which resembled either secure “non-spoof” or insecure “spoof” versions of popular websites. Websites also varied in terms of authentication level (i.e., extended validation, standard validation, or partial encryption). Spoof websites had modified Uniform Resource Locator (URL) and authentication level. Participants chose to “login” to or “back” out of each website based on perceived website security. Mouse tracking information was recorded throughout the task, along with task performance. After completing the website identification task, participants completed a questionnaire assessing their security knowledge and degree of familiarity with the websites simulated during the experiment. Despite being primed to the possibility of website phishing attacks, participants generally showed a bias for logging in to websites versus backing out of potentially dangerous sites. Along these lines, participant ability to identify spoof websites was around the level of chance. Hierarchical Bayesian logistic models were used to compare the accuracy of two-factor (i.e., website security and encryption level), survey-based (i.e., security knowledge and website familiarity), and real-time measures (i.e., mouse tracking) in predicting risky online behavior during phishing attacks. Participant accuracy in identifying spoof and non-spoof websites was best captured using a model that included real-time indicators of decision-making behavior, as compared to two-factor and survey-based models. Findings validate three widely applicable measures of user behavior derived from mouse tracking recordings, which can be utilized in cyber security and user intervention research. Survey data alone are not as strong at predicting risky Internet behavior as models that incorporate real-time measures of user behavior, such as mouse tracking.
Conference Paper
Full-text available
Despite the plethora of security advice and online education materials offered to end-users, there exists no standard measurement tool for end-user security behaviors. We present the creation of such a tool. We surveyed the most common computer security advice that experts offer to end-users in order to construct a set of Likert scale questions to probe the extent to which respondents claim to follow this advice. Using these questions, we iteratively surveyed a pool of 3,619 computer users to refine our question set such that each question was applicable to a large percentage of the population, exhibited adequate variance between respondents, and had high reliability (i.e., desirable psychometric properties). After performing both exploratory and confirmatory factor analysis, we identified a 16-item scale consisting of four sub-scales that measures attitudes towards choosing passwords, device securement, staying up-to-date, and proactive awareness.
Article
Full-text available
User education must focus on challenging and correcting the misconceptions that guide current user behavior. To date, user education on phishing has tried to persuade them to check URLs and a number of other indicators, with limited success. The authors evaluate a novel antiphishing tool in a realistic setting—participants had to buy tickets under time pressure and lost money if they bought from bad sites. Although none of the participants bought from sites the tool clearly identified as bad, 40 percent risked money with sites flagged as potentially risky, but offering bargains. When tempted by a good deal, participants didn't focus on the warnings; rather, they looked for signs they thought confirmed a site's trustworthiness.
Conference Paper
Full-text available
This paper reports findings from a multi-method set of four studies that investigate why we continue to fall for phish. Current security advice suggests poor spelling and grammar in emails can be signs of phish. But a content analysis of a phishing archive indicates that many such emails contain no obvious spelling or grammar mistakes and often use convincing logos and letterheads. An online survey of 224 people finds that although phish are detected approximately 80% of the time, those with logos are significantly harder to detect. A qualitative interview study was undertaken to better understand the strategies used to identify phish. Blind users were selected because it was thought they may be more vulnerable to phishing attacks, however they demonstrated robust strategies for identifying phish based on careful reading of emails. Finally an analysis was undertaken of phish as a literary form. This identifies the main literary device employed as pastiche and draws on critical theory to consider why security based pastiche may be currently very persuasive.
Conference Paper
Full-text available
In this paper we present the results of a roleplay survey instrument administered to 1001 online survey respondents to study both the relationship between demographics and phishing susceptibility and the effectiveness of several anti- phishing educational materials. Our results suggest that women are more susceptible than men to phishing and participants between the ages of 18 and 25 are more susceptible to phishing than other age groups. We explain these demographic factors through a mediation analysis. Educational materials reduced users' tendency to enter information into phishing webpages by 40% percent; however, some of the educational materials we tested also slightly decreased participants' tendency to click on legitimate links.
Article
Full-text available
Computing veterans remember an old habit of crossing zeros (?) in program listings to avoid confusing them with the letter O, in order to make sure the operator would type the program correctly into the computer. This habit, once necessary, has long been rendered obsolete by the increased availability of editing tools. However, the underlying problem of character resemblance is still there. Today it seems we may have to acquire a similar habit, this time to address an issue much more threatening than mere typos: security. Let us begin with a short recourse to history. On April 7, 2000 an anonymous site published a bogus story intimating that the company PairGain Technologies (NASDAQ:PAIR) was about to be acquired for approximately twice its market value. The site employed the look and feel of the Bloomberg news service, and thus appeared quite authentic to unsuspecting users. To disseminate the "news", a message containing a link to the story was simultaneously posted to the Yahoo message board dedicated to PairGain. The link referred to the phony site by its numerical IP address rather than by name, and thus obscured its true identity. Many readers were convinced by the Bloomberg look and feel, and accepted the story at face value despite its suspicious address. As a result, PairGain stock first jumped 31%, and then fell drastically, incurring severe losses to investors. Attacks like this are relatively easy to detect. A stronger variant of this hoax might have used a domain named bl00mberg. com, (with zeros replacing o's), but even the latter is easily distinguishable from the real thing. However, forthcoming Internet technologies have the potential to make such attacks much more elusive and devastating. A new initiative, promoted by a number of Internet standards bodies including IETF and IANA, allows one to register domain names in national alphabets. This way, for example, Russian news site "gazeta. ru" ("gazeta" means "newspaper" in Russian) might register a more appealing " . ". Far from buzzword compliance, the initiative caters to the genuine needs of non-English-speaking Internet users,, who currently find it difficult to access Web sites otherwise. Several alternative implementations are currently being considered, and we can expect the standardization process to be completed soon. The benefits of this initiative are indisputable. Yet the very idea of such an infrastructure is compromised by the peculiarities of world alphabets. Revisiting our newspaper example, one can observe that Russian letters ",,, " are indistinguishable in writing from their English counterparts. Some of the letters (such as "a") are close etymologically, while others look similar by sheer coincidence. For instance, Russian letter "p" is actually pronounced like "r", but the glyphs of the two letters are identical. As it happens, Russian is not the only such language; other Cyrillic languages may cause similar collisions. With the proposed infrastructure in place, numerous English domain names may be homographed-maliciously misspelled by substitution of non-Latin letters. For example, the Bloomberg attack could have been crafted much more skillfully, by registering a domain name bloomberg. com, where the letters "o" and/or "e" have been faked with Russian substitutes. Without adequate safety mechanisms, this scheme can easily mislead even the most cautious reader. 1 Incidentally, this domain has actually been registered. 2 According to Global Reach's report, the English-speaking population of the Internet was about 62% in 1998, and is forecasted to be as low as 37% by the end of 2002.
Conference Paper
Many computer-security defenses are reactive---they operate only when security incidents take place, or immediately thereafter. Recent efforts have attempted to predict security incidents before they occur, to enable defenders to proactively protect their devices and networks. These efforts have primarily focused on long-term predictions. We propose a system that enables proactive defenses at the level of a single browsing session. By observing user behavior, it can predict whether they will be exposed to malicious content on the web seconds before the moment of exposure, thus opening a window of opportunity for proactive defenses. We evaluate our system using three months' worth of HTTP traffic generated by 20,645 users of a large cellular provider in 2017 and show that it can be helpful, even when only very low false positive rates are acceptable, and despite the difficulty of making "on-the-fly'' predictions. We also engage directly with the users through surveys asking them demographic and security-related questions, to evaluate the utility of self-reported data for predicting exposure to malicious content. We find that self-reported data can help forecast exposure risk over long periods of time. However, even on the long-term, self-reported data is not as crucial as behavioral measurements to accurately predict exposure.
Conference Paper
Computer security tools usually provide universal solutions without taking user characteristics (origin, income level, ...) into account. In this paper, we test the validity of using such universal security defenses, with a particular focus on culture. We apply the previously proposed Security Behavior Intentions Scale (SeBIS) to 3,500 participants from seven countries. We first translate the scale into seven languages while preserving its reliability and structure validity. We then build a regression model to study which factors affect participants' security behavior. We find that participants from different countries exhibit different behavior. For instance, participants from Asian countries, and especially Japan, tend to exhibit less secure behavior. Surprisingly to us, we also find that actual knowledge influences user behavior much less than user self-confidence in their computer security knowledge. Stated differently, what people think they know affects their security behavior more than what they do know.
Article
Objective methods for assessing perceptual image quality have traditionally attempted to quantify the visibility of errors between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a Structural Similarity Index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MatLab implementation of the proposed algorithm is available online at http://www.cns.nyu.edu/~lcv/ssim/.
Article
Research problem: Phishing is an email-based scam where a perpetrator camouflages emails to appear as a legitimate request for personal and sensitive information. Research question: How do individuals process a phishing email, and determine whether to respond to it? Specifically, this study examines how users' attention to “visual triggers” and “phishing deception indicators” influence their decision-making processes and consequently their decisions. Literature review: This paper draws upon the theory of deception and the literature on mediated cognition and learning, including the critical role of attention and elaboration in deception detection. From this literature, we developed a research model to suggest that overall cognitive effort expended in email processing decreases with attention to visual triggers and phishing deception indicators. The likelihood to respond to phishing emails increases with attention to visceral cues, but decreases with attention to phishing deception indicators and cognitive effort. Knowledge of email-based scams increases attention to phishing deception indicators, and directly decreases response likelihood. It also moderates the impact of attention to visceral triggers and that of phishing deception indicators on likelihood to respond. Methodology: Using a real phishing email as a stimulus, a survey of 321 members of a public university community in the Northeast US, who were intended victims of a spear phishing attack that took place, was conducted. The survey used validated measures developed in prior literature for the most part and tested results using the partial least-squares regression. Results and discussion: Our research model and hypotheses were supported by the data except that we did not find that cognitive effort significantly affects response likelihood. The implication of the study is that attention to visceral triggers, attention to phishing deception indicators, and phishing knowledg- play critical roles in phishing detection. The limitations of the study were that the data were drawn from students, and the study explored one phishing attack, relied on some single-item measures, cognitive effort measure, and a one-round survey. Future research would examine the impact of a varying degree of urgency and a varying level of phishing deception indicators, and actual victims of phishing attacks.
Experiement results in details
  • Anonymous
Anonymous. Experiement results in details. February 2019. Available: https://drive.google.com/file/d/1n-rt_ BKlFmt-ZhTcJBqqQ6ZQgHur0sH1/view?usp=sharing.
About safari international domain name support
  • Appleinc
AppleInc. About safari international domain name support. October 2016. Available: https: //support.apple.com/kb/TA22996?locale=en_US.
Measuring user confidence in smartphone security and privacy
  • E Chin
  • A P Felt
  • V Sekary
  • D Wagner
E. Chin, A. P. Felt, V. Sekary, and D. Wagner. Measuring user confidence in smartphone security and privacy. Eighth Symposium on Usable Privacy and Security (SOUPS'12), July 2012.
Lloydsbank, iioydsbank -researcher highlights the homographic phishing problem
  • G Cluley
G. Cluley. Lloydsbank, iioydsbank -researcher highlights the homographic phishing problem. June 2015. Available: https://www.grahamcluley.com/ lloydsbank-homographic-phishing-problem/.
Homoglyph attack generator
  • A Crenshaw
A. Crenshaw. Homoglyph attack generator. Available: http://www.irongeek.com/ homoglyph-attack-generator.php.
The effect of social influence on security sensitivity
  • S Das
  • T H Kim
  • L A Dabbish
  • J I Hong
S. Das, T. H.-J. Kim, L. A. Dabbish,, and J. I. Hong. The effect of social influence on security sensitivity. 10th USENIX Conference on Usable Privacy and Security (SOUPS'14), pages 143-157, July 2014.
Android permissions: User attention, comprehension, and behavior
  • A P Felt
  • E Ha
  • S Egelman
A. P. Felt, E. Ha, and S. Egelman. Android permissions: User attention, comprehension, and behavior. Eighth Symposium on Usable Privacy and Security (SOUPS'12), July 2012.
Idn homograph attack
  • T Furrer
T. Furrer. Idn homograph attack. May 2017. Available: https: //github.com/timofurrer/idn-homograph-attack.
no one can hack my mind: Comparing expert and non-expert security practices
  • I Ion
  • R Reeder
  • S . Consolvo
I. Ion, R. Reeder, and S. Consolvo....no one can hack my mind: Comparing expert and non-expert security practices. 11th USENIX Conference on Usable Privacy and Security (SOUPS'15), pages 327-346, July 2015.
Chrome and firefox phishing attack uses domains identical to known safe sites
  • M Maunder
M. Maunder. Chrome and firefox phishing attack uses domains identical to known safe sites. April 2017. Available: https://www.wordfence.com/blog/2017/ 04/chrome-firefox-unicode-phishing/.
Changes to idn in ie7 to now allow mixing of scripts
  • Microsoft
Microsoft. Changes to idn in ie7 to now allow mixing of scripts. July 2006. Available: https: //blogs.msdn.microsoft.com/ie/2006/07/31/ changes-to-idn-in-ie7-to-now-allow-mixing-of-scripts/.
Idn homograph attack spreading betabot backdoor
  • M Mimoso
M. Mimoso. Idn homograph attack spreading betabot backdoor. September 2017. Available: https://threatpost.com/ idn-homograph-attack-spreading-betabot-backdoor/ 127839/.
Evilurl: Generate unicode evil domains for idn homograph attack and detect them
  • A Moretto
  • V Augusto
A. Moretto and V. Augusto. Evilurl: Generate unicode evil domains for idn homograph attack and detect them. February 2018. Available: https://github.com/UndeadSec/EvilURL.
Internationalized domain names (idn) can be used for spoofing
  • Opera
  • Advisory
Opera. Advisory: Internationalized domain names (idn) can be used for spoofing. February 2007. Available: https://web.archive.org/web/20070219070826/ http://www.opera.com/support/search/view/788/.
Internationalised domains show negative growth in 2017
  • I W Report
I. W. Report. Internationalised domains show negative growth in 2017. December 2017. Available: https://idnworldreport.eu/.
Detecting homograph idns using ocr. 46th Asia Pacific Advanced Network (APAN)
  • Y Sawabe
  • D Chiba
  • M Akiyama
  • S Goto
Y. Sawabe, D. Chiba, M. Akiyama, and S. Goto. Detecting homograph idns using ocr. 46th Asia Pacific Advanced Network (APAN), August 2018.
Dnstwist: Domain name permutation engine for detecting typo squatting, phishing and corporate espionage
  • M Ulikowski
M. Ulikowski. Dnstwist: Domain name permutation engine for detecting typo squatting, phishing and corporate espionage. November 2018. Available: https://github.com/elceef/dnstwist.
Unicode security mechanisms for uts #39
  • Unicode-Inc
Unicode-Inc. Unicode security mechanisms for uts #39. 2018. Available: http://www.unicode.org/ Public/security/11.0.0/confusables.txt.
Domain name generator
  • R Verhoef
R. Verhoef. Domain name generator. Available: https: //instantdomainsearch.com/domain/generator/.
Homographs: brutefind homographs within a font
  • R Verhoef
R. Verhoef. Homographs: brutefind homographs within a font. April 2017. Available: https://github.com/dutchcoders/homographs.
Phishing with unicode domains
  • X Zheng
X. Zheng. Phishing with unicode domains. April 2017. Available: https://www.xudongz.com/blog/2017/ idn-phishing/?_ga=2.53371112.1302505681. 1542677803-1987638994.1542677803.
Idn homograph attacks
  • Ntt-Security
NTT-Security. Idn homograph attacks. January 2017. Available: https://www.solutionary.com/resource-center/ blog/2017/01/idn-homograph-attacks/.