Conference Paper

Browser-Based Deep Behavioral Detection of Web Cryptomining with CoinSpy

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Most of these techniques are based on the analysis of dynamic features. These include fixed thresholds [16], semantic instruction-count-based signature matching [17], CPU, memory and network traffic features [18], [19], [20], runtime, network, mining-related and browser-based features [12], block-level profiling and dynamic instrumentation [15], [21], [22], hardware performance counters [23], and instructioncount-based analysis and memory events [10]. Although such existing detection systems are promising and report high detection accuracy, there are several issues to consider. ...
... Bian et al. [15] proposed MineThrottle, which uses a block-level profiler and dynamic instrumentation of the Wasm code that is pointed by the profiler at compile time. Kelton et al. [19] proposed CoinSpy which utilizes computation, memory, and network features. Conti et al. [23] used Hardware Performance Counters (HPC) data to detect cryptojacking. ...
... However, only a small portion of prior studies [17], [10], [12] take this change into account. Secondly, cryptojacking detection systems [17], [10], [12], [18], [15], [19] relying on dynamic analysis features can suffer from high computational overhead, reduced measurement accuracies due to noise caused by other processes, and false positives resulted from benign websites using the same technologies. For this reason, practical applications of such schemes may cause quality-of-experience issues for end-users. ...
... MineSweeper [45] SEISMIC [46] MinerRay [47] RAPID [48] MINOS [49] OutGuard [50] MineThrottle [51] CoinSpy [52] Detecting vulnerabilities in WebAssembly binaries (Section 5.2) ...
... CoinSpy. CoinSpy [52] is a method for detecting cryptojacking by monitoring compute, memory, and network usage from within the browser. The computational behavior is monitored using the JavaScript stack profiler, and memory usage is measured by monitoring the JavaScript heap and WebWorker threads. ...
Article
Full-text available
WebAssembly is a low-level bytecode language that enables high-level languages like C, C++, and Rust to be executed in the browser at near-native performance. In recent years, WebAssembly has gained widespread adoption and is now natively supported by all modern browsers. Despite its benefits, WebAssembly has introduced significant security challenges, primarily due to vulnerabilities inherited from memory-unsafe source languages. Moreover, the use of WebAssembly extends beyond traditional web applications to smart contracts on blockchain platforms, where vulnerabilities have led to significant financial losses. WebAssembly has also been used for malicious purposes, like cryptojacking, where website visitors’ hardware resources are used for crypto mining without their consent. To address these issues, several analysis techniques for WebAssembly binaries have been proposed. This paper presents a systematic review of these analysis techniques, focusing on vulnerability analysis, cryptojacking detection, and smart contract security. The analysis techniques are categorized into static, dynamic, and hybrid methods, evaluating their strengths and weaknesses based on quantitative data. Our findings reveal that static techniques are efficient but may struggle with complex binaries, while dynamic techniques offer better detection at the cost of increased overhead. Hybrid approaches, which merge the strengths of static and dynamic methods, are not extensively used in the literature and emerge as a promising direction for future research. Lastly, this paper identifies potential future research directions based on the state of the current literature.
... For example, the Firefox browser supports the detection of cryptomining by using deny lists [38]. The academic community also provides related work on detecting or preventing WebAssembly cryptojacking [52,42,25,5,26,48]. Yet, it is known that black-hats can use evasion techniques to bypass detection. ...
... On the same topic, MinerRay [48] detects cryptojacking in WebAssembly binaries by analyzing their control flow graph at runtime, searching for structures that are characteristic of encryption algorithms commonly used for cryptojacking. CoinSpy is another malware detector based on dynamic analysis [25]. It uses a convolutional neural network to analyse the computation, network, and memory information caused by cryptojackers running in client browsers. ...
Article
Full-text available
WebAssembly has become a crucial part of the modern web, offering a faster alternative to Java-Script in browsers. While boosting rich applications in browser, this technology is also very efficient to develop cryptojacking malware. This has triggered the development of several methods to detect cryptojacking malware. However, these defenses have not considered the possibility of attackers using evasion techniques. This paper explores how automatic binary diversification can support the evasion of WebAssembly cryptojacking detectors. We experiment with a dataset of 33 WebAssembly cryp-tojacking binaries and evaluate our evasion technique against two malware detectors: VirusTotal, a general-purpose detector, and MINOS, a WebAssembly-specific detector. Our results demonstrate that our technique can automatically generate variants of WebAssembly cryptojacking that evade the detectors in 90% of cases for VirusTotal and 100% for MINOS. Our results emphasize the importance of meta-antiviruses and diverse detection techniques and provide new insights into which WebAssembly code transformations are best suited for malware evasion. We also show that the variants introduce limited performance overhead, making binary diversification an effective technique for evasion.
... These approaches do not prevent cryptojacking malware after its detection. Kelton et al. [18] also presented an in-browser tool called CoinSpy based on deep learning. It is a cryptojacking classifier for the detection of cryptomining activities within a web page. ...
... Caprolu et al. [24] analyzed the network features and achieved the true positive rate (TPR) of 92%. Similarly, Kelton et al. [18] used CPU memory features from the Alexa 100k dataset by applying a neural network and achieved an accuracy of 97.9%. ...
Article
Full-text available
Cryptojacking is a type of computer piracy in which a hacker uses a victim’s computer resources, without their knowledge or consent, to mine for cryptocurrency. This is made possible by new memory-based cryptomining techniques and the growth of new web technologies such as WebAssembly, allowing mining to occur within a browser. Most of the research in the field of cryptojacking has focused on detection methods rather than prevention methods. Some of the detection methods proposed in the literature include using static and dynamic features of in-browser cryptojacking malware, along with machine learning algorithms such as Support Vector Machine (SVM), Random Forest (RF), and others. However, these methods can be effective in detecting known cryptojacking malware, but they may not be able to detect new or unknown variants. The existing prevention methods are shown to be effective only against web-assembly (WASM)-based cryptojacking malware and cannot handle mining service-providing scripts that use non-WASM modules. This paper proposes a novel hybrid approach for detecting and preventing web-based cryptojacking. The proposed approach performs the real-time detection and prevention of in-browser cryptojacking malware, using the blacklisting technique and statistical code analysis to identify unique features of non-WASM cryptojacking malware. The experimental results show positive performances in the ease of use and efficiency, with the detection accuracy improved from 97% to 99.6%. Moreover, the time required to prevent already known malware in real time can be decreased by 99.8%.
... Most detection systems do not focus on preventing or interrupting cryptojacking malware; there are still many studies (Kelton., 2020; that focus on both detection and prevention. The preventive approaches differ even for techniques that use similar dynamic traits to detect ongoing cryptojacking malware attacks. ...
... The preventive approaches differ even for techniques that use similar dynamic traits to detect ongoing cryptojacking malware attacks. ) raise a notification, Bian et al. (Kelton., 2020) put the mining process to sleep, and kill the process immediately. There are various technologies available on the market to prevent cryptojacking. ...
Chapter
More than 2000 different cryptocurrencies are currently available in business and FinTech applications. Cryptocurrency is a digital payment system that does not rely on banks to verify their financial transactions and can enable anyone anywhere to send and receive their payments. Crypto mining attracts investors to mine and gets some coins as a reward for using the cryptocurrency. However, hackers can exploit the computing power without the explicit authorization of a user by launching a cryptojacking attack and then using it to mine cryptocurrency. The detection and protection of cryptojacking attacks are essential, and thus, miners are continuously working to find innovative ways to overcome this issue. This chapter provides an overview of the cryptojacking landscape. It offers recommendations to guide researchers and practitioners to overcome the identified challenges faced while realizing a mitigation strategy to combat cryptojacking malware attacks.
... Antivirus vendors and browser vendors provide support for detecting cryptomalware. The academic community also provides related work on detecting or preventing Web-Assembly cryptomalware [34,27,16,4,17,31]. Yet, it is known that black-hats can use evasion techniques to bypass detection [3]. ...
... On the same topic, MinerRay [31] detects cryptojacking in WebAssembly binaries by analyzing their control flow graph at runtime, searching for structures that are characteristic of encryption algorithms commonly used for cryptojacking. CoinSpy is another malware detector based on dynamic analysis [16]. It uses a convolutional neural network to analyse the computation, network, and memory information caused by cryptojackers running in client browsers. ...
Preprint
Full-text available
WebAssembly is a binary format that has become an essential component of the web nowadays. Providing a faster alternative to JavaScript in the browser, this new technology has been embraced from its early days to create cryptomalware. This has triggered a solid effort to propose defenses that can detect WebAssembly malware. Yet, no defensive work has assumed that attackers would use evasion techniques. In this paper, we study how to evade WebAssembly cryptomalware detectors. We propose a novel evasion technique based on a state-of-the-art WebAssembly binary diversifier. We use the worldwide authoritative VirusTotal as malware detector to evaluate our technique. Our results demonstrate that it is possible to automatically generate variants of WebAssembly cryptomalware, which evade the considered strong detector. Remarkably, the variants introduce limited performance overhead. Our experiments also provide novel insights about which WebAssembly code transformations are the best suited for malware evasion. This provides insights for the community to improve the state of the art of WebAssembly malware detection.
... Examining dynamic characteristics is the foundation of most of these strategies. These include blocklevel analysis and dynamic detection [4][5][6], hardware performance counters [7], signature matching based on hardware performance counters, instruction count-based analysis, and memory events, CPU, memory, and network traffic features [8][9][10], runtime, network, mining-related, and browser-based features [11], and fixed thresholds. While promising and reporting excellent detection accuracy, the current detection method. ...
... This situation may become worse because even when a user leaves a mining page and closes his browser, the CPU is still used for mining without the user's knowledge. A Cisco report [10] shows that "in 2020, almost 70 percent of its customers were victims of crypto mining software". ...
Article
Full-text available
Coinhive released its browser-based cryptocurrency mining code in September 2017, and vicious web page writers, called vicious miners hereafter, began to embed mining JavaScript code into their web pages, called mining pages hereafter. As a result, browser users surfing these web pages will benefit mine cryptocurrencies unwittingly for the vicious miners using the CPU resources of their devices. The above activity, called Cryptojacking, has become one of the most common threats to web browser users. As mining pages influence the execution efficiency of regular programs and increase the electricity bills of victims, security specialists start to provide methods to block mining pages. Nowadays, using a blocklist to filter out mining scripts is the most common solution to this problem. However, when the number of new mining pages increases quickly, and vicious miners apply obfuscation and encryption to bypass detection, the detection accuracy of blacklist-based or feature-based solutions decreases significantly. This paper proposes a solution, called MinerGuard, to detect mining pages. MinerGuard was designed based on the observation that mining JavaScript code consumes a lot of CPU resources because it needs to execute plenty of computation. MinerGuard does not need to update data used for detection frequently. On the contrary, blacklist-based or feature-based solutions must update their blocklists frequently. Experimental results show that MinerGuard is more accurate than blacklist-based or feature-based solutions in mining page detection. MinerGuard’s detection rate for mining pages is 96%, but MinerBlock, a blacklist-based solution, is 42.85%. Moreover, MinerGuard can detect 0-day mining pages and scripts, but the blacklist-based and feature-based solutions cannot.
... Existing cryptojacking detection approaches mainly use the following four techniques: blacklisting-based [14], [16], [25], [37], [42], [45], resource monitoring-based [38], [40], thread count-based [38], [41], and WebAssembly-based techniques [35], [42]. Although they all provide insights into detecting cryptojacking, they have limitations in terms of the precise detection of cryptojacking. ...
... (2) Resource monitoring-based approach. A resource monitoring approach is based on the fact that cryptojacking is a resource-intensive task [38], [40]. This method detects a website as a cryptojacking website if the computer resources (e.g., CPU usage) exceed a predetermined threshold when visiting the website. ...
Article
Full-text available
Cryptojacking is often used by attackers as a means of gaining profits by exploiting users’ resources without their consent, despite the anticipated positive effect of browser-based cryptomining. Previous approaches have attempted to detect cryptojacking websites, but they have the following limitations: (1) they failed to detect several cryptojacking websites either because of their evasion techniques or because they cannot detect JavaScript-based cryptojacking and (2) they yielded several false alarms by focusing only on limited characteristics of cryptojacking, such as counting computer resources. In this paper, we propose CIRCUIT, a precise approach for detecting cryptojacking websites. We primarily focuse on the JavaScript memory heap, which is resilient to script code obfuscation and provides information about the objects declared in the script code and their reference relations. We then extract a reference flow that can represent the script code behavior of the website from the JavaScript memory heap. Hence, CIRCUIT determines that a website is running cryptojacking if it contains a reference flow for cryptojacking. In our experiments, we found 1,813 real-world cryptojacking websites among 300K popular websites. Moreover, we provided new insights into cryptojacking by modeling the identified evasion techniques and considering the fact that characteristics of cryptojacking websites now appear on normal websites as well.
Preprint
Full-text available
WebAssembly is revolutionizing the approach to developing modern applications. Although this technology was born to create portable and performant modules in web browsers, currently, its capabilities are extensively exploited in multiple and heterogeneous use-case scenarios. With the extensive effort of the community, new toolkits make the use of this technology more suitable for real-world applications. In this context, it is crucial to study the liaisons between the WebAssembly ecosystem and software security. Indeed, WebAssembly can be a medium for improving the security of a system, but it can also be exploited to evade detection systems or for performing cryptomining activities. In addition, programs developed in low-level languages such as C can be compiled in WebAssembly binaries, and it is interesting to evaluate the security impacts of executing programs vulnerable to attacks against memory in the WebAssembly sandboxed environment. Also, WebAssembly has been designed to provide a secure and isolated environment, but such capabilities should be assessed in order to analyze their weaknesses and propose new mechanisms for addressing them. Although some research works have provided surveys of the most relevant solutions aimed at discovering WebAssembly vulnerabilities or detecting attacks, at the time of writing, there is no comprehensive review of security-related literature in the WebAssembly ecosystem. We aim to fill this gap by proposing a comprehensive review of research works dealing with security in WebAssembly. We analyze 121 papers by identifying seven different security categories. We hope that our work will provide insights into the complex landscape of WebAssembly and guide researchers, developers, and security professionals towards novel avenues in the realm of the WebAssembly ecosystem.
Chapter
The booming of cryptocurrencies in the last decade brought about the burst of cryptomining for obtaining cryptocurrencies in recent years. Only those users with plenty of computing resources are able to gain profits according to the design of block chain. As a result, this brings out more and more criminal attacks to maliciously plunder private and public computing resources through networks. Consequently, the detection of malicious cryptomining behavior is particularly important for network security and management. In this paper, we designed Mining Vanguard, realizing the recognition of mining behavior through the detection of DNS behavior. By constructing a comprehensive feature set that includes both traditional DNS resolution features and morpheme features, we combine network characteristics with semantic characteristics, aiming to achieve early recognition. Through a large number of targeted experiments, it is verified that Mining Vanguard is promising for detecting mining behaviors on the Internet.
Article
The increasing development of cryptocurrencies has brought cryptojacking as a new security threat in which attackers steal computing resources for cryptomining. The digitization of the supply chain is a potential major target for cryptojacking due to the large number of different infrastructures involved. These different infrastructures provide information sources that can be useful to detect cryptojacking, but with a wide variety of data formats and encodings. This paper describes the semantic data aggregator (SDA), a normalization and aggregation system based on data modelling and low-latency processing of data streams that facilitates the integration of heterogeneous information sources. As a use case, the paper describes a cryptomining detection system (CDS) based on network traffic flows processed by a machine learning engine. The results show how the SDA is leveraged in this use case to obtain aggregated information that improves the performance of the CDS.
Article
Full-text available
With the increasing value of digital cryptocurrency in recent years, the digital cryptocurrency mining industry is becoming prosperous. However, this industry has also gained attention from adversaries who exploit users’ computers to mine cryptocurrency covertly. To detect cryptojacking attacks, many static and dynamic methods are proposed. However, the existing solutions still have some limitations in terms of effectiveness, performance, and transparency. To address these issues, we present CJSpector, a novel hardware-based approach for cryptojacking detection. This method first leverages the Intel Processor Trace mechanism to collect the run-time control flow information of a web browser. Next, CJSpector makes use of two optimization approaches based on the library functionality and information gain to preprocess the control flow information. Finally, it leverages Recurrent Neural Network (RNN) for cryptojacking detection. The evaluation shows that our method can detect in-browser covert cryptocurrency mining effectively and transparently with a small performance cost.
Conference Paper
Full-text available
Mining is the foundation of blockchain-based cryptocurrencies such as Bitcoin rewarding the miner for finding blocks for new transactions. The Monero currency enables mining with standard hardware in contrast to special hardware (ASICs) as often used in Bitcoin, paving the way for in-browser mining as a new revenue model for website operators. In this work, we study the prevalence of this new phenomenon. We identify and classify mining websites in 138M domains and present a new fingerprinting method which finds up to a factor of 5.7 more miners than publicly available block lists. Our work identifies and dissects Coinhive as the major browser-mining stakeholder. Further, we present a new method to associate mined blocks in the Monero blockchain to mining pools and uncover that Coinhive currently contributes 1.18% of mined blocks having turned over 1293 Moneros in June 2018. CCS CONCEPTS • Security and privacy → Malware and its mitigation; • Networks → Network measurement;
Conference Paper
Full-text available
Recent advances in cloud computing have simplified the way that both software development and testing are performed. Unfortunately, this is not true for battery testing for which state of the art test-beds simply consist of one phone attached to a power meter. These test-beds have limited resources, access, and are overall hard to maintain; for these reasons, they often sit idle with no experiment to run. In this paper, we propose to share existing battery testing setups and build BatteryLab, a distributed platform for battery measurements. Our vision is to transform independent battery testing setups into vantage points of a planetary-scale measurement platform offering heterogeneous devices and testing conditions. In the paper, we design and deploy a combination of hardware and software solutions to enable BatteryLab's vision. We then preliminarily evaluate BatteryLab's accuracy of battery reporting, along with some system benchmarking. We also demonstrate how BatteryLab can be used by researchers to investigate a simple research question.
Article
Full-text available
TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards. The system is flexible and can be used to express a wide variety of algorithms, including training and inference algorithms for deep neural network models, and it has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields, including speech recognition, computer vision, robotics, information retrieval, natural language processing, geographic information extraction, and computational drug discovery. This paper describes the TensorFlow interface and an implementation of that interface that we have built at Google. The TensorFlow API and a reference implementation were released as an open-source package under the Apache 2.0 license in November, 2015 and are available at www.tensorflow.org.
Conference Paper
Full-text available
In recent years, attacks targeting web browsers and their plugins have become a prevalent threat. Attackers deploy web pages that contain exploit code, typically written in HTML and JavaScript, and use them to compromise unsuspecting victims. Initially, static techniques, such as signature-based detection, were adequate to identify such attacks. The response from the attackers was to heavily obfuscate the attack code, rendering static techniques insufficient. This led to dynamic analysis systems that execute the JavaScript code included in web pages in order to expose malicious behavior. However, today we are facing a new reaction from the attackers: evasions. The latest attacks found in the wild incorporate code that detects the presence of dynamic analysis systems and try to avoid analysis and/or detection. In this paper, we present Revolver, a novel approach to automatically detect evasive behavior in malicious JavaScript. Revolver uses efficient techniques to identify similarities between a large number of JavaScript programs (despite their use of obfuscation techniques, such as packing, polymorphism, and dynamic code generation), and to automatically interpret their differences to detect evasions. More precisely, Revolver leverages the observation that two scripts that are similar should be classified in the same way by web malware detectors (either both scripts are malicious or both scripts are benign); differences in the classification may indicate that one of the two scripts contains code designed to evade a detector tool. Using large-scale experiments, we show that Revolver is effective at automatically detecting evasion attempts in JavaScript, and its integration with existing web malware analysis systems can support the continuous improvement of detection techniques.
Conference Paper
Full-text available
The proliferation of computers in any domain is followed by the proliferation of malware in that domain. Systems, including the latest mobile platforms, are laden with viruses, rootkits, spyware, adware and other classes of malware. Despite the existence of anti-virus software, malware threats persist and are growing as there exist a myriad of ways to subvert anti-virus (AV) software. In fact, attackers today exploit bugs in the AV software to break into systems. In this paper, we examine the feasibility of building a malware detector in hardware using existing performance counters. We find that data from performance counters can be used to identify malware and that our detection techniques are robust to minor variations in malware programs. As a result, after examining a small set of variations within a family of malware on Android ARM and Intel Linux platforms, we can detect many variations within that family. Further, our proposed hardware modifications allow the malware detector to run securely beneath the system software, thus setting the stage for AV implementations that are simpler and less buggy than software AV. Combined, the robustness and security of hardware AV techniques have the potential to advance state-of-the-art online malware detection.
Conference Paper
Full-text available
Bots are the root cause of many security problems on the Internet, as they send spam, steal information from infected machines, and perform distributed denial-of-service attacks. Many approaches to bot detection have been proposed, but they either rely on end-host installations, or, if they operate on network traffic, require deep packet inspection for signature matching. In this paper, we present BotFinder, a novel system that detects infected hosts in a network using only high-level properties of the bot's network traffic. BotFinder does not rely on content analysis. Instead, it uses machine learning to identify the key features of command-and-control communication, based on observing traffic that bots produce in a controlled environment. Using these features, BotFinder creates models that can be deployed at network egress points to identify infected hosts. We trained our system on a number of representative bot families, and we evaluated BotFinder on real-world traffic datasets – most notably, the NetFlow information of a large ISP that contains more than 25 billion flows. Our results show that BotFinder is able to detect bots in network traffic without the need of deep packet inspection, while still achieving high detection rates with very few false positives.
Conference Paper
Full-text available
Malicious web pages that host drive-by-download exploits have become a popular means for compromising hosts on the Internet and, subsequently, for creating large-scale botnets. In a drive-bydownload exploit, an attacker embeds a malicious script (typically written in JavaScript) into a web page. When a victim visits this page, the script is executed and attempts to compromise the browser or one of its plugins. To detect drive-by-download exploits, researchers have developed a number of systems that analyze web pages for the presence of malicious code. Most of these systems use dynamic analysis. That is, they run the scripts associated with a web page either directly in a real browser (running in a virtualized environment) or in an emulated browser, and they monitor the scripts ’ executions for malicious activity. While the tools are quite precise, the analysis process is costly, often requiring in the order of
Conference Paper
Illicit crypto-mining leverages resources stolen from victims to mine cryptocurrencies on behalf of criminals. While recent works have analyzed one side of this threat, i.e.: web-browser cryptojacking, only commercial reports have partially covered binary-based crypto-mining malware. In this paper, we conduct the largest measurement of crypto-mining malware to date, analyzing approximately 4.5 million malware samples (1.2 million malicious miners), over a period of twelve years from 2007 to 2019. Our analysis pipeline applies both static and dynamic analysis to extract information from the samples, such as wallet identifiers and mining pools. Together with OSINT data, this information is used to group samples into campaigns. We then analyze publicly-available payments sent to the wallets from mining-pools as a reward for mining, and estimate profits for the different campaigns. All this together is is done in a fully automated fashion, which enables us to leverage measurement-based findings of illicit crypto-mining at scale. Our profit analysis reveals campaigns with multi-million earnings, associating over 4.4% of Monero with illicit mining. We analyze the infrastructure related with the different campaigns, showing that a high proportion of this ecosystem is supported by underground economies such as Pay-Per-Install services. We also uncover novel techniques that allow criminals to run successful campaigns.
Conference Paper
In-browser cryptojacking is a form of resource abuse that leverages end-users' machines to mine cryptocurrency without obtaining the users' consent. In this paper, we design, implement, and evaluate Outguard, an automated cryptojacking detection system. We construct a large ground-truth dataset, extract several features using an instrumented web browser, and ultimately select seven distinctive features that are used to build an SVM classification model. Outguardachieves a 97.9% TPR and 1.1% FPR and is reasonably tolerant to adversarial evasions. We utilized Outguardin the wild by deploying it across the Alexa Top 1M websites and found 6,302 cryptojacking sites, of which 3,600 are new detections that were absent from the training data. These cryptojacking sites paint a broad picture of the cryptojacking ecosystem, with particular emphasis on the prevalence of cryptojacking websites and the shared infrastructure that provides clues to the operators behind the cryptojacking phenomenon.
Conference Paper
Four major search engines, Google in particular, hold a unique position in enabling the use of the Internet, as they alone direct over 98% of Internet users to the content they seek, using proprietary indices. While the contribution of these companies is undeniable, their design is necessarily affected by their economic interests, which may or may not align with those of the users, raising concerns regarding their effect on the availability of information around the globe. While multiple academic and commercial projects aimed to distribute and democratize the Web search, they failed to gain much traction, mostly due to inferior results and lack of incentives for participation. In this paper, we show how complex networking-intensive tasks can be crowdsourced using Bitcoin's incentive model. We present Webcoin, a novel distributed digital-currency which utilizes networking resources rather then computational, and can only be mined through Web indexing. Webcoin provides both the incentives and the means to create Google-scale indices, freely available to competing services and the public. Webcoin's design overcomes numerous unique challenges, such as index verification, scalability, and nodes' ability to actively manipulate webpages. We deploy 200 fully-functioning Webcoin nodes and demonstrate their low bandwidth requirements.
Conference Paper
A wave of alternative coins that can be effectively mined without specialized hardware, and a surge in cryptocurrencies' market value has led to the development of cryptocurrency mining ( cryptomining ) services, such as Coinhive, which can be easily integrated into websites to monetize the computational power of their visitors. While legitimate website operators are exploring these services as an alternative to advertisements, they have also drawn the attention of cybercriminals: drive-by mining (also known as cryptojacking ) is a new web-based attack, in which an infected website secretly executes JavaScript code and/or a WebAssembly module in the user's browser to mine cryptocurrencies without her consent. In this paper, we perform a comprehensive analysis on Alexa's Top 1 Million websites to shed light on the prevalence and profitability of this attack. We study the websites affected by drive-by mining to understand the techniques being used to evade detection, and the latest web technologies being exploited to efficiently mine cryptocurrency. As a result of our study, which covers 28 Coinhive-like services that are widely being used by drive-by mining websites, we identified 20 active cryptomining campaigns. Motivated by our findings, we investigate possible countermeasures against this type of attack. We discuss how current blacklisting approaches and heuristics based on CPU usage are insufficient, and present MineSweeper, a novel detection technique that is based on the intrinsic characteristics of cryptomining code, and, thus, is resilient to obfuscation. Our approach could be integrated into browsers to warn users about silent cryptomining when visiting websites that do not ask for their consent.
Conference Paper
As a new mechanism to monetize web content, cryptocurrency mining is becoming increasingly popular. The idea is simple: a webpage delivers extra workload (JavaScript) that consumes computational resources on the client machine to solve cryptographic puzzles, typically without notifying users or having explicit user consent. This new mechanism, often heavily abused and thus considered a threat termed "cryptojacking", is estimated to affect over 10 million web users every month; however, only a few anecdotal reports exist so far and little is known about its severeness, infrastructure, and technical characteristics behind the scene. This is likely due to the lack of effective approaches to detect cryptojacking at a large-scale (e.g., VirusTotal). In this paper, we take a first step towards an in-depth study over cryptojacking. By leveraging a set of inherent characteristics of cryptojacking scripts, we build CMTracker, a behavior-based detector with two runtime profilers for automatically tracking Cryptocurrency Mining scripts and their related domains. Surprisingly, our approach successfully discovered 2,770 unique cryptojacking samples from 853,936 popular web pages, including 868 among top 100K in Alexa list. Leveraging these samples, we gain a more comprehensive picture of the cryptojacking attacks, including their impact, distribution mechanisms, obfuscation, and attempts to evade detection. For instance, a diverse set of organizations benefit from cryptojacking based on the unique wallet ids. In addition, to stay under the radar, they frequently update their attack domains (fastflux) on the order of days. Many attackers also apply evasion techniques, including limiting the CPU usage, obfuscating the code, etc.
Conference Paper
Performing detailed forensic analysis of real-world web security incidents targeting users, such as social engineering and phishing attacks, is a notoriously challenging and time-consuming task. To reconstruct web-based attacks, forensic analysts typically rely on browser cache files and system logs. However, cache files and logs provide only sparse information often lacking adequate detail to reconstruct a precise view of the incident. To address this problem, we need an always-on and lightweight (i.e., low overhead) forensic data collection system that can be easily integrated with a variety of popular browsers, and that allows for recording enough detailed information to enable a full reconstruction of web security incidents, including phishing attacks. To this end, we propose WebCapsule, a novel record and replay forensic engine for web browsers. WebCapsule functions as an always-on system that aims to record all non-deterministic inputs to the core web rendering engine embedded in popular browsers, including all user interactions with the rendered web content, web traffic, and non-deterministic signals and events received from the runtime environment. At the same time, WebCapsule aims to be lightweight and introduce low overhead. In addition, given a previously recorded trace, WebCapsule allows a forensic analyst to fully replay and analyze past web browsing sessions in a controlled isolated environment. We design WebCapsule to also be portable, so that it can be integrated with minimal or no changes into a variety of popular web-rendering applications and platforms. To achieve this goal, we build WebCapsule as a self-contained instrumented version of Google's Blink rendering engine and its tightly coupled V8 JavaScript engine. We evaluate WebCapsule on numerous real-world phishing attack instances, and demonstrate that such attacks can be recorded and fully replayed. In addition, we show that WebCapsule can record complex browsing sessions on popular websites and different platforms (e.g., Linux and Android) while imposing reasonable overhead, thus making always-on recording practical.
Article
As smartphones and mobile devices are rapidly becoming indispensable for many network users, mobile malware has become a serious threat in the network security and privacy. Especially on the popular Android platform, many malicious apps are hiding in a large number of normal apps, which makes the malware detection more challenging. In this paper, we propose a ML-based method that utilizes more than 200 features extracted from both static analysis and dynamic analysis of Android app for malware detection. The comparison of modeling results demonstrates that the deep learning technique is especially suitable for Android malware detection and can achieve a high level of 96% accuracy with real-world Android application sets.
Conference Paper
Identifying malicious web sites has become a major challenge in today's Internet. Previous work focused on detecting if a web site is malicious by dynamically executing JavaScript in instrumented environments or by rendering web sites in client honeypots. Both techniques bear a significant evaluation overhead, since the analysis can take up to tens of seconds or even minutes per sample. In this paper, we introduce a novel, purely static analysis approach, the Delta-system, that (i) extracts change-related features between two versions of the same website, (ii) uses a machine-learning algorithm to derive a model of web site changes, (iii) detects if a change was malicious or benign, (iv) identifies the underlying infection vector campaign based on clustering, and (iv) generates an identifying signature. We demonstrate the effectiveness of the Delta-system by evaluating it on a dataset of over 26 million pairs of web sites by running next to a web crawler for a period of four months. Over this time span, the Delta-system successfully identified previously unknown infection campaigns. Including a campaign that targeted installations of the Discuz!X Internet forum software, injected infection vectors into these forums, and redirected to an installation of the Cool Exploit Kit.
Article
A new method of dimensionality reduction for time series data mining is proposed. Each time series is compressed with wavelet or Fourier decomposition. Instead of using only the first coefficients, a new method of choosing the best coefficients for a set of time series is presented. A criterion function is evaluated using all values of a co-efficient position to determine a good set of coefficients. The optimal criterion function with respect to energy preservation is given. For many real life data sets much more energy can be preserved, which is advantageous for data mining tasks. All time series to be mined, or at least a representative subset, need to be available a priori.
Conference Paper
Numerous attacks, such as worms, phishing, and botnets, threaten the availability of the Internet, the integrity of its hosts, and the privacy of its users. A core element of defense against these attacks is anti-virus (AV) software--a service that detects, removes, and characterizes these threats. The ability of these products to successfully characterize these threats has far-reaching effects--from facilitating sharing across organizations, to detecting the emergence of new threats, and assessing risk in quarantine and cleanup. In this paper, we examine the ability of existing host-based anti-virus products to provide semantically meaningful information about the malicious software and tools (or malware) used by attackers. Using a large, recent collection of malware that spans a variety of attack vectors (e.g., spyware, worms, spam), we show that different AV products characterize malware in ways that are inconsistent across AV products, incomplete across malware, and that fail to be concise in their semantics. To address these limitations, we propose a new classification technique that describes malware behavior in terms of system state changes (e.g., files written, processes created) rather than in sequences or patterns of system calls. To address the sheer volume of malware and diversity of its behavior, we provide a method for automatically categorizing these profiles of malware into groups that reflect similar classes of behaviors and demonstrate how behavior-based clustering provides a more direct and effective way of classifying and analyzing Internet malware.
Chrome remote interface
  • A Cardaci
A. Cardaci, "Chrome remote interface," https://github.com/cyrus-and/ chrome-remote-interface.
Chrome remote debugging
  • Chrome Debug Team
Chrome Debug Team, "Chrome remote debugging," http://bit.ly/ 2rnmsZx.
Crypto-loot becomes the number one web miner
  • Crypto-Loot
Crypto-Loot, "Crypto-loot becomes the number one web miner," https: //cryptolootminer.com/news, February 2019.
The cryptonight proof of work algorithm
  • Cryptonote
CryptoNote, "The cryptonight proof of work algorithm," https:// cryptonote.org/inside.php.
Zozzle: Fast and precise in-browser javascript malware detection
  • C Curtsinger
  • B Livshits
  • B Zorn
  • C Seifert
C. Curtsinger, B. Livshits, B. Zorn, and C. Seifert, "Zozzle: Fast and precise in-browser javascript malware detection," in Proceedings of the 20th USENIX Conference on Security, ser. SEC'11. Berkeley, CA, USA: USENIX Association, 2011, pp. 3-3. [Online]. Available: http://dl.acm.org/citation.cfm?id=2028067.2028070
What is web assembly?
  • E Elliot
E. Elliot, "What is web assembly?" June 2015.
A first look at browser-based cryptojacking
  • S Eskandari
  • A Leoutsarakos
  • T Mursch
  • J Clark
S. Eskandari, A. Leoutsarakos, T. Mursch, and J. Clark, "A first look at browser-based cryptojacking," CoRR, vol. abs/1803.02887, 2018. [Online]. Available: http://arxiv.org/abs/1803.02887
Digital currency -designed for the web
  • Jsecoin
JSEcoin, "Digital currency -designed for the web," https://jsecoin.com/ en/home/.
Improving user perceived page load times using gaze
  • C Kelton
  • J Ryoo
  • A Balasubramanian
  • S R Das
C. Kelton, J. Ryoo, A. Balasubramanian, and S. R. Das, "Improving user perceived page load times using gaze," ser. NSDI '17. USENIX Association, 2017, pp. 545-559. [Online]. Available: https://www. usenix.org/conference/nsdi17/technical-sessions/presentation/kelton
Nocoin: a tiny browser extension aiming to block coin miners
  • R Keramidas
R. Keramidas, "Nocoin: a tiny browser extension aiming to block coin miners," https://github.com/keraf/NoCoin, September 2017.
Who and what is coinhive
  • B Krebs
B. Krebs, "Who and what is coinhive?" https://krebsonsecurity.com/ 2018/03/who-and-what-is-coinhive/, march 2018.
Private digital currency
  • Monero
Monero, "Private digital currency," https://www.getmonero.org/.
Vesper: Measuring time-to-interactivity for web pages
  • R Netravali
  • V Nathan
  • J Mickens
  • H Balakrishnan
R. Netravali, V. Nathan, J. Mickens, and H. Balakrishnan, "Vesper: Measuring time-to-interactivity for web pages," in 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18). Renton, WA: USENIX Association, Apr. 2018, pp. 217-231. [Online]. Available: https://www.usenix.org/conference/nsdi18/ presentation/netravali-vesper
The onload property of the GlobalEven-tHandlers
  • M D Network
M. D. Network, "The onload property of the GlobalEven-tHandlers," https://developer.mozilla.org/en-US/docs/Web/API/ GlobalEventHandlers/onload.
Time to interact: A new metric for measuring user experience
  • D Oksnevad
D. Oksnevad, "Time to interact: A new metric for measuring user experience." http://bit.ly/2Fz0mvd.
W3c paint timing working draft
  • S Panicker
S. Panicker, "W3c paint timing working draft," http://bit.ly/2f2CGSk.
Search engine for source code
  • Publicwww
PublicWWW, "Search engine for source code," https://publicwww.com/.
The performance impact of cryptocurrency mining on the web
  • D Sillars
D. Sillars, "The performance impact of cryptocurrency mining on the web," https://bit.ly/2SOPQmv, November 2017.
Insights into the cyber security thread landscape
  • S S R Team
S. S. R. Team, "Insights into the cyber security thread landscape," https://www.symantec.com/blogs/threat-intelligence/ istr-23-cyber-security-threat-landscape, March 2018.
Ethereum: A secure decentralised generalised transaction ledger
  • G Wood
G. Wood, "Ethereum: A secure decentralised generalised transaction ledger," https://ethereum.github.io/yellowpaper/paper.pdf, December 2018.
How javascript works: the building blocks of web workers + 5 cases when you should use them
  • A Zlatkov
A. Zlatkov, "How javascript works: the building blocks of web workers + 5 cases when you should use them," https://bit.ly/2Hs3eqP, January 2018.