Yuto Otsuki’s scientific contributions

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (9)


Script Tainting Was Doomed From The Start (By Type Conversion): Converting Script Engines into Dynamic Taint Analysis Frameworks
  • Conference Paper

October 2022

·

16 Reads

·

2 Citations

·

Yuto Otsuki

·

·

[...]

·

Kanta Matsuura

Automatic Reverse Engineering of Script Engine Binaries for Building Script API Tracers

January 2021

·

22 Reads

·

2 Citations

Digital Threats Research and Practice

Script languages are designed to be easy-to-use and require low learning costs. These features provide attackers options to choose a script language for developing their malicious scripts. This diversity of choice in the attacker side unexpectedly imposes a significant cost on the preparation for analysis tools in the defense side. That is, we have to prepare for multiple script languages to analyze malicious scripts written in them. We call this unbalanced cost for script languages asymmetry problem . To solve this problem, we propose a method for automatically detecting the hook and tap points in a script engine binary that is essential for building a script Application Programming Interface (API) tracer. Our method allows us to reduce the cost of reverse engineering of a script engine binary, which is the largest portion of the development of a script API tracer, and build a script API tracer for a script language with minimum manual intervention. This advantage results in solving the asymmetry problem. The experimental results showed that our method generated the script API tracers for the three script languages popular among attackers (Visual Basic for Applications (VBA), Microsoft Visual Basic Scripting Edition (VBScript), and PowerShell). The results also demonstrated that these script API tracers successfully analyzed real-world malicious scripts.


ROPminer: Learning-Based Static Detection of ROP Chain Considering Linkability of ROP Gadgets

July 2020

·

211 Reads

·

4 Citations

IEICE Transactions on Information and Systems

Return-oriented programming (ROP) has been crucial for attackers to evade the security mechanisms of recent operating systems. Although existing ROP detection approaches mainly focus on host-based intrusion detection systems (HIDSes), network-based intrusion detection systems (NIDSes) are also desired to protect various hosts including IoT devices on the network. However, existing approaches are not enough for network-level protection due to two problems: (1) Dynamic approaches take the time with second- or minute-order on average for inspection. For applying to NIDSes, millisecond-order is required to achieve near real time detection. (2) Static approaches generate false positives because they use heuristic patterns. For applying to NIDSes, false positives should be minimized to suppress false alarms. In this paper, we propose a method for statically detecting ROP chains in malicious data by learning the target libraries (i.e., the libraries that are used for ROP gadgets). Our method accelerates its inspection by exhaustively collecting feasible ROP gadgets in the target libraries and learning them separated from the inspection step. In addition, we reduce false positives inevitable for existing static inspection by statically verifying whether a suspicious byte sequence can link properly when they are executed as a ROP chain. Experimental results showed that our method has achieved millisecond-order ROP chain detection with high precision.


EIGER: automated IOC generation for accurate and interpretable endpoint malware detection

December 2019

·

248 Reads

·

26 Citations

A malware signature including behavioral artifacts, namely Indicator of Compromise (IOC) plays an important role in security operations, such as endpoint detection and incident response. While building IOC enables us to detect malware efficiently and perform the incident analysis in a timely manner, it has not been fully-automated yet. To address this issue, there are two lines of promising approaches: regular expression-based signature generation and machine learning. However, each approach has a limitation in accuracy or interpretability, respectively. In this paper, we propose EIGER, a method to generate interpretable, and yet accurate IOCs from given malware traces. The key idea of EIGER is enumerate-then-optimize. That is, we enumerate representations of potential artifacts as candidates of IOCs. Then, we optimize the combination of these candidates to maximize the two essential properties, i.e., accuracy and interpretability, towards the generation of reliable IOCs. Through the experiment using 162K of malware samples collected over the five months, we evaluated the accuracy of EIGER-generated IOCs. We achieved a high True Positive Rate (TPR) of 91.98% and a very low False Positive Rate (FPR) of 0.97%. Interestingly, EIGER achieved FPR of less than 1% even when we use completely different dataset. Furthermore, we evaluated the interpretability of the IOCs generated by EIGER through a user study, in which we recruited 15 of professional security analysts working at a security operation center. The results allow us to conclude that our IOCs are as interpretable as manually-generated ones. These results demonstrate that EIGER is practical and deployable to the real-world security operations.


My script engines know what you did in the dark: converting engines into script API tracers

December 2019

·

27 Reads

·

3 Citations

Malicious scripts have been crucial attack vectors in recent attacks such as malware spam (malspam) and fileless malware. Since malicious scripts are generally obfuscated, statically analyzing them is difficult due to reflections. Therefore, dynamic analysis, which is not affected by obfuscation, is used for malicious script analysis. However, despite its wide adoption, some problems remain unsolved. Current designs of script analysis tools do not fulfill the following three requirements important for malicious script analysis. (1) Universally applicable to various script languages, (2) capable of outputting analysis logs that can precisely recover the behavior of malicious scripts, and (3) applicable to proprietary script engines. In this paper, we propose a method for automatically generating script API tracer by analyzing the target script engine binaries. The method mine the knowledge of script engine internals that are required to append behavior analysis capability. This enables the addition of analysis functionalities to arbitrary script engines and generation of script API tracers that can fulfill the above requirements. Experimental results showed that we can apply this method for building malicious script analysis tools.


Toward the Analysis of Distributed Code Injection in Post-mortem Forensics

July 2019

·

90 Reads

·

4 Citations

Lecture Notes in Computer Science

Distributed code injection is a new type of malicious code injection technique. It makes existing forensics techniques for injected code detection infeasible by splitting a malicious code into several code snippets, injecting them into multiple running processes, and executing them in each process spaces. In spite of the impact of it on practical forensics fields, there was no discussion on countermeasures against this threat. In this paper, we present a memory forensics method for finding all code snippets distributively injected into multiple processes to defeat distributed code injection attacks. Our method is designed on the following observation for distributed code injection attacks. Even though malicious code is split and distributed in multiple processes, the split code snippets have to synchronize each other at runtime to maintain the order of the execution of the original malicious code. We exploit this characteristic of distributed code injection attacks with our method. The experimental results showed that our method successfully found all distributed code snippets and assisted to reconstruct the original code from them. We believe that we are the first to present a countermeasure against distributed code injection attacks. We also believe that our method is able to improve the efficiency of forensics especially for a host compromised with distributed code injection attacks.


Fig. 1 Three patterns of API redirection. The top is the case of a normal Windows executable before applying API redirection. (a) Pattern in which the reference of the call instruction is modified, (b) that in which the entry of the IAT is modified, and (c) that in which API redirection is conducted with stolen code.
Fig. 2 How Stealth Loader works and its components. (a) The file layout of an executable before Stealth Loader is embedded, (b) that after Stealth Loader is embedded and the components of Stealth Loader are also described, and (c) the process memory layout after Bootstrap resolves the dependencies of an executable and stealth-loaded DLLs.
Stealth Loader: Trace-free Program Loading for Analysis Evasion
  • Article
  • Full-text available

September 2018

·

83 Reads

·

2 Citations

Journal of Information Processing

Understanding how application programming interfaces (APIs) are used in a program plays an important role in malware analysis. This, however, has resulted in an endless battle between malware authors and malware analysts around the development of API [de]obfuscation techniques over the last few decades. Our goal in this paper is to show the limit of existing API de-obfuscation techniques. To do that, we first analyzed existing API [de]obfuscation techniques and clarified that an attack vector commonly exists in these techniques; then, we present Stealth Loader, which is a program loader to bypass all existing API de-obfuscation techniques. The core idea of Stealth Loader is to load a dynamic link library (DLL) and resolve its dependency without leaving any traces on memory to be detected. We demonstrated the effectiveness of Stealth Loader by analyzing a set of Windows executables and malware protected with Stealth Loader using major dynamic and static analysis tools. The results indicate that among other obfuscation tools, only Stealth Loader is able to successfully bypass all analysis tools.

Download

Building stack traces from memory dump of Windows x64

March 2018

·

1,497 Reads

·

14 Citations

Digital Investigation

Stack traces play an important role in memory forensics as well as program debugging. This is because stack traces provide a history of executed code in a malware-infected host and this history could become a clue for forensic analysts to uncover the cause of an incident, i.e., what malware have actually done on the host. Nevertheless, existing research and tools for building stack traces for memory forensics are not well designed for the x64 environments, even though they have already become the most popular environment. In this paper, we introduce the design and implementation of our method for building stack traces from a memory dump of the Windows x64 environment. To build a stack trace, we retrieve a user context of the target thread from a memory dump for determining the start point of a stack trace, and then emulate stack unwinding referencing the metadata for exceptional handling for building the call stack of the thread. Even if the metadata are unavailable, which often occurs in a case of malicious software, we manage to produce the equivalent data by scanning the stack with a flow-based verification method. In this paper, we discuss the evaluation of our method through comparing the stack traces built with it with those built with WinDbg to show the accuracy of our method. We also explain some case studies using real malware to show the practicability of our method.


Stealth Loader: Trace-Free Program Loading for API Obfuscation

October 2017

·

208 Reads

·

10 Citations

Lecture Notes in Computer Science

Understanding how application programming interfaces (APIs) are used in a program plays an important role in malware analysis. This, however, has resulted in an endless battle between malware authors and malware analysts around the development of API [de]obfuscation techniques over the last few decades. Our goal in this paper is to show a limit of existing API de-obfuscations. To do that, we first analyze existing API [de]obfuscation techniques and clarify an attack vector commonly existed in API de-obfuscation techniques, and then we present Stealth Loader, which is a program loader using our API obfuscation technique to bypass all existing API de-obfuscations. The core idea of this technique is to load a dynamic link library (DLL) and resolve its dependency without leaving any traces on memory to be detected. We demonstrate the effectiveness of Stealth Loader by analyzing a set of Windows executables and malware protected with Stealth Loader using major dynamic and static analysis tools and techniques. The result shows that among other obfuscation techniques, only Stealth Loader is able to successfully bypass all analysis tools and techniques.

Citations (9)


... When the data flow reaches to a sink point, simulator knows it. Most of works in this area could be divided into three categories: Static taint analysis, Dynamic taint analysis, and Hybrid taint analysis [15][16][17]. As stated in [11], dynamic taint analyses propagate taints when applications are running. ...

Reference:

Cryptocurrency Security Study based on Static Taint Analysis
Script Tainting Was Doomed From The Start (By Type Conversion): Converting Script Engines into Dynamic Taint Analysis Frameworks
  • Citing Conference Paper
  • October 2022

... Macros can also be treated as Visual Basic scripts. With this respect, Usui et al. [36,37] proposed to trace API calls in scripting languages. Their work aims to be universally suitable for a plethora of scripting languages, including Visual Basic. ...

Automatic Reverse Engineering of Script Engine Binaries for Building Script API Tracers
  • Citing Article
  • January 2021

Digital Threats Research and Practice

... Researchers train a convolutional neural network (CNN) or reccurent neural network (RNN) respectively based on the customized dataset consists of benign samples and crafted gadget chains [31][32][33]. ROPminer [34] statically detects ROP chains by learning the orders of ROP components and the byte patterns of each component. The authors build a HMM (Hidden Markov Model) model based on exhaustively collecting feasible ROP gadgets in the target libraries. ...

ROPminer: Learning-Based Static Detection of ROP Chain Considering Linkability of ROP Gadgets
  • Citing Article
  • July 2020

IEICE Transactions on Information and Systems

... The manual process of verifying false positives can be a daunting exercise; therefore, ML algorithms must achieve high performance and minimize false positives. The validation of false positives can also be incorporated into ML algorithms, as emphasized by Otsuki et al. [55]. ...

EIGER: automated IOC generation for accurate and interpretable endpoint malware detection
  • Citing Conference Paper
  • December 2019

... Macros can also be treated as Visual Basic scripts. With this respect, Usui et al. [36,37] proposed to trace API calls in scripting languages. Their work aims to be universally suitable for a plethora of scripting languages, including Visual Basic. ...

My script engines know what you did in the dark: converting engines into script API tracers
  • Citing Conference Paper
  • December 2019

... The frequent appearance of intrusive advertisements also serves as an indicator of a malware infection. Moreover, there may be an unexplained disk space expansion, resulting in a significant loss of storage capacity [47]. Additionally, certain types of malware grant unauthorized access to the attacker, enabling them to download secondary B Mohammed Nasereddin mnasereddin@iitis.pl ...

Toward the Analysis of Distributed Code Injection in Post-mortem Forensics
  • Citing Chapter
  • July 2019

Lecture Notes in Computer Science

... These self-destruction behaviors put forward higher requirements for memory data sampling and detection. Otsuki et al. [13] Symmetry 2023, 15 proposed a method of extracting stack traces from the memory images in a 64-bit Windows system. They demonstrated the effectiveness of Stealth Loader by analyzing a set of Windows executables and malware protected with Stealth Loader using major dynamic and static analysis tools. ...

Stealth Loader: Trace-free Program Loading for Analysis Evasion

Journal of Information Processing

... However, this more performant method may miss artefacts outside the Kernel's current virtual address space. Otsuki et al. (2018) introduced a technique for reconstructing stack traces in Windows x64 memory dumps, with case studies on malware analysis. Fernández-Álvarez and Rodríguez (2023) discussed DLL injection in Windows, a malware capability, and proposed a solution to locate DLLs in a process's memory by combining pages of the same DLL across multiple processes and dumps. ...

Building stack traces from memory dump of Windows x64

Digital Investigation

... First, the call space obfuscation process prunes low-level function nodes and restores special function nodes (2)(3)(4)(5)(6)(7)(8); second, it adds bogus low-level function nodes (9-13); then adds function nodes on the call path according to the API internal function move requisites (14)(15)(16)(17)(18); and finally returns the obfuscated API internal call graph. ...

Stealth Loader: Trace-Free Program Loading for API Obfuscation
  • Citing Conference Paper
  • October 2017

Lecture Notes in Computer Science