Conference Paper

A Study of Malcode-Bearing Documents

DOI: 10.1007/978-3-540-73614-1_14 Conference: Detection of Intrusions and Malware, and Vulnerability Assessment, 4th International Conference, DIMVA 2007, Lucerne, Switzerland, July 12-13, 2007, Proceedings
Source: DBLP


By exploiting the object-oriented dynamic composability of modern document applications and formats, malcode hidden in otherwise inconspicuous documents can reach third-party applications that may harbor exploitable vulnerabilities otherwise unreachable by network-level service attacks. Such attacks can be very selective and dicult to detect compared to the typical network worm threat, owing to the complex- ity of these applications and data formats, as well as the multitude of document-exchange vectors. As a case study, this paper focuses on Mi- crosoft Word documents as malcode carriers. We investigate the pos- sibility of detecting embedded malcode in Word documents using two techniques: static content analysis using statistical models of typical doc- ument content, and run-time dynamic tests on diverse platforms. The experiments demonstrate these approaches can not only detect known malware, but also most zero-day attacks. We identify several problems with both approaches, representing both challenges in addressing the problem and opportunities for future research.

Download full-text


Available from: Salvatore J. Stolfo,
  • Source
    • "The proposal of Polychronakis et al. [4] for extracting shellcode does not have a narrowing-down method, because it uses network packet extraction. The proposal of Li et al. [5] analyzes file structures, but not for the purpose of extracting shellcode. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a method for the dynamic analysis of malicious documents that can exploit various types of vulnerability in applications. Static analysis of a document can be used to identify the type of vulnerability involved. However, it can be difficult to identify unknown vulnerabilities, and the application may not be available even if we could identify the vulnerability. In fact, malicious code that is executed after the exploitation may not have a relationship with the type of vulnerability in many cases. In this paper, we propose a method that extracts and executes “shellcode” to analyze malicious documents without requiring identification of the vulnerability or the application. Our system extracts shellcode by executing byte sequences to observe the features of a document file in a priority order decided on the basis of entropy. Our system was used to analyze 88 malware samples and was able to extract shellcode from 74 samples. Of these, 51 extracted shellcodes behaved as malicious software according to dynamic analysis.
    Proceedings of the 4th International Conference on Security Science and Technology (ICSST2015); 01/2015
  • Source
    • "EUROSEC '11, Salzburg, Austria Copyright 2011 ACM 978-1-4503-0613-3/11/04 ...$10.00. fice and other productivity suites from the mid-1990s to the early 2000s [17]. One of the factors that led to the extinction of macro viruses was the additional security measures and protections that were gradually being applied to newer versions of the affected applications. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The widespread adoption of the PDF format for document exchange has given rise to the use of PDF files as a prime vector for malware propagation. As vulnerabilities in the major PDF viewers keep surfacing, effective detection of malicious PDF documents remains an important issue. In this paper we present MDScan, a standalone malicious doc-ument scanner that combines static document analysis and dynamic code execution to detect previously unknown PDF threats. Our evaluation shows that MDScan can detect a broad range of malicious PDF documents, even when they have been extensively obfuscated.
  • Source
    • "Due to their heavy instrumentation and security risks associated with dynamic analysis, the practical applicability of such approaches is limited to malware research systems . For the end-user systems, some early work on the detection of potential exploits in PDF documents [13] [21] has gone largely unnoticed, and in practice the detection of malicious PDF documents still hinges upon signatures provided by security vendors. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Despite the recent security improvements in Adobe's PDF viewer, its underlying code base remains vulnerable to novel exploits. A steady flow of rapidly evolving PDF malware observed in the wild substantiates the need for novel protection instruments beyond the classical signature-based scanners. In this contribution we present a technique for detection of JavaScript-bearing malicious PDF documents based on static analysis of extracted JavaScript code. Compared to previous work, mostly based on dynamic analysis, our method incurs an order of magnitude lower run-time overhead and does not require special instrumentation. Due to its efficiency we were able to evaluate it on an extremely large real-life dataset obtained from the VirusTotal malware upload portal. Our method has proved to be effective against both known and unknown malware and suitable for large-scale batch processing.
    Twenty-Seventh Annual Computer Security Applications Conference, ACSAC 2011, Orlando, FL, USA, 5-9 December 2011; 01/2011
Show more