Conference Paper

An Approach to Detect Executable Content for Anomaly Based Network Intrusion Detection.

DOI: 10.1109/IPDPS.2007.370614 Conference: 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), Proceedings, 26-30 March 2007, Long Beach, California, USA
Source: DBLP

ABSTRACT Since current internet threats contain not only malicious codes like Trojan or worms, but also spyware and adware which do not have explicit illegal content, it is necessary to have a mechanism to prevent hidden executable files downloading in the network traffic. In this paper, we present a new solution to identify executable content for anomaly based network intrusion detection system (NIDS) based on file byte frequency distribution. First, a brief introduction to application level anomaly detection is given, as well as some typical examples of compromising user computers by recent attacks. In addition to a review of the related research on malicious code identification and file type detection in section 2, we will also discuss the drawback when applying them for NIDS. After that, the background information of our approach is presented with examples, in which the details of how we create the profile and how to perform the detection are thoroughly discussed. The experiment results are crucial in our research because they provide the essential support for the implementing. In the final experiment simulating the situation of uploading executable files to a FTP server, our approach demonstrates great performance on the accuracy and stability.

0 Bookmarks
 · 
129 Views
  • [Show abstract] [Hide abstract]
    ABSTRACT: Over 20 studies have been published in the past decade involving file and data type classification for digital forensics and information security applications. Methods using n-grams as inputs have proven the most successful across a wide variety of types; however, there are mixed results regarding the utility of unigrams and bigrams as inputs independently. In this study, we use support vector machines (SVMs) consisting of unigrams and bigrams, as well as complexity and other byte frequency-based measures, as inputs. Using concatenated unigrams and bigrams as input and a linear kernel SVM, we achieve significantly improved results over those previously reported (73.4% classification rate across 38 file and data types). We are the first to use concatenated n-grams as the sole input, and we show their superiority over inputs used previously. We also found that too many different types of features as inputs result in overfitting and poor generalization properties. We include several types seldom or not studied in the past (Microsoft Office 2010 files, file system data, base64, base85, URL encoding, flash video, M4A, MP4, WMV, and JSON records). The “winning” approach is instantiated in an open source software tool called Sceadan - Systematic Classification Engine for Advanced Data ANalysis.
    IEEE Transactions on Information Forensics and Security 01/2013; 8(9):1519-1530. · 1.90 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Digital information is packed into files when it is going to be stored on storage media. Each computer file is associated with a type. Type detection of computer data is a building block in different applications of computer forensics and security. Traditional methods were based on file extensions and metadata. The content-based method is a newer approach with the lowest probability of being spoofed and is the only way for type detection of data packets and file fragments. In this paper, a content-based method that deploys principle component analysis and neural networks for an automatic feature extraction is proposed. The extracted features are then applied to a classifier for the type detection. Our experiments show that the proposed method works very well for type detection of computer files when considering the whole content of a file. Its accuracy and speed is also significant for the case of file fragments, where data is captured from random starting points within files, but the accuracy differs according to the lengths of file fragments. Copyright © 2012 John Wiley & Sons, Ltd.
    Security and Communication Networks 01/2013; 6(1):115-128. · 0.43 Impact Factor
  • Source

Full-text (2 Sources)

Download
225 Downloads
Available from
May 16, 2014