Conference Paper

An Efficient Piecewise Hashing Method for Computer Forensics

SouthWest JiaoTong Univ., Chengdu
DOI: 10.1109/WKDD.2008.80 Conference: Knowledge Discovery and Data Mining, 2008. WKDD 2008. International Workshop on
Source: IEEE Xplore

ABSTRACT Hashing, a basic tool in computer forensics, is used to ensure data integrity and to identify known data objects efficiently. Unfortunately, intentional tiny modified file can not be identified using this traditional technique. Context triggered piecewise hashing separates a file into pieces using local context characteristic, and produces a hash sequence as a hash signature. The hash signature can be used to identify similar files with tiny modifications such as insertion, replacement and deletion. The algorithm of currently available scheme is designed for junk mail detection, which is low efficient and not suitable for file system investigation. In this paper, an improved algorithm based on the Store-Hash and Rehash idea is developed for context triggered piecewise hashing technique. Experiment results show that the performance of speed and the ability of similarity detection of the new scheme are better than that of spamsum. It is valuable for forensics practice.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we consider the problem of improving Web performance and propose an efficient differencing and merging system (DMS) based on an HTTP protocol extension. To provide for faster information exchange over the Web, the system tries to transfer only computed differences between requested documents and previously retrieved documents from the same site. Analysis and experimental results prove the effectiveness of DMS, but also show bigger processor and memory load on servers and clients. DMS is compatible with most of the existing solutions for improving Web performance. Moreover, SSL security system may be used to provide Web privacy and authenticity. The DMS model is simple to use and can be relatively easily integrated in Web servers and browsers.
  • [Show abstract] [Hide abstract]
    ABSTRACT: This book constitutes the refereed proceedings of the Third International ICST Conference, e-Democracy 2009, held in Athens, Greece, in September 2009. The 40 revised full papers presented were carefully reviewed. The papers are organized in topical sections on politics - legislation - regulatory framework I; enhancing quality of life through e-services; politics - legislation - regulatory framework II supporting democracy through e-services; identity management, privacy and trust; securitiy, attacks and crime; e-government & local e-government; education and training; collaboration, social networking, blogs; pervasive, ubiquitous, and intelligent computing.
    Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering edited by Alexander B. Sideridis, Charalampos Z. Patrikakis, 01/2010; Springer Berlin Heidelberg., ISBN: 978-3-642-11629-2
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Bytewise approximate matching is a relatively new area within digital forensics, but its importance is growing quickly as practitioners are looking for fast methods to screen and analyze the increasing amounts of data in forensic investigations. The essential idea is to complement the use of cryptographic hash functions to detect data objects with bytewise identical representation with the capability to find objects with bytewise similar representations. Unlike cryptographic hash functions, which have been studied and tested for a long time, approximate matching ones are still in their early development stages and evaluation methodology is still evolving. Broadly, prior approaches have used either a human in the loop to manually evaluate the goodness of similarity matches on real world data, or controlled (pseudo-random) data to perform automated evaluation. This work's contribution is to introduce automated approximate matching evaluation on real data by relating approximate matching results to the longest common substring (LCS). Specifically, we introduce a computationally efficient LCS approximation and use it to obtain ground truth on the t5 set. Using the results, we evaluate three existing approximate matching schemes relative to LCS and analyze their performance.
    Digital Investigation 05/2014; 11. DOI:10.1016/j.diin.2014.03.002 · 0.99 Impact Factor