Weiwei Zhuang’s research while affiliated with Xiamen University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (3)


Ensemble Clustering for Internet Security Applications
  • Article

November 2012

·

175 Reads

·

69 Citations

IEEE Transactions on Systems Man and Cybernetics Part C (Applications and Reviews)

Weiwei Zhuang

·

Yanfang Ye

·

Yong Chen

·

Tao Li

Due to their damage to Internet security, malware and phishing website detection has been the Internet security topics that are of great interests. Compared with malware attacks, phishing website fraud is a relatively new Internet crime. However, they share some common properties: 1) both malware samples and phishing websites are created at a rate of thousands per day driven by economic benefits; and 2) phishing websites represented by the term frequencies of the webpage content share similar characteristics with malware samples represented by the instruction frequencies of the program. Over the past few years, many clustering techniques have been employed for automatic malware and phishing website detection. In these techniques, the detection process is generally divided into two steps: 1) feature extraction, where representative features are extracted to capture the characteristics of the file samples or the websites; and 2) categorization, where intelligent techniques are used to automatically group the file samples or websites into different classes based on computational analysis of the feature representations. However, few have been applied in real industry products. In this paper, we develop an automatic categorization system to automatically group phishing websites or malware samples using a cluster ensemble by aggregating the clustering solutions that are generated by different base clustering algorithms. We propose a principled cluster ensemble framework to combine individual clustering solutions that are based on the consensus partition, which can not only be applied for malware categorization, but also for phishing website clustering. In addition, the domain knowledge in the form of sample-level/website-level constraints can be naturally incorporated into the ensemble framework. The case studies on large and real daily phishing websites and malware collection from the Kingsoft Internet Security Laboratory demonstrate the effectiveness and efficiency of our proposed method.


Combining file content and file relations for cloud based malware detection

August 2011

·

213 Reads

·

95 Citations

Yanfang Ye

·

Tao Li

·

·

[...]

·

Due to their damages to Internet security, malware (such as virus, worms, trojans, spyware, backdoors, and rootkits) detection has caught the attention not only of anti-malware industry but also of researchers for decades. Resting on the analysis of file contents extracted from the file samples, like Application Programming Interface (API) calls, instruction sequences, and binary strings, data mining methods such as Naive Bayes and Support Vector Machines have been used for malware detection. However, besides file contents, relations among file samples, such as a "Downloader" is always associated with many Trojans, can provide invaluable information about the properties of file samples. In this paper, we study how file relations can be used to improve malware detection results and develop a file verdict system (named "Valkyrie") building on a semi-parametric classifier model to combine file content and file relations together for malware detection. To the best of our knowledge, this is the first work of using both file content and file relations for malware detection. A comprehensive experimental study on a large collection of PE files obtained from the clients of anti-malware products of Comodo Security Solutions Incorporation is performed to compare various malware detection approaches. Promising experimental results demonstrate that the accuracy and efficiency of our Valkyrie system outperform other popular anti-malware software tools such as Kaspersky AntiVirus and McAfee VirusScan, as well as other alternative data mining based detection systems.


Associative Classification and Post-processing Techniques used for Malware Detection

September 2008

·

27 Reads

·

9 Citations

Numerous attacks made by the malware have presented serious threats to the security of computer users. Unfortunately, along with the development of the malware writing techniques, the number of file samples that need to be analyzed is constantly increasing on a daily basis. An automatic and robust tool to analyze and classify the file samples is the need of the hour. In this paper, resting on the analysis of Windows API execution sequences called by PE files, we use associative classification and post-processing techniques for malware detection. Promising experimental results demonstrate that the accuracy and efficiency of our malware detection method outperform popular anti-virus scanners such as Norton AntiVirus and Dr. Web, as well as previous data mining based detection systems which employed Naive Bayes, Support Vector Machine (SVM) and Decision Tree techniques. In particular, the post-processing techniques we adopt can greatly reduce the number of generated rules which make it easy for the human analysts to identify the useful ones.

Citations (3)


... In [13], the authors create an automatic categorization system to automatically group phishing websites or malware samples into families with common characteristics using a cluster ensemble. Their approach combines the individual clustering solutions produced by different algorithms using a cluster ensemble. ...

Reference:

Online Clustering of Known and Emerging Malware Families
Ensemble Clustering for Internet Security Applications
  • Citing Article
  • November 2012

IEEE Transactions on Systems Man and Cybernetics Part C (Applications and Reviews)

... malware detection [56], text classification [4], phishing websites detection [3,5,19], and breast cancer 185 prediction [7]. In [56], where AC was used for malware detection, the analyzed data set represents Windows portable executable files, having two class labels: benign or malicious executables. ...

Associative Classification and Post-processing Techniques used for Malware Detection
  • Citing Conference Paper
  • September 2008

... Broadly, the complexity of malware detection and the impossibility of unbiased sampling of the population at large [22] have led to a wide variety of papers looking at alternative featurization techniques to meet different challenges. This includes custom dynamic analysis for less-common malware vectors like C# [9], byte features, and miss-assumptions about packing [1], using the file path [38,29,27], and using cooccurrence of other files [57,53,26], amongst many others. Our work continues this long-term trend of looking at alternative means of obtaining predictive information for malware detectors, targeting simplicity in fitting into existing processes. ...

Combining file content and file relations for cloud based malware detection
  • Citing Conference Paper
  • August 2011