Conference PaperPDF Available

Study of Malware Threats Faced by the Typical Email User

Authors:
  • University of Colorado at Colorado Springs

Abstract and Figures

Understanding malware behavior will help in implementing robust intrusion detection and prevention systems. In this paper, we studied the behavioral characteristics of different malware types affecting the Internet and other enterprise email systems. This research was carried out on spam email data received by a single user’s test email account collected over a period of six months. A sandbox test environment platform using virtual machines was built to perform this research and simulate real-life malware behavior and determine its signature at the point of execution for proper analysis. Analysis of email data using the sandbox setup helps to produce a comprehensive data analysis about botnet behavior. We described in detail the design and implementation of sandbox test environment including the challenges faced in building this test environment. As a cost saving measure, we used VMware based virtual platforms built on Linux PC-class hardware. We present results of our behavioral measurement of the most active botnets. Our study discovered that for a single email user for a period of six months, two active Trojans contributed around 20 percent of the total identified malwares received within this time period and the remaining 80 percent of malware binaries were distributed over many different types of botnets; the email malware shows a classic long-tail distribution. During this experiment, we also discovered very strong polymorphic behaviors exhibited by these malware samples, ostensibly intended to help the malware authors and hackers to penetrate and bypass the enterprise intrusion detection systems. Finally, we are releasing the repository of malware collected as a data set for evaluation by other researchers.
Content may be subject to copyright.
Study of Malware Threats Faced by the
Typical Email User
Anthony Ayodele,James Henrydoss,Walter Schrier,T.E. Boult
Department of Computer Science,
University of Colorado at Colorado Springs
Colorado Springs, CO, USA
{aayodele, jhenrydo, wscheire, tboult@vast }@uccs.edu
Abstract. Understanding malware behavior will help in implementing robust
intrusion detection and prevention systems. In this paper, we studied the
behavioral characteristics of different malware types affecting the Internet and
other enterprise email systems. This research was carried out on spam email
data received by a single user‟s test email account collected over a period of six
months. A sandbox test environment platform using virtual machines was built
to perform this research and simulate real-life malware behavior and determine
its signature at the point of execution for proper analysis. Analysis of email
data using the sandbox setup helps to produce a comprehensive data analysis
about botnet behavior. We described in detail the design and implementation of
sandbox test environment including the challenges faced in building this test
environment. As a cost saving measure, we used VMware based virtual
platforms built on Linux PC-class hardware. We present results of our
behavioral measurement of the most active botnets. Our study discovered that
for a single email user for a period of six months, two active Trojans
contributed around 20 percent of the total identified malwares received within
this time period and the remaining 80 percent of malware binaries were
distributed over many different types of botnets; the email malware shows a
classic long-tail distribution. During this experiment, we also discovered very
strong polymorphic behaviors exhibited by these malware samples, ostensibly
intended to help the malware authors and hackers to penetrate and bypass the
enterprise intrusion detection systems. Finally, we are releasing the repository
of malware collected as a data set for evaluation by other researchers.
Keywords: Malware, Intrusion Detection, Botnet
1 INTRODUCTION
Malware is a malicious software agent that runs hidden using a secret communication
channel to communicate with its Command and Control (C&C) center which is
typically a server using either Internet Relay Chat (IRC) or web based access to
control the remote machines. There is a quest to find mechanisms to protect the
current and next generation Internet against botnets and their malicious fraudulent
2 Anthony Ayodele,James Henrydoss,Walter Schrier,T.E. Boult
activities [15]. A bot is a compromised machine which can be controlled by C&C
servers remotely, and the main intent is to inflict damage to the target client or end
user for fraudulent purposes. The term bot and malware will be used interchangeably
in this paper, even though some consider malware as the software and bots as
malware infested machines. The name „bot‟ is derived from the term „robot‟ which
means an automatic worker and refers to a single malware agent, functions
automatically and autonomously. „Botnet‟ refers to a collection of bots or a group of
software agents. It is a network of malicious computers also known as zombies,
which are remote machines hacked, compromised, and controlled over the Internet by
its command, and control (C&C) servers called „botmasters‟. Botnets spread often
infecting tens of thousands of computers that lie dormant until commanded to action
by the attacker via the botmaster. The attacker is the main malware originator who
instructs the bots or zombie army to carryout various malicious activities on remote
machines and then report back to the command centers. The remote computer‟s
owner is completely unaware of these bot agents who hide their behavior. Botnet
infected machines open up a backdoor to listen for command and control issued by
attackers. By sending command, the botmaster can take control of all or part of the
infected systems, and direct them to perform tasks such as distribution of spam,
spying computers for stealing the identity and personal information, launching attack
on other computers using distributed DoS (Denial of Service), SYN attacks and other
advertising commercial activities (i.e., adware). In addition, these compromised
machines can also be used by attackers to launch attacks on newer machines by
seeding newer bots, scanning for new victims, stealing confidential information from
users, performing DDoS (Distributed Denial of Service) attacks, hosting web servers
and phising content and propagating updates to the botnet software itself [1],[4].
1.1 Malicious Activities
The following summarizes the malicious activities performed by the malwares on
the compromised machines: Denial of Service (DoS) attacks, TCP SYN attacks,
DDoS attacks with distributed framework, phising for fraudulent activities (e.g.,
Identity theft, credit card fraud and e-fraud), coordinated spam attacks, recruiting
other bots, zombies and upgrading the existing software by malware authors for
polymorphic behavior. Even though majority of the bots are inflicting damage, there
are few bots that can be used for good purposes and business use. Google bot, the web
crawler which finds and retrieves the internet search pages on web before handling
them off to the Google Indexer is categorized as a good bot [4], and does not infect
end user machines.
1.2 Propagation Method
Bots can automatically scan their environment and propagate themselves using
vulnerabilities and weak passwords. The more vulnerability a bot can scan and
propagate thru, the more valuable it becomes to the botnet controller community. The
process of stealing computing resources as a reason to attack a computer system being
joined to a botnet is sometimes referred to as “scrumping”. The bots typically spread
through the following mechanisms: Internet and email downloads and attachments,
3
software installations from un-trusted sources, scan exploit and planting, IM, twitter
and social networking sites (e.g., Face book, Twitter and Skype), SMS, MMS and
mobile emails target smart phones (e.g., Blackberry, I -phone and Android based
phones).
1.3 Types of Bots
Malwares come in different disguised shapes and formats such as annoying pop-up
ads, spyware and viruses that can be used to steal your identity or for tracking your
activities[1].There are many different types of malwares and they can be categorized
based on the method it used to propagate in the Internet or how they communicate.
Bots can use various topologies such as centralized, peer-2-peer (P2P) or randomized
and can operate on different communication protocols such as HTTP, IRC, and P2P.
These popular methods are used for controlling and issuing commands to a large
number of bots at a time via various kinds of controlling mechanisms [3],[4],[5].
Internet Relay Chat (IRC) Bots.
Internet Relay Chat (IRC) is a popular web chatting system based on the Internet
Engineering Task Force (IETF) standard 1459. IRC is based on a client server model
and enables communication between server and clients. IRC enables direct
communication between clients (bots) and C&C server using command and control
method. IRC bot is based on centralized server architecture. These bots can be very
large and powerful consisting of thousands of bots in their network. Since it is large,
makes it increases its vulnerability for detecting and destroying. The communication
between bots and the C&C server are often encrypted e.g., darknet.org,
cyberarmy.net, eggdrop and winbot. The eggdrop (Open Source IRC bot) and Win
bots (Windows Bots) were built for good purposes and not for fraudulent activities
[2].
Point-to-Point (P2P) Bots.
P2P is a new and emerging technology for bot communication. These bots are
based on distributed network and smaller traffic patterns compared to large IRC bots.
This makes it very difficult to detect and destroy e.g., Nugache and Sinit. These bots
communicate with each other using P2P protocol methodology. The P2P bots only
use the TCP or UDP server ports for opening a connection between bots and the C&C
servers [4].
Web based HTTP Bots.
Web based bots are primarily used for launching DDoS attacks using http protocol.
These http botnets use html over http to communicate between the C&C server and
the client bots. They naturally try to blend into the normal Internet web traffic and so
detecting these bots is a daunting task. They form the basis of the next generation
botnet architecture, providing „botnet 2.0‟ for the herders. They are more accessible
for those who didn‟t grow up with IRC. Examples include Machbot, Barracuda, and
BlackEnergy [4].
Rest of this paper has been organized as follows. In section 2 we cover the relevant
research work. In section 3, we cover the sandbox test setup environment used for this
experiment. In section 4, we provide a detailed analysis and findings of this research
4 Anthony Ayodele,James Henrydoss,Walter Schrier,T.E. Boult
work. In section 5, we summarize the work presented in this paper and in section 6 we
discuss potential future directions of this work.
2 RELATED WORK
Botlab gathers multiple real-time streams of information about botnets taken from
distinct perspectives. By combining and analyzing these streams, Botlab produces
accurate, timely, and comprehensive data about spam botnet behavior [1]. The
Honeypot project is yielding a rich source of new malware samples analyzing
methods [15]. Our current implementation does not follow the real-time streaming of
these, but rather focuses on email malware analysis. The design and architecture of
malware analysis environments employ an effective malware analysis lab
environment to explore possibilities beyond the traditional two or three system VM-
based lab [11]. The Nepenthes platform is a framework for large-scale collection of
information on self-replicating malware in the wild. Using this platform several other
organizations were able to greatly broaden the empirical analysis of self-replicating
malware and provide thousands of samples of previously unknown malware to anti-
virus system vendors [7].
3 EXPERIMENTS
This section provides a detailed explanation about the sandbox test environment set
up, and how we collected the malware sample binaries along with the test results.
3.1 Malware Sample Collection
Malware samples used for this research were collected using a single user account
created in the lab email system, with the goal being to see the types of malware a
single user faces. Apache open source spam filter “Spam Assassin” was used to
collect the spam files from the email-MTA (Mail Transfer Agent) systems which runs
on a Windows mail server, and filters the spam before it reaches user‟s mailbox [9].
We used the spam folder of this dedicated test email account to collect six months] of
spam email binaries in zipped form for analysis [i.e. January - June 2010]. Each
month‟s binary malware data is stored in a separate month_name.zip file for further
malware processing. This method has been selected in lieu of using a honeypot set up
to collect spam emails received by a real email user. The honeypot automatically
feeds the spam data directly from the Internet but it does not provide the filtered email
only traffic for our analysis which focuses on spam messages received by a single
email user. Most users use email; few run unprotected machines as honeypots.
3.2 Experiment Setup
A closed network with reverse firewall functionality was used in this research for
better analysis of malware behavior. This closed network called “sandboxes” consist a
total number of five machines; one gateway machine, and four user assigned test
machines. This sandbox test environment was built using VMware virtual machines
running on UBUNTU 10 Linux operating system on single server hardware. The
malware analysis platforms used for testing the malware binaries were installed with
5
Windows XP. These virtual machines, running multiple virtual operating systems
simultaneously on a single physical computer was useful for real time analysis of
malware that seeks to interact with other systems, perhaps for the purpose of leaking
data, obtaining instructions from the attacker, or upgrading itself. Since the testing
and execution of the malware was restricted to a single machine, protecting the
machines from network interaction was very controlled and isolated. In addition,
virtualization reduces the number of physical boxes needed to conduct the research
and in turn makes the research effort more affordable in terms of cost. By adopting
this approach, we were able to run the malware on different virtual machines for easy
analysis and monitoring of the malware behavior. The figure (Fig. 2) below provides
a detailed view about the sandbox test setup.
SERVER
USER
TestBed Vmware
Fig. 1. Test Bed Using VMware Servers Fig. 2. Sandbox Test Environment
The main purpose of gateway machine, GOGOL is to filter the traffic to and from the
network with the exception of passing in specific user IPs by implementing firewall
IP filtering to thwart any unwanted access from outside the network. GOGOL utilizes
Open BSD to close traffic and to pass the user IPs to the user‟s assigned test machine.
TOLSTOY, PUSHKIN, BABEL and GORKY are the malware analyzing sandbox
test machines used in this research work and built on Linux machines. Each test
machine BABEL, GORKY, PUSHKIN, and TOLSTOY utilizes UBUNTU 10 as the
primary OS and contains a virtual machine via VMware for user testing purposes.
Figure 1 and 2 represents how each user machine has a test bed via VMware and each
user interaction is directly with the test bed. In addition each computer acted as a local
web server in order to allow the use of custom designed Hypertext Preprocessor
(PHP) script that contained commands for VMware operating system reload and trace
capture files upload. We have disabled the external Internet access from these servers
to block any accidental spread of malware via Internet access during testing as a
precautionary measure by using a reverse firewall. The web server installation
allowed any captured data, reports to be „uploaded‟ to the main server where the VM
(i.e., Virtual Machine hosted on VMware Server) is hosted on and then revert the
image of the VM back to a clean state. This helped the test machines to revert back to
the original image after infected with malware binaries under test before moving on to
the next test. Also GORKY acted as an FTP server to allow transfer of test data or any
other data and software to be passed from host to the VM. The VM on each test bed
6 Anthony Ayodele,James Henrydoss,Walter Schrier,T.E. Boult
hosts Windows XP operating system for purposes of testing malware in a windows
environment. The VM also has a defined IP that allows it to be on the same network
as the main host machines to allow the user traffic to reach the machine and to
facilitate data transfer between host and VM. Wire Shark network monitoring tool on
each VM was used to capture traffic data generated by each malware for each test run.
3.3 Data Collection Procedures
The malware data processing started with a user connecting via remote desktop client
to GOGOL where each user utilizes a unique port, for example <GOGOL.ip: port>.
The gateway GOGOL then checks the source IP address to make sure that this
address is pre-defined IP and belonging to one of the individual test bed machines. If
it matches, then it will route the traffic to appropriate IP of the test bed via its host.
The user logs on to the VM and performs data collection on an individual malware
file. This involves launching Wire Shark, releasing the malware, and then saving the
Wire Shark report. Utilizing as web browser to access the custom PHP script allows
the user to upload the Wire Shark report to host machine and then to revert the image
of VM to a clean state. Finally, the whole process is repeated for each malware file.
4 ANALYSIS
The following section explains the results of dynamic malware behavior that was
studied using the sandbox infrastructure set up at our lab. This study includes all
malware samples collected from the incoming email spam binaries for a period of six
months at the lab email MTA servers received using a test account specifically
created for this study.
4.1 Behavioral Characteristics
To study the malware behavior and its malicious, fraudulent activities, each and
every malware collected from the spam folder were run on our sandbox analysis
platform. The malware behavior was logged into Wire Shark, a network analysis tool
which is used to log the messages sent out by the active bots from the analysis
platform to outside network, its Command and Control (C&C) centers. First, we
examined the actions of malware binaries run on the sandbox to understand the
dynamic behavior while recording the outgoing email, Internet or any other relay chat
accesses. Briefly this research work analyzed the following malware behaviors:
networking characteristics, activity duration, polymorphic and correlation analysis
between the malware activity and polymorphic behaviors. This paper f ocuses only on
the dynamic malware behaviors of malware. The static behavior which involves
studying the code and reverse engineering the malware binary is out of the scope of
this research work and will be performed in future phases.
4.2 Malware Identification Method
Clam-AV software tool kit installed on virtual Windows XP platforms helped us to
identify the malware types. It helped us to categorize the malwares into groups, and
7
discard the unknown suspect types. By using antivirus software engine which is a
shared library, the Clam-AV directly accepted the malware samples and using its
internal shared library, it could identify the malware types. In addition to malware
identification, MD5 checksum was run on all the malware binaries to verify its
software instances [16]. We used approximately 396 source email binaries spanning
over a period of six months for our testing. A total of 214 Trojan malware binaries
were identified using Clam-AV antivirus software. The toolkit could not identify
approximately 179 malware binary signatures and these unidentified malwares were
tagged as suspects. We have shortlisted a total of 42 types of active malware Trojans
that were active during this six months period. Based on the collected data, we found
that the malware received by an individual email user is very distributed and none of
the malwares exhibited a dominant behavior except Agent-165149 (23 attacks/ six
months) and Generic FakeAV (25 attacks/six months). In our detailed analysis we
have shortlisted the following list of fourteen Trojans which made significant
contribution in affecting email users. These 14 malwares selected from the list of 214
identified samples showed significant contribution in term of average malware
contribution (i.e., 5 % and above malware attacks in six month period) to the single
email users and has been used for further analysis and reporting. The following data
summarizes the Trojans indentified to be making significant monthly contribution to
the email user: Agent165149 -23, Agent165380-8, Downloader 93419-8, Downloader
Bredolab 1414-8, Downloader Bredolab 1415-7, Downloader Bredolab 1416-7,
Downloader Bredolab 1417-11, Downloader Bredolab 1418-8, Downloader Bredolab
1419-10, Downloader Bredolab 1420-9, Downloader Bredolab 1421-11, Downloader
Bredolab 1423-11, Downloader Bredolab 1424-7 and Generic FakeAV25.
4.3 Monthly Malware Attacks
We have summarized the number of malware attacks per month which includes
both the malware binaries identified to be Trojans and suspects. The suspects list
consists of binary that does not match the malware signature database of Clam-AV
tool kit. A single email user received an average of 66 malware hits per month in their
spam folders as identified to be Trojans. The data below provides only the malwares
identified to be Trojans, but the actual number received by a single user is
approximately twice the table values because this would have both identified and
unidentified malware. Total Malware Attacks/Month is summarized as follows: Jan-
27 hits, Feb-160 hits, Mar-41 hits, Apr-46 hits, May 46 hits and June -76 hits.
The highest numbers of malware attacks were received during the month of
February and June compared to other four month period within the total six months
window. Based on our findings, this high volume of malware traffic for February can
be attributed to the highest number of spam activities during Valentine‟s Day [13].
Malware authors exploit the high volume of email activity during the Valentine‟s Day
period to mix up the malware emails along with regular emails to the users. Though
we confirm the highest numbers of email malware attacks during the month of
February, we are unable to confirm the exact digital signatures of both the Waledac
and Storm with our email data received during the month of February. This could be
due to the fact that the malware authors change the malware signatures once detected
to penetrate the intrusion detection systems by changing the code to come up with a
new signature which has been referred to polymorphic behavior of the malwares [13].
8 Anthony Ayodele,James Henrydoss,Walter Schrier,T.E. Boult
4.4 Average Contribution
Even though we identified close to 42 types of malware using Clam AV, the
monthly malware attacks were very evenly distributed. Two malwares, Agent-
165149 and Generic Fake AV, were contributing over 20 percent of the overall
malware attacks for the six months period and rest of the malware data is very
distributed. The data below summarizes the average malware contribution for a single
user in six months time period: Agent-165149 11 %, Generic FakeAV 12%, Other
Malwares 77%.
4.5 Active Duration
One of the key behavioral characteristics of malware is the number of days/months
a particular malware is active and spamming the email user‟s account. This
characteristic measures how long a particular malware was in the active state and
attacking the Internet by spamming malware messages regularly. We found out that
the following four malwares were active for longer period of time compared to other
malwares selected for detailed analysis: Trojan Agent-165149 (141 days / 4 months
and 20 days), Downloader Bredolab-1417 (79 days / 2 months and 19 days),
Downloader Bredolab-1419 (121 days / 4 months 1 day), Generic FakeAV (64 days/
2 months 4 days). It is very straightforward to derive the activity behavior of Trojan
Agent-165149 (4 and ½ months) and Generic Fake AV (2 months) from the Figure 8
to be active for longer duration to send frequent spam emails. These two malwares
exhibited active characteristics and produced a large volume of email malware spam
with Agent 165149 (141 attacks) and Generic Fake AV (64 attacks). It can be seen
that from these figures, the Trojan Agent 165149 produced 141 attacks in its recorded
activity time period of 4 ½ months with an average of one spam a day (i.e., 141
spam/135 days= 1.04 spam/day). In the case of Generic Fake AV, also we recorded
an average of one spam a day (i.e., 64 spam/30 days =1.06 spam/day), but over a
shorter period. In the case of Downloader.Bredolab-1417, which had been active for
2½ months and Dowonloader.Bredolab-1419 active for 4 months exhibited a very
dormant behavior. Since these two Trojans were active for long periods without
sending frequent spam emails, they are considered to be exhibiting dormant
characteristics.
4.6 C&C Server Identification
We recorded all the networking behaviors exhibited by the malwares using the tool
Wire Shark. Initially these malwares send domain name query (DNS) to resolve the
IP address of its command & control centers. The table below (Table 1.) summarizes
the networking behavior of the active Trojan Agent 165149 and Generic Fake AV
malwares. These initial queries were sent over the regular UDP/IP packets. The
malwares analyzed did not exhibit relay chatting and P2P behaviors.
9
Table 1. Networking Behavior by Malware Type
4.7 Polymorphic Behavior
In the context of malware analysis, polymorphism refers to the same behavior
exhibited by malwares with different signatures to evade anti-virus and anti-spam
programs (e.g., Norton, Symantec and McAfee). Malware authors change the
malware source code and its attributes to make it undetectable by signature and
behavior-based antivirus and intrusion detection defenses implemented in corporate
firewall and malware detection engines. Typical morphing methods include change of
malware filenames and change of compression and encryption methods by using
different keys. Polymorphic malware is very destructive and intrusive computer
programs. This self-mutating malwares constantly change its signature to penetrate
the detection engines. Since it is frequently morphing i.e., changing signature,
filename etc, makes it very difficult for anti-malware programs to detect these attacks.
Even though the appearance of the code in polymorphic malware varies with each
mutation, the main function of the software usually remains the same. In our malware
analysis, we recorded polymorphic nature of all the malware binary instances
collected during the six months period. The following six malwares exhibited strong
polymorphic behavior: Generic Fake AV, Downloader Bredolab1423, Downloader
Bredolab1421, Downloader Bredolab1419, Downloader Bredolab1417 and
TrojanAgent165149. We recorded the polymorphic behavior in table 2. The rank
(i.e., number of polymorphic instances) of polymorphic behavior for these six
malware is ranging between three and nine. The highest polymorphic behavior was
exhibited by Generic Fake AV and Trojan Agent165149 which also results in more
number of days active and spamming the maximum number of Trojan attacks to a
single email user. Our research confirms that if a botnet does not exhibit polymorphic
behavior, it is possible to be removed from the network by identifying the source and
its activities [12],[13].
10 Anthony Ayodele,James Henrydoss,Walter Schrier,T.E. Boult
Table 2. Polymorphic Malware Instance and Checksum Signature
By exhibiting polymorphic behaviors malwares come up with many alternate
signatures and attacks users even though it is destroyed from one location by
spreading the bots around. The following data provides the number of polymorphic
instances exhibited by the Malware within that six month periods: Trojan Agent-
165149-7, Downloader Bredolab14174, Downloader Bredolab1419-5, Downloader
Bredolab1421 -3, Downloader Bredolab1423 -5 and Generic FakeAV 9. Table 2
provides the polymorphic malware and its different digital signatures. We found that
the malwares identified to be polymorphic were sent out using a different email file
name/subject line name, with different MD5 checksum had different signature and
were part of the same malware groups identified by the Clam AV engine used in this
research.
4.8 Correlation Analysis
A correlation analysis has been performed using the two selected active malwares,
Trojan Agent 165149 and Generic Fake AV which are the two malwares that
contributed around twenty percent of the overall malware attacks to a single user. In
case of Trojan Agent 165149, it was detected with initial signature during the month
of January and then it was dormant for about five months relating to the problems in
the active behavior of the malware. It came back with different malware signature by
exhibiting polymorphic behavior and was active during the entire month of June. Also
Trojan Agent 165149 generated many hits to the single email user while changing
only the file names and not the signatures. Eventually, the polymorphic behavior
demonstrated by this malware was changing both the binary signature and file names.
In the case of Generic Fake AV, we recorded a very unique polymorphic
characteristic using file name and binary signature. This malware poses a serious
threat to the Intrusion detection engines by changing the file name, binary digital
signatures every time it is sent to the user‟s email account
11
Fig. 3. Malware Activity Correlations
4.9 Unidentified Malwares
The Clam-AV tool kit could not identify approximately 174 malware binaries that we
suspect are Trojans. Figure 4 shows a detailed data about the number of
suspect/unidentified malware binaries versus number of occurrences.
Fig. 4. Unidentified Malware Chart
This non-identification can also be due to shortcomings in the open source tool kit
Clam AV; however this has not been verified. This alarming number of unidentified
malware within short time periods for a single user email account emphasizes the
importance of further malware study and need for in-depth analysis of malware
signatures. These malware behaviors could not be analyzed and so further
investigation needs to be done to verify these signatures.
5 CONCLUSION
In this work, we have studied the malware behavioral characteristics using its binary
signature from the spam email collected for a period of six months by means of an
experimental sandbox test environment. The key aspect of this design is that by using
this malware sandbox testing environment, we can analyze the dynamic behavioral
characteristics of the spam binaries directly received from the corporate email MTA
12 Anthony Ayodele,James Henrydoss,Walter Schrier,T.E. Boult
servers comprehensively. We identified approximately 42 types of malware using
Clam AV tool kit. Our study finds that during this six months period only two active
Trojans, Agent-165149 and Generic Fake AV are contributing around 20 percent of
the malwares received and rest of the 80 percent email spam are evenly distributed
over many different types of botnets. In addition to that a strong surge of spam
activity was recorded during the month of February and June. This study also
recorded a strong polymorphic behaviors exhibited by the malwares to overcome the
up-to-date malware detection mechanisms employed by the corporate email security
firewalls and Intrusion detection systems.
6 FUTURE WORK
One dimension of future work is to expand the malware data coverage to a maximum
of one year period to record a complete picture of the malware behavior over an
extended period of time. In addition, this research work needs to be extended to
multiple email users with automated data feed mechanism and measurement method
for processing large volume of malware samples in real time. We want to use
malware static code analysis tools to study both static and dynamic behavioral
characteristics at the same time. This will help to improve the analyzing technique,
and in turn help build a better malware signature and intrusion detection engines.
7 REFERENCES
1. P. John, M. Alexander, G. Steven, K. Arvind, Studying Spamming Botnets using Botlab”,
In NSDI:6th USENIX Symposium on Networked Systems Design and Implementation
(2009).
2. T. Holz, M. Steiner, F. Dahl, E. Biersack and F. Freiling, Measurements and Mitigation of
Peer-to-Peer-based Botnets: A Case Study on Storm Worm”, First USENIX workshop on
Large Scale Exploits and Emergent Threats, USENIX (2008).
3. W. Lu, M. Tavallaee, G. Rammidi and A. Ghorbani, "BotCop: An Online Botnets Traffic
Classifier." In Proceedings of the 7th Annual Conference on Communication Networks and
Services Research (CNSR 2009), Moncton, New Brunswick, Canada, May 11 - 13, pp. 70-
77 (2009).
4. Botnet - An Overview, CERT In white paper CIWP 2005-05.
5. W. Lu, A. Ghorbani. “Botnets Detection Based on IRC-Community”, In Proceedings of the
IEEE Global Communications Conference (GLOBECOM 2008), Nov 30 - Dec 4, New
Orleans, LA, USA, pp 2067-2071 (2008).
6. A. Karasaridis, B. Rexroad and D. Hoeflin, “Wide-Scale Botnet Detection and
Characterization” , In Proceedings of the First Workshop on Hot Topics in Understanding
Botnets (2007).
7. P. Baecher, M. Koetter, T. Holz, M. Dornseif, and F. Freiling, The nepenthes platform: An
efficient approach to collect malware”, In Proceedings of International Symposium on
Recent Advances in Intrusion Detection (RAID‟06), Hamburg, September (2006).
8. W. Lu, M. Tavallaee and A. Ghorbani, Automatic Discovery of Botnet Communities on
Large-Scale Communication Networks,” In Proceedings of the 2009 ACM Symposium on
Information, Computer and Communications Security, ASIACCS 2009, Sydney, Australia,
March 10-12, ACM 2009, pp. 1-10 (2009).
9. Apache Spam Assassin Project, Open Source Windows Spam Filter”.
http://spamassassin.apache.org
13
10. A. Sanabria, Malware Analysis Environment Design Architecture, SANS Institute, SANS
Institute InfoSec Reading Room, January 18th, (2007).
11. K. Higgins, DDos Botnets, Thriving and Threatening.
http://www.darkreading.com/security/vulnerabilities/208803800/index.html
12. Polymorphic Malware: A Threat That Changes on the Fly -Polymorphic malware changes
shape to fool detection schemes http://www.csoonline.com/article/221190/polymorphic-
malware-a-threat-that-changes-on-the-fly
13. Malware Writers Use Multiple Botnets to Spread Valentine‟s Day Heartache.
http://www.eweek.com/c/a/Security/Malware-Writers-Use-Multiple-Botnets-to-Spread-
Valentines-Day-Love/
14. HoneyNet Project, Know Your Enemy Tracking Botnets.
http://www.honeynet.org/papers/bots/
15. The Crime ware Landscape: Malware, Phishing, Identity Theft and Beyond, A Joint Report
of the US Department of Homeland Security SRI International Identity Theft Technology
Council and the Anti-Phishing Working Group October, (2006).
http://www.antiphishing.org/reports/APWG_CrimewareReport.pdf
16. Clam anti virus tool kit www.clamav.net
... Study of Malware Threats Faced by the Typical Email User [11] -The main objective of this paper is the behavioral characteristics of different malware types affecting the Internet and other enterprise email systems. ...
Article
Full-text available
Ransomware is a type of malware that prevents or restricts user from accessing their system, either by locking the system's screen or by locking the users' files in the system unless a ransom is paid. More modern ransomware families, individually categorize as crypto-ransomware, encrypt certain file types on infected systems and forces users to pay the ransom through online payment methods to get a decrypt key. The analysis shows that there has been a significant improvement in encryption techniques used by ransomware. The careful analysis of ransomware behavior can produce an effective detection system that significantly reduces the amount of victim data loss.
Technical Report
Full-text available
Ransomware has rapidly become one of the internet's greatest threats, with occasional new iterations being deployed. It is a rising challenge to company data and new versions emerge regularly as a result of the vast sums of money to be made. Ransomware attacks have been a worldwide incidence, with the primary goal of making monetary profits by illicit means. It will result in the destruction of classified records, interruption of routine activities and damage to the credibility of a company. It encrypts the files of targets and shows alerts, demanding payment until it is possible to decrypt the data. This malware is liable for hundreds of millions of dollars in losses per year. The demand for ransom is usually in the form of virtual currencies since bitcoin is difficult to trace. This paper offers a brief analysis of the history of ransomware, best practices for seeking mitigation steps, and enduring responses to the vulnerability of ransomware that threatens the safety of machines, network protection, and records.
Article
Full-text available
Ransomware might be a one assortment of malignant PC code which is that the quickest spreading malware risk as they focusing on each kind of clients records result in brief or perpetual loss of touchy information issue to ordinary activities, cash misfortunes caused to resuscitate framework documents that partners potential hurt to an association. As a consequences of an essential information’s unit encoded on the pc arrange by the programmers which they request the payment to pay a computerized cash style of a bitcoin inside a particular date, if installment isn’t done no information are back to unique sort of mystery composing. The proposed framework demonstrates that gathering of bitcoin datasets are accustomed to anticipating the eventual fate of bitcoin rates then examination the ransomware, that investigation are utilized to discover the ransomware extends then the client can shutting the framework doors for keeping away from the trail technique for malware go into the framework. It shows the security measures for the framework from the ransomware prediction.
Conference Paper
Full-text available
Up to now, there is little empirically backed quantitative and qualitative knowledge about self-replicating malware publicly available. This hampers research in these topics because many counter-strategies against malware, e.g., network- and host-based intrusion detection systems, need hard empirical data to take full effect. We present the nepenthes platform, a framework for large-scale collection of information on self-replicating malware in the wild. The basic principle of nepenthes is to emulate only the vulnerable parts of a service. This leads to an efficient and effective solution that offers many advantages compared to other honeypot-based solutions. Furthermore, nepenthes offers a flexible deployment solution, leading to even better scalability. Using the nepenthes platform we and several other organizations were able to greatly broaden the empirical basis of data available about self-replicating malware and provide thousands of samples of previously unknown malware to vendors of host-based IDS/anti-virus systems. This greatly improves the detection rate of this kind of threat.
Conference Paper
Full-text available
Botnets are networks of compromised computers infected with malicious code that can be controlled remotely under a common command and control (C&C) channel. Recognized as one the most serious security threats on current Internet infrastructure, advanced botnets are hidden not only in existing well known network applications (e.g. IRC, HTTP, or Peer-to-Peer) but also in some unknown or novel (creative) applications, which makes the botnet detection a challenging problem. Most current attempts for detecting botnets are to examine traffic content for bot signatures on selected network links or by setting up honeypots . In this paper, we propose a new hierarchical framework to automatically discover botnets on a large-scale WiFi ISP network, in which we first classify the network traffic into different application communities by using payload signatures and a novel cross-association clustering algorithm, and then on each obtained application community, we analyze the temporal-frequent characteristics of flows that lead to the differentiation of malicious channels created by bots from normal traffic generated by human beings. We evaluate our approach with about 100 million flows collected over three consecutive days on a large-scale WiFi ISP network and results show the proposed approach successfully detects two types of botnet application flows (i.e. Blackenergy HTTP bot and Kaiten IRC bot) from about 100 million flows with a high detection rate and an acceptable low false alarm rate.
Conference Paper
Full-text available
A botnet is a network of compromised computers infected with malicious code that can be controlled remotely under a common command and control (C&C) channel. As one the most serious security threats to the Internet, a botnet cannot only be implemented with existing network applications (e.g. IRC, HTTP, or Peer-to-Peer) but also can be constructed by unknown or creative applications, thus making the botnet detection a challenging problem. In this paper, we propose a new online botnet traffic classification system, called BotCop, in which the network traffic are fully classified into different application communities by using payload signatures and a novel decision tree model, and then on each obtained application community, the temporal-frequent characteristic of flows is studied and analyzed to differentiate the malicious communication traffic created by bots from normal traffic generated by human beings. We evaluate our approach with about 30 million flows collected over one day on a large-scale WiFi ISP network and results show that the proposed approach successfully detects an IRC botnet from about 30 million flows with a high detection rate and a low false alarm rate.
Conference Paper
Full-text available
Botnets are networks of compromised computers controlled under a common command and control (C&C) channel. Recognized as one the most serious security threats on current Internet infrastructure, botnets are often hidden in existing applications, e.g. IRC, HTTP, or Peer-to-Peer, which makes the botnet detection a challenging problem. Previous attempts for detecting botnets are to examine traffic content for IRC command on selected network links or by setting up honeypots. In this paper, we propose a new approach for detecting and characterizing botnets on a large-scale WiFi ISP network, in which we first classify the network traffic into different applications by using payload signatures and a novel clustering algorithm and then analyze the specific IRC application community based on the temporal-frequent characteristics of flows that leads the differentiation of malicious IRC channels created by bots from normal IRC traffic generated by human beings. We evaluate our approach with over 160 million flows collected over five consecutive days on a large scale network and results show the proposed approach successfully detects the botnet flows from over 160 million flows with a high detection rate and an acceptable low false alarm rate.
Conference Paper
Full-text available
Botnets, i.e., networks of compromised machines under a com- mon control infrastructure, are commonly controlled by an at- tacker with the help of a central server: all compromised ma- chines connect to the central server and wait for commands. However, the first botnets that use peer-to-peer (P2P) net- works for remote control of the compromised machines ap- peared in the wild recently. In this paper, we introduce a methodology to analyze and mitigate P2P botnets. In a case study, we examine in detail the Storm Worm botnet, the most wide-spread P2P botnet currently propagating in the wild. We were able to infiltrate and analyze in-depth the botnet, which al- lows us to estimate the total number of compromised machines. Furthermore, we present two different ways to disrupt the com- munication channel between controller and compromised ma- chines in order to mitigate the botnet and evaluate the effective- ness of these mechanisms.
Article
Malicious botnets are networks of compromised computers that are controlled remotely to perform large-scale distributed denial-of-service (DDoS) attacks, send spam, trojan and phishing emails, distribute pirated media or conduct other usually illegitimate activities. This paper describes a methodology to detect, track and characterize botnets on a large Tier-1 ISP network. The approach presented here differs from previous attempts to detect botnets by employing scalable non-intrusive algorithms that analyze vast amounts of summary traffic data collected on selected network links. Our botnet analysis is performed mostly on transport layer data and thus does not depend on particular application layer information. Our algorithms produce alerts with information about controllers. Alerts are followed up with analysis of application layer data, that indicates less than 2% false positive rates.
Conference Paper
In this paper we present Botlab, a platform that con- tinually monitors and analyzes the behavior of spam- oriented botnets. Botlab gathers multiple real-time streams of information about botnets taken from distinct perspectives. By combining and analyzing these streams, Botlab can produce accurate, timely, and comprehensive data about spam botnet behavior. Our prototype system integrates information about spam arriving at the Univer- sity of Washington, outgoing spam generated by captive botnet nodes, and information gleaned from DNS about URLs found within these spam messages. We describe the design and implementation of Botlab, including the challenges we had to overcome, such as preventing captive nodes from causing harm or thwart- ing virtual machine detection. Next, we present the re- sults of a detailed measurement study of the behavior of the most active spam botnets. We find that six botnets are responsible for 79% of spam messages arriving at the UW campus. Finally, we present defensive tools that take advantage of the Botlab platform to improve spam filter- ing and protect users from harmful web sites advertised within botnet-generated spam.
Malware Analysis Environment Design Architecture, SANS Institute, SANS Institute InfoSec Reading Room
  • A Sanabria
A. Sanabria, Malware Analysis Environment Design Architecture, SANS Institute, SANS Institute InfoSec Reading Room, January 18 th, (2007).
DDos Botnets, Thriving and Threatening
  • K Higgins
A Threat That Changes on the Fly -Polymorphic malware changes shape to fool detection schemes
  • Polymorphic Malware
Polymorphic Malware: A Threat That Changes on the Fly -Polymorphic malware changes shape to fool detection schemes http://www.csoonline.com/article/221190/polymorphicmalware-a-threat-that-changes-on-the-fly