Content uploaded by Jill Slay
All content in this area was uploaded by Jill Slay on Mar 08, 2016
Content may be subject to copyright.
RELATED TO VoIP CALLS
David Irwin and Jill Slay
Abstract The Voice over Internet Protocol (VoIP) is designed for voice commu-
nications over IP networks. To use a VoIP service, an individual only
needs a user name for identiﬁcation. In comparison, the public switched
telephone network requires detailed information from a user before cre-
ating an account. The limited identity information requirement makes
VoIP calls appealing to criminals. In addition, due to VoIP call en-
cryption, conventional eavesdropping and wiretapping methods are in-
eﬀective. Forensic investigators thus require alternative methods for
recovering evidence related to VoIP calls. This paper describes a digital
forensic tool that extracts and analyzes VoIP packets from computers
used to make VoIP calls.
Keywords: VoIP calls, packet extraction, packet analysis
Voice over Internet Protocol (VoIP) telephony is an inexpensive and
increasingly popular alternative to using traditional telephone networks.
The use of VoIP in U.S. businesses is expected to reach 79% by 2013 .
Meanwhile, the lack of technology for law enforcement to monitor VoIP
calls, the low barrier for entry and the anonymity provided by VoIP
service are making it very attractive to criminals .
Fortunately, the remnants of a VoIP call remain in the physical mem-
ory of the computers used for the call. The information available in-
cludes signaling information, the digitized call, and information about
the VoIP client. The signaling information is related to the setup and
initialization of the VoIP call. The digitized call comprises packets that
contain the encapsulated voice component. Information speciﬁc to the
VoIP application being used, such as the contact list, is also saved. It
G. Peterson and S. Shenoi (Eds.): Advances in Digital Forensics VII, IFIP AICT 361, pp. 221–228, 2011.
IFIP International Federation for Information Processing 2011
222 ADVANCES IN DIGITAL FORENSICS VII
is possible to manually search for known VoIP remnants, but this is a
time-consuming process that requires considerable expertise.
McKemmish  deﬁnes digital forensics as “the process of identifying,
preserving, analyzing and presenting digital evidence in a manner that
is legally acceptable.” The digital forensic search tool described in this
paper is designed to support all four steps. A byte-for-byte copy of the
original memory is created without modifying the original digital evi-
dence. This evidence is processed and formatted into a human-readable
format for presentation in a court of law.
Several researchers have investigated memory forensic techniques for
extracting evidence related to VoIP calls [6, 7, 9]. This paper builds
on this work by describing a forensic tool that detects and reconstructs
VoIP packet sequences from a computer memory capture. In addition,
it provides a means for extracting user information and VoIP client in-
formation. Experimental tests demonstrate that the tool locates more
than 97% of the packets in VoIP calls.
2. Internet Protocol
VoIP is a collection of several protocols that set up, maintain and tear
down calls involving the encapsulation and transportation of voice pack-
ets over the Internet. The two most prominent protocols used are the
User Datagram Protocol (UDP) and the Real-Time Transport Protocol
(RTP). UDP is a transport layer protocol used by Skype . RTP is an
application layer protocol that, in the case of X-Lite , uses UDP as
the transport layer protocol. Both these protocols include an Ethernet
frame link layer and an Internet Protocol (IP) Internet layer header.
IP commonly uses version 2 Ethernet frames. An Ethernet frame
consists of a seven-byte preamble, a one-byte start of frame delimiter,
and two six-byte Media Access Control (MAC) headers, one each for
the source and destination. Following the MAC headers are the two-
byte Ethertype, the IP/UDP/RTP data payload in bytes 46 to 1,500,
and a four-byte cyclic redundancy checksum for packet integrity.
IP provides Internet addresses in its headers, allowing packets to be
routed from their source to a destination IP address. However, an IP
address is not suﬃcient to deliver an IP packet from a source IP address
to a destination IP address. The port numbers of the source and des-
tination computers must also be known for a VoIP call to take place.
UDP maintains the port information. While UDP does not guarantee IP
packet delivery, it is well-suited to VoIP because of the real-time nature
of voice communications. Thus, VoIP uses the IP/UDP protocol stack
shown in Figure 1.
Irwin & Slay 223
Figure 1. IP and UDP packet headers.
RTP  provides transport for real-time applications that transmit
audio over packet-switched networks. The protocol incorporates infor-
mation such as packet sequence numbers and timestamps. This allows a
receiving application to buﬀer and sequence packets in the correct order
for audio playback. Thus, the complete VoIP stack is IP/UDP/RTP.
Figure 2. RTP packet he ader.
Figure 2 presents the RTP packet header format. In RTP, the Syn-
chronization Source (SSRC) ﬁeld identiﬁes the source of the synchro-
nization (e.g., computer clock). The Contributing Source (CSRC) ﬁeld
identiﬁes the source of the individual contributions that make up the
single data stream payload for the packet. It is not necessary to use
RTP to participate in a VoIP call. VoIP applications such as Skype
do not use RTP; X-Lite, on the other hand, uses RTP. RTP provides a
means for a VoIP client to reassemble and synchronize packets.
224 ADVANCES IN DIGITAL FORENSICS VII
-To: "david"< sip:firstname.lastname@example.org>
SIP Display info: "david"
-SIP to address:sip:email@example.com
SIP to address User Part: 8889215862
SIP to address Host Part: sip.pennytel.com
Figure 3. SIP invite request.
3. VoIP Packet Identiﬁcation
After an individual registers with a VoIP service provider, the individ-
ual uses the provider’s client to make calls. To initiate a call, the client
connects to the provider using the Session Initiation Protocol (SIP).
Figure 3 shows a SIP invite request. Elements of a SIP invite request
that are important to a forensic investigator include the user’s regis-
tered name (david), unique SIP user identiﬁer (8889215862) and host
SIP contains information about the call participants based on their
unique SIP identiﬁers. A regular expression search can be used to iden-
tify VoIP packets in a memory capture. In our experiments, we used a
hex editor to search two physical memory captures. The ﬁrst capture
was made after a Skype call that only uses UDP. The second was made
after an X-Lite VoIP call that uses RTP and SIP.
0000 00 0C 29 B6 57 76 00 21 6A 4A D6 26 08 00 45 00
0010 00 7D 58 77 00 00 80 11 5F DD C0 A8 00 66 C0 A8
0020 00 65 A1 01 53 CD 00 69 89 E6 80 6B 23 01 00 2B
0030 59 1C 22 14 AD 31 3C 64 7B 82 29 6C E0 18 DD A9
0040 25 EA 44 65 61 9A C1 66 D3 A1 B9 09 BC 38 B1 86
0050 89 66 63 11 D2 44 5F 88 A3 2D E4 63 8E A5 B8 73
0060 26 41 09 BD 90 99 65 1D E7 1B 85 D6 A3 A6 5A 09
0070 DC 21 5C C0 A8 39 05 BB F1 A5 1B E6 A2 29 4A E0
0080 6C 56 92 47 9D CA 65 00
Figure 4. VoIP frame with the search expression highlighted.
Figure 4 shows a single VoIP packet capture with the search pat-
tern highlighted. The headers are segmented by vertical lines (Ether-
The Ethernet frame is fourteen bytes in length (the preamble and
start of the frame delimiter are not shown) with six bytes each for the
Irwin & Slay 225
Figure 5. User interface.
source and destination MAC addresses. The last two bytes identiﬁes the
Ethertype, which, in the case of VoIP, is 0x0800 for an IP packet.
During our analysis of a 4 GB memory capture, a search using the
IP identiﬁer (0x0800) and the ﬁrst byte of the IP header (0x45) corre-
sponding to the byte search pattern 0x080045 yielded 8,881 hits.
The UDP header is eight bytes long and does not form part of the
search pattern. The identiﬁcation of UDP veriﬁes the use of an Eth-
ernet/IP/UDP stack and the port numbers identify the two parties in-
volved in a call. The UDP protocol is identiﬁed by byte 10 of the IP
header (0x11), indicating that the next protocol is in fact UDP. The
search pattern 0x080045--------11 yielded 559 hits. If the VoIP client
uses RTP, then the RTP header follows the UDP header.
The identiﬁcation of RTP is accomplished by examining the ﬁrst byte
that follows the UDP header. The ﬁrst byte of RTP contains the version
number, padding bit, extension bit and CSRC count. In the example
in Figure 4, the current SIP version is 2 and the other elements are
predominantly empty; thus, the byte has the value 0x80.
A search of the 4 GB memory capture using the complete pattern
0x080045--------11-----------------80 yielded no false positive er-
4. Forensic Tool
The forensic tool implemented to extract and analyze VoIP packets
has a simpliﬁed interface with tabbed browsing and asynchronous func-
tionality (Figure 5). The user ﬁrst selects the memory capture ﬁle to
be searched. By default, the Ethernet Protocol, IP version 4, UDP and
226 ADVANCES IN DIGITAL FORENSICS VII
Figure 6. Results interface.
RTP are automatically selected and searched. The option to search for
IP version 6 is also available. The search looks for VoIP packets that
match the pattern with and without the RTP component.
Figure 6 presents the results of an analysis of a memory capture using
the forensic tool. The top data grid displays the RTP packets recovered.
When a user selects an individual packet, the bottom grid updates to
display detailed packet information. The displayed information includes
the Ethernet source and destination addresses, IP source and destina-
tion addresses and sequence number, UDP source and destination port
numbers and sequence number, timestamp, and synchronization source
identiﬁer. The recovered packets can be saved for further analysis in an
SQL Server database. For example, the payloads can recombined into a
single stream and attempts can be made to decrypt the payloads either
by brute force or with the assistance of the VoIP provider.
Irwin & Slay 227
Table 1. Wireshark and RAM recovery results.
VoIP Duration Wireshark RAM RAM % Call
Client (seconds) Packet Packets Packets Recovered
Count (Total) (Unique)
Skype 180 18,701 41,959 18,208 97.4%
X-Lite 30 3,097 4,759, 3,093 99.9%
X-Lite 30 3,290 5,488 3,274 99.5%
X-Lite 180 9,089 17,695 9,063 99.7%
The forensic tool can also be used to train forensic analysts. An indi-
vidual packet may be expanded and each protocol highlighted in a diﬀer-
ent color to facilitate the interpretation of individual protocol ﬁelds. The
color-coded graphical representation of the VoIP protocol stack greatly
simpliﬁes the interpretation and understanding of the overall frame and
the individual protocols.
5. Experimental Results
Two memory captures were performed after VoIP calls. The ﬁrst was
a 4 GB RAM capture performed after a Skype call that lasted three
minutes. The second memory capture occurred after a clean restart on
the following day and three successive X-Lite calls, the ﬁrst lasting 30
seconds, the second 30 seconds and the third three minutes.
Table 1 compares the remnants of the calls recovered by the digital
forensic tool (RAM capture) versus the Wireshark capture of the VoIP
calls. Note that the total number of packets recovered by the RAM cap-
ture (Total) exceeds the actual number of packets in the call (Wireshark
Packet Count). It was found that duplicate packets exist in up to six
diﬀerent locations in memory. Filtering these duplicate packets provides
a more accurate measure of the number of packets recovered (Unique).
The forensic tool locates nearly all the VoIP packets corresponding to
the two types of calls. Note that the recovery percentage is lower for the
Skype call because it does not use RTP and, therefore, does not beneﬁt
from the use of a longer search expression.
The forensic tool presented in this paper successfully recovers VoIP
packets from memory captures. The tool also helps extract user details
from VoIP application control signals. The ability to analyze, store and
format VoIP packets is particularly valuable in forensic investigations.
228 ADVANCES IN DIGITAL FORENSICS VII
Several opportunities exist to improve the tool. For example, be-
fore transmission and during playback, the call is in an unencrypted
form. Therefore, the potential exists to extract unencrypted audio from
memory. Another enhancement involves the creation of a database with
contact list structures and control signal information associated with
commonly used VoIP clients.
This research was supported by the Australian Research Council via
Linkage Grant LP0989890 and by the Australian Federal Police.
 CounterPath Corporation, X-Lite, Vancouver, Canada (www.count
 In-Stat, VoIP penetration forecast to reach 79% of U.S. businesses
by 2013, Scottsdale, Arizona (www.instat.com/newmk.asp?ID=
2721), February 2, 2010.
 R. Koch, Criminal activity through VoIP: Addressing the mis-
use of your network, Technology Marketing Corporation, Norwalk,
 R. McKemmish, What is forensic computing? Trends and Issues in
Crime and Criminal Justice, no. 118, pp. 1–6, 1999.
 H. Schulzrinne, S. Casner, R. Frederick and V. Jacobson, RTP: A
Transport Protocol for Real-Time Applications, RFC 3550, Internet
Engineering Task Force, Fremont, California (tools.ietf.org/html
 M. Simon and J. Slay, Voice over IP: Forensic computing implica-
tions, Proceedings of the Fourth Australian Digital Forensics Con-
ference, pp. 1–6, 2006.
 M. Simon and J. Slay, Enhancement of forensic computing inves-
tigations through memory forensic techniques, Proceedings of the
International Conference on Availability, Reliability and Security,
pp. 995–1000, 2009.
 Skype, Luxembourg (www.skype.com).
 J. Slay and M. Simon, Voice over IP forensics, Proceedings of the
First International Conference on Forensic Applications and Tech-
niques in Telecommunications, Information and Multimedia, pp.