Content uploaded by Mingzhao Li
Author content
All content in this area was uploaded by Mingzhao Li on Oct 10, 2016
Content may be subject to copyright.
Journal of Computational Information Systems 11: 9 (2015) 3181–3194
Available at http://www.Jofcis.com
IMLogVis: An Interactive Visualization Method for
Analyzing Instant Messaging Log
Ang ZENG 1, Min ZHU 1,∗, Yabo SU 1, Mingzhao LI 2
1Vision-Computing Lab, College of Computer Science, Sichuan University, Chengdu 610064, China
2HITLab Australia, University of Tasmania, Launceston, 7250, Australia
Abstract
Instant Messaging (IM) log comprises abundant time-varying information which highly reflects users’
behavior patterns. Visualization of this kind of data is valuable for discovering, analyzing and understand-
ing the patterns hidden in the dataset. In this paper, we propose IMLogVis, an interactive visualization
method for analyzing IM log. An enhanced radial visualization is designed to illustrate the distribution
of conversations at multiple levels and on multiple time scales. In the interior of the enhanced radial
visualization, a scatterplot incorporating a new time-based node ordering method depicts the proximity
relation between the user and different buddies. Besides, a trend curve is utilized to show the degree of
communication frequency changes over time. Coupled with rich interactions such as keyword location
and time navigation, users can explore IM log from different aspects. Case studies demonstrate that the
effectiveness and usefulness of IMLogVis on discovering the useful and interesting patterns.
Keywords: Instant Messaging Log; Interactive Visualization; Temporal Information; Radial
Visualization
1 Introduction
Instant Messaging (IM) is a type of communications over the Internet which offers messages
transmission in real-time. According to the survey of [1], IM has become the top utilization rate
of Internet applications; and by June 2014, the number of IM users has been over 5 billion in
China. IM log is a recording file of point-to-point communication between users in the same
IM system. The log comprises abundant information changing over time which highly reflects
users’ behavior patterns. However, this data is tremendous and semi-structured. Effective visual
analysis of IM log is necessary, as it is significant for intuitively understanding information and
interactively discovering hidden patterns. Furthermore, visualizing the log offers many benefits.
Users’ behavior patterns can be detected and analyzed. Also, it can be a reference for friend
or application recommendation. Even for polices, anomalies and the critical clues may be found
from the social behaviors of suspects.
∗Corresponding author.
Email address: zhumin@scu.edu.cn (Min ZHU).
1553–9105 / Copyright ©2015 Binary Information Press
DOI: 10.12733/jcis14163
May 1, 2015
3182 A. Zeng et al. /Journal of Computational Information Systems 11: 9 (2015) 3181–3194
IM users often communicate with their colleagues and friends (“buddies” in IM terminology),
which can be divided into groups. There are numerous conversations between them, each of which
is a sequence of messages. A conversation contains detailed temporal information such as the start,
end and duration. Much effort has been dedicated to visualizing temporal information. Line
charts, stacked graphs and animation have achieved certain success. These techniques perform
well in displaying the trends and the degree of changes over time, but awkwardly in detecting
periodic patterns hidden in the data. Many researchers [3, 6, 13] have utilized radial visualization
to illustrate the periodicity. However, traditional radial visualization methods normally cannot
convey the periodicity of the conversation occurrences well at multiple levels. Besides, these
methods generally do not support the identification of patterns and trends on multiple time
scales.
To tackle these issues we design IMLogVis, a novel visualization method to enable effective and
interactive exploration of IM log. Considering the analytical tasks and data characteristics, we
combine and enhance familiar visual representations including radial visualization, scatterplot,
tag clouds and Heat Map. Firstly, we extend the radial visualization with sectors along the
circumference and the scatterplot in the interior. The enhanced radial visualization is suited to
display the distribution of conversations at multiple levels (single-buddy, grouped-buddy and all-
buddy level). Meanwhile, multiple time scales (yearly, monthly and daily scale) are provided to
reveal the dependency on a specific time range. To depict the proximity relation between the user
and different buddies, we also propose a scatterplot which incorporates a new time-based node
ordering method. Additionally, we design a trend curve to provide an overview of communication
frequency changes over time. Finally, rich interactions are offered to explore IM log from different
aspects.
The major contributions of our work are as follows:
•Extend radial visualization with sectors along the circumference and the embedded scat-
terplot to convey the distribution of conversations at multiple levels and on multiple time
scales.
•Propose a new time-based node ordering method, which is efficient for understanding the
proximity relation.
•Provide rich interactions to allow users exploring IM log from different aspects.
•Develop a powerful interactive exploration tool with a seamless integration of four enhanced
classic visualization techniques.
2 Related Work
2.1 Visualizing data with temporal information
A lot of research has been conducted on the visualization of data with temporal information [2].
One of the most common methods to visualize this kind of data is the line chart. For instance, a
polyline-based visualization [19] is proposed to present time-varying data with tags for each time
point. Braided graph [12] and horizon graphs [11] are also the variants of line charts. They make
A. Zeng et al. /Journal of Computational Information Systems 11: 9 (2015) 3181–3194 3183
it easy to understand and compare the multiple values over time, but clutters are often generated
as the dataset contains myriad time points.
Stacked graph is also a popular approach which shows the individual and overall trend changes.
ThemeRiver [9] and Heat Map Scope [8] are the representatives of stacked graphs. ThemeRiver
uses a river metaphor to exhibit the thematic changes over time. Heat Map Scope which integrates
ThemeRiver and Heat Map, displays the whole trend as well as the detailed patterns of data.
Apart from the line chart and stacked graph, another common technique for displaying temporal
information is animation. Animation directly illustrates the movements to make changes in the
data transparent. Gapminder Trendalyzer [4] is an animated bubble chart to show trends in
three dimensions. TimeRider [14] is an animated scatterplot, which visually analyzes cohorts
of diabetes patients. However, tracking the changes in scenes seems difficult as human beings
usually only concentrate on what they see.
Potential cyclic behaviors of IM log are helpful to analyze users’ online behaviors. The above
visualization techniques have their own advantages in visualizing data with temporal information,
while they fail to detect the previously unknown periodic patterns of the data.
2.2 Radial visualization
Radial visualization has been becoming an increasingly popular technique [7]. Circular, elliptical
or spiral representations all belong to this visualization paradigm. Radial visualization was first
introduced by Salton et al. [15] for visualizing text data. Since then, many successful visualization
tools have used radial layout to visualize various kinds of data.
Among them, some designs have been used to illustrate the periodicity hidden in dataset.
Dragicevic et al. [6] design SpiraClock to display bus schedules, which have cyclic departure time.
Bertini et al. [3] present SpiralView for monitoring the routine activity of network, enabling a
retrospective view on anomalies. However, SpiraClock and SpiralView are not suitable for IM
log, since the narrow space in the center area cannot show additional information that related to
time.
TelogViz [13] and ChronoView [16] adopt hollow circles to solve the problem. TelogViz is
proposed to visualize cellphone communication log, depicting the periodicity that the cellphone
user contacts with friends. However, the design only supports the specific one-month data. The
time ranges of IM log files are usually more than one month, most of which are over one year.
ChronoView is a visualization tool for displaying the relationships between events and time s-
tamps. However, the tool follows a 24-hour scale, which only focuses on the time of events, but
has ignored the date.
The aforementioned radial visualization methods put emphasis on fixed time scales like one-
month scale or 24-hour scale, while time ranges of IM log files vary in different users. Meanwhile,
these methods are effective to detect the periodicity of the conversation occurrences only at
single-buddy level or all-buddy level, but fail to display that at multiple levels.
IMLogVis originates from radial visualization. We augment it with scatterplot in the interior to
intuitively convey the proximity relation between the user and different buddies. The enhanced
radial visualization displays the distribution of conversations at multiple levels, enabling users to
understand and compare the periodicity of the conversation occurrences. Moreover, IMLogVis
provides multiple time scales, allowing users to flexibly choose different scales according to their
3184 A. Zeng et al. /Journal of Computational Information Systems 11: 9 (2015) 3181–3194
requirements.
3 System Overview
Fig. 1: System overview of our visual tool
We design IMLogVis and develop a tool based on it. The system overview is shown in Fig. 1.
It contains three components: a log processing component, a log visualization component and
an interaction component. The log processing component prepares the structured data for the
subsequent procedures.
The log visualization component transforms the formatted data into visualization views. There
are four visualization views including the Summary View, Word Cloud View, Calendar View
and Detail View. The Summary View is a major view, which consists of an enhanced radial
visualization with embedded scatterplot and a trend curve on the bottom. The Summary View
provides an overview of communication frequency changes and the distribution of conversations.
As for the other three views, they display more details from different aspects. The Word Cloud
View provides the summarization of chat contents by a set of keywords. The Calendar View
reveals the communication frequency for the specific buddy per day. The Detail View shows
some key statistical information. Moreover, Connections among multiple views help users better
understand the correlations among attributes.
The interaction component provides rich interactions for users to visually explore and compre-
hend the data. For example, a user can hover over a buddy node, so that the distribution of
conversations for the buddy and his/her username will be shown. If a user would like to look over
the information in a specific time range, he/she can drag the time window over the trend curve
and all the visualization views will update synchronously.
4 Visualization Design
4.1 Data preparation
IM log is generally semi-structured and updated over time. The raw data is stored in a series of
text files, which comprises buddies’ information, chat messages, temporal information, etc. We
group chat messages into conversations according to time intervals between messages. When the
user is idle for a long time, the conversation session times out.
A. Zeng et al. /Journal of Computational Information Systems 11: 9 (2015) 3181–3194 3185
Fig. 2: The interface of IMLogVis. (a) The Summary View. (b) The Calendar View. (c) The Word
Cloud View. (d) The Detail View
Definition 1 Aconversation is a sequence of messages, in which the time interval between
two continuous messages is no more than a threshold δ.
4.2 Summary view
4.2.1 Enhanced radial visualization
Temporal information is one of the most important attributes of IM log. In order to illustrate
the temporal characteristics and detect the potential periodic patterns, we propose an enhanced
radial visualization. Considering an analog clock, we map the conversations with time stamps on
the spiral axis surrounding the clock.
Time ranges of IM log files vary in different users. Some of them may be over several years while
some may be less than a year. Therefore, IMLogVis provides multiple time scales rather than the
fixed scale. Multiple time scales encompass yearly, monthly and daily scale, each of which has
different time period. Take daily scale for example, it follows a 24-hour period. Considering the
analog clock, the north is midnight. Moving clockwise, the east is 6am, the south is 12am and
the west is 6pm.
To help users understand and compare the periodicity of the conversation occurrences, IM-
LogVis displays the distribution of conversations at multiple levels, which contains single-buddy,
grouped-buddy and all-buddy level. The sectors along the circumference count the number of
conversations for all buddies during the corresponding time intervals, so that they show the
distribution of conversations at all-buddy level.
Several equidistant rings are placed on the spiral axis. A color bar between two rings represents
3186 A. Zeng et al. /Journal of Computational Information Systems 11: 9 (2015) 3181–3194
a conversation, whose color denotes a group. We cluster the conversations with the buddies from
the different groups, thus to observe the distribution of conversations and detect the periodicities
at grouped-buddy level. To avoid the color bars overfilling, the greater number of conversations
the group has, the outer on the spiral axis it will be put. The colors of bars correspond to the
same colors displayed in the scatterplot for visual correlation. Limited by the color recognition
of human beings [18], IMLogVis allows users to simultaneously observe six groups at most. If
the number of groups is more than six, the system will display the top six groups ranked by the
number of their messages.
Let T={t1, t2, . . . , tn}be a set of time stamps, and G={g1, g2, . . . , gn}be a set of groups.
The set of radii Rcan be expressed as:
R={ri|ri=rOrigin +dGap ×(gi−1),1≤i≤n}(1)
Where rOrigin is the radius of the analog clock, and dGap is the distance of adjacent rings on
the spiral axis.
Suppose that the north of the analog clock is tStart. The positions of color bars are defined
by the function f:T→R2.
f(t)=(rcosα, rsinα) (2)
Where αcan be got from:
α=5
2π−2πt−tStart
m(3)
tis the beginning time of a conversation. When mis 24h, daily scale is provided (see Fig. 2),
namely, tS tart is represented as 0’ clock. When mis 720h, it provides monthly scale. Yearly
scale is in a similar way.
4.2.2 Scatterplot
Buddies play a key role in IM, which may be users’ college classmates, business partners, close
friends, etc. These buddies are divided into different groups based on personal preference.
Compared with some visualization forms such as Vizster [10], which show many-to-many rela-
tionships, IM mainly shows a one-to-many relationship. Consequently, we adopt the scatterplot,
incorporating our new time-based node ordering method, to display the proximity relation be-
tween the user and buddies.
In the scatterplot, a node encodes a buddy. Color encodes the group that buddies belong to.
We use size of the node to represent communication frequency, and distance from the center of the
analog clock to represent the last chat time. Links between the selected buddy and the spiral axis
depict conversations with the buddy, displaying the distribution of conversations at single-buddy
level.
A time stamp contains date and time (e.g. 2014/9/19 11:01:23). Therefore, a time stamp tcan
be represented as (tD, tT ). Let D={d1, d2, . . . , dn}be a set of distances between the center and
the buddy nodes. dican be expressed as:
di=tDi
tDmax −tDmin
×rOrigin (4)
Suppose the communication frequency is fand the size of node is s. The detail of the time-based
node ordering method is as follows:
A. Zeng et al. /Journal of Computational Information Systems 11: 9 (2015) 3181–3194 3187
Step 1 Extract username, group, communication frequency and the last chat time for each buddy
from the structured table.
Step 2 According to Eq. (4), compute the distance of each node away from the center.
Step 3 According to Eq. (3), compute the angle of each node.
Step 4 Compute the size of node s=λf, where λis a constant.
Step 5 Render and place the nodes in sequence.
4.2.3 Trend curve
To reveal the degree of communication frequency changes over time, we design a trend curve.
Each time point denotes a day. The height at each time point represents the corresponding
communication frequency, so that it gives an idea about how often the conversations occur. If the
height is zero, it means there is no conversation happened. Additionally, there is a time window
over the trend curve. The extent of the time window on the time axis specifies the time range of
interest. The start date and end date of the time window are displayed below the trend curve.
The trend curve has two advantages. Firstly, users get an overview of communication frequency
changes. Secondly, users can update the whole view by choosing the interested time range to
explore more information.
4.3 Word cloud view
Apart from the detailed temporal information of IM log, users also care about the chat contents.
To help them intuitively learn about the main information, the chat contents are summarized by
a set of keywords W={w1, w2, . . . , wn}, which are represented as word clouds. Font size encodes
the weight of keywords. The color does not have particular meaning so far.
We design a word cloud layout based on Wordle [17]. Each keyword is regarded as a rectangle.
Based on a spiral whose radius increases gradually, the candidate positions are selected for key-
words. By the boundary detection and intersection detection, the final positions of keyword will
be determined.
4.4 Calendar view and detail view
In contrast to the overview of the distribution of conversations and communication frequency
changes in the Summary View, the Calendar View illustrates the communication frequency for
the specific buddy per day. The layout is designed based on [5]. Each cell represents a day. Cells
are columned into a week and then grouped by a month and a year. Their colors denote the
magnitude of the values behind each cell, as displayed in Fig. 2. The Calendar View can be
utilized to help users understand the chat patterns, even enables an in-depth visual analysis of
relationship.
When users find something interesting, they are usually curious about the details. The Detail
View provides key statistical information (e.g. the communication frequency, the last chat time)
and the detailed chat contents.
3188 A. Zeng et al. /Journal of Computational Information Systems 11: 9 (2015) 3181–3194
4.5 User interactions
IMLogVis provides a series of rich interactions for users to uncover insights. Besides some basic
interactions such as zooming and panning, we also propose several particular interactions for the
system.
Time Navigation Users can drag the time window to select the interested time range (see Fig.
2). As the time range is altered, all the visualization views update at the same time. Moreover, if
users want to explore the information on a specific day, they can be satisfied by time navigation.
Keyword Location IM users often encounter such an annoying problem. When they need to
find the crucial information from the log such as the meeting time or dating place, it’s difficult
to get the needed information from a mass of messages, especially when they have a vague
memory. Therefore, we design keyword location to solve this problem. Users can choose the
related keywords of the critical information in the Word Cloud View. Then, the messages with
the highlighted keyword will be presented in the Detail View.
Multiple Time Scales Exploration As illustrated in Subsection 4.2.1, IMLogVis provides
multiple time scales. When importing the raw data, the system will determine the most appro-
priate scale according to the time range. Users have options to choose one scale flexibly based on
their requirements.
Linking IMLogVis supports automatic linking in the following two situations: 1) when users
select a node in the scatterplot, the Word Cloud View, Calendar View and Detail View will
display the corresponding information; 2) when hovering on the node, the radial lines depicting
conversations between the user and the selected buddy will be automatically connected by the
spiral axis.
Data Filtering The system allows users to filter data by groups and keyword frequencies.
Users can manually choose several interested groups in the Summary View. Meanwhile, users can
filter the keywords by adjusting the parameter.
5 Case Study
To show the usability of IMLogVis, several cases are presented in this section. We illustrate how
to use our system to analyze IM log and explore the hidden patterns. The datasets are got from
the IM system Tencent QQ, which belong to different users. The behavior patterns of three users
are displayed in Fig. 3.
5.1 Personal habits exploration
According to behavior patterns of these users, we can explore and analyze their personal habits.
IMLogVis provides multiple time scales. When selecting daily scale, users’ daily routine can be
observed (see Fig. 3(a), Fig. 3(b) and Fig. 3(c)). For instance, user A, user B and user C all
sleep late, while B gets up earlier than others. Also, the daily routine of A and C is more regular
than that of B, since it is obvious that B is still active even late at night. Moreover, there are
fewer conversations at 12 am and 1 pm (as shown in Fig. 3(c)), and the same situation happens
at 6 pm. Consequently, we can deduce that C has a habit of taking a nap at noon, and usually
has dinner at 6 pm. Actually, the deduction has been proved right later.
A. Zeng et al. /Journal of Computational Information Systems 11: 9 (2015) 3181–3194 3189
(a) (b)
(c) (d)
Fig. 3: Comparison of three users’ behavior patterns. (a) User A’s behavior patterns at daily scale. (b)
User B’s behavior patterns at daily scale. (c) User C’s behavior pattern at daily scale. (d) User C’s
behavior pattern at yearly scale
The system also enables to discover the chat patterns between the user and his/her buddies. For
example, user A often chats with friends from 9 pm to 11 pm, while there are fewer conversations
during the rest of the day. Obviously, C tends to contact with friends in three time periods,
including 9 am to 11 am, 2 pm to 5 pm, and 8 pm to 9 pm. B keeps in touch with friends almost
the whole day.
Furthermore, some patterns can be found from the trend curve. As shown in Fig. 3(a), there
are two peaks on Aug. 5 and Aug. 9 in 2012. Interestingly, we notice that the communication
frequency turns into zero in three time periods in Fig. 3(c), which are from Sep. 28 to Oct. 16 in
2013, Jan. 9 to Feb. 17 and Jul. 20 to Aug. 7 in 2014 respectively. Hence, we switch to yearly
scale for further exploration.
As shown in Fig. 3(d), the distribution of conversations on the spiral axis matches the same
3190 A. Zeng et al. /Journal of Computational Information Systems 11: 9 (2015) 3181–3194
pattern as we find from the trend curve. It is clear that there is no conversation in the three time
periods. We can also find that there are fewer conversations in January, February, and August
than other months. Therefore, we can infer that C was on vacation from Jan. 9 to Feb. 17 and
Jul. 20 to Aug. 7 in 2014. Later, the inference is proved right. Oct. 1 to Oct. 7, is National
Day holidays of China. C had a journey from Sep. 28 to Oct. 7 in 2013 and then went out for a
meeting from Oct. 8 to Oct. 16.
5.2 Social relationship exploration
Users communicate with various people by IM software. Based on the log files, their social
relationship can be explored and analyzed.
As shown in scatterplot of Fig. 3, user B has more friends than user A and user C. However,
A and C contact with most of their friends in the recent period. Combining the conclusion in
Subsection 5.1 that B keeps in touch with friends almost the whole day, we conclude that B is
more sociable than A and C. Thus we target at B for further exploration.
Fig. 4: The visualization of user B’s IM log
The figure presents the related information after selecting the buddy node “Zhang Chenlei (Ü
•[)”. When clicking the keyword “work (óŠ)” in the Word Cloud View, the chat messages
including this keyword are shown in the Detail View.
We can clearly see that B communicates more with buddies from “postgraduate” (blue),
“318lab” (green) and “college” (fuchsia) than other groups. Comparing the node distribution of
the three groups, buddies from “318lab” are mainly around the center, while buddies from “post-
graduate” and “college” spread everywhere. Therefore, B keeps in close touch with “318lab”. By
hovering on the nodes, we discover that B contacts with friends from “318lab” mainly during
working hours. Therefore, the buddies from this group may be B’s colleagues.
A. Zeng et al. /Journal of Computational Information Systems 11: 9 (2015) 3181–3194 3191
Then, choose a buddy of B to explore the relationship between them (see Fig. 4). From the
Summary View, we can observe that B communicates with the buddy “Zhang Chenlei (Ü•[)”
almost at any time, even at over 2 am. Meanwhile, statistics in the Detail View shows that the
communication frequency is 130 per year and the last chat time happened in the day we got the
log. Based on the above observations, we can conclude that they are close friends.
Fig. 5: The Calendar View displaying the first and the last chat time between user B and the buddy
“Zhang Chenlei (Ü•[)”
More details can be discovered from the Calendar View and the Word Cloud View. The first
and the last chat time are shown in Fig. 5. From the Calendar View, we can see that there
is no communication from June to August in 2013 and July, August in 2014. Besides the five
months, there are sporadic conversations in January and February 2014, while frequent chats in
other months. Combining the distribution of conversations in the Summary View, we guess that
B and “Zhang Chenlei (Ü•[)” are students or teachers.
The keywords in the Word Cloud View (see Fig. 4) are mainly about time and people’s
names. Then we select several particular keywords like “work (óŠ)”, “practice (¢‚)” and
“communication (6)” to see the corresponding conversations. From the chat contents, we
infer that they may be members in a same organization. Afterwards, we know that B and her
friend are postgraduates, both of whom worked in the student union.
5.3 Interesting discovery
During the process of exploration, we have discovered some interesting patterns. There are two
examples.
As shown in Fig. 6, the conversations between user C and the selected buddy evenly distribute in
three time periods. Look at the dashed box in the Calendar View, they have fixed communication
every Thursday during the three months. The user was surprised at the pattern. By time
navigation, we knew the exact time period is from Feb. 14 to Apr. 16 in 2014. Afterwards, the
user recalled that they worked together during that time period and they needed to hand over
the work every Thursday.
The visualization of user D’s IM log is displayed in Fig. 7. As for the selected node, we can see
that D communicates with “[13] Liu Hanqing (4ǘ)” almost at any time before the midnight,
except a special conversation happened at over 1 am. The user was curious about the special
conversation. Actually, it took place on Dec. 23 2013, whose contents was about how to celebrate
the Christmas Eve. Moreover, when D caught a sight of the keywords “Xi Jinping (SC²)” and
3192 A. Zeng et al. /Journal of Computational Information Systems 11: 9 (2015) 3181–3194
Fig. 6: The visualization of user C’s IM log. The figure displays the related information after selecting
the buddy node “Guo Juan (Hï)”. In the dashed rectangle, we can observe that C and “Guo Juan (H
ï)” have fixed communication every Thursday during the three months
Fig. 7: The visualization of user D’s IM log. The figure shows the related information after selecting
the buddy node “[13] Liu Hanqing (4ǘ)”. When clicking the keyword “Wang Qishan (©ì)” in
the Word Cloud View, the corresponding chat messages are displayed in the Detail View
A. Zeng et al. /Journal of Computational Information Systems 11: 9 (2015) 3181–3194 3193
“Wang Qishan (©ì)” (They are both China’s leaders), he felt surprised. Because he believed
that he had never talked about China’s leaders with his friend. Then we clicked the keyword
“Wang Qishan (©ì)” in the Word Cloud View. There was a corresponding chat message
shown in the Detail View. We found that they had talked about China’s leaders indeed and the
chat message was about a joke.
5.4 Discussion
The case studies have clearly demonstrated that our visualization method is effective to analyze
users’ behavior patterns and reveal hidden information in the IM log. Firstly, the periodicity of
the conversation occurrences can be figured out. Secondly, personal habits such as daily routine
and chat patterns can be easily discovered. Thirdly, the social relationship between the user and
various friends can be detected or deduced from the cross analysis of multiple views. Additionally,
coupled with some interactions like multiple time scales exploration and keyword location, more
interesting patterns and information can be discerned.
On the whole, the visualization method is beneficial to get an overview of the online life. Users
may recall some special events conversed with their friends, or realize that they should contact
with some distant friends. Furthermore, the discovery and analysis of user’s behavior patterns can
be a reference for friend or application recommendation. Even for polices, they may get the IM
log from IM service provider. Hence, they can discover suspect’s daily routine, chat patterns and
social relationship with IMLogVis, which are much helpful to find the critical rules and anomalies.
6 Conclusion and Future Work
In this paper, we have proposed IMLogVis, a novel method to interactively visualize and analyze
IM log. We propose an enhanced radial visualization to display the distribution of conversations
at multiple levels and on multiple time scales. The embedded scatterplot shows the user’s one-to-
many relationship based on the temporal information. Combining a trend curve and Heat Map,
our design illustrates the communication frequency changes from different facets. Moreover,
connections among multiple views and rich interactions provide comprehensive analysis of the
log. Case study proves that our method is efficient to discover personal habits, social relationship
even some interesting patterns in the IM log.
There are two avenues for future work. From the perspective of application, IMLogVis can be
used to analyze other data with temporal features, such as mobile data. From the perspective
of visualization design, the layout of tag clouds can be improved by specifying the meanings of
keywords’ colors and positions, thus to understand the content evolution over time.
Acknowledgement
This work is supported by Science and Technology Support Program of Sichuan Province, China
(No. 2013GZ0015), the Foundation of Department of Science and Technology of Sichuan Province,
China (No. 2013DTPY0010). The authors also wish to thank the reviewers for their helpful
comments.
3194 A. Zeng et al. /Journal of Computational Information Systems 11: 9 (2015) 3181–3194
References
[1] (Dec. 2014), Website of China Internet Network Information Center, http://www.cnnic.net.cn.
[2] W. Aigner, S. Miksch, W. Muller, H. Schumann, and C. Tominski. Visual Methods for Analyzing
Time-Oriented Data. IEEE Transactions on Visualization and Computer Graphics, 14(1): 47–60,
2008.
[3] E. Bertini, P. Hertzog, and D. Lalanne. Spiralview: Towards Security Policies Assessment through
Visual Correlation of Network Resources with Evolution of Alarms. IEEE Symposium on Visual
Analytics Science and Technology, pages 139–146. IEEE, 2007.
[4] (Dec. 2014), H. Rosling, Website of Gapminder, http://www.gapminder.org.
[5] (Dec. 2014), R. Wicklin and R. Allison, Website of Data Expo 2009, http://stat-computing.org
/dataexpo/2009/posters/.
[6] P. Dragicevic and S. Huot. Spiraclock: A Continuous and Non-Intrusive Display for Upcoming
Events. CHI’02 extended abstracts on Human factors in computing systems, pages 604–605. ACM,
2002.
[7] G. Draper, Y. Livnat, and R. F. Riesenfeld. A Survey of Radial Methods for Information Visual-
ization. IEEE Transactions on Visualization and Computer Graphics, 15(5): 759–776, 2009.
[8] Y. Hashimoto and R. Matsushita. Heat Map Scope Technique for Stacked Time-Series Data
Visualization. IEEE 16th International Conference on Information Visualisation (IV), pages 270–
273. IEEE, 2012.
[9] S. Havre, B. Hetzler, and L. Nowell. Themeriver: Visualizing Theme Changes Over Time. IEEE
Symposium on Information Visualization, pages 115–123. IEEE, 2000.
[10] J. Heer and D. Boyd. Vizster: Visualizing Online Social Networks. IEEE Symposium on Informa-
tion Visualization, pages 32–39. IEEE, 2005.
[11] J. Heer, N. Kong, and M. Agrawala. Sizing the Horizon: The Effects of Chart Size and Layering
on the Graphical Perception of Time Series Visualizations. Proceedings of the SIGCHI Conference
on Human Factors in Computing Systems, pages 1303–1312. ACM, 2009.
[12] W. Javed, B. McDonnel, and N. Elmqvist. Graphical Perception of Multiple Time Series. IEEE
Transactions on Visualization and Computer Graphics, 16(6): 927–934, 2010.
[13] M. LI, M. ZHU, Q. GAN, and T. LIANG. A Design to Visualize Cellphone Communication Log
in an Interesting Way. Journal of Computational Information Systems, 9(22): 9165–9176, 2013.
[14] A. Rind, W. Aigner, S. Miksch, S. Wiltner, M. Pohl, F. Drexler, B. Neubauer, and N. Suchy. Visu-
ally Exploring Multivariate Trends in Patient Cohorts Using Animated Scatter Plots. Ergonomics
and Health Aspects of Work with Computers, pages 139–148. Springer, 2011.
[15] G. Salton, J. Allan, C. Buckley, and A. Singhal. Automatic Analysis, Theme Generation, and
Summarization of Machine-Readable Texts. Information retrieval and hypertext, pages 51–73.
Springer, 1996.
[16] S. Shiroi, K. Misue, and J. Tanaka. Chronoview: Visualization Technique for Many Temporal
Data. IEEE 16th International Conference on Information Visualisation (IV), pages 112–117.
IEEE, 2012.
[17] F. B. Viegas, M. Wattenberg, and J. Feinberg. Participatory Visualization with Wordle. IEEE
Transactions on Visualization and Computer Graphics, 15(6): 1137–1144, 2009.
[18] C. Ware. Information Visualization: Perception for Design. Elsevier, 2013.
[19] S. Yagi, Y. Uchida, and T. Itoh. A Polyline-Based Visualization Technique for Tagged Time-
Varying Data. IEEE 16th International Conference onInformation Visualisation, pages 106–111.
IEEE, 2012.