ArticlePDF Available

Visual analytics: designing flexible filtering in parallel coordinate graph

Authors:
Journal of Fundamental and Applied
International License.
Libraries Resource Directory
V
ISUAL ANALYTICS: DESIGNING FLEXIBLE FILTERING IN PARALLEL
Z. Idrus
1
Faculty of Computer and Mathematical Sciences, Un
2Faculty of Applied Sciences,
UniversitiTeknologi MARA, 40450 Shah Alam, Selangor, Malaysia
Published online:
ABSTRACT
Data visualization is a tec
hnique of creating visual image
big data. The visualization
reveals hidden
that is congested and h
ardly reveal the data patterns.
flexibility to users
to control over the
others. However
, most of the filtering is
graph since it is a
widely used technique for visualizing m
flexible visual filtering
presentations for parallel coordinate graph have
finding support a wide range of
visual analytics needs in parallel coordinated graph.
Keywords: d
ata visualization; visual analytics; parallel coordinate graph;
Author Correspondence, e-mail:
zainura@tmsk.uitm.edu.my
doi:
http://dx.doi.org/10.4314/jfas.v9i5s.3
Journal of Fundamental and Applied Sciences
ISSN 1112-9867
Available online at
http://www.jfas.info
Journal of Fundamental and Applied
Sciences is licensed under a Creative Commons Attribution-
NonCommercial 4.0
Libraries Resource Directory
. We are listed under Research Associations
category.
ISUAL ANALYTICS: DESIGNING FLEXIBLE FILTERING IN PARALLEL
COORDINATE GRAPH
Z. Idrus
1,*, H. Zainuddin2 and A. D. M. Ja’afar1
Faculty of Computer and Mathematical Sciences, Un
iversitiTek
nologi MARA, 40450
Selangor, Malaysia
UniversitiTeknologi MARA, 40450 Shah Alam, Selangor, Malaysia
Published online:
17 October 2017
hnique of creating visual image
to help users in s
peed understanding of
reveals hidden
knowledge through data patterns and relationships. T
process is known as visual analytics.
Nevertheless, complex and huge data created visua
ardly reveal the data patterns.
Thus, filtering is
a technique
to control over the
data view as such to focus only o
n interest part
, most of the filtering is
not studied in a structured
manner. Thus, this research
has designed a structured process for formulating
filtering technique in a
parallel coordinate
widely used technique for visualizing m
ultivariate data.
With the
presentations for parallel coordinate graph have
be
en
visual analytics needs in parallel coordinated graph.
ata visualization; visual analytics; parallel coordinate graph;
zainura@tmsk.uitm.edu.my
http://dx.doi.org/10.4314/jfas.v9i5s.3
Journal of Fundamental and Applied Sciences
http://www.jfas.info
NonCommercial 4.0
category.
ISUAL ANALYTICS: DESIGNING FLEXIBLE FILTERING IN PARALLEL
nologi MARA, 40450
ShahAlam,
UniversitiTeknologi MARA, 40450 Shah Alam, Selangor, Malaysia
peed understanding of
knowledge through data patterns and relationships. T
he
Nevertheless, complex and huge data created visua
lization
a technique
that gives
n interest part
and hide
manner. Thus, this research
parallel coordinate
With the
process,
en
produced. The
visual analytics needs in parallel coordinated graph.
ata visualization; visual analytics; parallel coordinate graph;
data filtering.
Research Article
Special Issue
Z. Idrus et al. J Fundam Appl Sci. 2017, 9(5S), 23-32 23
1. INTRODUCTION
Data visualization is a term used for visual images that help users to understand complex data [1]
through their patterns and relationships [2]. Others define data visualization as the art of
conveying information visually instead of presenting them in numerical format [3]. Some of the
common form of data visualizations are tables, diagrams, images [4], plots, graphs and charts.
The visual image is also essential for complex analyses and the process is defined as visual
analytics [5]. It is a science of analytical reasoning and must be built with integrative features[6].
Visuals analytic assists in the process of extracting patterns and relationships that exist within the
data[7]. However, in complex and huge data, it is common for patterns to be cluttering that lead
to unjustifiable analysis. Thus, filtering is one of the techniques to reduce visual clutter [8-9], data
complexity and simplify data patterns, thus create apparent relationships.
Filtering is a process to limit the number of data entity to be displayed as a mean to reduce data
congestion. Filtering gives users flexible control over the visualization as such to focus on their
interest and hide insignificant items. It is a vital process in visual analytics since it is unusual to
visualize and analyze huge data at once [10]. However, most of the filtering is not studied in a
structured manner. Thus, this research studies filtering of parallel coordinate graph in a structured
approach. The graph is chosen as it is a widely used visualization technology in analyzing
multivariate data. The research introduces a systematic process to flexible filtering technique. At
the end of the process, a range of flexible visual filtering designs are produced. The finding will
support visual analytics needs in parallel coordinated graph.
The remainder of this paper is structured with section 2 highlights the background study of visual
analytics in parallel coordinate graphs. Section 3 describes the process of visual analytics design.
It also explores parallel coordinate graph as part of the process. Then, an abstraction design of
parallel coordinate graph via gold directed approach is undertaken. By adopting abstract scene
analysis method, section 4 discusses the seven filtering designs extracted from the graph
abstraction. Section 5 detailed out the evaluation of the filtering designs. Finally, section 6
concludes the finding with suggestions for future enhancement.
2. BACKGROUND STUDY
The rationale behind data visualization is to convey information through visual image such as
graphical tools. Since data are huge with different levels, data visualization must be equipped
Z. Idrus et al. J Fundam Appl Sci. 2017, 9(5S), 23-32 24
with visual analytic features. Visual analytic is also commonly known as visual exploration[11-
12]. Some of the common visual analytics techniques are filter, zoom, sort, brush, bind and
range[13].Information Seeking Mantra which was introduced by [14] highlights the concept of
three steps pathof visual analytics. First step is overview firstwhere data is viewed in a single
graph. Visualization technology is useful in this step. Second step is to zoom and filter which is to
identify interest patterns and relationships. Finally, detail on demand which is to focus on the
interested parts of the data.This is the stage where visual analytic technology becomes vital.
Visual exploration activities are performed to view information from different perspectives.
Parallel coordinate graph is one of the visualizationtechnologies. The main strength of the graph
is on its ability to reveal the relationship of multivariate data[15-16]in a single graph through its
multi-dimensionalaxes. When compared to other visual technology, parallel coordinate graph is
well ahead in term of time taken to analyze its data[17]. Parallel coordinate alsowork well with
many of visual analytic technology such as filtering, sorting, zooming, slicing, dicing and
brushing[18-19].
Beside data, edges of parallel coordinate graph also gain researchers interest. For example, in the
study by [11], the edges are divided into three which are within, between and background.The
edges can beselected to meet users’ needs. For every selection within and between edges are
shown while background edges are hidden.
On the other hand, edges in[8] is viewed form visual interaction perspectives. The research
introduces a force model which reduces clustering through edges interaction. The force is the
main function responsible to reduce the interference between edges by allowing them to curve
and adjusts their shape.
In term of methodology, Visualization Pipeline methodology is introduced and filtering is one the
steps and it is user-centered [20]. Users decide what variables to be focused,while hide others
during exploration.
Similarly, a new methodology to retain the visual exploration abilities of filtering in parallel
coordinate graph has been formulated by [9]. The model extends the capability of the current
methods by introducing parallel processing. With the new capability, the typical web-based
visualization library is now become a client-server model which has the ability to improve plot
rendering to 20 times faster
Similarly a model of four functions has been designed for computer forensics investigators. One
Z. Idrus et al. J Fundam Appl Sci. 2017, 9(5S), 23-32 25
of the model’s functions is filtering. It is applied to sort out unrelated data of crime scene. The
aimis to produce speed analysis and discover evidence through interactive visual exploration. [15].
Even though filteringtechnique is widely applied to big data analysis, its various modesof
presentations are hardly studied in detail.
3. ABSTRACT DESIGN
We propose a new two structured stepsfor deriving variations in visual analytics presentations.
The first step is to understand the graph’s behaviors. This approached is adopted form goal
directed approach which can be traced backed since 1990 [21]. The graph behaviors such as
dimension, concept, attributes type and attributes behaviors are studied. The behaviors and their
relationships form an abstraction of the graph. The second step is to identify the activities that can
be performed on the abstraction. The process is adopted form abstract scene analysisapproach
[22]. With such focused in mind various exploration designscan be derived and expended by
domain experts. Thus, the design can be adapted to many applications.
The two steps visual analytics design is implemented to parallel coordinate graph. The first step is
to derivethe behavior of parallel coordinate graph.Let G be the parallel coordinated graph. The
main entity of the graph areaxes whichare denoted as Cx where
Cx = {1…..Cx} (1)
and it indicates the total numbers of axes in the graph.
During exploration, the axes haveto be in one of two states. They are either selected or not
selected to be part of the graph visualization.The state isdenoted as S = {selected, not-selected}.
The axes will appear on graph if it is selected. However, at any given time, G has at least one axis
selected.
G: Cx
S (2)
Plotted on the axes are dataset and is denoted byDy where D = {1…..Dy}. The data can also be in
the two modes which is selected or not-selected. Thus,S = {selected, not-selected}.At any given
time, at least one data is selected.Thus, Gis a function indicating the state of axes,Cxand data, Dy.
Thus,
G: Cx x Dy
S (3)
The position of the axes,Px is not fix and the total number of P is equal to the total number of
Cxthus Px= {1…Px}where
Z. Idrus et al. J Fundam Appl Sci. 2017, 9(5S), 23-32 26
G: Cx x Dy
S x P (4)
To refer to the element of S and P, “.s” and “.p” will be used respectively. G is said to be filtered
by axis if at least one of its axes Cxin G is in a not-selected mode (regardless ofCx positionand Dy
mode)
i
CX (G(i).s = not-selected) (5)
Or at least one of the axes inG has been assigned to a new position(regardless of Cx mode and Dy
mode)
j
CX (G(j).p = Px ) (6)
Or at least one of the data Dy in G is in a not-selected mode (regardless of Cx position and mode)
k
Dy(G(k).s = not-selected ) (7)
Let range R be a set of R = {solo, individual, continuous}
Thus,
G: Cx x Dy
S x P x R (8)
Since this research is interested in the changes of parallel coordinate graph over time, a time
variable is added to the function G thus it is having the domain of
G(t): T x C x D
S x Px R
where
T = [tstart,
) (9)
wheretstart is the starting time for visual analytics process. G(t) is the state of the parallel
coordinate graph at time t.
At any particular time t, action will be made to axes. There are three possible actions. First is
focus on the axes where they can be in two states which are selected or not selected. Second is
when some of the data are selected while other are not selected to be viewed. Finally, change the
position of coordinate. The action is denoted by function act:
act: T x C x D
S
P

R

no_change} (10)
no_change to indicate that there is no action has been made to the graph
Let A(tx, ty) be the graph of act and time t is limited to [tx, ty) such as
A(tx, ty) = {< t, c, d, act (t, c, d) > | tx< t <ty , c

C, d

D} (11)
Thus, flexible filtering design is a temporal function

compute all the filtering performs on the
graph as such:
G(t) =
(t, A(tstart, t)) (12)
Z. Idrus et al. J Fundam Appl Sci. 2017, 9(5S), 23-32 27
The function is in fact is an abstract of the parallel coordinate graph supporting it in two vital
attributes, which are itsbehaviors that make up the graph and the filtering activities to be
performed on its behaviors. Thus, the next section is to identify various filtering design through
abstract scene analysis technique [22].
4. VARIATION OF FILTERING DESIGN
From the abstraction of parallel coordinate graph above, the second step is to extract the variation
in the filtering to produce flexible filtering designs.
4.1.Scenario 1: Zero Filtering
Parallel coordinate graph display all the axis and dataset.
Abstract Scene Analysis:
All axesCx and dataset Dyare set to selected mode. The position Px is unchanged.
i
Cx,
j
Dy(G(ti)(i, j)

s
), where s = selectedSfor the duration of tito tjwhere tstart
titj< tend and ti , tj, tstart, tendT (13)
4.2. Scenario 2: AxisChoice
Axes are active with open selection. However, their positions are static.
Abstract Scene Analysis:
i
Cx, (G(ti)(i)

s, p, r
), where s = selected S, G(i).p p, p P, r

Rfor the duration of
tito tjwhere tstart ≤ titj< tend and ti , tj, tstart, tendT 
4.3. Scenario 3: AxisUltimate
Given an axis, it can view and its position can also change.
Abstract Scene Analysis:
i
Cx, (G(ti)(i)

p, r
), where s = not-selected S, G(i).p = p, p P, r

Rfor the duration of
tito tjwhere tstart ≤ titj< tend and ti , tj, tstart, tendT (15)
4.4. Scenario 4: Dataset Ultimate
Only data are selected. There are two types of selection which are range and selection. Range is
to select two or more data continuously. While,Solo is to select only one set of data only.
4.5.Scenario 5: Dataset Range
It supportsinteractive selection of dataset.
Abstract Scene Analysis:
Z. Idrus et al. J Fundam Appl Sci. 2017, 9(5S), 23-32 28
j
Dy (G(ti)( j)

s, r
), where s = selected S, r = {individual, continuous}

Rfor the
duration of tito tjwhere tstart ≤ titj< tend and ti , tj, tstart, tendT
(16)
4.6. Scenario 6: Dataset Solo
Only one dataset are allowed to be highlighted, while others are deemed and appear as data
background. The data background is important as it creates awareness [23].
Abstract Scene Analysis:

j
Dy (G(ti)( j)

s, r
), where s = selected

S, r = solo

R for the duration of tito tjwhere
tstart ≤ titj< tend and ti , tj, tstart, tendT (17)
4.7. Scenario 7: Continuous
The selection of data is nested within the axes. The range of selection is continuous.
Abstract Scene Analysis:
I
Cx,
j
Dy (G(ti)(I, j)

s, r
), where s = selected S, r = continuous

Rfor the duration of
tito tjwhere tstart ≤ titj< tend and ti , tj, tstart, tendT (18)
4.8. Scenario 8: Split
Unlike Scenario 8, the range selection is not continuous.
Abstract Scene Analysis:
i
Cx,
j
Dy (G(ti)(i, j)

s, r
), where s = selected

S, r = individual

Rfor the duration of
tito tjwhere tstart ≤ titj< tend and ti , tj, tstart, tendT (19)
5. EVALUATION
This paper has presented abstractions of various data filtering in a form of parallel coordinated
graph. For evaluation, a dataset used is from a research center called Green Energy Research
Centre (GERC). The center is located at high educational institution MARA University of
Technology in Malaysia [24].
There are 11150 set of data which are collected every five minutes for the duration of 39
daysfrom various types of logs sensors. The sensors used are complied to international standard
as a mean of quality control[25]. A total of nine types of variables have been collected from these
logs sensors which are date, time, temperature, solar irradiance, ambient temperature, relative
humidity, module temperature and wind speed. The dataset is complex enough to be used to
generate abstractions of various visual filtering designs. It has variety type of variables; high
Z. Idrus et al. J Fundam Appl Sci. 2017, 9(5S), 23-32 29
volume of data and velocity type of data which are changing with time.
In this case study, C = {date, time, solar radiation, gust speed, wind direction, ambient
temperature, relative humidity, module temperature and wind speed}.
Fig.1. The solar photovoltaic data at the overview level
Fig.2. A filtered version of the solar photovoltaic data
All the seven filtering designed has been successfully implemented into the parallel coordinate
graphs.
Z. Idrus et al. J Fundam Appl Sci. 2017, 9(5S), 23-32 30
6. CONCLUSION
The findings support the overview of a multivariate data as well as drill-down capabilities
through filtering in parallel coordinate graph.All the new variations in filtering have been
embedded into the graph and have successfully support the various need of multivariate data.
This research can be extended to support a wider visual analytical support such as zoom, brushing
and sorting.
7. ACKNOWLEDGEMENTS
The authors would like to thank Universiti Teknologi MARA, Malaysia and Ministry of Higher
Education Malaysia for the facilities and financial support under the national grant
FRGS/1/2017/1CT04/UITM/02/2.
8. REFERENCES
[1] Stefanowski J. Data visualization: Or graphical data presentation. 2013.
http://www.cs.put.poznan.pl/jstefanowski/sed/DM14-visualisation.pdf
[2] Biswas A, Dutta S, Shen H W, Woodring J. An information-aware framework for exploring
multivariate data sets.IEEE Transactions on Visualization and Computer Graphics, 2013,
19(12):83-92
[3] Borup R. Data visualization for the database developer. 2015. http://www.ita-
software.com/papers/Borup_DataVisualization_Published.pdf
[4] Wang L, Wang G, Alexander C A. Big data and visualization: Methods, challenges and
technology progress.Digital Technologies, 2015, 1(1):33-38
[5] Andrienko N, Andrienko G.Visual analytics of movement: An overview of methods, tools and
procedures. Information Visualization, 2013, 12(1):3-24
[6] Thomas J. J., Cook K. A. Illuminating thepath: The research and development agenda for
visual analytics.New York: IEEE Press, 2005.
[7] Idrus Z, Abidin S Z Z, Omar N, Idrus Z, Sofee N S A M. Geovisualization of non-resident
students’ tabulation using line clustering. In Regional Conference on Sciences, Technology and
Social Sciences, 2016, pp. 251
[8] Zhou H, Yuan X, Qu H, Cui W, Chen B. Visual clustering in parallel coordinates.Computer
Graphics Forum, 2008, 27(3):1047-1054
Z. Idrus et al. J Fundam Appl Sci. 2017, 9(5S), 23-32 31
[9] Glendenning K, Wischgoll T, Harris J, Vickery R, Blaha L. Parameter space visualization for
large-scale datasets using parallel coordinate plots. Journal of Imaging Science and Technology,
2016, 60(1):104061-104068
[10] Heer J, Shneiderman B, Park C. A taxonomy of tools that support the fluent and flexible use
of visualizations. Interactive Dynamics for Visual Analysis, 2012, 10(1):1-26
[11] Van Den Elzen S, Van Wijk J J. Multivariate network exploration and presentation: From
detail to overview via selections and aggregations. IEEE Transactions on Visualization and
Computer Graphics, 2014, 20(12):2310-2319
[12] Luo W, MacEachren A M. Geo-social visual analytics. Journal of Spatial Information
Science, 2014, 8(8):27-66
[13] Khan N, Yaqoob I, Abaker I, Hashem T, Inayat Z, Kamaleldin W, Ali M, Alam M, Shiraz
M, Gani A. Big data: Survey,technologies, opportunities, and challenges. Scientific World
Journal, 2014, 1-18
[14] ShneidermanB. The eyes have it: A task by data type taxonomy for
informationvisualizations.In IEEE Symposium on Visual Languages, 1996, pp. 336-343
[15] Wang W B, Huang M L, Lu L F, Zhang J. Improving performance of forensics investigation
with parallel coordinates visual analytics. In 17th IEEE International Conference
on Computational Science and Engineering, 2014, pp. 1838-1843
[16] Idrus Z, Razak N H A, Talib N H A, Tajuddin T. Using three layer model (TLM) in web
form design: WeFDeC checklist development.In 2nd International Conference on Computer
Engineering and Applications, 2010, pp. 385-389
[17] Raidou R G, Eisemann M, Breeuwer M, Eisemann E, Vilanova A. Orientation-enhanced
parallel coordinate plots. IEEE Transactions on Visualization and Computer Graphics, 2016,
22(1):589-598
[18] Sacco D, Motta G, You L I, Ma T Y. Chapter 5: Smart cities, urban sensing, and big data:
Mining geo-location in social networks. In X. Liu, R. Anand, G. Xiong, X. Shang&X. Liu(Eds.),
Big data and smart service systems. New York: Academic Press, 2016, pp. 59-84
[19] Shiravi H, Shiravi A, Ghorbani AA. A survey of visualization systems for network security.
IEEE Transactions on Visualization and Computer Graphics, 2012, 18(8):1313-1329
[20] Hilda J J, Srimathi C, Bonthu B. A review on the development of big data analytics and
effective data visualization techniques in the context of massive and multidimensional data.Indian
Z. Idrus et al. J Fundam Appl Sci. 2017, 9(5S), 23-32 32
Journal of Science and Technology, 2016, 9(27):1-13.
[21] Kravetz S, Katz S. Goal directed approach to training parents of children with a
developmental disability.Biritish Journal of Mental Subnormality, 1990, 36(70):17-29
[22] Omar N. Modelling complexities of learner’s in handling web texts via abstract scene
analysis. Malaysian Journal of Computing, 2014, 2(1):13-26
[23] Idrus Z, Abidin S Z Z, Hashim R, Omar N. Social awareness: The power of digital elements
in collaborative environment.WSEAS Transactions on Computers, 2010, 9(6):644-653
[24] Green Energy Research Centre.Home. 2014,
http://gerc.uitm.edu.my/gerc/gerc.php?site=home
[25] International Electrotechnical Commission. Photovoltaic system performance monitoring-
Guidelines for measurement, data exchange and analysis. Geneva: International Standard IEC
61724, 1998
How to cite this article:
Idrus Z, Zainuddin H, Ja’afar A D M. Visual analytics: designing flexible filtering in parallel
coordinate graph. J. Fundam. Appl. Sci., 2017, 9(5S), 23-32.
... x [83] x x x [84] x [62] x x x x [70] x x x x [121] x ...
Article
Full-text available
Multidimensional data visualization is one of the primary foundations supporting data analysis used for understanding the hidden relationships between items and dimensions of complex data. The line-based visualization techniques are a fundamental class of multidimensional visualization techniques and cover an important set of methods that are relevant to the visual exploratory analysis. Recently, General Line Coordinates (GLCs) were introduced. These are losslessly line-based visualization techniques for multidimensional data. Particular cases of GLCs are the non-paired GLCs, which generalize the radial and parallel coordinates and have proved to be highly suitable for visualizing multidimensional data. In this context, we conduct a systematic paper review of the 2D non-paired GLC (2D-NP-GLC) visualization techniques present in the literature. We organize the 2D-NP-GLC contributions in a unified reference framework in which both the representations and the associated interactions are considered. Focusing jointly on these two criteria, we provide a useful common space for the design and development of 2D-NP-GLC techniques. Besides, this framework integrates the 2D-NP-GLC contributions and helps to identify under-explored areas that may be candidates for further research.
... during the same period was significantly higher (over 990 citations). Similarly, Idrus et al. (2017), with zero citations on the WoS, earned four on Google Scholar. The difference in the citation count among different bibliometrics databases is typical, with Google Scholar more likely to record higher citations, compared to WoS (webofknowledge.com). ...
Article
This study employs Google trends and data from the web of science to evaluate the popularity, growth, and impacts of visual analytics (VA) as a research field and data science technique. The paper undertakes quantitative analyses and visualization of the temporal trends, from VA's emergence in 2000 to the end of 2019. The trend analysis helps to forecast future growth in the research and practice of VA. The study highlights four outcomes. First, there is a robust direct relationship among the variables, including VA's growth on the Google trends, the scientific literature production (SLP), and usage of the published documents. Second, the SLP's growth pattern highlights VA's popularity as an emerging field with an overall annual increase of 17.4%. The high citation counts of the published scholarship indicate a significant impact and a continuous growth of the VA field. Third, VA contributes to diverse disciplines other than computer science and information systems, from business and economics to engineering, healthcare, biomedical and chemical sciences, and arts and humanities. VA helps researchers and practitioners in multidisciplinary fields analyze multidimensional data, enhance data visualization, knowledge discovery, generating insights, and make informed decisions. On the reverse, other disciplines contribute to propelling VA's popularity through research productivity, usage, and citation impacts. Finally, a trend analysis predicts sustained future growth of VA technology in research and practice to dissect and sensemaking of the increasingly massive and complex data structures, which is now the norm in many fields.
... Peng et al. [8] designed a metric to compare the degree of confusion in different dimensional sequences, and finally chose the best arrangement scheme. Z Idrus et al. [9] extracted seven filtering designs by abstracting the graph, and proposed a structured process using filtering techniques in parallel coordinates. ...
Article
Full-text available
With the advent of cloud computing and big data era, high-dimensional data sets are widely available in real life. Because of the increase of data dimension and complexity, it is difficult to carry out comprehensive analysis and exploration for high-dimensional data. Therefore, we propose a new visual analysis approach to high-dimensional data. Firstly, we use the principal component analysis (PCA) to reduce the dimension of the high-dimensional data. Then, we use clustering algorithm to classify the reduced dimension data, and each class is rendered independently with different color. Finally, we use the edge binding algorithm to perform a visual clustering. The curves in the same classes are converged and in different classes are separated to alleviate visual confusion. In order to analyze the visualization results better, we also provide the visual interaction technology of "brush". The experimental result shows that our approach can help users to explore the implicit feature patterns quickly and is effective for visual analysis of large-scale high-dimensional data.
Chapter
Full-text available
Analytical Reasoning is the foundation of visual analytics, assisted via interactive and dynamic visualization representation. The main concern of visual analytics is the analytics process itself, it is important to facilitate the human mental space during the analysis process by embedding the analytical reasoning in the visual analytics representation. This paper aims to introduce and describe the essential analytical reasoning features within visual analytics representation. The framework describes analytical reasoning features from three parts of visual analytics representation which are higher-level structure, interconnection and lower-level structure. For higher-level structure, we proposed the features of big picture, analytics goal and insights through storytelling to ensure the analytics output becomes knowledge and applicable to facilitate the business decision. For interconnection, the features of trend, pattern and relevancy induce a relationship between higher and lower-level structures. Finally, analytical reasoning features for lower-level structure are quite straightforward which are benchmarking, ranking, decluttering, clueing and filtering. It is hoped that this framework could help to shed some light in terms of understanding analytical reasoning features that can facilitate the business decision.
Chapter
Full-text available
Solar energy supplies pure environmental-friendly and limitless energy resource for human. Although the cost of solar panels has declined rapidly, technology gaps still exist for achieving cost-effective scalable deployment combined with storage technologies to provide reliable, dispatchable energy. However, it is difficult to analyze a solar data, in which data was added in every 10 min by the sensors in a short time. These data can be analyzed easier and faster with the help of data visualization. One of the popular data visualization methods for displaying massive quantity of data is parallel coordinates plot (PCP). The problem when using this method is this abundance of data can cause the polylines to overlap on each other and clutter the visualization. Thus, it is difficult to comprehend the relationship that exists between the parameters of solar data such as power rate produced by solar panel, duration of daylight in a day, and surrounding temperature. Furthermore, the density of overlapped data also cannot be determined. The solution is to implement clutter-reduction technique to parallel coordinate plot. Even though there are various clutter-reduction techniques available for visualization, they are not suitable for every situation of visualization. Thus this research studies a wide range of clutter-reduction techniques that has been implemented in visualization, identifies the common features available in clutter-reduction technique, produces a conceptual framework of clutter-reduction technique as well as proposes the suitable features to be added in parallel coordinates plot of solar energy data to reduce visual clutter.
Article
Full-text available
Objectives: Data visualization, the use of images to represent information, is now becoming properly appreciated due to the benefits it can bring to business. This paper focuses on the general background of data visualization and visualization techniques. Methods: Data visualization has the prospective to assist humans in analysing and comprehending large volumes of data, and to detect patterns, clusters and outliers that are not obvious using non-graphical forms of presentation. For this reason, data visualizations have an important role to play in a diverse range of applied problems, including data exploration and mining, Information retrieval and intelligence analysis. In real time various techniques have been used of which Geometric projection techniques, Iconographic display techniques, Pixel-oriented, Hierarchical techniques, Graphbased techniques are discussed. Findings: The major difficultly in big data visualization is to preserve any of the original dimensional information. The taxonomy detailed here show that the local and global structure of the data can be visualized in an interactive manner and has a massive advantage.
Article
Full-text available
Parallel Coordinate Plots (PCPs) is one of the most powerful techniques for the visualization of multivariate data. However, for large datasets, the representation suffers from clutter due to overplotting. In this case, discerning the underlying data information and selecting specific interesting patterns can become difficult. We propose a new and simple technique to improve the display of PCPs by emphasizing the underlying data structure. Our Orientation-enhanced Parallel Coordinate Plots (OPCPs) improve pattern and outlier discernibility by visually enhancing parts of each PCP polyline with respect to its slope. This enhancement also allows us to introduce a novel and efficient selection method, the Orientation-enhanced Brushing (O-Brushing). Our solution is particularly useful when multiple patterns are present or when the view on certain patterns is obstructed by noise. We present the results of our approach with several synthetic and real-world datasets. Finally, we conducted a user evaluation, which verifies the advantages of the OPCPs in terms of discernibility of information in complex data. It also confirms that O-Brushing eases the selection of data patterns in PCPs and reduces the amount of necessary user interactions compared to state-of-the-art brushing techniques.
Conference Paper
Full-text available
Computer forensics investigators aim to analyse and present facts through the examination of digital evidences in short times. As the volume of suspicious data is becoming large, the difficulties of catching the digital evidence in a legally acceptable time are high. This paper proposes an effective method for reducing investigation time redundancy to achieve the normalization of data on hard disk drives (HDD) for computer forensics. We use visualization techniques, parallel coordinates, to analyse data instead of using data analysis algorithms only, and also choose a Red-Black tree structure to de-duplicate data. It reduces the time complexity, including the time spent of searching data, adding data as well as deleting data. We show the advantages of our approach; moreover, we demonstrate how this method can enhance the efficiency and quality of computer forensics task.
Article
Full-text available
Big Data has gained much attention from the academia and the IT industry. In the digital and computing world, information is generated and collected at a rate that rapidly exceeds the boundary range. Currently, over 2 billion people worldwide are connected to the Internet, and over 5 billion individuals own mobile phones. By 2020, 50 billion devices are expected to be connected to the Internet. At this point, predicted data production will be 44 times greater than that in 2009. As information is transferred and shared at light speed on optic fiber and wireless networks, the volume of data and the speed of market growth increase. However, the fast growth rate of such large data generates numerous challenges, such as the rapid growth of data, transfer speed, diverse data, and security. Nonetheless, Big Data is still in its infancy stage, and the domain has not been reviewed in general. Hence, this study comprehensively surveys and classifies the various attributes of Big Data, including its nature, definitions, rapid growth rate, volume, management, analysis, and security. This study also proposes a data life cycle that uses the technologies and terminologies of Big Data. Future research directions in this field are determined based on opportunities and several open issues in Big Data domination. These research directions facilitate the exploration of the domain and the development of optimal techniques to address Big Data.
Article
Full-text available
Spatial analysis and social network analysis typically consider social processes in their own specific contexts, either geographical or network space. Both approaches demonstrate strong conceptual overlaps. For example, actors close to each other tend to have greater similarity than those far apart; this phenomenon has different labels in geography (spatial autocorrelation) and in network science (homophily). In spite of those conceptual and observed overlaps, the integration of geography and social network context has not received the attention needed in order to develop a comprehensive understanding of their interaction or their impact on outcomes of interest, such as population health behaviors, information dissemination, or human behavior in a crisis. In order to address this gap, this paper discusses the integration of geographic with social network perspectives applied to understanding social processes in place from two levels: the theoretical level and the methodological level. At the theoretical level, this paper argues that the concepts of nearness and relationship in terms of a possible extension of the First Law of Geography are a matter of both geographical and social network distance, relationship, and interaction. At the methodological level, the integration of geography and social network contexts are framed within a new interdisciplinary field: visual analytics, in which three major application-oriented subfields (data exploration, decision-making, and predictive analysis) are used to organize discussion. In each subfield, this paper presents a theoretical framework first, and then reviews what has been achieved regarding geo-social visual analytics in order to identify potential future research.
Article
Full-text available
Analysis of movement is currently a hot research topic in visual analytics. A wide variety of methods and tools for analysis of movement data has been developed in recent years. They allow analysts to look at the data from different perspectives and fulfil diverse analytical tasks. Visual displays and interactive techniques are often combined with computational processing, which, in particular, enables analysis of a larger number of data than would be possible with purely visual methods. Visual analytics leverages methods and tools developed in other areas related to data analytics, particularly statistics, machine learning and geographic information science. We present an illustrated structured survey of the state of the art in visual analytics concerning the analysis of movement data. Besides reviewing the existing works, we demonstrate, using examples, how different visual analytics techniques can support our understanding of various aspects of movement.
Article
This paper describes the rationale and techniques of a general approach to setting up a series of related training programmes for parents of children with a developmental disability. The goal directed approach described in this paper offers practical guidelines for setting up and evaluating parent training programmes and was implemented in a research and service project entitled A School for Parents of Children with a Developmental Dioability. It also can provide data for testing the validity and usefulness of a number of comprehensive statements about human behaviour and development. This approach can also be viewed as a partial res- ponse to divergent criticisms of existing parent teaching programmes.
Article
Abstract Visualization is an important task in data analytics, as it allows researchers to view patterns within the data instead of reading through extensive raw data. Allowing the ability to interact with the visualizations is an essential aspect, since it provides the ability to intuitively explore data to find meaning and patterns more efficiently. Interactivity, however, becomes progressively more difficult as the size of the dataset increases. This project begins by leveraging existing web-based data visualization technologies, and extends their functionality through the use of parallel processing. This methodology utilizes state-of-the-art techniques, such as Node.js, to split the visualization rendering and user interactivity controls between a client‐server infrastructure without having to rebuild the visualization technologies. The approach minimizes data transfer by performing the rendering step on the server while allowing for the use of high-performance computing systems to render the visualizations more quickly. In order to improve the scaling of the system with larger datasets, parallel processing and visualization optimization techniques are used. This work uses parameter space data generated from mindmodeling.org to showcase the authors’ methodology for handling large-scale datasets while retaining interactivity and user friendliness.
Article
Network data is ubiquitous; e-mail traffic between persons, telecommunication, transport and financial networks are some examples. Often these networks are large and multivariate, besides the topological structure of the network, multivariate data on the nodes and links is available. Currently, exploration and analysis methods are focused on a single aspect; the network topology or the multivariate data. In addition, tools and techniques are highly domain specific and require expert knowledge. We focus on the non-expert user and propose a novel solution for multivariate network exploration and analysis that tightly couples structural and multivariate analysis. In short, we go from Detail to Overview via Selections and Aggregations (DOSA): users are enabled to gain insights through the creation of selections of interest (manually or automatically), and producing high-level, infographic-style overviews simultaneously. Finally, we present example explorations on real-world datasets that demonstrate the effectiveness of our method for the exploration and understanding of multivariate networks where presentation of findings comes for free.