Variable binned scatter plots.

Information Visualization (Impact Factor: 0.77). 01/2010; 9:194-203. DOI: 10.1057/ivs.2010.4
Source: DBLP

ABSTRACT The scatter plot is a well-known method of visualizing pairs of two continuous variables. Scatter plots are intuitive and easy-to-use, but often have a high degree of overlap which may occlude a significant portion of the data. To analyze a dense non-uniform data set, a recursive drill-down is required for detailed analysis. In this article, we propose variable binned scatter plots to allow the visualization of large amounts of data without overlapping. The basic idea is to use a non-uniform (variable) binning of the x and y dimensions and to plot all data points that are located within each bin into the corresponding squares. In the visualization, each data point is then represented by a small cell (pixel). Users are able to interact with individual data points for record level information. To analyze an interesting area of the scatter plot, the variable binned scatter plots with a refined scale for the subarea can be generated recursively as needed. Furthermore, we map a third attribute to color to obtain a visual clustering. We have applied variable binned scatter plots to solve real-world problems in the areas of credit card fraud and data center energy consumption to visualize their data distributions and cause-effect relationships among multiple attributes. A comparison of our methods with two recent scatter plot variants is included.

Download full-text


Available from: Halldor Janetzko, Jun 12, 2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This thesis focuses on the visual exploration of a specific type of moving geoobjects, namely the spatially extended objects or phenomena. Visual analytical approaches are developed and implemented to study the dynamics of the spatio-temporally evolving polygons. The lightning data are chosen as a real-world case. In addition to a generic concept for the movement analysis of spatially extended objects, the thesis put forward a number of synchronized cartographic and non-cartographic visual analytical approaches for the exploration of dynamic lightning clusters.
    12/2014, Degree: Doctoral thesis (PhD, Supervisor: Meng, Liqiu (Prof. Dr.); Andrienko, Gennady (Prof. Dr.); Betz, Hans-Dieter (Prof. Dr.)
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background: Over the last years, more and more biological data became available. Besides the pure amount of new data, also its dimensionality - the number of different attributes per data point - increased. Recently, especially the amount of data on chromatin and its modifications increased considerably. In the field of epigenetics, appropriate visualization tools designed for highlighting the different aspects of epigenetic data are currently not available. Results: We present a tool called TiBi-Scatter enabling correlation analysis in 2D. This approach allows for analyzing multidimensional data while keeping the use of resources such as memory small. Thus, it is in particular applicable to large data sets. Conclusions: TiBi-Scatter is a resource-friendly and easy to use tool that allows for the hypothesis-free analysis of large multidimensional biological data sets.
    4th Symposium on Biologial Data Visualization, Boston, MA, USA; 07/2014
  • [Show abstract] [Hide abstract]
    ABSTRACT: Regression models play a key role in many application domains for analyzing or predicting a quantitative dependent variable based on one or more independent variables. Automated approaches for building regression models are typically limited with respect to incorporating domain knowledge in the process of selecting input variables (also known as feature subset selection). Other limitations include the identification of local structures, transformations, and interactions between variables. The contribution of this paper is a framework for building regression models addressing these limitations. The framework combines a qualitative analysis of relationship structures by visualization and a quantification of relevance for ranking any number of features and pairs of features which may be categorical or continuous. A central aspect is the local approximation of the conditional target distribution by partitioning 1D and 2D feature domains into disjoint regions. This enables a visual investigation of local patterns and largely avoids structural assumptions for the quantitative ranking. We describe how the framework supports different tasks in model building (e.g., validation and comparison), and we present an interactive workflow for feature subset selection. A real-world case study illustrates the step-wise identification of a five-dimensional model for natural gas consumption. We also report feedback from domain experts after two months of deployment in the energy sector, indicating a significant effort reduction for building and improving regression models.
    12/2013; 19(12):1962-71. DOI:10.1109/TVCG.2013.125