Conference Paper

Visualizing Text Classification Models with Voronoi Word Clouds

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The polygonal boundaries for each tag cloud are generated by applying Voronoi subdivision. The initial points for generating this subdivision can either be set manually (as in the example figure) or can be the result of a similarity layout of the category content (for an example, see [43]). ...
Chapter
Full-text available
Providing means for effectively accessing and exploring large textual data sets is a problem attracting the attention of text mining and information visualization experts alike. The rapid growth of the data volume and heterogeneity, as well as the richness of metadata and the dynamic nature of text repositories, add to the complexity of the task. This chapter provides an overview of data visualization methods for gaining insight into large, heterogeneous, dynamic textual data sets. We argue that visual analysis, in combination with automatic knowledge discovery methods, provides several advantages. Besides introducing human knowledge and visual pattern recognition into the analytical process, it provides the possibility to improve the performance of automatic methods through user feedback.
... The polygonal boundaries for each tag cloud are generated by applying Voronoi subdivision. The initial points for generating this subdivision can either be set manually (as in the example figure) or can be the result of a similarity layout of the category content (for an example, see [43]). ...
Chapter
Full-text available
Providing means for effectively accessing and exploring large textual data sets is a problem attracting attention of text mining and information visualization experts alike. Rapid growth of the data volume, heterogeneity and richness of metadata, and the dynamic nature of text repositories add to the complexity of the task. This chapter provides an overview of visualization methods for gaining insight into large, heterogeneous, dynamic textual data sets. We argue that visual analysis in combination with automatic knowledge discovery methods provides several advantages. Besides introducing human knowledge and visual pattern recognition into the analytical process, it provides the possibility to improve the performance of automatic methods through user feedback.
Conference Paper
Tag clouds are widely applied, popular visualization techniques as they illustrate summaries of textual data in an intuitive, lucid manner. Many layout algorithms for tag clouds have been developed in the recent years, but none of these approaches is designed to reflect the notion of hierarchical distance. For that purpose, we introduce a novel tag cloud layout called TagSpheres. By arranging tags on various hierarchy levels and applying appropriate colors, the importance of individual tags to the observed topic gets assessable. To explore relationships among various hierarchy levels, we aim to place related tags closely. Various usage scenarios from the digital humanities, sports, aviation and natural disaster management point out the benefit of TagSpheres for different domains. In addition, we highlight that TagSpheres is also a novel layout approach for tree structures.
Conference Paper
Full-text available
Automated text categorization is an important technique for many web applications, such as document indexing, doc- ument filtering, and cataloging web resources. Many dif- ferent approaches have been proposed for the automated text categorization problem. Among them, centroid-based approaches have the advantages of short training time and testing time due to its computational efficiency. As a result, centroid-based classifiers have been widely used in many web applications. However, the accuracy of centroid-based clas- sifiers is inferior to SVM, mainly because centroids found during construction are far from perfect locations. We design a fast Class-Feature-Centroid (CFC) classifier for multi-class, single-label text categorization. In CFC, a centroid is built from two important class distributions: inter-class term index and inner-class term index .C FC proposes a novel combination of these indices and employs a denormalized cosine measure to calculate the similarity score between a text vector and a centroid. Experiments on the Reuters-21578 corpus and 20-newsgroup email collection show that CFC consistently outperforms the state-of-the-art SVM classifiers on both micro-F1 and macro-F1 scores. Par- ticularly, CFC is more effective and robust than SVM when data is sparse.
On the beauty and usability of tag clouds
  • Christin Seifert
  • Barbara Kump
  • Wolfgang Kienreich
  • Gisela Granitzer
  • Michael Granitzer
Christin Seifert, Barbara Kump, Wolfgang Kienreich, Gisela Granitzer, and Michael Granitzer. On the beauty and usability of tag clouds. In Proceedings of the 12th International Conference on Information Visualisation (IV), pages 17-25, Los Alamitos, CA, USA, July 2008. IEEE Computer Society.