Topic-Based Coordination for Visual Analysis of Evolving Document Collections.
ABSTRACT Document interpretation is a crucial task in many visual analytics applications, made harder by the widespread availability of freely available textual files. In this paper we propose an approach based on topic detection coupled with multiple coordinated views to assist analysis of time varying document collections. Given multiple document maps built from a set of text files, we define a strategy to support users locating the evolution of topics addressed by the documents, along various time steps. The approach is supported by a new algorithm for topic extraction from texts, also introduced. Finally, we show several examples illustrating how the proposed strategy may be applied in the analysis of document collections.
[show abstract] [hide abstract]
ABSTRACT: Scientific articles are the major mechanism for researchers to re-port their results, and a collection of papers on a discipline can reveal a lot about its evolution, such as the emergence of new top-ics. Nonetheless, given a broad collection of papers it is typically very difficult to grasp important information that could help readers to globally interpret, navigate and then focus on the relevant items for their task. Content-based document maps are visual representa-tions created from evaluating the (dis)similarity amongst the doc-uments, and have been shown to support exploratory tasks in this scenario. Documents are represented by visual markers placed in the 2D space so that documents close share similar content. Al-beit the maps allow visually identifying groups of related docu-ments and frontiers between groups, they do not explicitly convey the temporal evolution of a collection. We propose a technique for creating content-based similarity maps of document collections that highlight temporal changes along time. Our solution constructs a sequence of maps from time-stamped sub-sets of the data. It adopts a cumulative backwards strategy to preserve user context across successive time-stamps, i.e., maps do not change drastically from one time stamp to the next, favouring user perception of changes.04/2002;