Conference PaperPDF Available

lingvis.io -A Linguistic Visual Analytics Framework

Conference Paper

lingvis.io -A Linguistic Visual Analytics Framework

Abstract and Figures

We present a modular framework for the rapid-prototyping of linguistic, web-based, visual analytics applications. Our framework gives developers access to a rich set of machine learning and natural language processing steps, through encapsulating them into micro-services and combining them into a computational pipeline. This processing pipeline is auto-configured based on the requirements of the visualization front-end, making the linguistic processing and visualization design detached , independent development tasks. This paper describes the constellation and modality of our framework, which continues to support the efficient development of various human-in-the-loop, linguistic visual analytics research techniques and applications.
Content may be subject to copyright.
lingvis.io — A Linguistic Visual Analytics Framework
Mennatallah El-Assady1, Wolfgang Jentner1, Fabian Sperrle1,
Rita Sevastjanova1, Annette Hautli-Janisz2, Miriam Butt2, and Daniel Keim1
1Department of Computer Science, University of Konstanz, Germany
2Department of Linguistics, University of Konstanz, Germany
Abstract
We present a modular framework for the rapid-
prototyping of linguistic, web-based, visual
analytics applications. Our framework gives
developers access to a rich set of machine
learning and natural language processing
steps, through encapsulating them into micro-
services and combining them into a compu-
tational pipeline. This processing pipeline is
auto-configured based on the requirements of
the visualization front-end, making the lin-
guistic processing and visualization design de-
tached, independent development tasks. This
paper describes the constellation and modality
of our framework, which continues to support
the efficient development of various human-
in-the-loop, linguistic visual analytics research
techniques and applications.
1 Introduction
Research at the intersection of computational lin-
guistics, visual analytics, and explainable machine
learning, is a vibrant, interesting field that broad-
ens the horizons of all disciplines involved. Over
the last years, a team of computer scientists, lin-
guists, as well as social scientists from differ-
ent areas, at the University of Konstanz, have
come together to push their disciplinary bound-
aries through collaborative research. This col-
laboration resulted in the development of several
mixed-initiative visual analyitcs approaches, rang-
ing from generating high-level corpus overviews
using Lexical Episode Plots (Gold et al.,2015) to
sophisticated human-in-the-loop topic refinement
techniques (El-Assady et al.,2018b,2019).
This effort has helped establish the subarea
of Linguistic Visualization (short: LingVis) re-
search (Butt et al.,2019). Within this subarea,
application topics we worked on include content
analysis, e.g., NEREx (El-Assady et al.,2017b);
discourse analysis, e.g., ThreadReconstructor (El-
Assady et al.,2018a); language change, e.g.,
HistoBankVis (Sch¨
atzle et al.,2017) or COHA
Vis (Schneider et al.,2017); readability analysis,
e.g., literature fingerprinting (Oelke et al.,2012);
language modeling, e.g., LTMA (El-Assady et al.,
2018c); argumentation analysis, e.g., ConToVi (El-
Assady et al.,2016); explainable machine leran-
ing, e.g., verbalization and active learning (Sev-
astjanova et al.,2018a,b); interactive model re-
finement, e.g., SpecEx (Sperrle et al.,2018);
multi-corpora analysis, e.g., Alignment Vis (Jent-
ner et al.,2017); modeling of speech features, e.g.,
SOMFlow (Sacha et al.,2018).
To make our linguistic visualization techniques
accessible to a wider public, we strive to im-
plement them as web-based applications. How-
ever, this is only possible on a larger scale us-
ing a framework architecture that accommodates
the needs for rapid-prototyping, disguising the in-
volved engineering complexity for application de-
velopers. Hence, we established the lingvis.io
framework as a common platform, facilitating the
share and reuse of implementation components. A
prominent application powered by our framework
is VisArgue (El-Assady et al.,2017a), an approach
for multi-party discourse analysis.
In this paper, we report on our shared frame-
work and infrastructure that drives a multitude
of linguistic visualization projects, as depicted in
Figure 1. The core of our framework is a flexi-
ble pipeline with automatic dependency resolution
that enables application developers to request nat-
ural language processing (NLP) steps for their vi-
sualizations, which, in turn, are auto-configured
based on user-defined parameters. These are cho-
sen in a user interface that is designed to enable
experts and non-experts, alike, to adapt the NLP
processing to their tasks and data. The results
of this processing are closely intertwined with the
interactive visual analytics components, enabling,
Figure 1: The lingvis.io framework driving various linguistic visualization projects based on rich NLP pipelines.
for instance, visual debugging for linguists, or in-
sights for domain experts, such as writers, political
scientists, etc. To address the trade-offs between
tailored and expressive interface design, rapid-
prototyping, and processing flexibility, our frame-
work architecture strictly separates and modular-
izes tasks into atomic components that are com-
partmentalized in subdomains (i.e., auto-scaling
cluster environments). Developers work on their
designated feature branches and efficiently test
their prototypes through continuous deployment.
Related Work – Other notable frameworks re-
lated to ours include Stanford CoreNLP1, GATE2
and Weblicht3. Facebook has recently released a
deep-learning based framework for various NLP
tasks, called pytext4. While they provide state-of-
the-art models, they are code-only platforms that
require developers to write processing pipelines
from scratch every time. More general (deep
learning) frameworks, including tensorflow5and
pytorch6can also be used for text processing
or to generate rich feature vectors like sentence-
or word-embeddings. KNI ME7and TABLEAU8
are platforms for intuitively creating data science
workflows with reusable components, but are not
tailored to NLP tasks specifically. While we
can communicate with those frameworks through
APIs to enrich our own NLP pipeline, these toolk-
its are solely tailored to linguistic analysis and of-
fer no, or very limited, visualizations possibilities.
1stanfordnlp.github.io/CoreNLP
2gate.ac.uk
3weblicht.sfs.uni-tuebingen.de
4github.com/facebookresearch/pytext
5tensorflow.org
6pytorch.org
7knime.com
8tableau.com
2 Auto-Configured Processing Pipeline
Our framework is based on the assumption that
the individual processing steps can and should be
atomic in their nature. Each step holds a well-
defined list of dependencies which the respec-
tive step requires to execute its task successfully.
This allows us to model a processing pipeline for
a given type of data input as an acyclic graph
which can be processed in parallel. For exam-
ple, as shown in Figure 2, to retrieve the result
of a topic model, the visualization requests one
or more models. Based on their dependencies to
other steps, a pipeline is generated (that takes into
account all user-defined parameters). Here, the
topic modeling is based on descriptor vectors ex-
tracted for each document in the corpus, as well as
word embedding results.
Figure 2: Dependency graph for topic model relations.
A successful implementation requires a consis-
tent, flexible, and well-defined data model such
that each step can use its transformation capabili-
ties to semantically enrich the data. We therefore
do not allow any step to modify or delete data but
each step can further add metadata. This section
describes the modeling of processing steps in our
pipeline, as well as the underlying data structure.
2.1 Processing Steps
Our framework allows for progressive steps where
intermediate results can be investigated and fur-
ther steered by the user. As shown in Figure 3, in
the user interface (UI), users first upload and se-
lect the data they want to process. Based on their
intended tasks they then select suitable visualiza-
tion components. Internally every visualization
defines a list of atomic processing steps as depen-
dencies that need to run in order to generate the
desired information to visualize. In addition to a
list of dependencies, visualizations define one or
multiple controller endpoints. These serve as com-
munication medium between the processing steps
and the UI, and are characterized by the fact that
they do not further enrich the data and cannot be
defined as dependencies by any other processing
step. This implies that visualization steps termi-
nate the acyclic graph and, thus, the resulting pro-
cessing pipeline.
For the initial processing, a controller in the
streaming-control-layer handles the communica-
tion with every specific processing step and pro-
vides the parameter-configuration interface to the
UI. This enables users to parameterize the pro-
cessing for increasing flexibility. For example, a
POS tagger step can be parameterized with differ-
ent tagger models. Such a tagset does not need to
be static but can depend on a language or be based
on a user’s selection. It is only constrained by the
necessity of having a standardized tag set, as later
steps use these tags to further process the data.
The endpoints in the two control layers sepa-
rate static (default) from streaming controllers. In
the default case, controllers are used to commu-
nicate the results of a completed processing step
to the visualization. Streaming controllers, on the
other hand, intercept a processing step while it is
running to support direct user interactions. Here,
progressive visualizations are shown while the re-
spective processing step is running. The users can,
therefore, directly observe, adapt, and refine the
underlying machine learning models. This enables
the design of tightly-coupled, human-in-the-loop
interfaces for interactive model refinement and ex-
plainable machine learning.
2.2 Data Structure Modeling
We represent a corpus hierarchy as a recursively
stacked data structure consisting of, so-called,
‘document objects’. These are a modular abstrac-
tion of all levels of the hierarchy, including cor-
pora, documents, paragraphs, sentences, etc. The
highest level of our data structure consists of a
Figure 3: Schematic overview of the framework.
collection of document objects that typically rep-
resents all analyzed corpora, whereas the lowest
level are single word tokens. Hence, from an or-
dered list of all corpora, we can descend the data
structure to find a list of documents for each cor-
pus, all the way to sub-sentence structures, multi-
word objects, and finally words consisting of an
ordered list of tokens. This flexible data structure
allows us to model arbitrary complex object hier-
archies, with each object level containing an or-
dered list of the objects on the next level, while to-
kens define the terminal level.
Each processing step in the pipeline has access
to the full data hierarchy. Throughout the process-
ing, steps append additional data elements to the
hierarchy objects to enrich them with computa-
tional results and metadata, making their process-
ing results accessible to other steps downstream.
Hence, through defining the pipeline dependen-
cies, processing steps can request input data that is
provided by its previous steps through defined for-
mats, which ensures atomicity and encapsulation.
This appended data is independent of a step’s so-
phistication, which can range from simple wordlist
lookups to complex deep neural network models.
In the following, we describe the three data for-
mats that can be appended to document objects.
(1) Weighted Feature Vectors (FV) – One of
the key structures attached to document objects are
feature vectors. These represent the transforma-
tion of text from a semi-structured data source to
a high-dimensional feature space. Feature vectors
represent the discretized elements of the text, of-
ten weighted descriptors extracted from the under-
lying text. These vectors are defined by a global
signature vector that prescribes an ordered refer-
ence for the numerical weights contained in in-
dividual FVs for each document object. For ex-
ample, to build a frequency-based bag-of-words
model, we enable users to choose from a set of
token-classes including POS-tags, named entities,
lexical chains, n-grams, stop-words, etc. These are
(a) Named Entity and Measure Settings UI (b) Topic Modeling Settings UI
Figure 4: UI components for parameterizing processing steps. Available settings depend on the underlying models.
scored based on several weighting schemes, such
as tf-idf, ttf-idf, log-likelihood ratio, and other
metrics, as described by El-Assady et al. (2018b).
Such weighted feature vectors can describe the im-
portance of keywords on a global level (e.g., for all
analyzed corpora) or on an individual object level
(e.g., for a single document). Other types of fea-
ture vectors include ones extracted using word- or
sentence-embeddings, as well as vectors based on
linguistic annotation pipelines.
(2) Attributes (A) – As opposed to numeric
feature vectors, attributes consist of labels or
pointers attached during processing. Both of these
attribute types can be used to aggregate feature
vectors or measures in the data hierarchy. For
example, for dynamically computing all measure
values related to a particular speaker and topic in
a conversation, these attributes are utilized.
Labels (L) can be single flags, such as POS tags,
or could consist of n-tuples, for example, to inform
the types of arguments contained in the underlying
text. To accommodate for labels that describe only
parts of a hierarchy, we also feature window la-
bels. These are stored in the hierarchy level above
the targeted level and contain a beginning and end-
ing indices of the children. For example, the sen-
tence hierarchy level may contain a label consen-
sus with a beginning index of 0and ending index
of 6, pointing to a sub-sentence structure that en-
codes that the first six tokens of that sentence are
indicating a consensus.
Pointers (P), on the other hand, are attributes
that point to external structures, such as topics,
speakers, or other entities. Such structures are usu-
ally modeled by specific processing steps and con-
tain descriptive features of the elements they rep-
resent. For example, a topic might contain a list
of descriptive keywords, whereas a speaker object
would contain metadata and biography informa-
tion on a speaker.
(3) Measures (M) – As a pendant to nominal
attributes, measures are numeric or boolean values
attached to the document objects. These are used
to describe linguistic features of various types. As
with the labels, a measure consists of a class name
and a singular value. They are typically used to
qualify properties of objects and, thus, can be ag-
gregated through the data hierarchy. In addition,
measures can be normalized, for example, based
on the number of tokens in a document. We distin-
guish three types of measures: Boolean;numeric
continuous; and numeric bi-polar. Such measures
can be extracted through a variety of processing
steps, ranging from simple word-list-based tag-
gers, statistical analysis steps, rule-based annota-
tors, through sophisticated machine learning based
measure calculators. We use such measures to ex-
tract semantically relevant information or to moni-
tor the quality of document objects with respect to
selected criteria. Hence, such measures inform the
visual analytics methods and expand the dimen-
sionality of the underlying objects.
3 User Interface
Simplicity comes at the cost of flexibility” (Jent-
ner et al.,2018). The dependency-based process-
ing model automatizes many decisions a user has
to take in other frameworks. However, in order to
allow domain-experts to use their knowledge and
influence the underlying models, parameterization
is necessary. We run a linearization of the acyclic
graph prior to executing the pipeline. This allows
us to display the steps and their parameters in the
order of the processing-flow to support the users
in their parameter estimation. To further support
users we deploy guidance in the form of informa-
tion pop-ups and built-in tutorials. This includes
explaining how a respective processing step trans-
forms the data and the value it adds to the task, but
furthermore involves descriptions of the parame-
ters and their estimated impact.
To exemplify this process, we describe a par-
tial pipeline that is commonly used in our frame-
work to demonstrate its expressiveness and flex-
ibility. Let this partial pipeline be: (1) Named
Entity Recognizer (A-L) (2) Document Feature
Extractor (FV) (3) Topic Modeling (A-P)
(4) Measure Calculator (M). The (1) NER step la-
bels (A-L) tokens with Named Entities. As shown
in Figure 4a, the user can define parameters such
as the minimum distance and similarity score. The
(2) DFE creates feature vectors (FV) on all data
hierarchies. Based on the data and task, the user
selects and weights the features and selects an ap-
propriate scoring scheme. In the (3) TM step,
the user selects one or multiple of the available
topic models (e.g., LDA, IHTM) and parameter-
izes them, for example, with the number of de-
sired topics (Figure 4b). Note that this step uses
only the feature vectors extracted in the previous
step. It assigns additional pointer attributes (A-P)
for each document reflecting their probability to
belong to a certain topic. The (4) MC then uses,
for example, the topic labels to calculate measures
(M) such as Topic Shift where the topic of discus-
sion is changed within a document, or Topic Per-
sistence where a given topic continues to be pur-
sued by the author or speaker.
Such a pipeline is part of multiple visualization
creation cycles. For example, we utilize the re-
sults of topic modeling to analyze the dynamics
of speakers in a conversation transcript in Con-
ToVi (El-Assady et al.,2016). Hence, to build such
visualization approaches, we rely on the auto-
configuration of the processing pipeline, as well
as the familiarity of users with their analyzed data
and tasks, enabling application developers to fo-
cus on their encapsulated implementation environ-
ment without worrying about the complexity of
the underlying linguisic processing.
4 Microservice Architecture
The modularity of our framework and atomicity of
the steps is further emphasized by the use of mi-
croservices (Figure 5,s1,s2,s3). A microservice
is a small, single-purpose service that exposes an
API. Because our microservices are dockerized,9
the microservice itself is independent of any pro-
gramming language and environment which pro-
vides us with great flexibility. Additionally, in-
dividual microservices are easier to maintain than
a large, monolithic framework. An example mi-
croservice from our framework returns POS tags
9docker.com
Figure 5: Multiple environments (env1,env2) of the
lingvis.io framework are managed in a kubernetes clus-
ter. Microservices (s1,s2,s3) are tailored to a specific
task and resemble different steps of the pipeline. Any
microservice can be redefined in a specific environment
if variations of the functionality are needed.
for tokenized texts. Requests to that service con-
tain a list of tokens, and are parametrized with
the tagger model to use. The middleware handles
the user authentication, the (processed) data, the
pipeline steps, and the controllers (Figure 3). In
addition, it coordinates the microservices and han-
dles communication with their respective APIs to
obtain results to add to the data.
Our framework lives in a kubernetes clus-
ter10 which effectively manages and orchestrates
docker containers. This allows us to scale
microservices, running multiple instances and
balancing their load automatically—even across
physical servers (Figure 5, see s2). We are further
able to run multiple environments of the lingvis.io
framework (middleware & frontend) in our cluster
allowing our researchers to deploy a tailored ver-
sion, for example, for a user evaluation. Kuber-
netes in combination with the reverse proxy trae-
fik11 automatically assigns a URL to the frontend
and the middleware to make them accessible from
anywhere in the world.
5 Conclusion
A demo of our framework is available under
https://demo.lingvis.io/. Currently available visu-
alizations with attendant NLP microservices are
presented via the demo video or can be found un-
der the “Visualizations” button. To the best of our
knowledge, lingvis.io represents the first scaleable
and modular web-based framework that combines
NLP with visual analytics applications. Its unique
contribution lies in combining these applications
in a novel way on the one hand, but in separating
NLP processing and visualizations on the opera-
tional level through an auto-configured pipeline on
10 kubernetes.io
11 traefik.io
the other hand. This enables developers to focus
on the individual task at hand, rather than being
distracted by needing to solve general NLP or vi-
sual analytics problems. As such, the framework
is ideal for rapid prototyping and should serve as
a productive base for more developments within
LingVis, the interdisciplinary combination of, lin-
guistics, NLP and visual analytics.
References
Miriam Butt, Annette Hautli-Janisz, and Verena Lyd-
ing. 2019. LingVis: Visual Analytics for Linguistics.
CSLI lecture notes. CSLI Publications, to appear.
Mennatallah El-Assady, Valentin Gold, Carmela
Acevedo, Christopher Collins, and Daniel Keim.
2016. ConToVi: Multi-Party Conversation Explo-
ration using Topic-Space Views.Computer Graph-
ics Forum, 35(3):431–440.
Mennatallah El-Assady, Annette Hautli-Janisz,
Valentin Gold, Miriam Butt, Katharina Holzinger,
and Daniel Keim. 2017a. Interactive Visual Analy-
sis of Transcribed Multi-Party Discourse. In Pro-
ceedings of ACL 2017, System Demonstrations,
pages 49–54, Stroudsburg, PA. ACL.
Mennatallah El-Assady, Rita Sevastjanova, Bela Gipp,
Daniel Keim, and Christopher Collins. 2017b.
NEREx: Named-Entity Relationship Exploration in
Multi-Party Conversations.Computer Graphics Fo-
rum, 36(3):213–225.
Mennatallah El-Assady, Rita Sevastjanova, Daniel
Keim, and Christopher Collins. 2018a. Thread-
Reconstructor: Modeling Reply-Chains to Untan-
gle Conversational Text through Visual Analytics.
Computer Graphics Forum, 37(3):351–365.
Mennatallah El-Assady, Rita Sevastjanova, Fabian
Sperrle, Daniel Keim, and Christopher Collins.
2018b. Progressive Learning of Topic Modeling Pa-
rameters: A Visual Analytics Framework.IEEE
Transactions on Visualization and Computer Graph-
ics, 24(1):382–391.
Mennatallah El-Assady, Fabian Sperrle, Oliver
Deussen, Daniel Keim, and Christopher Collins.
2019. Visual Analytics for Topic Model Optimiza-
tion based on User-Steerable Speculative Execution.
IEEE Trans. on Visualization and Computer Graph-
ics, 25(1):374–384.
Mennatallah El-Assady, Fabian Sperrle, Rita Sevast-
janova, Michael Sedlmair, and Daniel Keim. 2018c.
LTMA: Layered Topic Matching for the Compar-
ative Exploration, Evaluation, and Refinement of
Topic Modeling Results. In International Sympo-
sium on Big Data Visual and Immersive Analytics
(BDVA), pages 1–10.
Valentin Gold, Christian Rohrdantz, and Mennatallah
El-Assady. 2015. Exploratory Text Analysis using
Lexical Episode Plots. In Proc. of EuroVis., pages
85–89. The Eurographics Association.
Wolfgang Jentner, Mennatallah El-Assady, Bela Gipp,
and Daniel A Keim. 2017. Feature Alignment for
the Analysis of Verbatim Text Transcripts. In Eu-
roVis Workshop on Visual Analytics, EuroVA 2017,
Barcelona, Spain, 12-13 June 2017, pages 13–17.
Eurographics Association.
Wolfgang Jentner, Dominik Sacha, Florian Stoffel, Ge-
offrey P Ellis, Leishi Zhang, and Daniel A Keim.
2018. Making machine intelligence less scary for
criminal analysts: reflections on designing a visual
comparative case analysis tool.The Visual Com-
puter, 34(9):1225–1241.
Daniela Oelke, David Spretke, Andreas Stoffel, and
Daniel A. Keim. 2012. Visual readability analysis:
How to make your writings easier to read.IEEE
Transactions on Visualization and Computer Graph-
ics, 18(5):662–674.
Dominik Sacha, Matthias Kraus, Jrgen Bernard,
Michael Behrisch, Tobias Schreck, Yuki Asano,
and Daniel A Keim. 2018. Somflow: Guided ex-
ploratory cluster analysis with self-organizing maps
and analytic provenance. IEEE transactions on vi-
sualization and computer graphics, 24(1):120–130.
Christin Sch¨
atzle, Michael Hund, Frederik L Dennig,
Miriam Butt, and Daniel A Keim. 2017. Histo-
BankVis: Detecting Language Change via Data Vi-
sualization. In Proceedings of the NoDaLiDa 2017
Workshop on Processing Historical Language, 133,
pages 32–39. University of Konstanz, Germany.
Gerold Schneider, Mennatallah El-Assady, and
Hans Martin Lehmann. 2017. Tools and Meth-
ods for Processing and Visualizing Large Corpora.
Studies in Variation, Contacts and Change in En-
glish, 19.
Rita Sevastjanova, Fabian Beck, Basil Ell, Cagatay
Turkay, Rafael Henkin, Miriam Butt, Daniel Keim,
and Mennatallah El-Assady. 2018a. Going be-
yond Visualization: Verbalization as Complemen-
tary Medium to Explain Machine Learning Models.
Rita Sevastjanova, Mennatallah El-Assady, Annette
Hautli-Janisz, Aikaterini-Lida Kalouli, Rebecca
Kehlbeck, Oliver Deussen, Daniel Keim, and
Miriam Butt. 2018b. Mixed-initiative active learn-
ing for generating linguistic insights in question
classification. In Workshop on Data Systems for In-
teractive Analysis (DSIA) at IEEE VIS.
Fabian Sperrle, J ¨
urgen Bernard, Michael Sedlmair,
Daniel Keim, and Mennatallah El-Assady. 2018.
Speculative Execution for Guided Visual Analytics.
In Proc. of IEEE VIS Workshop on Machine Learn-
ing from User Interaction for Visualization and An-
alytics.
... Example: "This sentence is about cats." and "This sentence is not about cats, but about dogs." If we consider the two sentences above and a model that is mapping them to different topics [8], transformation uncertainty can arise if the word cats in the second sentence causes it to be partly considered as belonging to the topic cats, ignoring the negation. This example is trivial, but more nuanced issues arise in many NLP pipelines. ...
Preprint
Full-text available
Current visual text analysis approaches rely on sophisticated processing pipelines. Each step of such a pipeline potentially amplifies any uncertainties from the previous step. To ensure the comprehensibility and interoperability of the results, it is of paramount importance to clearly communicate the uncertainty not only of the output but also within the pipeline. In this paper, we characterize the sources of uncertainty along the visual text analysis pipeline. Within its three phases of labeling, modeling, and analysis, we identify six sources, discuss the type of uncertainty they create, and how they propagate.
... We show the applicability of the workspace through expert case studies, confirm findings from the related work, and generate new insights into adapter learning properties. A demo is available as part of the LingVis framework [13] under: https://adapters.demo.lingvis.io/. ACKNOWLEDGMENTS This paper was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) within projects BU 1806/10-2 "Questions Visualized" of the FOR2111, and the ETH AI Center. ...
Preprint
Full-text available
Neural language models are widely used; however, their model parameters often need to be adapted to the specific domains and tasks of an application, which is time- and resource-consuming. Thus, adapters have recently been introduced as a lightweight alternative for model adaptation. They consist of a small set of task-specific parameters with a reduced training time and simple parameter composition. The simplicity of adapter training and composition comes along with new challenges, such as maintaining an overview of adapter properties and effectively comparing their produced embedding spaces. To help developers overcome these challenges, we provide a twofold contribution. First, in close collaboration with NLP researchers, we conducted a requirement analysis for an approach supporting adapter evaluation and detected, among others, the need for both intrinsic (i.e., embedding similarity-based) and extrinsic (i.e., prediction-based) explanation methods. Second, motivated by the gathered requirements, we designed a flexible visual analytics workspace that enables the comparison of adapter properties. In this paper, we discuss several design iterations and alternatives for interactive, comparative visual explanation methods. Our comparative visualizations show the differences in the adapted embedding vectors and prediction outcomes for diverse human-interpretable concepts (e.g., person names, human qualities). We evaluate our workspace through case studies and show that, for instance, an adapter trained on the language debiasing task according to context-0 (decontextualized) embeddings introduces a new type of bias where words (even gender-independent words such as countries) become more similar to female than male pronouns. We demonstrate that these are artifacts of context-0 embeddings.
... We integrated this functionality to be able to generalize the approach, as many models with varying sets of features for the same use-case can be trained and analyzed. The system supports a set of word-level features (i.e., word attributes or labels) presented in Reference [26]. These include content features such as tokens and Part-of-Speech (POS) tags, and labels extracted based on various word-lists (e.g., WH-question words, speech acts, discourse particles). ...
Article
Linguistic insight in the form of high-level relationships and rules in text builds the basis of our understanding of language. However, the data-driven generation of such structures often lacks labeled resources that can be used as training data for supervised machine learning. The creation of such ground-truth data is a time-consuming process that often requires domain expertise to resolve text ambiguities and characterize linguistic phenomena. Furthermore, the creation and refinement of machine learning models is often challenging for linguists as the models are often complex, in-transparent, and difficult to understand. To tackle these challenges, we present a visual analytics technique for interactive data labeling that applies concepts from gamification and explainable Artificial Intelligence (XAI) to support complex classification tasks. The visual-interactive labeling interface promotes the creation of effective training data. Visual explanations of learned rules unveil the decisions of the machine learning model and support iterative and interactive optimization. The gamification-inspired design guides the user through the labeling process and provides feedback on the model performance. As an instance of the proposed technique, we present QuestionComb , a workspace tailored to the task of question classification (i.e., in information-seeking vs. non-information-seeking questions). Our evaluation studies confirm that gamification concepts are beneficial to engage users through continuous feedback, offering an effective visual analytics technique when combined with active learning and XAI.
... The system will be included into the lingvis.io framework [59]. ...
Article
We present VisInReport, a visual analytics tool that supports the manual analysis of discourse transcripts and generates reports based on user interaction. As an integral part of scholarly work in the social sciences and humanities, discourse analysis involves an aggregation of characteristics identified in the text, which, in turn, involves a prior identification of regions of particular interest. Manual data evaluation requires extensive effort, which can be a barrier to effective analysis. Our system addresses this challenge by augmenting the users' analysis with a set of automatically generated visualization layers. These layers enable the detection and exploration of relevant parts of the discussion supporting several tasks, such as topic modeling or question categorization. The system summarizes the extracted events visually and verbally, generating a content-rich insight into the data and the analysis process. During each analysis session, VisInReport builds a shareable report containing a curated selection of interactions and annotations generated by the analyst. We evaluate our approach on real-world datasets through a qualitative study with domain experts from political science, computer science, and linguistics. The results highlight the benefit of integrating the analysis and reporting processes through a visual analytics system, which supports the communication of results among collaborating researchers.
Article
Language models, such as BERT, construct multiple, contextualized embeddings for each word occurrence in a corpus. Understanding how the contextualization propagates through the model's layers is crucial for deciding which layers to use for a specific analysis task. Currently, most embedding spaces are explained by probing classifiers; however, some findings remain inconclusive. In this paper, we present LMFingerprints, a novel scoring‐based technique for the explanation of contextualized word embeddings. We introduce two categories of scoring functions, which measure (1) the degree of contextualization, i.e., the layerwise changes in the embedding vectors, and (2) the type of contextualization, i.e., the captured context information. We integrate these scores into an interactive explanation workspace. By combining visual and verbal elements, we provide an overview of contextualization in six popular transformer‐based language models. We evaluate hypotheses from the domain of computational linguistics, and our results not only confirm findings from related work but also reveal new aspects about the information captured in the embedding spaces. For instance, we show that while numbers are poorly contextualized, stopwords have an unexpected high contextualization in the models' upper layers, where their neighborhoods shift from similar functionality tokens to tokens that contribute to the meaning of the surrounding sentences.
Conference Paper
Full-text available
Language models, such as BERT, construct multiple, contextualized embeddings for each word occurrence in a corpus. Understanding how the contextualization propagates through the model's layers is crucial for deciding which layers to use for a specific analysis task. Currently, most embedding spaces are explained by probing classifiers; however, some findings remain inconclusive. In this paper, we present LMFingerprints, a novel scoring-based technique for the explanation of contextualized word embeddings. We introduce two categories of scoring functions, which measure (1) the degree of contextualization, i.e., the layerwise changes in the embedding vectors, and (2) the type of contextualization, i.e., the captured context information. We integrate these scores into an interactive explanation workspace. By combining visual and verbal elements, we provide an overview of contextualization in six popular transformer-based language models. We evaluate hypotheses from the domain of computational linguistics, and our results not only confirm findings from related work but also reveal new aspects about the information captured in the embedding spaces. For instance, we show that while numbers are poorly contextualized, stopwords have an unexpected high contextualization in the models' upper layers, where their neighborhoods shift from similar functionality tokens to tokens that contribute to the meaning of the surrounding sentences.
Preprint
Grounded theory (GT) is a research methodology that entails a systematic workflow for theory generation grounded on emergent data. In this paper, we juxtapose GT workflows with typical workflows in visualization and visual analytics, shortly VIS, and underline the characteristics shared by these workflows. We explore the research landscape of VIS to observe where GT has been applied to generate VIS theories, explicitly as well as implicitly. We propose a "why" typology for characterizing aspects in VIS where GT can potentially play a significant role. We outline a "how" methodology for conducting GT research in VIS, which addresses the need for theoretical advancement in VIS while benefitting from other methods and techniques in VIS. We exemplify this "how" methodology by adopting GT approaches in studying the messages posted on VisGuides - an Open Discourse Forum for discussing visualization guidelines.
Conference Paper
Full-text available
Despite the success of contextualized language models on various NLP tasks, it is still unclear what these models really learn. In this paper, we contribute to the current efforts of explaining such models by exploring the continuum between function and content words with respect to contextualization in BERT, based on linguistically-informed insights. In particular, we utilize scoring and visual analytics techniques: we use an existing similarity-based score to measure contextualization and integrate it into a novel visual analytics technique, presenting the model’s layers simultaneously and highlighting intra-layer properties and inter-layer differences. We show that contextualization is neither driven by polysemy nor by pure context variation. We also provide insights on why BERT fails to model words in the middle of the functionality continuum.
Conference Paper
CodaLab is an open-source web-based platform for collaborative computational research. Although CodaLab has gained popularity in the research community, its interface has limited support for creating reusable tools that can be easily applied to new datasets and composed into pipelines. In clinical domain, natural language processing (NLP) on medical notes generally involves multiple steps, like tokenization, named entity recognition, etc. Since these steps require different tools which are usually scattered in different publications, it is not easy for researchers to use them to process their own datasets. In this paper, we present BENTO, a workflow management platform with a graphic user interface (GUI) that is built on top of CodaLab, to facilitate the process of building clinical NLP pipelines. BENTO comes with a number of clinical NLP tools that have been pre-trained using medical notes and expert annotations and can be readily used for various clinical NLP tasks. It also allows researchers and developers to create their custom tools (e.g., pre-trained NLP models) and use them in a controlled and reproducible way. In addition, the GUI interface enables researchers with limited computer background to compose tools into NLP pipelines and then apply the pipelines on their own datasets in a "what you see is what you get" (WYSIWYG) way. Although BENTO is designed for clinical NLP applications, the underlying architecture is flexible to be tailored to any other domains.
Conference Paper
Full-text available
We propose the concept of Speculative Execution for Visual Analytics and discuss its effectiveness for model exploration and optimization. Speculative Execution enables the automatic generation of alternative, competing model configurations that do not alter the current model state unless explicitly confirmed by the user. These alternatives are computed based on either user interactions or model quality measures and can be explored using delta-visualizations. By automatically proposing modeling alternatives, systems employing Speculative Execution can shorten the gap between users and models, reduce the confirmation bias and speed up optimization processes. In this paper, we have assembled five application scenarios showcasing the potential of Speculative Execution, as well as a potential for further research.
Conference Paper
Full-text available
We propose a mixed-initiative active learning system to tackle the challenge of building descriptive models for under-studied linguistic phenomena. Our particular use case is the linguistic analysis of question types, in particular in understanding what characterizes information-seeking vs. non-information-seeking questions (i.e., whether the speaker wants to elicit an answer from the hearer or not) and how automated methods can assist with the linguistic analysis. Our approach is motivated by the need for an effective and efficient human-in-the-loop process in natural language processing that relies on example-based learning and provides immediate feedback to the user. In addition to the concrete implementation of a question classification system, we describe general paradigms of explainable mixed-initiative learning, allowing for the user to access the patterns identified automatically by the system, rather than being confronted by a machine learning black box. Our user study demonstrates the capability of our system in providing deep linguistic insight into this particular analysis problem. The results of our evaluation are competitive with the current state-of-the-art.
Article
Full-text available
To effectively assess the potential consequences of human interventions in model-driven analytics systems, we establish the concept of speculative execution as a visual analytics paradigm for creating user-steerable preview mechanisms. This paper presents an explainable, mixed-initiative topic modeling framework that integrates speculative execution into the algorithmic decisionmaking process. Our approach visualizes the model-space of our novel incremental hierarchical topic modeling algorithm, unveiling its inner-workings. We support the active incorporation of the user's domain knowledge in every step through explicit model manipulation interactions. In addition, users can initialize the model with expected topic seeds, the backbone priors. For a more targeted optimization, the modeling process automatically triggers a speculative execution of various optimization strategies, and requests feedback whenever the measured model quality deteriorates. Users compare the proposed optimizations to the current model state and preview their effect on the next model iterations, before applying one of them. This supervised human-in-the-loop process targets maximum improvement for minimum feedback and has proven to be effective in three independent studies that confirm topic model quality improvements.
Article
Full-text available
A fundamental task in criminal intelligence analysis is to analyze the similarity of crime cases, called comparative case analysis (CCA), to identify common crime patterns and to reason about unsolved crimes. Typically, the data are complex and high dimensional and the use of complex analytical processes would be appropriate. State-of-the-art CCA tools lack flexibility in interactive data exploration and fall short of computational transparency in terms of revealing alternative methods and results. In this paper, we report on the design of the Concept Explorer, a flexible, transparent and interactive CCA system. During this design process, we observed that most criminal analysts are not able to understand the underlying complex technical processes, which decrease the users’ trust in the results and hence a reluctance to use the tool. Our CCA solution implements a computational pipeline together with a visual platform that allows the analysts to interact with each stage of the analysis process and to validate the result. The proposed visual analytics workflow iteratively supports the interpretation of the results of clustering with the respective feature relations, the development of alternative models, as well as cluster verification. The visualizations offer an understandable and usable way for the analyst to provide feedback to the system and to observe the impact of their interactions. Expert feedback confirmed that our user-centered design decisions made this computational complexity less scary to criminal analysts.
Article
Full-text available
Topic modeling algorithms are widely used to analyze the thematic composition of text corpora but remain difficult to interpret and adjust. Addressing these limitations, we present a modular visual analytics framework, tackling the understandability and adaptability of topic models through a user-driven reinforcement learning process which does not require a deep understanding of the underlying topic modeling algorithms. Given a document corpus, our approach initializes two algorithm configurations based on a parameter space analysis that enhances document separability. We abstract the model complexity in an interactive visual workspace for exploring the automatic matching results of two models, investigating topic summaries, analyzing parameter distributions, and reviewing documents. The main contribution of our work is an iterative decision-making technique in which users provide a document-based relevance feedback that allows the framework to converge to a user-endorsed topic distribution. We also report feedback from a two-stage study which shows that our technique results in topic model quality improvements on two independent measures.
Conference Paper
Full-text available
We present HistoBankVis, a novel visualization system designed for the interactive analysis of complex, multidimen-sional data to facilitate historical linguistic work. In this paper, we illustrate the vi-sualization's efficacy and power by means of a concrete case study investigating the diachronic interaction of word order and subject case in Icelandic.
Article
We present ThreadReconstructor, a visual analytics approach for detecting and analyzing the implicit conversational structure of discussions, e.g., in political debates and forums. Our work is motivated by the need to reveal and understand single threads in massive online conversations and verbatim text transcripts. We combine supervised and unsupervised machine learning models to generate a basic structure that is enriched by user‐defined queries and rule‐based heuristics. Depending on the data and tasks, users can modify and create various reconstruction models that are presented and compared in the visualization interface. Our tool enables the exploration of the generated threaded structures and the analysis of the untangled reply‐chains, comparing different models and their agreement. To understand the inner‐workings of the models, we visualize their decision spaces, including all considered candidate relations. In addition to a quantitative evaluation, we report qualitative feedback from an expert user study with four forum moderators and one machine learning expert, showing the effectiveness of our approach.
Article
We present several approaches and methods which we develop or use to create workflows from data to evidence. They start with looking for specific items in large corpora, exploring overuse of particular items, and using off-the-shelf visualization such as GoogleViz. Second, we present the advanced visualization tools and pipelines which the Visualization Group at University of Konstanz is developing. After an overview, we apply statistical visualizations, Lexical Episode Plots and Interactive Hierarchical Modeling to the vast historical linguistics data offered by the Corpus of Historical American English (COHA), which ranges from 1800 to 2000. We investigate on the one hand the increase of noun compounds and visually illustrate correlations in the data over time. On the other hand we compute and visualize trends and topics in society from 1800 to 2000. We apply an incremental topic modeling algorithm to the extracted compound nouns to detect thematic changes throughout the investigated time period of 200 years. In this paper, we utilize various tailored analysis and visualization approaches to gain insight into the data from different perspectives.
Article
Clustering is a core building block for data analysis, aiming to extract otherwise hidden structures and relations from raw datasets, such as particular groups that can be effectively related, compared, and interpreted. A plethora of visual-interactive cluster analysis techniques has been proposed to date, however, arriving at useful clusterings often requires several rounds of user interactions to fine-tune the data preprocessing and algorithms. We present a multi-stage Visual Analytics (VA) approach for iterative cluster refinement together with an implementation (SOMFlow) that uses Self-Organizing Maps (SOM) to analyze time series data. It supports exploration by offering the analyst a visual platform to analyze intermediate results, adapt the underlying computations, iteratively partition the data, and to reflect previous analytical activities. The history of previous decisions is explicitly visualized within a flow graph, allowing to compare earlier cluster refinements and to explore relations. We further leverage quality and interestingness measures to guide the analyst in the discovery of useful patterns, relations, and data partitions. We conducted two pair analytics experiments together with a subject matter expert in speech intonation research to demonstrate that the approach is effective for interactive data analysis, supporting enhanced understanding of clustering results as well as the interactive process itself.