Project

LingVIS.io

Updates
0 new
1
Recommendations
0 new
1
Followers
0 new
22
Reads
2 new
584

Project log

Rita Sevastjanova
added 2 research items
Language models, such as BERT, construct multiple, contextualized embeddings for each word occurrence in a corpus. Understanding how the contextualization propagates through the model's layers is crucial for deciding which layers to use for a specific analysis task. Currently, most embedding spaces are explained by probing classifiers; however, some findings remain inconclusive. In this paper, we present LMFingerprints, a novel scoring-based technique for the explanation of contextualized word embeddings. We introduce two categories of scoring functions, which measure (1) the degree of contextualization, i.e., the layerwise changes in the embedding vectors, and (2) the type of contextualization, i.e., the captured context information. We integrate these scores into an interactive explanation workspace. By combining visual and verbal elements, we provide an overview of contextualization in six popular transformer-based language models. We evaluate hypotheses from the domain of computational linguistics, and our results not only confirm findings from related work but also reveal new aspects about the information captured in the embedding spaces. For instance, we show that while numbers are poorly contextualized, stopwords have an unexpected high contextualization in the models' upper layers, where their neighborhoods shift from similar functionality tokens to tokens that contribute to the meaning of the surrounding sentences.
Language models learn and represent language differently than humans; they learn the form and not the meaning. Thus, to assess the success of language model explainability, we need to consider the impact of its divergence from a user's mental model of language. In this position paper, we argue that in order to avoid harmful rationalization and achieve truthful understanding of language models, explanation processes must satisfy three main conditions: (1) explanations have to truthfully represent the model behavior, i.e., have a high fidelity; (2) explanations must be complete, as missing information distorts the truth; and (3) explanations have to take the user's mental model into account, progressively verifying a person's knowledge and adapting their understanding. We introduce a decision tree model to showcase potential reasons why current explanations fail to reach their objectives. We further emphasize the need for human-centered design to explain the model from multiple perspectives, progressively adapting explanations to changing user expectations.
Rita Sevastjanova
added a research item
Many interactive machine learning workflows in the context of visual analytics encompass the stages of exploration, verification, and knowledge communication. Within these stages, users perform various types of actions based on different human needs. In this position paper, we postulate expanding this workflow by introducing gameful design elements. These can increase a user’s motivation to take actions, to improve a model’s quality, or to exchange insights with others. By combining concepts from visual analytics, human psychology, and gamification, we derive a model for augmenting the visual analytics processes with game mechanics. We argue for automatically learning a parametrization of these game mechanics based on a continuous evaluation of the users’ actions and analysis results. To demonstrate our proposed conceptual model, we illustrate how three existing visual analytics techniques could benefit from incorporating tailored game dynamics. Lastly, we discuss open challenges and point out potential implications for future research.
Mennatallah El-Assady
added 3 research items
We propose the concept of Speculative Execution for Visual Analytics and discuss its effectiveness for model exploration and optimization. Speculative Execution enables the automatic generation of alternative, competing model configurations that do not alter the current model state unless explicitly confirmed by the user. These alternatives are computed based on either user interactions or model quality measures and can be explored using delta-visualizations. By automatically proposing modeling alternatives, systems employing Speculative Execution can shorten the gap between users and models, reduce the confirmation bias and speed up optimization processes. In this paper, we have assembled five application scenarios showcasing the potential of Speculative Execution, as well as a potential for further research.
We present a framework that allows users to incorporate the semantics of their domain knowledge for topic model refinement while remaining model-agnostic. Our approach enables users to (1) understand the semantic space of the model, (2) identify regions of potential conflicts and problems, and (3) readjust the semantic relation of concepts based on their understanding, directly influencing the topic modeling. These tasks are supported by an interactive visual analytics workspace that uses word-embedding projections to define concept regions which can then be refined. The user-refined concepts are independent of a particular document collection and can be transferred to related corpora. All user interactions within the concept space directly affect the semantic relations of the underlying vector space model, which, in turn, change the topic modeling. In addition to direct manipulation, our system guides the users' decision-making process through recommended interactions that point out potential improvements. This targeted refinement aims at minimizing the feedback required for an efficient human-in-the-loop process. We confirm the improvements achieved through our approach in two user studies that show topic model quality improvements through our visual knowledge externalization and learning process.
Argumentation Mining addresses the challenging tasks of identifying boundaries of argumentative text fragments and extracting their relationships. Fully automated solutions do not reach satisfactory accuracy due to their insufficient incorporation of semantics and domain knowledge. Therefore, experts currently rely on time-consuming manual annotations. In this paper, we present a visual analytics system that augments the manual annotation process by automatically suggesting which text fragments to annotate next. The accuracy of those suggestions is improved over time by incorporating linguistic knowledge and language modeling to learn a measure of argument similarity from user interactions. Based on a long-term collaboration with domain experts, we identify and model five high-level analysis tasks. We enable close reading and note-taking, annotation of arguments, argument reconstruction, extraction of argument relations, and exploration of argument graphs. To avoid context switches, we transition between all views through seamless morphing, visually anchoring all text- and graph-based layers. We evaluate our system with a two-stage expert user study based on a corpus of presidential debates. The results show that experts prefer our system over existing solutions due to the speedup provided by the automatic suggestions and the tight integration between text and graph views.
Mennatallah El-Assady
added a research item
We present a modular framework for the rapid-prototyping of linguistic, web-based, visual analytics applications. Our framework gives developers access to a rich set of machine learning and natural language processing steps, through encapsulating them into micro-services and combining them into a computational pipeline. This processing pipeline is auto-configured based on the requirements of the visualization front-end, making the linguistic processing and visualization design detached , independent development tasks. This paper describes the constellation and modality of our framework, which continues to support the efficient development of various human-in-the-loop, linguistic visual analytics research techniques and applications.
Mennatallah El-Assady
added 3 research items
We present several approaches and methods which we develop or use to create workflows from data to evidence. They start with looking for specific items in large corpora, exploring overuse of particular items, and using off-the-shelf visualization such as GoogleViz. Second, we present the advanced visualization tools and pipelines which the Visualization Group at University of Konstanz is developing. After an overview, we apply statistical visualizations, Lexical Episode Plots and Interactive Hierarchical Modeling to the vast historical linguistics data offered by the Corpus of Historical American English (COHA), which ranges from 1800 to 2000. We investigate on the one hand the increase of noun compounds and visually illustrate correlations in the data over time. On the other hand we compute and visualize trends and topics in society from 1800 to 2000. We apply an incremental topic modeling algorithm to the extracted compound nouns to detect thematic changes throughout the investigated time period of 200 years. In this paper, we utilize various tailored analysis and visualization approaches to gain insight into the data from different perspectives.
Mennatallah El-Assady
added a research item
Explainable Artificial Intelligence describes a process to reveal the logical propagation of operations that transform a given input to a certain output. In this paper, we investigate the design space of explanation processes based on factors gathered from six research areas, namely, Pedagogy, Story-telling, Argumentation, Programming, Trust-Building, and Gamification. We contribute a conceptual model describing the building blocks of explanation processes, including a comprehensive overview of explanation and verification phases, pathways, mediums, and strategies. We further argue for the importance of studying effective methods of explainable machine learning, and discuss open research challenges and opportunities. Figure 1: The proposed explanation process model. On the highest level, each explanation consists of different phases that structure the whole process into defined elements. Each phase contains explanation blocks, i.e., self-contained units to explain one phenomenon based on a selected strategy and medium. At the end of each explanation phase, an optional verification block ensures the understanding of the explained aspects. Lastly, to transition between phases and building blocks, different pathways are utilized.
Mennatallah El-Assady
added a research item
To effectively assess the potential consequences of human interventions in model-driven analytics systems, we establish the concept of speculative execution as a visual analytics paradigm for creating user-steerable preview mechanisms. This paper presents an explainable, mixed-initiative topic modeling framework that integrates speculative execution into the algorithmic decisionmaking process. Our approach visualizes the model-space of our novel incremental hierarchical topic modeling algorithm, unveiling its inner-workings. We support the active incorporation of the user's domain knowledge in every step through explicit model manipulation interactions. In addition, users can initialize the model with expected topic seeds, the backbone priors. For a more targeted optimization, the modeling process automatically triggers a speculative execution of various optimization strategies, and requests feedback whenever the measured model quality deteriorates. Users compare the proposed optimizations to the current model state and preview their effect on the next model iterations, before applying one of them. This supervised human-in-the-loop process targets maximum improvement for minimum feedback and has proven to be effective in three independent studies that confirm topic model quality improvements.
Fabian Sperrle
added a research item
We propose the concept of Speculative Execution for Visual Analytics and discuss its effectiveness for model exploration and optimization. Speculative Execution enables the automatic generation of alternative, competing model configurations that do not alter the current model state unless explicitly confirmed by the user. These alternatives are computed based on either user interactions or model quality measures and can be explored using delta-visualizations. By automatically proposing modeling alternatives, systems employing Speculative Execution can shorten the gap between users and models, reduce the confirmation bias and speed up optimization processes. In this paper, we have assembled five application scenarios showcasing the potential of Speculative Execution, as well as a potential for further research.
Rita Sevastjanova
added 3 research items
We propose a mixed-initiative active learning system to tackle the challenge of building descriptive models for under-studied linguistic phenomena. Our particular use case is the linguistic analysis of question types, in particular in understanding what characterizes information-seeking vs. non-information-seeking questions (i.e., whether the speaker wants to elicit an answer from the hearer or not) and how automated methods can assist with the linguistic analysis. Our approach is motivated by the need for an effective and efficient human-in-the-loop process in natural language processing that relies on example-based learning and provides immediate feedback to the user. In addition to the concrete implementation of a question classification system, we describe general paradigms of explainable mixed-initiative learning, allowing for the user to access the patterns identified automatically by the system, rather than being confronted by a machine learning black box. Our user study demonstrates the capability of our system in providing deep linguistic insight into this particular analysis problem. The results of our evaluation are competitive with the current state-of-the-art.
In this position paper, we argue that a combination of visualization and verbalization techniques is beneficial for creating broad and versatile insights into the structure and decision-making processes of machine learning models. Explainability of machine learning models is emerging as an important area of research. Hence, insights into the inner workings of a trained model allow users and analysts, alike, to understand the models, develop justifications, and gain trust in the systems they inform. Explanations can be generated through different types of media, such as visualization and verbalization. Both are powerful tools that enable model interpretability. However, while their combination is arguably more powerful than each medium separately, they are currently applied and researched independently. To support our position that the combination of the two techniques is beneficial to explain machine learning models, we describe the design space of such a combination and discuss arising research questions, gaps, and opportunities.
We present ThreadReconstructor, a visual analytics approach for detecting and analyzing the implicit conversational structure of discussions, e.g., in political debates and forums. Our work is motivated by the need to reveal and understand single threads in massive online conversations and verbatim text transcripts. We combine supervised and unsupervised machine learning models to generate a basic structure that is enriched by user‐defined queries and rule‐based heuristics. Depending on the data and tasks, users can modify and create various reconstruction models that are presented and compared in the visualization interface. Our tool enables the exploration of the generated threaded structures and the analysis of the untangled reply‐chains, comparing different models and their agreement. To understand the inner‐workings of the models, we visualize their decision spaces, including all considered candidate relations. In addition to a quantitative evaluation, we report qualitative feedback from an expert user study with four forum moderators and one machine learning expert, showing the effectiveness of our approach.
Mennatallah El-Assady
added 2 research items
Topic modeling algorithms are widely used to analyze the thematic composition of text corpora but remain difficult to interpret and adjust. Addressing these limitations, we present a modular visual analytics framework, tackling the understandability and adaptability of topic models through a user-driven reinforcement learning process which does not require a deep understanding of the underlying topic modeling algorithms. Given a document corpus, our approach initializes two algorithm configurations based on a parameter space analysis that enhances document separability. We abstract the model complexity in an interactive visual workspace for exploring the automatic matching results of two models, investigating topic summaries, analyzing parameter distributions, and reviewing documents. The main contribution of our work is an iterative decision-making technique in which users provide a document-based relevance feedback that allows the framework to converge to a user-endorsed topic distribution. We also report feedback from a two-stage study which shows that our technique results in topic model quality improvements on two independent measures.
Mennatallah El-Assady
added 4 research items
An important factor of argument mining in dialog or multilog data is the framing with which interlocutors put forth their arguments. By using rhetorical devices such as hedging or reference to the Common Ground, speakers relate themselves to their interlocutors, their arguments, and the ongoing discourse. Capitalizing on theoretical linguistic insights into the semantics and pragmatics of discourse particles in German, we propose a categorization of rhetorical information that is highly relevant in natural transcribed speech. In order to shed light on the rhetorical strategies of different interlocutors in large amounts of real mediation data, we use a method from Visual Analytics which allows for an exploration of the rhetorical patterns via an interactive visual interface. With this innovative combination of theoretical linguistics, argument mining and information visualization, we offer a novel way of analyzing framing strategies in large amounts of multi-party argumentative discourse in German.
In the research of deliberative democracy, political scientists are interested in analyzing the communication models of discussions, debates, and mediation processes with the goal of extracting reoccurring discourse patterns from the verbatim transcripts of these conversations. To enhance the time-exhaustive manual analysis of such patterns, we introduce a visual analytics approach that enables the exploration and analysis of repetitive feature patterns over parallel text corpora using feature alignment. Our approach is tailored to the requirements of our domain experts. In this paper, we discuss our visual design and workflow, and we showcase the applicability of our approach using an experimental parallel corpus of political debates.
We present NEREx, an interactive visual analytics approach for the exploratory analysis of verbatim conversational transcripts. By revealing different perspectives on multi-party conversations, NEREx gives an entry point for the analysis through high-level overviews and provides mechanisms to form and verify hypotheses through linked detail-views. Using a tailored named-entity extraction, we abstract important entities into ten categories and extract their relations with a distance-restricted entity-relationship model. This model complies with the often ungrammatical structure of verbatim transcripts, relating two entities if they are present in the same sentence within a small distance window. Our tool enables the exploratory analysis of multi-party conversations using several linked views that reveal thematic and temporal structures in the text. In addition to distant-reading, we integrated close-reading views for a text-level investigation process. Beyond the exploratory and temporal analysis of conversations, NEREx helps users generate and validate hypotheses and perform comparative analyses of multiple conversations. We demonstrate the applicability of our approach on real-world data from the 2016 U.S. Presidential Debates through a qualitative study with three domain experts from political science.
Private Profile
added a research item
We introduce a novel visual analytics approach to analyze speaker behavior patterns in multi-party conversations. We propose Topic-Space Views to track the movement of speakers across the thematic landscape of a conversation. Our tool is designed to assist political science scholars in exploring the dynamics of a conversation over time to generate and prove hypotheses about speaker interactions and behavior patterns. Moreover, we introduce a glyph-based representation for each speaker turn based on linguistic and statistical cues to abstract relevant text features. We present animated views for exploring the general behavior and interactions of speakers over time and interactive steady visualizations for the detailed analysis of a selection of speakers. Using a visual sedimentation metaphor we enable the analysts to track subtle changes in the flow of a conversation over time while keeping an overview of all past speaker turns. We evaluate our approach on real-world datasets and the results have been insightful to our domain experts.
Private Profile
added a research item
In this paper, we present Lexical Episode Plots, a novel automated text-mining and visual analytics approach for exploratory text analysis. In particular, we first describe an algorithm for automatically annotating text regions to examine prominent themes within natural language texts. The algorithm is based on lexical chaining to find spans of text in which the frequency of a term is significantly higher than its average in the document. In a second step we present an interactive visualization supporting the exploration and interpretation of Lexical Episodes. The visualization links higher-level thematic structures with content-level details. The methodological capabilities of our approach are illustrated by analyzing the televised US presidential election debates.
Private Profile
added 2 research items
Politikwissenschaftler, Linguisten und Informatiker der Universität Konstanz haben interdisziplinär drei Jahre lang im Rahmen eines Projekts ``VisArgue`` eine Software entwickelt, die politische Kommunikation analysiert, visualisiert und Rückschlüsse auf deren Effektivität zulässt. Erprobt wurden die Tools am Beispiel von Stuttgart 21.
This article reports on a Digital Humanities research project which is concerned with the automated linguistic and visual analysis of political discourses with a particular focus on the concept of deliberative communication. According to the theory of deliberative communication as discussed within political science, political debates should be inclusive and stakeholders participating in these debates are required to justify their positions rationally and respectfully and should eventually defer to the better argument. The focus of the article is on the novel interactive visualizations that combine linguistic and statistical cues to analyze the deliberative quality of communication automatically. In particular, we quantify the degree of deliberation for four dimensions of communication: Participation, Respect, Argumentation and Justification, and Persuasiveness. Yet, these four dimensions have not been linked within a combined linguistic and visual framework, but each single dimension helps determining the degree of deliberation independently from each other. Since at its core, deliberation requires sustained and appropriate modes of communication, our main contribution is the automatic annotation and disambiguation of causal connectors and discourse particles.
Private Profile
added an update
Private Profile
added 4 project references
Mennatallah El-Assady
added a project goal