Huamin Qu’s research while affiliated with Hong Kong University of Science and Technology and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (425)


Exploring Spatial Hybrid User Interface for Visual Sensemaking
  • Preprint

February 2025

·

8 Reads

·

Haobo Li

·

·

[...]

·

We built a spatial hybrid system that combines a personal computer (PC) and virtual reality (VR) for visual sensemaking, addressing limitations in both environments. Although VR offers immense potential for interactive data visualization (e.g., large display space and spatial navigation), it can also present challenges such as imprecise interactions and user fatigue. At the same time, a PC offers precise and familiar interactions but has limited display space and interaction modality. Therefore, we iteratively designed a spatial hybrid system (PC+VR) to complement these two environments by enabling seamless switching between PC and VR environments. To evaluate the system's effectiveness and user experience, we compared it to using a single computing environment (i.e., PC-only and VR-only). Our study results (N=18) showed that spatial PC+VR could combine the benefits of both devices to outperform user preference for VR-only without a negative impact on performance from device switching overhead. Finally, we discussed future design implications.


Kebbi can be programmed to control over seven parts of its body, including a neck (1), two shoulders (2), elbows (3) and fists (4). Minibo has three parts, including a pair of ears (5), a head (6), two auxiliary (7), and swivel wheels (8)
(1) and (2) illustrated the animation of facial expressions and gesture movements of Kebbi. (3) and (4) demonstrated the dancing function of Kebbi and Minibo, respectively
The interaction process of Minibo
The interaction process of Kebbi
A student was learning with Kebbi in classroom setting

+6

Exploring the impact of robot interaction on learning engagement: a comparative study of two multi-modal robots
  • Article
  • Full-text available

January 2025

·

32 Reads

Smart Learning Environments

In recent years, there has been a growing interest in using robots within educational environments due to their potential to augment student engagement and motivation. However, current research has not adequately addressed the effectiveness of these robots in facilitating inclusive learning for diverse student populations, particularly those with dyslexia. This study proposes an inclusive learning system developed on two multi-modal robots, Kebbi and Minibo, with interactive (i.e., movable hands) and straightforward features. The system integrates various interactive elements, such as animations, songs, dance, gestures, and touch, to enhance students’ learning engagement, interaction, and motivation and cater to their diverse needs. The study aims to examine the influence of different features from two unique multi-modal robots on the engagement levels of students with/without dyslexia and their needs when engaging with robot learning. Two research questions are posed: (1) What are the features of multi-modal robots that could effectively improve the learning engagements of students with/without dyslexia? (2) What are the needs of students with/without dyslexia when engaging with robot learning? To this end, a comparative study is conducted where 64 students participate in a five-day robot-led training program, while another 73 students receive traditional teacher-led training. Pre/post questionnaires are administered to evaluate students’ engagement levels, and semi-structured interviews are conducted to obtain additional insights. The findings reveal that students with dyslexia are better suited to the interactive and multi-modal features of Kebbi. In contrast, students without dyslexia may prefer the more straightforward features of Minibo, which can still effectively promote engagement and learning. Multi-modal robots can boost engagement and motivation in students with and without dyslexia through novelty and cognitive load management. Emotional connections and interactive elements, such as empathetic and customizable features, enhance engagement and improve learning outcomes.

Download

Figure 2: The pipeline of the naïve chatbot.
Figure 3: The Memory Tree organizes information in a photo collection into a three-level structure.
Scene details asked by PVI.
Demographic information of the participants. PID Age Gender Visual Condition Onset Image Accessibility Tools Chatbot Usage Themes of Photos in Study
Memory Reviver: Supporting Photo-Collection Reminiscence for People with Visual Impairment via a Proactive Chatbot

January 2025

·

6 Reads

Reminiscing with photo collections offers significant psychological benefits but poses challenges for people with visual impairment (PVI). Their current reliance on sighted help restricts the flexibility of this activity. In response, we explored using a chatbot in a preliminary study. We identified two primary challenges that hinder effective reminiscence with a chatbot: the scattering of information and a lack of proactive guidance. To address these limitations, we present Memory Reviver, a proactive chatbot that helps PVI reminisce with a photo collection through natural language communication. Memory Reviver incorporates two novel features: (1) a Memory Tree, which uses a hierarchical structure to organize the information in a photo collection; and (2) a Proactive Strategy, which actively delivers information to users at proper conversation rounds. Evaluation with twelve PVI demonstrated that Memory Reviver effectively facilitated engaging reminiscence, enhanced understanding of photo collections, and delivered natural conversational experiences. Based on our findings, we distill implications for supporting photo reminiscence and designing chatbots for PVI.


DanmuA11y: Making Time-Synced On-Screen Video Comments (Danmu) Accessible to Blind and Low Vision Users via Multi-Viewer Audio Discussions

January 2025

·

5 Reads

By overlaying time-synced user comments on videos, Danmu creates a co-watching experience for online viewers. However, its visual-centric design poses significant challenges for blind and low vision (BLV) viewers. Our formative study identified three primary challenges that hinder BLV viewers' engagement with Danmu: the lack of visual context, the speech interference between comments and videos, and the disorganization of comments. To address these challenges, we present DanmuA11y, a system that makes Danmu accessible by transforming it into multi-viewer audio discussions. DanmuA11y incorporates three core features: (1) Augmenting Danmu with visual context, (2) Seamlessly integrating Danmu into videos, and (3) Presenting Danmu via multi-viewer discussions. Evaluation with twelve BLV viewers demonstrated that DanmuA11y significantly improved Danmu comprehension, provided smooth viewing experiences, and fostered social connections among viewers. We further highlight implications for enhancing commentary accessibility in video-based social media and live-streaming platforms.


InclusiViz: Visual Analytics of Human Mobility Data for Understanding and Mitigating Urban Segregation

January 2025

·

58 Reads

Urban segregation refers to the physical and social division of people, often driving inequalities within cities and exacerbating socioeconomic and racial tensions. While most studies focus on residential spaces, they often neglect segregation across "activity spaces" where people work, socialize, and engage in leisure. Human mobility data offers new opportunities to analyze broader segregation patterns, encompassing both residential and activity spaces, but challenges existing methods in capturing the complexity and local nuances of urban segregation. This work introduces InclusiViz, a novel visual analytics system for multi-level analysis of urban segregation, facilitating the development of targeted, data-driven interventions. Specifically, we developed a deep learning model to predict mobility patterns across social groups using environmental features, augmented with explainable AI to reveal how these features influence segregation. The system integrates innovative visualizations that allow users to explore segregation patterns from broad overviews to fine-grained detail and evaluate urban planning interventions with real-time feedback. We conducted a quantitative evaluation to validate the model's accuracy and efficiency. Two case studies and expert interviews with social scientists and urban analysts demonstrated the system's effectiveness, highlighting its potential to guide urban planning toward more inclusive cities.


Composing Data Stories with Meta Relations

January 2025

·

2 Reads

To facilitate the creation of compelling and engaging data stories, AI-powered tools have been introduced to automate the three stages in the workflow: analyzing data, organizing findings, and creating visuals. However, these tools rely on data-level information to derive inflexible relations between findings. Therefore, they often create one-size-fits-all data stories. Differently, our formative study reveals that humans heavily rely on meta relations between these findings from diverse domain knowledge and narrative intent, going beyond datasets, to compose their findings into stylized data stories. Such a gap indicates the importance of introducing meta relations to elevate AI-created stories to a satisfactory level. Though necessary, it is still unclear where and how AI should be involved in working with humans on meta relations. To answer the question, we conducted an exploratory user study with Remex, an AI-powered data storytelling tool that suggests meta relations in the analysis stage and applies meta relations for data story organization. The user study reveals various findings about introducing AI for meta relations into the storytelling workflow, such as the benefit of considering meta relations and their diverse expected usage scenarios. Finally, the paper concludes with lessons and suggestions about applying meta relations to compose data stories to hopefully inspire future research.


Narrative Player: Reviving Data Narratives with Visuals

January 2025

·

6 Reads

IEEE Transactions on Visualization and Computer Graphics

Data-rich documents are commonly found across various fields such as business, finance, and science. However, a general limitation of these documents for reading is their reliance on text to convey data and facts. Visual representation of text aids in providing a satisfactory reading experience in comprehension and engagement. However, existing work emphasizes presenting the insights within phrases or sentences, rather than fully conveying data stories within the whole paragraphs and engaging readers. To provide readers with satisfactory data stories, this paper presents Narrative Player, a novel method that automatically revives data narratives with consistent and contextualized visuals. Specifically, it accepts a paragraph and corresponding data table as input and leverages LLMs to characterize the clauses and extract contextualized data facts. Subsequently, the facts are transformed into a coherent visualization sequence with a carefully designed optimization-based approach. Animations are also assigned between adjacent visualizations to enable seamless transitions. Finally, the visualization sequence, transition animations, and audio narration generated by text-to-speech technologies are rendered into a data video. The evaluation results showed that the automatic-generated data videos were well-received by participants and experts for enhancing reading.


Exploring Spatial Hybrid User Interface for Visual Sensemaking

January 2025

·

1 Read

IEEE Transactions on Visualization and Computer Graphics

We built a spatial hybrid system that combines a personal computer (PC) and virtual reality (VR) for visual sensemaking, addressing limitations in both environments. Although VR offers immense potential for interactive data visualization (e.g., large display space and spatial navigation), it can also present challenges such as imprecise interactions and user fatigue. At the same time, a PC offers precise and familiar interactions but has limited display space and interaction modality. Therefore, we iteratively designed a spatial hybrid system (PC+VR) to complement these two environments by enabling seamless switching between PC and VR environments. To evaluate the system's effectiveness and user experience, we compared it to using a single computing environment ( i.e. , PC-only and VR-only). Our study results (N=18) showed that spatial PC+VR could combine the benefits of both devices to outperform user preference for VR-only without a negative impact on performance from device switching overhead. Finally, we discussed future design implications.




Citations (47)


... Other research has developed advanced techniques for automating various stages of data science workflows [13,95,113]. AI systems are now increasingly used to generate insights autonomously [56,60]. ...

Reference:

Jupybara: Operationalizing a Design Space for Actionable Data Analysis and Storytelling with LLMs
PyGWalker: On-the-fly Assistant for Exploratory Visual Data Analysis
  • Citing Conference Paper
  • October 2024

... In recent years, the rapid development and widespread adoption of generative large language models (LLMs) have revolutionized various domains, including natural language processing [1], [2], content creation [3], [4], and code generation [5], [6]. These powerful AI tools have demonstrated remarkable abilities in generating coherent, contextually relevant, and human-like text, leading to their application in diverse fields such as chip design [7]- [9], medical research [10]- [12], and software development [13], [14]. ...

From Data to Story: Towards Automatic Animated Data Video Creation with LLM-Based Multi-Agent Systems
  • Citing Conference Paper
  • October 2024

... In addition to enabling rich, immersive experiences, XR technologies also offer the capability to transition seamlessly between different virtualities within a single application. This flexibility has spurred a growing interest amongst researchers, leading to a number of studies on crossvirtuality experiences and multi-user collaboration [23,39,44,62,74,78]. As user roles, interaction patterns, and behaviors in XR environments become increasingly complex and diverse, there is a pressing need for a unified standard that accommodates various types of immersive experiences and enables consistent, systematic evaluation, analytics, and visualization across them. ...

Evaluating Layout Dimensionalities in PC+VR Asymmetric Collaborative Decision Making
  • Citing Article
  • October 2024

Proceedings of the ACM on Human-Computer Interaction

... Intent Formalization. ChartEditor bridges the gap between vague user intentions and actionable outcomes through its natural language prompting system, which formalizes user intent and reduces cognitive load [40], [41], [58]. Experts E2 and E4 observed that users often paused or struggled with other tools due to unclear workflows, whereas ChartEditor streamlined their thought processes. ...

Data Playwright: Authoring Data Videos With Annotated Narration
  • Citing Article
  • October 2024

IEEE Transactions on Visualization and Computer Graphics

... The emergence of new types of visual content continues to challenge BLV users in fully participating on social media platforms [72,73,80]. In response, researchers have explored methods to help BLV users access various forms of visual content, including images [26,89,91], memes [25], GIFs [24,100], emojis [77,99], comics [35], and videos [56,78,84]. When engaging with social media content, BLV users seek more than just factual descriptions (e.g., identifying objects in an image); they also aim to establish social and emotional connections with other users [25]. ...

Memory Reviver: Supporting Photo-Collection Reminiscence for People with Visual Impairment via a Proactive Chatbot
  • Citing Conference Paper
  • October 2024

... To conclude, [81] aims to make performing data analysis tasks easier, due to LLMs, people who are not from a programming background are now able to complete data analysis attacks. Non-coding personnel find it difficult, at times, to understand the output of the LLM and decipher when it makes errors. ...

WaitGPT: Monitoring and Steering Conversational LLM Agent in Data Analysis with On-the-Fly Code Visualization

... However, prior studies indicate that older adults generally exhibit similar attitudes toward robots to those of younger individuals, challenging the stereotype of their lower robot-related receptiveness [29]. Furthermore, older adults often prefer human-like robots, such as the Kebbi robot, highlighting the importance of robot design and appearance in fostering acceptance [30,31]. These findings indicate that technology-enhanced educational interventions can effectively enhance HL and disease knowledge, which are associated with improved CKD-related outcomes. ...

Correction: Humanoid robot-empowered language learning based on self-determination theory

Education and Information Technologies

... Furthermore, they did not account for the influence of answer choices or the order in which options were presented [32]. Another recent work by Lo et al. [28] investigated the capability of LLMs to identify misleading charts, but they also relied on the datasets directly collected from websites. To the best of our knowledge, no prior work has addressed a critical issue in LLMs' performance on visualization tasks: whether their responses are based on pre-existing knowledge or solely on the visualizations presented. ...

How Good (Or Bad) Are LLMs at Detecting Misleading Visualizations?
  • Citing Article
  • September 2024

IEEE Transactions on Visualization and Computer Graphics

... DS-Agent, proposed by Guo et al. [7], integrates large language models with casebased reasoning to automate data science workflows, significantly reducing manual effort. Shen et al. [13] extended this concept with an LLM-based multi-agent system for creating animated data videos, pushing the boundaries of automated narrative generation. ...

From Data to Story: Towards Automatic Animated Data Video Creation with LLM-based Multi-Agent Systems

... Interpretative interfaces or tools: in the context of model use, interpretative interfaces or tools must be implemented to allow students and educators to understand the model's reasoning and decisions in real time. For example, visual tools can reveal how the model processes input data and generate outputs, thereby enhancing the model's transparency and interpretability [61]. ...

NFTracer: Tracing NFT Impact Dynamics in Transaction-flow Substitutive Systems with Visual Analytics
  • Citing Article
  • May 2024

IEEE Transactions on Visualization and Computer Graphics