Soon-Gyo Jung

Soon-Gyo Jung
  • Software Engineer at Qatar Computing Research Institute

About

170
Publications
139,306
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,408
Citations
Introduction
Current institution
Qatar Computing Research Institute
Current position
  • Software Engineer

Publications

Publications (170)
Article
Full-text available
Online reviews significantly influence consumer decision-making in digital marketplaces, yet the proliferation of fake reviews threatens their credibility. This study investigates the psycholinguistic features that differentiate human-written fake reviews from genuine ones and explores how these features, along with distributional semantics, can be...
Article
Large Language Models (LLMs) are emerging as a powerful tool for AI-generated personas. This study evaluates the usability of AI-generated personas, comparing chat and profile formats. The findings indicate chat personas tend to be perceived more favourably, and profile personas exhibit greater variability in user perception. The increased difficul...
Article
Full-text available
The wide use of social media raises numerous privacy concerns, with limited studies of the Middle East and North Africa (MENA) region. This study presents an in-depth analysis of social media privacy concerns in sixteen MENA countries, a timely and important topic in an under-studied region. A census-representative sample (N=8140) was collected usi...
Conference Paper
Full-text available
When using deepfake technology to represent users, there is a need to convey a reasonable range of emotions to be able to portray different circumstances ranging from positive to negative experiences (e.g., personal struggles). Because it is not known how well deepfake avatars embody emotional diversity, we investigated this aspect among 202 deepfa...
Conference Paper
As the integration of artificial intelligence into social media continues to attract attention, the key impacts on content marketing are still undefined. Initial studies have shown that language models are capable of producing content that is competitive with content created by humans. However, how can such content be tailored for different social...
Conference Paper
Cipherbot, an educational chatbot using large language models to answer student questions concerning learning materials uploaded by the educator, was pilot tested in a classroom setting. Forty-four students used Cipherbot for seven weeks, sending 8077 messages. The average number of messages sent per student was 184 (SD = 80), with an average lengt...
Conference Paper
Full-text available
We investigate the impact of user demographics (age, gender) and experience (with personas and chatbots) on users’ perceptions of interactive personas. A within-subjects study was conducted with 54 participants, mostly engineers and computer scientists. Each participant used interactive personas with two interfaces: a web-based profile persona and...
Article
Full-text available
Technology-mediated group toxicity polarization is a major socio-technological issue of our time. For better large-scale monitoring of polarization among social media news content, we quantify the toxicity of news video comments using a Toxicity Polarization Score. For polarizing news videos, our premise is that the comments’ toxicity approximates...
Article
HCI research is facing a vital question of the effectiveness of AI-generated personas. Addressing this question, this research explores user perceptions of AI-generated personas for textual content (GPT-4) and two image generation models (DALL-E and Midjourney). We evaluate whether the inclusion of images in AI-generated personas impacts user perce...
Conference Paper
In this article, we present METRIC. Measuring Engagement Through Remote Interactions of Customers (METRIC) (https://metric.qcri.org/) is a tool for collecting, measuring, analyzing, and reporting the engagement of online systems through actual interactions of customers or users, either remote or in the lab. METRIC enables system stakeholders to enh...
Conference Paper
Full-text available
Although deepfakes have a negative connotation in human-computer interaction (HCI) due to their risks, they also involve many opportunities, such as communicating user needs in the form of a “living, talking” deepfake persona. To scope and better understand these opportunities, we present a qualitative analysis of 46 participants’ think-aloud trans...
Article
Personas inform design by representing diverse user needs. Since their initial application in commercial technology contexts, personas have been adopted in several research domains for public good, such as health, accessibility, politics and civic society, education, sustainability, cybersecurity, and criminology. In this review paper, we analyzed...
Chapter
Social media analytics is the process of deriving meaning from social media data to make better business decisions. Social media platforms offer businesses new insights into their strategies through social media analytics. Social media data analytics involves extracting, cleansing, transforming, and loading social data for further analysis. The dat...
Chapter
Personas represent distinct user or customer groups in design, human-computer interaction, and marketing. Persona analytics refers to creating data-driven personas, often employing data science and machine learning algorithms, to address customer analytics use cases. Through persona analytics, various customer segments can become more salient, obse...
Chapter
Web analytics is the practice of analyzing digital data about online visitors to websites and apps to gain insights that inform business decisions. Web analytics focuses on collecting, measuring, analyzing, and reporting digital data to improve insights concerning web traffic patterns and user behavior. Many organizations use web analytics to monit...
Chapter
Social media platforms are powerful for businesses to gain insight into how their customers feel about their company, product, or service. This chapter discusses the different types of social media analytics methods available to businesses to track their social media performance. With the help of natural language processing, businesses can understa...
Chapter
This chapter presents the process of web analytics for web platforms, web systems, and web apps. The process outlines how basic visitor information, such as the number of visitors and visit duration, can be collected using log files and page tagging. This basic visitor information is then combined to create meaningful key performance indicators tha...
Chapter
This chapter provides an overview of the following data-gathering methods: online surveys, crowdsourcing, eye tracking, mouse tracking, search logs, triangulation, and social media APIs. This information is essential for anyone interested in understanding the methods and best practices for gathering data in web and social media analytics. With this...
Chapter
Validity is often understood as the “correctness” of a method or instrument. However, many data collection issues can degrade the validity of findings about people. Data validity is also a crucial issue for web and social media analytics. Data validity is data accuracy, where accuracy is the degree to which the data conforms to actual values. If th...
Chapter
This chapter provides a comprehensive overview of data preprocessing techniques and tools in the context of web and social media analytics. As data volume and complexity from various sources grow, effective data preprocessing becomes crucial for extracting valuable insights and knowledge. This chapter covers vital steps in data preprocessing, inclu...
Chapter
Search log analytics is a certain case of analytics that analyzes data from search logs. Using data stored in search logs of online search engines, Intranets search services, and online search sites can provide essential insights into understanding the information searching processes, needs, and tactics of people looking for information online. Und...
Chapter
An analytics strategy helps an organization make better decisions using data and analytics. While web analytics helps organizations identify how to improve their websites, social media analytics help them improve how social media channels are used by offering strategic KPI use and monitoring. This is essential for maintaining an effective design an...
Chapter
User studies, involving both qualitative and quantitative data about user experiences, can complement and contextualize the insights and data afforded by web and social media analytics, providing an extension to these analytics areas, which we refer to as user study analytics. In this chapter, we discuss some of the dos and don’ts of user studies,...
Chapter
This chapter explores data quality assessment in data analytics. Emphasis is placed on the importance of ensuring you have high-quality data for effective decision making and successful outcomes in data analytics. Various aspects of data quality, such as completeness, consistency, validity, accuracy, and timeliness, are examined, along with the met...
Article
This article discusses the promising potential of employing large language models (LLMs) for survey research, including generating responses to survey items. LLMs can address some of the challenges associated with survey research regarding question-wording and response bias. They can address issues relating to a lack of clarity and understanding bu...
Article
Purpose The “what is beautiful is good” (WIBIG) effect implies that observers tend to perceive physically attractive people in a positive light. The authors investigate how the WIBIG effect applies to user personas, measuring designers' perceptions and task performance when employing user personas for the design of information technology (IT) solut...
Article
Deepfakes, realistic portrayals of people that do not exist, have garnered interest in research and industry. Yet, the contributions of deepfake technology to human-computer interaction remain unclear. One possible value of deepfake technology is to create more immersive user personas. To test this premise, we use a commercial-grade service to gene...
Article
Full-text available
Although the effect of hyperparameters on algorithmic outputs is well known in machine learning, the effects of hyperparameters on information systems that produce user or customer segments are relatively unexplored. This research investigates the effect of varying the number of user segments on the personification of user engagement data in a real...
Article
Employing customer information from one of the world's largest airline companies, we develop a price elasticity model (PREM) using machine learning to identify customers likely to purchase an upgrade offer from economy to premium class and predict a customer's acceptable price range. A simulation of 64.3 million flight bookings and 14.1 million ema...
Conference Paper
Although personas have been applied for two decades, not much is known about why a designer chooses a specific persona for a given design task. This question matters because if designers prefer one persona over another, then the needs and attributes of that persona would be favored in the design process, resulting in possible “blind spots” and bias...
Conference Paper
Human-computer interaction (HCI) and natural language processing (NLP) can engage in mutually beneficial collaboration. This article summarizes previous literature to identify grand challenges for the application of NLP in quantitative user personas (QUPs), which exemplifies such collaboration. Grand challenges provide a collaborative starting poin...
Article
Full-text available
Studies in human-computer interaction recommend creating fewer than ten personas, based on stakeholders’ limitations to cognitively process and use personas. However, no existing studies offer empirical support for having fewer rather than more personas. Investigating this matter, thirty-seven participants interacted with five and fifteen personas...
Article
Full-text available
Compensating crowdworkers for their research participation often entails paying a flat rate to all participants, regardless of the amount of time they spend on the task or skill level. If the actual time required varies considerably between workers, flat rates may yield unfair compensation. To study this matter, we analyzed three survey studies wit...
Article
Full-text available
Background: Constructing a sample of real users as participants in user studies is considered by most researchers to be vital for the validity, usefulness, and applicability of research findings. However, how often user studies reported in information technology academic literature sample real users or surrogate users is unknown. Therefore, it is u...
Conference Paper
Full-text available
Personas represent distinct user types. However, while online user data can be demographically and behaviorally heterogeneous, most studies generate less than ten personas, regardless of how heterogeneous the data is. Because all persona creation efforts need to assign a number of personas to create, assigning this number evokes a fundamental quest...
Conference Paper
Full-text available
Algorithmically generated personas can help organizations understand their social media audiences. However, when using algorithms to create personas from social media user data, the resulting personas may contain toxic quotes that negatively affect content creators’ perceptions of the personas. To address this issue, we have implemented toxicity de...
Article
Full-text available
User-centric design within organizations is crucial for developing information technology that offers optimal usability and user experience. Personas are a central user-centered design technique that puts people before technology and helps decision makers understand the needs and wants of the end-user segments of their products, systems, and servic...
Article
This reflection article addresses a difficulty faced by scholars and practitioners working with numbers about people, which is that those who study people want numerical data about these people. Unfortunately, time and time again, this numerical data about people is wrong. Addressing the potential causes of this wrongness, we present examples of an...
Article
Full-text available
Derived from the notion of algorithmic bias, it is possible that creating user segments such as personas from data results in over- or under-representing certain segments (FAIRNESS), does not properly represent the diversity of the user populations (DIVERSITY), or produces inconsistent results when hyperparameters are changed (CONSISTENCY). Collect...
Conference Paper
Full-text available
Integrating artificial intelligence (AI) technologies into customer service is of great interest to firms that face a mass of feedback originating from multiple channels (Poser et al., 2022), including phone calls, emails, and social media. Particularly social media channels have increased their popularity in recent years as a notable channel of cu...
Conference Paper
Data-driven persona generation can benefit from stakeholder inputs while offloading the complexities of high-dimensional datasets. To this end, we present Survey2Persona (S2P), an interactive web interface for real-time persona generation from survey data. The users of the web interface—the designers—can upload survey data and have the interface au...
Conference Paper
Using data from a major international news organization, we investigate the effect of hiding the count of dislikes from YouTube viewers on the propensity to use the video like/dislike features. We compare one entire month of videos before (n = 478) and after (n = 394) YouTube began hiding the dislikes counts. Collectively, these videos had received...
Article
Large commercial sentiment analysis tools are often deployed in software engineering due to their ease of use. However, it is not known how accurate these tools are, and whether the sentiment ratings given by one tool agree with those given by another tool. We use two datasets - (1) NEWS consisting of 5,880 news stories and 60K comments from four s...
Article
Full-text available
Artificial intelligence, particularly machine learning, carries high potential to automatically detect customers’ pain points, which is a particular concern the customer expresses that the company can address. However, unstructured data scattered across social media make detection a nontrivial task. Thus, to help firms gain deeper insights into cus...
Article
Full-text available
This research compares four standard analytics metrics from Google Analytics with SimilarWeb using one year’s average monthly data for 86 websites from 26 countries and 19 industry verticals. The results show statistically significant differences between the two services for total visits, unique visitors, bounce rates, and average session duration....
Article
Full-text available
There has been little research into whether a persona's picture should portray a happy or unhappy individual. We report a user experiment with 235 participants, testing the effects of happy and unhappy image styles on user perceptions, engagement, and personality traits attributed to personas using a mixed-methods analysis. Results indicate that th...
Conference Paper
Full-text available
Personas represent the needs of users in diverse populations and impact design by endearing empathy and improving communication. While personas have been lauded for their benefits, we could locate no prior review of persona use cases in design, prompting the question: how are personas actually used to achieve these benefits? To address this questio...
Conference Paper
Personas has evolved since Alan Cooper coined the term in 1999, moving into new domains, new ways of collecting data, and with novel ways of presenting the persona profiles. From the beginning, personas was linked to software design, expressing the need for empathy with end-users. This is still the case today, but we want to show how this is execut...
Article
We investigate how the Proteus effect, which is players changing their way of communication based on characters with which they play, is associated with players’ champion usage in the popular online game League of Legends, where champions are the characters that the players control. First, we create two sets of variables: (a) objective champion cha...
Conference Paper
Full-text available
Much of the reported work on personas suffers from the lack of empirical evidence. To address this issue, we introduce Persona Analytics (PA), a system that tracks how users interact with data-driven personas. PA captures users’ mouse and gaze behavior to measure users’ interaction with algorithmically generated personas and use of system features...
Article
Full-text available
Customers increasingly rely on reviews for product information. However, the usefulness of online reviews is impeded by fake reviews that give an untruthful picture of product quality. Therefore, detection of fake reviews is needed. Unfortunately, so far, automatic detection has only had partial success in this challenging task. In this research, w...
Article
Full-text available
When algorithms create personas from social media data, the personas can become noxious via automatically including toxic comments. To investigate how users perceive such personas, we conducted a 2 × 2 user experiment with 496 participants that showed participants toxic and non-toxic versions of data-driven personas. We found that participants gave...
Article
Full-text available
In this work, we build on research on data-driven personas to present what might be “wrong with them”. From wrong assumptions by the client and wrong applications of methods to imbalanced, messy, or superficial data; a lack of communication regarding how these personas are created; and issues with usability, there are a plethora of issues that plag...
Article
Full-text available
We explore the effects of hyperparameter selections on the personification accuracy of customer analytics data from a corporate YouTube channel with an audience in the hundreds of thousands and customer interactions in the tens of millions. Using non-negative matrix factorization, we generate personas sets from 5 to 15 using the customer analytics...
Chapter
Full-text available
During exceptional times when researchers do not have physical access to users of technology, the importance of remote user studies increases. We provide recommendations based on lessons learned from conducting online user studies utilizing four online research platforms (Appen, MTurk, Prolific, and Upwork). Our recommendations aim to help those in...
Conference Paper
Full-text available
We report results using n-grams to model user actions with only aggregated data and knowing little about the user. Employing a data set of 33,860 flight bookings from 4,221 passengers, we evaluate the n-gram model for the precision of predicting next likely actions. Results show that our approach can achieve a precision of 21% overall and 88% for s...
Conference Paper
Full-text available
Attacks against media channels are increasing in social media. The concept of fake news has been weaponized to label and discredit content with which one does not agree. Using data collected from Facebook and YouTube, we analyze attacks against online news channels to understand the logic behind them. Based on programmatic data collection of 4, 980...
Conference Paper
Full-text available
Automated online hate detection has garnered interest from various stakeholders to make online platforms safer. Despite this interest, there remain a plethora of unresolved issues that hinder advancement. We review fourteen state-of-the-art articles discussing these challenges, and present a meta-synthesis. Six themes are identified: (1) Dataset se...
Conference Paper
Full-text available
User needs inform designers and developers of essential functionalities for requirements engineering. In this work, we summarize key concepts and challenges relating to manual and automatic user needs detection methods. We discuss six challenges with manual and eight challenges with automated methods. Despite the promise of automated methods, the c...
Chapter
Full-text available
Personas are often created based on user interviews. Yet, researchers rarely make their interview questions publicly available or justify how they were chosen. We manually extract 276 interview questions and categorize them into 10 themes, making this list publicly available for researchers and practitioners. We also demonstrate an approach of usin...
Conference Paper
Full-text available
Controlling the quality of social media feeds poses an issue for many users. Platforms such as Twitter give users some options to influence their feeds. Still, the selection of content predominantly relies on implicit rather than explicit user actions, as manual options for "cleaning the feed" are often cumbersome and difficult to use for most user...
Chapter
In a user experiment, we tried out a novel data collection approach consisting of combining surveys with the think aloud method. We coin the phrase “think-aloud survey method”, where participants think-aloud while completing a questionnaire. We analyzed the transcripts and found that the think aloud survey provides deeper insights into the reasonin...
Conference Paper
Full-text available
We compare a data-driven persona system and an analytics system for efficiency and effectiveness for a user identification task. Findings from the 34-participant experiment show that the data-driven persona system affords faster task completion, is easier for users to engage with, and provides better user identification accuracy. Eye-tracking data...
Article
Personified big data and rapidly developing data science techniques enable previously unforeseen methodological developments for longitudinal analysis of online audiences. Applying data-driven persona generation on online customer statistics from a real organizational social media channel, we demonstrate how personas can be deployed to understand o...
Article
We develop a framework to reduce the number of customer segments to the smallest quantity without losing essential information of the underlying population in the electronic marketplace. As a use case of this approach, we create personas for these segments to enhance customer understanding. We use (a) matrix factorization to identify customer behav...
Conference Paper
Full-text available
Investigating users’ engagement with interactive persona systems can yield crucial insights for the design of such systems. Using eye-tracking, researchers can address the scarcity of behavioral user studies, even during times when physical user studies are difficult or impossible to carry out. In this research, we implement a webcam-based eye-trac...
Conference Paper
Full-text available
Our research goal is to summarize the body of persona knowledge by identifying knowledge claims. This can aid HCI researchers to (a) navigate persona knowledge to form an understanding of what is known about personas quickly, (b) identify central research gaps of what is not known (or said) about personas, and (c) identify claims that are not subst...
Conference Paper
Full-text available
User studies have found persona application challenging. We argue that a potential reason for the challenges is the organization's readiness to apply personas. This research reports the on-going effort of developing the Persona Readiness Scale, a survey instrument for organizations’ readiness for personas. The scale involves twenty-two items from s...
Conference Paper
Full-text available
Though photographs of real people are typically used to portray personas, there is little research into the potential advantages or disadvantages of using such images, relative to other image styles. We conducted an experiment with 149 participants, testing the effects of six different image styles on user perceptions and personality traits that ar...
Article
Full-text available
Data-driven persona development unifies methodologies for creating robust personas from the behaviors and demographics of user segments. Data-driven personas have gained popularity in human-computer interaction due to digital trends such as personified big data, online analytics, and the evolution of data science algorithms. Even with its increasin...
Article
False preconceptions about users can result in poor design, product development, and marketing decisions, so rectifying these preconceptions is essential for organizations. This research quantitatively evaluates the ability of data-driven personas to alter decision makers’ preconceptions about their online social media users. We conduct a within-pa...
Article
Full-text available
Data-driven personas are a significant advancement in the fields of human-centered informatics and human-computer interaction. Data-driven personas enhance user understanding by combining the empathy inherent with personas with the rationality inherent in analytics using computational methods. Via the employment of these computational methods, the...
Conference Paper
Full-text available
Practitioners in user-centric industries have increasingly recognized the applicability of personas. However, the methods used to create personas in different domains remain inconsistent and unsystematic. We analyzed 51 studies focused on designing personas for professional purposes and find the practice most prevalent in the user experience (UX) d...
Conference Paper
Full-text available
Persona is a technique for enhancing user understanding and improving the user-centered design of digital products. Persona creation has traditionally been divided into Qualitative, Quantitative, and Mixed Methods approaches. However, no literature systematically contrasts the strengths and weaknesses of these approaches. We review the literature t...
Conference Paper
Full-text available
Targeting criteria in online advertising differ across platforms and frequently change. Because advertisers are increasingly taking a multi-channel approach to online marketing, there is a need to automatically map the targeting criteria between ad platforms. In this research, we test two algorithmic approaches − Word2Vec and WordNet − for mapping...
Conference Paper
Full-text available
We develop a method for assigning demographically appropriate names to data-driven entities, such as personas, chatbots, and virtual agents. The value of this method is removing the time-consuming human effort in this task. To demonstrate our method, we collect four million user profiles with gender, age, and country information from an internation...
Article
Full-text available
Using 27 million flight bookings for 2 years from a major international airline company, we built a Next Likely Destination model to ascertain customers’ next flight booking. The resulting model achieves an 89% predictive accuracy using historical data. A unique aspect of the model is the incorporation of self-competence, where the model defers whe...
Chapter
Full-text available
In this concluding chapter, we address three myths concerning data-driven personas. In some respects, these myths are the foundational drivers for the grand challenges in the data-driven persona domain. We then present and discuss the grand challenges that must be addressed, in a multidisciplinary manner, to take data-driven personas to the next le...
Chapter
Full-text available
This chapter discusses the challenging process of getting data-driven personas integrated within an organization. More precisely, it requires effort to bring the stakeholders within an organization to understand, develop, and productively employ data-driven personas for enhanced user understanding. We outline and discuss details of a three-step dat...
Chapter
Full-text available
The persona technique came into widespread use and acceptance shortly after its inception (Cooper, 1999), although it was sometimes criticized along the way (Laubheimer, 2017; McKeen, 2019) for shortcomings relative to other user research methods. There are different approaches apart from personas that can be used during a user-centered design proc...
Chapter
Full-text available
In this chapter, we discuss the aspects of getting data that is useful for creating data-driven personas. We start by introducing the concept of persona information needs, which refers to stakeholders’ requests for information. We then proceed to persona information display and design, which refers to how the selected information is displayed to en...
Chapter
Full-text available
This chapter reviews issues and challenges of data-driven-persona development. We introduce an analytical framework for ethical dimensions to assess if data-driven personas are fair and representative of minority groups while also countering the perpetuation of stereotypes. The complicated intersections of people and technology form the basis of th...
Chapter
Full-text available
In this chapter, we briefly introduce personas and, more specifically, data-driven personas. We compare data-driven personas with the concept of analytics. Then, we present a short history of data-driven personas. This concise history is followed by a brief discussion of user segmentation, which is the conceptual foundation of personas. We end the...
Chapter
Full-text available
In this chapter, we review the literature on data-driven persona development. We discuss popular data-driven persona development algorithms to show the diversity of algorithmic approaches. We then summarize the primary challenges of data-driven persona development, pointing to the road ahead with general data-driven persona creation methods. We the...
Chapter
Full-text available
In this chapter, we briefly introduce the critically needed task of data-driven persona evaluation. Although we specifically focus on data-driven, nearly all of the evaluation content applies to other types of personas as well. We highlight the need for evaluation in both persona research and practice, and we introduce techniques for persona evalua...

Network

Cited By