Figure 5 - available via license: Creative Commons Attribution 4.0 International
Content may be subject to copyright.
Distribution of meta-evaluation scores assigned by individual meta-evaluators. Text counts are for all generators combined.
Source publication
The capabilities of recent large language models (LLMs) to generate high-quality content indistinguishable by humans from human-written texts rises many concerns regarding their misuse. Previous research has shown that LLMs can be effectively misused for generating disinformation news articles following predefined narratives. Their capabilities to...
Context in source publication
Similar publications
Modern large language models (LLMs) are optimized for human-aligned responses using Reinforcement Learning from Human Feedback (RLHF). However, existing RLHF approaches assume a universal preference model and fail to account for individual user preferences, limiting their effectiveness in personalized applications. We introduce a framework that ext...
While Reinforcement Learning from Human Feedback (RLHF) is widely used to align Large Language Models (LLMs) with human preferences, it typically assumes homogeneous preferences across users, overlooking diverse human values and minority viewpoints. Although personalized preference learning addresses this by tailoring separate preferences for indiv...
Citations
Purpose
This study aims to investigate how safe large language model (LLM)-based artificial intelligence (AI) Chatbots are for young consumers of Generation Z for use in their purchase decisions. The research findings intend to inform potential security issues with LLM-based AI chatbots putting at risk the well-being of such consumers.
Design/methodology/approach
The study adopted the JAILBREAKHUB framework to evaluate the effectiveness of LLM guardrails against negative prompts for purchase-related decisions. The guardrails of LLM-based AI chatbots such as OpenAI’s ChatGPT, Google’s Gemini, Anthropic’s Claude, X’s Grok, Meta’ Llama and Mistral were evaluated.
Findings
There are variations in the effectiveness of LLM guardrails against negative purchase-related prompts. While there is general effectiveness of existing guardrails for some LLM-based AI chatbots, they are not impervious to manipulation by users.
Research limitations/implications
The study involved a limited set of LLM-based AI chatbots and on existing guardrails.
Practical implications
The landscape of prompt engineering techniques to bypass LLM guardrails is in a constant state of evolution. Developers of LLMs need to ensure more robust and adaptive safety protocols to ensure responsible purchase decisions among Generation Z consumers. Weaknesses in age verification mechanisms are also highlighted.
Social implications
The findings highlight safety concerns with the use of LLM-based chatbots by Generation Z consumers for their purchase decisions. While the use of prompt manipulation techniques on LLMs may be uncommon among these young consumers, such acts are implicative of the young consumer having already developed a precarious attitude or opinion toward a purchase decision leading to the use of LLMs to guide such decisions. There are factors of social concern that could instigate the precarious use of LLMs.
Originality/value
The current study is among the first to evaluate the use of LLM-based AI chatbots by young consumers of Generation Z for their purchase decisions.