An example where the model's response changes when provided with a cross-national prompt, assigning 99.1% probability to the response "Generally bad".

An example where the model's response changes when provided with a cross-national prompt, assigning 99.1% probability to the response "Generally bad".

Source publication
Preprint
Full-text available
Large language models (LLMs) may not equitably represent diverse global perspectives on societal issues. In this paper, we develop a quantitative framework to evaluate whose opinions model-generated responses are more similar to. We first build a dataset, GlobalOpinionQA, comprised of questions and answers from cross-national surveys designed to ca...

Similar publications

Article
Full-text available
Feedback is a key factor in motivating and consolidating learning, but in classroom teaching, a teacher needs to provide timely and effective feedback on the homework of dozens of students, which puts much pressure on the teacher. Meanwhile, existing automatic feedback systems are not suitable for open writing tasks. The emergence of ChatGPT has at...
Preprint
Full-text available
Are large language models (LLMs) biased towards text generated by LLMs over text authored by humans, leading to possible anti-human bias? Utilizing a classical experimental design inspired by employment discrimination studies, we tested widely-used LLMs, including GPT-3.5 and GPT4, in binary-choice scenarios. These involved LLM-based agents selecti...
Preprint
Full-text available
In this paper, we introduce the MediaSpin dataset aiming to help in the development of models that can detect different forms of media bias present in news headlines, developed through human-supervised and-validated Large Language Model (LLM) labeling of media bias. This corpus comprises 78,910 pairs of news headlines and annotations with explanati...
Preprint
Full-text available
Foundation models are now a major focus of leading technology organizations due to their ability to generalize across diverse tasks. Existing approaches for adapting foundation models to new applications often rely on Federated Learning (FL) and disclose the foundation model weights to clients when using it to initialize the global model. While the...

Citations

... The first study. In 2023, scientists developed the "Chinese Room of Increased Complexity" technology to create algorithmic copies of citizens of any country [11]. This was followed by the Wuhan experiment to predict the US presidential election in 2024 based on the analysis of the AI model of preferences of simulacra rather than people. ...
Article
Full-text available
The article examines the essential characteristics of rulemaking activity in the context of modern challenges and priorities, which is analyzed with due regard for the instrumental and essential typologizing elements. It is noted that one of the priority areas for the development of rulemaking at the present stage is to consider the "achievements" of the latest technologies related to artificial intelligence. It is emphasized that rulemaking at all levels should ensure human rights and freedoms (in particular, this refers to the improvement of veteran policy at the present stage, its forms, and methods).
... The first study. In 2023, scientists developed the "Chinese Room of Increased Complexity" technology to create algorithmic copies of citizens of any country [11]. This was followed by the Wuhan experiment to predict the US presidential election in 2024 based on the analysis of the AI model of preferences of simulacra rather than people. ...
Article
Full-text available
The article examines the essential characteristics of rulemaking activity in the context of modern challenges and priorities, which is analyzed with due regard for the instrumental and essential typologizing elements. It is noted that one of the priority areas for the development of rulemaking at the present stage is to consider the "achievements" of the latest technologies related to artificial intelligence. It is emphasized that rulemaking at all levels should ensure human rights and freedoms (in particular, this refers to the improvement of veteran policy at the present stage, its forms, and methods).
... The concept of "constitutional AI" arises from the field of AI safety research which aims to align AI behaviour with human goals, preferences, or ethical principles (Yudkowsky, 2016). This concept of provisioning a text-based set of rules to be narrowly interpreted by an AI agent for training and selfevaluation emerged as an AI safety approach in relation to LLMs; which are a subset of AI agent created from Natural Language Processing techniques (Bai et al., 2022;Durmus et al., 2023). AI governance refers to the ecosystem of norms, markets, and institutions that shape how AI is built and deployed, as well as the policy and research required to maintain it, in line with human interests (Bullock et al., 2024;Dafoe, 2024). ...
... From "constitutional AI" to "AI as a constituted system" The approach of conceptualising AI as a constituted system presented in this article provides a much broader scope for AI governance discussions, as opposed to the approach of instructing an AI model with a natural language constitution to align it with human values. For example, the approach of "constitutional AI" was coined by the team developing the Open AI competitor "Anthropic" to describe the practice of provisioning a list of rules or principles as a way to train AI agents to operate in line with the preferences of their deployer, and self-monitor their behaviour against these rules (Bai et al., 2022;Durmus et al., 2023). The obvious question here is how does a group of people, such as an entire democratic country, collectively express, articulate, and determine their preferences to be algorithmically constituted? ...
Article
Full-text available
This study focuses on the practicalities of establishing and maintaining AI infrastructure, as well as the considerations for responsible governance by investigating the integration of a pre-trained large language model (LLM) with an organisation’s knowledge management system via a chat interface. The research adopts the concept of “AI as a constituted system” to emphasise the social, technical, and institutional factors that contribute to AI’s governance and accountability. Through an ethnographic approach, this article details the iterative processes of negotiation, decision-making, and reflection among organisational stakeholders as they develop, implement, and manage the AI system. The findings indicate that LLMs can be effectively governed and held accountable to stakeholder interests within specific contexts, specifically, when clear institutional boundaries facilitate innovation while navigating the risks related to data privacy and AI misbehaviour. Effective constitution and use can be attributed to distinct policy creation processes to guide AI’s operation, clear lines of responsibility, and localised feedback loops to ensure accountability for actions taken. This research provides a foundational perspective to better understand algorithmic accountability and governance within organisational contexts. It also envisions a future where AI is not universally scaled but consists of localised, customised LLMs tailored to stakeholder interests.
... Depending on the training timepoint, it is unclear if they can keep up with rapid-moving trends of teenagers, for example, with teenage slang, pop culture shifts, and social media interactions. When considering subjective opinions, researchers have already shown that LLMs are biased towards specific ideologies [23,44] and populations [21,67]. It is thus essential to understand if LLM suggestions for teachers can support them in building chatbots with a broader student representation or if the LLM causes the opposite, biasing and narrowing their design. ...
Conference Paper
Full-text available
Cyberbullying harms teenagers' mental health, and teaching them upstanding intervention is crucial. Wizard-of-Oz studies show chatbots can scale up personalized and interactive cyberbullying education, but implementing such chatbots is a challenging and delicate task. We created a no-code chatbot design tool for K-12 teachers. Using large language models and prompt chaining, our tool allows teachers to prototype bespoke dialogue flows and chatbot utterances. In offering this tool, we explore teachers' distinctive needs when designing chatbots to assist their teaching, and how chat-bot design tools might better support them. Our findings reveal that teachers welcome the tool enthusiastically. Moreover, they see themselves as playwrights guiding both the students' and the chatbot's behaviors while allowing for some improvisation. Their goal is to enable students to rehearse both desirable and undesirable reactions to cyberbullying in a safe environment. We discuss the design opportunities LLM-Chains offer for empowering teachers and the research opportunities this work opens up.
... Instead, accepting large amounts of web text as representative of human knowledge runs the risk of perpetuating dominant viewpoints, exacerbating power imbalances, and reinforcing existing inequalities (Bender et al. 2021). Models based on dominant languages and knowledge can further entrench discriminatory or stereotypical views of minorities and marginalized cultures (Durmus et al. 2023). ...
Article
We build on Cohen’s discussion of the current state of generative AI and large language models (LLMs) and his concerns about representativeness and biases by highlighting the epistemological challenges presented by foundational models. The homogenization of foundational models poses a significant challenge when it comes to auditing and correcting embedded, structural biases within these models, while the hegemonic structuring of knowledge through LLMs may exclude minoritized knowledge and perspectives. This OPC will describe how generalizing use of LLMs for medical purposes may potentially jeopardize the diverse and local ways of understanding and providing care, as experiential knowledge, illness narratives, or alternative ways of caring could not be effectively synthesized in the LLM training; and doctors and caretakers need to navigate the epistemic authority of AI-generated knowledge.
Article
The emergence of large language models (LLMs) has sparked considerable interest in their potential application in psychological research, mainly as a model of the human psyche or as a general text-analysis tool. However, the trend of using LLMs without sufficient attention to their limitations and risks, which we rhetorically refer to as “GPTology”, can be detrimental given the easy access to models such as ChatGPT. Beyond existing general guidelines, we investigate the current limitations, ethical implications, and potential of LLMs specifically for psychological research, and show their concrete impact in various empirical studies. Our results highlight the importance of recognizing global psychological diversity, cautioning against treating LLMs (especially in zero-shot settings) as universal solutions for text analysis, and developing transparent, open methods to address LLMs’ opaque nature for reliable, reproducible, and robust inference from AI-generated data. Acknowledging LLMs’ utility for task automation, such as text annotation, or to expand our understanding of human psychology, we argue for diversifying human samples and expanding psychology’s methodological toolbox to promote an inclusive, generalizable science, countering homogenization, and over-reliance on LLMs.