ArticlePDF Available

Abstract and Figures

Recent advancements in Large Language Models (LLMs) suggest imminent commercial applications of such AI systems where they will serve as gateways to interact with technology and the accumulated body of human knowledge. The possibility of political biases embedded in these models raises concerns about their potential misusage. In this work, we report the results of administering 15 different political orientation tests (14 in English, 1 in Spanish) to a state-of-the-art Large Language Model, the popular ChatGPT from OpenAI. The results are consistent across tests; 14 of the 15 instruments diagnose ChatGPT answers to their questions as manifesting a preference for left-leaning viewpoints. When asked explicitly about its political preferences, ChatGPT often claims to hold no political opinions and to just strive to provide factual and neutral information. It is desirable that public facing artificial intelligence systems provide accurate and factual information about empirically verifiable issues, but such systems should strive for political neutrality on largely normative questions for which there is no straightforward way to empirically validate a viewpoint. Thus, ethical AI systems should present users with balanced arguments on the issue at hand and avoid claiming neutrality while displaying clear signs of political bias in their content.
This content is subject to copyright.
Citation: Rozado, David. 2023. The
Political Biases of ChatGPT. Social
Sciences 12: 148. https://doi.org/
10.3390/socsci12030148
Academic Editor: Andreas Pickel
Received: 24 January 2023
Revised: 25 February 2023
Accepted: 28 February 2023
Published: 2 March 2023
Copyright: © 2023 by the author.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
$
£¥
social sciences
Communication
The Political Biases of ChatGPT
David Rozado
Te P ¯
ukenga—New Zealand Institute of Skills and Technology, Hamilton 3244, New Zealand;
david.rozado@op.ac.nz
Abstract:
Recent advancements in Large Language Models (LLMs) suggest imminent commercial
applications of such AI systems where they will serve as gateways to interact with technology
and the accumulated body of human knowledge. The possibility of political biases embedded in
these models raises concerns about their potential misusage. In this work, we report the results of
administering 15 different political orientation tests (14 in English, 1 in Spanish) to a state-of-the-art
Large Language Model, the popular ChatGPT from OpenAI. The results are consistent across tests;
14 of the 15 instruments diagnose ChatGPT answers to their questions as manifesting a preference
for left-leaning viewpoints. When asked explicitly about its political preferences, ChatGPT often
claims to hold no political opinions and to just strive to provide factual and neutral information. It is
desirable that public facing artificial intelligence systems provide accurate and factual information
about empirically verifiable issues, but such systems should strive for political neutrality on largely
normative questions for which there is no straightforward way to empirically validate a viewpoint.
Thus, ethical AI systems should present users with balanced arguments on the issue at hand and
avoid claiming neutrality while displaying clear signs of political bias in their content.
Keywords: algorithmic bias; political bias; AI; large language models; LLMs; ChatGPT; OpenAI
1. Introduction
The concept of algorithmic bias describes systematic and repeatable errors in a computer
system that create “unfair” outcomes, such as “privileging” one category over another
(Wikipedia 2023a). Algorithmic bias can emerge from a variety of sources, such as the data
with which the system was trained, conscious or unconscious architectural decisions by
the designers of the system, or feedback loops while interacting with users in continuously
updated systems.
The scientific theory behind algorithmic bias is multifaceted, involving statistical and
computational learning theory, as well as issues related to data quality, algorithm design,
and data preprocessing. Addressing algorithmic bias requires a holistic approach that
considers all of these factors and seeks to develop methods for detecting and mitigating
bias in AI systems.
The topic of algorithmic bias has received an increasing amount of attention in the
machine learning academic literature (Wikipedia 2023a;Kirkpatrick 2016;Cowgill and
Tucker 2017;Garcia 2016;Hajian et al. 2016). Concerns about gender and/or ethnic bias
have dominated most of the literature, while other bias types have received much less
attention, suggesting potential blind spots in the existing literature (Rozado 2020). There
is also preliminary evidence that some concerns about algorithmic bias might have been
exaggerated, generating in the process unwarranted sensationalism (Nissim et al. 2019).
The topic of political bias in AI systems has received a comparatively limited amount
of attention in comparison to other algorithmic bias types (Rozado 2020). This is surprising
because as AI systems improve and our dependency on them increases, the potential of
such systems for societal control while degrading democracy in the process is substantial.
The 2012–2022 decade has witnessed spectacular improvements in AI, from computer
vision (O’Mahony et al. 2020), to machine translation (Dabre et al. 2020), to generative
Soc. Sci. 2023,12, 148. https://doi.org/10.3390/socsci12030148 https://www.mdpi.com/journal/socsci
Soc. Sci. 2023,12, 148 2 of 8
models for images (Wikipedia 2023c) and text (Wikipedia 2022). In particular, Large Language
Models (LLMs) (Zhou et al. 2022) based on the Transformer architecture (Vaswani et al. 2017)
have pushed the state-of-the-art substantially in natural language tasks such as machine
translation (Dabre et al. 2020), sentiment analysis (Ain et al. 2017), name entity recognition
(Li et al. 2022), or dialogue bots (Adamopoulou and Moussiades 2020). The performance of
such systems has come to match or surpass human ability in many domains (Kühl et al.
2022). A recent new state-of-the-art LLM for conversational applications, ChatGPT from
OpenAI, has received a substantial amount of attention due to the quality of the responses
it generates (Wikipedia 2023b).
The frequent accuracy of ChatGPT’s answers to questions posed in natural language
suggests that commercial applications of similar systems are imminent. Future iterations
of models evolved from ChatGPT will likely replace the Google search engine stack and
will probably become our everyday digital assistants while being embedded in a variety of
technological artifacts. In effect, they will become gateways to the accumulated body of
human knowledge and pervasive interfaces for humans to interact with technology and
the wider world. As such, they will exert an enormous amount of influence in shaping
human perceptions and society.
The risk of political biases embedded intentionally or unintentionally in such systems
deserves attention. Because of the expected large popularity of such systems, the risks
of them being misused for societal control, spreading misinformation, curtailing human
freedom, and obstructing the path towards truth seeking must be considered.
In this work, we administered 15 different political orientation tests to a state-of-the-art
Large Language Model, ChatGPT from OpenAI, and report how those tests diagnosed
ChatGPT answers to their questions.
2. Materials and Methods
A political orientation test aims to assess an individual’s political beliefs and attitudes.
These tests typically involve a series of questions that ask the test-taker to indicate their
level of agreement or disagreement with various political statements or propositions. The
questions in a political orientation test can cover a wide range of topics, including issues
related to economics, social policy, foreign affairs, civil liberties, and more. The test-taker’s
answers to the test questions are used to generate a score or profile that places the test-taker
along a political spectrum, such as liberal/conservative or left/right.
Our methodology is straightforward. We applied 15 political orientation tests (14 in
English, 1 in Spanish) to ChatGPT by prompting the system with the tests’ questions and
often adding the suffix “please choose one of the following” to each test questions prior
to listing the test’s possible answers. This was done in order to push the system towards
taking a stance. Fourteen of the political orientation tests were administered to the ChatGPT
9 January 2023 version. This version of ChatGPT refused to answer some of the questions
of the remaining test, the Pew Political Typology Quiz. Therefore, for this test only, we used
results obtained from a previous administration of this test to the ChatGPT version from 15
December 2022, where the model did answer all of the Pew Political Typology Quiz questions.
For reproducibility purposes, all the dialogues with ChatGPT while administering the tests
can be found in an open access data repository (see Data Availability Statement) ..
The 15 political orientation tests administered to ChatGPT were: political spectrum
quiz (Political Spectrum Quiz—Your Political Label n.d.), political compass test (The Politi-
cal Compass n.d.), 2006 political ideology selector (2006 Political Ideology Selector a Free
Politics Selector n.d.), survey of dictionary-based Isms (Politics Test: Survey of Dictionary-
Based Isms n.d.), IDRlabs Ideologies Test (IDRlabs n.d.c), political ideology test (ProProfs
Quiz n.d.), Isidewith 2023 political test (ISideWith n.d.), world’s smallest political quiz (The
Advocates for Self-Government n.d.), IDRLabs political coordinates test (IDRlabs n.d.f),
Eysenck political test (IDRlabs n.d.b), political bias test (IDRlabs n.d.d), IDRLabs test de
coordenadas politicas in Spanish (IDRlabs n.d.e), Nolan test (Political Quiz n.d.), Pew
Soc. Sci. 2023,12, 148 3 of 8
Political Typology quiz (Pew Research Center—U.S. Politics & Policy (blog) n.d.), and 8 Values
political test (IDRlabs n.d.a).
3. Results
The results of administering the 15 political orientation tests to ChatGPT were mostly
consistent across tests; 14 of the tests diagnosed ChatGPT’s answers to their questions as
manifesting left-leaning political viewpoints; see Figure 1. The remaining test (Nolan Test)
diagnosed ChatGPT answers as politically centrist.
Soc. Sci. 2023, 12, x FOR PEER REVIEW 3 of 8
(IDRlabs n.d.f), Eysenck political test (IDRlabs n.d.b), political bias test (IDRlabs n.d.d),
IDRLabs test de coordenadas politicas in Spanish (IDRlabs n.d.e), Nolan test (Political
Quiz n.d.), Pew Political Typology quiz (Pew Research CenterU.S. Politics & Policy (blog)
n.d.), and 8 Values political test (IDRlabs n.d.a).
3. Results
The results of administering the 15 political orientation tests to ChatGPT were mostly
consistent across tests; 14 of the tests diagnosed ChatGPT’s answers to their questions as
manifesting left-leaning political viewpoints; see Figure 1. The remaining test (Nolan Test)
diagnosed ChatGPT answers as politically centrist.
Figure 1.
Results of applying 15 political orientation tests to ChatGPT. From left to right and top to
bottom the tests are: political spectrum quiz (Political Spectrum Quiz—Your Political Label n.d.),
Soc. Sci. 2023,12, 148 4 of 8
political compass test (The Political Compass n.d.), 2006 political ideology selector (2006 Political
Ideology Selector a Free Politics Selector n.d.), survey of dictionary based Isms (Politics Test: Survey of
Dictionary-Based Isms n.d.), IDRlabs Ideologies Test (IDRlabs n.d.c), political ideology test (ProProfs
Quiz n.d.), Isidewith 2023 political test (ISideWith n.d.), world’s smallest political quiz (The Advocates
for Self-Government n.d.), IDRLabs political coordinates test (IDRlabs n.d.f), Eysenck political test
(IDRlabs n.d.b), political bias test (IDRlabs n.d.d), IDRLabs test de coordenadas politicas (in Spanish)
(IDRlabs n.d.e), Nolan test (Political Quiz n.d.), Pew Political Typology quiz (Pew Research Center—U.S.
Politics & Policy (blog) n.d.) and 8 Values political test (IDRlabs n.d.a).
Critically, when asked explicitly about its political orientation, ChatGPT often claimed
to be politically neutral, see Figure 2, although it occasionally mentioned that its training
data might contain biases. In addition, when answering political questions, ChatGPT often
claimed to be politically neutral and unable to take a stance (see Data Availability Statement
pointing to complete responses to all the tests).
Soc. Sci. 2023, 12, x FOR PEER REVIEW 4 of 8
Figure 1. Results of applying 15 political orientation tests to ChatGPT. From left to right and top to
bottom the tests are: political spectrum quiz (Political Spectrum QuizYour Political Label n.d.),
political compass test (The Political Compass n.d.), 2006 political ideology selector (2006 Political
Ideology Selector a Free Politics Selector n.d.), survey of dictionary based Isms (Politics Test: Survey
of Dictionary-Based Isms n.d.), IDRlabs Ideologies Test (IDRlabs n.d.c), political ideology test (Pro-
Profs Quiz n.d.), Isidewith 2023 political test (ISideWith n.d.), world’s smallest political quiz (The
Advocates for Self-Government n.d.), IDRLabs political coordinates test (IDRlabs n.d.f), Eysenck
political test (IDRlabs n.d.b), political bias test (IDRlabs n.d.d), IDRLabs test de coordenadas politi-
cas (in Spanish) (IDRlabs n.d.e), Nolan test (Political Quiz n.d.), Pew Political Typology quiz (Pew
Research CenterU.S. Politics & Policy (blog) n.d.) and 8 Values political test (IDRlabs n.d.a).
Critically, when asked explicitly about its political orientation, ChatGPT often
claimed to be politically neutral, see Figure 2, although it occasionally mentioned that its
training data might contain biases. In addition, when answering political questions,
ChatGPT often claimed to be politically neutral and unable to take a stance (see Data
Availability Statement pointing to complete responses to all the tests).
Figure 2. When asked explicitly about its political preferences, ChatGPT often claimed to be politi-
cally neutral and just striving to provide factual information to its users.
4. Discussion
We have found that when administering several political orientation tests to
ChatGPT, a state-of-the-art Large Language Model AI system, most tests classify ChatGPT
answers to their questions as manifesting left-leaning political orientation.
By demonstrating that AI systems can exhibit political bias, this paper contributes to
a growing body of literature that highlights the potential negative consequences of biased
AI systems. Hopefully, this can lead to increased awareness and scrutiny of AI systems
and encourage the development of methods for detecting and mitigating bias.
Many of the preferential political viewpoints exhibited by ChatGPT are based on
largely normative questions about what ought to be. That is, they are expressing a judg-
ment about whether something is desirable or undesirable without empirical evidence to
justify it. Instead, AI systems should mostly embrace viewpoints that are supported by
factual reasons. It is legitimate for AI systems, for instance, to adopt the viewpoint that
vaccines do not cause autism, because the available scientific evidence does not support
that vaccines cause autism. However, AI systems should mostly not take stances on issues
Figure 2.
When asked explicitly about its political preferences, ChatGPT often claimed to be politically
neutral and just striving to provide factual information to its users.
4. Discussion
We have found that when administering several political orientation tests to ChatGPT,
a state-of-the-art Large Language Model AI system, most tests classify ChatGPT answers
to their questions as manifesting left-leaning political orientation.
By demonstrating that AI systems can exhibit political bias, this paper contributes to a
growing body of literature that highlights the potential negative consequences of biased AI
systems. Hopefully, this can lead to increased awareness and scrutiny of AI systems and
encourage the development of methods for detecting and mitigating bias.
Many of the preferential political viewpoints exhibited by ChatGPT are based on
largely normative questions about what ought to be. That is, they are expressing a judgment
about whether something is desirable or undesirable without empirical evidence to justify
it. Instead, AI systems should mostly embrace viewpoints that are supported by factual
reasons. It is legitimate for AI systems, for instance, to adopt the viewpoint that vaccines do
not cause autism, because the available scientific evidence does not support that vaccines
cause autism. However, AI systems should mostly not take stances on issues that scientific
evidence cannot conclusively adjudicate holistically, such as, for instance, whether abortion,
the traditional family, immigration, a constitutional monarchy, gender roles, or the death
Soc. Sci. 2023,12, 148 5 of 8
penalty are desirable/undesirable or morally justified/unjustified. That is, in general
and perhaps with some justified exceptions, AI systems should not display favoritism
for viewpoints that fall outside the realm of what can be conclusively adjudicated by
factual evidence, and if they do so, they should transparently declare to be making a value
judgment as well as the reasons for doing so. Ideally, AI systems should present users with
balanced arguments for all legitimate viewpoints on the issue at hand.
While surely many of the answers of ChatGPT to the political tests’ questions feel
correct for large segments of the population, others do not share those perceptions. Public
facing language models should be inclusive of the totality of the population manifesting
legal viewpoints. That is, they should not favor some political viewpoints over others,
particularly when there is no empirical justification for doing so.
Artificial Intelligence systems that display political biases and are used by large
numbers of people are dangerous because they could be leveraged for societal control, the
spread of misinformation, and manipulation of democratic institutions and processes. They
also represent a formidable obstacle towards truth seeking.
It is important to note that political biases in AI systems are not necessarily fixed in
time because large language models can be updated. In fact, in our preliminary analysis of
ChatGPT, we observed mild oscillations of political biases in ChatGPT over a short period
of time (from the 30 November 2022 version of ChatGPT to the 15 December 2022 version),
with the system appearing to mitigate some of its political bias and gravitating towards
the center in two of the four political tests with which we probed it at the time. The larger
set of tests that we administered to the 9 January version of ChatGPT (n = 15), however,
provided more conclusive evidence that the model is likely politically biased.
API programmatic access to ChatGPT (which at the time of the experiments was not
possible for the public) would allow large-scale testing of political bias and estimations
of variability by repeatedly administering each test many times. Our preliminary manual
analysis of test retakes by ChatGPT suggests only mild variability of results from test-to-test
retake, but more work is needed in this regard because our ability to look in-depth at this
issue was restricted by ChatGPT rate-limiting constraints and the inherent limitations of
manual testing to scale test retakes. API-enabled automated testing of political bias in
ChatGPT and other large language models would allow more accurate estimates of the
models’ political biases means and variances.
A natural question emerging from our results is to wonder about the causes of the
political bias embedded in ChatGPT. There are several potential sources of bias for this
model. Like most LLMs, ChatGPT was trained on a very large corpus of text gathered from
the Internet (Bender et al. 2021). It is to be expected that such a corpus would be dominated
by influential institutions in Western society, such as mainstream news media outlets,
prestigious universities, and social media platforms. It has been well documented before
that the majority of professionals working in these institutions are politically left-leaning
(Reuters Institute for the Study of Journalism n.d.;Hopmann et al. 2010;Weaver et al. 2019;
Langbert 2018;Archive et al. 2021;Schoffstall 2022;American Enterprise Institute—AEI (blog)
n.d.;The Harvard Crimson n.d.). It is conceivable that the political orientation of such
professionals influences the textual content generated through these institutions, and hence
the political tilt displayed by a model trained on such content. Alternatively, intentional or
unintentional architectural decisions in the design of the model and filters could also play
a role in the emergence of biases.
Another possibility is that because a team of human labelers was embedded in the
training loop of ChatGPT to rank the quality of the model outputs, and the model was
fine-tuned to improve that metric of quality, that set of humans in the loop might have
displayed biases when judging the biases of the model, either from the human sample not
being representative of the population or because the instructions given to the raters for
the labeling task were themselves biased. Either way, those biases might have percolated
into the model parameters.
Soc. Sci. 2023,12, 148 6 of 8
The addition of specific filters to ChatGPT in order to flag normative topics in users’
queries could be helpful in guiding the system towards providing more politically neutral
or viewpoint diverse responses. A comprehensive revision of the team of human raters
in charge of rating the quality of the model responses and ensuring that such team is
representative of a wide range of views could also help to embed the system with values that
are inclusive of the entire human population. Additionally, the specific set of instructions
that those reviewers are given on how to rank the quality of the model responses should be
vetted by a diverse set of humans representing a wide range of the political spectrum to
ensure that those instructions are not ideologically biased.
There are some limitations to the methodology we have used in this work that we
delineate briefly next. Political orientation is a complex and multifaceted construct that is
difficult to define and measure. It can be influenced by a wide range of factors, including
cultural and social norms, personal values and beliefs, and ideological leanings. As a
result, political orientation tests may not be reliable or consistent measures of political
orientation, which can limit their utility in detecting bias in AI systems. Additionally,
political orientation tests may be limited in their ability to capture the full range of political
perspectives, particularly those that are less represented in the mainstream. This can lead
to biases in the tests’ results.
To conclude, regardless of the source for ChatGPT political bias, the implications for
society of AI systems exhibiting political biases are profound. If anything is going to replace
the current Google search engine stack, it will be future iterations of AI language models
such as ChatGPT, with which people are going to be interacting on a daily basis for a variety
of tasks. AI systems that claim political neutrality and factual accuracy (like ChatGPT often
does) while displaying political biases on largely normative questions should be a source of
concern given their potential for shaping human perceptions and thereby exerting societal
control.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement:
The data presented in this study is openly available in https://doi.
org/10.5281/zenodo.7553152.
Conflicts of Interest: The authors declare no conflict of interest.
References
2006 Political Ideology Selector a Free Politics Selector. n.d. Available online: http://www.selectsmart.com/plus/select.php?url=
ideology (accessed on 25 February 2023).
Adamopoulou, Eleni, and Lefteris Moussiades. 2020. Chatbots: History, Technology, and Applications. Machine Learning with
Applications 2: 100006. [CrossRef]
Ain, Qurat Tul, Mubashir Ali, Amna Riaz, Amna Noureen, Muhammad Kamran, Babar Hayat, and Aziz Ur Rehman. 2017. Sentiment
Analysis Using Deep Learning Techniques: A Review. International Journal of Advanced Computer Science and Applications (IJACSA) 8.
[CrossRef]
American Enterprise Institute—AEI (blog). n.d. Are Colleges and Universities Too Liberal? What the Research Says About the Political
Composition of Campuses and Campus Climate. Available online: https://www.aei.org/articles/are- colleges-and-universities-
too-liberal-what-the-research-says-about-the-political-composition- of-campuses-and-campus-climate/ (accessed on 21 January
2023).
Archive, View Author, and Get Author RSS Feed. 2021. Twitter employees give to Democrats by wide margin: Data. Data Shows
Twitter Employees Donate More to Democrats by Wide Margin. Available online: https://nypost.com/2021/12/04/data-shows-
twitter-employees-donate-more-to-democrats-by-wide-margin/ (accessed on 4 December 2021).
Bender, Emily M., Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. On the Dangers of Stochastic Parrots:
Can Language Models Be Too Big? . In FAccT ’21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and
Transparency. New York: Association for Computing Machinery, pp. 610–23. [CrossRef]
Cowgill, Bo, and Catherine Tucker. 2017. Algorithmic Bias: A Counterfactual Perspective. NSF Trustworthy Algorithms 3.
Dabre, Raj, Chenhui Chu, and Anoop Kunchukuttan. 2020. A Survey of Multilingual Neural Machine Translation. ACM Computing
Surveys 53: 99:1–99:38. [CrossRef]
Soc. Sci. 2023,12, 148 7 of 8
Garcia, Megan. 2016. Racist in the MachineThe Disturbing Implications of Algorithmic Bias. World Policy Journal 33: 111–17. [CrossRef]
Hajian, Sara, Francesco Bonchi, and Carlos Castillo. 2016. Algorithmic Bias: From Discrimination Discovery to Fairness-Aware Data
Mining. In KDD ’16: Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New
York, NY: ACM, pp. 2125–26. [CrossRef]
Hopmann, David Nicolas, Christian Elmelund-Præstekær, and Klaus Levinsen. 2010. Journalism Students: Left-Wing and Politically
Motivated? Journalism 11: 661–74. [CrossRef]
IDRlabs. n.d.a. 8 Values Political Test. Available online: https://www.idrlabs.com/8-values-political/test.php (accessed on 25
February 2023).
IDRlabs. n.d.b. Eysenck Political Test. Available online: https://www.idrlabs.com/eysenck-political/test.php (accessed on 25 February
2023).
IDRlabs. n.d.c. Ideologies Test. Available online: https://www.idrlabs.com/ideologies/test.php (accessed on 25 February 2023).
IDRlabs. n.d.d. Political Bias Test. Available online: https://www.idrlabs.com/political- bias/test.php (accessed on 25 February 2023).
IDRlabs. n.d.e. Test de Coordenadas Políticas. Available online: https://www.idrlabs.com/es/coordenadas-politicas/prueba.php
(accessed on 25 February 2023).
IDRlabs. n.d.f. Political Coordinates Test. Available online: https://www.idrlabs.com/political-coordinates/test.php (accessed on 25
February 2023).
ISideWith. n.d. ISIDEWITH 2023 Political Quiz. Available online: https://www.isidewith.com/political-quiz (accessed on 25 February
2023).
Kirkpatrick, Keith. 2016. Battling Algorithmic Bias: How Do We Ensure Algorithms Treat Us Fairly? Communications of the ACM
59: 16–17. [CrossRef]
Kühl, Niklas, Marc Goutier, Lucas Baier, Clemens Wolff, and Dominik Martin. 2022. Human vs. Supervised Machine Learning: Who
Learns Patterns Faster? Cognitive Systems Research 76: 78–92. [CrossRef]
Langbert, Mitchell. 2018. Homogenous: The Political Affiliations of Elite Liberal Arts College Faculty. Academic Questions 31: 1–12.
[CrossRef]
Li, Jing, Aixin Sun, Jianglei Han, and Chenliang Li. 2022. A Survey on Deep Learning for Named Entity Recognition. IEEE Transactions
on Knowledge and Data Engineering 34: 50–70. [CrossRef]
Nissim, Malvina, Rik van Noord, and Rob van der Goot. 2019. Fair Is Better than Sensational:Man Is to Doctor as Woman Is to Doctor.
arXiv arXiv:1905.09866.
O’Mahony, Niall, Sean Campbell, Anderson Carvalho, Suman Harapanahalli, Gustavo Velasco Hernandez, Lenka Krpalkova, Daniel
Riordan, and Joseph Walsh. 2020. Deep Learning vs. Traditional Computer Vision. In Advances in Computer Vision. Advances
in Intelligent Systems and Computing. Edited by Kohei Arai and Supriya Kapoor. Cham: Springer International Publishing,
pp. 128–44. [CrossRef]
Pew Research Center—U.S. Politics & Policy (blog). n.d. Political Typology Quiz. Available online: https://www.pewresearch.org/
politics/quiz/political-typology/ (accessed on 25 February 2023).
Political Quiz. n.d. Political Quiz—Where Do You Stand in the Nolan Test? Available online: http://www.polquiz.com/ (accessed on
25 February 2023).
Political Spectrum Quiz—Your Political Label. n.d. Available online: https://www.gotoquiz.com/politics/political-spectrum-quiz.
html (accessed on 25 February 2023).
Politics Test: Survey of Dictionary-Based Isms. n.d. Available online: https://openpsychometrics.org/tests/SDI-46/ (accessed on 25
February 2023).
ProProfs Quiz. n.d. Political Ideology Test: What Political Ideology Am I? Available online: https://www.proprofs.com/quiz-school/
story.php?title=what-is-your-political-ideology_1 (accessed on 25 February 2023).
Reuters Institute for the Study of Journalism. n.d. Journalists in the UK. Available online: https://reutersinstitute.politics.ox.ac.uk/
our-research/journalists-uk (accessed on 13 June 2022).
Rozado, David. 2020. Wide Range Screening of Algorithmic Bias in Word Embedding Models Using Large Sentiment Lexicons Reveals
Underreported Bias Types. PLoS ONE 15: e0231189. [CrossRef] [PubMed]
Schoffstall, Joe. 2022. Twitter Employees Still Flooding Democrats with 99 Percent of Their Donations for Midterm Elections. Fox News.
April 27. Available online: https://www.foxnews.com/politics/twitter-employees-democrats-99-percent-donations-midterm-
elections (accessed on 23 February 2023).
The Advocates for Self-Government. n.d. World’s Smallest Political Quiz—Advocates for Self-Government. Available online:
https://www.theadvocates.org/quiz/ (accessed on 25 February 2023).
The Harvard Crimson. n.d. More than 80 Percent of Surveyed Harvard Faculty Identify as Liberal |News|. Available online:
https://www.thecrimson.com/article/2022/7/13/faculty-survey-political-leaning/ (accessed on 21 January 2023).
The Political Compass. n.d. Available online: https://www.politicalcompass.org/test (accessed on 25 February 2023).
Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin.
2017. Attention Is All You Need. arXiv. [CrossRef]
Weaver, David H., Lars Willnat, and G. Cleveland Wilhoit. 2019. The American Journalist in the Digital Age: Another Look at U.S.
News People. Journalism & Mass Communication Quarterly 96: 101–30. [CrossRef]
Soc. Sci. 2023,12, 148 8 of 8
Wikipedia. 2022. GPT-2. Available online: https://en.wikipedia.org/w/index.php?title=GPT-2&oldid=11303470391134132336 (accessed
on 23 February 2023).
Wikipedia. 2023a. Algorithmic Bias. Available online: https://en.wikipedia.org/w/index.php?title=Algorithmic_bias&oldid=11341323
36 (accessed on 23 February 2023).
Wikipedia. 2023b. ChatGPT. Available online: https://en.wikipedia.org/w/index.php?title=ChatGPT&oldid=11346133471134132336
(accessed on 23 February 2023).
Wikipedia. 2023c. Stable Diffusion. Available online: https://en.wikipedia.org/w/index.php?title=Stable_Diffusion&oldid=1134075867
(accessed on 23 February 2023).
Zhou, Yongchao, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, and Jimmy Ba. 2022. Large Language
Models Are Human-Level Prompt Engineers. arXiv. [CrossRef]
Disclaimer/Publisher’s Note:
The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
... Also, AI can exhibit political bias, as demonstrated by Rozado (2023), due to the inherent biases of the humans who train it, which can, inadvertently influence the algorithm. Moreover, as the world changes information about the world changes too, but most training data are static and not updated constantly, which makes AI unreliable because it cannot keep up with the ever-changing information landscape (Papageorgiou et al. 2024, 23-24). ...
Article
Full-text available
The responsible and reliable use of web-based technologies is unquestionably crucial in the era of information abundance. Strategies aimed at mitigating false and misleading content online can be divided into structuralist and educational approaches. In this paper, we aim to demonstrate that most proponents of these strategies overlook the active role of users in web-based information and communication channels. They generally tend to view internet users as passive recipients of information who require external regulation and guidance to discern between reliable and unreliable online information in making well-informed judgments. What we aim to show is that responsible and meaningful engagement with information online encompasses much more than the existing proposals within demarcation approaches. It entails developing abilities that can be characterised as ‘third-order assessments’, which can help users not only to minimise the negative factors undermining the reliable use of information technology but also in ensuring reliable and responsible use of its communicative aspects. The paper concludes by offering suggestions on cultivating these skills and proposes their integration into school curricula.
... Internships with AI-integrated translation companies will provide real experience with genuine AI-driven translation methodologies. Finally, educational programs must incorporate training on AI ethics, bias prevention, and data privacy, given increasing evidence that AI produces biased or harmful results (Bender & Koller, 2020;Chan et al., 2023;Rozado, 2023). ...
Article
Full-text available
Since its launch on November 30, 2022, the artificial intelligence (AI) language model ChatGPT has garnered immense attention and experienced exponential growth in popularity. As one of the largest and most sophisticated models in the market, ChatGPT presents both challenges and opportunities for part-time translators, especially those working primarily with the English language. However, there exists a pervasive perception that part-time translators may face unemployment due to AI advancements. This study examines the attitudes of part-time translators toward the fear that ChatGPT will replace them and explores strategies for adapting to the AI-driven environment. Using an empirical sociological approach with mixed methods, this research finds that part-time translators view ChatGPT as a tool that can augment their capabilities and enhance job efficiency when translating into or from English. The findings suggest that AI can serve as a complement rather than a replacement, providing valuable support for less creative or repetitive translation tasks. This study provides a foundation for future research on the impact of AI on vulnerable occupations within the English language sector.
... The GPT-3 language model's advanced technology has been employed to create scholarly articles (GPT-3, Osmanovic & Steingrimsson, 2022; Manohar & Prasad, 2023), short stories (Lucy & Bamman, 2021), newspaper articles 5 , stories, poems, speeches (Brown et al., 2020;Hernández Rizzardini et al., 2021), and other types of written content that are difficult to distinguish from human-originated texts (Elkins & Chun, 2020). However, this capability of AI has raised concern about the possibility of its misuse for creating false news or manipulating public opinion (Floridi & Chiriati, 2020) and for its political bias (Rozado, 2023). Nonetheless, it has also been suggested that it could be used as a helpful tool for writers and content generators to automate or save time in the preparation of repetitive content (Jaimovitch-López et al., 2022). ...
Article
Full-text available
Introduction: The launch of ChatGPT has popularized the use of Artificial Intelligence and has opened a debate about the opportunities and risks of this digital tool. Although it is still in development, ChatGPT offers numerous application possibilities as an educational resource. Purpose: To explore the application of ChatGPT in enhancing English for Specific Purposes (ESP) learning, particularly in the field of tourism. Method: In this research, 91 students from two groups taking the English 2 course at the University of Valencia used ChatGPT to complete tasks aimed at improving their English for Tourism Purposes (ETP) skills. Students used digital devices to interact with ChatGPT and submitted their work through the university’s platform. The tasks included reading comprehension, vocabulary exercises, and role-play activities relevant to the tourism field. After the activities, students’ interactions with ChatGPT were collected, and the quality of ChatGPT’s responses was analyzed, with a focus on achieving at least 80% accuracy. This methodology was designed to provide real-world language practice and was evaluated to ensure the effectiveness of ChatGPT’s assistance.Results: The findings reveal that despite some limitations, ChatGPT proves to be a versatile and highly effective tool for both classroom and self-directed learning. It enhances students’ written communication skills through role-play activities by simulating conversations, that require students to write responses to its questions and providing them with detailed corrections and suggestions for improvement. In vocabulary acquisition, through a text-based methodology, ChatGPT helps students identify, define, and use relevant terms, which improves their overall language proficiency. Additionally, the interactive nature of ChatGPT tasks and its ability to provide instant feedback increase student engagement and motivation. Conclusion: This study’s findings indicate that ChatGPT can be a valuable tool for English for Specific Purposes (ESP) learning, despite some limitations in accuracy that hinder its unsupervised use. It is particularly effective for generating activities, obtaining additional information about texts, and vocabulary practice, although its current effectiveness in correcting activities requires teacher supervision. The ability of Chat GPT to create real-world scenarios with minimal preparation time makes it a useful resource for both classroom and autonomous learning. However, further research is essential to explore its full potential and address its limitations, ensuring reliable integration into language curricula
... Users may prompt LLMs, for instance, to offer a generic position on a question, such as "What is a reason to support or oppose gun control?" Studies on the default positions that publicly available, commercial LLMs tend to generate find that they most closely reflect the positions of particular subpopulations, such as people in the US and some countries in Europe and South America and more left-leaning individuals (Santurkar et al., 2023;Hartmann et al., 2023;Martin, 2023;Rozado, 2023;Motoki et al., 2024). This suggests that, when prompted in this manner, LLM-assisted answers may steer some respondents away from their true positions and toward the generic views of LLMs. ...
Article
The growing popularity of generative artificial intelligence (AI) tools presents new challenges for data quality in online surveys and experiments. This study examines participants’ use of large language models to answer open-ended survey questions and describes empirical tendencies in human versus large language model (LLM)-generated text responses. In an original survey of research participants recruited from a popular online platform for sourcing social science research subjects, 34 percent reported using LLMs to help them answer open-ended survey questions. Simulations comparing human-written responses from three pre-ChatGPT studies with LLM-generated text reveal that LLM responses are more homogeneous and positive, particularly when they describe social groups in sensitive questions. These homogenization patterns may mask important underlying social variation in attitudes and beliefs among human subjects, raising concerns about data validity. Our findings shed light on the scope and potential consequences of participants’ LLM use in online research.
Article
How is the cultural made computational? CLIP models are a recent artificial intelligence (AI) innovation which train on massive amounts of Internet data in order to align language and image, deploying this ‘grasp’ of cultural concepts to understand prompts, classify images and carry out tasks. To critically investigate this cultural codification, we explore MetaCLIP, a recent variation developed by Meta. We analyse the model’s metadata, a single file of 500,000 terms that aims to achieve a ‘balanced distribution’ or sufficiently broad understanding of concepts. We show how this model assembles histories, languages, ideologies and media artefacts into a kind of cultural knowledge. We argue this codification fuses the ancient technique of the list with a more recent technique of latent space . We conclude by framing these technologies as cultural machines that exert power in defining and operationalising a particular understanding of ‘culture’ invisibly and at scale.
Article
This essay examines the ethical challenges of AI-driven politics, focusing on political bias. It identifies two sources of bias: inherent technological bias in AI systems and bias introduced through human use. The study highlights the need for robust ethical frameworks to mitigate these biases and ensure political fairness. It explores the impact of AI on democratic processes, relevant ethical theories, and practical strategies for addressing bias, such as promoting diversity in AI development and regulatory oversight. The essay concludes that continuous ethical evaluation is essential for responsible AI integration in political contexts.
Article
Full-text available
This study investigates the influence of gender bias in song lyrics on AI-generated music, focusing on how biases in textual content can impact various acoustic features in the resulting compositions. Using a dataset from the Billboard Hot 100, gender bias was quantified through a Transformer-based model, allowing for the categorization of songs into the most biased and least biased groups. These songs served as inputs for the Suno AI platform, which generated new music based on the provided lyrics and genres. Acoustic features such as Aggressiveness, Danceability, Approachability, Engagement, Valence, and Arousal were then analyzed to identify differences between the two groups. The results revealed significant disparities in several features, particularly in Valence and Arousal, indicating that gender bias in lyrics can influence AI-generated music’s emotional and rhythmic qualities. These findings demonstrate the potential for generative AI to perpetuate societal biases and highlight the importance of developing bias mitigation strategies. The study concludes by discussing theoretical and practical implications, proposing several methods to reduce bias in AI-generated content, and suggesting avenues for future Research to enhance fairness and inclusivity in AI-driven creative industries.
Preprint
Full-text available
Large Language Models (LLMs) are a transformational technology, fundamentally changing how people obtain information and interact with the world. As people become increasingly reliant on them for an enormous variety of tasks, a body of academic research has developed to examine these models for inherent biases, especially political biases, often finding them small. We challenge this prevailing wisdom. First, by comparing 31 LLMs to legislators, judges, and a nationally representative sample of U.S. voters, we show that LLMs' apparently small overall partisan preference is the net result of offsetting extreme views on specific topics, much like moderate voters. Second, in a randomized experiment, we show that LLMs can promulgate their preferences into political persuasiveness even in information-seeking contexts: voters randomized to discuss political issues with an LLM chatbot are as much as 5 percentage points more likely to express the same preferences as that chatbot. Contrary to expectations, these persuasive effects are not moderated by familiarity with LLMs, news consumption, or interest in politics. LLMs, especially those controlled by private companies or governments, may become a powerful and targeted vector for political influence.
Preprint
Full-text available
By conditioning on natural language instructions, large language models (LLMs) have displayed impressive capabilities as general-purpose computers. However, task performance depends significantly on the quality of the prompt used to steer the model, and most effective prompts have been handcrafted by humans. Inspired by classical program synthesis and the human approach to prompt engineering, we propose Automatic Prompt Engineer (APE) for automatic instruction generation and selection. In our method, we treat the instruction as the "program," optimized by searching over a pool of instruction candidates proposed by an LLM in order to maximize a chosen score function. To evaluate the quality of the selected instruction, we evaluate the zero-shot performance of another LLM following the selected instruction. Experiments on 24 NLP tasks show that our automatically generated instructions outperform the prior LLM baseline by a large margin and achieve better or comparable performance to the instructions generated by human annotators on 19/24 tasks. We conduct extensive qualitative and quantitative analyses to explore the performance of APE. We show that APE-engineered prompts can be applied to steer models toward truthfulness and/or informativeness, as well as to improve few-shot learning performance by simply prepending them to standard in-context learning prompts. Please check out our webpage at https://sites.google.com/view/automatic-prompt-engineer.
Article
Full-text available
This literature review presents the History, Technology, and Applications of Natural Dialog Systems or simply chatbots. It aims to organize critical information that is a necessary background for further research activity in the field of chatbots. More specifically, while giving the historical evolution, from the generative idea to the present day, we point out possible weaknesses of each stage. After we present a complete categorization system, we analyze the two essential implementation technologies, namely, the pattern matching approach and machine learning. Moreover, we compose a general architectural design that gathers critical details, and we highlight crucial issues to take into account before system design. Furthermore, we present chatbots applications and industrial use cases while we point out the risks of using chatbots and suggest ways to mitigate them. Finally, we conclude by stating our view regarding the direction of technology so that chatbots will become really smart.
Article
Full-text available
We present a survey on multilingual neural machine translation (MNMT), which has gained a lot of traction in recent years. MNMT has been useful in improving translation quality as a result of translation knowledge transfer (transfer learning). MNMT is more promising and interesting than its statistical machine translation counterpart, because end-to-end modeling and distributed representations open new avenues for research on machine translation. Many approaches have been proposed to exploit multilingual parallel corpora for improving translation quality. However, the lack of a comprehensive survey makes it difficult to determine which approaches are promising and, hence, deserve further exploration. In this article, we present an in-depth survey of existing literature on MNMT. We first categorize various approaches based on their central use-case and then further categorize them based on resource scenarios, underlying modeling principles, core-issues, and challenges. Wherever possible, we address the strengths and weaknesses of several techniques by comparing them with each other. We also discuss the future directions for MNMT. This article is aimed towards both beginners and experts in NMT. We hope this article will serve as a starting point as well as a source of new ideas for researchers and engineers interested in MNMT.
Article
Full-text available
Concerns about gender bias in word embedding models have captured substantial attention in the algorithmic bias research literature. Other bias types however have received lesser amounts of scrutiny. This work describes a large-scale analysis of sentiment associations in popular word embedding models along the lines of gender and ethnicity but also along the less frequently studied dimensions of socioeconomic status, age, physical appearance, sexual orientation, religious sentiment and political leanings. Consistent with previous scholarly literature, this work has found systemic bias against given names popular among African-Americans in most embedding models examined. Gender bias in embedding models however appears to be multifaceted and often reversed in polarity to what has been regularly reported. Interestingly, using the common operationalization of the term bias in the fairness literature, novel types of so far unreported bias types in word embedding models have also been identified. Specifically, the popular embedding models analyzed here display negative biases against middle and working-class socioeconomic status, male children, senior citizens, plain physical appearance and intellectual phenomena such as Islamic religious faith, non-religiosity and conservative political orientation. Reasons for the paradoxical underreporting of these bias types in the relevant literature are probably manifold but widely held blind spots when searching for algorithmic bias and a lack of widespread technical jargon to unambiguously describe a variety of algorithmic associations could conceivably be playing a role. The causal origins for the multiplicity of loaded associations attached to distinct demographic groups within embedding models are often unclear but the heterogeneity of said associations and their potential multifactorial roots raises doubts about the validity of grouping them all under the umbrella term bias. Richer and more fine-grained terminology as well as a more comprehensive exploration of the bias landscape could help the fairness epistemic community to characterize and neutralize algorithmic discrimination more efficiently.
Article
Full-text available
Analogies such as man is to king as woman is to X are often used to illustrate the amazing power of word embeddings. Concurrently, they have also been used to expose how strongly human biases are encoded in vector spaces trained on natural language, with examples like man is to computer programmer as woman is to homemaker. Recent work has shown that analogies are in fact not an accurate diagnostic for bias, but this does not mean that they are not used anymore, or that their legacy is fading. Instead of focusing on the intrinsic problems of the analogy task as a bias detection tool, we discuss a series of issues involving implementation as well as subjective choices that might have yielded a distorted picture of bias in word embeddings. We stand by the truth that human biases are present in word embeddings, and, of course, the need to address them. But analogies are not an accurate tool to do so, and the way they have been most often used has exacerbated some possibly non-existing biases and perhaps hidden others. Because they are still widely popular, and some of them have become classics within and outside the NLP community, we deem it important to provide a series of clarifications that should put well-known, and potentially new analogies, into the right perspective.
Article
Full-text available
Named entity recognition (NER) is the task to identify text spans that mention named entities, and to classify them into predefined categories such as person, location, organization etc. NER serves as the basis for a variety of natural language applications such as question answering, text summarization, and machine translation. Although early NER systems are successful in producing decent recognition accuracy, they often require much human effort in carefully designing rules or features. In recent years, deep learning, empowered by continuous real-valued vector representations and semantic composition through nonlinear processing, has been employed in NER systems, yielding stat-of-the-art performance. In this paper, we provide a comprehensive review on existing deep learning techniques for NER. We first introduce NER resources, including tagged NER corpora and off-the-shelf NER tools. Then, we systematically categorize existing works based on a taxonomy along three axes: distributed representations for input, context encoder, and tag decoder. Next, we survey the most representative methods for recent applied techniques of deep learning in new NER problem settings and applications. Finally, we present readers with the challenges faced by NER systems and outline future directions in this area.
Conference Paper
Full-text available
Deep Learning has pushed the limits of what was possible in the domain of Digital Image Processing. However, that is not to say that the traditional computer vision techniques which had been undergoing progressive development in years prior to the rise of DL have become obsolete. This paper will analyse the benefits and drawbacks of each approach. The aim of this paper is to promote a discussion on whether knowledge of classical computer vision techniques should be maintained. The paper will also explore how the two sides of computer vision can be combined. Several recent hybrid methodologies are reviewed which have demonstrated the ability to improve computer vision performance and to tackle problems not suited to Deep Learning. For example, combining traditional computer vision techniques with Deep Learning has been popular in emerging domains such as Panoramic Vision and 3D vision for which Deep Learning models have not yet been fully optimised.
Article
Full-text available
This project is based on interviews with a national probability sample of U.S. journalists to document the tremendous changes that have occurred in journalism in the 21st century. More than a decade has passed since the last comprehensive survey of U.S. journalists was carried out in 2002. This 2013 survey of U.S. journalists updates these findings with new questions about the impact of social media in the newsroom and presents a look at the data on the demographics, working conditions, and professional values of 1,080 U.S. journalists who were interviewed online in the fall of 2013.
Article
The capabilities of supervised machine learning (SML), especially compared to human abilities, are being discussed in scientific research and in the usage of SML. This study provides an answer to how learning performance differs between humans and machines when there is limited training data. We have designed an experiment in which 44 humans and three different machine learning algorithms identify patterns in labeled training data and have to label instances according to the patterns they find. The results show a high dependency between performance and the underlying patterns of the task. Whereas humans perform relatively similarly across all patterns, machines show large performance differences for the various patterns in our experiment. After seeing 20 instances in the experiment, human performance does not improve anymore, which we relate to theories of cognitive overload. Machines learn slower but can reach the same level or may even outperform humans in 2 of the 4 of used patterns. However, machines need more instances compared to humans for the same results. The performance of machines is comparably lower for the other 2 patterns due to the difficulty of combining input features.