PreprintPDF Available

Thematic Analysis of Interview Data with ChatGPT: Designing and Testing a Reliable Research Protocol for Qualitative Research

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

In recent years, artificial intelligence has developed into a powerful tool for processing and generating human-like texts, unlocking innovative possibilities for quantitative and qualitative data analysis. In the case of qualitative research, artificial intelligence in general, and ChatGPT in particular, represent promising avenues to explore and examine textual transcriptions from interview data. This study is a step in this direction, advancing a reliable research protocol for using ChatGPT to conduct thematic analysis, which includes the following standard steps: 1) data preparation, 2) defining the analysis process, 3) chatbot interaction, 4) iterative process, 5) review and validation, and 6) analysis and interpretation. Results of the analysis revealed that ChatGPT may significantly facilitate qualitative data analysis exploration, especially during initial research stages and when dealing with extensive transcription material. Additionally, our protocol design is able to reliably identify different thematic patterns emerging from the text, although the granularity of the output may vary depending on the quality of the prompt and human intelligence interpretation. Accordingly, we conclude that despite its vast power, the ChatGPT model in its current state is unable to substitute the contextual insights and subtle metaphorical nuances associated with human qualitative analysis, interpretation and reflexivity.
Running head: THEMATIC ANALYSIS IN CHATGPT 1
Thematic Analysis of Interview Data with ChatGPT: Designing and Testing a Reliable
Research Protocol for Qualitative Research
Goyanes, Manuel1, Lopezosa, Carlos2, & Jordá, Beatriz1
1Carlos III University of Madrid
2Pompeu Fabra University
THEMATIC ANALYSIS IN CHATGPT 2
Abstract
In recent years, artificial intelligence has developed into a powerful tool for processing and
generating human-like texts, unlocking innovative possibilities for quantitative and
qualitative data analysis. In the case of qualitative research, artificial intelligence in general,
and ChatGPT in particular, represent promising avenues to explore and examine textual
transcriptions from interview data. This study is a step in this direction, advancing a reliable
research protocol for using ChatGPT to conduct thematic analysis, which includes the
following standard steps: 1) data preparation, 2) defining the analysis process, 3) chatbot
interaction, 4) iterative process, 5) review and validation, and 6) analysis and interpretation.
Results of the analysis revealed that ChatGPT may significantly facilitate qualitative data
analysis exploration, especially during initial research stages and when dealing with extensive
transcription material. Additionally, our protocol design is able to reliably identify different
thematic patterns emerging from the text, although the granularity of the output may vary
depending on the quality of the prompt and human intelligence interpretation. Accordingly,
we conclude that despite its vast power, the ChatGPT model in its current state is unable to
substitute the contextual insights and subtle metaphorical nuances associated with human
qualitative analysis, interpretation and reflexivity.
Keywords: thematic analysis, ChatGPT, qualitative research, qualitative data analysis,
interview data, transcriptions.
THEMATIC ANALYSIS IN CHATGPT 3
Thematic Analysis of Interview Data with ChatGPT: Designing and Testing a Reliable
Research Protocol for Future Qualitative Research
In the dynamic landscape of contemporary research, the integration of new technologies has
revolutionized the methodologies employed for data analysis (e.g., Hariri, 2023; Morgan,
2023; Gil de Zúñiga et al., 2024). Among these technologies, ChatGPT, a cutting-edge
language model developed by OpenAI, stands out as a powerful tool capable of processing
and generating human-like text (Hariri, 2023; Lingard, 2023). In qualitative research,
ChatGPT, when used ethically, can assist in coding and thematic analysis by generating
initial codes or offering alternative interpretations, thus streamlining the qualitative coding
process. Its versatility and adaptability make ChatGPT a promising tool for qualitative
researchers looking for innovative ways to collect, analyze, and interpret qualitative data in a
more interactive and dynamic manner manner (Morgan, 2023). Due to its potential to
facilitate the research process, developing a comprehensive research protocol for qualitative
researchers is essential.
Qualitative research has great importance in research due to its unique ability to delve
into the depth and nuance of human experiences, behaviors, and interactions (Taylor et al.,
2015; Lapan et al., 2011). In contrast to quantitative approaches that rely on numerical data,
qualitative research employs flexible methodologies, such as interviews, focus groups and
participant observation that allow researchers to capture the richness and complexity of real-
world contexts (Hammersley, 2012; Lacey & Luff, 2001; Thompson, 2017). This form of
inquiry is particularly valuable in social science disciplines such as sociology, psychology,
anthropology or communication studies, where understanding the subjective dimensions of
phenomena is essential (Lapan et al., 2011). Qualitative research not only explores the
intricacies of social phenomena, but also contributes to the development of theories and sheds
light on the intricacies of culture, identity and social structures (Patton, 2014; Taylor et al.,
THEMATIC ANALYSIS IN CHATGPT 4
2015). By providing a holistic view of the human experience, qualitative research enriches
our understanding of the multiple aspects of the human condition and ultimately promotes a
more nuanced and comprehensive understanding of the world around us (Creswell & Poth,
2016; Taylor et al., 2015).
In the framework of qualitative research, thematic analysis is a widely used method
that provides a systematic and flexible approach to identifying, analyzing and reporting
patterns or themes in data (Braun & Clarke, 2006; Javadi & Zarea, 2016). Specifically, it
provides researchers with a means to uncover underlying meanings, patterns and nuances in
qualitative data, enabling a deeper understanding of participants’ experiences, perceptions
and perspectives (Vaismoradi et al., 2016). Furthermore, thematic analysis promotes
interpretative richness and allows researchers to capture the diversity and depth inherent in
qualitative data, making it an indispensable tool for obtaining meaningful results in
qualitative studies (Terry et al., 2017). Reflecting its importance, with the advent of
ChatGPT, researchers now have the opportunity to explore novel avenues in understanding
and extracting meaning from diverse textual datasets (Morgan, 2023). However, the
complexities and nuances associated with the use of ChatGPT for thematic analysis require
the development of a robust research protocol to guide future researchers.
This paper aims to address the growing need for a research guide tailored to
conducting thematic analysis with ChatGPT. By addressing the unique challenges and
opportunities presented by this state-of-the-art language model, researchers can improve the
rigor and efficiency of their qualitative research efforts. This research protocol not only
provides a step-by-step method for using ChatGPT in thematic analysis, but also provides
insights into ethical considerations, potential pitfalls, and best practices related to this
innovative approach.
THEMATIC ANALYSIS IN CHATGPT 5
As we delve into the realm where artificial intelligence intersects with qualitative
research, this paper serves as a pioneering effort to bridge the gap between human and
technology-driven rigor. By fostering a deeper understanding of the capabilities and
limitations of ChatGPT in thematic analysis, researchers can harness the full potential of this
technology while adhering to standards of methodological rigor and ethical conduct.
Qualitative Research and Data Analysis
Qualitative research encompasses a wide range of data collection techniques that aim to
provide a detailed and socio-contextual interpretation of social phenomena, such as in-depth
interviews, focus groups or participant observation (Vaismorandi et al., 2016). These
methods are conducted in a flexible, open-ended manner, drawing on rich narrative material
that allows researchers to get to know the interpretations, thoughts and feelings of the people
studied in an intimate way (Denscombe, 2009; Patton, 2014). When analyzing data,
qualitative researchers develop concepts, categories and insights from the patterns in data,
which are also flexible and open-ended in character (Hammersley, 2012; Taylor et al., 2015).
Qualitative research emerged as an anti-positivist movement that rejected quantitative
methods and their reduction of social reality to variables and that moved instead towards
theorizing (Alasuutari, 2010). It is often associated with the interpretivist paradigm and has
sometimes been labeled “unscientific” by some quantitative scholars due to its lack of
generalizability (Adler, 2022; Sarma, 2015). However, qualitative methods are frequently
used in contemporary research and their importance for a more comprehensive understanding
of social and cultural phenomena is often acknowledged (Mohajan, 2018). Unlike
quantitative research, whose strength lies in its ability to generalize and replicate data in other
settings, qualitative methods have the unique ability to develop new theoretical insights.
Moreover, by examining the context of social phenomena and the meanings people attach to
them, qualitative methods reach the parts of knowledge that quantitative methods cannot.
THEMATIC ANALYSIS IN CHATGPT 6
They investigate the questions of “how” and “why” rather than “when",” “what” and “where”
(Creswell & Poth, 2016; Green & Thorogood, 2004). These methods therefore examine
phenomena as a whole and not as individual variables, which leads to a complex and holistic
picture of them (Taylor et al., 2015).
The process of qualitative data analysis follows inductive reasoning. This means that
qualitative researchers do not base their analysis on existing theories, ideas and hypotheses,
as is the case with quantitative methods, but begin by examining the empirical observations
they have collected (Madondo, 2021). This means that they first delve into participants'
“thick descriptions”⸻their “views, intents, circumstances, motives, meanings, and
understandings” (Younas et al., 2023, p. 1)⸻ and begin to think in increasingly abstract ways
until their vague ideas become precise theoretical concepts and propositions (Neuman, 2019).
It is these theoretical insights that make it into the final report and ensure the criterion of
transferability and suitability of the results for other contexts (Younas et al., 2023).
Qualitative data analysis has been described as one of the most challenging parts of
qualitative research (Castleberry & Nolen, 2018; O’Kane et al., 2021). Very often, qualitative
researchers are faced with large amounts of textual data to analyze, not quite knowing how to
proceed. This occurs because, unlike quantitative methods, there are no explicit rules for data
analysis in qualitative research (Graue, 2015): “There is no single, accepted way of carrying
out qualitative research” (Ormston et al., 2014, p. 2). The confusion about this process is
compounded by the fact that qualitative researchers generally do not describe their analytic
processes very clearly or systematically (Kekez, 2019; Tuckett, 2005). Analysis tends to be
murky and, as a result, some scholars have questioned its rigor, sometimes accusing
qualitative scholars of adopting an “anything goes” mentality (Braun & Clarke, 2006;
O’Kane et al., 2021; Sarma, 2015). However, when done right, the flexible nature of the
qualitative approach does not preclude it from being rigorous and systematic.
THEMATIC ANALYSIS IN CHATGPT 7
Accordingly, qualitative scholars have focused for many years on developing
guidelines and techniques to analyze and elucidate patterns in textual data in a more
standardized way. Some of these techniques are widely used, such as grounded theory
(Strauss & Corbin, 1990), narrative analysis (Polkinghome, 1995) or thematic analysis
(Braun & Clarke, 2006). The latter has become one of the best-known and most popular
analytical strategies in recent decades, which can be traced back to the much-cited
methodological paper by Virginia Braun and Victoria Clarke. Thematic analysis is a method
for identifying, analyzing and reporting themes in data and represents “the more systematic
and explicit form” of data analysis in qualitative research (Javadi & Zarea, 2016, p. 30). It
ensures standardization and transparency while maintaining the flexibility that is so
characteristic of qualitative methods.
Braun and Clarke (2006) proposed a six-phase guideline for it. These six stages
provide a systematic but theoretically open-ended procedure for coding and examining
descriptions through the discovery of themes (Braun & Clarke, 2006; Vaismoradi et al.,
2016). This analytic procedure goes as follows. First, researchers need to familiarize
themselves with their data by transcribing, reading and re-reading it and writing down initial
ideas. Second, they have to generate initial codes for the entire data set and select data
relevant to the codes. Next, researchers need to look for themes by finding common patterns
in codes. Themes may be described as “the subjective meaning and cultural-contextual
message of data” (Vaismoradi & Snelgrove, 2019, p. 2) and are the end product of thematic
analysis. Researchers must then review, define and name themes, and finally, conduct a final
analysis and write the report.
In addition to introducing clearer guidelines for analyzing textual data, qualitative
scholars have increasingly integrated new technologies into these processes. Computer-based
programs such as NVivo or Atlas.ti are now commonly used by researchers and serve as tools
THEMATIC ANALYSIS IN CHATGPT 8
to find, categorize, and retrieve data much faster than manual searching (Liamputtong, 2009).
These programs support and enhance the overall analysis process and allow for more
complex and in-depth analysis of data, especially in studies working with large datasets
(Castleberry & Nolen, 2018; Hwang, 2008; Liamputtong, 2009). More importantly, they are
proving useful in demonstrating more transparent and trustworthy work (O’Kane et al.,
2021).
Now, recent technological advances in the field of artificial intelligence have opened
up new possibilities for the systematization and transparency of qualitative research. This is
the case with ChatGPT, the state-of-the-art language model developed by OpenAI, which is
gradually being implemented in analytical procedures, research and even writing. Although
some concerns have been raised about its use in research and the risk of bias, plagiarism or
lack of originality (Lingard, 2023), utilized ethically ChatGPT, like the software programs
mentioned above, can support research processes such as qualitative data analysis as a
powerful tool and further enhance their rigor. In a recent review of this AI-based tool for
thematic analysis, Morgan (2023) notes that ChatGPT is successful in finding concrete,
descriptive themes and points to its “substantial potential” (p. 9) to contribute to qualitative
analysis. Accordingly, in the following section we address the growing need for a step-by-
step protocol for using ChatGPT in thematic analysis, in the hope that it will serve as a guide
for future researchers interested in its use.
Thematic Analysis with ChatGPT
To better illustrate the research protocol, we use a dataset of 30 semi-structured, in-depth
interviews drawn from a qualitative study conducted by a student who gave her consent. This
study explored citizens’ use of social media, personality traits and consumption of political
content and its influence on political opinions. The aim was to examine citizens’
understanding of the influence they are exposed to on social media and to determine the role
THEMATIC ANALYSIS IN CHATGPT 9
that their personality traits play in these processes. Three research questions were posed in
the study:
RQ1: What are the personality traits of users who change their political opinions on
social media?
RQ2: What kind of political content influences social media users’ political opinions?
RQ3: On which topics do social media users change their political opinions?
The interview guide was structured around three topic areas. The first one examined
citizens’ use of social media as an information source for public affairs and politics. The
second part examined personality traits and the openness of participants to opinion change.
Finally, the third part inquired about citizens’ exposure to political content and discussions in
social media, political opinion expression, and the influence that such political content may
have on citizens’ political opinions. The interviews were transcribed manually by the student.
The content addressed in this study mainly revolves around the use of social media to change
citizens' opinions.
Data Preparation
First, before accessing ChatGPT, researchers must prepare the data they wish to analyze. This
could be interview transcripts, open-ended survey responses, social media comments or other
text-based datasets. Ideally, all data should be organized in the same document with the
respective headings, labels or questions/answers in the case of interviews. It is also important
to clean up the text and remove information that is irrelevant for the thematic analysis, such
as special characters or HTML tags. The preparation of the data will facilitate will facilitate
the following steps of the protocol.
Preparing the Chatbot
Researchers should upload their text data to ChatGPT, following certain steps to ensure that
the Generative Artificial Intelligence (GenAI) processes it correctly.
THEMATIC ANALYSIS IN CHATGPT 10
Integration of Textual Content into ChatGPT
Due to the general length of qualitative data and the inherent word count limitation of
ChatGPT, it is advisable to load the material gradually to avoid potential errors in the GenAI
system. Studies suggest that ChatGPT is prone to problems when processing inputs that
exceed a certain number of words, even though there is no specific word limit defined
(Araújo et al., 2023; Lubiana et al., 2023). Therefore, it is advisable not to upload datasets
with large lengths to ensure optimal performance. To solve this technical problem,
researchers can structure the content based on a predetermined word count or upload each
interview independently. We recommend uploading the interviews independently, making
sure that their overall length stays within a range that does not affect ChatGPT's performance.
For example, you could implement an organizational structure based on individual
interviewees referred to as Interviewee 1, Interviewee 2 and so on. However, it is important
that you adhere to a set word limit of one thousand words for each interview. If the interviews
exceed this word limit, we recommend uploading the transcripts in stages and splitting the
content into sentences of less than 1000 words until the entire upload is complete.
Once this is done, the next step is to access ChatGPT and provide the first instruction
(prompt) to contextualize the platform. To illustrate this process, we will perform a practical
demonstration with our 30 interviews. All the prompts shown here and the corresponding
results can be found in the online appendix. The first prompt could look like this:
Prompt 1. Next, we are going to offer you a set of interviews. Specifically, 30. So each
instruction will be an interview. Once we've uploaded all the interviews, we'll let you
know and start asking you for specific instructions, okay?
This instruction allows us to upload the interviews one after the other, bypassing the
word limit mentioned above. After this first prompt, ChatGPT will mark this interaction as a
new chat, which will appear in the left history window of the user interface. Users can keep
THEMATIC ANALYSIS IN CHATGPT 11
the default name assigned by ChatGPT or rename it to their preference. We advise
researchers to choose a name that reflects the nature of the ongoing project and that makes it
easier to find. The advantages of using the chat within the history feature are manifold: first,
it allows working with the entire dataset in a single conversation; second, users can leave
ChatGPT and pick up where they left off later; and third, the entire conversation can be
shared via a link so that different researchers or co-authors have access to it.
The next step is to feed ChatGPT with the 30 interviews. The reason for this prompt is
to upload each interview individually so that the AI can recognize the origin of each answer,
whether it belongs to interviewee 1, 2, and so forth, up to 30. Uploading the dataset per
interviewee offers two advantages: First, researchers can create prompts that relate to specific
interviewees, and second, ChatGPT can recognize that it is trained on a complete dataset
consisting of a specific number of interviews, with each response associated with a specific
interviewee. This eliminates possible errors in the processing and interpretation of the data.
Prompt 2. Below we attach the following interview. Interview (number). Please do not
take any action, just indicate whether or not you have incorporated the interviews:
“paste the interview (important: The interview should start with Interview or
Participant 1, 2, 3...)”
After uploading each interview, ChatGPT will respond in the following way:
Reply from chatgpt. I have incorporated the information from Interview 1 into the
analysis. If you have any specific questions or if there's anything else you'd like me to
do with this information, feel free to let me know!
Once all interviews are integrated (30 in our case), it is important to ensure that
ChatGPT has loaded them all. To verify this, we present the following prompt:
Prompt 3. Please confirm how many interviews we have provided you in this chat.
THEMATIC ANALYSIS IN CHATGPT 12
If the interviews have been uploaded correctly, ChatGPT will answer the following
result:
Reply from chatgpt. You have provided 30 interviews in this chat. If you have any more
interviews or specific tasks, you'd like assistance with, feel free to let me know!
This response concludes the data loading phase. Then, we continue by writing a new
prompt to prepare ChatGPT for engaging with the interviews:
Prompt 4. Next, we will no longer give you more interviews, but rather we will make
requests for the 30 interviews. OK?
If the content has been uploaded successfully, ChatGPT will respond something along
these lines:
Reply from chatgpt. Certainly! Feel free to make requests or ask questions related to
the 30 interviews you've provided. I'm here to help!
Defining the Analysis Process
The second phase of the process involves the development of a series of questions or prompts
that guide ChatGPT through the thematic analysis. These prompts are used to get the GenAI
to identify themes, patterns or key ideas in the text data. To do this, we enter the chat thread
where we have uploaded our 30 interviews and ask OpenAI’s GenAI to identify common
and/or recurring themes, taking into account all the uploaded content.
Prompt 5. Identify as exhaustively as possible the most recurring themes taking into
account the content of the 30 interviews we have uploaded.
This instruction will allow ChatGPT to track the central themes of the 30 interviews.
GenAI achieves this by identifying repetition patterns in the interviewees’ answers based on
all the data. As a result, ChatGPT provides a set of specific themes that can vary depending
on the characteristics of the dataset. In our case, this results in twelve thematic axes around
which we can structure the thematic analysis of our 30 interviews. In addition, each theme
THEMATIC ANALYSIS IN CHATGPT 13
contains some representative statements. It is also worth noting that researchers can ask
ChatGPT to expand the themes found or specify the number of themes to display depending
on their research interest.
Once we have a general understanding of the central themes of the interviews, we can
ask ChatGPT to help us find and list terms related to these twelve identified central themes:
Prompt 6. Search for and list key phrases or terms related to the specific topics listed
in the previous prompt.
Overall, these terms represent the diversity of discussions and topics raised in the 30
interviews. In our case, we identified 4 key terms per topic, but this number may vary from
study to study. In this sense, ChatGPT allows or combines freedom with standardization, as it
gives researchers the freedom to direct their focus on specific topics or ideas while adhering
to systematic text patterns. When we speak of “freedom",” we refer to the autonomy and
freedom of the researcher to engage with specific aspects or concepts of interest in their
dataset. Conversely, the term “standardization” serves as a guiding principle to reconcile the
researcher's will with textual coherence, rigor and structure.
Once we have a general understanding of the central themes and keywords from the
main topics of the interviews, we can ask ChatGPT to summarize each of the themes
identified in the first prompt. To do this, we need to address the twelve identified themes
individually. The reason we are focusing on the topics and not the entire dataset is because
ChatGPT will return an error on this request for the 30 interviews, stating that it cannot
process this request. As an example, we use the topic "Social Media Influence” for the
instruction, which was identified as the first topic in the first query.
Prompt 7. Summarize the main points of the topic Social Media Influence resulted in
the previous prompt and taking for this purpose the 30 interviews
THEMATIC ANALYSIS IN CHATGPT 14
This instruction reflects the main points of the topic “Social Media Influence” and
provides a summary for each of these main points. This step should be repeated in our case
for the remaining eleven themes identified.
The result shows the most important related subtopics within the thematic axis “Social
Media Influence” on the one hand and a summary for each of these topics on the other.
ChatGPT concludes with an overview of the most important elements in the interviews. Once
we have these summaries, we can delve deeper into them. To do this, we can ask ChatGPT to
help us find and list various verbatims on specific topics.
Prompt 8. Provide the best examples or quotes that illustrate the topic Social Media
Influence resulted in the previous prompt and using the 30 interviews.
To get more targeted results, we recommend that you ask for “the best examples",”
otherwise ChatGPT may provide random and unrepresentative statements. The result is a set
of specific quotes categorized by subtopics. The collected statements can be expanded with
an iteration prompt asking ChatGPT to explain the result. Furthermore, this statement should
be repeated for each of the topics identified by ChatGPT, in our case twelve. The result will
be a comprehensive set of core statements that will give us a comprehensive overview of the
most representative verbatims for each topic and their respective subtopics according to
ChatGPT. After this second phase, we will move on to another phase focusing on deep
interactions with the GenAI.
Chatbot Interaction
The third phase of the analysis focuses, on the one hand, on identifying convergences and
divergences between the interviewees' statements on the other hand, and on extracting direct
quotes on the main and secondary themes that emerge from the previous phase on the other.
To do this, we will enter the chat thread where we uploaded our 30 interviews and ask
ChatGPT to find convergences and divergences for each of the general topics identified in
THEMATIC ANALYSIS IN CHATGPT 15
phase two. In the following example, we will continue with the theme “Social Media
Influence”.
Prompt 9. Identify convergences and divergences in the topic Social Media Influence
resulted in the previous prompt and taking for this purpose the 30 interviews. Identify the
approximate percentage of such convergences and divergences. Start with an intro where you
give an overview of this result.
This prompt identifies statements related to a specific topic and assigns a percentage
based on the similarity of responses. ChatGPT accomplishes this task by analyzing the 30
interviews and finding patterns of dichotomous responses. The result allows researchers to
firstly get an overview of the convergences and divergences in respondents' statements on a
particular topic, secondly to determine the percentage of similarity between different
statements, and thirdly to identify direct, contextual quotes.
Once we know the different perspectives of the interviewees based on the 12 thematic
axes, we can explore them in more depth by searching for key terms or categories. This
categorization aims to represent the central ideas of the entire group of interviewees for each
of the 12 themes. Below, we illustrate this prompt by continuing with the theme “ Social
Media Influence”
Prompt 10. Find and list terms related to the topic Social Media Influence resulted in
the previous prompt and taking for this purpose the 30 interviews. Start with an intro where
you give an overview of this result.
This instruction allows the researchers to get an overview of each theme considering
the statements of all interviewees. The result is a series of specific categories that reveal the
central elements of the themes. This compilation not only shows the variety of terms used by
respondents on the topic of "Social Media Influence",” but also provides a general description
THEMATIC ANALYSIS IN CHATGPT 16
of the terms that can help researchers understand the meaning of the categories. A subsequent
“manual” or AI-assisted analysis can be performed.
Once this is done, we can ask ChatGPT to provide the most representative direct
quotes for each of the themes, with the aim of complementing the previous instruction.
Prompt 11. Identify the most important quotes supporting the Social Media Influence
thematic axis resulting from the first prompt. To do this, include the direct quotes associated
with each of the interviewees that are part of this thematic axis. Start with an intro where you
give an overview of this result.
The result of this instruction is a series of specific quotes collected from some of the
30 interviewees that represent the central theme of Social Media Influence. If the researchers
are looking for a higher level of thematic accuracy in the direct quotes, they can also use the
following prompt.
Prompt 12. Give me literal statements from the 30 interviewees that support the social
issue resulting from the first prompt and more specifically the result: "Social media,
especially Instagram and Twitter, is a primary source of information for the majority." There
should be one statement for each interviewee, if an interviewee does not talk about it,
represent the answer as N/A. Start with an intro where you give an overview of this result.
This instruction is based on several key elements that enable the AI to achieve an
optimal result. The first sentence is specific, instructing ChatGPT to focus not on the topic
“Social Media Influence” but on its specific subtheme “Instagram and Twitter, as the primary
source of information for the majority” (which follows from the previous prompts in phase
2). The second sentence instructs the AI to go through the 30 interviews and also mark those
interviewees who do not make a statement on the subtheme.
THEMATIC ANALYSIS IN CHATGPT 17
Now, we will move on to a subsequent stage focused on conducting iterative
conversations with the AI to achieve a deeper integration between themes, recurring words,
and direct quotations.
Iterative Process
This step consists of engaging in an iterative conversation with the chatbot to acquire specific
themes, associated keywords, and relevant excerpts from the identified thematic axes. This
phase is crucial as it is during this stage that the AI will systematize the final outcome of the
thematic analysis for each of the 12 core axes obtained in the previous phase.
Prompt 13. Perform a thematic analysis on the Social Media Influence axis identified
in the first prompt.
The result of this prompt is a detailed thematic analysis consisting of six elements
focusing on the “Social Media Influence.” First, the AI provides an overview of the topic,
highlighting the complex relationship between interviewees and social media. Second, it lists
key themes, including the importance of social media as a primary source of information,
influence on political opinions, convenience and accessibility, trust and reliability of
information. Third, key terms related to this central axis are presented. Fourth, convergences,
mainly related to the importance of social media as a source of information, and divergences,
mainly focused on trust in information, are analyzed. Fifth, ChatGPT includes direct quotes
from interviewees to illustrate specific examples of individual opinions on social media use
and its influence on opinions. Finally, there is a concluding conclusion and an overall
summary.
Another major feature that adds systematization to the thematic analysis regards the
use of tables. Accordingly, at this stage, it is advisable to incorporate the following prompt.
THEMATIC ANALYSIS IN CHATGPT 18
Prompt 14. Make a table where a thematic analysis is carried out on the "Social
media influence" axis following all the steps proposed in the previous prompts. The chatbot
should respond with identified topics, associated keywords, and relevant text excerpts.
The result is a table that summarizes information on specific themes, associated
keywords, and relevant excerpts, providing a structured and clear overview of the thematic
analysis conducted on the Social Media Influence topic. In this case, ChatGPT mentions
that the table incorporates steps from previous prompts, suggesting that this information is the
cumulative result of a broader process. This table reflects a quick and accessible
understanding of key aspects derived from the thematic analysis so far. After applying these
two prompts to all thematic axes, the next step will be to carefully review the result of
ChatGPT’s thematic analysis.
Review and Validation
This phase is about carefully reviewing the final result from the previous phase. Specifically,
this step aims to check the accuracy and replicability of the AI results. This review goes
beyond the thematic axes discussed here and includes any form of research that utilizes the
thematic analysis methodology in ChatGPT. We assume that the same study with the same
data may offer different perspectives depending on the researcher’s point of view and
reflexivity, as it is a qualitative analysis. Accordingly, if a researcher performs the thematic
analysis instead of a machine, will the results be similar or different? The answer is that they
will definitely change, but that does not mean that the results provided by the AI are wrong.
They are one of many possible views and perspectives that can be obtained from the same
data set. Therefore, this phase of thematic analysis has a twofold goal: 1) to confirm whether
the results obtained are valid and reliable, and 2) to assess whether they can serve as a
starting point for researchers who want to extend the results through their own critical
thinking.
THEMATIC ANALYSIS IN CHATGPT 19
To check the reliability of the results obtained, we propose a three-step checklist using the
most common methods to perform thematic analysis: (1) manually, (2) using CAQDAS
software (NVivo), and (3) using an AI-powered CAQDAS (in this case, Atlas.ti's AI Coding
Beta Service). It is important to note that this checklist contains a set of core questions that
can be adapted for different studies and outcomes within the thematic analysis methodology.
In this case, the checklist focuses on the thematic axis of Social Media Influence, specifically
the results of questions 13 and 14 (see Appendix).
At the first level, critical thinking is applied through a series of questions on each
thematic axis. The control questions must be answered by the researchers. In case of a
negative result, an explanation and corrective action in the form of a new question is
required. For the thematic axis of the influence of social media, the control questions are as
follows:
Q1. The exploration of 30 interviews provides a rich tapestry of perspectives, revealing
commonalities and distinctions in how people perceive and engage with these digital spaces
as sources of information?
Q2. After reviewing the 30 interviews, can we confirm that the identified themes could be:
primary information source, influence on political opinions, convenience and accessibility,
trust and reliability, and filter bubbles and echo chambers?
Q3. Considering the responses of the interviewees, are the convergences and divergences
proposed by ChatGPT correct?
Q4. Do the direct quotes provided by ChatGPT exist, and do they match the statements of
the interviewees?
Q5. The keywords identified by ChatGPT accurately represent the statements of the 30
interviewees in the thematic axis of social media influence?
THEMATIC ANALYSIS IN CHATGPT 20
After the critical examination of the questions for the 12 thematic axes, the qualitative
analysis tool Nvivo (one of the most frequently used CAQDAS services for thematic
analysis) is used at the second control level. Specifically, two of its functions are used: (1)
word frequency query and (2) word tree.
The word frequency query allows listing the words that occur most frequently in different
choice sets. Therefore, in this task, researchers need to confirm whether, after uploading the
30 interviews to NVivo and using the 'word frequency query' tool, they can obtain a tag cloud
that can directly or at least partially represent the 12 axes identified by ChatGPT.
On the other hand, the word tree is a picture in the form of a graph that helps to identify
frequent words and phrases in the data and show how you have identified themes and
insights. To illustrate, we will use the keyword word tree with the theme 'Social Media
Influence' This review task aims to determine whether the direct quotes obtained from the
word tree for the chosen thematic axis match or are similar to the quotes provided by
ChatGPT. The word tree should be used for each of the thematic axes to check the
consistency of the results obtained by ChatGPT.
In the third and final phase of the inspection, the AI Coding Beta Tool of the
CAQDAS software Atlas.ti is used. This tool was developed to automate the coding of
interviews through its AI-driven support. In this context, two questions are asked in the
review checklist. The first question aims to determine whether, after uploading the 30
interviews to Atlas.ti and using the 'AI Coding Beta' tool, any of the automatically generated
codes match the key themes identified by ChatGPT. The second question aims to see if the
codes generated by Atlas.ti use the same or similar key phrases or terms that we obtained
through our prompts. These questions aim to assess the consistency of the results.
All checklist questions must be carefully analyzed by the co-authors. If, after analysis,
the researchers can confirm that ChatGPT has given a correct answer, they should also justify
THEMATIC ANALYSIS IN CHATGPT 21
why the result is correct. On the other hand, if the answer is incorrect or ambiguous, the co-
authors should review earlier stages, refine the instructions or continue the conversation to
ensure the accuracy of the results.
Data Organization
In this phase, researchers should organize the identified themes, associated keywords, and
relevant text excerpts into a structured format, such as a thematic coding framework or a
table. Below we provide a template (Table 1) for a possible three-phase verification guide
with which to review the results obtained with ChatGPT.
<Insert Table 1>
After completing this three-phase protocol for the social media influence thematic
axis, we can proceed with the following prompt.
Prompt 15. Repeat the thematic analysis for the social media influence axis as done in
previous prompts and incorporate the following improvements: do a more exhaustive job,
expand and better clarify the Additional Key Phrases identified.
The outcome of this prompt (see appendix) will serve as the foundation for the
development of the next phase.
Analysis and Interpretation
At this point in the process, researchers should collect and take ownership of the results of
prompt 16 in order to more thoroughly and reflectively examine the results obtained through
ChatGPT. Below is an example that focuses on the 'social media influence' axis.
When examining the convergences and divergences between interviewees' statements
on the thematic axis of social media influence, there is consensus on the central role of social
media as a primary source of information. The recognition of the convenience and quick
accessibility offered by platforms such as Instagram and Twitter stands out as a common
perspective among participants. However, in terms of trust in the information obtained via
THEMATIC ANALYSIS IN CHATGPT 22
these platforms, there are differing views that reveal varying levels of trust and a degree of
skepticism.
In this sense, and taking into account the thematic axis “Social media influence” with
its themes, keywords and interviewees' statements, we can confirm that in today's digital age,
social networks have become the primary source of information for many users, with
Instagram and Twitter being the most well-known platforms. “I update Instagram and get the
news. On Twitter, it has to be something specific that I look for,” claims one of the
interviewees. In this context, real-time updates and diverse content are the most important
aspects of this process, according to the users interviewed.
As far as political opinions are concerned, active opinion-forming can be observed,
even if some respondents have difficulty articulating their views. In general, the convenience
and accessibility of obtaining information quickly and effortlessly are highlighted as
advantages of the digital environment. However, some statements confirm that concerns arise
regarding trust in the information, with varying degrees of skepticism and the need to fact-
check. In addition, interviewees acknowledge the existence of filter bubbles and echo
chambers and understand the algorithmic provision of content potentially limits access to
different perspectives. In this context, the importance of reviewing and searching for diverse
information becomes clear.
On the other hand, in this digital scenario presented by the interviewees, social media
platforms are metaphorically understood as a digital agora “where people come together,
share and discuss ideas”, as one of the interviewees acknowledges. Despite concerns about
echo chambers, respondents generally value social media because it offers them the
opportunity to learn about different worldviews on social and political issues. In addition, the
instant gratification that comes with the rapid dissemination of information on social media is
highlighted as a key factor that attracts users. However, there are also concerns about how
THEMATIC ANALYSIS IN CHATGPT 23
metrics such as likes and shares can influence perceptions of news relevance, leading to
considerations about the authenticity of assigned relevance: “The number of likes and shares
impacts how I perceive news importance, and that's a bit concerning,” said one interviewee.
Another key point of this thematic axis focuses on the challenges of media literacy and
emphasizes the crucial need to evaluate and interpret the information found on these
platforms. Finally, interviewees acknowledge that media literacy on social media is both
important and challenging, as not everyone critically evaluates what they see.
Discussion
In recent years, the emergence and development of language models as powerful tools for
processing and generating human-like text has significantly transformed the experience of
conducting both quantitative and qualitative data analysis (Hariri, 2023; Morgan, 2023;
Lubiana et al., 2023). In the case of qualitative research, artificial intelligence in general, and
ChatGPT in particular, represent promising avenues to explore text transcriptions from
interview data and facilitate the work of qualitative researchers (Morgan, 2023). However,
these promises have not yet been practically explored and empirically validated. This study is
a step in this direction by providing a reliable research protocol for the use of ChatGPT and
similar language models for qualitative data analysis and subsequent theory building steps.
In particular, the use of ChatGPT paved the way for the application of thematic
analysis following the standard procedures associated with this technique as explained by the
authors (Braun & Clarke, 2006) and practically implemented by the qualitative research
community around the globe (Javadi & Zarea, 2016; Tuckett, 2005; Vaismorandi et al.,
2016). Although ChatGPT is not the ultimate platform for qualitative data analysis (various
software packages have been launched and empirically tested with great success over the
decades, such as Nvivo, Atlas.ti, etc.), its powerful language model and its ability, albeit with
relative success, to summarize, interpret, and solve various research problems make this
THEMATIC ANALYSIS IN CHATGPT 24
artificial intelligence model a potential candidate to assist researchers in many phases of
empirical investigation (Morgan, 2023). One possible implementation, as proposed in this
manuscript, could involve the exploration of interview data summarized in this study into a
standard research protocol that the qualitative research community can subsequently test and
refine.
The tested research protocol for thematic analysis in ChatGPT was developed
following the standard procedure proposed by Braun and Clarke (2006). Although the
protocol was not directly replicated for obvious reasons, it is intended to resemble the
original proposal, particularly in terms of the approach and spirit of data preparation, data
analysis and data validation. Accordingly, this protocol attempts to capture the nuances of
thematic analysis by proposing the following steps: 1) data preparation, 2) defining the
analysis process, 3) chatbot interaction, 4) iterative process, 5) review and validation, and 6)
analysis and interpretation. This six-step research protocol provides the necessary steps to
conduct a systematic and flexible thematic analysis of the interview data.
After the iterative process of data analysis, perhaps the hardest test for data validation
is to examine the reliability of the proposed thematic patterns that emerge from the data. Our
approach in this case was to test the reliability of the coding patterns using two other
qualitative data analysis software programs, Atlas.ti and Nvivo, and then assess the reliability
of the data using the checklist suggested in Table 1 and human intelligence. Any significant
deviation from the results of the two software programs and the natural conclusions
suggested by human intelligence must be explained in detail and the next best course of
action indicated. This design, which aims to evaluate the thematic patterns and verbatims
emerging from ChatGPT using two independent qualitative data analysis software programs
in addition to human intelligence, demonstrates the nature of this proposal and the transparent
orientation of the data analysis.
THEMATIC ANALYSIS IN CHATGPT 25
Once the research protocol and potential effectiveness of ChatGPT to support
qualitative researchers in data exploration of thematic analysis has been tested, various
advances and pitfalls to avoid can be discussed. In terms of advances offered by AI, ChatGPT
provides an easy-to-use chatbot that can facilitate the exploration of qualitative data analysis,
especially in the initial stages of the data analysis process and with large transcripts. In the
context of this advance, the proposed protocol helps qualitative researchers to provide more
transparency and systematization of emerging thematic patterns, a point that has traditionally
been criticized in qualitative research (O’Kane et al., 2021; Sarma, 2015). The chatbot and
retrieval affordances also facilitate research collaboration between co-authors, as emerging
patterns can be discussed and shared interactively as they emerge in the chatbot. All in all,
ChatGPT in its current form can be a suitable tool for preliminary exploration of data and for
assisting qualitative researchers in the primary understanding of their material while selecting
the best examples of verbatims. However, this implementation is no different from other
qualitative data analysis software programs, some of which were incorporated into this study
to validate the results of ChatGPT.
In terms of limitations and pitfalls to avoid, after testing, we must acknowledge that
ChatGPT is far from being able to replace the human skills of qualitative researchers. As for
its development, ChatGPT is able to provide an explicit but generally descriptive analysis of
qualitative data. Therefore, it is highly recommended that human intelligence monitors and
simultaneously analyzes the material to derive more meaningful and implicit theoretical
insights (Morgan, 2023). Our examination of the platform in terms of the quality and
granularity of thematic patterns is rather unsatisfactory when it comes to the subtle nuances
and contextual insights generally associated with qualitative research. To generate more
granular insights with contextual meaning, the protocol needs to specify and design the
prompt in detail, which may not make up for the time spent, as this could already be done by
THEMATIC ANALYSIS IN CHATGPT 26
human intelligence. In short, perfecting the prompt is only worthwhile if the quality of the
output, including nuance and contextual explanations, can improve the quality of human
reasoning, which has been difficult to achieve.
Finally, the ethos of qualitative research has always been linked to the creative
process and the 'magical conclusions' of human intelligence. The standardization of research
craft in qualitative research can improve the rigor and logical coherence of findings, but it can
also suppress the metaphorical thinking and writing craft of qualitative scholars. Qualitative
research is essentially based on the hunches, highs and lows of researchers immersing
themselves in the data and understanding the meaningful but sometimes subtle nuances of
context and reflexivity. Summarizing the qualitative process of data analysis may be
appropriate for studies that focus on structure and description, but extending AI-powered
qualitative analysis to all qualitative research could ultimately generate a more insipid, yet
solid science.
THEMATIC ANALYSIS IN CHATGPT 27
References
Adler, R. H. (2022). Trustworthiness in qualitative research. Journal of Human Lactation,
38(4), 598-602.
Alasuutari, P. (2010). The rise and relevance of qualitative research. International journal of
social research methodology, 13(2), 139-155.
Araújo, S., & Aguiar, M. (2023, June). Simplifying Specialized Texts with AI: A ChatGPT-
Based Learning Scenario. In International Conference in Information Technology and
Education (pp. 599-609). Singapore: Springer Nature Singapore.
Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative research
in psychology, 3(2), 77-101.
Castleberry, A., & Nolen, A. (2018). Thematic analysis of qualitative research data: Is it as
easy as it sounds? Currents in pharmacy teaching and learning, 10(6), 807-815.
Corbin, J. M., & Strauss, A. (1990). Grounded theory research: Procedures, canons, and
evaluative criteria. Qualitative sociology, 13(1), 3-21.
Creswell, J. W., & Poth, C. N. (2016). Qualitative inquiry and research design: Choosing
among five approaches. Sage publications.
Denscombe, M. (2009). Ground rules for social research: Guidelines for good practice.
McGraw-Hill Education (UK).
Gil de Zúñiga, H., Goyanes, M., & Durotoye, T. (2023). A scholarly definition of artificial
intelligence (AI): advancing AI as a conceptual framework in communication
research. Political Communication, 41(2), 317-334
Graebner, M. E., Martin, J. A., & Roundy, P. T. (2012). Qualitative data: Cooking without a
recipe. Strategic Organization, 10(3), 276-284.
Graue, C. (2015). Qualitative data analysis. International Journal of Sales, Retailing &
Marketing, 4(9), 5-14.
THEMATIC ANALYSIS IN CHATGPT 28
Hammersley, M. (2012). What is qualitative research? Bloomsbury Academic.
Hariri, W. (2023). Unlocking the Potential of ChatGPT: A Comprehensive Exploration of its
Applications, Advantages, Limitations, and Future Directions in Natural Language
Processing. arXiv preprint arXiv:2304.02017.
Hwang, S. (2008). Utilizing qualitative data analysis software: A review of Atlas. Ti. Social
Science Computer Review, 26(4), 519-527.
Javadi, M., & Zarea, K. (2016). Understanding thematic analysis and its pitfall. Journal of
client care, 1(1), 33-39.
Kekez, A. (2019). 21. Qualitative data analysis in implementation and street-level
bureaucracy research. Research Handbook on Street-Level Bureaucracy, 317.
Lacey, A., & Luff, D. (2001). Qualitative data analysis. Trent focus Sheffield.
Lapan, S. D., Quartaroli, M. T., & Riemer, F. J. (2011). Qualitative research: An introduction
to methods and designs (Vol. 37). John Wiley & Sons.
Liamputtong, P. (2009). Qualitative data analysis: Conceptual and practical considerations.
Health promotion journal of Australia, 20(2), 133-139.
Lingard, L. (2023). Writing with ChatGPT: An illustration of its capacity, limitations &
implications for academic writers. Perspectives on Medical Education, 12(1), 261.
Lubiana, T., Lopes, R., Medeiros, P., Silva, J. C., Goncalves, A. N. A., Maracaja-Coutinho,
V., & Nakaya, H. I. (2023). Ten quick tips for harnessing the power of ChatGPT in
computational biology. PLOS Computational Biology, 19(8).
Madondo, S. M. (2021). Data Analysis and Methods of Qualitative Research: Emerging
Research and Opportunities: Emerging Research and Opportunities.
Mohajan, H. K. (2018). Qualitative research methodology in social sciences and related
subjects. Journal of economic development, environment and people, 7(1), 23-48.
THEMATIC ANALYSIS IN CHATGPT 29
Morgan, D. L. (2023). Exploring the Use of Artificial Intelligence for Qualitative Data
Analysis: The Case of ChatGPT. International Journal of Qualitative Methods, 22,
16094069231211248.
Neuman, W. L. (2019). Social research methods.
O’Kane, P., Smith, A., & Lerman, M. P. (2021). Building transparency and trustworthiness in
inductive research through computer-aided qualitative data analysis software.
Organizational Research Methods, 24(1), 104-139.
Ormston, R., Spencer, L., Barnard, M., & Snape, D. (2014). The foundations of qualitative
research. Qualitative research practice: A guide for social science students and
researchers, 2(7), 52-55.
Patton, M. Q. (2014). Qualitative research & evaluation methods: Integrating theory and
practice. Sage publications.
Polkinghorne, D. E. (1995). Narrative configuration in qualitative analysis. International
journal of qualitative studies in education, 8(1), 5-23.
Sarma, S. K. (2015). Qualitative research: Examining the misconceptions. South Asian
Journal of Management, 22(3), 176.
Taylor, S. J., Bogdan, R., & DeVault, M. (2015). Introduction to qualitative research
methods: A guidebook and resource. John Wiley & Sons.
Terry, G., Hayfield, N., Clarke, V., & Braun, V. (2017). Thematic analysis. The SAGE
handbook of qualitative research in psychology, 2, 17-37.
Thompson, K. (2017). Qualitative research rules: Using qualitative and ethnographic methods
to access the human dimensions of technology. En Evaluation of Rail Technology (pp.
75-110). CRC Press.
Thorogood, N., & Green, J. (2018). Qualitative methods for health research. Qualitative
methods for health research, 1-440.
THEMATIC ANALYSIS IN CHATGPT 30
Tuckett, A. G. (2005). Applying thematic analysis theory to practice: A researcher’s
experience. Contemporary nurse, 19(1-2), 75-87.
Vaismoradi, M., Jones, J., Turunen, H., & Snelgrove, S. (2016). Theme development in
qualitative content analysis and thematic analysis.
Younas, A., Fàbregues, S., Durante, A., Escalante, E. L., Inayat, S., & Ali, P. (2023).
Proposing the “MIRACLE” narrative framework for providing thick description in
qualitative research. International Journal of Qualitative Methods, 22,
16094069221147162.
Zhang, L., & Zhang, L. (2022). Artificial intelligence for remote sensing data analysis: A
review of challenges and opportunities. IEEE Geoscience and Remote Sensing
Magazine, 10(2), 270-294.
THEMATIC ANALYSIS IN CHATGPT 31
Tables
Table 1. Checklist proposal for the review and validation phase.
Phase 1. Verification of the results from prompts 13 and 14: Manual review
Checklist question
Yes/No
Justification
Proposed improvement if
necessary
The exploration of 30
interviews provides a rich
tapestry of perspectives,
revealing commonalities and
distinctions in how people
perceive and engage with
these digital spaces as sources
of information?
Yes
The exploration of 30 interviews
offers a valuable and
comprehensive approach to
understanding individuals
perspectives and interactions with
digital spaces as sources of
information.
N/A
After reviewing the 30
interviews, can we confirm
that the identified themes
could be: primary information
source, influence on political
opinions, convenience and
accessibility, trust and
reliability, and filter bubbles
and echo chambers?
Yes
Although the themes may differ
to some extent manually, we can
confirm that they are indeed
appropriate. Yes, the
identification of themes such as
primary information source,
influence on political opinions,
convenience and accessibility,
trust and reliability, and filter
bubbles and echo chambers is
plausible and justified after
reviewing the 30 interviews.
After analyzing the 30
interviews, can we confirm
that the key phrases or terms
could be the same as those
obtained with ChatGPT?
No
Yes, after analyzing the 30
interviews, we can confirm that
the key phrases or terms could be
the same as those obtained with
ChatGPT. However, further
exploration would be
recommendable, since terms such
as "digital agora," "variety of
perspectives," "instant
gratification," and "likes and
shares impact" appear to be too
broad.
It is recommended to use a
broader prompt or manually
utilize "digital agora,"
"variety of perspectives,"
"instant gratification," and
"likes and shares impact" as
a foundation to achieve more
specific results.
Considering the responses of
the interviewees, are the
convergences and divergences
proposed by ChatGPT
correct?
Yes
After analyzing the interviews, it
is confirmed that the
convergences and divergences
identified by ChatGPT for the
social media influence thematic
axis are valid.
N/A
Do the direct quotes provided
by ChatGPT exist, and do
they match the statements of
the interviewees?
Yes
After conducting a search for the
obtained direct quotes, it is
confirmed that they exist and
align with the thematic axis of
social media influence.
N/A
The keywords identified by
ChatGPT accurately represent
the statements of the 30
interviewees in the thematic
axis of social media
influence?
Yes
After analyzing the key words
provided by ChatGPT, we
confirm that
these are indeed valid and align
with both the statements of the
interviewees and the thematic
axis of social media influence.
N/A
Phase 2. Verification of the results from prompts 13 and 14: Semiautomatic review with Nvivo
THEMATIC ANALYSIS IN CHATGPT 32
Checklist question
Yes/No
Justification
Proposed improvement if
necessary
After uploading the 30
interviews to the NVivo
software and using the “word
frequency query” tool, do we
obtain a tags cloud in which a
terms is identified that
directly aligns with the Social
Media Influence axis
identified by ChatGPT?
No
The results are fairly different.
After examining both ChatGPT
and NVivo, we can confirm that
those provided by ChatGPT are
more accurate.
It is advisable to discuss with
tco-authors whether to
incorporate additional
queries to NVivo, and then
analyze it with ChatGPT.
After uploading the 30
interviews to the NVivo
software and using the 'word
tree' tool applied to the
concept of Social Media
Influence, do we obtain direct
quotes that align or are similar
to those obtained by
ChatGPT?
No
The outcomes vary significantly.
After examining the outcomes of
both ChatGPT and NVivo, we
can conclude that those of
ChatGPT are more precise.
It is recommended to analyze
whether there is any element
that NVivo could
additionally highlight and
include. In our case, there are
none, which further confirms
that ChatGPT’s precision is
better than NVivo’s.
Phase 3. Verification of the results from prompts 13 and 14: Automatic review with Atlas.ti
Checklist question
Yes/No
Justification
Proposed improvement if
necessary
After uploading the 30
interviews to the Atlas.ti
software and using the 'AI
Coding Beta' tool, do any of
the automatically generated
codes align with the central
themes identified by
ChatGPT?
Yes
Although the coding results from
Atlas.ti differ from those obtained
by ChatGPT, there are direct
connections between them.
N/A
After inspecting the different
codes created by AI Coding
Beta, can we confirm that
they use the same or similar
key phrases or terms obtained
with ChatGPT?
No
It differs in some aspects. Atlas.ti
provides a greater number of
codes. However, ChatGPT offers
more efficient coding.
Although the results obtained
by ChatGPT are correct,
researchers could consider
the possibility of including
an additional code for the
analyzed axis, either
manually or using a new
prompt.
... Harnessing the power of natural language processing (NLP), computer vision, big data, and deep learning, coupled with its intuitive chatbot interface, ChatGPT and other generative AI have the potential to transform many areas such as climate change (Biswas, 2023a), public health (Biswas, 2023b), and especially education (Firat, 2023). Capable of understanding natural language, ChatGPT can analyze qualitative unstructured data, suggesting codes, formulating codebooks and categories, and generating themes (Goyanes et al., 2024;Morgan, 2023;Turobov et al., 2024). ...
... AI-powered tools have shown their capacities for analyzing large sets of qualitative information and generating summaries, codes, and themes (Goyanes et al., 2024;Morgan, 2023;Turobov et al., 2024). Applying ChatGPT in Template Analysis, Nguyen-Trung (2024) found that under researchers' thorough methodological guidance, supervision, and decision making, generative AI can play the role of a research assistant in data analysis. ...
Preprint
Full-text available
Artificial Intelligence (AI) tools have been used to improve the productivity of evidence review and synthesis since at least 2016, with EPPI-Reviewer and Abstrackr being two prominent examples. However, since the release of ChatGPT by OpenAI in late 2022, a large language model with an intuitive chatbot interface, the use of AI-powered tools for research-especially those that deal with text-based data-has exploded. In this working paper, we describe how we used the AI-powered tools such as ChatGPT, ChatGPT for Sheets and Docs, Casper AI, and ChatPDF to assist several stages of an evidence review. Our goal is to demonstrate how AI-powered tools can boost research productivity, identify their current weaknesses, and provide recommendations for researchers looking to utilize them.
Article
Full-text available
The potential use of artificial intelligence programs such as a ChatGPT to analyze qualitative data raises any number of questions, most notably whether it is possible to produce similar results without the demanding process of manual coding. In addition, there are questions about both the simplicity of using ChatGPT for qualitative data analysis and the potential time savings that it might provide This article addresses these questions by using ChatGPT to reinvestigate two qualitative datasets that were previously analyzed by more traditional methods. In particular, it examines the extent to which the responses from ChatGPT can recreate the themes that were originally chosen to summarize the two previous analyses. The results show that ChatGPT performed reasonably well, but in both cases it was less successful at locating subtle, interpretive themes, and more successful at reproducing concrete, descriptive themes. In doing so, the program was quite easy to use and required very little effort in comparison to approaches that rely on manual coding. It is important to recognize, however, that both coding and approaches based on artificial intelligence are simply tools that must be applied within a larger analytic process. Overall, this exploration suggests that artificial intelligence may well have the power to disrupt the coding of data segments as a dominant paradigm for qualitative data analysis.
Article
Full-text available
Thick description of qualitative findings is critical to improving the transferability of qualitative research findings as it allows researchers to assess their applicability to other contexts and settings. However, what thick description entails and how it should be carried out is often missing or insufficiently described. While expert qualitative researchers may be familiar with the concept, the wide variety of meanings and interpretations of thick description in the literature may make it difficult for novice qualitative researchers to understand this concept when reporting qualitative findings. The purpose of this paper is to propose the "MIRACLE" narrative framework for providing thick description in qualitative research. We developed this framework based on a critical review of theoretical literature about thick description and writing in qualitative research, as well as our personal experiences conducting, writing, and publishing qualitative studies. The proposed framework can be valuable for improving the reporting quality and transferability of qualitative research findings.
Article
Full-text available
Many scholars have called for qualitative research to demonstrate transparency and trustworthiness in the data analysis process. Yet these processes, particularly within inductive research, often remain shrouded in mystery. We suggest that computer-aided/assisted qualitative data analysis software (CAQDAS) can support qualitative researchers in their efforts to present their analysis and findings in a transparent way, thus enhancing trustworthiness. To this end, we propose, describe, and illustrate working examples of six CAQDAS building blocks, three combined CAQDAS techniques, and two coder consistency checks. We argue that these techniques give researchers the language to write about their methods and findings in a transparent manner and that their appropriate use enhances a research project’s trustworthiness. Specific CAQDAS techniques are rarely discussed across an array of inductive research processes. Thus, we see this article as the beginning of a conversation about the utility of CAQDAS to support inductive qualitative research.
Book
This book focuses on the key ideas and principles that underlie contemporary approaches to social research and identifies 12 basic ground rules for good research. In clear language it provides a user-friendly resource for people doing small-scale social research projects.
Article
Issue We are seeing the use of qualitative research methods more regularly in health professions education as well as pharmacy education. Often, the term “thematic analysis” is used in research studies and subsequently labeled as qualitative research, but saying that one did this type of analysis does not necessarily equate with a rigorous qualitative study. This methodology review will outline how to perform rigorous thematic analyses on qualitative data to draw interpretations from the data. Methodological Literature Review Despite not having an analysis guidebook that fits every research situation, there are general steps that you can take to make sure that your thematic analysis is systematic and thorough. A model of qualitative data analysis can be outlined in five steps: compiling, disassembling, reassembling, interpreting, and concluding. My Recommendations and Their Applications Nine practical recommendations are provided to help researchers implement rigorous thematic analyses. Potential Impact As researchers become comfortable in properly using qualitative research methods, the standards for publication will be elevated. By using these rigorous standards for thematic analysis and making them explicitly known in your data process, your findings will be more valuable.