PreprintPDF Available

Better by you, better than me, chatgpt3 as writing assistance in students essays

Authors:

Abstract and Figures

Aim: To compare students' essay writing performance with or without employing ChatGPT-3 as a writing assistant tool. Materials and methods: Eighteen students participated in the study (nine in control and nine in the experimental group that used ChatGPT-3). We scored essay elements with grades (A-D) and corresponding numerical values (4-1). We compared essay scores to students' GPTs, writing time, authenticity, and content similarity. Results: Average grade was C for both groups; for control (2.39, SD=0.71) and for experimental (2.00, SD=0.73). None of the predictors affected essay scores: group (P=0.184), writing duration (P=0.669), module (P=0.388), and GPA (P=0.532). The text unauthenticity was slightly higher in the experimental group (11.87%, SD=13.45 to 9.96%, SD=9.81%), but the similarity among essays was generally low in the overall sample (the Jaccard similarity index ranging from 0 to 0.054). In the experimental group, AI classifier recognized more potential AI-generated texts. Conclusions: This study found no evidence that using GPT as a writing tool improves essay quality since the control group outperformed the experimental group in most parameters.
Content may be subject to copyright.
Better by You, better than Me? ChatGPT-3 as writing assistance in students’ essays.
Željana Bašić, Ana Banovac*, Ivana Kružić, Ivan Jerković
University Department of Forensic Sciences, University of Split
Ruđera Boškovića 33, 21000 Split, Croatia
Correspondence to: Ana Banovac, ana.banovac@forenzika.unist.hr
Abstract
Aim: To compare students' essay writing performance with or without employing ChatGPT-3 as
a writing assistant tool.
Materials and methods: Eighteen students participated in the study (nine in control and nine in
the experimental group that used ChatGPT-3). We scored essay elements with grades (A-D) and
corresponding numerical values (4-1). We compared essay scores to students' GPTs, writing
time, authenticity, and content similarity.
Results: Average grade was C for both groups; for control (2.39 ± 0.71) and for experimental
(2.00 ± 0.73). None of the predictors affected essay scores: group (P = 0.184), writing duration
(P = 0.669), module (P = 0.388), and GPA (P = 0.532). The text unauthenticity was slightly
higher in the experimental group (11.87% ±13.45 to 9.96% ± 9.81%), but the similarity among
essays was generally low in the overall sample (the Jaccard similarity index ranging from 0 to
0.054). In the experimental group, AI classifier recognized more potential AI-generated texts.
Conclusions: This study found no evidence that using GPT as a writing tool improves essay
quality since the control group outperformed the experimental group in most parameters.
Keywords: ChatGPT, OpenAI, short-form essay, academic writing, education
Introduction
November 30 2022, will go down in history as the date when a free version of the AI language
model created by OpenAI called ChatGPT-3 1 was made available for public usage. This
language model's functions encompass text generation, answering questions, and completing
tasks such as translation and summarization 2.
ChatGPT can be employed as assistance in the world of academia. It can improve writing skills
since it is trained to deliver feedback on style, coherence, and grammar 3, extract key points, and
provide citations 4. This could increase the efficiency of researchers, allowing them to
concentrate on more crucial activities (e.g., analysis and interpretation). This has been supported
by studies showing that ChatGPT could generate abstracts 5,6, high-quality research papers 7,
dissertations, and essays 3. Previous studies showed that ChatGPT could create quality essays on
different topics 813. For example, this program, in conjunction with davinci-003, generated high-
quality short-form essays on Physics, which would be awarded First Class, the highest grade in
UK high education system 14. It also led to questions on the ethics of using ChatGPT in different
forms of academic writing, the AI authorship 7,1518, and raised issues of evaluating academic
tasks like students' essays 1921. Unavoidable content plagiarism issues were discussed, and
solutions for adapting essay settings and guidelines were revised 8,14,20,22.
However, it is still unknown how ChatGPT performs in students' environment as a writing
assistant tool and does it enhance students' performance. Thus, this research investigated whether
ChatGPT would improve students' essay grades, reduce writing time, and affect text authenticity.
Materials and methods
We invited the second-year master's students from the University Department of Forensic
Sciences, University of Split, Croatia, to voluntarily participate in research on essay writing as a
part of the course Forensic sciences seminar. Out of 50 students enrolled in the course, 18
applied by web form and participated in the study. Before the experiment, we divided them into
two groups according to the study module (Crime Scene Investigation, Forensic Chemistry and
Molecular Biology, and Forensics and National Security) and the weighted grade point average
(GPA) to ensure the similar composition of the groups. The control group (n = 9, GPA = 3.92 ±
0.46) wrote the essay traditionally, while the experimental group (n = 9, GPA = 3.92 ± 0.57)
used ChatGTP assistance, version 2.1.0. 1.
Before the study, the students signed the informed consent and were given a separate sheet to
write their names and password. This enabled anonymity while grading essays and further
analysis of student-specific variables. We explained the essay scoring methodology 23 to both
groups, with written instructions about the essay title (The advantages and disadvantages of
biometric identification in forensics sciences), length of the essay (800 1000 words in the
Croatian language), formatting, and citation style (Vancouver). We introduced the experimental
group to the ChatGPT tool. All students had four hours to finish the task and could leave
whenever they wanted. The control group was additionally supervised to ensure they did not use
the ChatGPT.
Two teachers graded the essays (ŽB, associate professor, and IJ, assistant professor). They
compared the grades, and if their scoring differed the final grade was decided by the consensus.
We used the essay rubrics from Schreyer Institute for Teaching Excellence, Pennsylvania State
University, that included the following criteria (mechanics, style, content, and format) and grades
from A to D 23. We converted categorical grades to numbers (A=4, B=3, C=2, D=1) for further
analysis. For each student, we recorded writing time.
We checked the authenticity of each document using PlagScan (PlagScan GmbH, Germany,
2023), and conducted the pairwise comparison for document similarity using R studio (ver.
1.2.5033) and package Textreuse 24 using the Jaccard similarity index. We checked the content
using an AI text classifier to test if a human or an AI created the text. According to this classifier,
text was scored as very unlikely, unlikely, unclear, possibly, and likely that it was AI generated
25. We opted for this package after similar programs 2527 did not recognize a ChatGPT-generated
text in the Croatian language as AI-assisted text.
Statistical analysis and visualization were conducted using Excel (Microsoft Office ver. 2301)
and R studio (ver. 1.2.5033). The final essay score was calculated as an average of four grading
elements. The linear regression was used to test the effects of group, writing duration, module,
and GPA on overall essay scores. The level of statistical significance was set at P 0.05.
Results
The duration of the essay writing for the GTP-assisted group was 172.22 ± 31.59, and for the
control, 179.11 ± 31.93 minutes. GTP and control group, on average, obtained grade C, with a
slightly higher average score in the control (2.39 ± 0.71) than in the GTP group (2.00 ± 0.73)
(Figure 1A). The mean of text unauthenticity was 11.87% ±13.45 in the GPT-assisted group and
9.96% ± 9.81% in the control group. The text similarity in the overall sample was low
(Supplementary table 1), with a median value of the Jaccard similarity index of 0.002 (0
0.054). The AI text classifier showed that, in the control group, two texts were possibly, one
likely generated by AI, two were unlikely created by AI, and four cases were unclear. The
ChatGPT group had three possible and five cases likely produced by AI, while one case was
labeled as unclear.
Figure 1 (A and B) implies a positive association between duration and GPA with essay scores.
However, students with higher GPAs in the control group achieved higher scores than those in
the GTP group. The association of essay scores and non-authentic text proportion (Figure 1C)
was detected only in the GPT group, where the students with more non-authentic text achieved
lower essay scores.
The linear regression model showed a moderate positive relationship between the four predictors
and the overall essay score (R = 0.573; P = 0.237). However, none of the predictors had a
significant effect on the outcome: group (P = 0.184), writing duration (P = 0.669), module (P =
0.388), and GPA (P = 0.532).
Figure 1. Essay scores by A) group, B) duration, C) average grades, D) proportion of non-
authentic text.
Supplementary table 1. Pairwise comparison of essay similarity.
Control1
Control2
Control3
Control4
Control5
Control6
Control7
Control8
Control9
GPT1
GPT3
GPT4
GPT9
Control1
NA
0.0013
0.0000
0.0537
0.0014
0.0018
0.0041
0.0023
0.0044
0.0031
0.0005
0.0000
0.0110
Control2
NA
NA
0.0015
0.0031
0.0030
0.0024
0.0031
0.0024
0.0010
0.0023
0.0000
0.0010
0.0009
Control3
NA
NA
NA
0.0009
0.0011
0.0015
0.0021
0.0015
0.0000
0.0015
0.0000
0.0010
0.0000
Control4
NA
NA
NA
NA
0.0051
0.0054
0.0099
0.0100
0.0014
0.0047
0.0005
0.0022
0.0117
Control5
NA
NA
NA
NA
NA
0.0046
0.0054
0.0041
0.0033
0.0038
0.0016
0.0051
0.0051
Control6
NA
NA
NA
NA
NA
NA
0.0117
0.0095
0.0021
0.0098
0.0020
0.0015
0.0074
Control7
NA
NA
NA
NA
NA
NA
NA
0.0118
0.0057
0.0128
0.0000
0.0013
0.0104
Control8
NA
NA
NA
NA
NA
NA
NA
NA
0.0021
0.0085
0.0010
0.0010
0.0074
Control9
NA
NA
NA
NA
NA
NA
NA
NA
NA
0.0020
0.0000
0.0000
0.0042
GPT1
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
0.0019
0.0037
0.0112
GPT2
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
0.0022
0.0054
0.0043
GPT3
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
0.0030
0.0035
GPT4
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
0.0015
GPT5
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
0.0030
GPT6
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
0.0019
GPT7
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
0.0085
GPT8
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
0.0062
GPT9
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
Discussion
As we are aware, this is the first study that tested ChatGPT-3 as an essay-writing assistance tool
in a student population sample. It showed that the aid of ChatGPT did not necessarily improve
the quality of students' essays. The ChatGPT group did not perform better in either of the
indicators; the students did not deliver higher quality content, did not write faster, nor had a
higher degree of authentic text.
The overall essay score was slightly better in the control group, which could probably result from
the over-reliance on the tool or students’ unfamiliarity with it. This was in line with Fyfe's study
on writing students' essays using GPT-2, where students reported that it was harder to write
using the tool than by themselves. Students also raised the question of not knowing the sources
of generated text which additionally distracted them in writing task 28. Some studies did show
more promising results 814, but unlike our study, they were mainly based on GPT and
experienced researcher interaction. This could be a reason for the lower performance of our GTP
group, as the experienced researchers are more skilled in formulating questions, guiding the
program to obtain better-quality information, and critically evaluating the content.
The other interesting finding is that the use of ChatGPT did not accelerate essay writing and that
the students of both groups required a similar amount of time to complete the task. As expected,
the longer writing time in both groups related to the better essay score. This finding could also be
explained by students’ feedback from Fyfe's study, where they specifically reported difficulties
combining the generated text and their own style 29. So, although ChatGPT could accelerate
writing in the first phase, it requires more time to finalize the task and assemble content.
Our experimental group had slightly more problems with plagiarism than the control group. Fyfe
also showed that his students felt uncomfortable writing and submitting the task since they felt
they were cheating and plagiarizing 29. However, a pairwise comparison of essays in our study
did not reveal remarkable similarities, indicating that students had different reasoning and style,
regardless of whether they were using ChatGPT. This could also imply that applying the tool for
writing assistance produces different outcomes for the same task, depending on the user's input
14.
The available ChatGPT text detector 25 did not perform well, giving false positive results in the
control group. Most classifiers are intended for English and usually have disclaimers for
performance in other languages. This raises the necessity of improving existing algorithms for
different languages or developing language-specific ones.
The main concern of using GPT in academic writing has been the unauthenticity 8,14,22, but we
believe that such tools will not increase the non-originality of the published content or students'
assignments. The detectors of AI-generated text are developing daily, and it is only a matter of
time before highly reliable tools are available. When we consider the perspectives of detection
tools and our findings where the students with GPT assistance did not outperform the control
group, we can see no reason for a major concern about its application in academic writing.
The main drawback of this study is the limited sample size which does not permit the
generalization of the findings or a more comprehensive statistical approach. One of the
limitations could also be language-specificity (our students wrote in Croatian for their
convenience), which disabled us from the full application of AI detection tools. We should also
consider that ChatGPT is predominantly fed with English content, so we cannot exclude the
possibility that writing in English could have generated higher-quality information. Lastly, this
was our students' first interaction with ChatGPT, so it is possible that lack of experience also
affected their performance. Future studies should therefore expand the sample size, number, and
conditions of experiments, include students of different profiles, and extend the number of
variables that could generally relate to writing skills.
As it seems, the academia and media concern about this tool might be unjustified, as, in our
example, the ChatGPT was found to perform similarly to any web-based search: the more you
know the more you will find. In some ways, instead of providing structure and facilitating
writing, it could distract students and make them underperform.
References
1. OpenAI. Optimizing Language Models for Dialogue. https://openai.com/blog/chatgpt/
(2022).
2. Agomuoh, F. ChatGPT: how to use the viral AI chatbot that took the world by storm.
Digital Trends https://www.digitaltrends.com/computing/how-to-use-openai-chatgpt-text-
generation-chatbot/ (2023).
3. Aljanabi, M., Ghazi, M., Ali, A. H. & Abed, S. A. ChatGpt: Open Possibilities. Iraqi J.
Comput. Sci. Math. 4, 6264 (2023).
4. Aydın, Ö. & Karaarslan, E. OpenAI ChatGPT generated literature review: Digital twin in
healthcare. Emerg. Comput. Technol. 2, 2231 (2022).
5. Gao, C. A. et al. Comparing scientific abstracts generated by ChatGPT to original
abstracts using an artificial intelligence output detector, plagiarism detector, and blinded
human reviewers. bioRxiv (2022) doi:https://doi.org/10.1101/2022.12.23.521610.
6. Ma, Y., Liu, J. & Yi, F. Is This Abstract Generated by AI? A Research for the Gap
between AI-generated Scientific Text and Human-written Scientific Text. arXiv (2023)
doi:https://doi.org/10.48550/arXiv.2301.10416.
7. Kung, T. H. et al. Performance of ChatGPT on USMLE: Potential for AI-Assisted
Medical Education Using Large Language Models. medRxiv (2022)
doi:https://doi.org/10.1101/2022.12.19.22283643.
8. Susnjak, T. ChatGPT: The End of Online Exam Integrity? arXiv (2022)
doi:https://doi.org/10.48550/arXiv.2212.09292.
9. Hoang, G. Academic writing and AI: Day-5 experiment with cultural additivity.
https://osf.io/u3cjx/download (2023).
10. Nguyen, Q. & La, V. Academic writing and AI: Day-4 experiment with mindsponge
theory. OSF Prepr. awysc, Cent. Open Sci. (2023) doi:10.31219/osf.io/awysc.
11. Hoang, G., Nguyen, M. & Le, T. Academic writing and AI: Day-3 experiment with
environmental semi- conducting principle. https://osf.io/2qbea/download (2023).
12. Nguyen, M. & Le, T. Academic writing and AI: Day-2 experiment with Bayesian
Mindsponge Framework. https://osf.io/kr29c/download (2023).
13. Nguyen, M. & Le, T. Academic writing and AI: Day-1 experiment.
https://osf.io/kr29c/download (2023).
14. Yeadon, W., Inyang, O.-O., Mizouri, A., Peach, A. & Testrow, C. The Death of the Short-
Form Physics Essay in the Coming AI Revolution. arXiv (2022)
doi:https://doi.org/10.48550/arXiv.2212.11661.
15. Xiao, Y. Decoding Authorship: Is There Really no Place for an Algorithmic Author Under
Copyright Law? IIC-International Rev. Intellect. Prop. Compet. Law 54, 525 (2023).
16. Bishop, L. A Computer Wrote this Paper: What ChatGPT Means for Education, Research,
and Writing. Res. Writ. (January 26, 2023) (2023)
doi:https://dx.doi.org/10.2139/ssrn.4338981.
17. Pourhoseingholi, M. A., Hatamnejad, M. R. & Solhpour, A. Does chatGPT (or any other
artificial intelligence language tools) deserve to be included in authorship list? chatGPT
and authorship. Gastroenterol. Hepatol. from Bed to Bench 16, (2023).
18. Grimaldi, G. & Ehrler, B. AI et al.: Machines Are About to Change Scientific Publishing
Forever. ACS Energy Lett. 878880 (2023) doi:10.1021/acsenergylett.2c02828.
19. Whitford, E. Heres How Forbes Got The ChatGPT AI To Write 2 College Essays In 20
Minutes. Forbes https://www.forbes.com/sites/emmawhitford/2022/12/09/heres-how-
forbes-got-the-chatgpt-ai-to-write-2-college-essays-in-20-minutes/?sh=7be402d956ad
(2022).
20. Stokel-Walker, C. AI bot ChatGPT writes smart essays should professors worry?
nature (2022) doi:https://doi.org/10.1038/d41586-022-04397-7.
21. Hern, A. AI bot ChatGPT stuns academics with essay-writing skills and usability. The
guardian https://www.theguardian.com/technology/2022/dec/04/ai-bot-chatgpt-stuns-
academics-with-essay-writing-skills-and-usability (2022).
22. Cotton, D. R. E., Cotton, P. A. & Shipway, J. R. Chatting and Cheating: Ensuring
academic integrity in the era of ChatGPT. EdArXiv (2023)
doi:https://doi.org/10.35542/osf.io/mrz8h.
23. Schreyer Institute for Teaching Excellence. Writing Rubric Example.
http://www.schreyerinstitute.psu.edu/pdf/suanne_general_resource_WritingRubric.pdf.
24. Mullen, L. Package textreuse. https://mran.revolutionanalytics.com/snapshot/2016-03-
22/web/packages/textreuse/textreuse.pdf (2015).
25. OpenAI. AI Text Classifier. https://platform.openai.com/ai-text-classifier.
26. Goal, D. &. ChatGPT - GPT3 Content Detector. https://detector.dng.ai/.
27. Debut, L., Kim, J. W. & Wu, J. RoBERTa-based GPT-2 Output Detector from OpenAI.
https://openai-openai-detector.hf.space/.
28. Fyfe, P. How to cheat on your final paper: Assigning AI for student writing. AI Soc. 111
(2022).
29. Fyfe, P. How to cheat on your final paper: Assigning AI for student writing. AI Soc. 111
(2022) doi:https://doi.org/10.17613/0h18-5p41.
... To this end, the authors suggest that future studies should concentrate on the evaluation of the capacity and effectiveness of AI tools applicable to pedagogy. Bašić et al. (2023) tested ChatGPT-3 as essay-writing assistance for students. The authors compared 18 second-year masters students' essay writing performance with or without employing ChatGPT-3 as a writing assistant tool. ...
... The study concluded that the use of ChatGPT as an assistance tool could not reduce students' writing time. However, it is worth mentioning that in the study conducted by Bašić et al. (2023), the essays were written in Croatian rather than in English. Given that ChatGPT was predominantly fed with English content and thus may have generated higher-quality information in English for students who used it as an essay-writing assistant tool, the results may have been different if English essays had been used instead. ...
... This finding is based on a qualitative perspective, with the researchers doing text analysis and manually coding the revisions. However, a much higher average score was given by ChatGPT itself after revising all the sample essays, a positive outcome that is in sharp contrast with the results from Bašić et al. (2023), who found GPT-3.0 to be ineffective in assisting students' essay writing. Three possible reasons for this finding can be suggested. ...
Article
Full-text available
This study explores the efficacy of ChatGPT-3.5, an AI chatbot, used as an Automatic Essay Scoring (AES) system and feedback provider for IELTS essay preparation. It investigates the alignment between scores given by ChatGPT-3.5 and those assigned by official IELTS examiners to establish its reliability as an AES. It also identifies the strategies employed by ChatGPT-3.5 in revising essays based on the four IELTS rubrics: task achievement, coherence and cohesion, lexical resources, and grammatical range and accuracy. Based on pre-rated essays from an official IELTS preparatory book as a control measure to ensure objectivity, the findings indicate a discrepancy, with ChatGPT-3.5 typically assigning lower scores compared to official raters. However, ChatGPT-3.5 shows a robust capability to revise essays across all four descriptors. In addition, the effectiveness of ChatGPT-3.5 as a feedback provider may be attributed to the essay type and its widely accepted rubrics. Our study contributes to the understanding of the application of AI tools in second language writing and suggests that future studies should focus on evaluating the capacity and effectiveness of such tools in pedagogical applications.
... The study highlights that positive association of ChatGPT personalization with improved teaching-learning outcomes. Bašić, Banovac, Kružić, Jerković 53 Highlights that using ChatGPT as a writing assistant helps in improving writing skills. The findings address the significance of AI literacy for learners and educators for maximizing the educational benefits of GenAI effectively. ...
... The review found that the potential misuse of ChatGPT might foster cheating, facilitate plagiarism, and threaten academic integrity. This issue is also affirmed by the research conducted by Basic et al. [7], who presented evidence that students who utilized ChatGPT in their writing assignments had more plagiarism cases than those who did not. These findings align with the conclusions drawn by Cotton et al. [13], Hisan and Amri [33] and Sullivan et al. [75], who revealed that the integration of chatbots such as ChatGPT into education poses a significant challenge to the preservation of academic integrity. ...
Article
Full-text available
Over the last four decades, studies have investigated the incorporation of Artificial Intelligence (AI) into education. A recent prominent AI-powered technology that has impacted the education sector is ChatGPT. This article provides a systematic review of 14 empirical studies incorporating ChatGPT into various educational settings, published in 2022 and before the 10th of April 2023—the date of conducting the search process. It carefully followed the essential steps outlined in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA 2020) guidelines, as well as Okoli’s (Okoli in Commun Assoc Inf Syst, 2015) steps for conducting a rigorous and transparent systematic review. In this review, we aimed to explore how students and teachers have utilized ChatGPT in various educational settings, as well as the primary findings of those studies. By employing Creswell’s (Creswell in Educational research: planning, conducting, and evaluating quantitative and qualitative research [Ebook], Pearson Education, London, 2015) coding techniques for data extraction and interpretation, we sought to gain insight into their initial attempts at ChatGPT incorporation into education. This approach also enabled us to extract insights and considerations that can facilitate its effective and responsible use in future educational contexts. The results of this review show that learners have utilized ChatGPT as a virtual intelligent assistant, where it offered instant feedback, on-demand answers, and explanations of complex topics. Additionally, learners have used it to enhance their writing and language skills by generating ideas, composing essays, summarizing, translating, paraphrasing texts, or checking grammar. Moreover, learners turned to it as an aiding tool to facilitate their directed and personalized learning by assisting in understanding concepts and homework, providing structured learning plans, and clarifying assignments and tasks. However, the results of specific studies (n = 3, 21.4%) show that overuse of ChatGPT may negatively impact innovative capacities and collaborative learning competencies among learners. Educators, on the other hand, have utilized ChatGPT to create lesson plans, generate quizzes, and provide additional resources, which helped them enhance their productivity and efficiency and promote different teaching methodologies. Despite these benefits, the majority of the reviewed studies recommend the importance of conducting structured training, support, and clear guidelines for both learners and educators to mitigate the drawbacks. This includes developing critical evaluation skills to assess the accuracy and relevance of information provided by ChatGPT, as well as strategies for integrating human interaction and collaboration into learning activities that involve AI tools. Furthermore, they also recommend ongoing research and proactive dialogue with policymakers, stakeholders, and educational practitioners to refine and enhance the use of AI in learning environments. This review could serve as an insightful resource for practitioners who seek to integrate ChatGPT into education and stimulate further research in the field.
... However, there is a potential risk of students excessively relying on ChatGPT to produce their written assignments independently, which can be considered academic cheating (Ajevski et al., 2023;Ceras & Balcioğlu, 2023;Benuyenah, 2023;Fauzi et al., 2023;Khalil & Er, 2023). Regarding the effectiveness of ChatGPT in providing writing assistance, Bašić et al. (2023) conducted an experimental study comparing a group of students who received Chat-GPT assistance with a group that did not, in the context of writing essays. Although the time taken by both groups was similar, the ChatGPT-assisted group achieved slightly lower scores due to a higher frequency of inaccuracies in their essays. ...
Conference Paper
ChatGPT is a chatbot that uses natural language processing to generate human-like responses to user input. It has gained significant popularity, particularly among students and researchers (Neumann et al., 2023). The launch of ChatGPT sparked debates regarding its advantages and disadvantages, including concerns about biases, inaccuracies, and cybersecurity risks (Benuyenah, 2023). Efforts have begun to take momentum in order to consolidate evidence related to the use of ChatGPT in healthcare and other fields (Sallam, 2023; Zamfiroiu et al., 2023), however, there is a dearth of such evidence in the field of higher education. To address this gap, the current scoping review was undertaken to systematically map the available global evidence related to the academic use of ChatGPT in higher education. By mapping the available global literature, this review aims to provide insights into the current state of ChatGPT's utilization in educational settings to inform practice, policymaking, and future research in higher education (Daudt et al., 2013). The review was guided by a question: what does existing literature portray about the uses of ChatGPT in higher education?
... They found that 70% of the hints offered by ChatGPT led to positive learning gains for students. Basic et al. [4] conducted a study with nine students in the control group and nine in the experimental group that used ChatGPT to determine whether it improves essay quality. Surprisingly, the control group outperformed the experimental group in most criteria, leading the authors to conclude that ChatGPT does not necessarily improve essay quality. ...
Article
Students with and without disabilities consistently fail to meet established writing benchmarks, highlighting the urgent need for intervention and innovation in this critical area. According to the National Assessment of Educational Progress (NAEP), key criteria for assessing writing include the development of ideas, organization of ideas, and language facility and conventions. One solution to improve writing outcomes is to leverage artificial intelligence (AI) tools to support struggling writers. The purpose of this article is to define AI, examine its current integration into tools used by teachers to support struggling writers, and provide a crosswalk between NAEP criteria and AI tools to enhance writing interventions.
Article
ChatGPT is one of the most popular artificial intelligence tools today. The number of users is increasing rapidly every day. ChatGPT, which is used in many fields for different purposes, has the potential to revolutionise the field of education and health. The study aims of the study is to investigate the learning experiences and expectations of midwifery students who use ChatGPT in their theoretical and clinical education. This study was conducted through an interpretive paradigm based on Heideggerian hermeneutic phenomenology, a qualitative research method. In the study, the maximum diversity sampling method, one of the purposive sampling methods, was used. One‐to‐one in‐depth interviews were conducted with a total of 17 midwifery students. The study data were collected through a three‐part interview form. The data obtained were analysed with the MAXQDA program. As a result of the data analysis, three main themes and seven sub‐themes were identified. The main themes are The Role of ChatGPT in Midwifery Education, The Effect of ChatGPT on Student Development and Concerns about the Use of ChatGPT. Midwifery students use ChatGPT for various purposes in theoretical and clinical courses. Although students are satisfied that artificial intelligence tools save time and make learning practical, students have some concerns about ChatGPT.
Article
Full-text available
Background Active Learning with AI‐tutoring in Higher Education tackles dropout rates. Objectives To investigate teaching‐learning methodologies preferred by students. AHP is used to evaluate a ChatGPT‐based studented learning methodology which is compared to another active learning methodology and a traditional methodology. Study with Learning Analytics to evaluate alternatives, and help students elect the best strategies according to their preferences. Methods Comparative study of three learning methodologies in a counterbalanced Single‐Group with 33 university students. It follows a pre‐test/post‐test approach using AHP and SAM. HRV and GSR used for the estimation of emotional states. Findings Criteria related to in‐class experiences valued higher than test‐related criteria. Chat‐GPT integration was well regarded compared to well‐established methodologies. Student emotion self‐assessment correlated with physiological measures, validating used Learning Analytics. Conclusions Proposed model AI‐Tutoring classroom integration functions effectively at increasing engagement and avoiding false information. AHP with the physiological measuring allows students to determine preferred learning methodologies, avoiding biases, and acknowledging minority groups.
Chapter
Full-text available
This chapter provides a detailed exploration of the ChatGPT model architecture, a cutting-edge natural language processing (NLP) model that has revolutionized conversational AI. Developed by OpenAI, ChatGPT is built upon the GPT-3.5 architecture, a state-of-the-art language model. This chapter presents an extensive study about ChatGPT using a comprehensive analysis of its various recent literatures. This study also focuses on ChatGPT evolution from ELIZA to ChatGPT. In this chapter various reviews of literature, related issues, its architecture, various layers, various ChatGPT versions and its specialization, comparative study of various models, and application is presented. In order to do the comprehensive study various papers from different databases like ACM digital library, Scopus, IEEE, IGI Global, and Willey have been included for the study. Papers selected for the comprehensive study have been reviewed extensively in order to get the details and comprehended information for the readers. Various issues like security, biasness, training, misuse, etc. have been mentioned.
Article
Full-text available
ChatGPT-3 is a powerful language model developed by OpenAI that has the potential to revolutionize the way weinteract with technology. This model has been trained on a massive amount of data, allowing it to understand andgenerate human-like text with remarkable accuracy. One of the most exciting possibilities of ChatGPT-3 is its potentialto improve natural language processing (NLP) and natural language understanding (NLU) in a wide range ofapplications. In particular, ChatGPT-3 can be used to power chatbots, virtual assistants, and other conversationalinterfaces. These types of systems are becoming increasingly important as more and more people use voice and text tointeract with technology, we list ChatGpt role in each of the follwoing sections
Preprint
Full-text available
BACKGROUND: Recent neural language models have taken a significant step forward in producing remarkably controllable, fluent, and grammatical text. Although some recent works have found that AI-generated text is not distinguishable from human-authored writing for crowd-sourcing workers, there still exist errors in AI-generated text which are even subtler and harder to spot. METHOD: In this paper, we investigate the gap between scientific content generated by AI and written by humans. Specifically, we first adopt several publicly available tools or models to investigate the performance for detecting GPT-generated scientific text. Then we utilize features from writing style to analyze the similarities and differences between the two types of content. Furthermore, more complex and deep perspectives, such as consistency, coherence, language redundancy, and factual errors, are also taken into consideration for in-depth analysis. RESULT: The results suggest that while AI has the potential to generate scientific content that is as accurate as human-written content, there is still a gap in terms of depth and overall quality. AI-generated scientific content is more likely to contain errors in language redundancy and factual issues. CONCLUSION: We find that there exists a ``writing style'' gap between AI-generated scientific text and human-written scientific text. Moreover, based on the analysis result, we summarize a series of model-agnostic or distribution-agnostic features, which could be utilized to unknown or novel domain distribution and different generation methods. Future research should focus on not only improving the capabilities of AI models to produce high-quality content but also examining and addressing ethical and security concerns related to the generation and the use of AI-generated content.
Preprint
Full-text available
The use of artificial intelligence in academia is a hot topic in the education field. The use of chatAPIs and GPT-3 in higher education has the potential to offer a range of benefits, including increased student engagement, collaboration, and accessibility. However, these tools also raise a number of challenges and concerns, particularly in relation to academic honesty and plagiarism. This paper examines the opportunities and challenges of using chatAPIs and GPT-3 in higher education, with a focus on the potential risks and rewards of these tools and the ways in which universities can address the challenges they pose. The paper discusses the main features and capabilities of chatAPIs and GPT-3 and provides examples of their use in higher education. It also considers the potential for these tools to be used for academic dishonesty and the difficulties of detecting and preventing such abuses. Finally, the paper suggests a range of strategies that universities can adopt to ensure that chatAPIs and GPT-3 are used ethically and responsibly, including developing policies and procedures, providing training and support, and using a variety of methods to detect and prevent cheating.
Chapter
Full-text available
Literature review articles are essential to summarize the related work in the selected field. However, covering all related studies takes too much time and effort. This study questions how Artificial Intelligence can be used in this process. We used ChatGPT to create a literature review article to show the stage of the OpenAI ChatGPT artificial intelligence application. As the subject, the applications of Digital Twin in the health field were chosen. Abstracts of the last three years (2020, 2021 and 2022) papers were obtained from the keyword "Digital twin in healthcare" search results on Google Scholar and paraphrased by ChatGPT. Later on, we asked ChatGPT questions. The results are promising; however, the paraphrased parts had significant matches when checked with the Ithenticate tool. This article is the first attempt to show the compilation and expression of knowledge will be accelerated with the help of artificial intelligence. We are still at the beginning of such advances. The future academic publishing process will require less human effort, which in turn will allow academics to focus on their studies. In future studies, we will monitor citations to this study to evaluate the academic validity of the content produced by the ChatGPT.
Preprint
Full-text available
Background Large language models such as ChatGPT can produce increasingly realistic text, with unknown information on the accuracy and integrity of using these models in scientific writing. Methods We gathered ten research abstracts from five high impact factor medical journals (n=50) and asked ChatGPT to generate research abstracts based on their titles and journals. We evaluated the abstracts using an artificial intelligence (AI) output detector, plagiarism detector, and had blinded human reviewers try to distinguish whether abstracts were original or generated. Results All ChatGPT-generated abstracts were written clearly but only 8% correctly followed the specific journal’s formatting requirements. Most generated abstracts were detected using the AI output detector, with scores (higher meaning more likely to be generated) of median [interquartile range] of 99.98% [12.73, 99.98] compared with very low probability of AI-generated output in the original abstracts of 0.02% [0.02, 0.09]. The AUROC of the AI output detector was 0.94. Generated abstracts scored very high on originality using the plagiarism detector (100% [100, 100] originality). Generated abstracts had a similar patient cohort size as original abstracts, though the exact numbers were fabricated. When given a mixture of original and general abstracts, blinded human reviewers correctly identified 68% of generated abstracts as being generated by ChatGPT, but incorrectly identified 14% of original abstracts as being generated. Reviewers indicated that it was surprisingly difficult to differentiate between the two, but that the generated abstracts were vaguer and had a formulaic feel to the writing. Conclusion ChatGPT writes believable scientific abstracts, though with completely generated data. These are original without any plagiarism detected but are often identifiable using an AI output detector and skeptical human reviewers. Abstract evaluation for journals and medical conferences must adapt policy and practice to maintain rigorous scientific standards; we suggest inclusion of AI output detectors in the editorial process and clear disclosure if these technologies are used. The boundaries of ethical and acceptable use of large language models to help scientific writing remain to be determined.
Preprint
Full-text available
The latest AI language modules can produce original, high quality full short-form (300-word) Physics essays within seconds. These technologies such as ChatGPT and davinci-003 are freely available to anyone with an internet connection. In this work, we present evidence of AI generated short-form essays achieving first-class grades on an essay writing assessment from an accredited, current university Physics module. The assessment requires students answer five open-ended questions with a short, 300-word essay each. Fifty AI answers were generated to create ten submissions that were independently marked by five separate markers. The AI generated submissions achieved an average mark of 71±2%71 \pm 2 \%, in strong agreement with the current module average of 71±571 \pm 5 %. A typical AI submission would therefore most-likely be awarded a First Class, the highest classification available at UK universities. Plagiarism detection software returned a plagiarism score between 2±12 \pm 1% (Grammarly) and 7±27 \pm 2% (TurnitIn). We argue that these results indicate that current AI MLPs represent a significant threat to the fidelity of short-form essays as an assessment method in Physics courses.
Preprint
To check the ability of AI to identify precise and detailed scientific information, I experiment with how accurate AI recognizes scientific terms, their origins, meanings, and usages, and whether the accuracy increases over time