Preprint

A Strategic Experiment in Identifying Machine Learning Algorithms for Automated Grading: Leveraging LLMs through Precise Prompt Crafting

Authors:
  • Zentrum für Medienpsychologie und Verhaltensforschung
Preprints and early-stage research may not have been peer reviewed yet.
To read the file of this research, you can request a copy directly from the author.

Abstract

This paper reflects an experimental approach to employ LLM technology through sophisticated crafting of machine instructions in order to evaluate data from machine learning algorithms and work out intersections suitable for automated grading frameworks. It can be seen as a demonstration of LLM capability in identifying cross-disciplinary data and bridging suitable datasets for further implementation in a different scientific field. The author aims to demonstrate how, from the perspective of educational science, Artificial Intelligence can be harnessed to create frameworks for grading automation, as an exemplary showcase for the overall usability of advanced machines in educational science.

No file available

Request Full-text Paper PDF

To read the file of this research,
you can request a copy directly from the author.

Preprint
This paper aims to demonstrate a successful attempt in bridging highly specific information from large amounts of complex data to tailored purposes of adjacent scientific fields by the employment of deep learning tools. The technique described is gradually narrowing down and sophisticatedly specifying prompts in order to approach the desired outcome. In layman's terms, we have to be able to determine, if and when a machine has comprehended the specifics of our target, document the approach, and backtest its properties. Ultimately, this paper is no more than a field note of my own investigations in the realm of educational science, but it can serve as an important representation of intermediate steps, and a demonstration of how important harnessing the bridging capabilities of LLMs can eventually become. The showcased example of SLA serves exceptionally well for that purpose, hence it will be further explained below.
Article
Full-text available
Massive open online courses (MOOCs) are among the latest e-learning initiative to attain widespread popularity among many universities. In this paper, a review of the current published literature focusing on the use of MOOCs by instructors or students was conducted. Our primary goal in doing this is to summarize the accumulated state of knowledge concerning the main motivations and challenges of using MOOCs, as well as to identify issues that have yet to be fully addressed or resolved. Our findings suggest four reasons why students sign up for MOOCs: the desire to learn about a new topic or to extend current knowledge, they were curious about MOOCs, for personal challenge, and the desire to collect as many completion certificates as possible. Up to 90% drop out due to reasons including a lack of incentive, failure to understand the content material and having no one to turn to for help, and having other priorities to fulfill. Findings suggest three main reasons why instructors wish to teach MOOCs: being motivated by a sense of intrigue, the desire to gain some personal (egoistic) rewards, or a sense of altruism. Four key challenges of teaching MOOCs are also surfaced: difficulty in evaluating students’ work, having a sense of speaking into a vacuum due to the absence of student immediate feedback, being burdened by the heavy demands of time and money, and encountering a lack of student participation in online forums. We conclude by discussing two issues that have yet to be fully resolved – the quality of MOOC education, and the assessment of student work.
Article
Full-text available
ContextThe number of students enrolled in universities at standard and on-line programming courses is rapidly increasing. This calls for automated evaluation of students assignments.Objective We aim to develop methods and tools for objective and reliable automated grading that can also provide substantial and comprehensible feedback. Our approach targets introductory programming courses, which have a number of specific features and goals. The benefits are twofold: reducing the workload for teachers, and providing helpful feedback to students in the process of learning.Method For sophisticated automated evaluation of students’ programs, our grading framework combines results of three approaches (i) testing, (ii) software verification, and (iii) control flow graph similarity measurement. We present our tools for software verification and control flow graph similarity measurement, which are publicly available and open source. The tools are based on an intermediate code representation, so they could be applied to a number of programming languages.ResultsEmpirical evaluation of the proposed grading framework is performed on a corpus of programs written by university students in programming language C within an introductory programming course. Results of the evaluation show that the synergy of proposed approaches improves the quality and precision of automated grading and that automatically generated grades are highly correlated with instructor-assigned grades. Also, the results show that our approach can be trained to adapt to teacher’s grading style.Conclusions In this paper we integrate several techniques for evaluation of student’s assignments. The obtained results suggest that the presented tools can find real-world applications in automated grading.
Conference Paper
Automated marking of assignments consisting of written text would doubtless be of advantage to teachers and education administrators alike. When large numbers of assignments are submitted at once, teachers find themselves bogged down in their attempt to provide consistent evaluations and high quality feedback to students within as short a timeframe as is reasonable, usually a matter of days rather than weeks. Educational administrators are also concerned with quality and timely feedback, but in addition must manage the cost of doing this work. Clearly an automated system would be a highly desirable addition to the educational tool-kit, particularly if it can provide less costly and more effective outcome. In this paper we present a description and evaluation of four automated essay grading systems. We then report on our trial of one of these systems which was undertaken at Curtin University of Technology in the first half of 2001. The purpose of the trial was to assess whether automated essay grading was feasible, economically viable and as accurate as manually grading the essays. Within the Curtin Business School we have not previously used automated grading systems but the benefit could be enormous given the very large numbers of students in some first year subjects. As we evaluate the results of our trial, a research and development direction is indicated which we believe will result in improvement over existing systems.
Article
We look at a controversy: the use of computers for automated and semiautomated grading of exams. K. Kukich, the director of the Natural Language Processing group at Educational Testing Service, provides an insider's view of the history of the field of automated essay grading and describes how ETS is currently using computer programs to supplement human judges in the grading process. T. Landauer, D. Laham, and P. Foltz describe the use of latent semantic analysis in a commercial essay-scoring system called IEA. They also address important ethical questions. L. Hirschman, E. Breck, J. Burger, and L. Ferro report on MITRE's current efforts towards automated grading of short-answer questions and discuss the ramifications for the design of general question answering systems. Finally R. Calfee places these developments in the framework of current educational theory and practice.
The Interplay of Evidence and Consequences in the Validation of Performance Assessments. Educational Researcher
  • S Messick
Messick, S. (1994). The Interplay of Evidence and Consequences in the Validation of Performance Assessments. Educational Researcher, 23(2), 13-23. doi:10.3102/0013189x023002013
Toward the Machine Scoring of Essays
  • E B Page
Page, E. B. (1966). Toward the Machine Scoring of Essays.