Conference Paper

Leveraging Open Source LLMs for Software Engineering Education and Training

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... By offering workedout examples and best coding practices, AI ensures that students develop robust programming habits and reinforce essential software development skills [6]. Furthermore, AI supports a comprehensive understanding of the software engineering lifecycle, guiding students through documentation standards, software development processes, and real-world engineering workflows [46]. Exposure to industry-relevant software engineering methodologies through AI-driven insights prepares students for practical applications of their learning, ensuring they are equipped with the knowledge and skills necessary for professional success. ...
Preprint
Full-text available
The increasing adoption of Large Language Models (LLMs) in software engineering education presents both opportunities and challenges. While LLMs offer benefits such as enhanced learning experiences , automated assessments, and personalized tutoring, their integration also raises concerns about academic integrity, student over-reliance, and ethical considerations. In this study, we conducted a preliminary literature review to identify motivators and demotivators for using LLMs in software engineering education. We applied a thematic mapping process to categorize and structure these factors (motivators and demotivators), offering a comprehensive view of their impact. In total, we identified 25 motivators and 30 demotivators, which are further organized into four high-level themes. This mapping provides a structured framework for understanding the factors that influence the integration of LLMs in software engineering education, both positively and negatively. As part of a larger research project, this study serves as a feasibility assessment , laying the groundwork for future systematic literature review and empirical studies. Ultimately, this project aims to develop a framework to assist Finnish higher education institutions in effectively integrating LLMs into software engineering education while addressing potential risks and challenges. CCS CONCEPTS • Software and its engineering → Software creation and management .
... By offering workedout examples and best coding practices, AI ensures that students develop robust programming habits and reinforce essential software development skills [6]. Furthermore, AI supports a comprehensive understanding of the software engineering lifecycle, guiding students through documentation standards, software development processes, and real-world engineering workflows [46]. Exposure to industry-relevant software engineering methodologies through AI-driven insights prepares students for practical applications of their learning, ensuring they are equipped with the knowledge and skills necessary for professional success. ...
Preprint
Full-text available
The increasing adoption of Large Language Models (LLMs) in software engineering education presents both opportunities and challenges. While LLMs offer benefits such as enhanced learning experiences, automated assessments, and personalized tutoring, their integration also raises concerns about academic integrity, student over-reliance, and ethical considerations. In this study, we conducted a preliminary literature review to identify motivators and demotivators for using LLMs in software engineering education. We applied a thematic mapping process to categorize and structure these factors (motivators and demotivators), offering a comprehensive view of their impact. In total, we identified 25 motivators and 30 demotivators, which are further organized into four high-level themes. This mapping provides a structured framework for understanding the factors that influence the integration of LLMs in software engineering education, both positively and negatively. As part of a larger research project, this study serves as a feasibility assessment, laying the groundwork for future systematic literature review and empirical studies. Ultimately, this project aims to develop a framework to assist Finnish higher education institutions in effectively integrating LLMs into software engineering education while addressing potential risks and challenges.
... These models offer many possible use cases, ranging from code generation and bug detection to requirements analysis and software maintenance. For instance, LLM-based tools were able to generate logging statements [4], generate test cases [5], and support education [6]. ...
Preprint
In the short period since the release of ChatGPT in November 2022, large language models (LLMs) have changed the software engineering research landscape. While there are numerous opportunities to use LLMs for supporting research or software engineering tasks, solid science needs rigorous empirical evaluations. However, so far, there are no specific guidelines for conducting and assessing studies involving LLMs in software engineering research. Our focus is on empirical studies that either use LLMs as part of the research process (e.g., for data annotation) or studies that evaluate existing or new tools that are based on LLMs. This paper contributes the first set of guidelines for such studies. Our goal is to start a discussion in the software engineering research community to reach a common understanding of what our community standards are for high-quality empirical studies involving LLMs.
Article
This study explores the feasibility of using open-source large language models (LLMs) to generate automatic feedback on physics problem-solving tasks in educational settings. A quantised version of the open-source LLM OpenChat 3.6 was employed to generate German-language feedback for high school students on standard school hardware. The study procedure involved five stages: data preparation, model selection, prompt design, response evaluation, and quality analysis of feedback. OpenChat 3.6 achieved an accuracy of 0.84 in classifying student answers. In comparison, GPT4-o achieved an accuracy of 0.85. The open-source LLM provided accurate and suitable feedback in 69% of cases, with substantial interrater agreement (κ = 0.89) on feedback quality. However, performance varied across task types, highlighting areas for improvement in prompt specificity, especially in handling physics terminology. These findings suggest that, with optimisation, open-source LLMs can offer a locally controlled and effective solution for formative assessment in physics education, enabling real-time, targeted feedback to support student learning.
Article
VivaBot is an intelligent and completely computerized viva examination system for enhancing efficiency, accuracy, and flexibility in conducting viva voce examinations. It minimizes cumbersome faculty effort by using face recognition-based verification, machine-based roll marking, AI-based question generation, adaptive questioning, and intelligent answer marking. Professors can schedule viva sessions by selecting the class and week, while students securely log in using their face and roll number for effective authentication and impersonation avoidance. The system implements an adaptive questioning mechanism where questions start at a medium level of difficulty and dynamically adjust based on the response of students to give a customized and impartial test. VivaBot makes use of Ollama Mistral AI to generate questions automatically, where educators can upload a set of pre-designed questions in PDF or type in a subject for AI-auto-generated question creation. The AI model even grades student responses unbiasedly, lightening the workload for teaching staff to a large extent and eliminating grading bias. VivaBot also generates performance reports with confidence levels and quality of answers scores according to students' performance, providing elaborate feedback and explanations to each question so that a better understanding is achieved and there will be improvement in the future. Supplementing text-based and speech- based viva tests, the system utilizes DeepFace technology to facilitate accurate speech recognition and response analysis, making the viva interactive and effective. With machine learning, AI, and automation integration, VivaBot simplifies viva tests with structured, fair, and effective testing, reducing faculty workload as well as offering students an immersive, adaptive, and informative learning experience. Keywords: Artificial Intelligence and Machine Learning, DeepFace, Large Language Models, Adaptive Questioning System, Automated Answer Evaluation, Student Performance Analysis, Confidence Scoring.
Article
Full-text available
The emergence of large language models (LLMs) such as ChatGPT has revolutionized many fields. In particular, recent advances in LLMs have triggered various studies examining the use of these models for software development tasks, such as program repair, code understanding, and code generation. Prior studies have shown the capability of ChatGPT in repairing conventional programs. However, debugging deep learning (DL) programs poses unique challenges since the decision logic is not directly encoded in the source code. This requires LLMs to not only parse the source code syntactically but also understand the intention of DL programs. Therefore, ChatGPT’s capability in repairing DL programs remains unknown. To fill this gap, our study aims to answer three research questions: (1) Can ChatGPT debug DL programs effectively? (2) How can ChatGPT’s repair performance be improved by prompting? (3) In which way can dialogue help facilitate the repair? Our study analyzes the typical information that is useful for prompt design and suggests enhanced prompt templates that are more efficient for repairing DL programs. On top of them, we summarize the dual perspectives (i.e., advantages and disadvantages) of ChatGPT’s ability, such as its handling of API misuse and recommendation, and its shortcomings in identifying default parameters. Our findings indicate that ChatGPT has the potential to repair DL programs effectively and that prompt engineering and dialogue can further improve its performance by providing more code intention. We also identified the key intentions that can enhance ChatGPT’s program repairing capability.
Article
Full-text available
The introduction of large language models (LLMs) that allow iterative “chat” in late 2022 is a paradigm shift that enables generation of text often indistinguishable from that written by humans. LLM-based chatbots have immense potential to improve academic work efficiency, but the ethical implications of their fair use and inherent bias must be considered. In this editorial, we discuss this technology from the academic’s perspective with regard to its limitations and utility for academic writing, education, and programming. We end with our stance with regard to using LLMs and chatbots in academia, which is summarized as (1) we must find ways to effectively use them, (2) their use does not constitute plagiarism (although they may produce plagiarized text), (3) we must quantify their bias, (4) users must be cautious of their poor accuracy, and (5) the future is bright for their application to research and as an academic tool.
Article
Full-text available
Innovation and challenges are significant factors that lead to the improvement in technology involving various sectors, including the educational field. New methods and techniques have been introduced in teaching and learning among learners and educators. Modern technology generates an effective learning process that increases the students’ interest and understanding of learning activities. Hence, software engineers need to develop high-quality educational applications that include the required elements such as learning materials and types of assessments. In addition, it should align with stipulated guidelines, timelines, budgets, and policies. Adaptation of a checklist approach in developing educational software can improve the quality of the software significantly. The checklist approach appears to be more effective for developers, especially novices with limited knowledge in relevant evidence-based principles. This study investigates the checklist approach for novice software developers in developing educational applications. It shows that 89.19% of them had understood and benefited from the provided checklist.
Conference Paper
Full-text available
Open source is widely used for educational purposes in higher education around the world. While many educators use open source resources for teaching, there seems to be few contributions to such projects of students as part of their university courses. In this work we present our experience on establishing open source development from student contributors as part of their university curriculum. Since 2010 more than 300 students from Graz University of Technology have been involved in the presented Catrobat project and have gained knowledge about agile software development as well as several related domains, e.g., project management, marketing, or graphical design. In this paper we provide detailed insights into the project's organization and evaluate in a study how students feel in this setting. As we conclude, bringing open source to university courses is an effective practical approach based on social learning and provides benefits for students and researchers.
Article
Full-text available
Teaching Software Engineering is a challenging task. This paper presents some problems encountered during teaching the course of software engineering to computer science and computer engineering students for few offerings. We present problems encountered and which are related to its title and contents and present suggested solutions.
Article
Full-text available
The higher education community is concerned about the cost and performance of commercial software products. A common view is that existing proprietary options do not have the features required by instructors and students or allow for cost-effective customization. One way to address these problems in poorer countries, and hence improve their quality of education and access to knowledge, would be to consider the modern educational tools available with no license fees through open-source software. This paper presents an initial development of a complete open-source software platform called the Open University Project, which contains software that precisely fulfills user requirements in the higher education sector. The paper also highlights the financial advantages of introducing open-source software in developing countries and its positive impact on educational quality.
Chapter
Large language models such as OpenAI’s GPT and Google’s Bard offer new opportunities for supporting software engineering processes. Large language model assisted software engineering promises to support developers in a conversational way with expert knowledge over the whole software lifecycle. Current applications range from requirements extraction, ambiguity resolution, code and test case generation, code review and translation to verification and repair of software vulnerabilities. In this paper we present our position on the potential benefits and challenges associated with the adoption of language models in software engineering. In particular, we focus on the possible applications of large language models for requirements engineering, system design, code and test generation, code quality reviews, and software process management. We also give a short review of the state-of-the-art of large language model support for software construction and illustrate our position by a case study on the object-oriented development of a simple “search and rescue” scenario.
Chapter
Grammatical error correction aims to correct ungrammatical sentences automatically. Recently, some work has demonstrated the excellent capabilities of closed-source Large Language Models (LLMs, e.g., ChatGPT) in grammatical error correction. However, the potential of open-source LLMs remains unexplored. In this paper, we introduced GrammarGPT, an open-source LLM, to preliminary explore its potential for native Chinese grammatical error correction. The core recipe of GrammarGPT is to leverage the hybrid dataset of ChatGPT-generated and human-annotated. For grammatical errors with clues, we proposed a heuristic method to guide ChatGPT to generate ungrammatical sentences by providing those clues. For grammatical errors without clues, we collected ungrammatical sentences from publicly available websites and manually corrected them. In addition, we employed an error-invariant augmentation method to enhance the ability of the model to correct native Chinese grammatical errors. We ultimately constructed about 1k parallel data and utilized these data to fine-tune open-source LLMs (e.g., Phoenix, released by The Chinese University of Hong Kong, Shenzhen) with instruction tuning. The experimental results show that GrammarGPT outperforms the existing SOTA system significantly. Although model parameters are 20x larger than the SOTA baseline, the required amount of data for instruction tuning is 1200x smaller, illustrating the potential of open-source LLMs on native CGEC. Our GrammarGPT ranks 3rd3^{rd} on NLPCC2023 SharedTask1, demonstrating our approach’s effectiveness. The code and data are available at https://github.com/FreedomIntelligence/GrammarGPT.
Article
Large language models (LLMs) can respond to free-text queries without being specifically trained in the task in question, causing excitement and concern about their use in healthcare settings. ChatGPT is a generative artificial intelligence (AI) chatbot produced through sophisticated fine-tuning of an LLM, and other tools are emerging through similar developmental processes. Here we outline how LLM applications such as ChatGPT are developed, and we discuss how they are being leveraged in clinical settings. We consider the strengths and limitations of LLMs and their potential to improve the efficiency and effectiveness of clinical, educational and research work in medicine. LLM chatbots have already been deployed in a range of biomedical contexts, with impressive but mixed results. This review acts as a primer for interested clinicians, who will determine if and how LLM technology is used in healthcare for the benefit of patients and practitioners.
Chapter
Learning new programming skills requires tailored guidance. With the emergence of advanced Natural Language Generation models like the ChatGPT API, there is now a possibility of creating a convenient and personalized tutoring system with AI for computer science education. This paper presents GPTutor, a ChatGPT-powered programming tool, which is a Visual Studio Code extension using the ChatGPT API to provide programming code explanations. By integrating Visual Studio Code API, GPTutor can comprehensively analyze the provided code by referencing the relevant source codes. As a result, GPTutor can use designed prompts to explain the selected code with a pop-up message. GPTutor is now published at the Visual Studio Code Extension Marketplace, and its source code is openly accessible on GitHub. Preliminary evaluation indicates that GPTutor delivers the most concise and accurate explanations compared to vanilla ChatGPT and GitHub Copilot. Moreover, the feedback from students and teachers indicated that GPTutor is user-friendly and can explain given codes satisfactorily. Finally, we discuss possible future research directions for GPTutor. This includes enhancing its performance and personalization via further prompt programming, as well as evaluating the effectiveness of GPTutor with real users.KeywordsChatGPTTutoring SystemDeveloper ToolPrompt EngineeringNatural Language Generation
Article
Has the day we all have been waiting for really arrived? Have advances in deep learning and machine learning (ML) finally reached a turning point and have started to produce “accurate enough” assistants to help us in a variety of tasks, including software development? Are large language models (LLM) going to turn us all into better writers, artists, translators, programmers, health-care workers, not to mention software engineers? Or are we at a risky turning point where we will not be able to separate artificial intelligence (AI)-generated content from user-created ones, drowning in misinformation and perfect sounding yet fake and incorrect information and AI-generated faulty programs?
Conference Paper
Teaching software engineers presents some specific problems. There are modern approaches that may make the education process easier and more appealing. A remarkably promising example of this is the use of computer games in the teaching and learning process. We suggest a methodology of two-fold use of learning games for teaching software engineers. Students, experienced in programming, develop learning games, and then we use the games that are developed for teaching the next generation of students. After gaining skills in basic subjects these students are involved in the development of new learning games. Teachers in our department play the role of customer, as they are interested in getting new effective tools for teaching and are ready to participate in our work. Students developing games learn all the software development life cycle phases including testing, deployment and maintenance, they contact real customers (teachers of corresponding subjects) and real users (students, learning these subjects). Student teams have developed several games that are used for teaching new students. Students, who participated in game development, gained such important professional skills as software design, testing, debugging and development, working with open source libraries, version controls systems, other modern tools and domain-specific programming languages. Students were also trained in soft skills such as team working, project management, and conflict resolution. Teachers, using games as learning systems, notice an improvement in students' motivation in learning and their will to be involved into the learning process. Our experience of using computer games shows that this approach is very effective for improving students' skills and motivation to study.
How ChatGPT can transform auto didactic experiences and open education
  • M Firat
M. Firat, "How ChatGPT can transform autodidactic experiences and open education," Department of Distance Education, Open Education Faculty, Anadolu Unive, 2023.
A preliminary evaluation of ChatGPT in requirements information retrieval
  • J Zhang
  • Y Chen
  • N Niu
  • C Liu
J. Zhang, Y. Chen, N. Niu, and C. Liu, "A preliminary evaluation of ChatGPT in requirements information retrieval," 2023.
Difficulty of Software Engineering and Ways to Overcome Common Challenges
  • S Murthy
S. Murthy, "Difficulty of Software Engineering and Ways to Overcome Common Challenges," Feb. 2023. [Online]. Available: https://tinyurl.com/wguedublog
You are what you write: Preserving privacy in the era of large language models
  • R Plant
  • V Giuffrida
  • D Gkatzia
R. Plant, V. Giuffrida, and D. Gkatzia, "You are what you write: Preserving privacy in the era of large language models," arXiv preprint arXiv:2204.09391, 2022.
A prompt pattern catalog to enhance prompt engineering with chatgpt
  • J White
  • Q Fu
  • S Hays
  • M Sandborn
  • C Olea
  • H Gilbert
J. White, Q. Fu, S. Hays, M. Sandborn, C. Olea, H. Gilbert, A. Elnashar, J. Spencer-Smith, and D. C. Schmidt, "A prompt pattern catalog to enhance prompt engineering with chatgpt," 2023.
You are what you write: Preserving privacy in the era of large language models
  • Plant
Evaluation of chatgpt model for vulnerability detection
  • A Cheshkov
  • P Zadorozhny
  • R Levichev
A. Cheshkov, P. Zadorozhny, and R. Levichev, "Evaluation of chatgpt model for vulnerability detection," 2023.
SWEBOK: Guide to the Software Engineering Body of Knowledge
IEEE Computer Society, SWEBOK: Guide to the Software Engineering Body of Knowledge. Piscataway, NJ: IEEE Press, 2014.