ArticlePublisher preview available

A Comparison of Sandbox Technologies Used in Online Judge Systems

Trans Tech Publications Ltd
Applied Mechanics and Materials
Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In this paper, the security issues in online judge systems (OJ) will be discussed. The pros and cons in different sandbox approaches in open source OJs will be analyzed. After that, well explore other possibilities to build a more suitable sandbox for online judge purpose.
A Comparison of Sandbox Technologies Used in Online Judge Systems
Chao Yi
1, a
, Su Feng
1, b
, Zhi Gong
1, c
1College of Information Science and Technology, Beijing Normal University, 100875, Beijing, China
ayichao@mail.bnu.edu.cn, bfengsu@bnu.edu.cn, cgz@mail.bnu.edu.cn
Keywords: Online Judge; sandbox; system security.
Abstract. In this paper, the security issues in online judge systems (OJ) will be discussed. The pros
and cons in different sandbox approaches in open source OJs will be analyzed. After that, we’ll
explore other possibilities to build a more suitable sandbox for online judge purpose.
Introduction
As we know, online judge systems (OJ) are now getting more and more attention in modern
education, especially in programming related subjects like Data Structure and Algorithm. Teachers
tend to publish on the system more than traditional handouts because the system can give results much
quicker and more accurate.
At first, online judge systems are designed for programming contests like ACM/ICPC and OI
(Olympiad of Informatics). Universidad de Valladolid implements the first public Online Judge
System “UVa Online Judge” in 1997 [1]. In 2001, how to apply OJ to the Computer Science
education was discussed [2][3].
A typical online judge system contains three modules: Web, Judger and Database. Users submit
their source codes in the Web module and will be stored in database. Then Judger module will read
the details from the database. After compiling and running those codes, the Judger module will write
the verdict back to database for users to see.
Security issues were raised [2] since the judger module must run the programs users submit, which
might lead to serious problems like stealing passwords, formatting disk drives and so on. Fortunately,
now almost every OJ will check users’ submits and prevent executing dangerous code. The core
technology of these security methods is known as “Sandbox”.
In this paper, we’ll take some of the open-source online judge systems, analyze the sandbox
technologies they use and discuss the advantages and disadvantages separately.
Sandbox Technologies
Sandbox is a security mechanism for separating running programs, often used to execute untrusted
source code. Sandbox is a specific example of virtualization provides a controlled set of resources to
run on. In this section we’ll introduce the sandboxing technologies used in current OJs.
A. Linux
Linux provides lots of utilities focus on security issues for developers to choose from. Since Linux
is open-source, developers are able to built kernel-level sandboxes if they want. Here we list three
sandbox technologies widely used in current OJs.
1) ptrace
Ptrace is the most common one among open source OJs. BNUOJ, HUSTOJ, HITOJ and some
others are using ptrace to build their sandbox. The basic idea is described as follows:
First, fork a process to execute the guest program. Then use ptrace to flag the process itself as
traceable. Use setrlimit to set the basic limits, such as time, memory usage and file size. Load the
pre-defined blacklist (or white-list if you prefer) in the parent process, then trace every system call that
guest program makes.
Applied Mechanics and Materials Online: 2014-01-28
ISSN: 1662-7482, Vols. 490-491, pp 1201-1204
doi:10.4028/www.scientific.net/AMM.490-491.1201
© 2014 Trans Tech Publications Ltd, All Rights Reserved
All rights reserved. No part of contents of this paper may be reproduced or transmitted in any form or by any means without the written permission of Trans
Tech Publications Ltd, www.scientific.net. (Research Gate for subscription journals-28/02/25,02:58:13)
... To avoid malicious attacks, most judging systems currently employ sandboxes to execute the programs uploaded by users [8]. Developers may require a method to determine whether the use of a sandbox is safe for this purpose. ...
... Version 4 execution result: (a) program execution status (4), (b) code analysis results(8). ...
Article
Full-text available
The majority of programming courses currently employ online judge systems as lesson materials. Online judge systems are becoming more common as the number of courses and persons studying computer science and information engineering grows. At the same time, there is an increase in the number of attacks against online judge systems; for example, Denial-Of-Service attacks, whose goal is to disrupt the target system by exhausting resources and blocking ordinary users from using the service normally. As a result, preventing attacks on online judge systems is becoming increasingly crucial. This research investigates and organizes these attack techniques, as well as develops a threat model for the online judge system by the STRIDE threat model approach, which provides a way to classify attacks into six categories. This research also designs code analysis rules and implements a code analysis tool. This tool can assist developers in analyzing the existing online judge system to determine whether the judge system is vulnerable to attack and dealing with vulnerability as soon as feasible to improve the judge system’s security. After enhancing the security of online judge systems, the system can foster trust and reliability among users as benefits.
... Sandbox is a security mechanism for separating running programs, often used to execute untrusted source code [3]. In the context of online judges, sandboxes are used for security measures and to enforce limits on resources [4] like memory or CPU. ...
... Thus, with good test cases and resource limits, authors of a challenge can check if the user solved a challenge with appropriate time and memory complexity. Many sandboxing techniques can be used for isolating untrusted source code with custom resource limitations such as LXC containers [5], Docker [6], virtual machines (VMs) [7] and others [3], [4]. In the context of online judges we expect that with a sandbox we can (1) initialize sandbox environment, (2) copy users code and other files into the sandbox, (3) compile and execute users code with custom resource limitations, (4) collect the output of a program and execution metadata, and (5) clean-up the sandbox environment. ...
Conference Paper
Full-text available
In this paper, we present a novel, robust, scalable, and open-source online code execution system called Judge0. It features a modern modular architecture that can be deployed over an arbitrary number of computers and operating systems. We study its design, comment on the various challenges that arise in building such systems, compare it with other available online code execution systems and online judge systems, and finally comment on several scenarios how it can be used to build a wide range of applications varying from competitive programming platforms, educational and recruitment platforms, to online code editors. Though first presented now, Judge0 is in active use since October 2017 and has become a crucial part of several production systems.
... Most of the relevant studies in this area were primarily concerned with improving the functionalities of OJS [2,3] as well as with security issues [4]. Instead of the global deployment of OJS, most cases were developed and implemented for speci¯c purposes including programming contests, academic course management, and corporate training. ...
Article
Full-text available
The development and operation of Online Judge System (OJS), which is used to evaluate the correctness of programs, is a nontrivial and difficult task due to the various functional and non-functional requirements. However, although many OJSs have been developed and operated, and their usefulness reported, the theory for constructing OJSs has not been sufficiently discussed. In this paper, we present the functional and non-functional requirements oriented to OJS as well as demonstrate the internal components and software architecture of an OJS, which has been in operation for over a decade and has evaluated over six million solutions. We also present real-world experiences and challenges encountered during this long journey of our OJS.
... Therefore, we have to ensure that the system is resistant to a broad range of attacks, such as forcing a high execution time, modifying the testing environment, or accessing restricted resources during the solution evaluation process [1]. The most popular methods for avoiding such issues rely on the execution of submitted solutions in dedicated sandboxes that are managed by the online judging system [21], such as virtualization, LXC containers [22], and the Docker framework. Such approaches could significantly increase the safety and reliability of the system [23]. ...
Article
Full-text available
A programming contest generally involves the host presenting a set of logical and mathematical problems to the contestants. The contestants are required to write computer programs that are capable of solving these problems. An online judge system is used to automate the judging procedure of the programs that are submitted by the users. Online judges are systems designed for the reliable evaluation of the source codes submitted by the users. Traditional online judging platforms are not ideally suitable for programming labs, as they do not support partial scoring and efficient detection of plagiarized codes. When considering this fact, in this paper, we present an online judging framework that is capable of automatic scoring of codes by detecting plagiarized contents and the level of accuracy of codes efficiently. Our system performs the detection of plagiarism by detecting fingerprints of programs and using the fingerprints to compare them instead of using the whole file. We used winnowing to select fingerprints among k-gram hash values of a source code, which was generated by the Rabin–Karp Algorithm. The proposed system is compared with the existing online judging platforms to show the superiority in terms of time efficiency, correctness, and feature availability. In addition, we evaluated our system by using large data sets and comparing the run time with MOSS, which is the widely used plagiarism detection technique.
... Unfortunately, the solutions of security threats presented in the aforementioned paper are mostly outdated, although their causes are still unchanged. Currently, the most popular methods of avoidance of such issues rely on the execution of submitted solutions in dedicated sandboxes managed by the online judge system ( Yi et al. 2014), such as virtualization, LXC containers ( Felter et al. 2015), and the Docker framework ( Merkel 2014). Such approaches could significantly increase the safety and reliability of the system. ...
Article
Online judges are systems designed for the reliable evaluation of algorithm source code submitted by users, which is next compiled and tested in a homogeneous environment. Online judges are becoming popular in various applications. Thus, we would like to review the state of the art for these systems. We classify them according to their principal objectives into systems supporting organization of competitive programming contests, enhancing education and recruitment processes, facilitating the solving of data mining challenges, online compilers and development platforms integrated as components of other custom systems. Moreover, we introduce a formal definition of an online judge system and summarize the common evaluation methodology supported by such systems. Finally, we briefly discuss an Optil.io platform as an example of an online judge system, which has been proposed for the solving of complex optimization problems. We also analyze the competition results conducted using this platform. The competition proved that online judge systems, strengthened by crowdsourcing concepts, can be successfully applied to accurately and efficiently solve complex industrial- and science-driven challenges.
Preprint
Online judges are systems designed for the reliable evaluation of algorithm source code submitted by users, which is next compiled and tested in a homogeneous environment. Online judges are becoming popular in various applications. Thus, we would like to review the state of the art for these systems. We classify them according to their principal objectives into systems supporting organization of competitive programming contests, enhancing education and recruitment processes, facilitating the solving of data mining challenges, online compilers and development platforms integrated as components of other custom systems. Moreover, we introduce a formal definition of an online judge system and summarize the common evaluation methodology supported by such systems. Finally, we briefly discuss an Optil.io platform as an example of an online judge system, which has been proposed for the solving of complex optimization problems. We also analyze the competition results conducted using this platform. The competition proved that online judge systems, strengthened by crowdsourcing concepts, can be successfully applied to accurately and efficiently solve complex industrial- and science-driven challenges.
Preprint
Full-text available
The automated code evaluation system (AES) is mainly designed to reliably assess user-submitted code. The code is compiled and then tested in a unified environment with predefined input and output test cases. Due to their extensive range of applications and the accumulation of valuable resources, AESs are becoming increasingly popular. Research on the application of AES and their real-world resource exploration for diverse coding tasks is still lacking. In this study, we conducted a comprehensive survey on AESs and their resources. This survey explores the application areas of AESs, available resources, and resource utilization for coding tasks. AESs are categorized into programming contests, programming learning and education, recruitment, online compilers, and additional modules, depending on their application. We explore the available datasets and other resources of these systems for research, analysis, and coding tasks. The success of machine learning models for inference procedures depends primarily on the purity of the data, where the accumulated real-life data (e.g., codes and submission logs) from AESs can be a valuable treasure. Moreover, we provide an overview of machine learning-driven coding tasks, such as bug detection, code review, comprehension, refactoring, search, representation, and repair. These tasks are performed using real-life datasets. In addition, we briefly discuss the Aizu Online Judge platform as a real example of an AES from the perspectives of system design (hardware and software), operation (competition and education), and research. This is due to the scalability of the AOJ platform (programming education, competitions, and practice), open internal features (hardware and software), attention from the research community, open source data (e.g., solution codes and submission documents), and transparency. We also analyze the overall performance of this system and the perceived challenges over the years.
Chapter
In this paper we prepare extensive analysis of creation process for online judge solution. Design of architecture is inspired by methods used widely in professional applications and by experiences with existing systems of this kind. We present our vision of all system elements—web application, server application and executors. To create web application we use Material Design technology. Scalability is given by use of Google Kubernetes and Docker technologies. There is considered wide range of probable security threats, and after extensive testing eBPF solutions was chosen. In first version of system C++, Java and Python are supported. System offers also amazing extensibility by plugins, in these way even support of new programming language can be added.
Article
Full-text available
In this paper, the disadvantages in traditional item bank are analyzed, and traditional Online Judge is improved to fit the course teaching. Then, Algorithm Design and Analysis course is taken as an example to demonstrate how to apply COOJ to course teaching. After that, COOJ's advantages are analyzed against traditional Online Judge and Web-Based Collaborative Learning Platform.
Conference Paper
Full-text available
Many fault-tolerant and intrusion-tolerant systems require the abil- ity to execute unsafe programs in a realistic environment with- out leaving permanent damages. Virtual machine technology meets this requirement perfectly because it provides an execution envi- ronment that is both realistic and isolated. In this paper, we intro- duce an OS level virtual machine architecture for Windows applica- tions called Feather-weight Virtual Machine (FVM), under which virtual machines share as many resources of the host machine as possible while still isolated from one another and from the host machine. The key technique behind FVM is namespace virtualiza- tion, which isolates virtual machines by renaming resources at the OS system call interface. Through a copy-on-write scheme, FVM allows multiple virtual machines to physically share resources but logically isolate their resources from each other. A main techni- cal challenge in FVM is how to achieve strong isolation among different virtual machines and the host machine, due to numerous namespaces and interprocess communication mechanisms on Win- dows. Experimental results demonstrate that FVM is more flexible and scalable, requires less system resource, incurs lower start-up and run-time performance overhead than existing hardware-level virtual machine technologies, and thus makes a compelling build- ing block for security and fault-tolerant applications.
Article
This report describes and evaluates the implementation and applicability of an automatic programming assignment grading system we named the online judge. We compared this with the manual grading system that is currently being used and showed that the automatic grading system, when implemented carefully, is more convenient, fairer, and more secure than the former. We have successfully tested the system on two courses. However, further studies need to be conducted to improve the effectiveness of learning through this system.