Using Physical Object Detection as User Interface in Video Games
Lunar Ludologists Studio
Lunar Ludologists Studio
The camera of high-end smartphones could be used as an input device
for object recognition. This paper explores using this technology as the
user interface in video games and presents a playable prototype of a video
game using object recognition as the user interface to show the proof of
concept. The prototype was played by several RPG players, and they
were interviewed about their general opinion toward their experience,
and they stated that their experience was satisfying.
Keywords: computer vision, user interface design, user experience design, video games, mixed
User interface is key in making the player feel like entering and acting in the virtual world of the video
game and has a significant role in creating the player’s gameplay experience . A video game might
have various objects and the UI allows the users to interact with them. UI is a wide concept and there
are numerous UI elements in video games from semi-transparent overlays to electronic input devices.
High end smart phones could be used to detect 3D physical objects in their surrounding using their
camera. This paper presents a framework for using object recognition to allow the player interact with
the game and this interaction could be used as an extra user interface element in designing video
Recent growth in computer vision and machine learning technologies has made the process of object
identification easier and more user friendly. Trained machine learning models allow applications to
recognize real world objects without attaching extra components like RFID labels, QR Codes or
markers which means that users can interact with applications using common everyday objects
without further effort such as spending money on purchasing RFID tagged toys.
To recognize objects without adding electronic identifiers, computer vision techniques must be used.
According to  computer vision “is concerned with the understanding of useful information” from
images. Computer vision is a wide concept and includes different approaches for different problems.
In this research image recognition which is a subset of computer vision will be used and it refers to the
process of “identifying and detecting an object or a feature in a digital image or video” . To detect
objects, machine learning models should be trained for the required object detection features but it’s
not a critical issue because a variety of open source machine learning models are available making the
process of game development faster.
In this study after exploring the related works, game concept will be presented and implementation
will be discussed.
Although there is countless research dedicated to object recognition techniques and algorithms and it’s
applications in augmented reality but there are a few studies that investigate object recognition from
the user interface point of view.
An early research by  introduced a technique for producing augmented reality systems that
simultaneously identify world objects and estimate their location data, but their model is restricted to
pre-printed 2D matrix markers.
 research was about the role of real time face tracking and object recognition as a part of perceptual
user interface. The proposed system was seeking to provide the ability to understand pose and gestures
of humans to be used as interface in video games. However, his efforts did not benefit from machine
learning techniques and cannot recognize daily casual objects.
A major attempt to use object recognition in video games was conducted in . The researchers
presented an object recognition pipeline for a matching game to use on known objects. Their system
compares key points of predefined 3D CAD models with distinctive key points in the scene.
Schwank  developed a game called UIRRIG which uses 3D object recognition for real time
interaction. In this game concept, for the craftsmanship of the game, users should scan real world
known objects to solve the game puzzles. The game benefits from Microsoft Kinect to scan 3D
objects. For example, if one task is to give a Non-Player Character (NPC) an apple, the user has to
present the Kinect an apple. This means that the object recognition system classifies objects which the
game can display and use.
In the proposed system of this research, an iPhone 8 is used for taking photos and computer vision
processing. However, the main game played on a Mac personal computer or an iPhone itself. So
instead of using extra accessories, players can use their phones for scanning real world objects. There
are two goals in using real world object scanning in this game:
1. Reaching a more immersive experience by combining real world objects and virtual environment of
2. Increasing freedom of players in the game.
To achieve both goals, instead of using known 3D objects which may put a burden on the development
process, 3D object scanning could be used especially for using in AR games to locate the exact
coordinates of objects, however in this game the coordinates of objects are not used.
In this game players should fill their inventory with various items, the player with the most inventory
items earns highest score, and will have a better ranking in leaderboards. Furthermore, players can make
new friends by exchanging their inventory items. So in order to collect items, users should scan real
world objects. In addition, instead of browsing the user interfaces via scroll bars or search fields users
can scan real world objects to select them. So inventory object importing and object selection task are
operated with the help of computer vision.
The scanning system is implemented using Swift 3.0 for iOS 12 and the gaming system is
implemented in Swift for iOS 12 and Mac OS Mojave. To classify camera input of the player’s phone,
the CoreML framework is used. With the help of CoreML framework, developers can use trained
machine learning models to classify input data. The Vision framework works with Core ML to apply
classification models to images . This prototype uses MobileNet model which is an open source
machine learning model that can detect objects in images from 1000 categories such as food and
To improve the visual experience of the prototype, the results of the analysis are searched via Google
Custom Search API to make the game experience more appealing.
Conclusion and Future work
This work here presents a foundation for using object detection in video games. Object recognition is used to add
an extra element to the user interface in video games. A game was designed using this framework and was tested
with a group of RPG game players, and they were interviewed. The game testers found the general concept
satisfying, but they asked for the expansion of the attributes of inventory items to increase the gaming
possibilities and for a more complex game mechanism. In other words, the only current downside is lack of
entertaining content which prevents researchers from comparing the object recognition user interface element
with the interface of common mainstream games. The advanced game usability tests such as questionnaire and
EEG should be applied to evaluate the player’s engagement and satisfaction in the future.
The research is limited to MobileNet models, using other machine learning models and custom data can extend
the game opportunities for the players and should be evaluated by more players.
1- Albarelli, A., Rodola, E., Bergamasco, F., and Torsello, A. A non-cooperative game for 3d object recognition in cluttered scenes: (IEEE,
2011, edn.), pp. 252-259
2- “Classyfying images with vision with Core ML” Internet:
https://developer.apple.com/documentation/vision/classifying_images_with_vision_and_core_ml [Oct. 10,
3- “What is computer vision” Internet: http://www.bmva.org/visionoverview [Oct. 10, 2018]
4- Bradski, G.R. Real time face and object tracking as a component of a perceptual user interface: (IEEE, 1998,
edn.), pp. 214-219
5- “Recognition methods in image processing” Internet: https://www.mathworks.com/discovery/image-
recognition.html [Sep. 9, 2018]
6- Rekimoto, J. Matrix: A realtime object identification and registration method for augmented reality: (IEEE,
1998, edn.), pp. 63-68
7- Schwank, A. Using 3D Object Recognition for Real-Time Interactive Games: HTW Berlin, 2014
8- Adams, E. Fundamentals of game design: Pearson Education, 2014