Conference PaperPDF Available

Glide: Towards an intelligent kinesthetic learning space for teaching dance

Authors:

Abstract

In this paper we present the overview for a proposed artificially intelligent learning environment which instructs dance. The Guided Learning and Immersive Dance Environment (GLIDE) will teach dance via a virtual instructor that senses student movement in the real world via a Microsoft Kinect. As a student attempts to mimic the movements of the instructor the system will extract skeleton movement and joint rotations to evaluate the dance performance. The analysis of the student's movements will be fed into the system's artificial intelligence which will provide real-time feedback and customised targeted instruction to assist in bettering the student's performance. This technology has far reaching application from tradition dance instruction to preservation and dissemination of intangible cultural heritage.
Bond University
Research Repository
Glide: Towards an intelligent kinesthetic learning space for teaching dance
De Byl, Penny Baillie; Birt, James R.; Khan, Muqeem
Published in:
13th Middle Eastern Simulation and Modelling Conference
Published: 01/01/2012
Document Version:
Publisher's PDF, also known as Version of record
Link to publication in Bond University research repository.
Recommended citation(APA):
De Byl, P., Birt, J., & Khan, M. (2012). Glide: Towards an intelligent kinesthetic learning space for teaching
dance. In M. Al-Akaidi (Ed.), 13th Middle Eastern Simulation and Modelling Conference : MESM 2012
Proceedings (pp. 81-82). EUROSIS.
General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners
and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.
For more information, or if you believe that this document breaches copyright, please contact the Bond University research repository
coordinator.
Download date: 10 Jan 2019
GLIDE:
TOWARDS AN INTELLIGENT KINESTHETIC LEARNING SPACE FOR
TEACHING DANCE
Penny de Byl and James Birt
Bond University, Australia
E-mail: pdebyl@bond.edu.au, jbirt@bond.edu.au
Muqeem Khan
Northwestern University in Qatar
muqeem@usa.net
KEYWORDS
Learning, Teaching, Motion-Capture, Dance, Artificial
Intelligence, XBox Kinect
ABSTRACT
technology; and 3) perpetuate the notion of serious games in
the realm of kinesthetic learning space.
GLIDE: A Guided Learning and Immersive Dance
Environment
In this paper we present the overview for a proposed
artificially intelligent learning environment which instructs
dance. The Guided Learning and Immersive Dance
Environment (GLIDE) will teach dance via a virtual
instructor that senses student movement in the real world via
a Microsoft Kinect. As a student attempts to mimic the
movements of the instructor the system will extract skeleton
movement and joint rotations to evaluate the dance
performance. The analysis of the student's movements will
be fed into the system's artificial intelligence which will
provide real-time feedback and customised targeted
instruction to assist in bettering the student's performance.
This technology has far reaching application from tradition
dance instruction to preservation and dissemination of
intangible cultural heritage.
I INTRODUCTION
Dance can be learned from verbal description, spatial guides
or instructor imitation. Verbal description works well if the
dance isn't too complicated, however as the movements
become more complex and increase in speed, words are too
deliberate, general and linear to embody the complexity of
precise movement. The spatial guide method used in old-
fashioned dancing, which forms the basis of numerous
computer games such as Dance Dance Revolution, instructs
players through tasks of hitting a sequence of spatial targets
in rhythmic time. This provides a greater kinesthetic
learning experience than verbal description alone. In this
paper we outline the proposed GLIDE system being
developed by Bond University in Australia, Northwestern
University in Qatar and Hanze University, The Netherlands.
Through it we will evaluate motion-sensing technologies,
developed for use in contemporary digital computer games,
through the design of an intelligent kinesthetic learning
space. This project aims to produce the following outcomes
1) create a proof-of-concept intelligent kinesthetic learning
space, 2) evaluate and explore knowledge transfer
opportunities afforded by kinesthetic peripheral games
In the hierarchy of visual and kinesthetic experiences with
GLIDE, the cognitive subprocesses play the leading role.
The peripheral and focused visions constantly scan the
application on the screen for learning cues and feedbacks.
This transection of information from user's interaction
interacts simultaneously with short term or working/active
memories, depending on the user's choices and sudden
emotional or structural changes from digital elements
presented by GLIDE application. Research shows that human
have limited capacity for coding, storage and retrieval of
information from the screen (Lang, 2006). The limited
capacity model categorizes television viewers as information
processor. Viewers process the information on the screen
with parallel cognitive sub-process of coding, storage and
retrieval of messages that they have exposed.
The main difference in our approach to designing GLIDE
with respect to existing systems is in determining the optimal
configuration between the artificial intelligence (AI)
techniques used to provide student feedback and the
accuracy with which motion is being captured and analysed.
In short, we endeavour to determine a point in the design at
which learning effectiveness does not improve with more
complex processing and feedback.
GLIDE will include the projection of a virtual dance
instructor character programmed with real world dance
routines captured from expert dancers and a dance area for
students (as illustrated in Figure 1). Students entering the
space will receive dance instruction from the character. The
motion-sensing devices will track the students' movements
and the character will give them feedback on their
performance.
© EUROSIS-ETI
Student foUov/s end
jneiiipli to mimic ilic
iasliuciof.
IntelBgcll softwaieconipiircs
insinKIO' moveiw'ol with sHidoiil movanicnl
1 adapts insiructionfor cwrecling .wfl pioo'eniraj
lessen.
Iiuuudor teachiis
by eximpla.
Inlurmi cameras
Captllt« .in<l tKOKl
student mowiiiiMil.
based on the research team's access to a Hip Hop dance
instmctor and a large cohort of secondary school children.
Pose Evaluation
Figure 2: GLIDE illustrating the instructor on the left, the
student's avatar and an accuracy display highlighting the
joints of the student which are not in line with those of the
instructor.
Pose evaluation identifies in real time or in post processing
how a human body and/or individual limbs and joints are
configured in a given scene. There are many techniques
revealed in contemporary literature for determining the
accuracy of human pose capture (Moeslund & Granum,
2001). These techniques can be coarsely grouped into two
classes, 2D image sequencing and 3D data point analysis
(Raptis, Kirovski, & Hoppe, 2011). Methods using 2D image
sequences focus on extracting image and pattern features of
the movement in space and time. Examples of these features
include contours to identify similar motions between the
recorded movement and the user (Aaron, James, & IEEE
Computer, 2001), and matching systems against pre-recorded
poses or templates of human actions such as walking, waving
and running (Schuldt, Laptev, & Caputo, 2004). The use of
3D information provides many benefits over the pure 2D
image based approaches. The data does not suffer from
sensitivity such as vantage point, scale and light changes
(Raptis, et al., 2011). The 3D approach allows tracking of
individual parts of the human body enabling temporal
analysis of human body dynamics or simply dance movement
in time.
The method used for determining pose accuracy in our
project will depend on the level of detail required to achieve
the optimal learning situation. Using the provided pose
evaluation examples as a baseline we propose to experiment
using: 1) distinct time intervals and 3D joints, 2) musical
beat intervals and 3D joints, 3) realtime 3D joints, 4)
approximate 3D joints in real time with simple Gaussian
analysis, 5) approximate 3D joints in real time with inverse
kinematics and Euclidean distance The goal is to generate a
series of algorithms from these pose evaluations that range in
complexity and can adapt to the user.
Proposed Efficacy and Affordance
The current prototype, shown in Figure 2, is implemented
with the Unity 3D game engine. It has been developed to
begin evaluation on the first configuration of visual accuracy
feedback and the use of Euclidian distance to determine
player pose accuracy. Although the image illustrates and
Australian aboriginal dancer the design will change to
accommodate our first study which will evaluate the use of
GLIDE in teaching Hip Hop. This decision is primarily
SUMMARY
GLIDE offers a well crafted kinesthetic learning space with
playfulness that can easily foster the active, purposeful and
entertaining user interaction with observational learning
opportunities. With the marketplace entry of the Nintendo
Wii, PlayStation Move and Microsoft Kinect motion capture
technology is now available to everyone. While there are a
plethora of serious applications of this technology being
examined, there is little research into its efficacy with respect
to teaching dance. This project represents a cutting-edge and
necessary examination of motion-sensing technologies and
their potential for kinesthetic knowledge transfer to a wider
international community. While the project outlined herein
will examine the design parameters for a truly effective
virtual dance tutor, the applications stemming from such are
further reaching. For example, the preservation and teaching
of cultural dances is an important focal area in ICH research.
The use of these technologies not only provides tremendous
showcasing opportunities for dance and related kinesthetic
domains but also guarantees spontaneous, undirected
learning experiences for people of all ages (Tanenbaum &
Bizzocchi, 2009).
REFERENCES
Lang, A. (2006). Using the limited capacity model of motivated
mediated message processing to design effective cancer
communication messages. Journal of Communication, 56, S57—
S80.
Aaron, F. B., James, W. D., & IEEE Computer, S. (2001). The
Recognition of Human Movement Using Temporal Templates.
Moeslund, T. B., & Granum, E. (2001). A Survey of Computer
Vision-Based Human Motion Capture. Computer Vision and
Image Understanding, 81(3), 231-268. doi:
10.1006/cviu.2000.0897
Raptis, M., Kirovski, D., & Hoppe, H. (2011). Real-time
classification of dance gestures from skeleton animation.
Tanenbaum, J., & Bizzocchi, J. (2009). Rock Band: a case study in
the design of embodied interface experience. Proceedings of the
2009 ACM SIGGRAPH Symposium on Video Games (pp.
127-134).
© EUROSIS-ETI
Conference Paper
Full-text available
We present a real-time gesture classification system for skeletal wireframe motion. Its key components include an angular representation of the skeleton designed for recognition robustness under noisy input, a cascaded correlation-based classifier for multivariate time-series data, and a distance metric based on dynamic time-warping to evaluate the difference in motion between an acquired gesture and an oracle for the matching gesture. While the first and last tools are generic in nature and could be applied to any gesture-matching scenario, the classifier is conceived based on the assumption that the input motion adheres to a known, canonical time-base: a musical beat. On a benchmark comprising 28 gesture classes, hundreds of gesture instances recorded using the XBOX Kinect platform and performed by dozens of subjects for each gesture class, our classifier has an average accuracy of 96:9%, for approximately 4-second skeletal motion recordings. This accuracy is remarkable given the input noise from the real-time depth sensor.
Article
This paper applies the limited capacity model of motivated mediated messages (LC4MP) to the problem of creating effective messages about cancer. A general descrip-tion of the model is presented and then applied specifically to the task of creating effec-tive cancer communication messages by asking the following questions about cancer communication: (a) What is the goal of the message? (b) Who is in the target market? (c) What medium will carry the message? and (d) What is the motivational and personal relevance of the main information in the message for the majority of people in the tar-get market? The paper concludes that cancer is a motivationally relevant topic that will elicit aversive activation. Target markets for various types of cancer-related messages (e.g., smokers or people of a certain age) will process mediated messages in predictably different ways making certain design decisions better for certain target markets. Both structural and content elements of messages interact with the limited capacity informa-tion processing system to impact resource allocation, which in turn determines how well messages are encoded, stored, and retrieved at a decision point. Individual differences in peoples' motivational activation influence both their tendencies to engage in risky behaviors that increase the probabilities of getting cancer and their processing of health-related messages. Future research from this perspective should be done to opti-mize cancer messages for specific target audiences using specific media. Why do we study health communication specifically? Is health communication different from other types of communication? Is health communication about can-cer different from health communication about other topics? Obviously, we study health communication because we want to be able to deliver effective messages about health-related information and behaviors to people who would benefit from those messages.
Article
A comprehensive survey of computer vision-based human motion capture literature from the past two decades is presented. The focus is on a general overview based on a taxonomy of system functionalities, broken down into four processes: initialization, tracking, pose estimation, and recognition. Each process is discussed and divided into subprocesses and/or categories of methods to provide a reference to describe and compare the more than 130 publications covered by the survey. References are included throughout the paper to exemplify important issues and their relations to the various methods. A number of general assumptions used in this research field are identified and the character of these assumptions indicates that the research field is still in an early stage of development. To evaluate the state of the art, the major application areas are identified and performances are analyzed in light of the methods presented in the survey. Finally, suggestions for future research directions are offered.
Rock Band: a case study in the design of embodied interface experience
  • J Tanenbaum
  • J Bizzocchi
Tanenbaum, J., & Bizzocchi, J. (2009). Rock Band: a case study in the design of embodied interface experience. Proceedings of the 2009 ACM SIGGRAPH Symposium on Video Games (pp. 127-134).