February 2019
·
3,887 Reads
·
3 Citations
Human action classification is a significant issue in the computer vision field. To retrieve essential information from a large number of videos, understanding the content of the videos is very important. In this study, we propose an approach that classifies human actions based on the coordinate information of the body parts. The extracted key coordinate points from each frame based on the real-time pose estimation algorithm are accumulated as the matrix. Then these accumulated coordinates are used to feed the convolutional neural network (CNN) to classify human actions, that is the main contribution of this study. This study is designed to ignore the background, and just consider the movement information of the joints of the extracted poses. CNN is designed to consist of three convolutional layers, pooling layer and linear layer to extract the most relevant features for classifying the human actions. We use two benchmark dataset to validate the performance of our proposed approach. The human action classification performance of our proposed approach using six different types of actions achieves very high accuracy (100%), which is higher than the other competitive approaches using KTH dataset.