Autors: Nikolova, D. V., Vladimirov, I. H., Terneva, Z. A.
Title: Human Action Recognition for Pose-based Attention: Methods on the Framework of Image Processing and Deep Learning
Keywords: Deep Learning; Feature Extraction; Human Action Recognition; Image Processing; Pose-based Attention

Abstract: is paper presents an overview of some approaches of Human action recognition (HAR) for pose-based attention. The paper's focus is on algorithms that use video processing on a given dataset. A list of the best HAR datasets is given in order to show the variety of the available videos online. Local and Global feature extraction are reviewed. Also some of the most common Deep Learning methods are studied: Recurrent Neural Network (RNN), Convolutional Neural Network (CNN) and Generative Adversarial Network (GAN). All of the methods are directed to recognise the pose and the focus of the person in a recording.


  1. Sun Z., Lui J., Ke Q., Rahmani H., Bennamoun M., Wang G., Liu, J., 2021, Human Action Recognition from Various Data Modalities: A Review, IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume Early Access, pp. pp.1-20
  2. Abu-El-Haija, S., Kothari, N., Lee, J., Natsev, P., Toderici, G., Varadarajan, B., Vijayanarasimhan, S., 2016, YouTube-8M: A LargeScale Video Classification Benchmark, Google Research, <>, Дата на последен преглед (Last accessed on): 16.09.2022
  3. Zhao, H., Yan, Z., Torresani, L., Yan, Z., 2019, HACS: Human Action Clips and Segments Dataset for Recognition and Temporal Localization, Seoul, Korea, 27 October - 2 November 2019, <>, IEEE
  4. Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Li, F.-F., 2014, Large-scale video classification with convolutional neural networks, Columbus, OH, USA, 23-28 June 2014, <>, IEEE
  5. Monfort, M., Vondrick, C., Oliva, A., Andonian, A., Zhou, B., Ramakrishnan, K., Bargal, S.A., Yan, T., Brown, L., Fan, Q., Gutfreund, D., 2019, Moments in time dataset: One million videos for event understand, IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 42(2), pp. pp.502–508
  6. Kay W., Carreira J., Simonyan K., Zhang B., Hillier C., Vijayanarasimhan S., Viola F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A., 2017, The Kinetics Human Action Video Dataset, Google, <>, Дата на последен преглед (Last accessed on): 16.09.2022
  7. Carreira, J., Noland, E., Banki-Horvath, A., Hillier, C., Zisserman, A., 2018, A Short Note about Kinetics-600, Google, <>, Дата на последен преглед (Last accessed on): 16.09.2022
  8. Carreira, J., Noland, E., Hillier, C., Zisserman, A., 2019, A Short Note on the kinetics-700, Google, <>, Дата на последен преглед (Last accessed on): 16.09.2022
  9. Soomro, K., Zamir, A.R., Shah, M., 2012, UCF101: A Dataset of 101 Human Actions Classes from Videos in the Wild, Center for Research in Computer Vision University of Central Florida, USA, <>, Дата на последен преглед (Last accessed on): 16.09.2022
  10. Krig, S., 2014, Computer Vision Metrics: Survey, Taxonomy, and Analysis, Online, Apress Berkeley, CA
  11. Al-Akam, R., Paulus, D., 2017, RGBD Human Action Recognition Using Multi-Features Combination and K-Nearest Neighbors Classification, International Journal of Advanced Computer Science and Applications, Volume 8(10), pp. pp.383-389
  12. Poppe, R., 2010, A survey on vision-based human action recognition, Image and Vision Computing, Volume 28(6), pp. pp.976-990
  13. Jegham, I., Ben Khalifa, A., Alouani, I., Mahjoub, M.A., 2020, Vision-based Human Action Recognition: An Overview and Real World Challenges, Forensic Science International: Digital Investigation, Volume 32, pp. p.200901
  14. Lev, G., Sadeh, G., Klein, B., Wolf, L., 2016, RNN Fisher Vectors for Action Recognition and Image Annotation, Amsterdam, The Netherlands, 8-16 October 2016, <Switzerland>, Springer Cham
  15. Cheron, G., Laptev, I., Schmid, C., 2015, P-CNN: Pose-based CNN Features for Action Recognition, Santiago, Chile, 7-13 December 2015, <>, IEEE
  16. Wang, J., Chen, Y., Gu, Y., Xiao, Y., Pan, H., 2018, SensoryGANs: An Effective Generative Adversarial Framework for Sensor-based Human Activity Recognition, Rio de Janeiro, Brazil, 08-13 July 2018, <>, IEEE
  17. Pienaar, S., Malekian, R., 2019, Human Activity Recognition Using LSTM-RNN Deep Neural Network Architecture, Pretoria, South Africa, 8-20 August 2019, <>, IEEE
  18. Minaee, S., Boykov, Y., Porikli, F., Plaza, A., Kehtarnavaz, N., Terzopoulos, D., 2021, Image Segmentation Using Deep Learning: A Survey, IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 44(7), pp. pp.3523-3542
  19. Amelio, A., Pizzuti, C., 2014, A New Evolutionary-Based Clustering Framework for Image Databases, e-book, Online, Springer Cham, <>, Дата на последен преглед (Last accessed on): 16.09.2022
  20. Ariza Colpas, P., Vicario, E., De-La-Hoz-Franco, E., Pineres-Melo, M., Oviedo-Carrascal, A., Patara, F., 2020, Unsupervised Human Activity Recognition Using the Clustering Approach: A Review, Sensors, Volume 20(9), pp. pp.2702-2729


ICEST Conference, issue 56, pp. 23 - 26, 2021, Bulgaria, IEEE, DOI 10.1109/ICEST52640.2021.9483503

Copyright IEEE

Full text of the publication

Вид: постер/презентация в международен форум, публикация в реферирано издание, индексирана в Scopus