Skip to Main content Skip to Navigation
Journal articles

Exploring a rich spatial–temporal dependent relational model for skeleton-based action recognition by bidirectional LSTM-CNN

Abstract : With the fast development of effective and low-cost human skeleton capture systems, skeleton-based action recognition has attracted much attention recently. Most existing methods using Convolutional Neural Networks (CNN) and Long Short Term Memory (LSTM) have achieved promising performance for skeleton-based action recognition. However, these approaches are limited in the ability to explore the rich spatial–temporal relational information. In this paper, we propose a new spatial–temporal model with an end-to-end bidirectional LSTM-CNN (BiLSTM-CNN). First, a hierarchical spatial–temporal dependent relational model is used to explore rich spatial–temporal information in the skeleton data. Then a new framework is proposed to fuse CNN and LSTM. In this framework, the skeleton data are built by the dependent relational model and serve as the input of the proposed network. Then LSTM is used to extract the temporal features, and followed by a standard CNN to explore the spatial information from the output of LSTM. Finally, the experimental results demonstrate the effectiveness of the proposed model on the NTU RGB+D, SBU Interaction and UTD-MHAD dataset.
Document type :
Journal articles
Complete list of metadata

https://hal-utt.archives-ouvertes.fr/hal-03320682
Contributor : Jean-Baptiste Vu Van Connect in order to contact the contributor
Submitted on : Monday, August 16, 2021 - 11:52:25 AM
Last modification on : Friday, August 27, 2021 - 3:14:06 PM

Identifiers

Collections

UTT | CNRS

Citation

Aichun Zhu, Qianyu Wu, Ran Cui, Tian Wang, Wenlong Hang, et al.. Exploring a rich spatial–temporal dependent relational model for skeleton-based action recognition by bidirectional LSTM-CNN. Neurocomputing, Elsevier, 2020, 414, pp.90-100. ⟨10.1016/j.neucom.2020.07.068⟩. ⟨hal-03320682⟩

Share

Metrics

Record views

12