Interaction-Aware Spatio-Temporal Pyramid Attention Networks for Action Classification


When it comes to CNN-based visual action recognition, the accuracy of the process could be improved by concentrating on local key action regions. The task of self-attention is to concentrate on important aspects while ignoring information that is not pertinent. Therefore, paying attention to one's own actions can be beneficial for action recognition. However, most of the time, the current methods of self-attention ignore the correlations that exist among the local feature vectors located at spatial positions in CNN feature maps. In this paper, we propose an efficient interaction-aware self-attention model that is able to learn attention maps by extracting information about the interactions that occur between feature vectors. For the purpose of modeling attention, we introduce a spatial pyramid that contains feature maps at various layers. This is due to the fact that the various layers of a network each capture feature maps at a different scale. The information from multiple scales is utilized in order to arrive at attention scores that are more accurate. After using these attention scores to weight the local feature vectors of the feature maps, attentional feature maps are then calculated using those weighted local feature vectors. Because there is no limit placed on the number of feature maps that can be fed into the spatial pyramid attention layer, it is simple for us to extend this attention layer into a spatio-temporal variant. To create a video-level, end-to-end attention network for action recognition, our model can be incorporated into any general CNN. In order to obtain precise forecasts of human behavior, researchers are looking into a number of different ways to combine the RGB and flow streams. Our method has been shown to achieve state-of-the-art results on the datasets UCF101, HMDB51, Kinetics-400, and untrimmed Charades, according to the findings of our experimental research.

Did you like this research project?

To get this research project Guidelines, Training and Code... Click Here

PROJECT TITLE : Action-Stage Emphasized Spatiotemporal VLAD for Video Action Recognition ABSTRACT: However, convolutional neural networks (CNNs) have yet to attain the same spectacular results in video action detection as in image
PROJECT TITLE : Optimal Transmission Switching as a Remedial Action to Enhance power System Reliability ABSTRACT: Increasing the number of redundant pathways in transmission grids has long been considered a technique to improve
PROJECT TITLE :Semi-Supervised Image-To-Video Adaptation For Video Action Recognition - 2017ABSTRACT:Human action recognition has been well explored in applications of pc vision. Many successful action recognition methods have
PROJECT TITLE : Fusion of depth ,skeleton ,and inertial data for human action recognition - 2016 ABSTRACT: This paper presents somebody's action recognition approach by the simultaneous deployment of a second generation Kinect
PROJECT TITLE : Silhouette Analysis for Human Action Recognition Based on Supervised Temporal t-SNE and Incremental Learning - 2015 ABSTRACT: This paper develops a person's action recognition method for human silhouette sequences

Ready to Complete Your Academic MTech Project Work In Affordable Price ?

Project Enquiry