Multimodal Pedestrian Detection Using Spatio-Contextual Deep Networks for Autonomous Driving PROJECT TITLE : Spatio-Contextual Deep Network Based Multimodal Pedestrian Detection For Autonomous Driving ABSTRACT: The most important component of an autonomous driving system is the module that is responsible for pedestrian detection. Although a camera is commonly used for this purpose, the quality of the image produced by the camera suffers significantly in conditions of low light, such as when driving at night. In contrast, the quality of an image captured by a thermal camera is unaffected by the conditions in which it was captured. Using RGB and thermal images, this paper presents a model for end-to-end multimodal fusion that is intended for pedestrian detection. Because of its innovative spatio-contextual deep network architecture, it is able to exploit multimodal input in an effective manner. It is made up of two separate deformable ResNeXt-50 encoders, and its purpose is to extract features from the two different modalities. A multimodal feature embedding module (MuFEm) that is comprised of several groups of a pair of Graph Attention Networks and a feature fusion unit is where these two encoded features are combined into a single feature. After that, the output of the final feature fusion unit of MuFEm is sent to two CRFs for the spatial refinement of their respective models. The application of channel-wise attention and the extraction of contextual information are used to accomplish further enhancement of the features. This is done with the assistance of four RNNs that traverse in four different directions. Last but not least, a single-stage decoder makes use of these feature maps in order to produce the score map as well as the bounding box for each pedestrian. Extensive tests of the proposed framework have been run on three publicly available multimodal pedestrian detection benchmark datasets, namely KAIST, CVC-14, and UTokyo. These datasets were used to evaluate the performance of the proposed framework. The findings on each of them led to an improvement in the respective performance of the state-of-the-art. You can watch a brief video that provides an overview of this work as well as its qualitative results at https://youtu.be/FDJdSifuuCs. This video can be found there. When the paper is finally published, we will make our source code available to the public. Did you like this research project? To get this research project Guidelines, Training and Code... Click Here facebook twitter google+ linkedin stumble pinterest End-to-End Off-Policy Deep Reinforcement Learning for Traffic Signal Control Using triplet losses, VARID Viewpoint-Aware Re-IDentification of the Vehicle