Underwater Acoustic Networks Using a MAC Protocol Based on Deep Reinforcement Learning PROJECT TITLE : Deep Reinforcement Learning Based MAC Protocol for Underwater Acoustic Networks ABSTRACT: When designing the medium access control (MAC) protocol for underwater acoustic networks (UWANs), one of the most important considerations must be given to the lengthy propagation delay that can reduce throughput. This paper develops a deep reinforcement learning (DRL) based MAC protocol for use in UWANs. It is referred to as delayed-reward deep-reinforcement learning multiple access (DR-DLMA). The goal of this protocol is to maximize the network throughput by making strategic use of the available time slots that have been caused by propagation delays or are being unused by other nodes. In the design of the DR-DLMA, we initially proposed a new DRL algorithm, which we referred to as the delayed-reward deep Q-network (DR-DQN). Then, we realize the DR-DLMA protocol by recasting the multiple access problem in UWANs as a reinforcement learning (RL) problem by defining state, action, and reward in the language of RL. This allows us to solve the problem of multiple access in a more efficient manner. After performing an action, the agent in traditional DRL algorithms, such as the first iteration of the DQN algorithm, has the ability to immediately gain access to the "reward" from the surrounding environment. In contrast, in our design, the "reward" (i.e., the ACK packet) is not available until after the agent has taken an action for twice the amount of time that the one-way propagation delay has elapsed (i.e., to transmit a data packet). The core idea behind DR-DQN is to modify the DRL algorithm so that it takes into account the propagation delay, which is then incorporated into the DRL framework. In addition, in order to cut down on the expenses associated with the online training of deep neural networks (DNN), we have developed a flexible training mechanism for DR-DQN. As a benchmark, the optimal network throughputs in a variety of scenarios are presented here. The results of the simulations show that our DR-DLMA protocol with the nimble training mechanism is able to I find the optimal transmission strategy when coexisting with other protocols in a heterogeneous environment; (ii) outperform state-of-the-art MAC protocols (such as slotted FAMA and DOTS) in a homogeneous environment; and (iii) significantly reduce energy consumption and run-time in comparison to DR-DLMA with the traditional Did you like this research project? To get this research project Guidelines, Training and Code... Click Here facebook twitter google+ linkedin stumble pinterest Multicast Underlay D2D Communications with Distributed Energy Efficient Channel Allocation Full-Duplex Cellular Networks with Decoupled Uplink-Downlink Association: A Contract-Theory Approach