Direct Heuristic Dynamic Programming for Online Reinforcement Learning: From Time-Driven to Event-Driven

PROJECT TITLE :

Online Reinforcement Learning Control by Direct Heuristic Dynamic Programming: From Time-Driven to Event-Driven

ABSTRACT:

In the context of this research, the term "time-driven learning" refers to the type of Machine Learning that continuously modifies the parameters of a prediction model in response to the arrival of new data. The direct heuristic dynamic programming (dHDP) algorithm is one of the existing approximate dynamic programming (ADP) and reinforcement learning (RL) algorithms. It has been shown to be an effective tool, as it has been demonstrated in the solution of several complex learning control problems. It is constantly updating both the control policy and the critic in response to the continuously changing system states. As a result, it is preferable to prevent the time-driven dHDP from updating as a result of a relatively insignificant system event such as noise. To achieve this objective, we suggest the creation of a brand new event-driven dHDP. We demonstrate that the system states and the weights in the critic and the control policy networks are uniformly ultimately bounded (UUB) by constructing a candidate for the Lyapunov function. As a result of this, we demonstrate that the approximate control and cost-to-go function are getting closer and closer to the Bellman optimality while staying within a finite bound. In addition to this, we demonstrate the operation of the event-driven dHDP algorithm in contrast to the time-driven dHDP that was originally developed.

Did you like this research project?

To get this research project Guidelines, Training and Code... Click Here

Direct Heuristic Dynamic Programming for Online Reinforcement Learning: From Time-Driven to Event-Driven

QUICK LINKS

Ready to Complete Your Academic MTech Project Work In Affordable Price ?