Intelligent virtual agent, learning how to reach a goal by making the least number of compromises

Budakova, D. V.; Petrova-Dimitrova V. S.; Dakovski L. G.

Autors: Budakova, D. V., Petrova-Dimitrova V. S., Dakovski L. G.
Title: Intelligent virtual agent, learning how to reach a goal by making the least number of compromises
Keywords: Intelligent system, Reinforcement learning, Smart shopping-c

Abstract: The learning process in the Q-learning algorithm is characterized by maximizing a single, numerical reward signal. However, there are tasks for which the requirements toward the way to reach a goal are complex. This paper proposes a modification to the Q-learning algorithm. In order to make the Q-learning agent find the optimal path to the goal by meeting particular complex criteria, the use of measures model (a model of environment criteria), represented as a new memory matrix, is introduced. If the goal cannot be reached by following the pre-set criteria, the learning agent can compromise a given criterion. The agent makes the least possible number of tradeoffs in order to reach the goal. If the criteria are arranged by their level of importance, then the agent can choose more in number and more acceptable compromises. The aim of the modification is to empower the learning agent to control the way of reaching a goal. The modified algorithm has been applied to training smart agents.

References

Sutton R. S. and Barto A. G., 2014, Reinforcement Learning: An Introduction, Online, England, MIT Press
Gosavi A., 2008, Reinforcement Learning: A Tutorial Survey and Recent Advances, INFORMS Journal on Computing, Volume Vol. 21 No.2, pp. 178-192
Torrado R.R., Bontrager P., Togelius J., Liu J., Perez-Liebana D., 2018, Deep Reinforcement Learning for General Video Game AI, Maastricht, 2018-August, <Netherlands>, IEEE Conference on Computatonal Intelligence and Games
Argall B., 2009, Learning Mobile Robot Motion Control from Demonstration and Corrective, Robotics Institute Carnegie Mellon University Pittsburgh, Volume PA 15213, pp. 8
Amor H. B., Vogt D., Ewerton M., Berger E., Jung B., Peters J., 2013, Learning Responsive Robot Behavior by Imitation, Tokyo, November 3-7, 2013, <Japan>, IEEE/RSJ International Conference on Intelligent Robots and Systems
Takahashi K, Kim K., Ogata T., Sugano S., 2017, Tool-body assimilation model considering grasping motion through deep learning, Robotics and Autonomous Systems, Elsevier,, Volume Volume 91, pp. 115–127
Moffaert K. V., 2016, Reinforcement Learning for Sequential Decision Making Problems, for the degree of Doctor of Science, <Computer Science, Brussels University Press>
Natarajan S., Tadepalli P.,, 2005, Dinamic Preferences in Multi-Criteria Reinforcement Learning, Bonn, 2005, <Germany>, International Conference on Macine Learning
Budakova D , Dakovski L., 2019, Smart shopping system, Plovdiv, May 2019, <Bulgaria>, TECHSYS 2019
Budakova D, Dakovski L., Petrova-Dimitrova Veselka, 2019, Smart Shopping Cart Learning Agents Development, Sozopol, 26-28 September 2019, <Bulgaria>, TECIS 2019, 19th IFAC-PapersOnLine
Budakova D., Dakovski L., Petrova-Dimitrova Veselka, 2019, Smart Shopping Cart Learning Agents, International journal on Advances in internet technology, IARIA, Volume Vol. 12, nr 3&4. 2019, pp. 109 – 121
Shakev N. G., Ahmed S. A.,Topalov A.V., Popov V.L., and Shiev K.B., 2018, Autonomous Flight Control and Precise Gestural Positioning of a Small Quadrotor, Learning Systems: From Theory to Practice, Springer, Volume 2018, pp. 179-197

Issue

TechSys 2020, vol. 878, 2020, Bulgaria, IOP Publishing, DOI 1757-899X/1757-8981

Full text of the publication

Вид: публикация в международен форум, публикация в реферирано издание, индексирана в Scopus

Е-Публикации
Технически университет - София

Детайли за публикация от базата данни на ТУ - София (Publication Details)