An Approach to Modeling and Studying the Behavior of Firefighting Drones Using Unity ML-Agents

Harbaliev G.; Vasilev, V. E.; Budakova, D. V.

Autors: Harbaliev G., Vasilev, V. E., Budakova, D. V.
Title: An Approach to Modeling and Studying the Behavior of Firefighting Drones Using Unity ML-Agents
Keywords: behavior, firefighting, ML-Agents, PPO, SAC

Abstract: This paper presents an approach to model and study the behavior of a firefighting drone. Unity ML-Agents Toolkit and two advanced Deep Reinforcement Learning algorithms are used. They are respectively Soft Actor-Critic (SAC) and Proximal Policy Optimization (PPO) which are supported. Their effectiveness in training a virtual agent drone to put out a fire is being studied. For the experiment, a fire is realized at different places randomly in the training virtual environment. The results show that the agents can detect and extinguish many of the fires and achieve adaptive behavior. From this, it can be concluded that these algorithms can be used for real purposes and the Unity ML Agent toolkit is suitable for conducting experiments.

References

www.unity.com Unity Real-Time Development Platform | 3D, 2D, VR & AR Engine.
www.unrealengine.com The most powerful real-time 3D creation tool - Unreal Engine.
https://microsoft.github.io/AirSim/ AirSim is an open-source, cross-platform simulator for drones and cars, built on Unreal Engine.
https://bisimulations.com/products/vbs-blue-ig VBS Blue IG is a visualization simulator used by the military for tactical training and missions.
https://www.ossovr.com/ a surgical training platform that uses virtual reality
V. Vasilev, S. Stefanov, “Modeling an outdoor substation with dynamically occurring faults and conducting a preventative inspection,” 12th International Conference “TechSys 2023” – Engineering, Technologies and Systems AIP Conference Proceedings, Volume 3078, Issue 1, 040003 2024. https://doi.org/10.1063/5.0209284
E. Bondi, A. Kapoor, D. Dey, J. Piavis, Sh. Shah, R. Hannaford, A. Iyer, L. Joppa, M. Tambe, “Near Real-Time Detection of Poachers from Drones in AirSim,” Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI - 2018) Demos. Pages 5814-5816., 2018. https://doi.org/10.24963/ijcai.2018/847
Cianci, Maria Grazia & Calisi, Daniele & Botta, Stefano & Colaceci, Sara & Molinari, Matteo. (2022). “2022_Virtual Reality in Future Museums.” REPRESENTATION CHALLENGES New Frontiers of AR and AI Research for Cultural Heritage and Innovative Design, doi.org/10.3280/oa-845-c220
I. Heldal, H. Wijkmark, C. Pareto, Lena, “Simulation and serious games for firefighter training: challenges for effective use.” NOKOBIT, ISSN 1892-0748, E-ISSN 1894-7719, Vol. 24, no 1, p. 1-12, 2016.
https://unity-technologies.github.io/ml-agents/ML-AgentsOverview/ - ML Agents Overview.
Bot Academy, Train your first A.I. in Unity | ML-Agents Tutorial 2020, https://www.youtube.com/watch?v=1bn9Lx2DDa0&list=P L8fePt58xRPY1-pkhMPus3GlUGXNdqMH5
The Ash Bot, How to show ML Agents what to do! || Imitation learning!, https://www.youtube.com/watch?v=1raDh6rpg8U
Immersive Limit, Unity ML-Agents - Demonstration Recorder for Imitation Learning, https://www.youtube.com/watch?v=Dhr4tHY3joE
Code Monkey, Teach your AI! Imitation Learning with Unity ML-Agents!, https://www.youtube.com/watch?v=supqT7kqpEI
Immersive Limit LLC, ML-Agent: Hummingbirds, https://learn.unity.com/course/ml-agents-hummingbirds
Immersive Limit LLC, Reinforcement Learning: AI Flight with Unity ML-Agents, https://www.udemy.com/course/aiflight/
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, O. Klimov, “Proximal Policy Optimization Algorithms”, arXiv:1707.06347, 2017, https://doi.org/10.48550/arXiv.1707.06347
Y. Bai, A. Jones, K. Ndousse, A. Askell, A. Chen, N. DasSarma, D. Drain, S. Fort, D. Ganguli, T. Henighan, et al. “Training a helpful and harmless assistant with reinforcement learning from human feedback.” arXiv preprint arXiv:2204.05862, 2022.
H. Lee, S. Phatale, H. Mansoor, K. Lu, T. Mesnard, C. Bishop, V. Carbune and A. Rastogi, “RLAIF: scaling reinforcement learning from human feedback with AI feedback”. CoRR, abs/2309.00267, 2023. doi: 10.48550/ ARXIV.2309.00267. URL https://doi.org/10.48550/arXiv.2309.00267 .
W. Yongliang, K. Hamidreza, “IPPO: Obstacle avoidance for robotic manipulators in joint space via improved proximal policy optimization”, arXiv:2210.00803v2 [cs.RO] 9 Feb 2023, 2023.
N. Tang and T. Haarnoja, “Learning Diverse Skills via Maximum Entropy Deep Reinforcement Learning:”, Berkeley Artificial Intelligence Research – BAIR, 2017, https://bair.berkeley.edu/blog/2017/10/06/soft-q-learning/
T. Haarnoja, A. Zhou, K. Hartikainen, G. Tucker, S. Ha, J. Tan, V. Kumar, H. Zhu, A. Gupta, P. Abbeel, S. Levine, “Soft Actor-Critic Algorithms and Applications”, arXiv:1812.05905v2 [cs.LG] 29 Jan 2019
T. Haarnoja, A. Zhou, P. Abbeel, S. Levine, “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor.” In International Conference on Machine Learning (ICML), 2018.
L. J. McCunn, R. Gifford, “Place imageability, sense of place, and spatial navigation: A community investigation”, Cities, Volume 115, 2021, 103245, ISSN 0264-2751, https://doi.org/10.1016/j.cities.2021.103245.(https://www.sciencedirect.com/science/article/pii/S0264275121001451).
J. Zhang, X. Xia, R. Liu, N. Li, “Enhancing human indoor cognitive map development and wayfinding performance with immersive augmented reality-based navigation systems,” Advanced Engineering Informatics, Volume 50, 2021, 101432, ISSN 1474-0346, https://doi.org/10.1016/j.aei.2021.101432.(https://www.sciencedirect.com/science/article/pii/S1474034621001841)

Issue

2024 12th International Scientific Conference on Computer Science, COMSCI 2024 - Proceedings, 2025, Albania, https://doi.org/10.1109/COMSCI63166.2024.10778513

Вид: публикация в международен форум, публикация в реферирано издание, индексирана в Scopus

Е-Публикации
Технически университет - София

Детайли за публикация от базата данни на ТУ - София (Publication Details)