Autors: Harbaliev G., Vasilev, V. E., Budakova, D. V. Title: An Approach to Modeling and Studying the Behavior of Firefighting Drones Using Unity ML-Agents Keywords: behavior, firefighting, ML-Agents, PPO, SACAbstract: This paper presents an approach to model and study the behavior of a firefighting drone. Unity ML-Agents Toolkit and two advanced Deep Reinforcement Learning algorithms are used. They are respectively Soft Actor-Critic (SAC) and Proximal Policy Optimization (PPO) which are supported. Their effectiveness in training a virtual agent drone to put out a fire is being studied. For the experiment, a fire is realized at different places randomly in the training virtual environment. The results show that the agents can detect and extinguish many of the fires and achieve adaptive behavior. From this, it can be concluded that these algorithms can be used for real purposes and the Unity ML Agent toolkit is suitable for conducting experiments. References - www.unity.com Unity Real-Time Development Platform | 3D, 2D, VR & AR Engine.
- www.unrealengine.com The most powerful real-time 3D creation tool - Unreal Engine.
- https://microsoft.github.io/AirSim/ AirSim is an open-source, cross-platform simulator for drones and cars, built on Unreal Engine.
- https://bisimulations.com/products/vbs-blue-ig VBS Blue IG is a visualization simulator used by the military for tactical training and missions.
- https://www.ossovr.com/ a surgical training platform that uses virtual reality
- V. Vasilev, S. Stefanov, “Modeling an outdoor substation with dynamically occurring faults and conducting a preventative inspection,” 12th International Conference “TechSys 2023” – Engineering, Technologies and Systems AIP Conference Proceedings, Volume 3078, Issue 1, 040003 2024. https://doi.org/10.1063/5.0209284
- E. Bondi, A. Kapoor, D. Dey, J. Piavis, Sh. Shah, R. Hannaford, A. Iyer, L. Joppa, M. Tambe, “Near Real-Time Detection of Poachers from Drones in AirSim,” Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI - 2018) Demos. Pages 5814-5816., 2018. https://doi.org/10.24963/ijcai.2018/847
- Cianci, Maria Grazia & Calisi, Daniele & Botta, Stefano & Colaceci, Sara & Molinari, Matteo. (2022). “2022_Virtual Reality in Future Museums.” REPRESENTATION CHALLENGES New Frontiers of AR and AI Research for Cultural Heritage and Innovative Design, doi.org/10.3280/oa-845-c220
- I. Heldal, H. Wijkmark, C. Pareto, Lena, “Simulation and serious games for firefighter training: challenges for effective use.” NOKOBIT, ISSN 1892-0748, E-ISSN 1894-7719, Vol. 24, no 1, p. 1-12, 2016.
- https://unity-technologies.github.io/ml-agents/ML-AgentsOverview/ - ML Agents Overview.
- Bot Academy, Train your first A.I. in Unity | ML-Agents Tutorial 2020, https://www.youtube.com/watch?v=1bn9Lx2DDa0&list=P L8fePt58xRPY1-pkhMPus3GlUGXNdqMH5
- The Ash Bot, How to show ML Agents what to do! || Imitation learning!, https://www.youtube.com/watch?v=1raDh6rpg8U
- Immersive Limit, Unity ML-Agents - Demonstration Recorder for Imitation Learning, https://www.youtube.com/watch?v=Dhr4tHY3joE
- Code Monkey, Teach your AI! Imitation Learning with Unity ML-Agents!, https://www.youtube.com/watch?v=supqT7kqpEI
- Immersive Limit LLC, ML-Agent: Hummingbirds, https://learn.unity.com/course/ml-agents-hummingbirds
- Immersive Limit LLC, Reinforcement Learning: AI Flight with Unity ML-Agents, https://www.udemy.com/course/aiflight/
- J. Schulman, F. Wolski, P. Dhariwal, A. Radford, O. Klimov, “Proximal Policy Optimization Algorithms”, arXiv:1707.06347, 2017, https://doi.org/10.48550/arXiv.1707.06347
- Y. Bai, A. Jones, K. Ndousse, A. Askell, A. Chen, N. DasSarma, D. Drain, S. Fort, D. Ganguli, T. Henighan, et al. “Training a helpful and harmless assistant with reinforcement learning from human feedback.” arXiv preprint arXiv:2204.05862, 2022.
- H. Lee, S. Phatale, H. Mansoor, K. Lu, T. Mesnard, C. Bishop, V. Carbune and A. Rastogi, “RLAIF: scaling reinforcement learning from human feedback with AI feedback”. CoRR, abs/2309.00267, 2023. doi: 10.48550/ ARXIV.2309.00267. URL https://doi.org/10.48550/arXiv.2309.00267 .
- W. Yongliang, K. Hamidreza, “IPPO: Obstacle avoidance for robotic manipulators in joint space via improved proximal policy optimization”, arXiv:2210.00803v2 [cs.RO] 9 Feb 2023, 2023.
- N. Tang and T. Haarnoja, “Learning Diverse Skills via Maximum Entropy Deep Reinforcement Learning:”, Berkeley Artificial Intelligence Research – BAIR, 2017, https://bair.berkeley.edu/blog/2017/10/06/soft-q-learning/
- T. Haarnoja, A. Zhou, K. Hartikainen, G. Tucker, S. Ha, J. Tan, V. Kumar, H. Zhu, A. Gupta, P. Abbeel, S. Levine, “Soft Actor-Critic Algorithms and Applications”, arXiv:1812.05905v2 [cs.LG] 29 Jan 2019
- T. Haarnoja, A. Zhou, P. Abbeel, S. Levine, “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor.” In International Conference on Machine Learning (ICML), 2018.
- L. J. McCunn, R. Gifford, “Place imageability, sense of place, and spatial navigation: A community investigation”, Cities, Volume 115, 2021, 103245, ISSN 0264-2751, https://doi.org/10.1016/j.cities.2021.103245.(https://www.sciencedirect.com/science/article/pii/S0264275121001451).
- J. Zhang, X. Xia, R. Liu, N. Li, “Enhancing human indoor cognitive map development and wayfinding performance with immersive augmented reality-based navigation systems,” Advanced Engineering Informatics, Volume 50, 2021, 101432, ISSN 1474-0346, https://doi.org/10.1016/j.aei.2021.101432.(https://www.sciencedirect.com/science/article/pii/S1474034621001841)
Issue
| 2024 12th International Scientific Conference on Computer Science, COMSCI 2024 - Proceedings, 2025, Albania, https://doi.org/10.1109/COMSCI63166.2024.10778513 |
|