Deep Reinforcement Learning Implementation for Vessel Stability Calculations
Szynalewski, Mateusz Zdzislaw (2024)
Szynalewski, Mateusz Zdzislaw
2024
All rights reserved. This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-202404106186
https://urn.fi/URN:NBN:fi:amk-202404106186
Tiivistelmä
This paper focused on combining ships stability calculations with deep reinforcement learning methods. For the sake of this thesis the author developed a proof of concept software that not only enables users to solve tasks from the field of vessel stability in manual mode but also allows the use of Proximal Policy Optimization (PPO) algorithm, which utilizes Python 3.11 programming language along with the Stable Baselines 3 package.
Deep reinforcement learning, which is an optimization problem framework for learning from experience that leads to refining a policy to maximize a future reward, was discussed in further detail. Moreover, attention was paid to a detailed explanation of the whole development process, neural networks architectures and hyperparameters defying PPO algorithm used in this project.
In the final chapters, the author presents the utility of the software in manual mode based on a few example tasks as well as the results achieved by the trained neural network which achieved an efficiency of 42 percent. Final conclusions were presented which emphasized the advantages and disadvantages of utilizing machine learning in the field of stability calculation. Finally, the potential for further development of this kind of software and autonomous maritime industry in general in the future was discussed.
Deep reinforcement learning, which is an optimization problem framework for learning from experience that leads to refining a policy to maximize a future reward, was discussed in further detail. Moreover, attention was paid to a detailed explanation of the whole development process, neural networks architectures and hyperparameters defying PPO algorithm used in this project.
In the final chapters, the author presents the utility of the software in manual mode based on a few example tasks as well as the results achieved by the trained neural network which achieved an efficiency of 42 percent. Final conclusions were presented which emphasized the advantages and disadvantages of utilizing machine learning in the field of stability calculation. Finally, the potential for further development of this kind of software and autonomous maritime industry in general in the future was discussed.