Analyzing the performance of AI summarization with limited resource allocation
Georgiev, Georgi (2023)
Georgiev, Georgi
2023
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-202304044708
https://urn.fi/URN:NBN:fi:amk-202304044708
Tiivistelmä
The advances in AI text processing in the recent years have been significant. However, the resource demands to achieve this task have also risen considerably. Natural language processing (NLP) is the specific field of machine learning that works with text and AI summarization is a part of NLP. The aim of this thesis was to assess whether the NLP summarization task is possible with limited hardware resources at all and if it was, what its capabilities were.
To achieve that, 11 models in total were prepared for an examination. The selection was made after examining available architectures and the best examples of each were chosen. From those 11 models, 6 were fine-tuned for the summarization task, while the rest were in their base state. While most collected models were kept in their original state, two versions of T5-Small were fine-tuned to examine in more detail the optimizations that can be achieved with it.
The collected models were then evaluated by comparing their hardware utilization and ROUGE scores. Efficiency scores for each model were then calculated based on those values. Finally, the produced summaries were reviewed, possible improvements were proposed, and potential applications were examined.
To achieve that, 11 models in total were prepared for an examination. The selection was made after examining available architectures and the best examples of each were chosen. From those 11 models, 6 were fine-tuned for the summarization task, while the rest were in their base state. While most collected models were kept in their original state, two versions of T5-Small were fine-tuned to examine in more detail the optimizations that can be achieved with it.
The collected models were then evaluated by comparing their hardware utilization and ROUGE scores. Efficiency scores for each model were then calculated based on those values. Finally, the produced summaries were reviewed, possible improvements were proposed, and potential applications were examined.