Topic Modeling of StormFront Forum Posts
Nazarko, Grigorii (2021)
Nazarko, Grigorii
2021
All rights reserved. This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2021060313973
https://urn.fi/URN:NBN:fi:amk-2021060313973
Tiivistelmä
The research of radical communities is crucial for preventing violent actions and affecting the community to avoid further radicalisation. In this thesis, we propose a way to analyse semantic topics which were assessed on the oldest right-wing forum StormFront.
We obtained two million forum posts from 2015 to 2020 and applied several NLP techniques for topic modelling. The model that provided the best results was Latent Dirichlet Allocation (LDA). We used human experts, who estimated the connection between real-world events and the model’s output to validate the value and sensibility of the framework. The validation showed that the framework is correct and valuable for analysing topics and affiliated discussions in StormFront.
The thesis consists of two parts: formal thesis and associated conference paper
(Nazarko, Frank & Westerlund 2021).
We obtained two million forum posts from 2015 to 2020 and applied several NLP techniques for topic modelling. The model that provided the best results was Latent Dirichlet Allocation (LDA). We used human experts, who estimated the connection between real-world events and the model’s output to validate the value and sensibility of the framework. The validation showed that the framework is correct and valuable for analysing topics and affiliated discussions in StormFront.
The thesis consists of two parts: formal thesis and associated conference paper
(Nazarko, Frank & Westerlund 2021).