Using Machine Learning to Predict On-Time Delivery
Guimaraes de Sousa, Debora (2022)
Guimaraes de Sousa, Debora
2022
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2022120125496
https://urn.fi/URN:NBN:fi:amk-2022120125496
Tiivistelmä
The objective of this study is to create a ML algorithm to predict the On-Time Delivery (OTD) for Wärtsilä Global Logistics (WGLS). WGLS uses OTD to measure service efficiency and improve customer satisfaction. However, WGLS has no advanced tools to predict OTD.
This thesis used an applied research family with mixed research methods. It mainly relied on the qualitative data gathered via internal document analysis, observations, and interviews, but it also used quantitative data for developing the ML algorithm.
The proposal was built by examining the literature review and evaluating the current state analysis. The data was extracted from EDW. The activities executed in the pre-processing stage includes data collection, outliers’ exclusion, missing data handling, feature encoding, data imbalance analysis, feature scaling and analysis of the correlations. After pre-processing, the data was loaded into seven different ML algorithms baseline: Random Forest, Neural Network, Bagging Classifier, Support Vector Machine, Gradient Boosting Machines, Logistic Regression and K-nearest neighbour. The three best algorithms were tuned by simulating different hyperparameters and feature selection. The best model after tunning was the Random Forest.
The final model was validated using ML metrics and stakeholder feedback. The algorithm was loaded into a Power BI file and shared with the case company.
It is expected that this tool will reduce the number of late deliveries. Late delivery is a problem for both customers and operations. Late deliveries lead to more claims and costs for the business. Therefore, on-time delivery service is crucial in maintaining high customer retention rates and satisfying customers.
This thesis used an applied research family with mixed research methods. It mainly relied on the qualitative data gathered via internal document analysis, observations, and interviews, but it also used quantitative data for developing the ML algorithm.
The proposal was built by examining the literature review and evaluating the current state analysis. The data was extracted from EDW. The activities executed in the pre-processing stage includes data collection, outliers’ exclusion, missing data handling, feature encoding, data imbalance analysis, feature scaling and analysis of the correlations. After pre-processing, the data was loaded into seven different ML algorithms baseline: Random Forest, Neural Network, Bagging Classifier, Support Vector Machine, Gradient Boosting Machines, Logistic Regression and K-nearest neighbour. The three best algorithms were tuned by simulating different hyperparameters and feature selection. The best model after tunning was the Random Forest.
The final model was validated using ML metrics and stakeholder feedback. The algorithm was loaded into a Power BI file and shared with the case company.
It is expected that this tool will reduce the number of late deliveries. Late delivery is a problem for both customers and operations. Late deliveries lead to more claims and costs for the business. Therefore, on-time delivery service is crucial in maintaining high customer retention rates and satisfying customers.