Data analysis theory and practice : Case: Python and Excel Tools
Khadka, Birendra (2019)
Khadka, Birendra
2019
All rights reserved. This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-202004235733
https://urn.fi/URN:NBN:fi:amk-202004235733
Tiivistelmä
Data science is a multifaceted field used to gain insights from complex data. The aim of this thesis is to discover how to analyse data using Python and Excel with the different data sets. The proposal of this thesis to analyse datasets of Corona virus by using Python´s libraries also this thesis analyse the excel datasets and illustrate the comparative study between them. The data analytics is the investigation of analysing unrefined data to resolve on selections about the information where data science is an interdisciplinary field that utilizes logical techniques, procedures, calculations and frameworks to separate information and bits of knowledge from organized and unstructured information which is identified with data mining and big data.
This thesis comprises three parts including theoretical part which demonstrates the introduction of data analysis, data analysis vs data science, data wrangling, normalization, formatting, binning and exploratory data analysis of Titanic datasets. Additionally, practical part exemplifies the data analysis of Corona virus spreading all over the China which is completed by Python-based data analysis beside this, it contains Excel-based data analysis of excel datasheets and comparative study of two data sets by using two types of data analysis: Python-based data analysis and Excel-based data analysis. Finally, this thesis concludes the preeminent data analysis method by describing its features and recommend for individuals who want to develop career in data science as well as refers to the future of data analysis.
This thesis comprises three parts including theoretical part which demonstrates the introduction of data analysis, data analysis vs data science, data wrangling, normalization, formatting, binning and exploratory data analysis of Titanic datasets. Additionally, practical part exemplifies the data analysis of Corona virus spreading all over the China which is completed by Python-based data analysis beside this, it contains Excel-based data analysis of excel datasheets and comparative study of two data sets by using two types of data analysis: Python-based data analysis and Excel-based data analysis. Finally, this thesis concludes the preeminent data analysis method by describing its features and recommend for individuals who want to develop career in data science as well as refers to the future of data analysis.