Kelly, Mark (2021) Data Analytics Project: Identifying Irish County with Best Quality of Life. Undergraduate thesis, Dublin, National College of Ireland.
Preview |
PDF (Bachelor of Science)
Download (4MB) | Preview |
Abstract
The Covid-19 pandemic has led to several big employers like Indeed, Siemens, Twitter, Salesforce & Spotify now allowing their employees to work remotely on a more permanent basis. As more companies recognise the benefits of large-scale remote working (both for the employee & employer), this list will grow as companies move from forced remote working to smarter working in a post-Covid-19 environment. Large numbers of employees now no longer must live in Dublin near their employer and are looking for alternative locations to call home.
The project aim was to identify which county in the Republic of Ireland would offer the best quality of life using variables that a typical family would consider. These variables include crime rate, classroom size, property prices / monthly rent costs, distance to an emergency department et cetera. These datasets are the most current available.
The project also compares the actual number of crimes versus the predicted number of crimes using decision trees and random forest algorithms for a sample of counties. The purpose of this was to determine if the eight independent variables are sufficient to predict crime rates. The random forest proved to be the most accurate but not accurate enough to be considered a good model.
The results from the analysis and associated charts/graphs show the top counties a user should consider for relocation. When using all nine variables in making the decision, the county with the best quality of life is Dublin. This prediction is not a true reflection of the best county, as including all nine variables is unrealistic. A more realistic scenario is that only selected variables would be needed for individual analysis. This option is explored in the results section using a ‘R’ Shiny application.
The ‘R’ Shiny application has been developed to allow users to pick the variables they would like to include in the analysis. The county with the best quality of life will be computed. Additionally, the application also provides summaries per county and per variable.
Item Type: | Thesis (Undergraduate) |
---|---|
Subjects: | H Social Sciences > HA Statistics Q Science > QA Mathematics > Electronic computers. Computer science T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science |
Divisions: | School of Computing > Bachelor of Science (Honours) in Computing |
Depositing User: | Clara Chan |
Date Deposited: | 02 Sep 2021 11:01 |
Last Modified: | 20 Sep 2021 09:49 |
URI: | https://norma.ncirl.ie/id/eprint/5002 |
Actions (login required)
View Item |