NORMA eResearch @NCI Library

Analysis of Wildfire Risk Using Machine Learning and Distributed Computing in Canadian Regions

Amadi, Emmanuel (2019) Analysis of Wildfire Risk Using Machine Learning and Distributed Computing in Canadian Regions. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
PDF (Master of Science)
Download (1MB) | Preview
[thumbnail of Configuration manual]
PDF (Configuration manual)
Download (2MB) | Preview


Wildfires present a great danger to human lives and their environments. Early detection and rapid spread of a wildfire is a major challenge to some countries, especially during the summer period which must be reduced to prevent economic, ecological and social damage to human lives. Data mining algorithms can be applied to historic and near real-time data to gain useful insight that will aid the fire managers in predicting, reducing the cost of moving water tankers with heavy fire equipment and the tendency of the fire to spread if not quenched on time. The aim of this research is to investigate using unsupervised and supervised machine learning algorithms built on distributed computing in predicting and staging firefighting assets as close to where wildfires are likely to occur based on wildfire dataset. The method followed a knowledge discovery and data mining approach extracting insight from the NASA wildfire dataset to predict the occurrence of wildfire and reduce computational time. Consequently, this research methodology was implemented to achieve this by building wildfire models from remote sensing satellite data acquired from the Moderate Resolution Imaging Spectroradiometer (MODIS). Experimental results showed that K-means clustering with a silhouette score of 65% and random forest with reduced RMSE of 0.13 when treated as a regression analysis while for classification, the model gave high prediction accuracy of 97% and training time of 7 seconds. The results and performance of these models were determined using cross-validation, root mean square error (RMSE), R-squared and classification metrics.
Keywords: Wildfire prediction, Machine learning, Random forest, Classification algorithms, Kmeans clustering, Distributed computing, Regression algorithms.

Item Type: Thesis (Masters)
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
Q Science > QA Mathematics > Computer software
T Technology > T Technology (General) > Information Technology > Computer software
S Agriculture > SD Forestry
G Geography. Anthropology. Recreation > GE Environmental Sciences > Environment
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Dan English
Date Deposited: 17 Jun 2020 17:13
Last Modified: 17 Jun 2020 17:13

Actions (login required)

View Item View Item