NORMA eResearch @NCI Library

Outlier Visualization in N-Dimensional Categorical Data Sets

Rogers, John (2016) Outlier Visualization in N-Dimensional Categorical Data Sets. Undergraduate thesis, Dublin, National College of Ireland.

[thumbnail of Bachelor of Science]
PDF (Bachelor of Science)
Download (1MB) | Preview


Humans can more easily and quickly interpret visual images than they can interpret the same data in text form. Knowledge contained in Big Data sets would be nearly inaccessible to the casual, or even moderately interested viewer, if it was not visualized. The primary use case of this project is server logs provided by IBM from their SameTime test systems.

Providing a meaningful visualization for a high-dimensional categorical case such as the primary use case, is particularly challenging for outlier detection. This is because, in high dimensionality, the data becomes sparse, and all pairs of data points become almost equidistant from one another. By using open source R and cutting edge technologies such as Tableau and IBM Watson, this project addresses the challenge of displaying high dimensional data in a meaningful and informative format.

Item Type: Thesis (Undergraduate)
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
Q Science > QA Mathematics > Computer software
T Technology > T Technology (General) > Information Technology > Computer software
Divisions: School of Computing > Bachelor of Science (Honours) in Computing
Depositing User: Caoimhe Ní Mhaicín
Date Deposited: 16 Nov 2016 14:27
Last Modified: 16 Nov 2016 14:27

Actions (login required)

View Item View Item