Abstract
Data mining technique has been used to extract potentially useful knowledge from big data. However, data mining sometimes faces the issue of incorrect results which could be due to the presence of an outlier in the analyzed data. In the literature, it has been identified that the detection of this outlier could enhance the quality of the dataset. An important type of data that requires outlier detection for accurate prediction and enhanced decision making is time series data. Time series data are valuable as it helps to understand the past behavior which is helpful for future predictions hence, it is important to detect the presence of outliers in time series dataset. This paper proposes an algorithm for outlier detection in Multivariate Time Series (MTS) data based on a fusion of K-medoid, Standard Euclidean Distance (SED), and Z-score. Apart from SED, experiments were also performed on two other distance metrics which are City Block and Euclidean Distance. Z-score performance was compared to that of inter-quartile. However, the result obtained showed that the Z-score technique produced a better outlier detection result of 0.9978 F-measure as compared to inter-quartile of 0.8571 F-measure. Furthermore, SED performed better when combined with both Z-score and inter-quartile than City Block and Euclidean Distance.
| Original language | English |
|---|---|
| Title of host publication | Information and Communication Technology and Applications - Third International Conference, ICTA 2020, Revised Selected Papers |
| Editors | Sanjay Misra, Bilkisu Muhammad-Bello |
| Publisher | Springer Science and Business Media Deutschland GmbH |
| Pages | 259-271 |
| Number of pages | 13 |
| ISBN (Print) | 9783030691424 |
| DOIs | |
| State | Published - 2021 |
| Externally published | Yes |
| Event | 3rd International Conference on Information and Communication Technology and Applications, ICTA 2020 - Virtual, Online Duration: 24 Nov 2020 → 27 Nov 2020 |
Publication series
| Name | Communications in Computer and Information Science |
|---|---|
| Volume | 1350 |
| ISSN (Print) | 1865-0929 |
| ISSN (Electronic) | 1865-0937 |
Conference
| Conference | 3rd International Conference on Information and Communication Technology and Applications, ICTA 2020 |
|---|---|
| City | Virtual, Online |
| Period | 24/11/20 → 27/11/20 |
Bibliographical note
Publisher Copyright:© 2020, Springer Nature Switzerland AG.
Keywords
- City block
- Euclidean distance
- K-Medoid
- Multivariate
- Outlier detection
- Outliers
- Time series data
- Z-scores
ASJC Scopus subject areas
- General Computer Science
- General Mathematics