An ML Based Anomaly Detection System in real-time data streams

  • Javier Jose Diaz Rivera
  • , Talha Ahmed Khan
  • , Waleed Akbar
  • , Muhammad Afaq
  • , Wang Cheol Song*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Scopus citations

Abstract

Due to the advancements in machine learning and artificial intelligence applied fields, network anomaly detection systems have experienced an evolution from traditional signature-based methods for intrusion detection. Nonetheless, as security measures evolve, more sophisticated attacks are also constantly being developed by hackers. Not only a robust anomaly detection algorithm is needed, but also a real-time data feeding mechanism for minimizing the reaction-time impact is required. Moreover, DDoS attacks can flood the network data channels with more than thousands of packets per second with the latent effect of overloading most traditional monitoring systems that rely on data storage. Due to this, the research presented in this paper focuses its efforts on implementing a real-time data streaming system for network anomaly detection that can operate during a high volume of traffic data. The solution includes the deployment of a flow collector platform connected to Apache Kafka for receiving NetFlow data from network switches. Also, real-time big data processing techniques are applied through Apache Spark, where the ML anomaly detection is triggered. The detection of anomalies is performed by a combination of the unsupervised learning clustering algorithm k-means and the supervised learning classifier KNN (k- nearest neighbors). Finally, a monitoring system consisting of an ELK stack collects historical data for further evolution of the ML algorithms.

Original languageEnglish
Title of host publicationProceedings - 2021 International Conference on Computational Science and Computational Intelligence, CSCI 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1329-1334
Number of pages6
ISBN (Electronic)9781665458412
DOIs
StatePublished - 2021
Externally publishedYes
Event2021 International Conference on Computational Science and Computational Intelligence, CSCI 2021 - Las Vegas, United States
Duration: 15 Dec 202117 Dec 2021

Publication series

NameProceedings - 2021 International Conference on Computational Science and Computational Intelligence, CSCI 2021

Conference

Conference2021 International Conference on Computational Science and Computational Intelligence, CSCI 2021
Country/TerritoryUnited States
CityLas Vegas
Period15/12/2117/12/21

Bibliographical note

Publisher Copyright:
© 2021 IEEE.

Keywords

  • Anomaly Detection
  • Big Data
  • Data Streams
  • Machine Learning
  • NetFlow

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Computer Vision and Pattern Recognition
  • Signal Processing
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'An ML Based Anomaly Detection System in real-time data streams'. Together they form a unique fingerprint.

Cite this