Big Data Forensics: Hadoop Distributed File Systems as a Case Study

Mohammed Asim, Dean Richard McKinnel, Ali Dehghantanha*, Reza M. Parizi, Mohammad Hammoudeh, Gregory Epiphaniou

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

9 Scopus citations

Abstract

Big Data has fast become one of the most adopted computer paradigms within computer science and is considered an equally challenging paradigm for forensics investigators. The Hadoop Distributed File System (HDFS) is one of the most favourable big data platforms within the market, providing an unparalleled service with regards to parallel processing and data analytics. However, HDFS is not without its risks, having been reportedly targeted by cyber criminals as a means of stealing and exfiltrating confidential data. Using HDFS as a case study, we aim to detect remnants of malicious users’ activities within the HDFS environment. Our examination involves a thorough analysis of different areas of the HDFS environment, including a range of log files and disk images. Our experimental environment was comprised of a total of four virtual machines, all running Ubuntu. This HDFS research provides a thorough understanding of the types of forensically relevant artefacts that are likely to be found during a forensic investigation.

Original languageEnglish
Title of host publicationHandbook of Big Data and IoT Security
PublisherSpringer International Publishing
Pages179-210
Number of pages32
ISBN (Electronic)9783030105433
ISBN (Print)9783030105426
DOIs
StatePublished - 1 Jan 2019
Externally publishedYes

Bibliographical note

Publisher Copyright:
© Springer Nature Switzerland AG 2019.

Keywords

  • Big data
  • Digital forensics
  • Distributed file systems
  • HDFS
  • Hadoop

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'Big Data Forensics: Hadoop Distributed File Systems as a Case Study'. Together they form a unique fingerprint.

Cite this