Mining criminal networks from unstructured text documents

  • Rabeah Al-Zaidy
  • , Benjamin C.M. Fung*
  • , Amr M. Youssef
  • , Francis Fortin
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

49 Scopus citations

Abstract

Digital data collected for forensics analysis often contain valuable information about the suspects' social networks. However, most collected records are in the form of unstructured textual data, such as e-mails, chat messages, and text documents. An investigator often has to manually extract the useful information from the text and then enter the important pieces into a structured database for further investigation by using various criminal network analysis tools. Obviously, this information extraction process is tedious and error-prone. Moreover, the quality of the analysis varies by the experience and expertise of the investigator. In this paper, we propose a systematic method to discover criminal networks from a collection of text documents obtained from a suspect's machine, extract useful information for investigation, and then visualize the suspect's criminal network. Furthermore, we present a hypothesis generation approach to identify potential indirect relationships among the members in the identified networks. We evaluated the effectiveness and performance of the method on a real-life cybercrimine case and some other datasets. The proposed method, together with the implemented software tool, has received positive feedback from the digital forensics team of a law enforcement unit in Canada.

Original languageEnglish
Pages (from-to)147-160
Number of pages14
JournalDigital Investigation
Volume8
Issue number3-4
DOIs
StatePublished - Feb 2012
Externally publishedYes

Bibliographical note

Funding Information:
The authors would like to thank the anonymous reviewers for their constructive comments that greatly helped improve this paper. The research is supported in part by research grants from Le Fonds québécois de la recherche sur la nature et les technologies (FQRNT) new researchers start-up program, Concordia ENCS seed funding program , and the National Cyber-Forensics and Training Alliance Canada (NCFTA Canada).

Keywords

  • Criminal network
  • Data mining
  • Forensic analysis
  • Hypothesis generation
  • Information retrieval

ASJC Scopus subject areas

  • Pathology and Forensic Medicine
  • Information Systems
  • Computer Science Applications
  • Medical Laboratory Technology
  • Law

Fingerprint

Dive into the research topics of 'Mining criminal networks from unstructured text documents'. Together they form a unique fingerprint.

Cite this