Context-Aware Spam Detection Using BERT Embeddings with Multi-Window CNNs

  • Sajid Ali
  • , Qazi Mazhar Ul Haq*
  • , Ala Saleh Alluhaidan*
  • , Muhammad Shahid Anwar
  • , Sadique Ahmad
  • , Leila Jamel
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Spam emails remain one of the most persistent threats to digital communication, necessitating effective detection solutions that safeguard both individuals and organisations. We propose a spam email classification framework that uses Bidirectional Encoder Representations from Transformers (BERT) for contextual feature extraction and a multiple-window Convolutional Neural Network (CNN) for classification. To identify semantic nuances in email content, BERT embeddings are used, and CNN filters extract discriminative n-gram patterns at various levels of detail, enabling accurate spam identification. The proposed model outperformed Word2Vec-based baselines on a sample of 5728 labelled emails, achieving an accuracy of 98.69%, AUC of 0.9981, F1 Score of 0.9724, and MCC of 0.9639. With a medium kernel size of (6, 9) and compact multi-window CNN architectures, it improves performance. Cross-validation illustrates stability and generalization across folds. By balancing high recall with minimal false positives, our method provides a reliable and scalable solution for current spam detection in advanced deep learning. By combining contextual embedding and a neural architecture, this study develops a security analysis method.

Original languageEnglish
Article number43
JournalCMES - Computer Modeling in Engineering and Sciences
Volume146
Issue number1
DOIs
StatePublished - 2026

Bibliographical note

Publisher Copyright:
Copyright © 2026 The Authors. Published by Tech Science Press.

Keywords

  • BERT embedding
  • CNN
  • cybersecurity
  • E-mail spam detection
  • text classification

ASJC Scopus subject areas

  • Software
  • Modeling and Simulation
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Context-Aware Spam Detection Using BERT Embeddings with Multi-Window CNNs'. Together they form a unique fingerprint.

Cite this