Abstract
Countering online hate speech is essential for creating a safer digital space where positive interactions can thrive. As central hubs of global communication, platforms like social media platforms require effective moderation through explainable and affective computing approaches. This study introduces a novel artificial intelligence-driven system for detecting misogynstic discourse. We collected 11,245 YouTube video uniform resource locators using specific keywords, then extracted audio to create Urdu transcripts and transliterated them into Roman Urdu, resulting in two distinct datasets. Various feature sets were explored using classic machine learning and deep learning algorithms. The results showed that classical models achieved 0.90 accuracy on the Urdu dataset, while deep learning models reached 0.96 accuracy on Roman Urdu. The corpus is publicly available to promote transparency and further research. Comparative evaluations against existing English hate speech dataset demonstrate the effectiveness of the proposed approach. This work lays the foundation for more ethical and transparent content moderation systems.
| Original language | English |
|---|---|
| Pages (from-to) | 29-40 |
| Number of pages | 12 |
| Journal | IEEE Intelligent Systems |
| Volume | 40 |
| Issue number | 6 |
| DOIs | |
| State | Published - 2025 |
Bibliographical note
Publisher Copyright:© 2001-2011 IEEE.
ASJC Scopus subject areas
- Computer Networks and Communications
- Artificial Intelligence