Skip to main navigation Skip to search Skip to main content

Enhancing security in text-to-SQL systems: A novel dataset and agent-based framework

  • Salmane Chafik*
  • , Saad Ezzini
  • , Ismail Berrada
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

This paper explores the significant advancements in generating Structured Query Language (SQL) from natural language, primarily driven by Large Language Models (LLMs). These advancements have led to the development of sophisticated text-to-SQL integrated applications, enabling easier database (DB) querying for users unfamiliar with SQL syntax using natural language queries. However, reliance on LLMs exposes these applications to potential attacks through the introduction of malicious prompts or by compromising models with malicious data during the training phase. Such attacks pose severe risks, including unauthorized data access or even complete DB destruction upon success. To address these concerns, we introduce a novel large-scale dataset comprising malicious and safe prompts along with their corresponding SQL queries, enabling model fine-tuning on malicious query detection tasks. Moreover, we propose the implementation of two transformer-based classification solutions to aid in the detection of malicious attacks. Finally, we present a secure agent-based text-to-SQL architecture that incorporates these solutions to enhance overall system security, resulting in a 70% security enhancement overall compared to solely relying on a conventional text-to-SQL model.

Original languageEnglish
Pages (from-to)1399-1422
Number of pages24
JournalNatural Language Processing
Volume31
Issue number6
DOIs
StatePublished - 1 Nov 2025

Bibliographical note

Publisher Copyright:
© 2025 Cambridge University Press. All rights reserved.

Keywords

  • prompt Injection
  • text-to-SQL
  • text-to-SQL integrated applications

ASJC Scopus subject areas

  • Software
  • Language and Linguistics
  • Linguistics and Language
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Enhancing security in text-to-SQL systems: A novel dataset and agent-based framework'. Together they form a unique fingerprint.

Cite this