Abstract
Software vulnerabilities have posed significant threats to on-premise as well as cloud servers and applications. So far, numerous studies have focused on identifying and addressing software vulnerabilities at the binary level. Traditional approaches often involve highly complicated static and dynamic analysis techniques. Current intelligent methods are not explainable to reverse engineers, making them incapable of validating the detected vulnerabilities. In this paper, we propose VulEXplaineR, an XAI method for vulnerability detection based on assembly code. It employs BERT for block embedding, augmented with TFIDF of blocks and operand types information, to provide an effective vulnerability detection/explanation framework. VulEXplaineR takes a trained GCNN and its predictions and returns an explanation in the form of a small subgraph of the input graph. It is based on PGExplainer, a perturbation-based global explanation model for GNNs. We augment edge distribution with the edge feature in the form of intra-function jumps between blocks or inter-function calls between functions. The experimental results on the NDSS2018 and Juliet Test datasets demonstrate that VulEXplaineR outperforms the current state-of-the-art baselines in vulnerability detection. Unlike other baseline models, VulEXplaineR provides a high level of explainability as a complementary aid to a reverse engineer, for a more accurate function analysis. We measure fidelity to demonstrate how much two predictions from the extracted subgraph and the original graph match. Furthermore, we conduct a case study to show that VulEXplaineR not only identifies functions and basic blocks that cause the vulnerability, but also highlights interdependencies between those functions and blocks.
| Original language | English |
|---|---|
| Title of host publication | Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track - European Conference, ECML PKDD 2024, Proceedings |
| Editors | Albert Bifet, Tomas Krilavičius, Ioanna Miliou, Slawomir Nowaczyk |
| Publisher | Springer Science and Business Media Deutschland GmbH |
| Pages | 3-20 |
| Number of pages | 18 |
| ISBN (Print) | 9783031703775 |
| DOIs | |
| State | Published - 2024 |
| Externally published | Yes |
| Event | European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2024 - Vilnius, Lithuania Duration: 9 Sep 2024 → 13 Sep 2024 |
Publication series
| Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
|---|---|
| Volume | 14949 LNAI |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
Conference
| Conference | European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2024 |
|---|---|
| Country/Territory | Lithuania |
| City | Vilnius |
| Period | 9/09/24 → 13/09/24 |
Bibliographical note
Publisher Copyright:© Crown 2024.
Keywords
- BERT
- TFIDF
- Vulnerability
- assembly code
- block embedding
- explainability
- graph neural network
- subgraph
ASJC Scopus subject areas
- Theoretical Computer Science
- General Computer Science
Fingerprint
Dive into the research topics of 'VulEXplaineR: XAI for Vulnerability Detection on Assembly Code'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver