Skip to main navigation Skip to search Skip to main content

BETAC: Bidirectional Encoder Transformer for Assembly Code Function Name Recovery

  • Guillaume Breyton
  • , Mohd Saqib
  • , Benjamin C.M. Fung
  • , Philippe Charland

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Recovering function names from stripped binaries is a crucial and time-consuming task for software reverse engineering' particularly in enhancing network reliability, resilience, and security. This paper tackles the challenge of recovering function names in stripped binaries, a fundamental step in reverse engineering. The absence of syntactic information and the possibility of different code producing identical behavior complicate this task. To overcome these challenges, we introduce a novel model, the Bidirectional Encoder Transformer for Assembly Code (BETAC), leveraging a transformer-based architecture known for effectively processing sequential data. BETAC utilizes self-Attention mechanisms and feed-forward networks to discern complex relationships within assembly code for precise function name prediction. We evaluated BETAC against various existing encoder and decoder models in diverse binary datasets, including benign and malicious codes in multiple formats. Our model demonstrated superior performance over previous techniques in certain metrics and showed resilience against code obfuscation.

Original languageEnglish
Title of host publication20th International Conference on the Design of Reliable Communication Networks, DRCN 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350348972
DOIs
StatePublished - 2024
Externally publishedYes
Event20th International Conference on the Design of Reliable Communication Networks, DRCN 2024 - Montreal, Canada
Duration: 6 May 20249 May 2024

Publication series

Name20th International Conference on the Design of Reliable Communication Networks, DRCN 2024

Conference

Conference20th International Conference on the Design of Reliable Communication Networks, DRCN 2024
Country/TerritoryCanada
CityMontreal
Period6/05/249/05/24

Bibliographical note

Publisher Copyright:
© 2024 IEEE.

Keywords

  • assembly code
  • binaries
  • CodeBERT
  • Reverse engineering automation
  • summarization
  • Transformers

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Science Applications
  • Hardware and Architecture
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality
  • Health Informatics

Fingerprint

Dive into the research topics of 'BETAC: Bidirectional Encoder Transformer for Assembly Code Function Name Recovery'. Together they form a unique fingerprint.

Cite this