Static malware detection and attribution in android byte-code through an end-to-end deep system

Muhammad Amin*, Tamleek Ali Tanveer, Mohammad Tehseen, M. Khan, Fakhri Alam Khan, S. Anwar

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

80 Scopus citations

Abstract

Android reflects a revolution in handhelds and mobile devices. It is a virtual machine based, an open source mobile platform that powers millions of smartphone and devices and even a larger no. of applications in its ecosystem. Surprisingly in a short lifespan, Android has also seen a colossal expansion in application malware with 99% of the total malware for smartphones being found in the Android ecosystem. Subsequently, quite a few techniques have been proposed in the literature for the analysis and detection of these malicious applications for the Android platform. The increasing and diversified nature of Android malware has immensely attenuated the usefulness of prevailing malware detectors, which leaves Android users susceptible to novel malware. Here in this paper, as a remedy to this problem, we propose an anti-malware system that uses customized learning models, which are sufficiently deep, and are ’End to End deep learning architectures which detect and attribute the Android malware via opcodes extracted from application bytecode’. Our results show that Bidirectional long short-term memory (BiLSTMs) neural networks can be used to detect static behavior of Android malware beating the state-of-the-art models without using handcrafted features. For our experiments in our system, we also choose to work with distinct and independent deep learning models leveraging sequence specialists like recurrent neural networks, Long Short Term Memory networks and its Bidirectional variation as well as those are more usual neural architectures like a network of all connected layers(fully connected), deep convnets, Diabolo network (autoencoders) and generative graphical models like deep belief networks for static malware analysis on Android. To test our system, we have also augmented a bytecode dataset from three open and independently maintained state-of-the-art datasets. Our bytecode dataset, which is on an order of magnitude large, essentially suffice for our experiments. Our results suggests that our proposed system can lead to better design of malware detectors as we report an accuracy of 0.999 and an F1-score of 0.996 on a large dataset of more than 1.8 million Android applications.

Original languageEnglish
Pages (from-to)112-126
Number of pages15
JournalFuture Generation Computer Systems
Volume102
DOIs
StatePublished - Jan 2020
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2019 Elsevier B.V.

Keywords

  • Android and big data
  • Deep neural networks
  • End-to-end architecture
  • Malware analysis

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Static malware detection and attribution in android byte-code through an end-to-end deep system'. Together they form a unique fingerprint.

Cite this