Memoized semantics-based binary diffing with application to malware lineage inference

Jiang Ming, Dongpeng Xu, Dinghao Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

Identifying differences between two executable binaries (binary diffing) has compelling security applications, such as software vulnerability exploration, “1-day” exploit generation and software plagiarism detection. Recently, binary diffing based on symbolic execution and constraint solver has been proposed to look for the code pairs with the same semantics, even though they are ostensibly different in syntactics. Such logical-based method captures intrinsic differences of binary code, making it a natural choice to analyze highly-obfuscated malicious program. However, semantics-based binary diffing suffers from significant performance slowdown, hindering it from analyzing large-scale malware samples. In this paper, we attempt to mitigate the high overhead of semantics-based binary diffing with application to malware lineage inference. We first study the key obstacles that contribute to the performance bottleneck. Then we propose basic blocks fast matching to speed up semantics-based binary diffing. We introduce an union-find set structure that records semantically equivalent basic blocks. Managing the union-find structure during successive comparisons allows direct reuse of previously computed results. Moreover, we purpose to concretize symbolic formulas and cache equivalence queries to further cut down the invocation times of constraint solver. We have implemented our technique on top of iBinHunt and evaluated it on 12 malware families with respect to the performance improvement when performing intra-family comparisons. Our experimental results show that our methods can accelerate symbolic execution from 2.8x to 5.3x (with an average 4.0x), and reduce constraint solver invocation by a factor of 3.0x to 6.0x (with an average 4.3x).

Original languageEnglish (US)
Title of host publicationICT Systems Security and Privacy Protection - 30th IFIP TC 11 International Conference, SEC 2015, Proceedings
EditorsHannes Federrath, Dieter Gollmann
PublisherSpringer New York LLC
Pages416-430
Number of pages15
ISBN (Print)9783319184661
DOIs
StatePublished - Jan 1 2015
Event30th IFIP TC 11 International Information Security and Privacy Conference, SEC 2015 - Hamburg, Germany
Duration: May 26 2015May 28 2015

Publication series

NameIFIP Advances in Information and Communication Technology
Volume455
ISSN (Print)1868-4238

Other

Other30th IFIP TC 11 International Information Security and Privacy Conference, SEC 2015
CountryGermany
CityHamburg
Period5/26/155/28/15

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Networks and Communications
  • Information Systems and Management

Fingerprint Dive into the research topics of 'Memoized semantics-based binary diffing with application to malware lineage inference'. Together they form a unique fingerprint.

  • Cite this

    Ming, J., Xu, D., & Wu, D. (2015). Memoized semantics-based binary diffing with application to malware lineage inference. In H. Federrath, & D. Gollmann (Eds.), ICT Systems Security and Privacy Protection - 30th IFIP TC 11 International Conference, SEC 2015, Proceedings (pp. 416-430). (IFIP Advances in Information and Communication Technology; Vol. 455). Springer New York LLC. https://doi.org/10.1007/978-3-319-18467-8_28