VMhunt: A verifiable approach to partially-virtualized binary code simplification

Dongpeng Xu, Jiang Ming, Yu Fu, Dinghao Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Code virtualization is a highly sophisticated obfuscation technique adopted by malware authors to stay under the radar. However, the increasing complexity of code virtualization also becomes a “double-edged sword” for practical application. Due to its performance limitations and compatibility problems, code virtualization is seldom used on an entire program. Rather, it is mainly used only to safeguard the key parts of code such as security checks and encryption keys. Many techniques have been proposed to reverse engineer the virtualized code, but they share some common limitations. They assume the scope of virtualized code is known in advance and mainly focus on the classic structure of code emulator. Also, few work verifies the correctness of their deobfuscation results. In this paper, with fewer assumptions on the type and scope of code virtualization, we present a verifiable method to address the challenge of partially-virtualized binary code simplification. Our key insight is that code virtualization is a kind of process-level virtual machine (VM), and the context switch patterns when entering and exiting the VM can be used to detect the VM boundaries. Based on the scope of VM boundary, we simplify the virtualized code. We first ignore all the instructions in a given virtualized snippet that do not affect the final result of that snippet. To better revert the data obfuscation effect that encodes a variable through bitwise operations, we then run a new symbolic execution called multiple granularity symbolic execution to further simplify the trace snippet. The generated concise symbolic formulas facilitate the correctness testing of our simplification results. We have implemented our idea as an open source tool, VMHunt, and evaluated it with real-world applications and malware. The encouraging experimental results demonstrate that VMHunt is a significant improvement over the state of the art.

Original languageEnglish (US)
Title of host publicationCCS 2018 - Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security
PublisherAssociation for Computing Machinery
Pages442-458
Number of pages17
ISBN (Electronic)9781450356930
DOIs
StatePublished - Oct 15 2018
Event25th ACM Conference on Computer and Communications Security, CCS 2018 - Toronto, Canada
Duration: Oct 15 2018 → …

Publication series

NameProceedings of the ACM Conference on Computer and Communications Security
ISSN (Print)1543-7221

Other

Other25th ACM Conference on Computer and Communications Security, CCS 2018
CountryCanada
CityToronto
Period10/15/18 → …

Fingerprint

Binary codes
Cryptography
Radar
Switches
Virtualization
Engineers
Virtual machine
Testing

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Networks and Communications

Cite this

Xu, D., Ming, J., Fu, Y., & Wu, D. (2018). VMhunt: A verifiable approach to partially-virtualized binary code simplification. In CCS 2018 - Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (pp. 442-458). (Proceedings of the ACM Conference on Computer and Communications Security). Association for Computing Machinery. https://doi.org/10.1145/3243734.3243827
Xu, Dongpeng ; Ming, Jiang ; Fu, Yu ; Wu, Dinghao. / VMhunt : A verifiable approach to partially-virtualized binary code simplification. CCS 2018 - Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. Association for Computing Machinery, 2018. pp. 442-458 (Proceedings of the ACM Conference on Computer and Communications Security).
@inproceedings{314c6b66ce5b49468442144b39b4cac7,
title = "VMhunt: A verifiable approach to partially-virtualized binary code simplification",
abstract = "Code virtualization is a highly sophisticated obfuscation technique adopted by malware authors to stay under the radar. However, the increasing complexity of code virtualization also becomes a “double-edged sword” for practical application. Due to its performance limitations and compatibility problems, code virtualization is seldom used on an entire program. Rather, it is mainly used only to safeguard the key parts of code such as security checks and encryption keys. Many techniques have been proposed to reverse engineer the virtualized code, but they share some common limitations. They assume the scope of virtualized code is known in advance and mainly focus on the classic structure of code emulator. Also, few work verifies the correctness of their deobfuscation results. In this paper, with fewer assumptions on the type and scope of code virtualization, we present a verifiable method to address the challenge of partially-virtualized binary code simplification. Our key insight is that code virtualization is a kind of process-level virtual machine (VM), and the context switch patterns when entering and exiting the VM can be used to detect the VM boundaries. Based on the scope of VM boundary, we simplify the virtualized code. We first ignore all the instructions in a given virtualized snippet that do not affect the final result of that snippet. To better revert the data obfuscation effect that encodes a variable through bitwise operations, we then run a new symbolic execution called multiple granularity symbolic execution to further simplify the trace snippet. The generated concise symbolic formulas facilitate the correctness testing of our simplification results. We have implemented our idea as an open source tool, VMHunt, and evaluated it with real-world applications and malware. The encouraging experimental results demonstrate that VMHunt is a significant improvement over the state of the art.",
author = "Dongpeng Xu and Jiang Ming and Yu Fu and Dinghao Wu",
year = "2018",
month = "10",
day = "15",
doi = "10.1145/3243734.3243827",
language = "English (US)",
series = "Proceedings of the ACM Conference on Computer and Communications Security",
publisher = "Association for Computing Machinery",
pages = "442--458",
booktitle = "CCS 2018 - Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security",

}

Xu, D, Ming, J, Fu, Y & Wu, D 2018, VMhunt: A verifiable approach to partially-virtualized binary code simplification. in CCS 2018 - Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. Proceedings of the ACM Conference on Computer and Communications Security, Association for Computing Machinery, pp. 442-458, 25th ACM Conference on Computer and Communications Security, CCS 2018, Toronto, Canada, 10/15/18. https://doi.org/10.1145/3243734.3243827

VMhunt : A verifiable approach to partially-virtualized binary code simplification. / Xu, Dongpeng; Ming, Jiang; Fu, Yu; Wu, Dinghao.

CCS 2018 - Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. Association for Computing Machinery, 2018. p. 442-458 (Proceedings of the ACM Conference on Computer and Communications Security).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - VMhunt

T2 - A verifiable approach to partially-virtualized binary code simplification

AU - Xu, Dongpeng

AU - Ming, Jiang

AU - Fu, Yu

AU - Wu, Dinghao

PY - 2018/10/15

Y1 - 2018/10/15

N2 - Code virtualization is a highly sophisticated obfuscation technique adopted by malware authors to stay under the radar. However, the increasing complexity of code virtualization also becomes a “double-edged sword” for practical application. Due to its performance limitations and compatibility problems, code virtualization is seldom used on an entire program. Rather, it is mainly used only to safeguard the key parts of code such as security checks and encryption keys. Many techniques have been proposed to reverse engineer the virtualized code, but they share some common limitations. They assume the scope of virtualized code is known in advance and mainly focus on the classic structure of code emulator. Also, few work verifies the correctness of their deobfuscation results. In this paper, with fewer assumptions on the type and scope of code virtualization, we present a verifiable method to address the challenge of partially-virtualized binary code simplification. Our key insight is that code virtualization is a kind of process-level virtual machine (VM), and the context switch patterns when entering and exiting the VM can be used to detect the VM boundaries. Based on the scope of VM boundary, we simplify the virtualized code. We first ignore all the instructions in a given virtualized snippet that do not affect the final result of that snippet. To better revert the data obfuscation effect that encodes a variable through bitwise operations, we then run a new symbolic execution called multiple granularity symbolic execution to further simplify the trace snippet. The generated concise symbolic formulas facilitate the correctness testing of our simplification results. We have implemented our idea as an open source tool, VMHunt, and evaluated it with real-world applications and malware. The encouraging experimental results demonstrate that VMHunt is a significant improvement over the state of the art.

AB - Code virtualization is a highly sophisticated obfuscation technique adopted by malware authors to stay under the radar. However, the increasing complexity of code virtualization also becomes a “double-edged sword” for practical application. Due to its performance limitations and compatibility problems, code virtualization is seldom used on an entire program. Rather, it is mainly used only to safeguard the key parts of code such as security checks and encryption keys. Many techniques have been proposed to reverse engineer the virtualized code, but they share some common limitations. They assume the scope of virtualized code is known in advance and mainly focus on the classic structure of code emulator. Also, few work verifies the correctness of their deobfuscation results. In this paper, with fewer assumptions on the type and scope of code virtualization, we present a verifiable method to address the challenge of partially-virtualized binary code simplification. Our key insight is that code virtualization is a kind of process-level virtual machine (VM), and the context switch patterns when entering and exiting the VM can be used to detect the VM boundaries. Based on the scope of VM boundary, we simplify the virtualized code. We first ignore all the instructions in a given virtualized snippet that do not affect the final result of that snippet. To better revert the data obfuscation effect that encodes a variable through bitwise operations, we then run a new symbolic execution called multiple granularity symbolic execution to further simplify the trace snippet. The generated concise symbolic formulas facilitate the correctness testing of our simplification results. We have implemented our idea as an open source tool, VMHunt, and evaluated it with real-world applications and malware. The encouraging experimental results demonstrate that VMHunt is a significant improvement over the state of the art.

UR - http://www.scopus.com/inward/record.url?scp=85056907796&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85056907796&partnerID=8YFLogxK

U2 - 10.1145/3243734.3243827

DO - 10.1145/3243734.3243827

M3 - Conference contribution

AN - SCOPUS:85056907796

T3 - Proceedings of the ACM Conference on Computer and Communications Security

SP - 442

EP - 458

BT - CCS 2018 - Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security

PB - Association for Computing Machinery

ER -

Xu D, Ming J, Fu Y, Wu D. VMhunt: A verifiable approach to partially-virtualized binary code simplification. In CCS 2018 - Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. Association for Computing Machinery. 2018. p. 442-458. (Proceedings of the ACM Conference on Computer and Communications Security). https://doi.org/10.1145/3243734.3243827