DeepVSA: Facilitating value-set analysis with deep learning for postmortem program analysis

Wenbo Guo, Dongliang Mu, Xinyu Xing, Min Du, Dawn Song

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Value set analysis (VSA) is one of the most powerful binary analysis tools, which has been broadly adopted in many use cases, ranging from verifying software properties (e.g., variable range analysis) to identifying software vulnerabilities (e.g., buffer overflow detection). Using it to facilitate data flow analysis in the context of postmortem program analysis, it however exhibits an insufficient capability in handling memory alias identification. Technically speaking, this is due to the fact that VSA needs to infer memory reference based on the context of a control flow, but accidental termination of a running program left behind incomplete control flow information, making memory alias analysis clueless. To address this issue, we propose a new technical approach. At the high level, this approach first employs a layer of instruction embedding along with a bi-directional sequence-to-sequence neural network to learn the machine code pattern pertaining to memory region accesses. Then, it utilizes the network to infer the memory region that VSA fails to recognize. Since the memory references to different regions naturally indicate the non-alias relationship, the proposed neural architecture can facilitate the ability of VSA to perform better alias analysis. Different from previous research that utilizes deep learning for other binary analysis tasks, the neural network proposed in this work is fundamentally novel. Instead of simply using off-the-shelf neural networks, we introduce a new neural network architecture which could capture the data dependency between and within instructions. In this work, we implement our deep neural architecture as DEEPVSA, a neural network assisted alias analysis tool. To demonstrate the utility of this tool, we use it to analyze software crashes corresponding to 40 memory corruption vulnerabilities archived in Offensive Security Exploit Database. We show that, DEEPVSA can significantly improve VSA with respect to its capability in analyzing memory alias and thus escalate the ability of security analysts to pinpoint the root cause of software crashes. In addition, we demonstrate that our proposed neural network outperforms state-of-the-art neural architectures broadly adopted in other binary analysis tasks. Last but not least, we show that DEEPVSA exhibits nearly no false positives when performing alias analysis.

Original languageEnglish (US)
Title of host publicationProceedings of the 28th USENIX Security Symposium
PublisherUSENIX Association
Pages1787-1804
Number of pages18
ISBN (Electronic)9781939133069
StatePublished - Jan 1 2019
Event28th USENIX Security Symposium - Santa Clara, United States
Duration: Aug 14 2019Aug 16 2019

Publication series

NameProceedings of the 28th USENIX Security Symposium

Conference

Conference28th USENIX Security Symposium
CountryUnited States
CitySanta Clara
Period8/14/198/16/19

Fingerprint

Data storage equipment
Neural networks
Flow control
Data flow analysis
Deep learning
Network architecture

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Safety, Risk, Reliability and Quality

Cite this

Guo, W., Mu, D., Xing, X., Du, M., & Song, D. (2019). DeepVSA: Facilitating value-set analysis with deep learning for postmortem program analysis. In Proceedings of the 28th USENIX Security Symposium (pp. 1787-1804). (Proceedings of the 28th USENIX Security Symposium). USENIX Association.
Guo, Wenbo ; Mu, Dongliang ; Xing, Xinyu ; Du, Min ; Song, Dawn. / DeepVSA : Facilitating value-set analysis with deep learning for postmortem program analysis. Proceedings of the 28th USENIX Security Symposium. USENIX Association, 2019. pp. 1787-1804 (Proceedings of the 28th USENIX Security Symposium).
@inproceedings{39b221b6af3848d3adc3a903f6a6df0c,
title = "DeepVSA: Facilitating value-set analysis with deep learning for postmortem program analysis",
abstract = "Value set analysis (VSA) is one of the most powerful binary analysis tools, which has been broadly adopted in many use cases, ranging from verifying software properties (e.g., variable range analysis) to identifying software vulnerabilities (e.g., buffer overflow detection). Using it to facilitate data flow analysis in the context of postmortem program analysis, it however exhibits an insufficient capability in handling memory alias identification. Technically speaking, this is due to the fact that VSA needs to infer memory reference based on the context of a control flow, but accidental termination of a running program left behind incomplete control flow information, making memory alias analysis clueless. To address this issue, we propose a new technical approach. At the high level, this approach first employs a layer of instruction embedding along with a bi-directional sequence-to-sequence neural network to learn the machine code pattern pertaining to memory region accesses. Then, it utilizes the network to infer the memory region that VSA fails to recognize. Since the memory references to different regions naturally indicate the non-alias relationship, the proposed neural architecture can facilitate the ability of VSA to perform better alias analysis. Different from previous research that utilizes deep learning for other binary analysis tasks, the neural network proposed in this work is fundamentally novel. Instead of simply using off-the-shelf neural networks, we introduce a new neural network architecture which could capture the data dependency between and within instructions. In this work, we implement our deep neural architecture as DEEPVSA, a neural network assisted alias analysis tool. To demonstrate the utility of this tool, we use it to analyze software crashes corresponding to 40 memory corruption vulnerabilities archived in Offensive Security Exploit Database. We show that, DEEPVSA can significantly improve VSA with respect to its capability in analyzing memory alias and thus escalate the ability of security analysts to pinpoint the root cause of software crashes. In addition, we demonstrate that our proposed neural network outperforms state-of-the-art neural architectures broadly adopted in other binary analysis tasks. Last but not least, we show that DEEPVSA exhibits nearly no false positives when performing alias analysis.",
author = "Wenbo Guo and Dongliang Mu and Xinyu Xing and Min Du and Dawn Song",
year = "2019",
month = "1",
day = "1",
language = "English (US)",
series = "Proceedings of the 28th USENIX Security Symposium",
publisher = "USENIX Association",
pages = "1787--1804",
booktitle = "Proceedings of the 28th USENIX Security Symposium",

}

Guo, W, Mu, D, Xing, X, Du, M & Song, D 2019, DeepVSA: Facilitating value-set analysis with deep learning for postmortem program analysis. in Proceedings of the 28th USENIX Security Symposium. Proceedings of the 28th USENIX Security Symposium, USENIX Association, pp. 1787-1804, 28th USENIX Security Symposium, Santa Clara, United States, 8/14/19.

DeepVSA : Facilitating value-set analysis with deep learning for postmortem program analysis. / Guo, Wenbo; Mu, Dongliang; Xing, Xinyu; Du, Min; Song, Dawn.

Proceedings of the 28th USENIX Security Symposium. USENIX Association, 2019. p. 1787-1804 (Proceedings of the 28th USENIX Security Symposium).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - DeepVSA

T2 - Facilitating value-set analysis with deep learning for postmortem program analysis

AU - Guo, Wenbo

AU - Mu, Dongliang

AU - Xing, Xinyu

AU - Du, Min

AU - Song, Dawn

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Value set analysis (VSA) is one of the most powerful binary analysis tools, which has been broadly adopted in many use cases, ranging from verifying software properties (e.g., variable range analysis) to identifying software vulnerabilities (e.g., buffer overflow detection). Using it to facilitate data flow analysis in the context of postmortem program analysis, it however exhibits an insufficient capability in handling memory alias identification. Technically speaking, this is due to the fact that VSA needs to infer memory reference based on the context of a control flow, but accidental termination of a running program left behind incomplete control flow information, making memory alias analysis clueless. To address this issue, we propose a new technical approach. At the high level, this approach first employs a layer of instruction embedding along with a bi-directional sequence-to-sequence neural network to learn the machine code pattern pertaining to memory region accesses. Then, it utilizes the network to infer the memory region that VSA fails to recognize. Since the memory references to different regions naturally indicate the non-alias relationship, the proposed neural architecture can facilitate the ability of VSA to perform better alias analysis. Different from previous research that utilizes deep learning for other binary analysis tasks, the neural network proposed in this work is fundamentally novel. Instead of simply using off-the-shelf neural networks, we introduce a new neural network architecture which could capture the data dependency between and within instructions. In this work, we implement our deep neural architecture as DEEPVSA, a neural network assisted alias analysis tool. To demonstrate the utility of this tool, we use it to analyze software crashes corresponding to 40 memory corruption vulnerabilities archived in Offensive Security Exploit Database. We show that, DEEPVSA can significantly improve VSA with respect to its capability in analyzing memory alias and thus escalate the ability of security analysts to pinpoint the root cause of software crashes. In addition, we demonstrate that our proposed neural network outperforms state-of-the-art neural architectures broadly adopted in other binary analysis tasks. Last but not least, we show that DEEPVSA exhibits nearly no false positives when performing alias analysis.

AB - Value set analysis (VSA) is one of the most powerful binary analysis tools, which has been broadly adopted in many use cases, ranging from verifying software properties (e.g., variable range analysis) to identifying software vulnerabilities (e.g., buffer overflow detection). Using it to facilitate data flow analysis in the context of postmortem program analysis, it however exhibits an insufficient capability in handling memory alias identification. Technically speaking, this is due to the fact that VSA needs to infer memory reference based on the context of a control flow, but accidental termination of a running program left behind incomplete control flow information, making memory alias analysis clueless. To address this issue, we propose a new technical approach. At the high level, this approach first employs a layer of instruction embedding along with a bi-directional sequence-to-sequence neural network to learn the machine code pattern pertaining to memory region accesses. Then, it utilizes the network to infer the memory region that VSA fails to recognize. Since the memory references to different regions naturally indicate the non-alias relationship, the proposed neural architecture can facilitate the ability of VSA to perform better alias analysis. Different from previous research that utilizes deep learning for other binary analysis tasks, the neural network proposed in this work is fundamentally novel. Instead of simply using off-the-shelf neural networks, we introduce a new neural network architecture which could capture the data dependency between and within instructions. In this work, we implement our deep neural architecture as DEEPVSA, a neural network assisted alias analysis tool. To demonstrate the utility of this tool, we use it to analyze software crashes corresponding to 40 memory corruption vulnerabilities archived in Offensive Security Exploit Database. We show that, DEEPVSA can significantly improve VSA with respect to its capability in analyzing memory alias and thus escalate the ability of security analysts to pinpoint the root cause of software crashes. In addition, we demonstrate that our proposed neural network outperforms state-of-the-art neural architectures broadly adopted in other binary analysis tasks. Last but not least, we show that DEEPVSA exhibits nearly no false positives when performing alias analysis.

UR - http://www.scopus.com/inward/record.url?scp=85075897488&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85075897488&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85075897488

T3 - Proceedings of the 28th USENIX Security Symposium

SP - 1787

EP - 1804

BT - Proceedings of the 28th USENIX Security Symposium

PB - USENIX Association

ER -

Guo W, Mu D, Xing X, Du M, Song D. DeepVSA: Facilitating value-set analysis with deep learning for postmortem program analysis. In Proceedings of the 28th USENIX Security Symposium. USENIX Association. 2019. p. 1787-1804. (Proceedings of the 28th USENIX Security Symposium).