Bidirectional Grammars for Machine-Code Decoding and Encoding

Gang Tan, Greg Morrisett

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Binary analysis, which analyzes machine code, requires a decoder for converting bits into abstract syntax of machine instructions. Binary rewriting requires an encoder for converting instructions to bits. We propose a domain-specific language that enables the specification of both decoding and encoding in a single bidirectional grammar. With dependent types, a bigrammar enables the extraction of an executable decoder and encoder as well as a correctness proof showing their consistency. The bigrammar DSL is embedded in Coq with machine-checked proofs. We have used the bigrammar DSL to specify the decoding and encoding of subsets of the x86-32 and MIPS instruction sets. We have also extracted an executable decoder and encoder from the x86 bigrammar with competitive performance.

Original languageEnglish (US)
Pages (from-to)257-277
Number of pages21
JournalJournal of Automated Reasoning
Volume60
Issue number3
DOIs
StatePublished - Mar 1 2018

Fingerprint

DSL
Decoding
Specifications

All Science Journal Classification (ASJC) codes

  • Software
  • Computational Theory and Mathematics
  • Artificial Intelligence

Cite this

@article{a3f43c395a184e4c98eb37a201a7899c,
title = "Bidirectional Grammars for Machine-Code Decoding and Encoding",
abstract = "Binary analysis, which analyzes machine code, requires a decoder for converting bits into abstract syntax of machine instructions. Binary rewriting requires an encoder for converting instructions to bits. We propose a domain-specific language that enables the specification of both decoding and encoding in a single bidirectional grammar. With dependent types, a bigrammar enables the extraction of an executable decoder and encoder as well as a correctness proof showing their consistency. The bigrammar DSL is embedded in Coq with machine-checked proofs. We have used the bigrammar DSL to specify the decoding and encoding of subsets of the x86-32 and MIPS instruction sets. We have also extracted an executable decoder and encoder from the x86 bigrammar with competitive performance.",
author = "Gang Tan and Greg Morrisett",
year = "2018",
month = "3",
day = "1",
doi = "10.1007/s10817-017-9429-1",
language = "English (US)",
volume = "60",
pages = "257--277",
journal = "Journal of Automated Reasoning",
issn = "0168-7433",
publisher = "Springer Netherlands",
number = "3",

}

Bidirectional Grammars for Machine-Code Decoding and Encoding. / Tan, Gang; Morrisett, Greg.

In: Journal of Automated Reasoning, Vol. 60, No. 3, 01.03.2018, p. 257-277.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Bidirectional Grammars for Machine-Code Decoding and Encoding

AU - Tan, Gang

AU - Morrisett, Greg

PY - 2018/3/1

Y1 - 2018/3/1

N2 - Binary analysis, which analyzes machine code, requires a decoder for converting bits into abstract syntax of machine instructions. Binary rewriting requires an encoder for converting instructions to bits. We propose a domain-specific language that enables the specification of both decoding and encoding in a single bidirectional grammar. With dependent types, a bigrammar enables the extraction of an executable decoder and encoder as well as a correctness proof showing their consistency. The bigrammar DSL is embedded in Coq with machine-checked proofs. We have used the bigrammar DSL to specify the decoding and encoding of subsets of the x86-32 and MIPS instruction sets. We have also extracted an executable decoder and encoder from the x86 bigrammar with competitive performance.

AB - Binary analysis, which analyzes machine code, requires a decoder for converting bits into abstract syntax of machine instructions. Binary rewriting requires an encoder for converting instructions to bits. We propose a domain-specific language that enables the specification of both decoding and encoding in a single bidirectional grammar. With dependent types, a bigrammar enables the extraction of an executable decoder and encoder as well as a correctness proof showing their consistency. The bigrammar DSL is embedded in Coq with machine-checked proofs. We have used the bigrammar DSL to specify the decoding and encoding of subsets of the x86-32 and MIPS instruction sets. We have also extracted an executable decoder and encoder from the x86 bigrammar with competitive performance.

UR - http://www.scopus.com/inward/record.url?scp=85028970859&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85028970859&partnerID=8YFLogxK

U2 - 10.1007/s10817-017-9429-1

DO - 10.1007/s10817-017-9429-1

M3 - Article

VL - 60

SP - 257

EP - 277

JO - Journal of Automated Reasoning

JF - Journal of Automated Reasoning

SN - 0168-7433

IS - 3

ER -