TY - JOUR
T1 - Deviation-based obfuscation-resilient program equivalence checking with application to software plagiarism detection
AU - Ming, Jiang
AU - Zhang, Fangfang
AU - Wu, Dinghao
AU - Liu, Peng
AU - Zhu, Sencun
N1 - Funding Information:
This work was supported in part by the National Science Foundation under Grant CCF-1320605 and Grant CNS- 1223710 and by the Office of Naval Research under Grant N00014-13-1-0175 and Grant N00014-16-1-2265. The work of P. Liu was supported in part by the ARO under Grant W911NF-09-1-0525 and Grant W911NF-13-1-0421 and by the National Science Foundation under Grant CNS-1422594 and Grant CNS- 1505664.
Publisher Copyright:
© 1963-2012 IEEE.
PY - 2016/12
Y1 - 2016/12
N2 - Software plagiarism, an act of illegally copying others' code, has become a serious concern for honest software companies and the open source community. Considerable research efforts have been dedicated to searching the evidence of software plagiarism. In this paper, we continue this line of research and propose LoPD, a deviation-based program equivalence checking approach, which is an ideal fit for the whole-program plagiarism detection. Instead of directly comparing the similarity between two programs, LoPD searches for any dissimilarity between two programs by finding an input that will cause these two programs to behave differently, either with different output states or with semantically different execution paths. As long as we can find one dissimilarity, the programs are semantically different; but if we cannot find any dissimilarity, it is more likely a plagiarism case. We leverage dynamic symbolic execution to capture the semantics of execution paths and to find path deviations. Compared to the existing detection approaches, LoPD's formal program semantics-based method is more resilient to automatic obfuscation schemes. Our evaluation results indicate that LoPD is effective in detecting whole-program plagiarism. Furthermore, we demonstrate that LoPD can be applied to partial software plagiarism detection as well. The encouraging experiment results show that LoPD is an appealing complement to existing software plagiarism detection approaches.
AB - Software plagiarism, an act of illegally copying others' code, has become a serious concern for honest software companies and the open source community. Considerable research efforts have been dedicated to searching the evidence of software plagiarism. In this paper, we continue this line of research and propose LoPD, a deviation-based program equivalence checking approach, which is an ideal fit for the whole-program plagiarism detection. Instead of directly comparing the similarity between two programs, LoPD searches for any dissimilarity between two programs by finding an input that will cause these two programs to behave differently, either with different output states or with semantically different execution paths. As long as we can find one dissimilarity, the programs are semantically different; but if we cannot find any dissimilarity, it is more likely a plagiarism case. We leverage dynamic symbolic execution to capture the semantics of execution paths and to find path deviations. Compared to the existing detection approaches, LoPD's formal program semantics-based method is more resilient to automatic obfuscation schemes. Our evaluation results indicate that LoPD is effective in detecting whole-program plagiarism. Furthermore, we demonstrate that LoPD can be applied to partial software plagiarism detection as well. The encouraging experiment results show that LoPD is an appealing complement to existing software plagiarism detection approaches.
UR - http://www.scopus.com/inward/record.url?scp=84974829842&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84974829842&partnerID=8YFLogxK
U2 - 10.1109/TR.2016.2570554
DO - 10.1109/TR.2016.2570554
M3 - Article
AN - SCOPUS:84974829842
SN - 0018-9529
VL - 65
SP - 1647
EP - 1664
JO - IRE Transactions on Reliability and Quality Control
JF - IRE Transactions on Reliability and Quality Control
IS - 4
M1 - 7490384
ER -