Program logic based software plagiarism detection

Fangfang Zhang, Dinghao Wu, Peng Liu, Sencun Zhu

Research output: Contribution to journalConference article

14 Citations (Scopus)

Abstract

Software plagiarism, an act of illegally copying others' code, has become a serious concern for honest software companies and the open source community. In this paper, we propose LoPD, a program logic based approach to software plagiarism detection. Instead of directly comparing the similarity between two programs, LoPD searches for any dissimilarity between two programs by finding an input that will cause these two programs to behave differently, either with different output states or with semantically different execution paths. As long as we can find one dissimilarity, the programs are semantically different, but if we cannot find any dissimilarity, it is likely a plagiarism case. We leverage symbolic execution and weakest precondition reasoning to capture the semantics of execution paths and to find path dissimilarities. LoPD is more resilient to current automatic obfuscation techniques, compared to the existing detection mechanisms. In addition, since LoPD is a formal program semantics-based method, it can provide a guarantee of resilience against many known obfuscation attacks. Our evaluation results indicate that LoPD is both effective and efficient in detecting software plagiarism.

Original languageEnglish (US)
Article number6982615
Pages (from-to)66-77
Number of pages12
JournalProceedings - International Symposium on Software Reliability Engineering, ISSRE
DOIs
StatePublished - Dec 11 2014
Event25th IEEE International Symposium on Software Reliability Engineering, ISSRE 2014 - Naples, Italy
Duration: Nov 3 2014Nov 6 2014

Fingerprint

Semantics
Copying
Industry

All Science Journal Classification (ASJC) codes

  • Software
  • Safety, Risk, Reliability and Quality

Cite this

@article{2d062253549a4fe2ad8174449edff471,
title = "Program logic based software plagiarism detection",
abstract = "Software plagiarism, an act of illegally copying others' code, has become a serious concern for honest software companies and the open source community. In this paper, we propose LoPD, a program logic based approach to software plagiarism detection. Instead of directly comparing the similarity between two programs, LoPD searches for any dissimilarity between two programs by finding an input that will cause these two programs to behave differently, either with different output states or with semantically different execution paths. As long as we can find one dissimilarity, the programs are semantically different, but if we cannot find any dissimilarity, it is likely a plagiarism case. We leverage symbolic execution and weakest precondition reasoning to capture the semantics of execution paths and to find path dissimilarities. LoPD is more resilient to current automatic obfuscation techniques, compared to the existing detection mechanisms. In addition, since LoPD is a formal program semantics-based method, it can provide a guarantee of resilience against many known obfuscation attacks. Our evaluation results indicate that LoPD is both effective and efficient in detecting software plagiarism.",
author = "Fangfang Zhang and Dinghao Wu and Peng Liu and Sencun Zhu",
year = "2014",
month = "12",
day = "11",
doi = "10.1109/ISSRE.2014.18",
language = "English (US)",
pages = "66--77",
journal = "Proceedings - International Symposium on Software Reliability Engineering, ISSRE",
issn = "1071-9458",

}

TY - JOUR

T1 - Program logic based software plagiarism detection

AU - Zhang, Fangfang

AU - Wu, Dinghao

AU - Liu, Peng

AU - Zhu, Sencun

PY - 2014/12/11

Y1 - 2014/12/11

N2 - Software plagiarism, an act of illegally copying others' code, has become a serious concern for honest software companies and the open source community. In this paper, we propose LoPD, a program logic based approach to software plagiarism detection. Instead of directly comparing the similarity between two programs, LoPD searches for any dissimilarity between two programs by finding an input that will cause these two programs to behave differently, either with different output states or with semantically different execution paths. As long as we can find one dissimilarity, the programs are semantically different, but if we cannot find any dissimilarity, it is likely a plagiarism case. We leverage symbolic execution and weakest precondition reasoning to capture the semantics of execution paths and to find path dissimilarities. LoPD is more resilient to current automatic obfuscation techniques, compared to the existing detection mechanisms. In addition, since LoPD is a formal program semantics-based method, it can provide a guarantee of resilience against many known obfuscation attacks. Our evaluation results indicate that LoPD is both effective and efficient in detecting software plagiarism.

AB - Software plagiarism, an act of illegally copying others' code, has become a serious concern for honest software companies and the open source community. In this paper, we propose LoPD, a program logic based approach to software plagiarism detection. Instead of directly comparing the similarity between two programs, LoPD searches for any dissimilarity between two programs by finding an input that will cause these two programs to behave differently, either with different output states or with semantically different execution paths. As long as we can find one dissimilarity, the programs are semantically different, but if we cannot find any dissimilarity, it is likely a plagiarism case. We leverage symbolic execution and weakest precondition reasoning to capture the semantics of execution paths and to find path dissimilarities. LoPD is more resilient to current automatic obfuscation techniques, compared to the existing detection mechanisms. In addition, since LoPD is a formal program semantics-based method, it can provide a guarantee of resilience against many known obfuscation attacks. Our evaluation results indicate that LoPD is both effective and efficient in detecting software plagiarism.

UR - http://www.scopus.com/inward/record.url?scp=84928689035&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84928689035&partnerID=8YFLogxK

U2 - 10.1109/ISSRE.2014.18

DO - 10.1109/ISSRE.2014.18

M3 - Conference article

SP - 66

EP - 77

JO - Proceedings - International Symposium on Software Reliability Engineering, ISSRE

JF - Proceedings - International Symposium on Software Reliability Engineering, ISSRE

SN - 1071-9458

M1 - 6982615

ER -