Reconciling malware labeling discrepancy via consensus learning

Ting Wang, Xin Hu, Shicong Meng, Reiner Sailer

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Anti-virus systems developed by different vendors often demonstrate strong discrepancy in the labels they assign to given malware, which significantly hinders threat intelligence sharing. The key challenge of addressing this discrepancy stems from the difficulty of re-standardizing already-in-use systems. In this paper we explore a non-intrusive alternative. We propose to leverage the correlation between the malware labels of different anti-virus systems to create a 'consensus' classification system, through which different systems can share information without modifying their own labeling conventions. To this end, we present a novel classification integration framework Latin which exploits the correspondence between participating anti-virus systems as reflected in heterogeneous information at instance-instance, instance-class, and class-class levels. We provide results from extensive experimental studies using real datasets and concrete use cases to verify the efficacy of Latin in reconciling the malware labeling discrepancy.

Original languageEnglish (US)
Title of host publication2014 IEEE 30th International Conference on Data Engineering Workshops, ICDEW 2014
PublisherIEEE Computer Society
Pages84-89
Number of pages6
ISBN (Print)9781479934805
DOIs
Publication statusPublished - Jan 1 2014
Event2014 IEEE 30th International Conference on Data Engineering Workshops, ICDEW 2014 - Chicago, IL, United States
Duration: Mar 31 2014Apr 4 2014

Publication series

NameProceedings - International Conference on Data Engineering
ISSN (Print)1084-4627

Other

Other2014 IEEE 30th International Conference on Data Engineering Workshops, ICDEW 2014
CountryUnited States
CityChicago, IL
Period3/31/144/4/14

    Fingerprint

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Information Systems

Cite this

Wang, T., Hu, X., Meng, S., & Sailer, R. (2014). Reconciling malware labeling discrepancy via consensus learning. In 2014 IEEE 30th International Conference on Data Engineering Workshops, ICDEW 2014 (pp. 84-89). [6818308] (Proceedings - International Conference on Data Engineering). IEEE Computer Society. https://doi.org/10.1109/ICDEW.2014.6818308