Log2vec: A heterogeneous graph embedding based approach for detecting cyber threats within enterprise

Fucheng Liu, Xihe Jiang, Yu Wen, Xinyu Xing, Dongxue Zhang, Dan Meng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Conventional attacks of insider employees and emerging APT are both major threats for the organizational information system. Existing detections mainly concentrate on users' behavior and usually analyze logs recording their operations in an information system. In general, most of these methods consider sequential relationship among log entries and model users' sequential behavior. However, they ignore other relationships, inevitably leading to an unsatisfactory performance on various attack scenarios. We propose log2vec, a heterogeneous graph embedding based modularized method. First, it involves a heuristic approach that converts log entries into a heterogeneous graph in the light of diverse relationships among them. Next, it utilizes an improved graph embedding appropriate to the above heterogeneous graph, which can automatically represent each log entry into a low-dimension vector. The third component of log2vec is a practical detection algorithm capable of separating malicious and benign log entries into different clusters and identifying malicious ones. We implement a prototype of log2vec. Our evaluation demonstrates that log2vec remarkably outperforms state-of-the-art approaches, such as deep learning and hidden markov model (HMM). Besides, log2vec shows its capability to detect malicious events in various attack scenarios.

Original languageEnglish (US)
Title of host publicationCCS 2019 - Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security
PublisherAssociation for Computing Machinery
Pages1777-1794
Number of pages18
ISBN (Electronic)9781450367479
DOIs
StatePublished - Nov 6 2019
Event26th ACM SIGSAC Conference on Computer and Communications Security, CCS 2019 - London, United Kingdom
Duration: Nov 11 2019Nov 15 2019

Publication series

NameProceedings of the ACM Conference on Computer and Communications Security
ISSN (Print)1543-7221

Conference

Conference26th ACM SIGSAC Conference on Computer and Communications Security, CCS 2019
CountryUnited Kingdom
CityLondon
Period11/11/1911/15/19

Fingerprint

Information systems
Hidden Markov models
Industry
Personnel
Deep learning

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Networks and Communications

Cite this

Liu, F., Jiang, X., Wen, Y., Xing, X., Zhang, D., & Meng, D. (2019). Log2vec: A heterogeneous graph embedding based approach for detecting cyber threats within enterprise. In CCS 2019 - Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (pp. 1777-1794). (Proceedings of the ACM Conference on Computer and Communications Security). Association for Computing Machinery. https://doi.org/10.1145/3319535.3363224
Liu, Fucheng ; Jiang, Xihe ; Wen, Yu ; Xing, Xinyu ; Zhang, Dongxue ; Meng, Dan. / Log2vec : A heterogeneous graph embedding based approach for detecting cyber threats within enterprise. CCS 2019 - Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. Association for Computing Machinery, 2019. pp. 1777-1794 (Proceedings of the ACM Conference on Computer and Communications Security).
@inproceedings{7eb37db9e4f242eda8f388c9731a0c68,
title = "Log2vec: A heterogeneous graph embedding based approach for detecting cyber threats within enterprise",
abstract = "Conventional attacks of insider employees and emerging APT are both major threats for the organizational information system. Existing detections mainly concentrate on users' behavior and usually analyze logs recording their operations in an information system. In general, most of these methods consider sequential relationship among log entries and model users' sequential behavior. However, they ignore other relationships, inevitably leading to an unsatisfactory performance on various attack scenarios. We propose log2vec, a heterogeneous graph embedding based modularized method. First, it involves a heuristic approach that converts log entries into a heterogeneous graph in the light of diverse relationships among them. Next, it utilizes an improved graph embedding appropriate to the above heterogeneous graph, which can automatically represent each log entry into a low-dimension vector. The third component of log2vec is a practical detection algorithm capable of separating malicious and benign log entries into different clusters and identifying malicious ones. We implement a prototype of log2vec. Our evaluation demonstrates that log2vec remarkably outperforms state-of-the-art approaches, such as deep learning and hidden markov model (HMM). Besides, log2vec shows its capability to detect malicious events in various attack scenarios.",
author = "Fucheng Liu and Xihe Jiang and Yu Wen and Xinyu Xing and Dongxue Zhang and Dan Meng",
year = "2019",
month = "11",
day = "6",
doi = "10.1145/3319535.3363224",
language = "English (US)",
series = "Proceedings of the ACM Conference on Computer and Communications Security",
publisher = "Association for Computing Machinery",
pages = "1777--1794",
booktitle = "CCS 2019 - Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security",

}

Liu, F, Jiang, X, Wen, Y, Xing, X, Zhang, D & Meng, D 2019, Log2vec: A heterogeneous graph embedding based approach for detecting cyber threats within enterprise. in CCS 2019 - Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. Proceedings of the ACM Conference on Computer and Communications Security, Association for Computing Machinery, pp. 1777-1794, 26th ACM SIGSAC Conference on Computer and Communications Security, CCS 2019, London, United Kingdom, 11/11/19. https://doi.org/10.1145/3319535.3363224

Log2vec : A heterogeneous graph embedding based approach for detecting cyber threats within enterprise. / Liu, Fucheng; Jiang, Xihe; Wen, Yu; Xing, Xinyu; Zhang, Dongxue; Meng, Dan.

CCS 2019 - Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. Association for Computing Machinery, 2019. p. 1777-1794 (Proceedings of the ACM Conference on Computer and Communications Security).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Log2vec

T2 - A heterogeneous graph embedding based approach for detecting cyber threats within enterprise

AU - Liu, Fucheng

AU - Jiang, Xihe

AU - Wen, Yu

AU - Xing, Xinyu

AU - Zhang, Dongxue

AU - Meng, Dan

PY - 2019/11/6

Y1 - 2019/11/6

N2 - Conventional attacks of insider employees and emerging APT are both major threats for the organizational information system. Existing detections mainly concentrate on users' behavior and usually analyze logs recording their operations in an information system. In general, most of these methods consider sequential relationship among log entries and model users' sequential behavior. However, they ignore other relationships, inevitably leading to an unsatisfactory performance on various attack scenarios. We propose log2vec, a heterogeneous graph embedding based modularized method. First, it involves a heuristic approach that converts log entries into a heterogeneous graph in the light of diverse relationships among them. Next, it utilizes an improved graph embedding appropriate to the above heterogeneous graph, which can automatically represent each log entry into a low-dimension vector. The third component of log2vec is a practical detection algorithm capable of separating malicious and benign log entries into different clusters and identifying malicious ones. We implement a prototype of log2vec. Our evaluation demonstrates that log2vec remarkably outperforms state-of-the-art approaches, such as deep learning and hidden markov model (HMM). Besides, log2vec shows its capability to detect malicious events in various attack scenarios.

AB - Conventional attacks of insider employees and emerging APT are both major threats for the organizational information system. Existing detections mainly concentrate on users' behavior and usually analyze logs recording their operations in an information system. In general, most of these methods consider sequential relationship among log entries and model users' sequential behavior. However, they ignore other relationships, inevitably leading to an unsatisfactory performance on various attack scenarios. We propose log2vec, a heterogeneous graph embedding based modularized method. First, it involves a heuristic approach that converts log entries into a heterogeneous graph in the light of diverse relationships among them. Next, it utilizes an improved graph embedding appropriate to the above heterogeneous graph, which can automatically represent each log entry into a low-dimension vector. The third component of log2vec is a practical detection algorithm capable of separating malicious and benign log entries into different clusters and identifying malicious ones. We implement a prototype of log2vec. Our evaluation demonstrates that log2vec remarkably outperforms state-of-the-art approaches, such as deep learning and hidden markov model (HMM). Besides, log2vec shows its capability to detect malicious events in various attack scenarios.

UR - http://www.scopus.com/inward/record.url?scp=85075931180&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85075931180&partnerID=8YFLogxK

U2 - 10.1145/3319535.3363224

DO - 10.1145/3319535.3363224

M3 - Conference contribution

AN - SCOPUS:85075931180

T3 - Proceedings of the ACM Conference on Computer and Communications Security

SP - 1777

EP - 1794

BT - CCS 2019 - Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security

PB - Association for Computing Machinery

ER -

Liu F, Jiang X, Wen Y, Xing X, Zhang D, Meng D. Log2vec: A heterogeneous graph embedding based approach for detecting cyber threats within enterprise. In CCS 2019 - Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. Association for Computing Machinery. 2019. p. 1777-1794. (Proceedings of the ACM Conference on Computer and Communications Security). https://doi.org/10.1145/3319535.3363224