Unsupervised signature extraction from forensic logs

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

2 Citations (Scopus)
1 Downloads (Pure)

Abstract

Signature extraction is a key part of forensic log analysis. It involves recognizing patterns in log lines such that log lines that originated from the same line of code are grouped together. A log signature consists of immutable parts and mutable parts. The immutable parts define the signature, and the mutable parts are typically variable parameter values. In practice, the number of log lines and signatures can be quite large, and the task of detecting and aligning immutable parts of the logs to extract the signatures becomes a significant challenge. We propose a novel method based on a neural language model that outperforms the current state-of-the-art on signature extraction. We use an RNN auto-encoder to create an embedding of the log lines. Log lines embedded in such a way can be clustered to extract the signatures in an unsupervised manner.
Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases
Subtitle of host publicationEuropean Conference, ECML PKDD 2017, Skopje, Macedonia, September 18–22, 2017, Proceedings, Part III
EditorsMichelangelo Ceci, Saso Dzeroski, Donato Malerba, Yasemin Altun, Kamalika Das, Jesse Read, Marinka Zitnik, Jerzy Stefanowski, Taneli Mielikäinen
Place of PublicationDordrecht
PublisherSpringer
Pages305-316
Number of pages12
ISBN (Electronic)978-3-319-71273-4
ISBN (Print)978-3-319-71272-7
DOIs
Publication statusPublished - 2017
EventEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, September 18–22, 2017, Skopje, Macedonia - Skopje, Macedonia, The Former Yugoslav Republic of
Duration: 18 Sep 201722 Sep 2017
http://ecmlpkdd2017.ijs.si/index.html

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10536 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, September 18–22, 2017, Skopje, Macedonia
Abbreviated titleECML PKDD 2017
CountryMacedonia, The Former Yugoslav Republic of
CitySkopje
Period18/09/1722/09/17
Internet address

Keywords

  • Information forensic
  • Log clustering
  • Neural language model
  • RNN auto-encoder
  • Signature extraction

Cite this

Thaler, S. M., Menkovski, V., & Petkovic, M. (2017). Unsupervised signature extraction from forensic logs. In M. Ceci, S. Dzeroski, D. Malerba, Y. Altun, K. Das, J. Read, M. Zitnik, J. Stefanowski, ... T. Mielikäinen (Eds.), Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2017, Skopje, Macedonia, September 18–22, 2017, Proceedings, Part III (pp. 305-316). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10536 LNAI). Dordrecht: Springer. https://doi.org/10.1007/978-3-319-71273-4_25