Unsupervised signature extraction from forensic logs

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

5 Citations (Scopus)
1 Downloads (Pure)


Signature extraction is a key part of forensic log analysis. It involves recognizing patterns in log lines such that log lines that originated from the same line of code are grouped together. A log signature consists of immutable parts and mutable parts. The immutable parts define the signature, and the mutable parts are typically variable parameter values. In practice, the number of log lines and signatures can be quite large, and the task of detecting and aligning immutable parts of the logs to extract the signatures becomes a significant challenge. We propose a novel method based on a neural language model that outperforms the current state-of-the-art on signature extraction. We use an RNN auto-encoder to create an embedding of the log lines. Log lines embedded in such a way can be clustered to extract the signatures in an unsupervised manner.
Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases
Subtitle of host publicationEuropean Conference, ECML PKDD 2017, Skopje, Macedonia, September 18–22, 2017, Proceedings, Part III
EditorsMichelangelo Ceci, Saso Dzeroski, Donato Malerba, Yasemin Altun, Kamalika Das, Jesse Read, Marinka Zitnik, Jerzy Stefanowski, Taneli Mielikäinen
Place of PublicationDordrecht
Number of pages12
ISBN (Electronic)978-3-319-71273-4
ISBN (Print)978-3-319-71272-7
Publication statusPublished - 2017
Event2017 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2017) - Skopje, Macedonia, The Former Yugoslav Republic of
Duration: 18 Sept 201722 Sept 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10536 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference2017 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2017)
Abbreviated titleECML PKDD 2017
Country/TerritoryMacedonia, The Former Yugoslav Republic of
Internet address


  • Information forensic
  • Log clustering
  • Neural language model
  • RNN auto-encoder
  • Signature extraction


Dive into the research topics of 'Unsupervised signature extraction from forensic logs'. Together they form a unique fingerprint.

Cite this