Abstract
Almost all activities observed in nowadays applications are
correlated with a timing sequence. Users are mainly looking
for interesting sequences out of such data. Sequential pattern
mining algorithms aim at finding frequent sequences.
Usually, the mined activities have timing durations that
represent time intervals between their starting and ending
points. Most sequential pattern mining approaches dealt
with such activities as a single point event and thus lost
many valuable information in the collected patterns. We
present the PIVOTMiner, an efficient interval-based sequential
pattern mining algorithm using a geometric representation
of intervals. The interestingness level is not necessarily
positively correlated with the frequency of the patterns. In
many applications, users are seeking for rare patterns that
considerably deviate from the majority. Simply delivering
the bottom-k patterns does not guarantee their high outlierness
(or deviation) from the frequent ones. We propose
additionally the PIVOTRanker, the first scalable algorithm
for ranking rare interval-based sequential patterns based on
their outlierness. Our experimental results on both synthetic
and real-world datasets show that PIVOTMiner spends considerably
less time than two state-of-the-art competitors,
and that PIVOTRanker delivers a meaningful and useful
ranking of rare patterns.
correlated with a timing sequence. Users are mainly looking
for interesting sequences out of such data. Sequential pattern
mining algorithms aim at finding frequent sequences.
Usually, the mined activities have timing durations that
represent time intervals between their starting and ending
points. Most sequential pattern mining approaches dealt
with such activities as a single point event and thus lost
many valuable information in the collected patterns. We
present the PIVOTMiner, an efficient interval-based sequential
pattern mining algorithm using a geometric representation
of intervals. The interestingness level is not necessarily
positively correlated with the frequency of the patterns. In
many applications, users are seeking for rare patterns that
considerably deviate from the majority. Simply delivering
the bottom-k patterns does not guarantee their high outlierness
(or deviation) from the frequent ones. We propose
additionally the PIVOTRanker, the first scalable algorithm
for ranking rare interval-based sequential patterns based on
their outlierness. Our experimental results on both synthetic
and real-world datasets show that PIVOTMiner spends considerably
less time than two state-of-the-art competitors,
and that PIVOTRanker delivers a meaningful and useful
ranking of rare patterns.
Original language | English |
---|---|
Title of host publication | Advances in Database Technology - EDBT 2016. 19th International Conference on Extending Database Technology, Bordeaux, France, March 15-16, 2016. Proceedings |
Editors | E. Pitoura, S. Maabout, G. Koutrika, A. Marian, L. Tanca, I. Manolescu, K. Stefanides |
Place of Publication | Konstanz |
Publisher | OpenProceedings.org |
Pages | 688-689 |
Number of pages | 2 |
ISBN (Electronic) | 978-3-89318-070-7 |
DOIs | |
Publication status | Published - 2016 |
Externally published | Yes |
Event | 19th International Conference on Extending Database Technology (EDBT 2016) - Bordeaux, France Duration: 15 Mar 2016 → 18 Mar 2016 |
Publication series
Name | Open Proceedings |
---|---|
ISSN (Electronic) | 2367-2005 |
Conference
Conference | 19th International Conference on Extending Database Technology (EDBT 2016) |
---|---|
Country/Territory | France |
City | Bordeaux |
Period | 15/03/16 → 18/03/16 |