On complexity and efficiency of mutual information estimation on static and dynamic data

Michael Vollmer, Ignaz Rutter, Klemens Böhm

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

5 Citations (Scopus)
10 Downloads (Pure)

Abstract

Mutual Information (MI) is an established measure for the dependence of two variables and is often used as a generalization of correlation measures. Existing methods to estimate MI focus on static data. However, dynamic data is ubiquitous as well, and MI estimates on it are useful for stream mining and advanced monitoring tasks. In dynamic data, small changes (e.g., insertion or deletion of a value) may often invalidate the previous estimate. In this article, we study how to efficiently adjust an existing MI estimate when such a change occurs. As a first step, we focus on the well-known nearest-neighbor based estimators for static data and derive a tight lower bound for their computational complexity, which is unknown so far. We then propose two dynamic data structures that can update existing estimates asymptotically faster than any approach that computes the estimates independently, i.e., from scratch. Next, we infer a lower bound for the computational complexity of such updates, irrespective of the data structure and the algorithm, and present an algorithm that is only a logarithmic factor slower than this bound. In absolute numbers, these solutions offer fast and accurate estimates of MI on dynamic data as well.

Original languageEnglish
Title of host publicationAdvances in Database Technology - EDBT 2018
Subtitle of host publication21st International Conference on Extending Database Technology, Proceedings
EditorsMichael Bohlen, Reinhard Pichler, Norman May, Erhard Rahm, Shan-Hung Wu, Katja Hose
Place of PublicationKonstanz
PublisherOpenProceedings.org
Pages49-60
Number of pages12
ISBN (Electronic)978-3-89318-078-3
Publication statusPublished - 2018
Event21st International Conference on Extending Database Technology, EDBT 2018 - Vienna, Austria
Duration: 26 Mar 201829 Mar 2018

Conference

Conference21st International Conference on Extending Database Technology, EDBT 2018
CountryAustria
CityVienna
Period26/03/1829/03/18

Fingerprint Dive into the research topics of 'On complexity and efficiency of mutual information estimation on static and dynamic data'. Together they form a unique fingerprint.

Cite this