Abstract
Mutual Information (MI) is an established measure for the dependence of two variables and is often used as a generalization of correlation measures. Existing methods to estimate MI focus on static data. However, dynamic data is ubiquitous as well, and MI estimates on it are useful for stream mining and advanced monitoring tasks. In dynamic data, small changes (e.g., insertion or deletion of a value) may often invalidate the previous estimate. In this article, we study how to efficiently adjust an existing MI estimate when such a change occurs. As a first step, we focus on the well-known nearest-neighbor based estimators for static data and derive a tight lower bound for their computational complexity, which is unknown so far. We then propose two dynamic data structures that can update existing estimates asymptotically faster than any approach that computes the estimates independently, i.e., from scratch. Next, we infer a lower bound for the computational complexity of such updates, irrespective of the data structure and the algorithm, and present an algorithm that is only a logarithmic factor slower than this bound. In absolute numbers, these solutions offer fast and accurate estimates of MI on dynamic data as well.
Original language | English |
---|---|
Title of host publication | Advances in Database Technology - EDBT 2018 |
Subtitle of host publication | 21st International Conference on Extending Database Technology, Proceedings |
Editors | Michael Bohlen, Reinhard Pichler, Norman May, Erhard Rahm, Shan-Hung Wu, Katja Hose |
Place of Publication | Konstanz |
Publisher | OpenProceedings.org |
Pages | 49-60 |
Number of pages | 12 |
ISBN (Electronic) | 978-3-89318-078-3 |
Publication status | Published - 2018 |
Event | 21st International Conference on Extending Database Technology, EDBT 2018 - Vienna, Austria Duration: 26 Mar 2018 → 29 Mar 2018 |
Conference
Conference | 21st International Conference on Extending Database Technology, EDBT 2018 |
---|---|
Country/Territory | Austria |
City | Vienna |
Period | 26/03/18 → 29/03/18 |