Abstract
Recent attempts at utilizing visual analytics to interpret Recurrent Neural Networks (RNNs) mainly focus on natural language processing (NLP) tasks that take symbolic sequences as input. However, many real-world problems like environment pollution forecasting apply RNNs on sequences of multi-dimensional data where each dimension represents an individual feature with semantic meaning such as PM 2.5 and SO 2 . RNN interpretation on multi-dimensional sequences is challenging as users need to analyze what features are important at different time steps to better understand model behavior and gain trust in prediction. This requires effective and scalable visualization methods to reveal the complex many-to-many relations between hidden units and features. In this work, we propose a visual analytics system to interpret RNNs on multi-dimensional time-series forecasts. Specifically, to provide an overview to reveal the model mechanism, we propose a technique to estimate the hidden unit response by measuring how different feature selections affect the hidden unit output distribution. We then cluster the hidden units and features based on the response embedding vectors. Finally, we propose a visual analytics system which allows users to visually explore the model behavior from the global and individual levels. We demonstrate the effectiveness of our approach with case studies using air pollutant forecast applications.
Original language | English |
---|---|
Title of host publication | 2020 IEEE Pacific Visualization Symposium (PacificVis) |
Editors | Fabian Beck, Jinwook Seo, Chaoli Wang |
Publisher | Institute of Electrical and Electronics Engineers |
Pages | 61-70 |
Number of pages | 10 |
ISBN (Electronic) | 9781728156972 |
DOIs | |
Publication status | Published - Jun 2020 |
Event | 13th IEEE Pacific Visualization Symposium, PacificVis 2020 - Tianjin, China Duration: 14 Apr 2020 → 17 Apr 2020 |
Conference
Conference | 13th IEEE Pacific Visualization Symposium, PacificVis 2020 |
---|---|
Country/Territory | China |
City | Tianjin |
Period | 14/04/20 → 17/04/20 |
Keywords
- air pollutant forecast
- interpretable machine learning
- multi-dimensional time series
- recurrent neural networks