Abstract
Multi-objective multi-armed bandits (MOMAB) is an extension of the multi-objective multi-armed bandits framework that considers reward vectors instead of scalar reward values. Scalarization functions transform the reward vectors into reward values in order to use the standard multi-armed bandits (MAB) algorithms. However for many applications it is not obvious to come up with a good scalarization set and therefore there is needed to develop MAB that discover the whole Pareto set of arms. Our approach to this multi-objective MAB problem is two folded: i) identify the set of Pareto optimal arms and ii) identify the minimum subset of scalarization functions that optimize the set of Pareto optimal arms. We experimentally compare the proposed MOMAB algorithms on a multi-objective Bernoulli problem.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the International Joint Conference on Neural Networks |
| Place of Publication | Piscataway |
| Publisher | Institute of Electrical and Electronics Engineers |
| Pages | 2690-2697 |
| Number of pages | 8 |
| ISBN (Electronic) | 978-1-4799-1484-5 |
| DOIs | |
| Publication status | Published - 3 Sept 2014 |
| Externally published | Yes |
| Event | 2014 International Joint Conference on Neural Networks, IJCNN 2014 - Beijing International Convention Center, Beijing, China Duration: 6 Jul 2014 → 11 Jul 2014 http://www.ieee-wcci2014.org |
Conference
| Conference | 2014 International Joint Conference on Neural Networks, IJCNN 2014 |
|---|---|
| Abbreviated title | IJCNN 2014 |
| Country/Territory | China |
| City | Beijing |
| Period | 6/07/14 → 11/07/14 |
| Other | International Joint Conference on Neural Networks |
| Internet address |