TY - JOUR
T1 - Runtime evaluation of cognitive systems for non-deterministic multiple output classification problems
AU - Gopalakrishna, Aravind Kota
AU - Ozcelebi, Tanir
AU - Lukkien, Johan J.
AU - Liotta, Antonio
PY - 2019/11/1
Y1 - 2019/11/1
N2 - Cognitive applications that involve complex decision making such as smart lighting have non-deterministic input–output relationships, i.e., more than one output may be acceptable for a given input. We refer them as non-deterministic multiple output classification (nDMOC) problems, which are particularly difficult for machine learning (ML) algorithms to predict outcomes accurately. Evaluating ML algorithms based on commonly used metrics such as Classification Accuracy (CA) is not appropriate. In a batch setting, Relevance Score (RS) was proposed as a better alternative, which determines how relevant a predicted output is to a given context. We introduce two variants of RS to evaluate ML algorithms in an online setting. Furthermore, we evaluate the algorithms using different metrics for two datasets that have non-deterministic input–output relationships. We show that instance-based learning provides superior RS performance and the RS performance keeps improving with an increase in the number of observed samples, even after the CA performance has converged to its maximum. This is a crucial result as it illustrates that RS is able to capture the performance of ML algorithms in the context of nDMOC problems while CA cannot.
AB - Cognitive applications that involve complex decision making such as smart lighting have non-deterministic input–output relationships, i.e., more than one output may be acceptable for a given input. We refer them as non-deterministic multiple output classification (nDMOC) problems, which are particularly difficult for machine learning (ML) algorithms to predict outcomes accurately. Evaluating ML algorithms based on commonly used metrics such as Classification Accuracy (CA) is not appropriate. In a batch setting, Relevance Score (RS) was proposed as a better alternative, which determines how relevant a predicted output is to a given context. We introduce two variants of RS to evaluate ML algorithms in an online setting. Furthermore, we evaluate the algorithms using different metrics for two datasets that have non-deterministic input–output relationships. We show that instance-based learning provides superior RS performance and the RS performance keeps improving with an increase in the number of observed samples, even after the CA performance has converged to its maximum. This is a crucial result as it illustrates that RS is able to capture the performance of ML algorithms in the context of nDMOC problems while CA cannot.
KW - Classification problems
KW - Cognitive systems
KW - Human factors
KW - Machine learning
KW - Non-deterministic multiple output classification
KW - Performance metric
KW - Relevance score
KW - Smart lighting
UR - http://www.scopus.com/inward/record.url?scp=85067041870&partnerID=8YFLogxK
U2 - 10.1016/j.future.2019.05.043
DO - 10.1016/j.future.2019.05.043
M3 - Article
AN - SCOPUS:85067041870
SN - 0167-739X
VL - 100
SP - 1005
EP - 1016
JO - Future Generation Computer Systems
JF - Future Generation Computer Systems
ER -