Generally, in machine learning applications, the problem of missing data has significant effect on the prediction performance. For a given missing data problem, it is not straightforward to select a treatment approach in combination with a classification model due to several factors such as the pattern of data and nature of missing data. The selection becomes more difficult for applications such as intelligent lighting, where there is high degree of randomness in the pattern of data. In this paper, we study pairs of probabilistic missing data treatment methods and classification models to identify the best pair for a dataset gathered from an office environment for intelligent lighting.We evaluate the performance in simulations using a new metric called Relevance Score. Experimental results show that the CPOF (Conditional Probability based only on the Outcome and other Features) method in combination with the DecisionTable (DT) classifier is the most suitable pair for implementation.
Keywords: intelligent lighting; missing data; machine learning; classification models; relevance metric
|Name||Advances in Intelligent Systems and Computing|
|Conference||conference; 3rd International Conference on Man-Machine Interaction; 2013-10-22; 2013-10-25|
|Period||22/10/13 → 25/10/13|
|Other||3rd International Conference on Man-Machine Interaction|