Recently, the following problem of discrimination aware classification was introduced: given a labeled dataset and an attribute B, find a classifier with high predictive accuracy that at the same time does not discriminate on the basis of the given attribute B. This problem is motivated by the fact that often available historic data is biased due to discrimination, e.g., when B denotes ethnicity. Using the standard learners on this data may lead to wrongfully biased classifiers, even if the attribute B is removed from training data. Existing solutions for this problem consist of "cleaning away" the discrimination from the dataset before a classifier is learned. In this paper we study an alternative approach in which the non-discriminatory constraint is pushed deeply into a decision tree learner by changing its splitting criterion and pruning strategy by using a novel leaf relabeling approach. Experimental evaluation shows that the proposed approach advances the state-of-the-art in the sense that the learned decision trees have a lower discrimination than models provided by previous methods with only little loss in accuracy.
|Naam||Computer science reports|
|ISSN van geprinte versie||0926-4515|