Real-world K-Anonymity applications: The KGEN approach and its evaluation in fraudulent transactions

Onderzoeksoutput: Bijdrage aan tijdschriftTijdschriftartikelAcademicpeer review

7 Citaten (Scopus)
76 Downloads (Pure)

Samenvatting

K-Anonymity is a property for the measurement, management, and governance of the data anonymization. Many implementations of k-anonymity have been described in state of the art, but most of them are not practically usable over a large number of attributes in a “Big” dataset, i.e., a dataset drawing from Big Data. To address this significant shortcoming, we introduce and evaluate KGEN, an approach to K-anonymity featuring meta-heuristics, specifically, Genetic Algorithms to compute a permutation of the dataset which is both K-anonymized and still usable for further processing, e.g., for private-by-design analytics. KGEN promotes such a meta-heuristic approach since it can solve the problem by finding a pseudo-optimal solution in a reasonable time over a considerable load of input. KGEN allows the data manager to guarantee a high anonymity level while preserving the usability and preventing loss of information entropy over the data. Differently from other approaches that provide optimal global solutions compatible with smaller datasets, KGEN works properly also over Big datasets while still providing a good-enough K-anonymized but still processable dataset. Evaluation results show how our approach can still work efficiently on a real world dataset, provided by Dutch Tax Authority, with 47 attributes (i.e., the columns of the dataset to be anonymized) and over 1.5K+ observations (i.e., the rows of that dataset), as well as on a dataset with 97 attributes and over 3942 observations.

Originele taal-2Engels
Artikelnummer102193
Aantal pagina's13
TijdschriftInformation Systems
Volume115
DOI's
StatusGepubliceerd - mei 2023

Vingerafdruk

Duik in de onderzoeksthema's van 'Real-world K-Anonymity applications: The KGEN approach and its evaluation in fraudulent transactions'. Samen vormen ze een unieke vingerafdruk.

Citeer dit