Discovering reliable evidence of data misuse by exploiting rule redundancy

L. Genga (Corresponding author), Nicola Zannone, Anna Squicciarini

Onderzoeksoutput: Bijdrage aan tijdschriftTijdschriftartikelAcademicpeer review

Uittreksel

Big Data offers opportunities for in-depth data analytics and advanced personalized services. Yet, while valuable, data analytics might rely on data that should not have been used due to, e.g., privacy constraints from the data subject or regulations. As decision makers and data controllers often act outside any control mechanism and with no requirement of transparency, it is challenging to verify whether constraints on data usage are actually satisfied. In this work, we relate the problem of finding evidence of data misuse to the identification of unique decision rules, i.e. rules that have likely been used for decision making. Accordingly, we propose an approach to find reliable evidence of data misuse in the context of classification problems using association rule mining, along with novel metrics to assess the level of redundancy among decision rules. Our proposed approach is able to identify the use of sensitive information in decisional processes along with their context. We evaluated our approach through both controlled experiments and two case studies using real-life event data. The results show that our approach finds more reliable evidence of data misuse compared to previous work.

TaalEngels
Artikelnummer101577
Aantal pagina's17
TijdschriftComputers and Security
Volume87
DOI's
StatusGepubliceerd - 1 nov 2019

Vingerafdruk

Association rules
redundancy
Transparency
Redundancy
Decision making
Controllers
evidence
Experiments
Big data
transparency
privacy
decision maker
decision making
regulation

Trefwoorden

    Citeer dit

    @article{b18ea31150304cceb5744a680c54a544,
    title = "Discovering reliable evidence of data misuse by exploiting rule redundancy",
    abstract = "Big Data offers opportunities for in-depth data analytics and advanced personalized services. Yet, while valuable, data analytics might rely on data that should not have been used due to, e.g., privacy constraints from the data subject or regulations. As decision makers and data controllers often act outside any control mechanism and with no requirement of transparency, it is challenging to verify whether constraints on data usage are actually satisfied. In this work, we relate the problem of finding evidence of data misuse to the identification of unique decision rules, i.e. rules that have likely been used for decision making. Accordingly, we propose an approach to find reliable evidence of data misuse in the context of classification problems using association rule mining, along with novel metrics to assess the level of redundancy among decision rules. Our proposed approach is able to identify the use of sensitive information in decisional processes along with their context. We evaluated our approach through both controlled experiments and two case studies using real-life event data. The results show that our approach finds more reliable evidence of data misuse compared to previous work.",
    keywords = "Classification rules, Data mining, Data misuse detection, Redundancy reduction, Rule evaluation",
    author = "L. Genga and Nicola Zannone and Anna Squicciarini",
    year = "2019",
    month = "11",
    day = "1",
    doi = "10.1016/j.cose.2019.101577",
    language = "English",
    volume = "87",
    journal = "Computers and Security",
    issn = "0167-4048",
    publisher = "Elsevier",

    }

    Discovering reliable evidence of data misuse by exploiting rule redundancy. / Genga, L. (Corresponding author); Zannone, Nicola; Squicciarini, Anna.

    In: Computers and Security, Vol. 87, 101577, 01.11.2019.

    Onderzoeksoutput: Bijdrage aan tijdschriftTijdschriftartikelAcademicpeer review

    TY - JOUR

    T1 - Discovering reliable evidence of data misuse by exploiting rule redundancy

    AU - Genga,L.

    AU - Zannone,Nicola

    AU - Squicciarini,Anna

    PY - 2019/11/1

    Y1 - 2019/11/1

    N2 - Big Data offers opportunities for in-depth data analytics and advanced personalized services. Yet, while valuable, data analytics might rely on data that should not have been used due to, e.g., privacy constraints from the data subject or regulations. As decision makers and data controllers often act outside any control mechanism and with no requirement of transparency, it is challenging to verify whether constraints on data usage are actually satisfied. In this work, we relate the problem of finding evidence of data misuse to the identification of unique decision rules, i.e. rules that have likely been used for decision making. Accordingly, we propose an approach to find reliable evidence of data misuse in the context of classification problems using association rule mining, along with novel metrics to assess the level of redundancy among decision rules. Our proposed approach is able to identify the use of sensitive information in decisional processes along with their context. We evaluated our approach through both controlled experiments and two case studies using real-life event data. The results show that our approach finds more reliable evidence of data misuse compared to previous work.

    AB - Big Data offers opportunities for in-depth data analytics and advanced personalized services. Yet, while valuable, data analytics might rely on data that should not have been used due to, e.g., privacy constraints from the data subject or regulations. As decision makers and data controllers often act outside any control mechanism and with no requirement of transparency, it is challenging to verify whether constraints on data usage are actually satisfied. In this work, we relate the problem of finding evidence of data misuse to the identification of unique decision rules, i.e. rules that have likely been used for decision making. Accordingly, we propose an approach to find reliable evidence of data misuse in the context of classification problems using association rule mining, along with novel metrics to assess the level of redundancy among decision rules. Our proposed approach is able to identify the use of sensitive information in decisional processes along with their context. We evaluated our approach through both controlled experiments and two case studies using real-life event data. The results show that our approach finds more reliable evidence of data misuse compared to previous work.

    KW - Classification rules

    KW - Data mining

    KW - Data misuse detection

    KW - Redundancy reduction

    KW - Rule evaluation

    UR - http://www.scopus.com/inward/record.url?scp=85069971314&partnerID=8YFLogxK

    U2 - 10.1016/j.cose.2019.101577

    DO - 10.1016/j.cose.2019.101577

    M3 - Article

    VL - 87

    JO - Computers and Security

    T2 - Computers and Security

    JF - Computers and Security

    SN - 0167-4048

    M1 - 101577

    ER -