A machine learning solution to assess privacy policy completeness

E. Costante, Y. Sun, M. Petkovic, J.I. Hartog, den

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

37 Citations (Scopus)

Abstract

A privacy policy is a legal document, used by websites to communicate how the personal data that they collect will be managed. By accepting it, the user agrees to release his data under the conditions stated by the policy. Privacy policies should provide enough information to enable users to make informed decisions. Privacy regulations support this by specifying what kind of information has to be provided. As privacy policies can be long and dif¿cult to understand, users tend not to read them. Because of this, users generally agree with a policy without knowing what it states and whether aspects important to him are covered at all. In this paper we present a solution to assist the user by providing a structured way to browse the policy content and by automatically assessing the completeness of a policy, i.e. the degree of coverage of privacy categories important to the user. The privacy categories are extracted from privacy regulations, while text categorization and machine learning techniques are used to verify which categories are covered by a policy. The results show the feasibility of our approach; an automatic classi¿er, able to associate the right category to paragraphs of a policy with an accuracy approximating that obtainable by a human judge, can be effectively created. Keywords: privacy, privacy policy, natural language, machine learning
Original languageEnglish
Title of host publicationProceedings of the 2012 ACM Workshop on Privacy in the Electronic Society (WPES co-located with CCS 2012), October 15, 2012, Raleigh NC, USA
PublisherAssociation for Computing Machinery, Inc
Pages91-96
ISBN (Print)978-1-4503-1663-7
DOIs
Publication statusPublished - 2012
Eventconference; 2012 ACM Workshop on Privacy in the Electronic Society; 2012-10-15; 2012-10-15 -
Duration: 15 Oct 201215 Oct 2012

Conference

Conferenceconference; 2012 ACM Workshop on Privacy in the Electronic Society; 2012-10-15; 2012-10-15
Period15/10/1215/10/12
Other2012 ACM Workshop on Privacy in the Electronic Society

Fingerprint Dive into the research topics of 'A machine learning solution to assess privacy policy completeness'. Together they form a unique fingerprint.

Cite this