HiDER : query-driven entity resolution for historical data

B. Ranjbar-Sahraei, I. Efremova, H. Rahmani, T.G.K. Calders, K.P. Tuyls

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

Entity Resolution (ER) is the task of finding references that refer to the same entity across different data sources. Cleaning a data warehouse and applying ER on it is a computationally demanding task, particularly for large data sets that change dynamically. Therefore, a query-driven approach which analyses a small subset of the entire data set and integrates the results in real-time is significantly beneficial. Here, we present an interactive tool, called HiDER, which allows for query-driven ER in large collections of uncertain dynamic historical data. The input data includes civil registers such as birth, marriage and death certificates in the form of structured data, and notarial acts such as estate tax and property transfers in the form of free text. The outputs are family networks and event timelines visualized in an integrated way. The HiDER is being used and tested at BHIC center(Brabant Historical Information Center, https://¿www.¿bhic.¿nl); despite the uncertainties of the BHIC input data, the extracted entities have high certainty and are enriched by extra information.
Original languageEnglish
Title of host publicationMachine learing and knowledge discovery in database, European Conference, ECML PKDD 2015, Porto, Portugal, September 7-11, 2015, Proceedings, Part III
EditorsE. Bifet, M. May, B. Zadrozny, R. Gavalda, D. Pedreschi, F. Bonchi, J. Cardoso, M. Spiliopoulou
PublisherSpringer
Pages281-284
ISBN (Print)978-3-319-23461-8
DOIs
Publication statusPublished - 2015
Event2015 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2015) - Porto, Portugal
Duration: 7 Sep 201511 Sep 2015
http://www.ecmlpkdd2015.org/

Publication series

NameLecture Notes in Computer Science
Volume9286
ISSN (Print)0302-9743

Conference

Conference2015 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2015)
Abbreviated titleECML PKDD 2015
CountryPortugal
CityPorto
Period7/09/1511/09/15
Internet address

Fingerprint Dive into the research topics of 'HiDER : query-driven entity resolution for historical data'. Together they form a unique fingerprint.

  • Cite this

    Ranjbar-Sahraei, B., Efremova, I., Rahmani, H., Calders, T. G. K., & Tuyls, K. P. (2015). HiDER : query-driven entity resolution for historical data. In E. Bifet, M. May, B. Zadrozny, R. Gavalda, D. Pedreschi, F. Bonchi, J. Cardoso, & M. Spiliopoulou (Eds.), Machine learing and knowledge discovery in database, European Conference, ECML PKDD 2015, Porto, Portugal, September 7-11, 2015, Proceedings, Part III (pp. 281-284). (Lecture Notes in Computer Science; Vol. 9286). Springer. https://doi.org/10.1007/978-3-319-23461-8_30#page-1