Towards a general framework for effective solutions to the data mapping problem

G.H.L. Fletcher, C.M. Wyss

Research output: Chapter in Book/Report/Conference proceedingChapterAcademic

4 Citations (Scopus)

Abstract

Automating the discovery of mappings between structured data sources is a long standing and important problem in data management. We discuss the rich history of the problem and the variety of technical solutions advanced in the database community over the previous four decades. Based on this discussion, we develop a basic statement of the data mapping problem and a general framework for reasoning about the design space of system solutions to the problem. We then concretely illustrate the framework with the Tupelo system for data mapping discovery, focusing on the important common case of relational data sources. Treating mapping discovery as example-driven search in a space of transformations, Tupelo generates queries encompassing the full range of structural and semantic heterogeneities encountered in relational data mapping. Hence, Tupelo is applicable in a wide range of data mapping scenarios. Finally, we present the results of extensive empirical validation, both on synthetic and real world datasets, indicating that the system is both viable and effective.
Original languageEnglish
Title of host publicationJournal on Data Semantics XIV
EditorsS. Spaccapietra, L. Delcambre
Place of PublicationBerlin
PublisherSpringer
Pages37-73
DOIs
Publication statusPublished - 2009

Publication series

NameLecture Notes in Computer Science
Volume5880
ISSN (Print)0302-9743

Fingerprint

Dive into the research topics of 'Towards a general framework for effective solutions to the data mapping problem'. Together they form a unique fingerprint.

Cite this