An approach towards context-sensitive and user-adapted access to heterogeneous data sources, illustrated in the television domain

P.A.E. Bellekens

    Research output: ThesisPhd Thesis 1 (Research TU/e / Graduation TU/e)

    4485 Downloads (Pure)

    Abstract

    In a variety of domains, developers nowadays are struggling with the dilemma of how they can provide a more personal service to their users. A more personal service can for example be facilitated by offering user-adapted search, generating recommendations, personalized content navigation, personalized user interfaces, etc. However, providing such functionality on top of a particular data set, requires to have good knowledge of the relevant domain items (which can for example represent books, songs, TV programs, art pieces, etc.) as well as to have good knowledge of the relevant users (in terms of the user’s behavior, interests, preferences, etc. with respect to those domain items). In this dissertation and more specifically in Chapter 3 and Chapter 4, we describe the re-quirements and a domain-independent approach respectively, to provide context-sensitive and user-adapted access to heterogeneous data sources. This approach consists of three main parts, including: 1) Data Integration, 2) User Modeling and 3) User-Adapted Data Access. Chapter 5 focusses on the integration of information from various heterogeneous data sources. To provide user-adapted access, a good description of the relevant domain items is key. The more descriptive information we have about every item, the more raw material is there to for example compare different items, compare items with user profiles, deduce new information, etc. Unfortunately, in the real world, items often come poorly described. On the other hand however, with the immense growth of available information on the Web, many different data sources (like IMDb, Wikipedia, social networks, etc.) exist and offer free access to their data. By using Semantic Web techniques we describe how we can enrich the descriptive metadata of those domain items by on the one hand integrating and matching information from different external sources providing instance metadata, and on the other hand taking relevant ontological background information into account. Chapter 6 concentrates on the second part in our approach: the creation of an extensive model of the end-user. Such a user model is the user’s digital representation and encompasses all valuable user data we can obtain. Information can be provided explicitly by the user himself (e.g. the user states that he is 45 years old, male, capable of speaking three languages, fond of tennis, etc.) but also implicitly. Implicit feedback includes all the information the user gives away without realizing it, by means of his behavioral patterns (e.g. the user watches the news every day at 8, he always adds books from the same author to his favorites, etc.). However, user feedback (both explicit and implicit) can be hard to interpret since it depends on a wide variety of parameters. Numerous influences like mood, location, time, environment, health, etc., make that people can behave very differently at any given time. Therefore, every statement in the user model is contextualized. In other words, the constrained setting in which a specific user statement was valid, which we call the statement’s context, is saved and is used later to predict the user’s interests accurately in any given situation. Further, since our approach depends on the quality and richness of the user model and new users usually start with an empty profile, we suffer from the so-called cold start problem. To deal with this situation where new users have an empty profile, we provide a number of strategies based on user statistics and stereotypes to alleviate this problem. The third and last part in our approach encompasses the strategies to adapt any user request and provide a personalized set of results, based on both the integrated data structure describing the domain and the user model. Chapter 7 describes a processing pipeline, consisting of three steps, which takes a user request as input and delivers a personal response. The first step involves cleaning and conceptualizing the user’s request with respect to the current domain. Secondly, the updated query is sent to the database to retrieve matching results. However, trying to find not only exactly matching results but also highly related results, the database automatically broadens the result space of the query in a controlled fashion. It does so by reasoning over well-chosen semantic relations including for example transitivity and synonymity. When matching results are retrieved, the last step filters them by following a set of rules. These rules are predefined and can include restrictions based on both ontological as well as user model information. This pipeline is employed both to provide user-adapted search as well as for the generation of personal recommendations. However, systems dealing with potentially large amounts of data and on top of that provide complex functionality like reasoning, user-adapted search, integration of data, recommendations, etc., require extra care in terms of their database setup. Moreover, efficiency in terms of querying speed is vital for any system’s long-term success. Therefore, in Chapter 8, we introduce a number of optimizations to improve the efficiency of the database in terms of size and querying speed. To illustrate our approach, we apply it in the television domain which we introduce in Chapter 2. Together with Stoneroos we developed a cross-platform application called iFanzy, which tries to bring personalized access to television programs to the user via a set-top box interface, a Web site and an iPhone application. All three of these platforms are synchronized and behave as one ubiquitous application supporting the user in putting together the best possible television experience, by finding exactly those TV programs fitting the user best. In Chapter 9, we give an overview of these three platforms in terms of functionality and user interface. Furthermore, we perform an evaluation on the interface of the iFanzy Web portal including experiments like a Cognitive Walkthrough, the Thinking Aloud method and a Heuristic Evaluation. Through the commercial availability of iFanzy we were able to further evaluate our approach, focussing on the recommendation quality and user satisfaction. In Chapter 10, we elucidate our evaluation which features a group of 60 people using the iFanzy Web interface for about two full weeks. From the data of this evaluation we can investigate the influence of both explicit and implicit user feedback on the predictive power of the system and the accuracy of generated recommendations. To further improve the quality of the recommendation strategy, Chapter 10 concludes with an approach to improve the serendipity of the recommender system, which leads to more surprising or serendipitous discoveries. Reusing the data from the previous evaluation, we describe a number of measurements to quantify the degree of serendipity in the recommendations of a recommender system.
    Original languageEnglish
    QualificationDoctor of Philosophy
    Awarding Institution
    • Mathematics and Computer Science
    Supervisors/Advisors
    • De Bra, Paul M.E., Promotor
    • Houben, Geert-Jan, Promotor
    • Aroyo, Lora, Copromotor
    Award date7 Oct 2010
    Place of PublicationEindhoven
    Publisher
    Print ISBNs978-90-386-2336-8
    DOIs
    Publication statusPublished - 2010

    Fingerprint

    Dive into the research topics of 'An approach towards context-sensitive and user-adapted access to heterogeneous data sources, illustrated in the television domain'. Together they form a unique fingerprint.

    Cite this