TY - JOUR
T1 - Where are the search engines for handwritten documents?
AU - Zant, T.
AU - Schomaker, L.
AU - Zinger, S.
AU - Schie, H.
PY - 2009
Y1 - 2009
N2 - Although the problems of optical character recognition for contemporary printed text have been resolved, for historical printed and handwritten connected cursive text (i.e. western style writing), they have not. This does not mean that scanning historical documents is not useful. This article describes our research on retrieving digitized handwritten documents containing the information that the user is looking for. This task is essential for optimizing the archive's work. We investigated how to process historical documents and their transcriptions, so that a super computer could learn how to read. We applied artificial intelligence techniques to a large amount of image data and created a search engine. Our methods often require that the computer learns in interaction with a human. We have studied the requests of archive users in order to bring our research as close as possible to the current information needs. Our system learns continuously, allowing the constant improvement of search results. User requests stimulated us to delve into an unsolved topic: to search for the most elusive knowledge in text, namely the names of people and places. The solutions are described in this article.
AB - Although the problems of optical character recognition for contemporary printed text have been resolved, for historical printed and handwritten connected cursive text (i.e. western style writing), they have not. This does not mean that scanning historical documents is not useful. This article describes our research on retrieving digitized handwritten documents containing the information that the user is looking for. This task is essential for optimizing the archive's work. We investigated how to process historical documents and their transcriptions, so that a super computer could learn how to read. We applied artificial intelligence techniques to a large amount of image data and created a search engine. Our methods often require that the computer learns in interaction with a human. We have studied the requests of archive users in order to bring our research as close as possible to the current information needs. Our system learns continuously, allowing the constant improvement of search results. User requests stimulated us to delve into an unsolved topic: to search for the most elusive knowledge in text, namely the names of people and places. The solutions are described in this article.
U2 - 10.1179/174327909X441126
DO - 10.1179/174327909X441126
M3 - Article
SN - 0308-0188
VL - 34
SP - 228
EP - 235
JO - Interdisciplinary Science Reviews
JF - Interdisciplinary Science Reviews
IS - 2
ER -