An incremental prefix filtering approach for all pairs similarity search

T.L. Hoang, V.D. Dinh, R. Perego, F. Silvestri

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    4 Citations (Scopus)
    140 Downloads (Pure)


    Given a set of records, a threshold value t and a similarity function, we investigate the problem of finding all pairs of records such that similarity between each pair is above t. We propose several optimizations on the existing approaches to solve the problem. Our algorithm outperforms the state-of-the-art algorithms in the case with large and high-dimensional datasets. The speedup we achieved varied from 30% to 4-x depending on the similarity threshold and the dataset properties.
    Original languageEnglish
    Title of host publicationProceedings of the 12th International Asia-Pacific Web Conference (APWeb, Busan, Korea, April 6-8, 2010)
    PublisherIEEE Computer Society
    ISBN (Print)978-1-7695-4012-2
    Publication statusPublished - 2010


    Dive into the research topics of 'An incremental prefix filtering approach for all pairs similarity search'. Together they form a unique fingerprint.

    Cite this