Approximating (k,ℓ)-center clustering for curves

Kevin Buchin, Anne Driemel, Joachim Gudmundsson, Michael Horton, Irina Kostitsyna, Maarten Löffler, Martijn Struijs

Research output: Chapter in Book/Report/Conference proceedingChapterAcademicpeer-review

1 Citation (Scopus)

Abstract


The Euclidean k-Center problem is a classical problem that has been extensively studied in computer science. Given a set G of n points in Euclidean space, the problem is to determine a set C of k centers (not necessarily part of G) such that the maximum distance between a point in G and its nearest neighbor in C is minimized. In this paper we study the corresponding (k, ℓ)-CENTER problem for polygonal curves under the Fréchet distance, that is, given a set G of n polygonal curves in ℝd, each of complexity m, determine a set C of k polygonal curves in ℝd, each of complexity ℓ, such that the maximum Fréchet distance of a curve in G to its closest curve in C is minimized. In their 2016 paper, Driemel, Krivošija, and Sohler give a near-linear time (1 + ε-approximation algorithm for one-dimensional curves, assuming that k and ℓ are constants. In this paper, we substantially extend and improve the known approximation bounds for curves in dimension 2 and higher. Our analysis thus extends to application-relevant input data such as GPS-trajectories and protein backbones. We show that, if ℓ is part of the input, then there is no polynomial-time approximation scheme unless P = NP. Our constructions yield different bounds for one and two-dimensional curves and the discrete and continuous Fréchet distance. In the case of the discrete Fréchet distance on two-dimensional curves, we show hardness of approximation within a factor close to 2.598. This result also holds when k = 1, and the NP-hardness extends to the case that ℓ = ∞, i.e., for the problem of computing the minimum-enclosing ball under the Fréchet distance. Finally, we observe that a careful adaptation of Gonzalez’ algorithm in combination with a curve simplification yields a 3-approximation in any dimension, provided that an optimal simplification can be computed exactly. We conclude that our approximation bounds are close to being tight.


Read More: https://epubs.siam.org/doi/10.1137/1.9781611975482.181
LanguageEnglish
Title of host publication30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)
EditorsTimothy M. Chan
PublisherSociety for Industrial and Applied Mathematics (SIAM)
Pages2922-2938
Number of pages17
ISBN (Electronic)978-1-61197-548-2
DOIs
StatePublished - 2 Jan 2019
EventACM-SIAM Symposium on Discrete Algorithms (SODA2019) - San Diego, United States
Duration: 6 Jan 20199 Jan 2019
https://www.siam.org/Conferences/CM/Conference/soda19

Conference

ConferenceACM-SIAM Symposium on Discrete Algorithms (SODA2019)
Abbreviated titleSODA2019
CountryUnited States
CitySan Diego
Period6/01/199/01/19
Internet address

Fingerprint

Clustering
Curve
Simplification
Approximation
Hardness of Approximation
Center Problem
Polynomial Time Approximation Scheme
NP-hardness
Backbone
Euclidean space
Linear Time
Approximation Algorithms
Euclidean
Nearest Neighbor
Computer Science
Ball
Trajectory
Protein
Computing

Cite this

Buchin, K., Driemel, A., Gudmundsson, J., Horton, M., Kostitsyna, I., Löffler, M., & Struijs, M. (2019). Approximating (k,ℓ)-center clustering for curves. In T. M. Chan (Ed.), 30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) (pp. 2922-2938). Society for Industrial and Applied Mathematics (SIAM). DOI: 10.1137/1.9781611975482.181
Buchin, Kevin ; Driemel, Anne ; Gudmundsson, Joachim ; Horton, Michael ; Kostitsyna, Irina ; Löffler, Maarten ; Struijs, Martijn. / Approximating (k,ℓ)-center clustering for curves. 30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). editor / Timothy M. Chan. Society for Industrial and Applied Mathematics (SIAM), 2019. pp. 2922-2938
@inbook{2aaf61c0a0164fa18984490e784a475a,
title = "Approximating (k,ℓ)-center clustering for curves",
abstract = "The Euclidean k-Center problem is a classical problem that has been extensively studied in computer science. Given a set G of n points in Euclidean space, the problem is to determine a set C of k centers (not necessarily part of G) such that the maximum distance between a point in G and its nearest neighbor in C is minimized. In this paper we study the corresponding (k, ℓ)-CENTER problem for polygonal curves under the Fr{\'e}chet distance, that is, given a set G of n polygonal curves in ℝd, each of complexity m, determine a set C of k polygonal curves in ℝd, each of complexity ℓ, such that the maximum Fr{\'e}chet distance of a curve in G to its closest curve in C is minimized. In their 2016 paper, Driemel, Krivošija, and Sohler give a near-linear time (1 + ε-approximation algorithm for one-dimensional curves, assuming that k and ℓ are constants. In this paper, we substantially extend and improve the known approximation bounds for curves in dimension 2 and higher. Our analysis thus extends to application-relevant input data such as GPS-trajectories and protein backbones. We show that, if ℓ is part of the input, then there is no polynomial-time approximation scheme unless P = NP. Our constructions yield different bounds for one and two-dimensional curves and the discrete and continuous Fr{\'e}chet distance. In the case of the discrete Fr{\'e}chet distance on two-dimensional curves, we show hardness of approximation within a factor close to 2.598. This result also holds when k = 1, and the NP-hardness extends to the case that ℓ = ∞, i.e., for the problem of computing the minimum-enclosing ball under the Fr{\'e}chet distance. Finally, we observe that a careful adaptation of Gonzalez’ algorithm in combination with a curve simplification yields a 3-approximation in any dimension, provided that an optimal simplification can be computed exactly. We conclude that our approximation bounds are close to being tight.Read More: https://epubs.siam.org/doi/10.1137/1.9781611975482.181",
author = "Kevin Buchin and Anne Driemel and Joachim Gudmundsson and Michael Horton and Irina Kostitsyna and Maarten L{\"o}ffler and Martijn Struijs",
year = "2019",
month = "1",
day = "2",
doi = "10.1137/1.9781611975482.181",
language = "English",
pages = "2922--2938",
editor = "Chan, {Timothy M.}",
booktitle = "30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)",
publisher = "Society for Industrial and Applied Mathematics (SIAM)",
address = "United States",

}

Buchin, K, Driemel, A, Gudmundsson, J, Horton, M, Kostitsyna, I, Löffler, M & Struijs, M 2019, Approximating (k,ℓ)-center clustering for curves. in TM Chan (ed.), 30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). Society for Industrial and Applied Mathematics (SIAM), pp. 2922-2938, ACM-SIAM Symposium on Discrete Algorithms (SODA2019), San Diego, United States, 6/01/19. DOI: 10.1137/1.9781611975482.181

Approximating (k,ℓ)-center clustering for curves. / Buchin, Kevin; Driemel, Anne; Gudmundsson, Joachim; Horton, Michael; Kostitsyna, Irina; Löffler, Maarten; Struijs, Martijn.

30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). ed. / Timothy M. Chan. Society for Industrial and Applied Mathematics (SIAM), 2019. p. 2922-2938.

Research output: Chapter in Book/Report/Conference proceedingChapterAcademicpeer-review

TY - CHAP

T1 - Approximating (k,ℓ)-center clustering for curves

AU - Buchin,Kevin

AU - Driemel,Anne

AU - Gudmundsson,Joachim

AU - Horton,Michael

AU - Kostitsyna,Irina

AU - Löffler,Maarten

AU - Struijs,Martijn

PY - 2019/1/2

Y1 - 2019/1/2

N2 - The Euclidean k-Center problem is a classical problem that has been extensively studied in computer science. Given a set G of n points in Euclidean space, the problem is to determine a set C of k centers (not necessarily part of G) such that the maximum distance between a point in G and its nearest neighbor in C is minimized. In this paper we study the corresponding (k, ℓ)-CENTER problem for polygonal curves under the Fréchet distance, that is, given a set G of n polygonal curves in ℝd, each of complexity m, determine a set C of k polygonal curves in ℝd, each of complexity ℓ, such that the maximum Fréchet distance of a curve in G to its closest curve in C is minimized. In their 2016 paper, Driemel, Krivošija, and Sohler give a near-linear time (1 + ε-approximation algorithm for one-dimensional curves, assuming that k and ℓ are constants. In this paper, we substantially extend and improve the known approximation bounds for curves in dimension 2 and higher. Our analysis thus extends to application-relevant input data such as GPS-trajectories and protein backbones. We show that, if ℓ is part of the input, then there is no polynomial-time approximation scheme unless P = NP. Our constructions yield different bounds for one and two-dimensional curves and the discrete and continuous Fréchet distance. In the case of the discrete Fréchet distance on two-dimensional curves, we show hardness of approximation within a factor close to 2.598. This result also holds when k = 1, and the NP-hardness extends to the case that ℓ = ∞, i.e., for the problem of computing the minimum-enclosing ball under the Fréchet distance. Finally, we observe that a careful adaptation of Gonzalez’ algorithm in combination with a curve simplification yields a 3-approximation in any dimension, provided that an optimal simplification can be computed exactly. We conclude that our approximation bounds are close to being tight.Read More: https://epubs.siam.org/doi/10.1137/1.9781611975482.181

AB - The Euclidean k-Center problem is a classical problem that has been extensively studied in computer science. Given a set G of n points in Euclidean space, the problem is to determine a set C of k centers (not necessarily part of G) such that the maximum distance between a point in G and its nearest neighbor in C is minimized. In this paper we study the corresponding (k, ℓ)-CENTER problem for polygonal curves under the Fréchet distance, that is, given a set G of n polygonal curves in ℝd, each of complexity m, determine a set C of k polygonal curves in ℝd, each of complexity ℓ, such that the maximum Fréchet distance of a curve in G to its closest curve in C is minimized. In their 2016 paper, Driemel, Krivošija, and Sohler give a near-linear time (1 + ε-approximation algorithm for one-dimensional curves, assuming that k and ℓ are constants. In this paper, we substantially extend and improve the known approximation bounds for curves in dimension 2 and higher. Our analysis thus extends to application-relevant input data such as GPS-trajectories and protein backbones. We show that, if ℓ is part of the input, then there is no polynomial-time approximation scheme unless P = NP. Our constructions yield different bounds for one and two-dimensional curves and the discrete and continuous Fréchet distance. In the case of the discrete Fréchet distance on two-dimensional curves, we show hardness of approximation within a factor close to 2.598. This result also holds when k = 1, and the NP-hardness extends to the case that ℓ = ∞, i.e., for the problem of computing the minimum-enclosing ball under the Fréchet distance. Finally, we observe that a careful adaptation of Gonzalez’ algorithm in combination with a curve simplification yields a 3-approximation in any dimension, provided that an optimal simplification can be computed exactly. We conclude that our approximation bounds are close to being tight.Read More: https://epubs.siam.org/doi/10.1137/1.9781611975482.181

U2 - 10.1137/1.9781611975482.181

DO - 10.1137/1.9781611975482.181

M3 - Chapter

SP - 2922

EP - 2938

BT - 30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)

PB - Society for Industrial and Applied Mathematics (SIAM)

ER -

Buchin K, Driemel A, Gudmundsson J, Horton M, Kostitsyna I, Löffler M et al. Approximating (k,ℓ)-center clustering for curves. In Chan TM, editor, 30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). Society for Industrial and Applied Mathematics (SIAM). 2019. p. 2922-2938. Available from, DOI: 10.1137/1.9781611975482.181