Approximating (k,ℓ)-center clustering for curves

Kevin Buchin, Anne Driemel, Joachim Gudmundsson, Michael Horton, Irina Kostitsyna, Maarten Löffler, Martijn Struijs

Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureHoofdstukAcademicpeer review

Uittreksel


The Euclidean k-Center problem is a classical problem that has been extensively studied in computer science. Given a set G of n points in Euclidean space, the problem is to determine a set C of k centers (not necessarily part of G) such that the maximum distance between a point in G and its nearest neighbor in C is minimized. In this paper we study the corresponding (k, ℓ)-CENTER problem for polygonal curves under the Fréchet distance, that is, given a set G of n polygonal curves in ℝd, each of complexity m, determine a set C of k polygonal curves in ℝd, each of complexity ℓ, such that the maximum Fréchet distance of a curve in G to its closest curve in C is minimized. In their 2016 paper, Driemel, Krivošija, and Sohler give a near-linear time (1 + ε-approximation algorithm for one-dimensional curves, assuming that k and ℓ are constants. In this paper, we substantially extend and improve the known approximation bounds for curves in dimension 2 and higher. Our analysis thus extends to application-relevant input data such as GPS-trajectories and protein backbones. We show that, if ℓ is part of the input, then there is no polynomial-time approximation scheme unless P = NP. Our constructions yield different bounds for one and two-dimensional curves and the discrete and continuous Fréchet distance. In the case of the discrete Fréchet distance on two-dimensional curves, we show hardness of approximation within a factor close to 2.598. This result also holds when k = 1, and the NP-hardness extends to the case that ℓ = ∞, i.e., for the problem of computing the minimum-enclosing ball under the Fréchet distance. Finally, we observe that a careful adaptation of Gonzalez’ algorithm in combination with a curve simplification yields a 3-approximation in any dimension, provided that an optimal simplification can be computed exactly. We conclude that our approximation bounds are close to being tight.


Read More: https://epubs.siam.org/doi/10.1137/1.9781611975482.181
TaalEngels
Titel30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)
RedacteurenTimothy M. Chan
UitgeverijSociety for Industrial and Applied Mathematics (SIAM)
Pagina's2922-2938
Aantal pagina's17
ISBN van elektronische versie978-1-61197-548-2
DOI's
StatusGepubliceerd - 2 jan 2019
EvenementACM-SIAM Symposium on Discrete Algorithms (SODA2019) - San Diego, Verenigde Staten van Amerika
Duur: 6 jan 20199 jan 2019
https://www.siam.org/Conferences/CM/Conference/soda19

Congres

CongresACM-SIAM Symposium on Discrete Algorithms (SODA2019)
Verkorte titelSODA2019
LandVerenigde Staten van Amerika
StadSan Diego
Periode6/01/199/01/19
Internet adres

Vingerafdruk

Clustering
Curve
Simplification
Approximation
Hardness of Approximation
Center Problem
Polynomial Time Approximation Scheme
NP-hardness
Backbone
Euclidean space
Linear Time
Approximation Algorithms
Euclidean
Nearest Neighbor
Computer Science
Ball
Trajectory
Protein
Computing

Citeer dit

Buchin, K., Driemel, A., Gudmundsson, J., Horton, M., Kostitsyna, I., Löffler, M., & Struijs, M. (2019). Approximating (k,ℓ)-center clustering for curves. In T. M. Chan (editor), 30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) (blz. 2922-2938). Society for Industrial and Applied Mathematics (SIAM). DOI: 10.1137/1.9781611975482.181
Buchin, Kevin ; Driemel, Anne ; Gudmundsson, Joachim ; Horton, Michael ; Kostitsyna, Irina ; Löffler, Maarten ; Struijs, Martijn. / Approximating (k,ℓ)-center clustering for curves. 30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). redacteur / Timothy M. Chan. Society for Industrial and Applied Mathematics (SIAM), 2019. blz. 2922-2938
@inbook{2aaf61c0a0164fa18984490e784a475a,
title = "Approximating (k,ℓ)-center clustering for curves",
abstract = "The Euclidean k-Center problem is a classical problem that has been extensively studied in computer science. Given a set G of n points in Euclidean space, the problem is to determine a set C of k centers (not necessarily part of G) such that the maximum distance between a point in G and its nearest neighbor in C is minimized. In this paper we study the corresponding (k, ℓ)-CENTER problem for polygonal curves under the Fr{\'e}chet distance, that is, given a set G of n polygonal curves in ℝd, each of complexity m, determine a set C of k polygonal curves in ℝd, each of complexity ℓ, such that the maximum Fr{\'e}chet distance of a curve in G to its closest curve in C is minimized. In their 2016 paper, Driemel, Krivošija, and Sohler give a near-linear time (1 + ε-approximation algorithm for one-dimensional curves, assuming that k and ℓ are constants. In this paper, we substantially extend and improve the known approximation bounds for curves in dimension 2 and higher. Our analysis thus extends to application-relevant input data such as GPS-trajectories and protein backbones. We show that, if ℓ is part of the input, then there is no polynomial-time approximation scheme unless P = NP. Our constructions yield different bounds for one and two-dimensional curves and the discrete and continuous Fr{\'e}chet distance. In the case of the discrete Fr{\'e}chet distance on two-dimensional curves, we show hardness of approximation within a factor close to 2.598. This result also holds when k = 1, and the NP-hardness extends to the case that ℓ = ∞, i.e., for the problem of computing the minimum-enclosing ball under the Fr{\'e}chet distance. Finally, we observe that a careful adaptation of Gonzalez’ algorithm in combination with a curve simplification yields a 3-approximation in any dimension, provided that an optimal simplification can be computed exactly. We conclude that our approximation bounds are close to being tight.Read More: https://epubs.siam.org/doi/10.1137/1.9781611975482.181",
author = "Kevin Buchin and Anne Driemel and Joachim Gudmundsson and Michael Horton and Irina Kostitsyna and Maarten L{\"o}ffler and Martijn Struijs",
year = "2019",
month = "1",
day = "2",
doi = "10.1137/1.9781611975482.181",
language = "English",
pages = "2922--2938",
editor = "Chan, {Timothy M.}",
booktitle = "30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)",
publisher = "Society for Industrial and Applied Mathematics (SIAM)",
address = "United States",

}

Buchin, K, Driemel, A, Gudmundsson, J, Horton, M, Kostitsyna, I, Löffler, M & Struijs, M 2019, Approximating (k,ℓ)-center clustering for curves. in TM Chan (redactie), 30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). Society for Industrial and Applied Mathematics (SIAM), blz. 2922-2938, San Diego, Verenigde Staten van Amerika, 6/01/19. DOI: 10.1137/1.9781611975482.181

Approximating (k,ℓ)-center clustering for curves. / Buchin, Kevin; Driemel, Anne; Gudmundsson, Joachim; Horton, Michael; Kostitsyna, Irina; Löffler, Maarten; Struijs, Martijn.

30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). redactie / Timothy M. Chan. Society for Industrial and Applied Mathematics (SIAM), 2019. blz. 2922-2938.

Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureHoofdstukAcademicpeer review

TY - CHAP

T1 - Approximating (k,ℓ)-center clustering for curves

AU - Buchin,Kevin

AU - Driemel,Anne

AU - Gudmundsson,Joachim

AU - Horton,Michael

AU - Kostitsyna,Irina

AU - Löffler,Maarten

AU - Struijs,Martijn

PY - 2019/1/2

Y1 - 2019/1/2

N2 - The Euclidean k-Center problem is a classical problem that has been extensively studied in computer science. Given a set G of n points in Euclidean space, the problem is to determine a set C of k centers (not necessarily part of G) such that the maximum distance between a point in G and its nearest neighbor in C is minimized. In this paper we study the corresponding (k, ℓ)-CENTER problem for polygonal curves under the Fréchet distance, that is, given a set G of n polygonal curves in ℝd, each of complexity m, determine a set C of k polygonal curves in ℝd, each of complexity ℓ, such that the maximum Fréchet distance of a curve in G to its closest curve in C is minimized. In their 2016 paper, Driemel, Krivošija, and Sohler give a near-linear time (1 + ε-approximation algorithm for one-dimensional curves, assuming that k and ℓ are constants. In this paper, we substantially extend and improve the known approximation bounds for curves in dimension 2 and higher. Our analysis thus extends to application-relevant input data such as GPS-trajectories and protein backbones. We show that, if ℓ is part of the input, then there is no polynomial-time approximation scheme unless P = NP. Our constructions yield different bounds for one and two-dimensional curves and the discrete and continuous Fréchet distance. In the case of the discrete Fréchet distance on two-dimensional curves, we show hardness of approximation within a factor close to 2.598. This result also holds when k = 1, and the NP-hardness extends to the case that ℓ = ∞, i.e., for the problem of computing the minimum-enclosing ball under the Fréchet distance. Finally, we observe that a careful adaptation of Gonzalez’ algorithm in combination with a curve simplification yields a 3-approximation in any dimension, provided that an optimal simplification can be computed exactly. We conclude that our approximation bounds are close to being tight.Read More: https://epubs.siam.org/doi/10.1137/1.9781611975482.181

AB - The Euclidean k-Center problem is a classical problem that has been extensively studied in computer science. Given a set G of n points in Euclidean space, the problem is to determine a set C of k centers (not necessarily part of G) such that the maximum distance between a point in G and its nearest neighbor in C is minimized. In this paper we study the corresponding (k, ℓ)-CENTER problem for polygonal curves under the Fréchet distance, that is, given a set G of n polygonal curves in ℝd, each of complexity m, determine a set C of k polygonal curves in ℝd, each of complexity ℓ, such that the maximum Fréchet distance of a curve in G to its closest curve in C is minimized. In their 2016 paper, Driemel, Krivošija, and Sohler give a near-linear time (1 + ε-approximation algorithm for one-dimensional curves, assuming that k and ℓ are constants. In this paper, we substantially extend and improve the known approximation bounds for curves in dimension 2 and higher. Our analysis thus extends to application-relevant input data such as GPS-trajectories and protein backbones. We show that, if ℓ is part of the input, then there is no polynomial-time approximation scheme unless P = NP. Our constructions yield different bounds for one and two-dimensional curves and the discrete and continuous Fréchet distance. In the case of the discrete Fréchet distance on two-dimensional curves, we show hardness of approximation within a factor close to 2.598. This result also holds when k = 1, and the NP-hardness extends to the case that ℓ = ∞, i.e., for the problem of computing the minimum-enclosing ball under the Fréchet distance. Finally, we observe that a careful adaptation of Gonzalez’ algorithm in combination with a curve simplification yields a 3-approximation in any dimension, provided that an optimal simplification can be computed exactly. We conclude that our approximation bounds are close to being tight.Read More: https://epubs.siam.org/doi/10.1137/1.9781611975482.181

U2 - 10.1137/1.9781611975482.181

DO - 10.1137/1.9781611975482.181

M3 - Chapter

SP - 2922

EP - 2938

BT - 30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)

PB - Society for Industrial and Applied Mathematics (SIAM)

ER -

Buchin K, Driemel A, Gudmundsson J, Horton M, Kostitsyna I, Löffler M et al. Approximating (k,ℓ)-center clustering for curves. In Chan TM, redacteur, 30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). Society for Industrial and Applied Mathematics (SIAM). 2019. blz. 2922-2938. Beschikbaar vanaf, DOI: 10.1137/1.9781611975482.181