# Probabilistic clustering of high dimensional norms

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

## Abstract

Separating decompositions of metric spaces are an important randomized clustering paradigm that was formulated by Bartal in [Bar96] and is defined as follows. Given a metric space (X; dX), its modulus of separated decomposability, denoted SEP(X; dX), is the in mum over those 2 (0;1] such that for every finite subset S X and every there exists a distribution over random partitions P of S into sets of diameter at most such that for every x; y 2 S the probability that both x and y do not fall into the same cluster of the random partition P is at most dX(x; y) Here we obtain new bounds on SEP(X; kkX) when (X; k kX) is a finite dimensional normed space, yielding, as a special case, that n minfp; log ng for every p 2 [2;1]. This improves over the work [CCG+98] of Charikar, Chekuri, Goel, Guha, and Plotkin, who obtained this bound when p = 2, yet for p 2 (2;1] they obtained the asymptotically weaker estimate SEP(n p ) . n11=p. One should note that it was claimed in [CCG+98] that the bound SEP(n p ) . n11=p is sharp for every p 2 [2;1], and in particular it was claimed in [CCG+98] that SEP(n 1) n. However, the above results show that this claim of [CCG+98] is incorrect for every p 2 (2;1]. Our new bounds on the modulus of separated decomposability rely on extremal results for orthogonal hyperplane projections of convex bodies, specifically using the work [BN02] of Barthe and the author. This yields additional refined estimates, an example of which is that for every n 2 N and k 2 f1; : : : ; ng we have SEP 2 )6k denotes the subset of Rn consisting of all those vectors that have at most k nonzero entries, equipped with the Euclidean metric. The above statements have implications to the Lipschitz extension problem through its connection to random partitions that was developed by Lee and the author in [LN04, LN05]. Given a metric space (X; dX), let e(X) denote the in mum over those K 2 (0;1] such that for every Banach space Y and every subset S X, every 1-Lipschitz function f : S ! Y has a K-Lipschitz extension to all of X. Johnson, Lindenstrauss and Schechtman proved in [JLS86] that e(X) . dim(X) for every finite dimensional normed space (X; k kX). It is a longstanding open problem to determine the correct asymptotic dependence on dim(X) in this context, with the best known lower bound, due to Johnson and Lindenstrauss [JL84], being that the quantity e(X) must sometimes be at least a constant multiple of p dim(X). In particular, the previously best known upper bound on e(n was the O(n) estimate of [JLS86]. It is shown here that for every n 2 N we have n log n, thus answering (up to logarithmic factors) a question that was posed by Brudnyi and Brudnyi in [BB05, Problem 2]. More generally, n minfp; log ng for every p 2 [2;1], thus resolving (negatively) a conjecture of Brudnyi and Brudnyi in [BB05, Conjecture 5].

Original language English (US) 28th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017 Philip N. Klein Association for Computing Machinery 690-709 20 9781611974782 https://doi.org/10.1137/1.9781611974782.44 Published - 2017 28th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017 - Barcelona, SpainDuration: Jan 16 2017 → Jan 19 2017

### Publication series

Name Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms 0

### Other

Other 28th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017 Spain Barcelona 1/16/17 → 1/19/17

## All Science Journal Classification (ASJC) codes

• Software
• Mathematics(all)