TY - JOUR
T1 - Linear boolean classification, coding and the critical problem
AU - Abbe, Emmanuel Auguste
AU - Alon, Noga
AU - Bandeira, Afonso S.
AU - Sandon, Colin
N1 - Funding Information:
E. Abbe was supported in part by the Army Research Office under Grant W911NF-16-1-0051 and in part by NSF within the Directorate for Biological Sciences under Grant CIF-1706648. N. Alon was supported in part by the Israel Science Foundation, in part by a USAIsraeli BSF Grant, in part by the Oswald Veblen Fund for Mathematics, and in part by the Israeli I-Core Program. A. S. Bandeira was supported by the Air Force Office of Scientific Research under Grant FA9550-12-1-0317. This paper was presented at the ISIT 2014 [1].
Publisher Copyright:
© 1963-2012 IEEE.
PY - 2016/4
Y1 - 2016/4
N2 - This paper considers the problem of linear Boolean classification, where the goal is to determine in which set, among two given sets of Boolean vectors, an unknown vector belongs to by making linear queries. Finding the least number of queries is equivalent to determining the minimal rank of a matrix over GF(2) , whose kernel does not intersect a given set S. In the case where $S$ is a Hamming ball, this reduces to finding linear codes of largest dimension. For a general set $S$ , this is an instance of the critical problem posed by Crapo and Rota in 1970, open in general. This paper focuses on the case where $S$ is an annulus. As opposed to balls, it is shown that an optimal kernel is composed not only of dense but also of sparse vectors, and the optimal mixture is identified in various cases. These findings corroborate a proposed conjecture that for an annulus of inner and outer radius $nq$ and $np$ respectively, the optimal relative rank is given by the normalized entropy $(1-q)H(p/(1-q))$ , an extension of the Gilbert-Varshamov bound.
AB - This paper considers the problem of linear Boolean classification, where the goal is to determine in which set, among two given sets of Boolean vectors, an unknown vector belongs to by making linear queries. Finding the least number of queries is equivalent to determining the minimal rank of a matrix over GF(2) , whose kernel does not intersect a given set S. In the case where $S$ is a Hamming ball, this reduces to finding linear codes of largest dimension. For a general set $S$ , this is an instance of the critical problem posed by Crapo and Rota in 1970, open in general. This paper focuses on the case where $S$ is an annulus. As opposed to balls, it is shown that an optimal kernel is composed not only of dense but also of sparse vectors, and the optimal mixture is identified in various cases. These findings corroborate a proposed conjecture that for an annulus of inner and outer radius $nq$ and $np$ respectively, the optimal relative rank is given by the normalized entropy $(1-q)H(p/(1-q))$ , an extension of the Gilbert-Varshamov bound.
KW - Linear codes
KW - classification
KW - combinatorics
KW - learning theory
UR - http://www.scopus.com/inward/record.url?scp=84963808129&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84963808129&partnerID=8YFLogxK
U2 - 10.1109/TIT.2016.2531085
DO - 10.1109/TIT.2016.2531085
M3 - Article
AN - SCOPUS:84963808129
SN - 0018-9448
VL - 62
SP - 1667
EP - 1673
JO - IRE Professional Group on Information Theory
JF - IRE Professional Group on Information Theory
IS - 4
M1 - 7410095
ER -