TY - GEN

T1 - Balanced hashing, color coding and approximate counting

AU - Alon, Noga

AU - Gutner, Shai

PY - 2009/12/24

Y1 - 2009/12/24

N2 - Color Coding is an algorithmic technique for deciding efficiently if a given input graph contains a path of a given length (or another small subgraph of constant tree-width). Applications of the method in computational biology motivate the study of similar algorithms for counting the number of copies of a given subgraph. While it is unlikely that exact counting of this type can be performed efficiently, as the problem is #W[1]-complete even for paths, approximate counting is possible, and leads to the investigation of an intriguing variant of families of perfect hash functions. A family of functions from [n] to [k] is an (ε,k)-balanced family of hash functions, if there exists a positive T so that for every K ⊂ [n] of size |K| = k, the number of functions in the family that are one-to-one on K is between (1 - ε)T and (1 + ε)T. The family is perfectly k-balanced if it is (0,k)-balanced. We show that every such perfectly k-balanced family is of size at least c(k)n [k/2], and that for every ε > 1/poly(k) there are explicit constructions of (ε,k)-balanced families of hash functions from [n] to [k] of size e(1+o(1))k logn. This is tight up to the o(1)-term in the exponent, and supplies deterministic polynomial time algorithms for approximately counting the number of paths or cycles of a specified length k (or copies of any graph H with k vertices and bounded tree-width) in a given input graph of size n, up to relative error ε, for all k ≤ O(logn).

AB - Color Coding is an algorithmic technique for deciding efficiently if a given input graph contains a path of a given length (or another small subgraph of constant tree-width). Applications of the method in computational biology motivate the study of similar algorithms for counting the number of copies of a given subgraph. While it is unlikely that exact counting of this type can be performed efficiently, as the problem is #W[1]-complete even for paths, approximate counting is possible, and leads to the investigation of an intriguing variant of families of perfect hash functions. A family of functions from [n] to [k] is an (ε,k)-balanced family of hash functions, if there exists a positive T so that for every K ⊂ [n] of size |K| = k, the number of functions in the family that are one-to-one on K is between (1 - ε)T and (1 + ε)T. The family is perfectly k-balanced if it is (0,k)-balanced. We show that every such perfectly k-balanced family is of size at least c(k)n [k/2], and that for every ε > 1/poly(k) there are explicit constructions of (ε,k)-balanced families of hash functions from [n] to [k] of size e(1+o(1))k logn. This is tight up to the o(1)-term in the exponent, and supplies deterministic polynomial time algorithms for approximately counting the number of paths or cycles of a specified length k (or copies of any graph H with k vertices and bounded tree-width) in a given input graph of size n, up to relative error ε, for all k ≤ O(logn).

KW - Approximate counting of subgraphs

KW - Color-coding

KW - Derandomization

KW - Expanders

KW - K-wise independence

KW - Perfect hashing

UR - http://www.scopus.com/inward/record.url?scp=72249116155&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=72249116155&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-11269-0_1

DO - 10.1007/978-3-642-11269-0_1

M3 - Conference contribution

AN - SCOPUS:72249116155

SN - 3642112684

SN - 9783642112683

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 1

EP - 16

BT - Parameterized and Exact Computation - 4th International Workshop, IWPEC 2009, Revised Selected Papers

T2 - 4th International Workshop on Parameterized and Exact Computation, IWPEC 2009

Y2 - 10 September 2009 through 11 September 2009

ER -