TY - GEN
T1 - Collaborative, privacy-preserving data aggregation at scale
AU - Applebaum, Benny
AU - Ringberg, Haakon
AU - Freedman, Michael Joseph
AU - Caesar, Matthew
AU - Rexford, Jennifer L.
PY - 2010
Y1 - 2010
N2 - Combining and analyzing data collected at multiple administrative locations is critical for a wide variety of applications, such as detecting malicious attacks or computing an accurate estimate of the popularity of Web sites. However, legitimate concerns about privacy often inhibit participation in collaborative data aggregation. In this paper, we design, implement, and evaluate a practical solution for privacy-preserving data aggregation (PDA) among a large number of participants. Scalability and efficiency is achieved through a "semi-centralized" architecture that divides responsibility between a proxy that obliviously blinds the client inputs and a database that aggregates values by (blinded) keywords and identifies those keywords whose values satisfy some evaluation function. Our solution leverages a novel cryptographic protocol that provably protects the privacy of both the participants and the keywords, provided that proxy and database do not collude, even if both parties may be individually malicious. Our prototype implementation can handle over a million suspect IP addresses per hour when deployed across only two quad-core servers, and its throughput scales linearly with additional computational resources.
AB - Combining and analyzing data collected at multiple administrative locations is critical for a wide variety of applications, such as detecting malicious attacks or computing an accurate estimate of the popularity of Web sites. However, legitimate concerns about privacy often inhibit participation in collaborative data aggregation. In this paper, we design, implement, and evaluate a practical solution for privacy-preserving data aggregation (PDA) among a large number of participants. Scalability and efficiency is achieved through a "semi-centralized" architecture that divides responsibility between a proxy that obliviously blinds the client inputs and a database that aggregates values by (blinded) keywords and identifies those keywords whose values satisfy some evaluation function. Our solution leverages a novel cryptographic protocol that provably protects the privacy of both the participants and the keywords, provided that proxy and database do not collude, even if both parties may be individually malicious. Our prototype implementation can handle over a million suspect IP addresses per hour when deployed across only two quad-core servers, and its throughput scales linearly with additional computational resources.
UR - http://www.scopus.com/inward/record.url?scp=77955452807&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77955452807&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-14527-8_4
DO - 10.1007/978-3-642-14527-8_4
M3 - Conference contribution
AN - SCOPUS:77955452807
SN - 3642145264
SN - 9783642145261
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 56
EP - 74
BT - Privacy Enhancing Technologies - 10th International Symposium, PETS 2010, Proceedings
T2 - 10th International Symposium on Privacy Enhancing Technologies, PETS 2010
Y2 - 21 July 2010 through 23 July 2010
ER -