TY - GEN
T1 - Structure preserving anonymization of router configuration data
AU - Maltz, David A.
AU - Zhan, Jibin
AU - Xie, Geoffrey
AU - Zhang, Hui
AU - Hjálmtýsson, Gísli
AU - Greenberg, Albert
AU - Rexford, Jennifer
PY - 2004
Y1 - 2004
N2 - A repository of router configuration files from production networks would provide the research community with a treasure trove of data about network topologies, routing designs, and security policies. However, configuration files have been largely unobtainable precisely because they provide detailed information that could be exploited by competitors and attackers. This paper describes a method for anonymizing router configuration files by removing all information that connects the data to the identity of the originating network, while still preserving the structure of information that makes the data valuable to networking researchers. Anonymizing configuration files has unusual requirements, including preserving relationships between elements of data, anonymizing regular expressions, and robustly coping with more than 200 versions of the configuration language, that mean conventional tools and techniques are poorly suited to the problem. Our anonymization method has been validated with a major carrier, earning unprivileged researchers access to the configuration files of more than 7600 routers in 31 networks. Through example analysis, we demonstrate that the anonymized data retains the key properties of the network design. We believe that applying our single-blind methodology to a large number of production networks from different sources would be of tremendous value to both the research and operations communities.
AB - A repository of router configuration files from production networks would provide the research community with a treasure trove of data about network topologies, routing designs, and security policies. However, configuration files have been largely unobtainable precisely because they provide detailed information that could be exploited by competitors and attackers. This paper describes a method for anonymizing router configuration files by removing all information that connects the data to the identity of the originating network, while still preserving the structure of information that makes the data valuable to networking researchers. Anonymizing configuration files has unusual requirements, including preserving relationships between elements of data, anonymizing regular expressions, and robustly coping with more than 200 versions of the configuration language, that mean conventional tools and techniques are poorly suited to the problem. Our anonymization method has been validated with a major carrier, earning unprivileged researchers access to the configuration files of more than 7600 routers in 31 networks. Through example analysis, we demonstrate that the anonymized data retains the key properties of the network design. We believe that applying our single-blind methodology to a large number of production networks from different sources would be of tremendous value to both the research and operations communities.
KW - Data anonymization
KW - Router configuration
KW - Security
UR - http://www.scopus.com/inward/record.url?scp=14944358967&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=14944358967&partnerID=8YFLogxK
U2 - 10.1145/1028788.1028819
DO - 10.1145/1028788.1028819
M3 - Conference contribution
AN - SCOPUS:14944358967
SN - 1581138210
SN - 9781581138214
T3 - Proceedings of the 2004 ACM SIGCOMM Internet Measurement Conference, IMC 2004
SP - 239
EP - 244
BT - Proceedings of the 2004 ACM SIGCOMM Internet Measurement Conference, IMC 2004
PB - Association for Computing Machinery
T2 - Proceedings of the 2004 ACM SIGCOMM Internet Measurement Conference, IMC 2004
Y2 - 25 October 2004 through 27 October 2004
ER -