TY - GEN
T1 - Sensitivity of PCA for traffic anomaly detection
AU - Ringberg, Haakon
AU - Soule, Augustin
AU - Rexford, Jennifer L.
AU - Diot, Christophe
PY - 2007
Y1 - 2007
N2 - Detecting anomalous traffic is a crucial part of managing IP networks. In recent years, network-wide anomaly detection based on Principal Component Analysis (PCA) has emerged as a powerful method for detecting a wide variety of anomalies. We show that tuning PCA to operate effectively in practice is difficult and requires more robust techniques than have been presented thus far. We analyze a week of network-wide traffic measurements from two IP backbones (Abilene and Geant) across three different traffic aggregations (ingress routers, OD flows, and input links), and conduct a detailed inspection of the feature time series for each suspected anomaly. Our study identifies and evaluates four main challenges of using PCA to detect traffic anomalies: (i) the false positive rate is very sensitive to small differences in the number of principal components in the normal subspace, (ii) the effectiveness of PCA is sensitive to the level of aggregation of the traffic measurements, (iii) a large anomaly may in advertently pollute the normal subspace, (iv) correctly identifying which flow triggered the anomaly detector is an inherently challenging problem.
AB - Detecting anomalous traffic is a crucial part of managing IP networks. In recent years, network-wide anomaly detection based on Principal Component Analysis (PCA) has emerged as a powerful method for detecting a wide variety of anomalies. We show that tuning PCA to operate effectively in practice is difficult and requires more robust techniques than have been presented thus far. We analyze a week of network-wide traffic measurements from two IP backbones (Abilene and Geant) across three different traffic aggregations (ingress routers, OD flows, and input links), and conduct a detailed inspection of the feature time series for each suspected anomaly. Our study identifies and evaluates four main challenges of using PCA to detect traffic anomalies: (i) the false positive rate is very sensitive to small differences in the number of principal components in the normal subspace, (ii) the effectiveness of PCA is sensitive to the level of aggregation of the traffic measurements, (iii) a large anomaly may in advertently pollute the normal subspace, (iv) correctly identifying which flow triggered the anomaly detector is an inherently challenging problem.
KW - Network traffic analysis
KW - Principal component analysis
KW - Traffic engineering
UR - http://www.scopus.com/inward/record.url?scp=36349029177&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=36349029177&partnerID=8YFLogxK
U2 - 10.1145/1269899.1254895
DO - 10.1145/1269899.1254895
M3 - Conference contribution
AN - SCOPUS:36349029177
SN - 1595936394
SN - 9781595936394
T3 - Performance Evaluation Review
SP - 109
EP - 120
BT - SIGMETRICS'07 - Proceedings of the 2007 International Conference on Measurement and Modeling of Computer Systems
T2 - SIGMETRICS'07 - 2007 International Conference on Measurement and Modeling of Computer Systems
Y2 - 12 June 2007 through 16 June 2007
ER -