TY - GEN
T1 - Analyzing the robustness of open-world machine learning
AU - Sehwag, Vikash
AU - Sitawarin, Chawin
AU - Bhagoji, Arjun Nitin
AU - Cullina, Daniel
AU - Mittal, Prateek
AU - Song, Liwei
AU - Chiang, Mung
N1 - Publisher Copyright:
© 2019 Copyright held by the owner/author(s).
PY - 2019/11/11
Y1 - 2019/11/11
N2 - When deploying machine learning models in real-world applications, an open-world learning framework is needed to deal with both normal in-distribution inputs and undesired out-of-distribution (OOD) inputs. Open-world learning frameworks include OOD detectors that aim to discard input examples which are not from the same distribution as the training data of machine learning classifiers. However, our understanding of current OOD detectors is limited to the setting of benign OOD data, and an open question is whether they are robust in the presence of adversaries. In this paper, we present the first analysis of the robustness of open-world learning frameworks in the presence of adversaries by introducing and designing OOD adversarial examples. Our experimental results show that current OOD detectors can be easily evaded by slightly perturbing benign OOD inputs, revealing a severe limitation of current open-world learning frameworks. Furthermore, we find that OOD adversarial examples also pose a strong threat to adversarial training based defense methods in spite of their effectiveness against in-distribution adversarial attacks. To counteract these threats and ensure the trustworthy detection of OOD inputs, we outline a preliminary design for a robust open-world machine learning framework.
AB - When deploying machine learning models in real-world applications, an open-world learning framework is needed to deal with both normal in-distribution inputs and undesired out-of-distribution (OOD) inputs. Open-world learning frameworks include OOD detectors that aim to discard input examples which are not from the same distribution as the training data of machine learning classifiers. However, our understanding of current OOD detectors is limited to the setting of benign OOD data, and an open question is whether they are robust in the presence of adversaries. In this paper, we present the first analysis of the robustness of open-world learning frameworks in the presence of adversaries by introducing and designing OOD adversarial examples. Our experimental results show that current OOD detectors can be easily evaded by slightly perturbing benign OOD inputs, revealing a severe limitation of current open-world learning frameworks. Furthermore, we find that OOD adversarial examples also pose a strong threat to adversarial training based defense methods in spite of their effectiveness against in-distribution adversarial attacks. To counteract these threats and ensure the trustworthy detection of OOD inputs, we outline a preliminary design for a robust open-world machine learning framework.
KW - Adversarial example
KW - Deep learning
KW - Open world recognition
UR - http://www.scopus.com/inward/record.url?scp=85075879823&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85075879823&partnerID=8YFLogxK
U2 - 10.1145/3338501.335737
DO - 10.1145/3338501.335737
M3 - Conference contribution
AN - SCOPUS:85075879823
T3 - Proceedings of the ACM Conference on Computer and Communications Security
SP - 105
EP - 116
BT - AISec 2019 - Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security
PB - Association for Computing Machinery
T2 - 12th ACM Workshop on Artificial Intelligence and Security, AISec 2019, co-located with CCS 2019
Y2 - 15 November 2019
ER -