TY - JOUR
T1 - Attacker-Centric View of a Detection Game against Advanced Persistent Threats
AU - Xiao, Liang
AU - Xu, Dongjin
AU - Mandayam, Narayan B.
AU - Poor, H. Vincent
N1 - Funding Information:
This work was supported in part by the Natural Science Foundation of China under Grant 61671396, 61472335, and 91638204, in part by the Open Research Fund of the National Mobile Communications Research Laboratory, Southeast University (No. 2018D08), in part by the Science and Technology Innovation Project of Foshan City, China (Grant No. 2015IT100095), and in part by the U.S. National Science Foundation under Grants CMMI-1435778, ECCS-1549881, ECCS-1647198, and ACI-1541069.
Publisher Copyright:
© 2002-2012 IEEE.
PY - 2018/11/1
Y1 - 2018/11/1
N2 - Advanced persistent threats (APTs) are a major threat to cyber-security, causing significant financial and privacy losses each year. In this paper, cumulative prospect theory (CPT) is applied to study the interactions between a cyber system and an APT attacker when each of them makes subjective decisions to choose their scan interval and attack interval, respectively. Both the probability distortion effect and the framing effect are applied to model the deviation of subjective decisions of end-users from the objective decisions governed by expected utility theory, under uncertain attack durations in a pure-strategy game and scan interval in a mixed-strategy game. The CPT-based APT detection game incorporates both the probability weighting distortion and the framing effect of the subjective attacker and security agent of the cyber system, rather than discrete decision weights, as in earlier prospect theoretic study of APT detection. The Nash equilibria of the APT detection game are derived, showing that a subjective attacker becomes risk-seeking if the frame of reference for evaluating the utility is large, and becomes risk-averse if the frame of reference for evaluating the utility is small. A policy hill-climbing (PHC) based detection scheme is proposed to increase the policy uncertainty to fool the attacker in the dynamic game, and a 'hotbooting' technique that exploits experiences in similar scenarios to initialize the quality values is developed to accelerate the learning speed of PHC-based detection. A practical example of a mobile network is presented to evaluate the performance of the proposed detection strategy. Simulation results show that the proposed strategy can improve detection performance with a higher data protection level and utilities of the cloud in the presence of an attacker compared with a standard Q-learning strategy.
AB - Advanced persistent threats (APTs) are a major threat to cyber-security, causing significant financial and privacy losses each year. In this paper, cumulative prospect theory (CPT) is applied to study the interactions between a cyber system and an APT attacker when each of them makes subjective decisions to choose their scan interval and attack interval, respectively. Both the probability distortion effect and the framing effect are applied to model the deviation of subjective decisions of end-users from the objective decisions governed by expected utility theory, under uncertain attack durations in a pure-strategy game and scan interval in a mixed-strategy game. The CPT-based APT detection game incorporates both the probability weighting distortion and the framing effect of the subjective attacker and security agent of the cyber system, rather than discrete decision weights, as in earlier prospect theoretic study of APT detection. The Nash equilibria of the APT detection game are derived, showing that a subjective attacker becomes risk-seeking if the frame of reference for evaluating the utility is large, and becomes risk-averse if the frame of reference for evaluating the utility is small. A policy hill-climbing (PHC) based detection scheme is proposed to increase the policy uncertainty to fool the attacker in the dynamic game, and a 'hotbooting' technique that exploits experiences in similar scenarios to initialize the quality values is developed to accelerate the learning speed of PHC-based detection. A practical example of a mobile network is presented to evaluate the performance of the proposed detection strategy. Simulation results show that the proposed strategy can improve detection performance with a higher data protection level and utilities of the cloud in the presence of an attacker compared with a standard Q-learning strategy.
KW - Reinforcement learning
KW - advanced persistent threat
KW - cumulative prospect theory
KW - data protection
KW - game theory
UR - http://www.scopus.com/inward/record.url?scp=85043471411&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85043471411&partnerID=8YFLogxK
U2 - 10.1109/TMC.2018.2814052
DO - 10.1109/TMC.2018.2814052
M3 - Article
AN - SCOPUS:85043471411
SN - 1536-1233
VL - 17
SP - 2512
EP - 2523
JO - IEEE Transactions on Mobile Computing
JF - IEEE Transactions on Mobile Computing
IS - 11
M1 - 8310016
ER -