Learning Bounded Rationality Models of the Adversary in Repeated Stackelberg Security Games


Debarun Kar, Fei Fang, Francesco Delle Fave, Nicole Sintov, Arunesh Sinha, Aram Galstyan, Bo An, and Milind Tambe. 2015. “Learning Bounded Rationality Models of the Adversary in Repeated Stackelberg Security Games .” In Adaptive and Learning Agents Workshop at the International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2015).


Several competing human behavior models have been proposed to model and protect against boundedly rational adversaries in repeated Stackelberg security games (SSGs). However, these existing models fail to address three main issues which are extremely detrimental to defender performance. First, while they attempt to learn adversary behavior models from adversaries’ past actions (“attacks on targets”), they fail to take into account adversaries’ future adaptation based on successes or failures of these past actions. Second, they assume that sufficient data in the initial rounds will lead to a reliable model of the adversary. However, our analysis reveals that the issue is not the amount of data, but that there just is not enough of the attack surface exposed to the adversary to learn a reliable model. Third, current leading approaches have failed to include probability weighting functions, even though it is well known that human beings’ weighting of probability is typically nonlinear. Moreover, the performances of these models may be critically dependent on the learning algorithm used to learn the parameters of these models. The first contribution of this paper is a new human behavior model, SHARP, which mitigates these three limitations as follows: (i) SHARP reasons based on success or failure of the adversary’s past actions on exposed portions of the attack surface to model adversary adaptiveness; (ii) SHARP reasons about similarity between exposed and unexposed areas of the attack surface, and also incorporates a discounting parameter to mitigate adversary’s lack of exposure to enough of the attack surface; and (iii) SHARP integrates a non-linear probability weighting function to capture the adversary’s true weighting of probability. Our second contribution is a comparison of two different approaches for learning the parameters of the bounded rationality models. Our third contribution is a first “longitudinal study” – at least in the context of SSGs – of competing models in settings involving repeated interaction between the attacker and the defender. This study, where each experiment lasted a period of multiple weeks with individual sets of human subjects, illustrates the strengths and weaknesses of different models and shows the advantages of SHARP.
See also: 2015