Improved Computational Models of Human Behavior in Security Games


Rong Yang, Christopher Kiekintveld, Fernando Ordonez, Milind Tambe, and Richard John. 2011. “Improved Computational Models of Human Behavior in Security Games.” In International Conference on Autonomous Agents and Multiagent Systems (Extended Abstract).


Security games refer to a special class of attacker-defender Stackelberg games. In these non zero-sum games, the attacker’s utility of attacking a target decreases as the defender allocates more resources to protect it (and vice versa for the defender). The defender (leader) first commits to a mixed strategy, assuming the attacker (follower) decides on a pure strategy after observing the defender’s strategy. This models the situation where an attacker conducts surveillance to learn the defender’s mixed strategy and then launches an attack on a single target. Given that the defender has limited resources, she must design her mixed-strategy optimally against the adversaries’ response to maximize effectiveness. One leading family of algorithms to compute such mixed strategies are DOBSS and its successors [3, 5], which are used in the deployed ARMOR [5] and IRIS [8] applications. One key set of assumptions these systems make is about how attackers choose strategies based on their knowledge of the security strategy. Typically, such systems apply the standard game-theoretic assumption that attackers are perfectly rational. This is a reasonable proxy for the worst case of a highly intelligent attacker, but it can lead to a defense strategy that is not robust against attackers using different decision procedures, and it fails to exploit known weaknesses in human decision-making. Indeed, it is widely accepted that standard game-theoretic assumptions of perfect rationality are not ideal for predicting the behavior of humans in multi-agent decision problems [1]. Thus, integrating more realistic models of human decision-making has become necessary in solving real-world security problems. The current leading contender that accounts for human behavior in security games is COBRA [6], which assumes that adversaries can deviate to −optimal strategies and that they have an anchoring bias when interpreting a probability distribution. It remains an open question whether other models yield better solutions than COBRA against human adversaries. The literature has introduced a multitude of candidate models, but there is an important empirical question of which model best represents the salient features of human behavior in applied security contexts. We address these open questions by developing three new algorithms to generate defender strategies in security games, based on using two fundamental theories of human behavior to predict an attacker’s decision: Prospect Theory (PT) [2] and Quantal Response Equilibrium (QRE) [4]. PT is a noble-prize-winning theory, which describes human decision making as a process of maximizing ‘prospect’. ‘Prospect’ is defined as the weighted sum of the benefit of all possible outcomes for each action. QRE suggests that instead of strictly maximizing utility, individuals respond stochastically in games: the chance of selecting a non-optimal strategy increases as the cost of such an error decreases.
See also: 2011