Restless Poachers: Handling Exploration-Exploitation Tradeoffs in Security Domains

Citation:

Yundi Qian, Chao Zhang, Bhaskar Krishnamachari, and Milind Tambe. 2016. “Restless Poachers: Handling Exploration-Exploitation Tradeoffs in Security Domains .” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS).

Abstract:

The success of Stackelberg Security Games (SSGs) in counterterrorism domains has inspired researchers’ interest in applying game-theoretic models to other security domains with frequent interactions between defenders and attackers, e.g., wildlife protection. Previous research optimizes defenders’ strategies by modeling this problem as a repeated Stackelberg game, capturing the special property in this domain — frequent interactions between defenders and attackers. However, this research fails to handle exploration-exploitation tradeoff in this domain caused by the fact that defenders only have knowledge of attack activities at targets they protect. This paper addresses this shortcoming and provides the following contributions: (i) We formulate the problem as a restless multi-armed bandit (RMAB) model to address this challenge. (ii) To use Whittle index policy to plan for patrol strategies in the RMAB, we provide two sufficient conditions for indexability and an algorithm to numerically evaluate indexability. (iii) Given indexability, we propose a binary search based algorithm to find Whittle index policy efficiently.
See also: 2016