Publications

2016
Shahrzad Gholami, Bryan Wilder, Matthew Brown, Dana Thomas, Nicole Sintov, and Milind Tambe. 2016. “Divide to Defend: Collusive Security Games .” In Conference on Decision and Game Theory for Security (GameSec 2016).Abstract
Research on security games has focused on settings where the defender must protect against either a single adversary or multiple, independent adversaries. However, there are a variety of real-world security domains where adversaries may benefit from colluding in their actions against the defender, e.g., wildlife poaching, urban crime and drug trafficking. Given such adversary collusion may be more detrimental for the defender, she has an incentive to break up collusion by playing off the self-interest of individual adversaries. As we show in this paper, breaking up such collusion is difficult given bounded rationality of human adversaries; we therefore investigate algorithms for the defender assuming both rational and boundedly rational adversaries. The contributions of this paper include (i) collusive security games (COSGs), a model for security games involving potential collusion among adversaries, (ii) SPECTRE-R, an algorithm to solve COSGs and break collusion assuming rational adversaries, (iii) observations and analyses of adversary behavior and the underlying factors including bounded rationality, imbalanced- resource-allocation effect, coverage perception, and individualism / collectivism attitudes within COSGs with data from 700 human subjects, (iv) a learned human behavioral model that incorporates these factors to predict when collusion will occur, (v) SPECTRE-BR, an enhanced algorithm which optimizes against the learned behavior model to provide demonstrably better performing defender strategies against human subjects compared to SPECTRE-R.
2016_38_teamcore_dividetodefend.pdf
Eric Shieh, Albert Xin Jiang, Amulya Yadav, Pradeep Varakantham, and Milind Tambe. 2016. “An Extended Study on Addressing Defender Teamwork while Accounting for Uncertainty in Attacker Defender Games using Iterative Dec-MDPs .” Multiagent and Grid Systems (MAGS) Journal.Abstract
Multi-agent teamwork and defender-attacker security games are two areas that are currently receiving significant attention within multi-agent systems research. Unfortunately, despite the need for effective teamwork among multiple defenders, little has been done to harness the teamwork research in security games. The problem that this paper seeks to solve is the coordination of decentralized defender agents in the presence of uncertainty while securing targets against an observing adversary. To address this problem, we offer the following novel contributions in this paper: (i) New model of security games with defender teams that coordinate under uncertainty; (ii) New algorithm based on column generation that utilizes Decentralized Markov Decision Processes (Dec-MDPs) to generate defender strategies that incorporate uncertainty; (iii) New techniques to handle global events (when one or more agents may leave the system) during defender execution; (iv) Heuristics that help scale up in the number of targets and agents to handle real-world scenarios; (v) Exploration of the robustness of randomized pure strategies. The paper opens the door to a potentially new area combining computational game theory and multi-agent teamwork.
Nicole Sintov, Debarun Kar, Thanh Nguyen, Fei Fang, Kevin Hoffman, Arnaud Lyet, and Milind Tambe. 2016. “From the Lab to the Classroom and Beyond: Extending a Game-Based Research Platform for Teaching AI to Diverse Audiences .” In Symposium on Educational Advances in Artificial Intelligence (EAAI) 2016.Abstract
Recent years have seen increasing interest in AI from outside the AI community. This is partly due to applications based on AI that have been used in real-world domains, for example, the successful deployment of game theory-based decision aids in security domains. This paper describes our teaching approach for introducing the AI concepts underlying security games to diverse audiences. We adapted a game-based research platform that served as a testbed for recent research advances in computational game theory into a set of interactive role-playing games. We guided learners in playing these games as part of our teaching strategy, which also included didactic instruction and interactive exercises on broader AI topics. We describe our experience in applying this teaching approach to diverse audiences, including students of an urban public high school, university undergraduates, and security domain experts who protect wildlife. We evaluate our approach based on results from the games and participant surveys.
2016_7_teamcore_eaai16_teamcore.pdf
Andrew Plucker. 2016. “The Future of Counterinsurgency Modeling: Decision Aids for United States Army Commanders ”. 2016_2_teamcore_thefuture_of_counterinsurgency_modeling_final.pdf
Shahrzad Gholami, Bryan Wilder, Matthew Brown, Arunesh Sinha, Nicole Sintov, and Milind Tambe. 2016. “A Game Theoretic Approach on Addressing Collusion among Human Adversaries .” In Workshop on security and multiagent systems, International conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract
Several models have been proposed for Stackelberg security games (SSGs) and protection against perfectly rational and bounded rational adversaries; however, none of these existing models addressed the collusion mechanism between adversaries. In a large number of studies related to SSGs, there is one leader and one follower in the game such that the leader takes action and the follower responds accordingly. These studies fail to take into account the possibility of existence of group of adversaries who can collude and cause synergistic loss to the security agents (defenders). The first contribution of this paper is formulating a new type of Stackleberg security game involving a beneficial collusion mechanism among adversaries. The second contribution of this paper is to develop a parametric human behavior model which is able to capture the bounded rationality of adversaries in this type of collusive games. This model is proposed based on human subject experiments with participants on Amazon Mechanical Turk (AMT).
2016_20_teamcore_sgholami_secmas_aamas2016.pdf
Aaron Schlenker, Matthew Brown, Arunesh Sinha, Milind Tambe, and Ruta Mehta. 2016. “Get Me to My GATE On Time: Efficiently Solving General-Sum Bayesian Threat Screening Games .” In 22nd European Conference on Artificial Intelligence (ECAI).Abstract
Threat Screening Games (TSGs) are used in domains where there is a set of individuals or objects to screen with a limited amount of screening resources available to screen them. TSGs are broadly applicable to domains like airport passenger screening, stadium screening, cargo container screening, etc. Previous work on TSGs focused only on the Bayesian zero-sum case and provided the MGA algorithm to solve these games. In this paper, we solve Bayesian general-sum TSGs which we prove are NP-hard even when exploiting a compact marginal representation. We also present an algorithm based upon a adversary type hierarchical tree decomposition and an efficient branch-and-bound search to solve Bayesian generalsum TSGs. With this we provide four contributions: (1) GATE, the first algorithm for solving Bayesian general-sum TSGs, which uses hierarchical type trees and a novel branch-and-bound search, (2) the Branch-and-Guide approach which combines branch-and-bound search with the MGA algorithm for the first time, (3) heuristics based on properties of TSGs for accelerated computation of GATE, and (4) experimental results showing the scalability of GATE needed for real-world domains.
2016_35_teamcore_gate_generalsumtsgs.pdf
Yundi Qian. 2016. “Handling Attacker's Preference in Security Domains: Robust Optimization and Learning Approaches ”.Abstract
Stackelberg security games (SSGs) are now established as a powerful tool in security domains. In order to compute the optimal strategy for the defender in SSG model, the defender needs to know the attacker’s preferences over targets so that she can predict how the attacker would react under a certain defender strategy. Uncertainty over attacker preferences may cause the defender to suffer significant losses. Motivated by that, my thesis focuses on addressing uncertainty in attacker preferences using robust and learning approaches. In security domains with one-shot attack, e.g., counter-terrorism domains, the defender is interested in robust approaches that can provide performance guarantee in the worst case. The first part of my thesis focuses on handling attacker’s preference uncertainty with robust approaches in these domains. My work considers a new dimension of preference uncertainty that has not been taken into account in previous literatures: the risk preference uncertainty of the attacker, and propose an algorithm to efficiently compute defender’s robust strategy against uncertain risk-aware attackers. In security domains with repeated attacks, e.g., green security domain of protecting natural resources, the attacker “attacks” (illegally extracts natural resources) frequently, so it is possible for the defender to learn attacker’s preference from their previous actions.
2016_33_teamcore_yundi_thesis.pdf
G. R. Martins, M. Escarce Junior, and L. S. Marcolino. 2016. “A Hands-on Musical Experience in AI, Games and Art (Demonstration) .” In 30th AAAI Conference on Artificial Intelligence (AAAI 2016).Abstract
AI is typically applied in video games in the creation of artificial opponents, in order to make them strong, realistic or even fallible (for the game to be “enjoyable” by human players). We offer a different perspective: we present the concept of “Art Games”, a view that opens up many possibilities for AI research and applications. Conference participants will play Jikan to Kukan, an art game where the player dynamically creates the soundtrack with the AI system, while developing her experience in the unconscious world of a character.
2016_22_teamcore_aaai16_demo.pdf
Chao Zhang, Shahrzad Gholami, Debarun Kar, Arunesh Sinha, Manish Jain, Ripple Goyal, and Milind Tambe. 2016. “Keeping Pace with Criminals: An Extended Study of Designing Patrol Allocation against Adaptive Opportunistic Criminals .” Games Journal.Abstract
Game theoretic approaches have recently been used to model the deterrence effect of patrol officers’ assignments on opportunistic crimes in urban areas. One major challenge in this domain is modeling the behavior of opportunistic criminals. Compared to strategic attackers (such as terrorists) who execute a well-laid out plan, opportunistic criminals are less strategic in planning attacks and more flexible in executing well-laid plans based on their knowledge of patrol officers’ assignments. In this paper, we aim to design an optimal police patrolling strategy against opportunistic criminals in urban areas. Our approach is comprised by two major parts: learning a model of the opportunistic criminal (and how he or she responds to patrols) and then planning optimal patrols against this learned model. The planning part, by using information about how criminals responds to patrols, takes into account the strategic game interaction between the police and criminals. In more detail, first, we propose two categories of models for modeling opportunistic crimes. The first category of models learns the relationship between defender strategy and crime distribution as a Markov chain. The second category of models represents the interaction of criminals and patrol officers as a Dynamic Bayesian Network (DBN) with the number of criminals as the unobserved hidden states. To this end, we: (i) apply standard algorithms, such as Expectation Maximization (EM), to learn the parameters of the DBN; (ii) modify the DBN representation that allows for a compact representation of the model, resulting in better learning accuracy and the increased speed of learning of the EM algorithm when used for the modified DBN. These modifications exploit the structure of the problem and use independence assumptions to factorize the large joint probability distributions. Next, we propose an iterative learning and planning mechanism that periodically updates the adversary model. We demonstrate the efficiency of our learning algorithms by applying them to a real dataset of criminal activity obtained from the police department of the University of Southern California (USC) situated in Los Angeles, CA, USA. We project a significant reduction in crime rate using our planning strategy as compared to the actual strategy deployed by the police department. We also demonstrate the improvement in crime prevention in simulation when we use our iterative planning and learning mechanism when compared to just learning once and planning. Finally, we introduce a web-based software for recommending patrol strategies, which is currently deployed at USC. In the near future, our learning and planning algorithm is planned to be integrated with this software. This work was done in collaboration with the police department of USC.
Arunesh Sinha, Debarun Kar, and Milind Tambe. 2016. “Learning Adversary Behavior in Security Games: A PAC Model Perspective .” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract
Recent applications of Stackelberg Security Games (SSG), from wildlife crime to urban crime, have employed machine learning tools to learn and predict adversary behavior using available data about defender-adversary interactions. Given these recent developments, this paper commits to an approach of directly learning the response function of the adversary. Using the PAC model, this paper lays a firm theoretical foundation for learning in SSGs and provides utility guarantees when the learned adversary model is used to plan the defender’s strategy. The paper also aims to answer practical questions such as how much more data is needed to improve an adversary model’s accuracy. Additionally, we explain a recently observed phenomenon that prediction accuracy of learned adversary behavior is not enough to discover the utility maximizing defender strategy. We provide four main contributions: (1) a PAC model of learning adversary response functions in SSGs; (2) PAC-model analysis of the learning of key, existing bounded rationality models in SSGs; (3) an entirely new approach to adversary modeling based on a non-parametric class of response functions with PAC-model analysis and (4) identification of conditions under which computing the best defender strategy against the learned adversary behavior is indeed the optimal strategy. Finally, we conduct experiments with real-world data from a national park in Uganda, showing the benefit of our new adversary modeling approach and verification of our PAC model predictions.
2016_13_teamcore_aamas2016pac.pdf
Yasaman Dehghani Abbasi. 2016. “Modeling Human Bounded Rationality in Opportunistic Security Games ”.Abstract
Security has been an important, world-wild concern over the past decades. Security agencies have been established to prevent different types of crimes in various domains, such as illegal poaching, human trafficking, terrorist attacks to ports and airports, and urban crimes. Unfortunately, in all these domains, security agencies have limited resources and cannot protect all potential targets at all time. Therefore, it is critical for the security agencies to allocate their limited resources optimally to protect potential targets from the adversary. Recently, game-theoretic decision support systems have been applied to assist defenders (e.g. security agencies) in allocating and scheduling their limited resources. Stackelberg Security Game (denoted as SSG), is an example of a game-theoretic model that has been deployed to assign the security resources to the potential targets. Indeed, decision-support systems based on SSG models have been successfully implemented to assist real-world security agencies in protecting critical infrastructure such as airports, ports, or suppressing crime in urban areas. SSG provides an approach for generating randomized protection strategies for the defender using a mathematical representation of the interaction between the defender and the attacker. Therefore, one of the key steps in applying the SSG algorithm to real-world security problems is to model adversary decision-making process. Building upon the success of SSGs applications, game theory is now being applied to adjacent domains such as Opportunistic Security. In this domain, the defender is faced with adversaries with special characteristics. Opportunistic criminals carry out repeated, and frequent illegal activities (attacks), and they generally do not conduct extensive surveillance before performing an attack and spend less time and effort in planning each attack. To that end, in my thesis, I focus on modeling the opportunistic criminals’ behavior in which modeling adversary decision-making process is particularly crucial to develop efficient patrolling strategies for the defenders. I provide an empirical investigation of adversary behavior in opportunistic crime settings by conducting extensive human subject experiments and analyzing how participants are making their decisions to create adversary behavior prediction models to be deployed in many opportunistic crime domains. More specifically, this thesis provides (i) a comprehensive answer to the question that “which of the proposed human bounded rationality models best predicts adversaries’ behavior in the Opportunistic Crime domain?”, (ii) enhanced human behavior models which outperform existing state-of-the-art models (iii) a detailed comparison between human behavior models and well-known Cognitive Science model: InstanceBased Learning model (iv) an extensive study on the heterogeneity of adversarial behavior, and (v) a thorough study of human behavior changing over time, (vi) as well as how to improve human behavior models to account for the adversaries’ behavior evolve over time.
2016_31_teamcore_yasi_abbasis_phd_thesis.pdf
L. S. Marcolino, H. Xu, D. Gerber, B. Kolev, S. Price, E. Pantazis, and M. Tambe. 2016. “Multi-agent Team Formation for Design Problems .” In Coordination, Organizations, Institutions and Norms in Agent Systems XI. Springer-Verlag Lecture Notes in AI.Abstract
Design imposes a novel social choice problem: using a team of voting agents, maximize the number of optimal solutions; allowing a user to then take an aesthetical choice. In an open system of design agents, team formation is fundamental. We present the first model of agent teams for design. For maximum applicability, we envision agents that are queried for a single opinion, and multiple solutions are obtained by multiple iterations. We show that diverse teams composed of agents with different preferences maximize the number of optimal solutions, while uniform teams composed of multiple copies of the best agent are in general suboptimal. Our experiments study the model in bounded time; and we also study a real system, where agents vote to design buildings.
2016_25_teamcore_coin2015book.pdf
Haifeng Xu. 2016. “The Mysteries of Security Games: Equilibrium Computation Becomes Combinatorial Algorithm Design .” In The 17 ACM Conference on Economics and Computation (ACM-EC).Abstract
The security game is a basic model for resource allocation in adversarial environments. Here there are two players, a defender and an attacker. The defender wants to allocate her limited resources to defend critical targets and the attacker seeks his most favorable target to attack. In the past decade, there has been a surge of research interest in analyzing and solving security games that are motivated by applications from various domains. Remarkably, these models and their game-theoretic solutions have led to real-world deployments in use by major security agencies like the LAX airport, the US Coast Guard and Federal Air Marshal Service, as well as non-governmental organizations. Among all these research and applications, equilibrium computation serves as a foundation. This paper examines security games from a theoretical perspective and provides a unified view of various security game models. In particular, each security game can be characterized by a set system E which consists of the defender’s pure strategies; The defender’s best response problem can be viewed as a combinatorial optimization problem over E. Our framework captures most of the basic security game models in the literature, including all the deployed systems; The set system E arising from various domains encodes standard combinatorial problems like bipartite matching, maximum coverage, min-cost flow, packing problems, etc. Our main result shows that equilibrium computation in security games is essentially a combinatorial problem. In particular, we prove that, for any set system E, the following problems can be reduced to each other in polynomial time: (0) combinatorial optimization over E; (1) computing the minimax equilibrium for zero-sum security games over E; (2) computing the strong Stackelberg equilibrium for security games over E; (3) computing the best or worst (for the defender) Nash equilibrium for security games over E. Therefore, the hardness [polynomial solvability] of any of these problems implies the hardness [polynomial solvability] of all the others. Here, by “games over E” we mean the class of security games with arbitrary payoff structures, but a fixed set E of defender pure strategies. This shows that the complexity of a security game is essentially determined by the set system E. We view drawing these connections as an important conceptual contribution of this paper.
2016_37_teamcore_mysteries.pdf
Matthew Brown, Arunesh Sinha, Aaron Schlenker, and Milind Tambe. 2016. “One Size Does Not Fit All: A Game-Theoretic Approach for Dynamically and Effectively Screening for Threats .” In AAAI conference on Artificial Intelligence (AAAI).Abstract
An effective way of preventing attacks in secure areas is to screen for threats (people, objects) before entry, e.g., screening of airport passengers. However, screening every entity at the same level may be both ineffective and undesirable. The challenge then is to find a dynamic approach for randomized screening, allowing for more effective use of limited screening resources, leading to improved security. We address this challenge with the following contributions: (1) a threat screening game (TSG) model for general screening domains; (2) an NP-hardness proof for computing the optimal strategy of TSGs; (3) a scheme for decomposing TSGs into subgames to improve scalability; (4) a novel algorithm that exploits a compact game representation to efficiently solve TSGs, providing the optimal solution under certain conditions; and (5) an empirical comparison of our proposed algorithm against the current state-ofthe-art optimal approach for large-scale game-theoretic resource allocation problems.
2016_5_teamcore_aaai_darms_camera.pdf
Chao Zhang. 2016. “Opportunistic Crime Security Games: Assisting Police to Control Urban Crime Using Real World Data ”.Abstract
Crime in urban areas plagues every city in all countries. A notable characteristic of urban crime, distinct from organized terrorist attacks, is that most urban crimes are opportunistic in nature, i.e., criminals do not plan their attacks in detail, rather they seek opportunities for committing crime and are agile in their execution of the crime. In order to deter such crimes, police officers conduct patrols with the aim of preventing crime. However, by observing on the spot the actual presence of patrol units, the criminals can adapt their strategy by seeking crime opportunity in less effectively patrolled location. The problem of where and how much to patrol is therefore important. My thesis focuses on addressing such opportunistic crime by introducing a new gametheoretic framework and algorithms. I first introduce the Opportunistic Security Game (OSG), a computational framework to recommend deployment strategies for defenders to control opportunistic crimes. I propose a new exact algorithm EOSG to optimize defender strategies given our opportunistic adversaries. Then I develop a fast heuristic algorithm to solve large-scale OSG problems, exploiting a compact representation. The next contribution in my thesis is a Dynamic Bayesian Network (DBN) to learn the OSG model from real-world criminal activity. Standard Algorithm such as EM can be applied to learn the parameters. Also, I propose a sequence of modifications that allows for a compact representation of the model resulting in better learning accuracy and increased speed of learning of the EM algorithm. Finally, I propose a game abstraction framework that can handle opportunistic crimes in large-scale urban areas. I propose a planning algorithm that recommends a mixed strategy against opportunistic criminals in this abstraction framework. As part of our collaboration with local police departments, we apply our model in two large scale urban problems: USC campus and the city of Nashville. Our approach provides high prediction accuracy in the real datasets; furthermore, we project significant crime rate reduction using our planning strategy compared to current police strategy
2016_28_teamcore_chao_defense.pdf
Ayan Mukhopadhyay, Chao Zhang, Yevgeniy Vorobeychik, Milind Tambe, Kenneth Pence, and Paul Speer. 2016. “Optimal Allocation of Police Patrol Resources Using a Continuous-Time Crime Model .” In Decision and Game Theory for Security (GameSec 2016).Abstract
Police departments worldwide are eager to develop better patrolling methods to manage the complex and evolving crime landscape. Surprisingly, the problem of spatial police patrol allocation to optimize expected crime response time has not been systematically addressed in prior research. We develop a bi-level optimization framework to address this problem. Our framework includes novel linear programming patrol response formulations. Bender’s decomposition is then utilized to solve the underlying optimization problem. A key challenge we encounter is that criminals may respond to police patrols, thereby shifting the distribution of crime in space and time. To address this, we develop a novel iterative Bender’s decomposition approach. Our validation involves a novel spatio-temporal continuous-time model of crime based on survival analysis, which we learn using real crime and police patrol data for Nashville, TN. We demonstrate that our model is more accurate, and much faster, than state-of-the-art alternatives. Using this model in the bi-level optimization framework, we demonstrate that our decision theoretic approach outperforms alternatives, including actual police patrol policies.
2016_39_teamcore_dividetodefend.pdf
Haifeng Xu, Long Tran Thanh, and Nick Jennings. 2016. “Playing Security Games with No Prior Knowledge .” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract
This paper investigates repeated security games with unknown (to the defender) game payoffs and attacker behaviors. As existing work assumes prior knowledge about either the game payoffs or the attacker’s behaviors, they are not suitable for tackling our problem. Given this, we propose the first efficient defender strategy, based on an adversarial online learning framework, that can provably achieve good performance guarantees without any prior knowledge. In particular, we prove that our algorithm can achieve low performance loss against the best fixed strategy on hindsight (i.e., having full knowledge of the attacker’s moves). In addition, we prove that our algorithm can achieve an efficient competitive ratio against the optimal adaptive defender strategy. We also show that for zero-sum security games, our algorithm achieves efficient results in approximating a number of solution concepts, such as algorithmic equilibria and the minimax value. Finally, our extensive numerical results demonstrate that, without having any prior information, our algorithm still achieves good performance, compared to state-of-the-art algorithms from the literature on security games, such as SUQR [19], which require significant amount of prior knowledge.
2016_26_teamcore_mab.pdf
Benjamin Ford, Matthew Brown, Amulya Yadav, Amandeep Singh, Arunesh Sinha, Biplav Srivastava, Christopher Kiekintveld, and Milind Tambe. 2016. “Protecting the NECTAR of the Ganga River through Game-Theoretic Factory Inspections .” In International Conference on Practical Applications of Agents and Multi-Agent Systems (PAAMS).Abstract
Leather is an integral part of the world economy and a substantial income source for developing countries. Despite government regulations on leather tannery waste emissions, inspection agencies lack adequate enforcement resources, and tanneries’ toxic wastewaters wreak havoc on surrounding ecosystems and communities. Previous works in this domain stop short of generating executable solutions for inspection agencies. We introduce NECTAR - the first security game application to generate environmental compliance inspection schedules. NECTAR’s game model addresses many important real-world constraints: a lack of defender resources is alleviated via a secondary inspection type; imperfect inspections are modeled via a heterogeneous failure rate; and uncertainty, in traveling through a road network and in conducting inspections, is addressed via a Markov Decision Process. To evaluate our model, we conduct a series of simulations and analyze their policy implications.
2016_17_teamcore_ford16_paams_cameraready_ben.pdf
Yundi Qian, Chao Zhang, Bhaskar Krishnamachari, and Milind Tambe. 2016. “Restless Poachers: Handling Exploration-Exploitation Tradeoffs in Security Domains .” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract
The success of Stackelberg Security Games (SSGs) in counterterrorism domains has inspired researchers’ interest in applying game-theoretic models to other security domains with frequent interactions between defenders and attackers, e.g., wildlife protection. Previous research optimizes defenders’ strategies by modeling this problem as a repeated Stackelberg game, capturing the special property in this domain — frequent interactions between defenders and attackers. However, this research fails to handle exploration-exploitation tradeoff in this domain caused by the fact that defenders only have knowledge of attack activities at targets they protect. This paper addresses this shortcoming and provides the following contributions: (i) We formulate the problem as a restless multi-armed bandit (RMAB) model to address this challenge. (ii) To use Whittle index policy to plan for patrol strategies in the RMAB, we provide two sufficient conditions for indexability and an algorithm to numerically evaluate indexability. (iii) Given indexability, we propose a binary search based algorithm to find Whittle index policy efficiently.
2016_15_teamcore_aamas2016_eve_yundi.pdf
Haifeng Xu, Rupert Freeman, Vincent Conitzer, Shaddin Dughmi, and Milind Tambe. 2016. “Signaling in Bayesian Stackelberg Games .” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract
Algorithms for solving Stackelberg games are used in an evergrowing variety of real-world domains. Previous work has extended this framework to allow the leader to commit not only to a distribution over actions, but also to a scheme for stochastically signaling information about these actions to the follower. This can result in higher utility for the leader. In this paper, we extend this methodology to Bayesian games, in which either the leader or the follower has payoff-relevant private information or both. This leads to novel variants of the model, for example by imposing an incentive compatibility constraint for each type to listen to the signal intended for it. We show that, in contrast to previous hardness results for the case without signaling [5, 16], we can solve unrestricted games in time polynomial in their natural representation. For security games, we obtain hardness results as well as efficient algorithms, depending on the settings. We show the benefits of our approach in experimental evaluations of our algorithms.
2016_12_teamcore_draft.pdf

Pages