Zhengyu Yin, Albert Jiang, Matthew Johnson, Milind Tambe, Christopher Kiekintveld, Kevin Leyton-Brown, Tuomas Sandholm, and John Sullivan. 2012. “TRUSTS: Scheduling Randomized Patrols for Fare Inspection in Transit Systems .” In Conference on Innovative Applications of Artificial Intelligence (IAAI) .Abstract
In proof-of-payment transit systems, passengers are legally required to purchase tickets before entering but are not physically forced to do so. Instead, patrol units move about the transit system, inspecting the tickets of passengers, who face fines if caught fare evading. The deterrence of such fines depends on the unpredictability and effectiveness of the patrols. In this paper, we present TRUSTS, an application for scheduling randomized patrols for fare inspection in transit systems. TRUSTS models the problem of computing patrol strategies as a leader-follower Stackelberg game where the objective is to deter fare evasion and hence maximize revenue. This problem differs from previously studied Stackelberg settings in that the leader strategies must satisfy massive temporal and spatial constraints; moreover, unlike in these counterterrorism-motivated Stackelberg applications, a large fraction of the ridership might realistically consider fare evasion, and so the number of followers is potentially huge. A third key novelty in our work is deliberate simplification of leader strategies to make patrols easier to be executed. We present an efficient algorithm for computing such patrol strategies and present experimental results using realworld ridership data from the Los Angeles Metro Rail system. The Los Angeles Sheriff’s department has begun trials of TRUSTS.
Zhengyu Yin and Milind Tambe. 2012. “A Unified Method for Handling Discrete and Continuous Uncertainty in Bayesian Stackelberg Games .” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract
Given their existing and potential real-world security applications, Bayesian Stackelberg games have received significant research interest [3, 12, 8]. In these games, the defender acts as a leader, and the many different follower types model the uncertainty over discrete attacker types. Unfortunately since solving such games is an NP-hard problem, scale-up has remained a difficult challenge. This paper scales up Bayesian Stackelberg games, providing a novel unified approach to handling uncertainty not only over discrete follower types but also other key continuously distributed real world uncertainty, due to the leader’s execution error, the follower’s observation error, and continuous payoff uncertainty. To that end, this paper provides contributions in two parts. First, we present a new algorithm for Bayesian Stackelberg games, called HUNTER, to scale up the number of types. HUNTER combines the following five key features: i) efficient pruning via a best-first search of the leader’s strategy space; ii) a novel linear program for computing tight upper bounds for this search; iii) using Bender’s decomposition for solving the upper bound linear program efficiently; iv) efficient inheritance of Bender’s cuts from parent to child; v) an efficient heuristic branching rule. Our experiments show that HUNTER provides orders of magnitude speedups over the best existing methods to handle discrete follower types. In the second part, we show HUNTER’s efficiency for Bayesian Stackelberg games can be exploited to also handle the continuous uncertainty using sample average approximation. We experimentally show that our HUNTER-based approach also outperforms latest robust solution methods under continuously distributed uncertainty.
Manish Jain, Kevin Leyton-Brown, and Milind Tambe. 2012. “Which Security Games are Hard to Solve? .” In AAAI Spring Symposium on Game Theory for Security, Sustainability and Health .Abstract
Stackelberg security games form the backbone of systems like ARMOR, IRIS and PROTECT, which are in regular use by the Los Angeles International Police, US Federal Air Marshal Service and the US Coast Guard respectively. An understanding of the runtime required by algorithms that power such systems is critical to furthering the application of game theory to other real-world domains. This paper identifies the concept of the deployment-to-saturation ratio in random Stackelberg security games, and shows that in a decision problem related to these games, the probability that a solution exists exhibits a phase transition as the ratio crosses 0.5. We demonstrate that this phase transition is invariant to changes both in the domain and the domain representation. Moreover, problem instances at this phase transition point are computationally harder than instances with other deployment-tosaturation ratios for a wide range of different equilibrium computation methods, including (i) previously published different MIP algorithms, and (ii) different underlying solvers and solution mechanisms. Our findings have at least two important implications. First, it is important for new algorithms to be evaluated on the hardest problem instances. We show that this has often not been done in the past, and introduce a publicly available benchmark suite to facilitate such comparisons. Second, we provide evidence that this phase transition region is also one where optimization would be of most benefit to security agencies, and thus requires significant attention from researchers in this area.
Manish Jain, Bo An, and Milind Tambe. 2012. “An Overview of Recent Application Trends at the AAMAS conference: Security, Sustainability and Safety.” AI Magazine, 33, 3, Pp. 14-28.Abstract
A key feature of the AAMAS conference is its emphasis on ties to real-world applications. The focus of this article is to provide a broad overview of application-focused papers published at the AAMAS 2010 and 2011 conferences. More specifically, recent applications at AAMAS could be broadly categorized as belonging to research areas of security, sustainability and safety. We outline the domains of applications, key research thrusts underlying each such application area, and emerging trends.
Jason Tsai, Thanh H. Nguyen, and Milind Tambe. 2012. “Security Games for Controlling Contagion.” In Conference on Artificial Intelligence (AAAI).Abstract
Many strategic actions carry a ‘contagious’ component beyond the immediate locale of the effort itself. Viral marketing and peacekeeping operations have both been observed to have a spreading effect. In this work, we use counterinsurgency as our illustrative domain. Defined as the effort to block the spread of support for an insurgency, such operations lack the manpower to defend the entire population and must focus on the opinions of a subset of local leaders. As past researchers of security resource allocation have done, we propose using game theory to develop such policies and model the interconnected network of leaders as a graph. Unlike this past work in security games, actions in these domains possess a probabilistic, non-local impact. To address this new class of security games, we combine recent research in influence blocking maximization with a double oracle approach and create novel heuristic oracles to generate mixed strategies for a real-world leadership network from Afghanistan, synthetic leadership networks, and a real social network. We find that leadership networks that exhibit highly interconnected clusters can be solved equally well by our heuristic methods, but our more sophisticated heuristics outperform simpler ones in less interconnected social networks.
Manish Jain, Zhengyu Yin, Milind Tambe, and Fernando Ordonez. 2011. “Addressing Execution and Observation Error in Security Games .” In AAAI'11 Workshop on Applied Adversarial Reasoning and Risk Modeling (AARM).Abstract
Attacker-defender Stackelberg games have become a popular game-theoretic approach for security with deployments for LAX Police, the FAMS and the TSA. Unfortunately, most of the existing solution approaches do not model two key uncertainties of the real-world: there may be noise in the defender’s execution of the suggested mixed strategy and/or the observations made by an attacker can be noisy. In this paper, we analyze a framework to model these uncertainties, and demonstrate that previous strategies perform poorly in such uncertain settings. We also analyze RECON, a novel algorithm that computes strategies for the defender that are robust to such uncertainties, and explore heuristics that further improve RECON’s efficiency.
Matthew Brown, Emma Bowring, Shira Epstein, Mufaddal Jhaveri, Rajiv Maheswaran, Parag Mallick, Shannon Mumenthaler, Michelle Povinelli, and Milind Tambe. 2011. “Applying Multi-Agent Techniques to Cancer Modeling .” In Workshop on Multiagent Sequential Decision Making in Uncertain Domains(MSDM) at AAMAS 2011 .Abstract
Each year, cancer is responsible for 13% of all deaths worldwide. In the United States, that percentage increases to 25%, translating to an estimated 569,490 deaths in 2010 [1]. Despite significant advances in the fight against cancer, these statistics make clear the need for additional research into new treatments. As such, there has been growing interest in the use of computer simulations as a tool to aid cancer researchers. We propose an innovative multi-agent approach in which healthy cells and cancerous cells are modeled as opposing teams of agents using a decentralized Markov decision process (DEC-MDP). We then describe changes made to traditional DEC-MDP algorithms in order to better handle the complexity and scale of our domain. We conclude by presenting and analyzing preliminary simulation results. This paper is intended to introduce the cancer modeling domain to the multi-agent community with the hope of fostering a discussion about the opportunities and challenges it presents. Given the complexity of the domain, we do not claim our approach to be a definitive solution but rather a first step toward the larger goal of creating realistic simulations of cancer.
Christopher Kiekintveld, Janusz Marecki, and Milind Tambe. 2011. “Approximation Methods for Infinite Bayesian Stackelberg Games: Modeling Distributional Payoff Uncertainty.” In International Conference on Autonomous Agents and Multiagent Systems.Abstract
Game theory is fast becoming a vital tool for reasoning about complex real-world security problems, including critical infrastructure protection. The game models for these applications are constructed using expert analysis and historical data to estimate the values of key parameters, including the preferences and capabilities of terrorists. In many cases, it would be natural to represent uncertainty over these parameters using continuous distributions (such as uniform intervals or Gaussians). However, existing solution algorithms are limited to considering a small, finite number of possible attacker types with different payoffs. We introduce a general model of infinite Bayesian Stackelberg security games that allows payoffs to be represented using continuous payoff distributions. We then develop several techniques for finding approximate solutions for this class of games, and show empirically that our methods offer dramatic improvements over the current state of the art, providing new ways to improve the robustness of security game models.
Zhengyu Yin and Milind Tambe. 2011. “Continuous Time Planning for Multiagent Teams with Temporal Constraints .” In International Joint Conference on Artificial Intelligence (IJCAI).Abstract
Continuous state DEC-MDPs are critical for agent teams in domains involving resources such as time, but scaling them up is a significant challenge. To meet this challenge, we first introduce a novel continuous-time DEC-MDP model that exploits transition independence in domains with temporal constraints. More importantly, we present a new locally optimal algorithm called SPAC. Compared to the best previous algorithm, SPAC finds solutions of comparable quality substantially faster; SPAC also scales to larger teams of agents.
Matthew E. Taylor, Manish Jain, Prateek Tandon, Milind Tambe, and Makoto Yokoo. 2011. “Distributed On-line Multi-Agent Optimization Under Uncertainty: Balancing Exploration and Exploitation .” In Advances in Complex Systems.Abstract
A significant body of work exists on effectively allowing multiple agents to coordinate to achieve a shared goal. In particular, a growing body of work in the Distributed Constraint Optimization (DCOP) framework enables such coordination with different amounts of teamwork. Such algorithms can implicitly or explicitly trade-off improved solution quality with increased communication and computation requirements. However, the DCOP framework is limited to planning problems; DCOP agents must have complete and accurate knowledge about the reward function at plan time. We extend the DCOP framework, defining the Distributed Coordination of Exploration and Exploitation (DCEE) problem class to address real-world problems, such as ad-hoc wireless network optimization, via multiple novel algorithms. DCEE algorithms differ from DCOP algorithms in that they (1) are limited to a finite number of actions in a single trial, (2) attempt to maximize the on-line, rather than final, reward, (3) are unable to exhaustively explore all possible actions, and (4) may have knowledge about the distribution of rewards in the environment, but not the rewards themselves. Thus, a DCEE problem is not a type of planning problem, as DCEE algorithms must carefully balance and coordinate multiple agents’ exploration and exploitation. Two classes of algorithms are introduced: static estimation algorithms perform simple calculations that allow agents to either stay or explore, and balanced exploration algorithms use knowledge about the distribution of the rewards and the time remaining in an experiment to decide whether to stay, explore, or (in some algorithms) backtrack to a previous location. These two classes of DCEE algorithms are compared in simulation and on physical robots in a complex mobile ad-hoc wireless network setting. Contrary to our expectations, we found that increasing teamwork in DCEE algorithms may lower team performance. In contrast, agents running DCOP algorithms improve their reward as teamwork increases. We term this previously unknown phenomenon the team uncertainty penalty, analyze it in both simulation and on robots, and present techniques to ameliorate the penalty.
Manish Jain, Dmytro Korzhyk, Ondrej Vanek, Vincent Conitzer, Michal Pechoucek, and Milind Tambe. 2011. “A Double Oracle Algorithm for Zero-Sum Security Games on Graphs .” In International Conference on Autonomous Agents and Multiagent Systems.Abstract
In response to the Mumbai attacks of 2008, the Mumbai police have started to schedule a limited number of inspection checkpoints on the road network throughout the city. Algorithms for similar security-related scheduling problems have been proposed in recent literature, but security scheduling in networked domains when targets have varying importance remains an open problem at large. In this paper, we cast the network security problem as an attackerdefender zero-sum game. The strategy spaces for both players are exponentially large, so this requires the development of novel, scalable techniques. We first show that existing algorithms for approximate solutions can be arbitrarily bad in general settings. We present RUGGED (Randomization in Urban Graphs by Generating strategies for Enemy and Defender), the first scalable optimal solution technique for such network security games. Our technique is based on a double oracle approach and thus does not require the enumeration of the entire strategy space for either of the players. It scales up to realistic problem sizes, as is shown by our evaluation of maps of southern Mumbai obtained from GIS data.
Jason Tsai, Emma Bowring, Stacy Marsella, and Milind Tambe. 2011. “Empirical Evaluation of Computational Emotional Contagion Models .” In International Conference on Intelligent Virtual Agents (IVA).Abstract
In social psychology, emotional contagion describes the widely observed phenomenon of one person’s emotions being influenced by surrounding people’s emotions. While the overall effect is agreed upon, the underlying mechanism of the spread of emotions has seen little quantification and application to computational agents despite extensive evidence of its impacts in everyday life. In this paper, we examine computational models of emotional contagion by implementing two models ([2] and [8]) that draw from two separate lines of contagion research: thermodynamics-based and epidemiological-based. We first perform sensitivity tests on each model in an evacuation simulation, ESCAPES, showing both models to be reasonably robust to parameter variations with certain exceptions. We then compare their ability to reproduce a real crowd panic scene in simulation, showing that the thermodynamics-style model ([2]) produces superior results due to the ill-suited contagion mechanism at the core of epidemiological models. We also identify that a graduated effect of fear and proximity-based contagion effects are key to producing the superior results. We then reproduce the methodology on a second video, showing that the same results hold, implying generality of the conclusions reached in the first scene.
Jason Tsai, Natalie Fridman, Emma Bowring, Matthew Brown, Shira Epstein, Gal Kaminka, Stacy Marsella, Andrew Ogden, Inbal Rika, Ankur Sheel, Matthew Taylor, Xuezhi Wang, Avishay Zilka, and Milind Tambe. 2011. “ESCAPES - Evacuation Simulation with Children, Authorities, Parents, Emotions, and Social comparison .” In International Conference on Autonomous Agents and Multiagent Systems.Abstract
In creating an evacuation simulation for training and planning, realistic agents that reproduce known phenomenon are required. Evacuation simulation in the airport domain requires additional features beyond most simulations, including the unique behaviors of firsttime visitors who have incomplete knowledge of the area and families that do not necessarily adhere to often-assumed pedestrian behaviors. Evacuation simulations not customized for the airport domain do not incorporate the factors important to it, leading to inaccuracies when applied to it. In this paper, we describe ESCAPES, a multiagent evacuation simulation tool that incorporates four key features: (i) different agent types; (ii) emotional interactions; (iii) informational interactions; (iv) behavioral interactions. Our simulator reproduces phenomena observed in existing studies on evacuation scenarios and the features we incorporate substantially impact escape time. We use ESCAPES to model the International Terminal at Los Angeles International Airport (LAX) and receive high praise from security officials.
Rong Yang, Milind Tambe, Manish Jain, Jun-young Kwak, James Pita, and Zhengyu Yin. 2011. “Game Theory and Human Behavior: Challenges in Security and Sustainability .” In Algorithmic Decision Theory (ADT) .Abstract
Security and sustainability are two critical global challenges that involve the interaction of many intelligent actors. Game theory provides a sound mathematical framework to model such interactions, and computational game theory in particular has a promising role to play in helping to address key aspects of these challenges. Indeed, in the domain of security, we have already taken some encouraging steps by successfully applying game-theoretic algorithms to real-world security problems: our algorithms are in use by agencies such as the US coast guard, the Federal Air Marshals Service, the LAX police and the Transportation Security Administration. While these applications of game-theoretic algorithms have advanced the state of the art, this paper lays out some key challenges as we continue to expand the use of these algorithms in real-world domains. One such challenge in particular is that classical game theory makes a set of assumptions of the players, which may not be consistent with real-world scenarios, especially when humans are involved. To actually model human behavior within game-theoretic framework, it is important to address the new challenges that arise due to the presence of human players: (i) human bounded rationality; (ii) limited observations and imperfect strategy execution; (iii) large action spaces. We present initial solutions to these challenges in context of security games. For sustainability, we lay out our initial efforts and plans, and key challenges related to human behavior in the loop.
James Pita, Milind Tambe, Chris Kiekintveld, Shane Cullen, and Erin Steigerwald. 2011. “GUARDS - Game Theoretic Security Allocation on a National Scale.” In International Conference on Autonomous Agents and Multiagent Systems .Abstract
Building on research previously reported at AAMAS conferences, this paper describes an innovative application of a novel gametheoretic approach for a national scale security deployment. Working with the United States Transportation Security Administration (TSA), we have developed a new application called GUARDS to assist in resource allocation tasks for airport protection at over 400 United States airports. In contrast with previous efforts such as ARMOR and IRIS, which focused on one-off tailored applications and one security activity (e.g. canine patrol or checkpoints) per application, GUARDS faces three key issues: (i) reasoning about hundreds of heterogeneous security activities; (ii) reasoning over diverse potential threats; (iii) developing a system designed for hundreds of end-users. Since a national deployment precludes tailoring to specific airports, our key ideas are: (i) creating a new game-theoretic framework that allows for heterogeneous defender activities and compact modeling of a large number of threats; (ii) developing an efficient solution technique based on general purpose Stackelberg game solvers; (iii) taking a partially centralized approach for knowledge acquisition and development of the system. In doing so we develop a software scheduling assistant, GUARDS, designed to reason over two agents — the TSA and a potential adversary — and allocate the TSA’s limited resources across hundreds of security activities in order to provide protection within airports. The scheduling assistant has been delivered to the TSA and is currently under evaluation and testing for scheduling practices at an undisclosed airport. If successful, the TSA intends to incorporate the system into their unpredictable scheduling practices nationwide. In this paper we discuss the design choices and challenges encountered during the implementation of GUARDS. GUARDS represents promising potential for transitioning years of academic research into a nationally deployed system.
James Pita, Milind Tambe, Christopher Kiekintveld, Shane Cullen, and Erin Steigerwald. 2011. “GUARDS - Innovative Application of Game Theory for National Airport Security .” In International Joint Conference on Artificial Intelligence (IJCAI).Abstract
We describe an innovative application of a novel game-theoretic approach for a national scale security deployment. Working with the United States Transportation Security Administration (TSA), we have developed a new application called GUARDS to allocate the TSA’s limited resources across hundreds of security activities to provide protection at over 400 United States airports. Similar security applications (e.g., ARMOR and IRIS) have focused on one-off tailored applications and one security activity (e.g. checkpoints) per application, GUARDS on the other hand faces three new key issues: (i) reasoning about hundreds of heterogeneous security activities; (ii) reasoning over diverse potential threats; (iii) developing a system designed for hundreds of end-users. Since a national deployment precludes tailoring to specific airports, our key ideas are: (i) creating a new game-theoretic framework that allows for heterogeneous defender activities and compact modeling of a large number of threats; (ii) developing an efficient solution technique based on general purpose Stackelberg game solvers; (iii) taking a partially centralized approach for knowledge acquisition. The scheduling assistant has been delivered to the TSA and is currently undergoing evaluation for scheduling practices at an undisclosed airport. If successful, the TSA intends to incorporate the system into their unpredictable scheduling practices nationwide.
Rong Yang, Christopher Kiekintveld, Fernando Ordonez, Milind Tambe, and Richard John. 2011. “Improved Computational Models of Human Behavior in Security Games.” In International Conference on Autonomous Agents and Multiagent Systems (Extended Abstract).Abstract
Security games refer to a special class of attacker-defender Stackelberg games. In these non zero-sum games, the attacker’s utility of attacking a target decreases as the defender allocates more resources to protect it (and vice versa for the defender). The defender (leader) first commits to a mixed strategy, assuming the attacker (follower) decides on a pure strategy after observing the defender’s strategy. This models the situation where an attacker conducts surveillance to learn the defender’s mixed strategy and then launches an attack on a single target. Given that the defender has limited resources, she must design her mixed-strategy optimally against the adversaries’ response to maximize effectiveness. One leading family of algorithms to compute such mixed strategies are DOBSS and its successors [3, 5], which are used in the deployed ARMOR [5] and IRIS [8] applications. One key set of assumptions these systems make is about how attackers choose strategies based on their knowledge of the security strategy. Typically, such systems apply the standard game-theoretic assumption that attackers are perfectly rational. This is a reasonable proxy for the worst case of a highly intelligent attacker, but it can lead to a defense strategy that is not robust against attackers using different decision procedures, and it fails to exploit known weaknesses in human decision-making. Indeed, it is widely accepted that standard game-theoretic assumptions of perfect rationality are not ideal for predicting the behavior of humans in multi-agent decision problems [1]. Thus, integrating more realistic models of human decision-making has become necessary in solving real-world security problems. The current leading contender that accounts for human behavior in security games is COBRA [6], which assumes that adversaries can deviate to −optimal strategies and that they have an anchoring bias when interpreting a probability distribution. It remains an open question whether other models yield better solutions than COBRA against human adversaries. The literature has introduced a multitude of candidate models, but there is an important empirical question of which model best represents the salient features of human behavior in applied security contexts. We address these open questions by developing three new algorithms to generate defender strategies in security games, based on using two fundamental theories of human behavior to predict an attacker’s decision: Prospect Theory (PT) [2] and Quantal Response Equilibrium (QRE) [4]. PT is a noble-prize-winning theory, which describes human decision making as a process of maximizing ‘prospect’. ‘Prospect’ is defined as the weighted sum of the benefit of all possible outcomes for each action. QRE suggests that instead of strictly maximizing utility, individuals respond stochastically in games: the chance of selecting a non-optimal strategy increases as the cost of such an error decreases.
Rong Yang, Christopher Kiekintveld, Fernando Ordonez, Milind Tambe, and Richard John. 2011. “Improving Resource Allocation Strategy Against Human Adversaries in Security Games .” In International Joint Conference on Artificial Intelligence (IJCAI) .Abstract
Recent real-world deployments of Stackelberg security games make it critical that we address human adversaries’ bounded rationality in computing optimal strategies. To that end, this paper provides three key contributions: (i) new efficient algorithms for computing optimal strategic solutions using Prospect Theory and Quantal Response Equilibrium; (ii) the most comprehensive experiment to date studying the effectiveness of different models against human subjects for security games; and (iii) new techniques for generating representative payoff structures for behavioral experiments in generic classes of games. Our results with human subjects show that our new techniques outperform the leading contender for modeling human behavior in security games.
Marcos A. M. Vieira, Matthew E. Taylor, Prateek Tandon Manish Jain, Ramesh Govindan, Gaurav S. Sukhatme, and Milind Tambe. 2011. “Mitigating Multi-path Fading in a Mobile Mesh Network.” In Ad-Hoc Networks.Abstract
By using robots as routers, a team of networked robots can provide a communication substrate to establish a wireless mesh network. The mobile mesh network can autonomously optimize its configuration, increasing performance. One of the main sources of radio signal fading in such a network is multi-path propagation, which can be mitigated by moving the senders or the receivers on the distance of the order of a wavelength. In this paper, we measure the performance gain when robots are allowed to make such small movements and find that it may be as much as 270%. Our main contribution is the design of a system that allows robots to cooperate and improve the real-world network throughput via a practical solution. We model the problem of which robots to move as a distributed constraint optimization problem (DCOP). Our study includes four local metrics to estimate global throughput.
Bo An, Manish Jain, Milind Tambe, and Christopher Kiekintveld. 2011. “Mixed-Initiative Optimization in Security Games: A Preliminary Report.” In AAAI Spring Symposium on Help me help you:Bridging the Gaps in Human-Agent Collaboration.Abstract
Stackelberg games have been widely used to model patrolling or monitoring problems in security. In a Stackelberg security game, the defender commits to a strategy and the adversary makes its decision with knowledge of the leader’s commitment. Algorithms for computing the defender’s optimal strategy are used in deployed decision-support tools in use by the Los Angeles International Airport (LAX), the Federal Air Marshals Service, and the Transportation Security Administration (TSA). Those algorithms take into account various resource usage constraints defined by human users. However, those constraints may lead to poor (even infeasible) solutions due to users’ insufficient information and bounded rationality. A mixed-initiative approach, in which human users and software assistants (agents) collaborate to make security decisions, is needed. Efficient human-agent interaction process leads to models with higher overall solution quality. This paper preliminarily analyzes the needs and challenges for such a mixed-initiative approach.