Publications

2010
Christopher Kiekintveld, Janusz Marecki, and Milind Tambe. 2010. “Methods and Algorithms for Infinite Bayesian Stackelberg Security Games (Extended Abstract) .” In Conference on Decision and Game Theory for Security.Abstract
Recently there has been significant interest in applications of gametheoretic analysis to analyze security resource allocation decisions. Two examples of deployed systems based on this line of research are the ARMOR system in use at the Los Angeles International Airport [20], and the IRIS system used by the Federal Air Marshals Service [25]. Game analysis always begins by developing a model of the domain, often based on inputs from domain experts or historical data. These models inevitably contain significant uncertainty—especially in security domains where intelligence about adversary capabilities and preferences is very difficult to gather. In this work we focus on developing new models and algorithms that capture this uncertainty using continuous payoff distributions. These models are richer and more powerful than previous approaches that are limited to small finite Bayesian game models. We present the first algorithms for approximating equilibrium solutions in these games, and study these algorithms empirically. Our results show dramatic improvements over existing techniques, even in cases where there is very limited uncertainty about an adversaries’ payoffs.
2010_17_teamcore_gamesec.pdf
Manish Jain, Erim Kardes, Christopher Kiekintveld, Fernando Ordonez, and Milind Tambe. 2010. “Optimal defender allocation for massive security games: A branch and price approach .” In AAMAS 2010 Workshop on Optimisation in Multi-Agent Systems (OptMas).Abstract
Algorithms to solve security games, an important class of Stackelberg games, have seen successful real-world deployment by LAX police and the Federal Air Marshal Service. These algorithms provide randomized schedules to optimally allocate limited security resources for infrastructure protection. Unfortunately, these stateof-the-art algorithms fail to scale-up or to provide a correct solution for massive security games with arbitrary scheduling constraints. This paper provides ASPEN, a branch-and-price algorithm to overcome this limitation based on two key contributions: (i) A column-generation approach that exploits an innovative compact network flow representation, avoiding a combinatorial explosion of schedule allocations; (ii) A branch-and-bound approach with novel upper-bound generation via a fast algorithm for solving under-constrained security games. ASPEN is the first known method for efficiently solving real-world-sized security games with arbitrary schedules. This work contributes to a very new area of work that applies techniques used in large-scale optimization to game-theoretic problems—an exciting new avenue with the potential to greatly expand the reach of game theory.
2010_19_teamcore_optmas10jain.pdf
James Pita, Christopher Kiekintveld, Michael Scott, and Milind Tambe. 2010. “Randomizing Security Activities with Attacker Circumvention Strategies .” In AAMAS 2010 Workshop on Optimisation in Multi-Agent Systems (OptMas).Abstract
Game theoretic methods for making resource allocation decision in security domains have attracted growing attention from both researchers and security practitioners, including deployed applications at both the LAX airport and the Federal Air Marshals Service. We develop a new class of security games designed to model decisions faced by the Transportation Security Administration and other agencies in protecting airports, ports, and other critical infrastructure. Our model allows for a more diverse set of security activities for the defensive resources than previous work, which has generally focused on interchangeable resources that can only defend against possible attacks in one way. Here, we are concerned in particular with the possibility that adversaries can circumvent specific security activities if they are aware of common security measures. The model we propose takes this capability into account and generates more unpredictable, diverse security policies as a result—without resorting to an external value for entropy or randomness. Solving these games is a significant computational challenge, and existing algorithms are not capable of solving realistic games. We introduce a new method that exploits common structure in these problems to reduce the size of the game representation and enable faster solution algorithm. These algorithms are able to scale to make larger games than existing solvers, as we show in our experimental results.
2010_14_teamcore_optimas_pita.pdf
Christopher Kiekintveld, Milind Tambe, and Janusz Marecki. 2010. “Robust Bayesian Methods for Stackelberg Security Games (Extended Abstract) .” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS) [Short Paper].Abstract
Recent work has applied game-theoretic models to real-world security problems at the Los Angeles International Airport (LAX) and Federal Air Marshals Service (FAMS). The analysis of these domains is based on input from domain experts intended to capture the best available intelligence information about potential terrorist activities and possible security countermeasures. Nevertheless, these models are subject to significant uncertainty—especially in security domains where intelligence about adversary capabilities and preferences is very difficult to gather. This uncertainty presents significant challenges for applying game-theoretic analysis in these domains. Our experimental results show that standard solution methods based on perfect information assumptions are very sensitive to payoff uncertainty, resulting in low payoffs for the defender. We describe a model of Bayesian Stackelberg games that allows for general distributional uncertainty over the attacker’s payoffs. We conduct an experimental analysis of two algorithms for approximating equilibria of these games, and show that the resulting solutions give much better results than the standard approach when there is payoff uncertainty.
2010_5_teamcore_aamas10_robustness.pdf
James Pita, Manish Jain, Fernando Ordonez, Milind Tambe, and Sarit Kraus. 2010. “Robust Solutions to Stackelberg Games: Addressing Bounded Rationality and Limited Observations in Human Cognition .” Artificial Intelligence Journal, 174, 15, Pp. 1142-1171.Abstract
How do we build algorithms for agent interactions with human adversaries? Stackelberg games are natural models for many important applications that involve human interaction, such as oligopolistic markets and security domains. In Stackelberg games, one player, the leader, commits to a strategy and the follower makes her decision with knowledge of the leader’s commitment. Existing algorithms for Stackelberg games efficiently find optimal solutions (leader strategy), but they critically assume that the follower plays optimally. Unfortunately, in many applications, agents face human followers (adversaries) who — because of their bounded rationality and limited observation of the leader strategy — may deviate from their expected optimal response. In other words, human adversaries’ decisions are biased due to their bounded rationality and limited observations. Not taking into account these likely deviations when dealing with human adversaries may cause an unacceptable degradation in the leader’s reward, particularly in security applications where these algorithms have seen deployment. The objective of this paper therefore is to investigate how to build algorithms for agent interactions with human adversaries. To address this crucial problem, this paper introduces a new mixed-integer linear program (MILP) for Stackelberg games to consider human adversaries, incorporating: (i) novel anchoring theories on human perception of probability distributions and (ii) robustness approaches for MILPs to address human imprecision. Since this new approach considers human adversaries, traditional proofs of correctness or optimality are insufficient; instead, it is necessary to rely on empirical validation. To that end, this paper considers four settings based on real deployed security systems at Los Angeles International Airport [43], and compares 6 different approaches (three based on our new approach and three previous approaches), in 4 different observability conditions, involving 218 human subjects playing 2960 games in total. The final conclusion is that a model which incorporates both the ideas of robustness and anchoring achieves statistically significant higher rewards and also maintains equivalent or faster solution speeds compared to existing approaches.
2010_16_teamcore_aij3.pdf
Manish Jain, Erim Kardes, Christopher Kiekintveld, Milind Tambe, and Fernando Ordonez. 2010. “Security Games with Arbitrary Schedules: A Branch and Price Approach .” In National Conference on Artificial Intelligence (AAAI).Abstract
Security games, and important class of Stackelberg games, are used in deployed decision-support tools in use by LAX police and the Federal Air Marshals Service. The algorithms used to solve these games find optimal randomized schedules to allocate security resources for infrastructure protection. Unfortunately, the state of the art algorithms either fail to scale or to provide a correct solution for large problems with arbitrary scheduling constraints. We introduce ASPEN, a branch-and-price approach that overcomes these limitations based on two key contributions: (i) A columngeneration approach that exploits a novel network flow representation, avoiding a combinatorial explosion of schedule allocations; (ii) A branch-and-bound algorithm that generates bounds via a fast algorithm for solving security games with relaxed scheduling constraints. ASPEN is the first known method for efficiently solving massive security games with arbitrary schedules.
2010_10_teamcore_aaai10_jain.pdf
Manish Jain, James Pita, Jason Tsai, Christopher Kiekintveld, Shyamsunder Rathi, Fernando Ordonez, and Milind Tambe. 2010. “Software Assistants for patrol planning at LAX and Federal Air Marshals Service .” Interfaces, 40, 4, Pp. 267-290.Abstract
Security at major locations of economic or political importance is a key concern around the world, particularly given the increasing threat of terrorism. Limited security resources prevent full security coverage at all times, which allows adversaries to observe and exploit patterns in patrolling or monitoring, e.g. they can plan an attack that avoids existing patrols. An important method of countering the surveillance capabilities of an adversary is to use randomized security policies that are more difficult to predict and exploit. We describe two deployed applications that assist security forces in randomizing their operations based on fast algorithms for solving large instances of Bayesian Stackelberg games. The first is the ARMOR system (Assistant for Randomized Monitoring over Routes), which has been successfully deployed since August 2007 at the Los Angeles International Airport (LAX). This system is used by airport police to randomize the placement of checkpoints on roads entering the airport, and the routes of canine unit patrols in the airport terminals. The IRIS system (Intelligent Randomization in Scheduling) is designed to randomize flight schedules for the Federal Air Marshals Service (FAMS). IRIS has been deployed in a pilot program by FAMS since October 2009 to randomize schedules of air marshals on international flights. These assistants share several key features: (i) they are based on Stackelberg game models to intelligently weight the randomized schedules, (ii) they use efficient mixed-integer programming formulations of the game models to enable fast solutions for large games, and (iii) they allow for interactive manipulation of the domain constraints and parameters by the users. This paper examines the design choices, information, and evaluation that went into building these effective applications.
2010_2_teamcore_09interfaces.pdf
Zhengyu Yin, Dmytro Korzhyk, Christopher Kiekintveld, Vincent Conitzer, and Milind Tambe. 2010. “Stackelberg vs. Nash in Security Games: Interchangeability, Equivalence, and Uniqueness .” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract
There has been significant recent interest in game theoretic approaches to security, with much of the recent research focused on utilizing the leader-follower Stackelberg game model; for example, these games are at the heart of major applications such as the ARMOR program deployed for security at the LAX airport since 2007 and the IRIS program in use by the US Federal Air Marshals (FAMS). The foundational assumption for using Stackelberg games is that security forces (leaders), acting first, commit to a randomized strategy; while their adversaries (followers) choose their best response after surveillance of this randomized strategy. Yet, in many situations, the followers may act without observation of the leader’s strategy, essentially converting the game into a simultaneous-move game model. Previous work fails to address how a leader should compute her strategy given this fundamental uncertainty about the type of game faced. Focusing on the complex games that are directly inspired by realworld security applications, the paper provides four contributions in the context of a general class of security games. First, exploiting the structure of these security games, the paper shows that the Nash equilibria in security games are interchangeable, thus alleviating the equilibrium selection problem. Second, resolving the leader’s dilemma, it shows that under a natural restriction on security games, any Stackelberg strategy is also a Nash equilibrium strategy; and furthermore, the solution is unique in a class of realworld security games of which ARMOR is a key exemplar. Third, when faced with a follower that can attack multiple targets, many of these properties no longer hold. Fourth, our experimental results emphasize positive properties of games that do not fit our restrictions. Our contributions have major implications for the real-world applications.
2010_6_teamcore_aamas10_obs.pdf
Jun-young Kwak, Rong Yang, Zhengyu Yin, Matthew E. Taylor, and Milind Tambe. 2010. “Teamwork and Coordination under Model Uncertainty in DEC-POMDPs .” In AAAI 2010 Workshop on Interactive Decision Theory and Game Theory (IDTGT) .Abstract
Distributed Partially Observable Markov Decision Processes (DEC-POMDPs) are a popular planning framework for multiagent teamwork to compute (near-)optimal plans. However, these methods assume a complete and correct world model, which is often violated in real-world domains. We provide a new algorithm for DEC-POMDPs that is more robust to model uncertainty, with a focus on domains with sparse agent interactions. Our STC algorithm relies on the following key ideas: (1) reduce planning-time computation by shifting some of the burden to execution-time reasoning, (2) exploit sparse interactions between agents, and (3) maintain an approximate model of agents’ beliefs. We empirically show that STC is often substantially faster to existing DEC-POMDP methods without sacrificing reward performance.
2010_18_teamcore_pomdp_10aaaiws.pdf
Scott Alfeld, Matthew E. Taylor, Prateek Tandon, and Milind Tambe. 2010. “Towards a Theoretic Understanding of DCEE .” In DCR 10.Abstract
Common wisdom says that the greater the level of teamwork, the higher the performance of the team. In teams of cooperative autonomous agents, working together rather than independently can increase the team reward. However, recent results show that in uncertain environments, increasing the level of teamwork can actually decrease overall performance. Coined the team uncertainty penalty, this phenomenon has been shown empirically in simulation, but the underlying mathematics are not yet understood. By understanding the mathematics, we could develop algorithms that reduce or eliminate this penalty of increased teamwork. In this paper we investigate the team uncertainty penalty on two fronts. First, we provide results of robots exhibiting the same behavior seen in simulations. Second, we present a mathematical foundation by which to analyze the phenomenon. Using this model, we present findings indicating that the team uncertainty penalty is inherent to the level of teamwork allowed, rather than to specific algorithms.
2010_8_teamcore_alfeld_dcr10cr.pdf
Jason Tsai, Zhengyu Yin, Jun-young Kwak, David Kempe, Christopher Kiekintveld, and Milind Tambe. 2010. “Urban Security: Game-Theoretic Resource Allocation in Networked Physical Domains .” In National Conference on Artificial Intelligence (AAAI).Abstract
Law enforcement agencies frequently must allocate limited resources to protect targets embedded in a network, such as important buildings in a city road network. Since intelligent attackers may observe and exploit patterns in the allocation, it is crucial that the allocations be randomized. We cast this problem as an attacker-defender Stackelberg game: the defender’s goal is to obtain an optimal mixed strategy for allocating resources. The defender’s strategy space is exponential in the number of resources, and the attacker’s exponential in the network size. Existing algorithms are therefore useless for all but the smallest networks. We present a solution approach based on two key ideas: (i) A polynomial-sized game model obtained via an approximation of the strategy space, solved efficiently using a linear program; (ii) Two efficient techniques that map solutions from the approximate game to the original, with proofs of correctness under certain assumptions. We present in-depth experimental results, including an evaluation on part of the Mumbai road network.
2010_9_teamcore_aaai_tsai.pdf
Matthew E. Taylor, Manish Jain, Yanquin Jin, Makoto Yooko, and Milind Tambe. 2010. “When Should There be a "Me" in "Team"? Distributed Multi-Agent Optimization Under Uncertainty .” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract
Increasing teamwork between agents typically increases the performance of a multi-agent system, at the cost of increased communication and higher computational complexity. This work examines joint actions in the context of a multi-agent optimization problem where agents must cooperate to balance exploration and exploitation. Surprisingly, results show that increased teamwork can hurt agent performance, even when communication and computation costs are ignored, which we term the team uncertainty penalty. This paper introduces the above phenomena, analyzes it, and presents algorithms to reduce the effect of the penalty in our problem setting.
2010_3_teamcore_dcee2010.pdf
2009
Matthew E. Taylor, Chris Kiekintveld, Craig Western, and Milind Tambe. 7/2009. “Is There a Chink in Your ARMOR? Towards Robust Evaluations for Deployed Security Systems .” In IJCAI 2009 Workshop on Quantitative Risk Analysis for Security Applications. QRASA-7/2009. Abstract
A growing number of security applications, designed to reduce risk from adversaries’ actions, are being developed and deployed. However, there are many challenges when attempting to evaluate such systems, both in the lab and in the real world. Traditional evaluations used by computer scientists, such as runtime analysis and optimality proofs, may be largely irrelevant. The primary contribution of this paper is to provide a preliminary framework which can guide the evaluation of such systems and to apply the framework to the evaluation of ARMOR (a system deployed at LAX since August 2007). This framework helps determine what experiments could, and should, be run in order to measure a system’s overall utility. A secondary contribution of this paper is to help familiarize our community with some of the difficulties inherent in evaluating deployed applications, focusing on those in security domains.
2009_11_teamcore_qrasa09_taylor.pdf
Jason Tsai, Emma Bowring, Shira Epstein, Natalie Fridman, Prakhar Garg, Gal Kaminka, Andrew Ogden, Milind Tambe, and Matthew Taylor. 2009. “Agent-based Evacuation Modeling: Simulating the Los Angeles International Airport .” In Workshop on Emergency Management: Incident, Resource, and Supply Chain Management (EMWS09).Abstract
In the aftermath of a terrorist attack on a large public venue such as an airport, a train station, or a theme park, rapid but safe evacuation is critical. For example, multiple IED explosions at the Los Angeles International Airport (LAX) will require evacuation of thousands of travelers and airport staff, while mitigating the risks from possible secondary explosions. In such cases, it is crucial to obtain information on the fastest evacuation routes, the time needed to evacuate people, the lag between the disaster and the arrival of a bomb squad or other emergency personnel, and what tactics and procedures authorities can use to avoid stampedes, confusion, and loss of life. By understanding possible scenarios in simulation before hand, emergency personnel can be trained so that they respond in the event of an actual evacuation. Unfortunately, conducting live exercises for evacuating thousands of people is generally impossible. It would be time-consuming and would result in tremendous lost revenue. Moreover, a staged evacuation would necessarily miss crucial aspects of the required response (e.g. fear, confusion) on the part of both emergency personnel and crowd evacuees, or the exercise would be considered unethical. Smaller scale evacuation exercises miss the most important feature of an evacuation: its massive scale. Simulations provide one attractive answer. Evacuation simulations allow us to meet the required training, evacuation planning, and tactics development goals; by running large numbers of “faster than real-time” simulations, we can obtain data from large numbers of scenarios that would be near-impossible in live exercises. This information can be used by policy-makers to predetermine optimal solutions to timecritical situations such as those involving injuries, IEDs, or active shooters. Additionally, real-time simulations can be created for officers on the ground who may only see a handful of real emergencies in their careers and thus could benefit immensely from repeated scenario-style training tools to learn with. In building these simulations, we bring to bear over two-decades of experience in agent-based simulations, including battlefield simulations of battalions of virtual helicopter pilots or teams of virtual fighter pilots for DARPA’s Synthetic Theater of War program and disaster response simulations in collaboration with the Los Angeles Fire Department.
2009_13_teamcore_workshop_v3.pdf
J. Pita, M. Jain, C. Western, P. Paruchuri, J. Marecki, M. Tambe, F. Ordonez, and S. Kraus. 2009. “ARMOR Software: A game theoretic approach to airport security .” In Protecting Airline Passengers in the Age of Terrorism, edited by P. Seidenstat. Praeger Publishers.Abstract
Protecting national infrastructure such as airports, is a challenging task for police and security agencies around the world; a challenge that is exacerbated by the threat of terrorism. Such protection of these important locations includes, but is not limited to, tasks such as monitoring all entrances or inbound roads and checking inbound traffic. However, limited resources imply that it is typically impossible to provide full security coverage at all times. Furthermore, adversaries can observe security arrangements over time and exploit any predictable patterns to their advantage. Randomizing schedules for patrolling, checking, or monitoring is thus an important tool in the police arsenal to avoid the vulnerability that comes with predictability. This chapter focuses on a deployed software assistant agent that can aid police or other security agencies in randomizing their security schedules. We face at least three key challenges in building such a software assistant. First, the assistant must provide quality guarantees in randomization by appropriately weighing the costs and benefits of the different options available. For example, if an attack on one part of an infrastructure will cause economic damage while an attack on another could potentially cost human lives, we must weigh the two options differently – giving higher weight (probability) to guarding the latter. Second, the assistant must address the uncertainty in information that security forces have about the adversary. Third, the assistant must enable a mixed-initiative interaction with potential users rather than dictating a schedule; the assistant may be unaware of users’ real-world constraints and hence users must be able to shape the schedule development. We have addressed these challenges in a software assistant agent called ARMOR (Assistant for Randomized Monitoring over Routes). Based on game-theoretic principles, ARMOR combines three key features to address each of the challenges outlined above. Game theory is a well-established foundational principle within multi-agent systems to reason about multiple agents each pursuing their own interests (Fudenberg & Tirole 1991). We build on these game theoretic foundations to reason about two agents – the police force and their adversary – in providing a method of randomization. In particular, the main contribution of our work is mapping the problem of security scheduling as a Bayesian Stackelberg game (Conitzer & Sandholm 2006) and solving it within our software system using the fastest optimal algorithm for such games (Paruchuri et al. 2008), addressing the first two challenges. While a Bayesian game allows us to address uncertainty over adversary types, by optimally solving such Bayesian Stackelberg games (which yield optimal randomized strategies as solutions), ARMOR provides quality guarantees on the schedules generated. The algorithm used builds on several years of research regarding multi-agent systems and security (Paruchuri et al. 205; 2006; 2007). ARMOR employs an algorithm that is a logical culmination of this line of research; in particular, ARMOR relies on an optimal algorithm called DOBSS (Decomposed Optimal Bayesian Stackelberg Solver) (Paruchuri et al. 2008). The third challenge is addressed by ARMOR’s use of a mixed-initiative based interface, where users are allowed to graphically enter different constraints to shape the schedule generated. ARMOR is thus a collaborative assistant that iterates over generated schedules rather than a rigid one-shot scheduler. ARMOR also alerts users in case overrides may potentially deteriorate schedule quality. ARMOR therefore represents a very promising transition of multi-agent research into a deployed application. ARMOR has been successfully deployed since August 2007 at the Los Angeles International Airport (LAX) to assist the Los Angeles World Airport (LAWA) police in randomized scheduling of checkpoints, and since November 2007 for generating randomized patrolling schedules for canine units. In particular, it assists police in determining where to randomly set up checkpoints and where to randomly allocate canines to terminals. Indeed, February 2008 marked the successful end of the six month trial period of ARMOR deployment at LAX. The feedback from police at the end of this six month period is extremely positive; ARMOR will continue to be deployed at LAX and expand to other police activities at LAX.
2009_18_teamcore_armor_book_chapter.pdf
Emma Bowring and Milind Tambe. 2009. “Bridging the Gap: Introducing Agents and Multiagent Systems to Undergraduate Students .” In Educational Uses of Multi-Agent Systems (EDUMAS).Abstract
The field of “intelligent agents and multiagent systems” is maturing; no longer is it a special topic to be introduced to graduate students after years of training in computer science and many introductory courses in Artificial Intelligence. Instead, time is ripe to face the challenge of introducing agents and multiagents directly to undergraduate students, whether majoring in computer science or not. This paper focuses on exactly this challenge, drawing on the co-authors’ experience of teaching several such undergraduate courses on agents and multiagents, over the last three years at two different universities. The paper outlines three key issues that must be addressed. The first issue is facilitating students’ intuitive understanding of fundamental concepts of multiagent systems; we illustrate uses of science fiction materials and classroom games to not only provide students with the necessary intuitive understanding but with the excitement and motivation for studying multiagent systems. The second is in selecting the right material — either science-fiction material or games — for providing students the necessary motivation and intuition; we outline several criteria that have been useful in selecting such material. The third issue is in educating students about the fundamental philosophical, ethical and social issues surrounding agents and multiagent systems: we outline course materials and classroom activities that allow students to obtain this “big picture” futuristic vision of our science. We conclude with feedback received, lessons learned and impact on both the computer science students and non computer-science students.
2009_8_teamcore_bowring_tambe2.pdf
Christopher Kiekintveld, Manish Jain, Jason Tsai, James Pita, Fernando Ordóñez, and Milind Tambe. 2009. “Computing Optimal Randomized Resource Allocations for Massive Security Games .” In The Eighth International Conference on Autonomous Agents and Multiagent Systems.Abstract
Predictable allocations of security resources such as police officers, canine units, or checkpoints are vulnerable to exploitation by attackers. Recent work has applied game-theoretic methods to find optimal randomized security policies, including a fielded application at the Los Angeles International Airport (LAX). This approach has promising applications in many similar domains, including police patrolling for subway and bus systems, randomized baggage screening, and scheduling for the Federal Air Marshal Service (FAMS) on commercial flights. However, the existing methods scale poorly when the security policy requires coordination of many resources, which is central to many of these potential applications. We develop new models and algorithms that scale to much more complex instances of security games. The key idea is to use a compact model of security games, which allows exponential improvements in both memory and runtime relative to the best known algorithms for solving general Stackelberg games. We develop even faster algorithms for security games under payoff restrictions that are natural in many security domains. Finally, introduce additional realistic scheduling constraints while retaining comparable performance improvements. The empirical evaluation comprises both random data and realistic instances of the FAMS and LAX problems. Our new methods scale to problems several orders of magnitude larger than the fastest known algorithm.
2009_7_teamcore_computing_aamas_09.pdf
Manish Jain, Matthew Taylor, Milind Tambe, and Makoto Yokoo. 2009. “DCOPs Meet the RealWorld: Exploring Unknown Reward Matrices with Applications to Mobile Sensor Networks .” In International Joint Conference on Artificial Intelligence (IJCAI).Abstract
Buoyed by recent successes in the area of distributed constraint optimization problems (DCOPs), this paper addresses challenges faced when applying DCOPs to real-world domains. Three fundamental challenges must be addressed for a class of real-world domains, requiring novel DCOP algorithms. First, agents may not know the payoff matrix and must explore the environment to determine rewards associated with variable settings. Second, agents may need to maximize total accumulated reward rather than instantaneous final reward. Third, limited time horizons disallow exhaustive exploration of the environment. We propose and implement a set of novel algorithms that combine decision-theoretic exploration approaches with DCOP-mandated coordination. In addition to simulation results, we implement these algorithms on robots, deploying DCOPs on a distributed mobile sensor network.
2009_10_teamcore_09ijcailandroids.pdf
James Pita, Manish Jain, Fernando Ordóñez, Milind Tambe, Sarit Kraus, and Reuma Magori-Cohen. 2009. “Effective Solutions for Real-World Stackelberg Games: When Agents Must Deal with Human Uncertainties .” In The Eighth International Conference on Autonomous Agents and Multiagent Systems.Abstract
How do we build multiagent algorithms for agent interactions with human adversaries? Stackelberg games are natural models for many important applications that involve human interaction, such as oligopolistic markets and security domains. In Stackelberg games, one player, the leader, commits to a strategy and the follower makes their decision with knowledge of the leader’s commitment. Existing algorithms for Stackelberg games efficiently find optimal solutions (leader strategy), but they critically assume that the follower plays optimally. Unfortunately, in real-world applications, agents face human followers who — because of their bounded rationality and limited observation of the leader strategy — may deviate from their expected optimal response. Not taking into account these likely deviations when dealing with human adversaries can cause an unacceptable degradation in the leader’s reward, particularly in security applications where these algorithms have seen real-world deployment. To address this crucial problem, this paper introduces three new mixed-integer linear programs (MILPs) for Stackelberg games to consider human followers, incorporating: (i) novel anchoring theories on human perception of probability distributions and (ii) robustness approaches for MILPs to address human imprecision. Since these new approaches consider human followers, traditional proofs of correctness or optimality are insufficient; instead, it is necessary to rely on empirical validation. To that end, this paper considers two settings based on real deployed security systems, and compares 6 different approaches (three new with three previous approaches), in 4 different observability conditions, involving 98 human subjects playing 1360 games in total. The final conclusion was that a model which incorporates both the ideas of robustness and anchoring achieves statistically significant better rewards and also maintains equivalent or faster solution speeds compared to existing approaches.
2009_2_teamcore_cobra.pdf
Pradeep Varakantham, Jun-young Kwak, Matthew Taylor, Janusz Marecki, Paul Scerri, and Milind Tambe. 2009. “Exploiting Coordination Locales in Distributed POMDPs via Social Model Shaping .” In International conference on automated planning and scheduling.Abstract
Distributed POMDPs provide an expressive framework for modeling multiagent collaboration problems, but NEXPComplete complexity hinders their scalability and application in real-world domains. This paper introduces a subclass of distributed POMDPs, and TREMOR, an algorithm to solve such distributed POMDPs. The primary novelty of TREMOR is that agents plan individually with a single agent POMDP solver and use social model shaping to implicitly coordinate with other agents. Experiments demonstrate that TREMOR can provide solutions orders of magnitude faster than existing algorithms while achieving comparable, or even superior, solution quality
2009_12_teamcore_icaps09_tremor.pdf

Pages