Stackelberg security games (SSGs) are now established as a powerful tool in security domains. In this paper, we consider a new
dimension of security games: the risk preferences of the attacker.
Previous work assumes a risk-neutral attacker that maximizes his
expected reward. However, extensive studies show that the attackers in some domains are in fact risk-averse, e.g., terrorist groups
in counter-terrorism domains. The failure to incorporate the risk
aversion in SSG models may lead the defender to suffer significant
losses. Additionally, defenders are uncertain about the degree of
attacker’s risk aversion. Motivated by this challenge this paper provides the following five contributions: (i) we propose a novel model
for security games against risk-averse attackers with uncertainty in
the degree of their risk aversion; (ii) we develop an intuitive MIBLP formulation based on previous security games research, but
find that it finds locally optimal solutions and is unable to scale up;
(iii) based on insights from our MIBLP formulation, we develop
our scalable BeRRA algorithm that finds globally ǫ-optimal solutions; (iv) our BeRRA algorithm can also be extended to handle
other risk-aware attackers, e.g., risk-seeking attackers; (v) we show
that we do not need to consider attacker’s risk attitude in zero-sum
Most models of Stackelberg security games assume that the attacker only knows the defender’s mixed strategy, but is not able to observe (even partially) the instantiated pure strategy. Such partial observation of the deployed pure strategy – an issue we refer to as information
leakage – is a significant concern in practical applications. While previous research on patrolling
games has considered the attacker’s real-time surveillance, our settings, therefore models and
techniques, are fundamentally different. More specifically, after describing the information leakage model, we start with an LP formulation to compute the defender’s optimal strategy in the
presence of leakage. Perhaps surprisingly, we show that a key subproblem to solve this LP
(more precisely, the defender oracle) is NP-hard even for the simplest of security game models.
We then approach the problem from three possible directions: efficient algorithms for restricted
cases, approximation algorithms, and heuristic algorithms for sampling that improves upon the
status quo. Our experiments confirm the necessity of handling information leakage and the
advantage of our algorithms.
Teams of voting agents are a powerful tool for solving complex problems. When forming such teams, there are three
fundamental issues that must be addressed: (i) Selecting
which agents should form a team; (ii) Aggregating the opinions of the agents; (iii) Assessing the performance of a team.
In this thesis we address all these points.
Teams of voting agents have great potential in finding optimal solutions. However, there are fundamental challenges to effectively use such teams: (i)
selecting agents; (ii) aggregating opinions; (iii) assessing performance. I address all these challenges,
with theoretical and experimental contributions.
While human behavior models based on repeated Stackelberg games have been proposed for domains such as “wildlife crime” where there is repeated interaction between the defender and the adversary, there has been no empirical study with human subjects to show the effectiveness of such models. This paper presents an initial study based on extensive human subject experiments with participants on Amazon Mechanical Turk (AMT). Our findings include: (i) attackers may view the defender’s coverage probability in a non-linear fashion; specifically it follows an S-shaped curve, and (ii) there are significant losses in defender utility when strategies generated by existing models are deployed in repeated Stackelberg game settings against human subjects.
Several competing human behavior models have been proposed to model and protect against boundedly rational adversaries in repeated Stackelberg security games (SSGs). However, these existing models fail to address three main issues which are extremely detrimental to defender performance. First, while they attempt to learn adversary behavior models from adversaries’ past actions (“attacks on targets”), they fail to take into account adversaries’ future adaptation based on successes or failures of these past actions. Second, they assume that sufficient data in the initial rounds will lead to a reliable model of the adversary. However, our analysis reveals that the issue is not the amount of data, but that there just is not enough of the attack surface exposed to the adversary to learn a reliable model. Third, current leading approaches have failed to include probability weighting functions, even though it is well known that human beings’ weighting of probability is typically nonlinear. The first contribution of this paper is a new human behavior model, SHARP, which mitigates these three limitations as follows: (i) SHARP reasons based on success or failure of the adversary’s past actions on exposed portions of the attack surface to model adversary adaptiveness; (ii) SHARP reasons about similarity between exposed and unexposed areas of the attack surface, and also incorporates a discounting parameter to mitigate adversary’s lack of exposure to enough of the attack surface; and (iii) SHARP integrates a non-linear probability weighting function to capture the adversary’s true weighting of probability. Our second contribution is a first “longitudinal study” – at least in the context of SSGs – of competing models in settings involving repeated interaction between the attacker and the defender. This study, where each experiment lasted a period of multiple weeks with individual sets of human subjects, illustrates the strengths and weaknesses of different models and shows the advantages of SHARP.
Recent research on Green Security Games (GSG), i.e., security games for the protection of wildlife, forest and fisheries, relies on the promise of an abundance of available data in these domains to learn adversary behavioral models and determine game payoffs. This research suggests that adversary behavior models (capturing bounded rationality) can be learned from real-world data on where adversaries have attacked, and that game payoffs can be determined precisely from data on animal densities. However, previous work has, as yet, failed to demonstrate the usefulness of these behavioral models in capturing adversary behaviors based on real-world data in GSGs. Previous work has also been unable to address situations where available data is insufficient to accurately estimate behavioral models or to obtain the required precision in the payoff values. In addressing these limitations, as our first contribution, this paper, for the first time, provides validation of the aforementioned adversary behavioral models based on real-world data from a wildlife park in Uganda. Our second contribution addresses situations where real-world data is not precise enough to determine exact payoffs in GSG, by providing the first algorithm to handle payoff uncertainty in the presence of adversary behavioral models. This algorithm is based on the notion of minimax regret. Furthermore, in scenarios where the data is not even sufficient to learn adversary behaviors, our third contribution is to provide a novel algorithm to address payoff uncertainty assuming a perfectly rational attacker (instead of relying on a behavioral model); this algorithm allows for a significant scaleup for large security games. Finally, to reduce the problems due to paucity of data, given mobile sensors such as Unmanned Aerial Vehicles (UAV), we introduce new payoff elicitation strategies to strategically reduce uncertainty.
Homeless youth are prone to HIV due to their engagement in high risk behavior. Many agencies conduct interventions to educate/train a select group of homeless youth about HIV prevention practices and rely on word-of-mouth spread of information through their social network. Previous work in strategic selection of intervention participants does not handle uncertainties in the social network’s structure and in the evolving network state, potentially causing significant shortcomings in spread of information. Thus, we developed PSINET, a decision support system to aid the agencies in this task. PSINET includes the following key novelties: (i) it handles uncertainties in network structure and evolving network state; (ii) it addresses these uncertainties by using POMDPs in influence maximization; (iii) it provides algorithmic advances to allow high quality approximate solutions for such POMDPs. Simulations show that PSINET achieves ∼60% more information spread over the current state-of-the-art. PSINET was developed in collaboration with My Friend’s Place (a drop-in agency serving homeless youth in Los Angeles) and is currently being reviewed by their officials.
Building on the successful applications of Stackelberg Security Games (SSGs) to protect infrastructure, researchers have begun focusing on applying game theory to green security domains such as protection of endangered animals and fish stocks. Previous efforts in these domains optimize defender strategies based on the standard Stackelberg assumption that the adversaries become fully aware of the defender’s strategy before taking action. Unfortunately, this assumption is inappropriate since adversaries in green security domains often lack the resources to fully track the defender strategy. This paper (i) introduces Green Security Games (GSGs), a novel game model for green security domains with a generalized Stackelberg assumption; (ii) provides algorithms to plan effective sequential defender strategies — such planning was absent in previous work; (iii) proposes a novel approach to learn adversary models that further improves defender performance; and (iv) provides detailed experimental analysis of proposed approaches.
Stackelberg games form the core of a number of tools deployed for computing optimal patrolling strategies in adversarial domains, such as the US Federal Air Marshall Service
and the US Coast Guard. In traditional Stackelberg security
game models the attacker knows only the probability that
each target is covered by the defender, but is oblivious to the
detailed timing of the coverage schedule. In many real-world
situations, however, the attacker can observe the current location of the defender and can exploit this knowledge to reason
about the defender’s future moves. We show that this general modeling framework can be captured using adversarial
patrolling games (APGs) in which the defender sequentially
moves between targets, with moves constrained by a graph,
while the attacker can observe the defender’s current location
and his (stochastic) policy concerning future moves. We offer a very general model of infinite-horizon discounted adversarial patrolling games. Our first contribution is to show that
defender policies that condition only on the previous defense
move (i.e., Markov stationary policies) can be arbitrarily suboptimal for general APGs. We then offer a mixed-integer nonlinear programming (MINLP) formulation for computing optimal randomized policies for the defender that can condition on history of bounded, but arbitrary, length, as well as
a mixed-integer linear programming (MILP) formulation to
approximate these, with provable quality guarantees. Additionally, we present a non-linear programming (NLP) formulation for solving zero-sum APGs. We show experimentally
that MILP significantly outperforms the MINLP formulation,
and is, in turn, significantly outperformed by the NLP specialized to zero-sum games.
Endangered species around the world are in danger of extinction from poaching. From the start of the 20th century, the African rhino population has dropped over 98%  and the global tiger population has dropped over 95% , resulting in multiple species extinctions in both groups. Species extinctions have negative consequences on local ecosystems, economies, and communities. To protect these species, countries have set up conservation agencies and national parks, such as Uganda’s Queen Elizabeth National Park (QENP). However, a common lack of funding for these agencies results in a lack of law enforcement resources to protect these large, rural areas. As an example of the scale of disparity, one wildlife crime study in 2007 reported an actual coverage density of one ranger per 167 square kilometers . Because of the hazards involved (e.g., armed poachers, wild animals), rangers patrol in groups, further increasing the amount of area they are responsible for patrolling. Security game research has typically been concerned with combating terrorism, and this field has indeed benefited from a range of successfully deployed applications [1, 6]. These applications have enabled security agencies to make more efficient use of their limited resources. In this previous research, adversary data has been absent during the development of these solutions, and thus, it has been difficult to make accurate adversary behavior models during algorithm development. In a domain such as wildlife crime, interactions with the adversary are frequent and repeated, thus enabling conservation agencies to collect data. This presence of data enables security game researchers to begin developing algorithms that incorporate this data into, potentially, more accurate behavior models and consequently better security solutions. Developed in conjunction with staff at QENP, the Protection Assistant for Wildlife Security (PAWS) generates optimized defender strategies for use by park rangers . Due to the repeated nature of wildlife crime, PAWS is able to leverage crime event data - a previously unrealized capability in security games research. Thus, PAWS implements a novel adaptive algorithm that processes crime event data, builds multiple human behavior models, and, based on those models, predicts where adversaries will attack next. These predictions are then used to generate a patrol strategy for the rangers (i.e., a set of patrol waypoints) that can be viewed on a GPS unit. Against this background, the demonstration presented in this paper introduces two contributions. First, we present the PAWS system which incorporates the algorithm in  into a scheduling system and a GPS visualizer. Second, we present a software interface to run a number of human subject experiments (HSE) to evaluate and improve the efficacy of PAWS before its deployment in QENP. By conducting these HSEs, we can (i) test the PAWS algorithms with repeated interactions with humans, thus providing a more realistic testing environment than in its previous simulations; (ii) generate data that can be used to initialize PAWS’s human behavior models for deployment, and (iii) compare the current PAWS algorithms’ performance to alternatives and determine if additional improvements are needed prior to deployment. To provide proper context for the presentation, this paper also presents a brief overview of the PAWS system data flow and its adaptive algorithms. The demonstration will engage audience members by having them participate in the HSEs and using the GPS unit to visualize a patrol schedule in QENP.
Boundedly rational human adversaries pose a serious challenge to security because they deviate from the classical assumption of
perfect rationality. An emerging trend in security game research addresses this challenge by using behavioral models such as quantal response (QR) and subjective utility quantal response (SUQR). These
models improve the quality of the defender’s strategy by more accurately
modeling the decisions made by real human adversaries. Work on incorporating human behavioral models into security games has typically followed two threads. The first thread, scalability, seeks to develop efficient
algorithms to design patrols for large-scale domains that protect against
a single adversary. However, this thread cannot handle the common situation of multiple adversary types with heterogeneous behavioral models.
Having multiple adversary types introduces considerable uncertainty into
the defender’s planning problem. The second thread, robustness, uses either Bayesian or maximin approaches to handle this uncertainty caused
by multiple adversary types. However, the robust approach has so far
not been able to scale up to complex, large-scale security games. Thus,
each of these two threads alone fails to work in key real world security
games. Our present work addresses this shortcoming and merges these
two research threads to yield a scalable and robust algorithm, MIDAS
(MaxImin Defense Against SUQR), for generating game-theoretic patrols
to defend against multiple boundedly rational human adversaries. Given
the size of the defender’s optimization problem, the key component of
MIDAS is incremental cut and strategy generation using a master/slave
optimization approach. Innovations in MIDAS include (i) a maximin
mixed-integer linear programming formulation in the master and (ii) a
compact transition graph formulation in the slave. Additionally, we provide a theoretical analysis of our new model and report its performance
in simulations. In collaboration with the United States Coast Guard
(USCG), we consider the problem of defending fishery stocks from illegal fishing in the Gulf of Mexico and use MIDAS to handle heterogeneity
in adversary types (i.e., illegal fishermen) in order to construct robust
patrol strategies for USCG assets.
In this research-in-progress paper we present a new real
world domain for studying the aggregation of different
opinions: early stage architectural design of buildings.
This is an important real world application, not only
because building design and construction is one of the
world’s largest industries measured by global expenditures, but also because the early stage design decision
making has a significant impact on the energy consumption of buildings. We present a mapping between the domain of architecture and engineering research and that
of the agent models present in the literature. We study
the importance of forming diverse teams when aggregating the opinions of different agents for architectural
design, and also the effect of having agents optimizing
for different factors of a multi-objective optimization
design problem. We find that a diverse team of agents is
able to provide a higher number of top ranked solutions
for the early stage designer to choose from. Finally, we
present the next steps for a deeper exploration of our
This paper presents THINC, an agent developed for saving energy
in real-world commercial buildings. While previous work has presented techniques for computing energy-efficient schedules, it fails
to address two issues, centered on human users, that are essential in
real-world agent deployments: (i) incentivizing users for their energy saving activities and (ii) interacting with users to reschedule
key “energy-consuming” meetings in a timely fashion, while handling the uncertainty in such interactions. THINC addresses these
shortcomings by providing four new major contributions. First,
THINC computes fair division of credits from energy savings. For
this fair division, THINC provides novel algorithmic advances for
efficient computation of Shapley value. Second, THINC includes a
novel robust algorithm to optimally reschedule identified key meetings addressing user interaction uncertainty. Third, THINC provides an end-to-end integration within a single agent of energy efficient scheduling, rescheduling and credit allocation. Finally, we
deploy THINC in the real-world as a pilot project at one of the main
libraries at the University of Southern California and present results
illustrating the benefits in saving energy.
Security is a critical concern around the world that arises in
protecting our ports, airports, transportation and other critical national infrastructure from adversaries, in protecting our wildlife
and forests from poachers and smugglers, and in curtailing the illegal flow of weapons, drugs and money; and it arises in problems
ranging from physical to cyber-physical systems. In all of these
problems, we have limited security resources which prevent full
security coverage at all times; instead, security resources must be
deployed intelligently taking into account differences in priorities of targets requiring security coverage, the responses of the attackers to the security posture, and potential uncertainty over the
types, capabilities, knowledge and priorities of attackers faced.
Game theory, which studies interactions among multiple selfinterested agents, is well-suited to the adversarial reasoning required for security resource allocation and scheduling problems.
Casting the problem as a Bayesian Stackelberg game, we have
developed new algorithms for efficiently solving such games that
provide randomized patrolling or inspection strategies. These algorithms have led to some initial successes in this challenging
problem arena, leading to advances over previous approaches in
security scheduling and allocation, e.g., by addressing key weaknesses of predictability of human schedulers. These algorithms
are now deployed in multiple applications: ARMOR has been
deployed at the Los Angeles International Airport (LAX) since
2007 to randomize checkpoints on the roadways entering the airport and canine patrol routes within the airport terminals ;
IRIS, a game-theoretic scheduler for randomized deployment of
the US Federal Air Marshals (FAMS) requiring significant scaleup in underlying algorithms, has been in use since 2009 ;
PROTECT, which schedules the US Coast Guard’s randomized
patrolling of ports using a new set of algorithms based on modeling bounded-rational human attackers, has been deployed in the port of Boston since April 2011 and is in use at the port of New
York since February 2012 , and is headed for nationwide deployment; another application for deploying escort boats to protect ferries has been deployed by the US Coast Guard since April
2013 ; GUARDS is under evaluation for national deployment
by the US Transportation Security Administration (TSA) ,
and TRUSTS  has been evaluated in field trials by the Los Angeles Sheriffs Department (LASD) in the LA Metro system and
a nation-wide deployment is now being evaluated at TSA. These
initial successes point the way to major future applications in a
wide range of security domains; with major research challenges
in scaling up our game-theoretic algorithms, in addressing human
adversaries’ bounded rationality and uncertainties in action execution and observation, as well as in multiagent learning.
This paper will provide an overview of the models and algorithms, key research challenges and a brief description of our successful deployments.
The goal of this paper is to (re)introduce a real-world challenge problem for researchers in multiagent systems and beyond, where our collective efforts may have a significant impact on activities in the real-world. The challenge is in applying game theory for security: our goal is to not only introduce the research challenges for algorithmic and behavioral
game theory in service of this problem, but also to provide
initial exemplars of successes of deployed systems, and to
challenges introduced by these deployments of computational
game theory in the field. We also wish to provide an overview
of key open research challenges and pointers to getting started in this research.
Among the many deployment areas of Stackelberg Security games, a
major area involves games played out in space and time, which includes applications in multiple mobile defender resources protecting multiple mobile targets.
Previous algorithms for such spatio-temporal security games fail to scale-up and
little is known of the computational complexity properties of these problems.
This paper provides a novel oracle-based algorithmic framework for a systematic
study of different problem variants of computing optimal (minimax) strategies
in spatio-temporal security games. Our framework enables efficient computation
of a minimax strategy when the problem admits a polynomial-time oracle. Furthermore, for the cases in which efficient oracles are difficult to find, we propose
approximations or prove hardness results.
This paper introduces a new game-theoretic framework and
algorithms for addressing opportunistic crime. The Stackelberg Security Game (SSG), which models highly strategic and resourceful adversaries, has become an important computational framework within
multiagent systems. Unfortunately, SSG is ill-suited as a framework for
handling opportunistic crimes, which are committed by criminals who
are less strategic in planning attacks and more flexible in executing them
than SSG assumes. Yet, opportunistic crime is what is commonly seen
in most urban settings.We therefore introduce the Opportunistic Security Game (OSG), a computational framework to recommend deployment strategies for defenders to control opportunistic crimes. Our first
contribution in OSG is a novel model for opportunistic adversaries, who
(i) opportunistically and repeatedly seek targets; (ii) react to real-time
information at execution time rather than planning attacks in advance;
and (iii) have limited observation of defender strategies. Our second
contribution to OSG is a new exact algorithm EOSG to optimize defender strategies given our opportunistic adversaries. Our third contribution is the development of a fast heuristic algorithm to solve largescale OSG problems, exploiting a compact representation.We use urban
transportation systems as a critical motivating domain, and provide detailed experimental results based on a real-world system.
Leandro Soriano Marcolino, Chao Zhang, Albert Xin Jiang, and Milind Tambe. 2014. “A Detailed Analysis of a Multi-agent Diverse Team .” In Coordination, Organizations, Institutions and Norms in Agent Systems IX. Springer-Verlag Lecture Notes in AI, edited by T. Balke, A. Chopra, F. Dignum, and B. van Riemsdijk.Abstract
In an open system we can have many different kinds of agents. However, it is a challenge to decide which agents to pick when forming multi-agent
teams. In some scenarios, agents coordinate by voting continuously. When forming such teams, should we focus on the diversity of the team or on the strength
of each member? Can a team of diverse (and weak) agents outperform a uniform
team of strong agents? We propose a new model to address these questions. Our
key contributions include: (i) we show that a diverse team can overcome a uniform team and we give the necessary conditions for it to happen; (ii) we present
optimal voting rules for a diverse team; (iii) we perform synthetic experiments
that demonstrate that both diversity and strength contribute to the performance of
a team; (iv) we show experiments that demonstrate the usefulness of our model
in one of the most difficult challenges for Artificial Intelligence: Computer Go.
We investigate the power of voting among diverse, randomized software agents.
With teams of computer Go agents in mind, we develop a novel theoretical model
of two-stage noisy voting that builds on recent work in machine learning. This
model allows us to reason about a collection of agents with different biases (determined by the first-stage noise models), which, furthermore, apply randomized
algorithms to evaluate alternatives and produce votes (captured by the secondstage noise models). We analytically demonstrate that a uniform team, consisting
of multiple instances of any single agent, must make a significant number of mistakes, whereas a diverse team converges to perfection as the number of agents
grows. Our experiments, which pit teams of computer Go agents against strong
agents, provide evidence for the effectiveness of voting when agents are diverse.