2015
Chao Zhang, Arunesh Sinha, and Milind Tambe. 2015. “
Keeping pace with criminals: Designing patrol allocation against adaptive opportunistic criminals .” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2015).
AbstractPolice patrols are used ubiquitously to deter crimes in urban areas.
A distinctive feature of urban crimes is that criminals react opportunistically to patrol officers’ assignments. Compared to strategic
attackers (such as terrorists) with a well-laid out plan, opportunistic criminals are less strategic in planning attacks and more flexible
in executing them. In this paper, our goal is to recommend optimal police patrolling strategy against such opportunistic criminals.
We first build a game-theoretic model that captures the interaction
between officers and opportunistic criminals. However, while different models of adversary behavior have been proposed, their exact form remains uncertain. Rather than simply hypothesizing a
model as done in previous work, one key contribution of this paper is to learn the model from real-world criminal activity data.
To that end, we represent the criminal behavior and the interaction
with the patrol officers as parameters of a Dynamic Bayesian Network (DBN), enabling application of standard algorithms such as
EM to learn the parameters. Our second contribution is a sequence
of modifications to the DBN representation, that allows for a compact representation of the model resulting in better learning accuracy and increased speed of learning of the EM algorithm when
used for the modified DBN. These modifications use marginalization approaches and exploit the structure of this problem. Finally,
our third contribution is an iterative learning and planning mechanism that keeps updating the adversary model periodically. We
demonstrate the efficiency of our learning algorithm by applying
it to a real data set of criminal activity obtained from the police
department of University of Southern California (USC) situated in
Los Angeles, USA. We project a significant reduction in crime rate
using our planning strategy as opposed to the actual strategy deployed by the police department. We also demonstrate the improvement in crime prevention in simulations when we use our iterative
planning and learning mechanism compared to just learning once
and planing. This work was done in collaboration with the police
department of USC.
2015_13_teamcore_keep_pace_with_criminal.pdf Chao Zhang, Manish Jain, Ripple Goyal, Arunesh Sinha, and Milind Tambe. 2015. “
Keeping pace with criminals: Learning, Predicting and Planning against Crime: Demonstration Based on Real Urban Crime Data (Demonstration) .” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2015).
AbstractCrime in urban areas plagues every
city in all countries. This demonstration will show a novel approach for
learning and predicting crime patterns
and planning against such crimes using
real urban crime data. A notable characteristic of urban crime, distinct from
organized terrorist attacks, is that most
urban crimes are opportunistic in nature, i.e., criminals do not plan their attacks in detail, rather they seek opportunities for committing crime and are
agile in their execution of the crime [6,
7, 1, 4]. Police officers conduct patrols
with the aim of preventing crime. However, criminals can adapt their strategy in response of police deployment by
seeking crime opportunity in less effectively patrolled location. The problem
of where and how much to patrol is therefore important.
There are two approaches to solve this problem. The first
approach is to schedule patrols manually by human planners,
which is followed in various police departments. However,
it has been demonstrated that manual planning of patrols
is not only time-consuming but also highly ineffective in related scenarios of protecting airport terminals [3] and ships
in ports [5]. The second approach is to use automated planners to plan patrols against urban crime. This approach
has either focused on modeling the criminal explicitly [7,6] (rational, bounded rational, etc.) in a game model or
to learn the adversary behavior using machine learning [2].
However, the proposed mathematical models of criminal behavior have not been validated with real data. Also, prior
machine learning approaches have either only focused on the
adversary actions ignoring their adaptation to the defenders’
actions [2]. Hence, in this presentation we propose
a novel approach to
learn and update the
criminal behavior from
real data [8]. We
model the interaction
between criminals and
patrol officers as a Dynamic Bayesian Network (DBN). Figure 1
shows an example of such DBN. Next, we apply a dynamic
programming algorithm to generate optimal patrol strategy
against the learned criminal model. By iteratively updating
the criminals’ model and computing patrol strategy against
them, we help patrol officers keep up with criminals’ adaptive behavior and execute effective patrols. This process is
shown as a flow chart in Figure 2.
With this context, the demonstration presented in this
paper introduces a web-based software with two contributions. First, our system collects and analyzes crime reports
and resources (security camera, emergency supplies, etc.)
data, presenting them in various forms. Second, our patrol
scheduler incorporates the algorithm in [8] in a scheduling
recommendation system. The demonstration will engage audience members by having them participate as patrol officers
and using the software to ‘patrol’ the University of Southern
California (USC) campus in USA.
2015_15_teamcore_de005_zhang_learning_demo.pdf Debarun Kar, Fei Fang, Francesco Delle Fave, Nicole Sintov, Arunesh Sinha, Aram Galstyan, Bo An, and Milind Tambe. 2015. “
Learning Bounded Rationality Models of the Adversary in Repeated Stackelberg Security Games .” In Adaptive and Learning Agents Workshop at the International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2015).
AbstractSeveral competing human behavior models have been proposed
to model and protect against boundedly rational adversaries in repeated Stackelberg security games (SSGs). However, these existing
models fail to address three main issues which are extremely detrimental to defender performance. First, while they attempt to learn
adversary behavior models from adversaries’ past actions (“attacks
on targets”), they fail to take into account adversaries’ future adaptation based on successes or failures of these past actions. Second,
they assume that sufficient data in the initial rounds will lead to
a reliable model of the adversary. However, our analysis reveals
that the issue is not the amount of data, but that there just is not
enough of the attack surface exposed to the adversary to learn a
reliable model. Third, current leading approaches have failed to include probability weighting functions, even though it is well known
that human beings’ weighting of probability is typically nonlinear.
Moreover, the performances of these models may be critically dependent on the learning algorithm used to learn the parameters of
these models. The first contribution of this paper is a new human
behavior model, SHARP, which mitigates these three limitations as
follows: (i) SHARP reasons based on success or failure of the adversary’s past actions on exposed portions of the attack surface to
model adversary adaptiveness; (ii) SHARP reasons about similarity between exposed and unexposed areas of the attack surface, and
also incorporates a discounting parameter to mitigate adversary’s
lack of exposure to enough of the attack surface; and (iii) SHARP
integrates a non-linear probability weighting function to capture the
adversary’s true weighting of probability. Our second contribution
is a comparison of two different approaches for learning the parameters of the bounded rationality models.
Our third contribution is a first “longitudinal study” – at least in
the context of SSGs – of competing models in settings involving
repeated interaction between the attacker and the defender. This
study, where each experiment lasted a period of multiple weeks
with individual sets of human subjects, illustrates the strengths and
weaknesses of different models and shows the advantages of SHARP.
2015_27_teamcore_ijcai2015_gsg.pdf D. J. Gerber, E. Pantazis, L. S. Marcolino, and A. Heydarian. 2015. “
A Multi-Agent Systems for Design Simulation Framework: Experiments with Virtual-Physical-Social Feedback for Architecture .” In Symposium on Simulation for Architecture and Urban Design (SimAUD 2015).
AbstractThis paper presents research on the development of multiagent systems (MAS) for integrated and performance driven
architectural design. It presents the development of a
simulation framework that bridges architecture and
engineering, through a series of multi-agent based
experiments. The research is motivated to combine multiple
design agencies into a system for managing and optimizing
architectural form, across multiple objectives and contexts.
The research anticipates the incorporation of feedback from
real world human behavior and user preferences with physics
based structural form finding and environmental analysis
data. The framework is a multi-agent system that provides
design teams with informed design solutions, which
simultaneously optimize and satisfy competing design
objectives. The initial results for building structures are
measured in terms of the level of lighting improvements and
qualitatively in geometric terms. Critical to the research is
the elaboration of the system and the feedback loops that are
possible when using the multi-agent systems approach.
2015_18_teamcore_simaud2015.pdf Eric Shieh. 2015. “
Not a Lone Ranger: Unleashing Defender Teamwork in Security Games ”.
AbstractGame theory has become an important research area in handling complex security resource allocation and patrolling problems. Stackelberg Security Games (SSGs) have been used in modeling these types of problems via a defender and an attacker(s). Despite recent successful real-world deployments of SSGs, scale-up to handle defender teamwork remains a fundamental challenge in this field. The latest techniques do not scale-up to domains where multiple defenders must coordinate time-dependent joint activities. To address this challenge, my thesis presents algorithms for solving defender teamwork in SSGs in two phases. As a first step, I focus on domains without execution uncertainty, in modeling and solving SSGs that incorporate teamwork among defender resources via three novel features: (i) a column-generation approach that uses an ordered network of nodes (determined by solving the traveling salesman problem) to generate individual defender strategies; (ii) exploitation of iterative reward shaping of multiple coordinating defender units to generate coordinated strategies; (iii) generation of tighter upper-bounds for pruning by solving security games that only abide by key scheduling constraints. In the second stage of my thesis, I address execution uncertainty among defender resources that arises from the real world by integrating the powerful teamwork mechanisms offered by decentralized Markov Decision Problems (Dec-MDPs) into security games. My thesis offers the following novel contributions: (i) New model of security games with defender teams that coordinate under uncertainty; (ii) New algorithm based on column generation that utilizes Decentralized Markov Decision Processes (Dec-MDPs) to generate defender strategies that incorporate uncertainty; (iii) New techniques to handle global events (when one or more agents may leave the system) during defender execution; (iv) Heuristics that help scale up in the number of targets and resources to handle real-world scenarios; (v) Exploration of the robustness of randomized pure strategies. Different mechanisms, from both solving situations with and without execution uncertainty, may be used depending on the features of the domain. This thesis opens the door to a powerful combination of previous work in multiagent systems on teamwork and security games.
2015_19_teamcore_shieh_thesis_20150324.pdf L. S. Marcolino, H. Xu, A.X. Jiang, M. Tambe, and E. Bowring. 2015. “
The Power of Teams that Disagree: Team Formation in Large Action Spaces .” In Coordination, Organizations, Institutions and Norms in Agent Systems X. Springer-Verlag Lecture Notes in AI, 2015.
AbstractRecent work has shown that diverse teams can outperform a uniform
team made of copies of the best agent. However, there are fundamental questions
that were never asked before. When should we use diverse or uniform teams?
How does the performance change as the action space or the teams get larger?
Hence, we present a new model of diversity, where we prove that the performance
of a diverse team improves as the size of the action space increases. Moreover, we
show that the performance converges exponentially fast to the optimal one as we
increase the number of agents. We present synthetic experiments that give further
insights: even though a diverse team outperforms a uniform team when the size
of the action space increases, the uniform team will eventually again play better
than the diverse team for a large enough action space. We verify our predictions
in a system of Go playing agents, where a diverse team improves in performance
as the board size increases, and eventually overcomes a uniform team.1
2015_32_teamcore_coin2014book.pdf Amulya Yadav, Leandro Soriano Marcolino, Eric Rice, Robin Petering, Hailey Winetrobe, Harmony Rhoades, Milind Tambe, and Heather Carmichael. 2015. “
PSINET: Aiding HIV Prevention Amongst Homeless Youth by Planning Ahead .” AI Magazine.
AbstractHomeless youth are prone to Human Immunodeficiency
Virus (HIV) due to their engagement in high risk behavior such as unprotected sex, sex under influence of
drugs, etc. Many non-profit agencies conduct interventions to educate and train a select group of homeless
youth about HIV prevention and treatment practices and
rely on word-of-mouth spread of information through
their social network. Previous work in strategic selection of intervention participants does not handle uncertainties in the social network’s structure and evolving network state, potentially causing significant shortcomings in spread of information. Thus, we developed
PSINET, a decision support system to aid the agencies
in this task. PSINET includes the following key novelties: (i) it handles uncertainties in network structure
and evolving network state; (ii) it addresses these uncertainties by using POMDPs in influence maximization;
and (iii) it provides algorithmic advances to allow high
quality approximate solutions for such POMDPs. Simulations show that PSINET achieves ∼60% more information spread over the current state-of-the-art. PSINET
was developed in collaboration with My Friend’s Place
(a drop-in agency serving homeless youth in Los Angeles) and is currently being reviewed by their officials.
2015_25_teamcore_aimag_yadav.pdf Amulya Yadav, Leandro Marcolino, Eric Rice, Robin Petering, Hailey Winetrobe, Harmony Rhoades, Milind Tambe, and Heather Carmichael. 2015. “
PSINET - An Online POMDP Solver for HIV Prevention in Homeless Populations .” In In AAAI-15 Workshop on Planning, Search, and Optimization (PlanSOpt-15).
AbstractHomeless youth are prone to Human Immunodeficiency
Virus (HIV) due to their engagement in high risk behavior such as unprotected sex, sex under influence of
drugs, etc. Many non-profit agencies conduct interventions to educate and train a select group of homeless
youth about HIV prevention and treatment practices and
rely on word-of-mouth spread of information through
their social network. Previous work in strategic selection of intervention participants does not handle uncertainties in the social network’s structure and evolving network state, potentially causing significant shortcomings in spread of information. Thus, we developed
PSINET, a decision support system to aid the agencies
in this task. PSINET includes the following key novelties: (i) it handles uncertainties in network structure
and evolving network state; (ii) it addresses these uncertainties by using POMDPs in influence maximization;
and (iii) it provides algorithmic advances to allow high
quality approximate solutions for such POMDPs. Simulations show that PSINET achieves ∼60% more information spread over the current state-of-the-art. PSINET
was developed in collaboration with My Friend’s Place
(a drop-in agency serving homeless youth in Los Angeles) and is currently being reviewed by their officials.
2015_4_teamcore_yadav.pdf Arjun Tambe and Thanh Nguyen. 2015. “
Robust Resource Allocation in Security Games and Ensemble Modeling of Adversary Behavior .” In ACM Symposium on Applied Computing (ACM SAC 2015) Track on Intelligent Robotics and Multi-Agent Systems (IRMAS).
AbstractGame theoretic algorithms have been used to optimize the
allocation of security resources to improve the protection of critical
infrastructure against threats when limits on security resources
prevent full protection of all targets. Past approaches have assumed
adversaries will always behave to maximize their expected utility,
failing to address real-world adversaries who are not perfectly
rational. Instead, adversaries may be boundedly rational, i.e., they
generally act to increase their expected value but do not
consistently maximize it. A successful approach to addressing
bounded adversary rationality has been a robust approach that does
not explicitly model adversary behavior. However, these robust
algorithms implicitly rely on an efficiently computable weak model
of adversary behavior, which does not necessarily match adversary
behavior trends. We therefore propose a new robust algorithm that
provides a more refined model of adversary behavior that retains
the advantage of efficient computation. We also develop an
ensemble method used to tune the algorithm’s parameters, and
compare this method’s accuracy in predicting adversary behavior
to previous work. We test these contributions in security games
against human subjects to show the advantages of our approach.
2015_5_teamcore_arjun_paper.pdf Yundi Qian, William B. Haskell, and Milind Tambe. 2015. “
Robust Strategy against Unknown Risk-averse Attackers in Security Games .” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2015).
AbstractStackelberg security games (SSGs) are now established as a powerful tool in security domains. In this paper, we consider a new
dimension of security games: the risk preferences of the attacker.
Previous work assumes a risk-neutral attacker that maximizes his
expected reward. However, extensive studies show that the attackers in some domains are in fact risk-averse, e.g., terrorist groups
in counter-terrorism domains. The failure to incorporate the risk
aversion in SSG models may lead the defender to suffer significant
losses. Additionally, defenders are uncertain about the degree of
attacker’s risk aversion. Motivated by this challenge this paper provides the following five contributions: (i) we propose a novel model
for security games against risk-averse attackers with uncertainty in
the degree of their risk aversion; (ii) we develop an intuitive MIBLP formulation based on previous security games research, but
find that it finds locally optimal solutions and is unable to scale up;
(iii) based on insights from our MIBLP formulation, we develop
our scalable BeRRA algorithm that finds globally ǫ-optimal solutions; (iv) our BeRRA algorithm can also be extended to handle
other risk-aware attackers, e.g., risk-seeking attackers; (v) we show
that we do not need to consider attacker’s risk attitude in zero-sum
games.
2015_12_teamcore_yundi_aamas2015.pdf Haifeng Xu, Albert X. Jiang, Arunesh Sinha, Zinovi Rabinovich, Shaddin Dughmi, and Milind Tambe. 2015. “
Security Games with Information Leakage: Modeling and Computation .” In International Joint Conference on Artificial Intelligence (IJCAI 2015).
AbstractMost models of Stackelberg security games assume that the attacker only knows the defender’s mixed strategy, but is not able to observe (even partially) the instantiated pure strategy. Such partial observation of the deployed pure strategy – an issue we refer to as information
leakage – is a significant concern in practical applications. While previous research on patrolling
games has considered the attacker’s real-time surveillance, our settings, therefore models and
techniques, are fundamentally different. More specifically, after describing the information leakage model, we start with an LP formulation to compute the defender’s optimal strategy in the
presence of leakage. Perhaps surprisingly, we show that a key subproblem to solve this LP
(more precisely, the defender oracle) is NP-hard even for the simplest of security game models.
We then approach the problem from three possible directions: efficient algorithms for restricted
cases, approximation algorithms, and heuristic algorithms for sampling that improves upon the
status quo. Our experiments confirm the necessity of handling information leakage and the
advantage of our algorithms.
2015_22_teamcore_infor_leakage.pdf L. S. Marcolino and M. Tambe. 2015. “
Three Fundamental Pillars of Multi-agent Team Formation (Doctoral Consortium) .” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2015).
AbstractTeams of voting agents are a powerful tool for solving complex problems. When forming such teams, there are three
fundamental issues that must be addressed: (i) Selecting
which agents should form a team; (ii) Aggregating the opinions of the agents; (iii) Assessing the performance of a team.
In this thesis we address all these points.
2015_17_teamcore_aamas15dc.pdf L. S. Marcolino and M. Tambe. 2015. “
Unleashing the Power of Multi-agent Voting Teams (Doctoral Consortium) .” In International Joint Conference on Artificial Intelligence (IJCAI 2015).
AbstractTeams of voting agents have great potential in finding optimal solutions. However, there are fundamental challenges to effectively use such teams: (i)
selecting agents; (ii) aggregating opinions; (iii) assessing performance. I address all these challenges,
with theoretical and experimental contributions.
2015_26_teamcore_ijcai15dc.pdf D. Kar, F. Fang, F. Delle Fave, N. Sintov, M. Tambe, and A. Van Wissen. 2015. “
Effectiveness of Probability Perception Modeling and Defender Strategy Generation Algorithms in Repeated Stackelberg Games: An Initial Report.” In Computational Sustainability Workshop at AAAI15, Texas, Austin.
AbstractWhile human behavior models based on repeated Stackelberg games have been proposed for domains such as “wildlife crime” where there is repeated interaction between the defender and the adversary, there has been no empirical study with human subjects to show the effectiveness of such models. This paper presents an initial study based on extensive human subject experiments with participants on Amazon Mechanical Turk (AMT). Our findings include: (i) attackers may view the defender’s coverage probability in a non-linear fashion; specifically it follows an S-shaped curve, and (ii) there are significant losses in defender utility when strategies generated by existing models are deployed in repeated Stackelberg game settings against human subjects.
2015_7_teamcore_debarun_kar_aaai15_cs_workshop.pdf Debarun Kar, Fei Fang, Francesco Delle Fave, Nicole Sintov, and Milind Tambe. 2015. “
A Game of Thrones: When Human Behavior Models Compete in Repeated Stackelberg Security Games.” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2015).
AbstractSeveral competing human behavior models have been proposed to model and protect against boundedly rational adversaries in repeated Stackelberg security games (SSGs). However, these existing models fail to address three main issues which are extremely detrimental to defender performance. First, while they attempt to learn adversary behavior models from adversaries’ past actions (“attacks on targets”), they fail to take into account adversaries’ future adaptation based on successes or failures of these past actions. Second, they assume that sufficient data in the initial rounds will lead to a reliable model of the adversary. However, our analysis reveals that the issue is not the amount of data, but that there just is not enough of the attack surface exposed to the adversary to learn a reliable model. Third, current leading approaches have failed to include probability weighting functions, even though it is well known that human beings’ weighting of probability is typically nonlinear. The first contribution of this paper is a new human behavior model, SHARP, which mitigates these three limitations as follows: (i) SHARP reasons based on success or failure of the adversary’s past actions on exposed portions of the attack surface to model adversary adaptiveness; (ii) SHARP reasons about similarity between exposed and unexposed areas of the attack surface, and also incorporates a discounting parameter to mitigate adversary’s lack of exposure to enough of the attack surface; and (iii) SHARP integrates a non-linear probability weighting function to capture the adversary’s true weighting of probability. Our second contribution is a first “longitudinal study” – at least in the context of SSGs – of competing models in settings involving repeated interaction between the attacker and the defender. This study, where each experiment lasted a period of multiple weeks with individual sets of human subjects, illustrates the strengths and weaknesses of different models and shows the advantages of SHARP.
2015_11_teamcore_aamas15_fp85_crc.pdf Thanh H. Nguyen, Francesco M. Delle Fave, Debarun Kar, Aravind S. Lakshminarayanan, Amulya Yadav, Milind Tambe, Noa Agmon, Andrew J. Plumptre, Margaret Driciru, Fred Wanyama, and Aggrey Rwetsiba. 2015. “
Making the most of Our Regrets: Regret-based Solutions to Handle Payoff Uncertainty and Elicitation in Green Security Games.” In Conference on Decision and Game Theory for Security.
AbstractRecent research on Green Security Games (GSG), i.e., security games for the protection of wildlife, forest and fisheries, relies on the promise of an abundance of available data in these domains to learn adversary behavioral models and determine game payoffs. This research suggests that adversary behavior models (capturing bounded rationality) can be learned from real-world data on where adversaries have attacked, and that game payoffs can be determined precisely from data on animal densities. However, previous work has, as yet, failed to demonstrate the usefulness of these behavioral models in capturing adversary behaviors based on real-world data in GSGs. Previous work has also been unable to address situations where available data is insufficient to accurately estimate behavioral models or to obtain the required precision in the payoff values. In addressing these limitations, as our first contribution, this paper, for the first time, provides validation of the aforementioned adversary behavioral models based on real-world data from a wildlife park in Uganda. Our second contribution addresses situations where real-world data is not precise enough to determine exact payoffs in GSG, by providing the first algorithm to handle payoff uncertainty in the presence of adversary behavioral models. This algorithm is based on the notion of minimax regret. Furthermore, in scenarios where the data is not even sufficient to learn adversary behaviors, our third contribution is to provide a novel algorithm to address payoff uncertainty assuming a perfectly rational attacker (instead of relying on a behavioral model); this algorithm allows for a significant scaleup for large security games. Finally, to reduce the problems due to paucity of data, given mobile sensors such as Unmanned Aerial Vehicles (UAV), we introduce new payoff elicitation strategies to strategically reduce uncertainty.
2015_35_teamcore_gamesec2015_arrow.pdf Amulya Yadav, Leandro Marcolino, Eric Rice, Robin Petering, Hailey Winetrobe, Harmony Rhoades, Milind Tambe, and Heather Carmichael. 2015. “
Preventing HIV Spread in Homeless Populations Using PSINET.” In Conference on Innovative Applications of Artificial Intelligence (IAAI-15).
AbstractHomeless youth are prone to HIV due to their engagement in high risk behavior. Many agencies conduct interventions to educate/train a select group of homeless youth about HIV prevention practices and rely on word-of-mouth spread of information through their social network. Previous work in strategic selection of intervention participants does not handle uncertainties in the social network’s structure and in the evolving network state, potentially causing significant shortcomings in spread of information. Thus, we developed PSINET, a decision support system to aid the agencies in this task. PSINET includes the following key novelties: (i) it handles uncertainties in network structure and evolving network state; (ii) it addresses these uncertainties by using POMDPs in influence maximization; (iii) it provides algorithmic advances to allow high quality approximate solutions for such POMDPs. Simulations show that PSINET achieves ∼60% more information spread over the current state-of-the-art. PSINET was developed in collaboration with My Friend’s Place (a drop-in agency serving homeless youth in Los Angeles) and is currently being reviewed by their officials.
2015_6_teamcore_iaai15.pdf Fei Fang, Peter Stone, and Milind Tambe. 2015. “
When Security Games Go Green: Designing Defender Strategies to Prevent Poaching and Illegal Fishing.” In International Joint Conference on Artificial Intelligence (IJCAI).
AbstractBuilding on the successful applications of Stackelberg Security Games (SSGs) to protect infrastructure, researchers have begun focusing on applying game theory to green security domains such as protection of endangered animals and fish stocks. Previous efforts in these domains optimize defender strategies based on the standard Stackelberg assumption that the adversaries become fully aware of the defender’s strategy before taking action. Unfortunately, this assumption is inappropriate since adversaries in green security domains often lack the resources to fully track the defender strategy. This paper (i) introduces Green Security Games (GSGs), a novel game model for green security domains with a generalized Stackelberg assumption; (ii) provides algorithms to plan effective sequential defender strategies — such planning was absent in previous work; (iii) proposes a novel approach to learn adversary models that further improves defender performance; and (iv) provides detailed experimental analysis of proposed approaches.
2015_21_teamcore_ijcai2015_gsg_cameraready_withappendix.pdf 2014
Y. Vorobeychik, B. An, M. Tambe, and S. Singh. 6/2014. “
Computing Solutions in Infinite-Horizon Discounted Adversarial Patrolling Game .” In International Conference on Automated Planning and Scheduling (ICAPS).
AbstractStackelberg games form the core of a number of tools deployed for computing optimal patrolling strategies in adversarial domains, such as the US Federal Air Marshall Service
and the US Coast Guard. In traditional Stackelberg security
game models the attacker knows only the probability that
each target is covered by the defender, but is oblivious to the
detailed timing of the coverage schedule. In many real-world
situations, however, the attacker can observe the current location of the defender and can exploit this knowledge to reason
about the defender’s future moves. We show that this general modeling framework can be captured using adversarial
patrolling games (APGs) in which the defender sequentially
moves between targets, with moves constrained by a graph,
while the attacker can observe the defender’s current location
and his (stochastic) policy concerning future moves. We offer a very general model of infinite-horizon discounted adversarial patrolling games. Our first contribution is to show that
defender policies that condition only on the previous defense
move (i.e., Markov stationary policies) can be arbitrarily suboptimal for general APGs. We then offer a mixed-integer nonlinear programming (MINLP) formulation for computing optimal randomized policies for the defender that can condition on history of bounded, but arbitrary, length, as well as
a mixed-integer linear programming (MILP) formulation to
approximate these, with provable quality guarantees. Additionally, we present a non-linear programming (NLP) formulation for solving zero-sum APGs. We show experimentally
that MILP significantly outperforms the MINLP formulation,
and is, in turn, significantly outperformed by the NLP specialized to zero-sum games.
2014_12_teamcore_apg_icaps_1.pdf Benjamin Ford, Debarun Kar, Francesco M. Delle Fave, Rong Yang, and Milind Tambe. 5/2014. “
PAWS: Adaptive Game-theoretic Patrolling for Wildlife Protection (Demonstration).” In Conference on Autonomous Agents and Multiagent Systems (AAMAS).
AbstractEndangered species around the world are in danger of extinction from poaching. From the start of the 20th century, the African rhino population has dropped over 98% [4] and the global tiger population has dropped over 95% [5], resulting in multiple species extinctions in both groups. Species extinctions have negative consequences on local ecosystems, economies, and communities. To protect these species, countries have set up conservation agencies and national parks, such as Uganda’s Queen Elizabeth National Park (QENP). However, a common lack of funding for these agencies results in a lack of law enforcement resources to protect these large, rural areas. As an example of the scale of disparity, one wildlife crime study in 2007 reported an actual coverage density of one ranger per 167 square kilometers [2]. Because of the hazards involved (e.g., armed poachers, wild animals), rangers patrol in groups, further increasing the amount of area they are responsible for patrolling. Security game research has typically been concerned with combating terrorism, and this field has indeed benefited from a range of successfully deployed applications [1, 6]. These applications have enabled security agencies to make more efficient use of their limited resources. In this previous research, adversary data has been absent during the development of these solutions, and thus, it has been difficult to make accurate adversary behavior models during algorithm development. In a domain such as wildlife crime, interactions with the adversary are frequent and repeated, thus enabling conservation agencies to collect data. This presence of data enables security game researchers to begin developing algorithms that incorporate this data into, potentially, more accurate behavior models and consequently better security solutions. Developed in conjunction with staff at QENP, the Protection Assistant for Wildlife Security (PAWS) generates optimized defender strategies for use by park rangers [7]. Due to the repeated nature of wildlife crime, PAWS is able to leverage crime event data - a previously unrealized capability in security games research. Thus, PAWS implements a novel adaptive algorithm that processes crime event data, builds multiple human behavior models, and, based on those models, predicts where adversaries will attack next. These predictions are then used to generate a patrol strategy for the rangers (i.e., a set of patrol waypoints) that can be viewed on a GPS unit. Against this background, the demonstration presented in this paper introduces two contributions. First, we present the PAWS system which incorporates the algorithm in [7] into a scheduling system and a GPS visualizer. Second, we present a software interface to run a number of human subject experiments (HSE) to evaluate and improve the efficacy of PAWS before its deployment in QENP. By conducting these HSEs, we can (i) test the PAWS algorithms with repeated interactions with humans, thus providing a more realistic testing environment than in its previous simulations; (ii) generate data that can be used to initialize PAWS’s human behavior models for deployment, and (iii) compare the current PAWS algorithms’ performance to alternatives and determine if additional improvements are needed prior to deployment. To provide proper context for the presentation, this paper also presents a brief overview of the PAWS system data flow and its adaptive algorithms. The demonstration will engage audience members by having them participate in the HSEs and using the GPS unit to visualize a patrol schedule in QENP.
2014_13_teamcore_de12_ford.pdf