Despite significant research in Security Games, limited efforts have been made to handle game domains with continuous space. Addressing such limitations, in this paper we propose: (i) a continuous space security game model that considers infinitesize action spaces for players; (ii) OptGradFP, a novel and general algorithm that searches for the optimal defender strategy in a parametrized search space; (iii) OptGradFP-NN, a convolutional neural network based implementation of OptGradFP for continuous space security games; (iv) experiments and analysis with OptGradFP-NN. This is the first time that neural networks have been used for security games, and it shows the promise of applying deep learning to complex security games which previous approaches fail to handle.
This paper focuses on a topic that is insufficiently addressed in the literature, i.e., challenges faced in transitioning agents from an emerging phase in the lab, to a deployed application in the field. Specifically, we focus on challenges faced in transitioning HEALER and DOSIM, two agents for social influence maximization, which assist service providers in maximizing HIV awareness in real-world homeless-youth social networks. These agents recommend key "seed" nodes in social networks, i.e., homeless youth who would maximize HIV awareness in their real-world social network. While prior research on these agents published promising simulation results from the lab, this paper illustrates that transitioning these agents from the lab into the real-world is not straightforward, and outlines three major lessons. First, it is important to conduct real-world pilot tests; indeed, due to the health-critical nature of the domain and complex influence spread models used by these agents, it is important to conduct field tests to ensure the real-world usability and effectiveness of these agents. We present results from three real-world pilot studies, involving 173 homeless youth in an American city. These are the first such pilot studies which provide headto-head comparison of different agents for social influence maximization, including a comparison with a baseline approach. Second, we present analyses of these real-world results, illustrating the strengths and weaknesses of different influence maximization approaches we compare. Third, we present research and deployment challenges revealed in conducting these pilot tests, and propose solutions to address them. These challenges and proposed solutions are instructive in assisting the transition of agents focused on social influence maximization from the emerging to the deployed application phase.
This paper presents HEALER, a software agent that recommends sequential intervention plans for use by homeless shelters, who organize these interventions to raise awareness about HIV among homeless youth. HEALER’s sequential plans (built using knowledge of social networks of homeless youth) choose intervention participants strategically to maximize influence spread, while reasoning about uncertainties in the network. While previous work presents influence maximizing techniques to choose intervention participants, they do not address two real-world issues: (i) they completely fail to scale up to real-world sizes; and (ii) they do not handle deviations in execution of intervention plans. HEALER handles these issues via two major contributions: (i) HEALER casts this influence maximization problem as a POMDP and solves it using a novel planner which scales up to previously unsolvable real-world sizes; and (ii) HEALER allows shelter officials to modify its recommendations, and updates its future plans in a deviationtolerant manner. HEALER was deployed in the real world in Spring 2016 with considerable success.
Poaching is considered a major driver for the population drop of key species such as tigers, elephants, and rhinos, which can be detrimental to whole ecosystems. While conducting foot patrols is the most commonly used approach in many countries to prevent poaching, such patrols often do not make the best use of the limited patrolling resources. This paper presents PAWS, a game-theoretic application deployed in Southeast Asia for optimizing foot patrols to combat poaching. In this paper, we report on the significant evolution of PAWS from a proposed decision aid introduced in 2014 to a regularly deployed application. We outline key technical advances that lead to PAWS’s regular deployment: (i) incorporating complex topographic features, e.g., ridgelines, in generating patrol routes; (ii) handling uncertainties in species distribution (game theoretic payoffs); (iii) ensuring scalability for patrolling large-scale conservation areas with fine-grained guidance; and (iv) handling complex patrol scheduling constraints.
Conservation agencies worldwide must make the most efficient use of their limited resources to protect natural resources from over-harvesting and animals from poaching. Predictive modeling, a tool to increase efficiency, is seeing increased usage in conservation domains such as to protect wildlife from poaching. Many works in this wildlife protection domain, however, fail to train their models on real-world data or test their models in the real world. My thesis proposes novel poacher behavior models that are trained on real-world data and are tested via first-of-their-kind tests in the real world. First, I proposed a paradigm shift in traditional adversary behavior modeling techniques from Quantal Response-based models to decision tree-based models. Based on this shift, I proposed an ensemble of spatially-aware decision trees, INTERCEPT, that outperformed the prior stateof-the-art and then also presented results from a one-month pilot field test of the ensemble’s predictions in Uganda’s Queen Elizabeth Protected Area (QEPA). This field test represented the first time that a machine learning-based poacher behavior modeling application was tested in the field. Second, I proposed a hybrid spatio-temporal model that led to further performance improvements. To validate this model, I designed and conducted a large-scale, eight-month field test of this model’s predictions in QEPA. This field test, where rangers patrolled over 450 km in the largest and longest field test of a machine learning-based poacher behavior model to date in this domain, successfully demonstrated the selectiveness of the model’s predictions; the model successfully predicted, with statistical significance, where rangers would find more snaring activity and also where rangers would not find as much snaring activity. I also conducted detailed analysis of the behavior of my predictive model. Third, beyond wildlife poaching, I also provided novel graph-aware models for modeling human adversary behavior in wildlife or other contraband smuggling networks and tested them against human subjects. Lastly, I examined human considerations of deployment in new domains and the importance of easily-interpretable models and results. While such interpretability has been a recurring theme in all my thesis work, I also created a game-theoretic inspection strategy application that generated randomized factory inspection schedules and also contained visualization and explanation components for users.
Worldwide, conservation agencies employ rangers to protect conservation areas from poachers. However, agencies lack the manpower to have rangers effectively patrol these vast areas frequently. While past work has modeled poachers’ behavior so as to aid rangers in planning future patrols, those models’ predictions were not validated by extensive field tests. In this paper, we present a hybrid spatio-temporal model that predicts poaching threat levels and results from a five-month field test of our model in Uganda’s Queen Elizabeth Protected Area (QEPA). To our knowledge, this is the first time that a predictive model has been evaluated through such an extensive field test in this domain. We present two major contributions. First, our hybrid model consists of two components: (i) an ensemble model which can work with the limited data common to this domain and (ii) a spatio-temporal model to boost the ensemble’s predictions when sufficient data are available. When evaluated on real-world historical data from QEPA, our hybrid model achieves significantly better performance than previous approaches with either temporally-aware dynamic Bayesian networks or an ensemble of spatially-aware models. Second, in collaboration with the Wildlife Conservation Society and Uganda Wildlife Authority, we present results from a five-month controlled experiment where rangers patrolled over 450 sq km across QEPA. We demonstrate that our model successfully predicted (1) where snaring activity would occur and (2) where it would not occur; in areas where we predicted a high rate of snaring activity, rangers found more snares and snared animals than in areas of lower predicted activity. These findings demonstrate that (1) our model’s predictions are selective, (2) our model’s superior laboratory performance extends to the real world, and (3) these predictive models can aid rangers in focusing their efforts to prevent wildlife poaching and save animals.
This paper focuses on new challenges in influence maximization inspired by non-profits’ use of social networks to effect behavioral change in their target populations. Influence maximization is a multiagent problem where the challenge is to select the most influential agents from a population connected by a social network. Specifically, our work is motivated by the problem of spreading messages about HIV prevention among homeless youth using their social network. We show how to compute solutions which are provably close to optimal when the parameters of the influence process are unknown. We then extend our algorithm to a dynamic setting where information about the network is revealed at each stage. Simulation experiments using real world networks collected by the homeless shelter show the advantages of our approach.
Advances in computational game theory have led to several successfully deployed applications in security domains. These gametheoretic approaches and security applications learn game payoff values or adversary behaviors from annotated input data provided by domain experts and practitioners in the field, or collected through experiments with human subjects. Beyond these traditional methods, unmanned aerial vehicles (UAVs) have become an important surveillance tool used in security domains to collect the required annotated data. However, collecting annotated data from videos taken by UAVs efficiently, and using these data to build datasets that can be used for learning payoffs or adversary behaviors in game-theoretic approaches and security applications, is an under-explored research question. This paper presents VIOLA, a novel labeling application that includes (i) a workload distribution framework to efficiently gather human labels from videos in a secured manner; (ii) a software interface with features designed for labeling videos taken by UAVs in the domain of wildlife security. We also present the evolution of VIOLA and analyze how the changes made in the development process relate to the efficiency of labeling, including when seemingly obvious improvements surprisingly did not lead to increased efficiency. VIOLA enables collecting massive amounts of data with detailed information from challenging security videos such as those collected aboard UAVs for wildlife security. VIOLA will lead to the development of a new generation of game-theoretic approaches for security domains, including approaches that integrate deep learning and game theory for real-time detection and response.
Whereas previous real-world game-theoretic applications in security focused on protection of critical infrastructure in the absence of past attack data, more recent work has focused on datadriven security and sustainability applications for protecting the environment, including forests, fish and wildlife. One key challenge in such “Green Security Game” (GSG) domains is to model the adversary’s decision making process based on available attack data. This thesis, for the first time, explores the suitability of different adversary behavior modeling approaches in such domains that differ in the type and amount of historical data available. The first contribution is to provide a detailed comparative study, based on actual human subject experiments, of competing adversary behavior models in domains where attack data is available in plenty (e.g., via a large number of sensors). This thesis demonstrates a new human behavior model, SHARP, which mitigates the limitations of previous models in three key ways. First, SHARP reasons based on successes or failures of the adversary’s past actions to model adversary adaptivity. Second, SHARP reasons about similarity between exposed and unexposed areas of the attack surface to handle the adversary’s lack of exposure to enough of the attack surface. Finally, SHARP integrates a non-linear probability weighting function to capture the adversary’s true weighting of probabilities.The second contribution relates to domains requiring predictions over a large set of targets by learning from limited (and in some cases, noisy) data. One example dataset on which we demonstrate our approaches to handle such challenges is a real-world poaching dataset collected over a large geographical area at the Queen Elizabeth National Park in Uganda. This data is too sparse to construct a detailed model. The second contribution of this thesis delivers a surprising result by presenting an adversary behavior modeling system, INTERCEPT, which is based on an ensemble of decision trees (i) that effectively learns and predicts poacher attacks based on limited noisy attack data over a large set of targets, and (ii) has fast execution speed. This has led to a successful month-long test of INTERCEPT in the field, a first for adversary behavior modeling applications in the wildlife conservation domain. Finally, for the my third contribution, we examine one common assumption in adversary behavior modeling that the adversary perfectly observes the defender’s randomized protection strategy. However, in domains such as wildlife conservation, the adversary only observes a limited sequence of defender patrols and forms beliefs about the defender’s strategy. In the absence of a comparative analysis and a principled study of the strengths and weaknesses of belief models, no informed decision could be made to incorporate belief models in adversary behavior models such as SHARP and INTERCEPT. This thesis provides the first-of-its-kind systematic comparison of existing and new proposed belief models and demonstrates based on human subjects experiments data that identifying heterogeneous belief update behavior is essential in making effective predictions. We also propose and evaluate customized models for settings that differ in the type of belief data available and quantify the value of having such historical data on the accuracy of belief prediction.
Stackelberg Security Games (SSG) have been
widely applied for solving real-world security problems —
with a significant research emphasis on modeling attackers’
behaviors to handle their bounded rationality. However, access
to real-world data (used for learning an accurate behavioral
model) is often limited, leading to uncertainty in attacker’s
behaviors while modeling. This paper therefore focuses on
addressing behavioral uncertainty in SSG with the following
main contributions: 1) we present a new uncertainty game
model that integrates uncertainty intervals into a behavioral
model to capture behavioral uncertainty; and 2) based on
this game model, we propose a novel robust algorithm that
approximately computes the defender’s optimal strategy in the
worst-case scenario of uncertainty. We show that our algorithm
guarantees an additive bound on its solution quality.
Security is an important concern worldwide. Stackelberg
Security Games have been used successfully in a variety
of security applications, to optimally schedule limited
defense resources by modeling the interaction between
attackers and defenders. Prior research has suggested
that it is possible to classify adversary behavior into
distinct groups of adversaries based on the ways humans
explore their decision alternatives. However, despite the
widespread use of Stackelberg Security Games, there has
been little research on how adversaries adapt to defense
strategies over time (i.e., dynamics of behavior). In this
paper, we advance this work by showing how
adversaries’ behavior changes as they learn the
defenders’ behavior over time. Furthermore, we show
how behavioral game theory models can be modified to
capture learning dynamics using a Bayesian Updating
modeling approach. These models perform similarly to a
cognitive model known as Instance-Based-Learning to
predict learning patterns.
Security is a global concern. Real-world security problems range from domains such as the protection of ports, airports, and transportation from terrorists to protecting forests, wildlife, and
fisheries from smugglers, poachers, and illegal fishermen. A key challenge in solving these security problems is that security resources are limited; not all targets can be protected all the time.
Therefore, security resources must be deployed intelligently, taking into account the responses
of adversaries and potential uncertainties over their types, priorities, and knowledge. Stackelberg
Security Games (SSG) have drawn a significant amount of interest from security agencies by
capturing the strategic interaction between security agencies and human adversaries. SSG-based
decision aids are in widespread use (both nationally and internationally) for the protection of
assets such as major ports in the US, airport terminals, and wildlife and fisheries.
My research focuses on addressing uncertainties in SSGs — one recognized area of weakness.
My thesis provides innovative techniques and significant advances in addressing these uncertainties in SSGs. First, in many security problems, human adversaries are known to be boundedly
rational, and often choose targets with non-highest expected value to attack. I introduce novel
behavioral models of adversaries which significantly advance the state-of-the-art in capturing the
adversaries’ decision making. More specifically, my new model for predicting poachers’ behavior in wildlife protection is the first game-theoretic model which takes into account key domain
challenges including imperfect poaching data and complex temporal dependencies in poachers’
behavior. The superiority of my new models over the existing ones is demonstrated via extensive experiments based on the biggest real-world poaching dataset, collected in a national park in
Uganda over 12 years. Second, my research also focuses on developing new robust algorithms
which address uncertainties in real-world security problems. I present the first unified maximinbased robust algorithm — a single algorithm — to handle all different types of uncertainties
explored in SSGs. Furthermore, I propose a less conservative decision criterion; minimax regret, for generating new, candidate defensive strategies that handle uncertainties in SSGs. In fact, minimax regret and maximin can be used in different security situations which may demand different
robust criteria. I then present novel robust algorithms to compute minimax regret for addressing
A contribution of particular significance is that my work is deployed in the real world; I have
deployed my robust algorithms and behavioral models in the PAWS system, which is currently
being used by NGOs (Panthera and Rimba) in a conservation area in Malaysia.
Several competing human behavior models have been proposed to model boundedly
rational adversaries in repeated Stackelberg Security Games (SSG). However, these existing models fail to address three main issues which are detrimental to defender
performance. First, while they attempt to learn adversary behavior models from adversaries’ past actions (“attacks on targets”), they fail to take into account adversaries’
future adaptation based on successes or failures of these past actions. Second, existing
algorithms fail to learn a reliable model of the adversary unless there exists sufficient
data collected by exposing enough of the attack surface — a situation that often arises
in initial rounds of the repeated SSG. Third, current leading models have failed to include probability weighting functions, even though it is well known that human beings’
weighting of probability is typically nonlinear.
To address these limitations of existing models, this article provides three main
contributions. Our first contribution is a new human behavior model, SHARP, which
mitigates these three limitations as follows: (i) SHARP reasons based on success or
failure of the adversary’s past actions on exposed portions of the attack surface to
model adversary adaptivity; (ii) SHARP reasons about similarity between exposed and
unexposed areas of the attack surface, and also incorporates a discounting parameter to
mitigate adversary’s lack of exposure to enough of the attack surface; and (iii) SHARP
integrates a non-linear probability weighting function to capture the adversary’s true
weighting of probability. Our second contribution is a first “repeated measures study”
– at least in the context of SSGs – of competing human behavior models. This study,
where each experiment lasted a period of multiple weeks with individual sets of human subjects on the Amazon Mechanical Turk platform, illustrates the strengths and
weaknesses of different models and shows the advantages of SHARP. Our third major
contribution is to demonstrate SHARP’s superiority by conducting real-world human
subjects experiments at the Bukit Barisan Seletan National Park in Indonesia against
wildlife security experts.
Real-world deployed applications of Stackelberg Security
Games (Shieh et al. 2012; Basilico, Gatti, and Amigoni
2009; Letchford and Vorobeychik 2011) have led to significant research emphasis on modeling the attacker’s bounded
rationality (Yang et al. 2011; Nguyen et al. 2013). One key
assumption in behavioral modeling is the availability of a
significant amount of data to obtain an accurate prediction.
However, in real-world security domains such as the wildlife
protection, this assumption may be inapplicable due to the
limited access to real-world data (Lemieux 2014), leading
to uncertainty in the attacker’s behaviors — a key research
challenge of security problems.
Recent research has focused on addressing uncertainty in
behavioral modeling, following two different approaches: 1)
one approach assumes a known distribution of multiple attacker types, each follows a certain behavioral model, and
attempts to solve the resulting Bayesian games (Yang et
al. 2014); and 2) another considers the existence of multiple attacker types of which behavioral models are perfectly
known, but without a known distribution over the types. It
then only considers the worst attacker type for the defender
(Brown, Haskell, and Tambe 2014). These two approaches
have several limitations. First, both still require a sufficient
amount of data to precisely estimate either the distribution
over attacker types (the former approach) or the model parameters for each individual type (the latter approach). Second, solving the resulting Bayesian games in the former case
is computationally expensive. Third, the latter approach
tends to be overly conservative as it only focuses on the
worst-case attacker type.
This paper remedies these shortcomings of state-of-theart approaches when addressing behavioral uncertainty in
SSG by providing three key contributions. First, we present
a new game model with uncertainty in which we consider a
single behavioral model to capture decision making of the
whole attacker population (instead of multiple behavioral
models); uncertainty intervals are integrated with the chosen
model to capture behavioral uncertainty. The idea of uncertainty interval is commonly used in literature (Aghassi and
Bertsimas 2006) and has been shown to effectively represent uncertainty in SSG (Kiekintveld, Islam, and Kreinovich
2013). Second, based on this game model, we propose a new
efficient robust algorithm that computes the defender’s optimal strategy which is robust to the uncertainty.
Overall, the resulting robust optimization problem for
computing the defender’s optimal strategy against the worst
case of behavioral uncertainty is a non-linear non-convex
fractional maximin problem. Our algorithm efficiently
solves this problem based on the following key insights: 1)
it converts the problem into a single maximization problem
via a non-linear conversion for fractional terms and the dual
of the inner minimization in maximin; 2) a binary search is
then applied to remove the fractional terms; and 3) the algorithm explores extreme points of the feasible solution region
and uses a piece-wise linear approximation to convert the
problem into a Mixed Integer Linear Program (MILP). Our
new algorithm provides an O(+
)-optimal solution where
is the convergence threshold for the binary search and K
is the number of segments in the piecewise approximation.
We address the challenge of detecting and addressing advanced persistent threats
(APTs) in a computer network, focusing in particular on the challenge of detecting data exfiltration over Domain Name System (DNS) queries, where existing detection sensors are
imperfect and lead to noisy observations about the network’s security state. Data exfiltration
over DNS queries involves unauthorized transfer of sensitive data from an organization to a
remote adversary through a DNS data tunnel to a malicious web domain. Given the noisy
sensors, previous work has illustrated that standard approaches fail to satisfactorily rise to the
challenge of detecting exfiltration attempts. Instead, we propose a decision-theoretic technique
that sequentially plans to accumulate evidence under uncertainty while taking into account
the cost of deploying such sensors. More specifically, we provide a fast scalable POMDP formulation to address the challenge, where the efficiency of the formulation is based on two
key contributions: (i) we use a virtually distributed POMDP (VD-POMDP) formulation, motivated by previous work in distributed POMDPs with sparse interactions, where individual
policies for different sub-POMDPs are planned separately but their sparse interactions are
only resolved at execution time to determine the joint actions to perform; (ii) we allow for
abstraction in planning for speedups, and then use a fast MILP to implement the abstraction
while resolving any interactions. This allows us to determine optimal sensing strategies, leveraging information from many noisy detectors, and subject to constraints imposed by network
topology, forwarding rules and performance costs on the frequency, scope and efficiency of
sensing we can perform.
The conservation of key wildlife species such as tigers and
elephants are threatened by poaching activities. In many conservation areas, foot patrols are conducted to prevent poaching but they may not be well-planned to make the best use of
the limited patrolling resources. While prior work has introduced PAWS (Protection Assistant for Wildlife Security) as
a game-theoretic decision aid to design effective foot patrol
strategies to protect wildlife, the patrol routes generated by
PAWS may be difficult to follow in areas with complex terrain. Subsequent research has worked on the significant evolution of PAWS, from an emerging application to a regularly
deployed software. A key advance of the deployed version of
PAWS is that it incorporates the complex terrain information
and generates a strategy consisting of easy-to-follow routes.
In this demonstration, we provide 1) a video introducing the
PAWS system; 2) an interactive visualization of the patrol
routes generated by PAWS in an example area with complex
terrain; and 3) a machine-human competition in designing patrol strategy given complex terrain and animal distribution.
Research on security games has focused on settings where the defender must protect against either a single adversary or multiple, independent
adversaries. However, there are a variety of real-world security domains where
adversaries may benefit from colluding in their actions against the defender, e.g.,
wildlife poaching, urban crime and drug trafficking. Given such adversary collusion may be more detrimental for the defender, she has an incentive to break
up collusion by playing off the self-interest of individual adversaries. As we show
in this paper, breaking up such collusion is difficult given bounded rationality of
human adversaries; we therefore investigate algorithms for the defender assuming both rational and boundedly rational adversaries. The contributions of this
paper include (i) collusive security games (COSGs), a model for security games
involving potential collusion among adversaries, (ii) SPECTRE-R, an algorithm
to solve COSGs and break collusion assuming rational adversaries, (iii) observations and analyses of adversary behavior and the underlying factors including bounded rationality, imbalanced- resource-allocation effect, coverage perception, and individualism / collectivism attitudes within COSGs with data from
700 human subjects, (iv) a learned human behavioral model that incorporates
these factors to predict when collusion will occur, (v) SPECTRE-BR, an enhanced
algorithm which optimizes against the learned behavior model to provide demonstrably better performing defender strategies against human subjects compared
Multi-agent teamwork and defender-attacker security games are two areas that are currently receiving significant attention within multi-agent systems research. Unfortunately, despite the need for effective teamwork among multiple defenders, little has been done to harness the teamwork research in security games. The problem that this paper seeks to solve is the coordination of decentralized defender agents in the presence of uncertainty while securing targets against an observing adversary. To address this problem, we offer the following novel contributions in this paper: (i) New model of security games with defender teams that coordinate under uncertainty; (ii) New algorithm based on column generation that utilizes Decentralized Markov Decision Processes (Dec-MDPs) to generate defender strategies that incorporate uncertainty; (iii) New techniques to handle global events (when one or more agents may leave the system) during defender execution; (iv) Heuristics that help scale up in the number of targets and agents to handle real-world scenarios; (v) Exploration of the robustness of randomized pure strategies. The paper opens the door to a potentially new area combining computational game theory and multi-agent teamwork.
Recent years have seen increasing interest in AI from
outside the AI community. This is partly due to applications based on AI that have been used in real-world domains, for example, the successful deployment of game
theory-based decision aids in security domains. This paper describes our teaching approach for introducing the
AI concepts underlying security games to diverse audiences. We adapted a game-based research platform
that served as a testbed for recent research advances
in computational game theory into a set of interactive
role-playing games. We guided learners in playing these
games as part of our teaching strategy, which also included didactic instruction and interactive exercises on
broader AI topics. We describe our experience in applying this teaching approach to diverse audiences, including students of an urban public high school, university
undergraduates, and security domain experts who protect wildlife. We evaluate our approach based on results
from the games and participant surveys.