There has been significant amount of research in Stackelberg Security Games (SSG), and a common
assumption in that literature is that the adversary perfectly observes the defender’s mixed strategy.
However, in real-world settings the adversary can only observe a sequence of defender pure strategies sampled from the actual mixed strategy. Therefore, a key challenge is the modeling of adversary’s belief formation based on such limited observations. The SSG literature lacks a comparative
analysis of these models and a principled study of their strengths and weaknesses. In this paper, we
study the following shortcomings of previous work and introduce new models that address these
shortcomings. First, we address the lack of empirical evaluation or head-to-head comparison of
existing models by conducting the first-of-its-kind systematic comparison of existing and new proposed models on belief data collected from human subjects on Amazon Mechanical Turk. Second,
we show that assuming a homogeneous population of adversaries, a common assumption in the
literature, is unrealistic based on our experiments, which highlight four heterogeneous groups of
adversaries with distinct belief update mechanisms. We present new models that address this shortcoming by clustering and learning these disparate behaviors from data when available. Third, we
quantify the value of having historical data on the accuracy of belief prediction.
In recent years, there have been a number of successful cyber attacks on enterprise networks by malicious actors. These attacks generate alerts which
must be investigated by cyber analysts to determine
if they are an attack. Unfortunately, there are magnitude more alerts than cyber analysts - a trend expected to continue into the future creating a need
to find optimal assignments of the incoming alerts
to analysts in the presence of a strategic adversary.
We address this challenge with the four following
contributions: (1) a cyber allocation game (CAG)
model for the cyber network protection domain, (2)
an NP-hardness proof for computing the optimal
strategy for the defender, (3) techniques to find the
optimal allocation of experts to alerts in CAG in the
general case and key special cases, and (4) heuristics to achieve significant scale-up in CAGs with
minimal loss in solution quality.
Voting among different agents is a powerful tool in problem solving, and it has
been widely applied to improve the performance in finding the correct answer to complex
problems. We present a novel benefit of voting, that has not been observed before: we can
use the voting patterns to assess the performance of a team and predict their final outcome.
This prediction can be executed at any moment during problem-solving and it is completely
domain independent. Hence, it can be used to identify when a team is failing, allowing an
operator to take remedial procedures (such as changing team members, the voting rule, or
increasing the allocation of resources). We present three main theoretical results: (i) we
show a theoretical explanation of why our prediction method works; (ii) contrary to what
would be expected based on a simpler explanation using classical voting models, we show
that we can make accurate predictions irrespective of the strength (i.e., performance) of the
teams, and that in fact, the prediction can work better for diverse teams composed of different agents than uniform teams made of copies of the best agent; (iii) we show that the quality
of our prediction increases with the size of the action space. We perform extensive experimentation in two different domains: Computer Go and Ensemble Learning. In Computer
Go, we obtain high quality predictions about the final outcome of games. We analyze the
prediction accuracy for three different teams with different levels of diversity and strength,
and show that the prediction works significantly better for a diverse team. Additionally, we
show that our method still works well when trained with games against one adversary, but
tested with games against another, showing the generality of the learned functions. Moreover, we evaluate four different board sizes, and experimentally confirm better predictions
in larger board sizes. We analyze in detail the learned prediction functions, and how they change according to each team and action space size. In order to show that our method is
domain independent, we also present results in Ensemble Learning, where we make online
predictions about the performance of a team of classifiers, while they are voting to classify
sets of items. We study a set of classical classification algorithms from machine learning, in
a data-set of hand-written digits, and we are able to make high-quality predictions about the
final performance of two different teams. Since our approach is domain independent, it can
be easily applied to a variety of other domains.
Could an AI decision aid improve housing systems that assist homeless youth? There are nearly 2 million homeless
youth in the United States each year. Coordinated entry systems are being used to provide homeless youth with housing
assistance across the nation. Despite these efforts, the number of homeless youth still living on the street remains very
high. Motivated by this fact, we initiate a first study to create
AI decision aids for improving the current housing systems
for homeless youth. First, we determine whether the current
rubric for prioritizing youth for housing assistance can be
used to predict youth’s homelessness status after receiving
housing assistance. We then consider building better AI decision aids and predictive models using other components of
the rubric. We believe there is much potential for effective
human-machine collaboration in the context of housing allocation. We plan to work with HUD and local communities to
develop such systems in the future.
In recent years, AI-based applications have increasingly been used in real-world domains. For example, game theorybased decision aids have been successfully deployed in various security settings to protect ports, airports, and wildlife.
This paper describes our unique problem-to-project educational approach that used games rooted in real-world issues
to teach AI concepts to diverse audiences. Specifically, our educational program began by presenting real-world
security issues, and progressively introduced complex AI concepts using lectures, interactive exercises, and ultimately
hands-on games to promote learning. We describe our experience in applying this approach to several audiences,
including students of an urban public high school, university undergraduates, and security domain experts who protect
wildlife. We evaluated our approach based on results from the games and participant surveys.
Motivated by the problem of protecting endangered animals,
there has been a surge of interests in optimizing patrol planning for conservation area protection. Previous efforts in these domains have mostly
focused on optimizing patrol routes against a specific boundedly rational
poacher behavior model that describes poachers’ choices of areas to attack. However, these planning algorithms do not apply to other poaching
prediction models, particularly, those complex machine learning models
which are recently shown to provide better prediction than traditional
bounded-rationality-based models. Moreover, previous patrol planning
algorithms do not handle the important concern whereby poachers infer the patrol routes by partially monitoring the rangers’ movements. In
this paper, we propose OPERA, a general patrol planning framework
that: (1) generates optimal implementable patrolling routes against a
black-box attacker which can represent a wide range of poaching prediction models; (2) incorporates entropy maximization to ensure that the
generated routes are more unpredictable and robust to poachers’ partial monitoring. Our experiments on a real-world dataset from Uganda’s
Queen Elizabeth Protected Area (QEPA) show that OPERA results in
better defender utility, more efficient coverage of the area and more unpredictability than benchmark algorithms and the past routes used by
rangers at QEPA.
F. Fang, T. H. Nguyen, A. Sinha, S. Gholami, A. Plumptre, L. Joppa, M. Tambe, M. Driciru, F. Wanyama, A. Rwetsiba, R. Critchlow, and C. M. Beale. 2017. “Predicting Poaching for Wildlife Protection.” IBM Journal of Research and Development (To appear).Abstract
Wildlife species such as tigers and elephants are under the threat of poaching. To combat
poaching, conservation agencies (“defenders”) need to (1) anticipate where the poachers are
likely to poach and (2) plan effective patrols. We propose an anti-poaching tool CAPTURE
(Comprehensive Anti-Poaching tool with Temporal and observation Uncertainty REasoning),
which helps the defenders achieve both goals. CAPTURE builds a novel hierarchical model for
poacher-patroller interaction. It considers the patroller’s imperfect detection of signs of
poaching, the complex temporal dependencies in the poacher's behaviors and the defender’s lack
of knowledge of the number of poachers. Further, CAPTURE uses a new game-theoretic
algorithm to compute the optimal patrolling strategies and plan effective patrols. This paper
investigates the computational challenges that CAPTURE faces. First, we present a detailed
analysis of parameter separation and target abstraction, two novel approaches used by
CAPTURE to efficiently learn the parameters in the hierarchical model. Second, we propose two
heuristics – piece-wise linear approximation and greedy planning – to speed up the computation
of the optimal patrolling strategies. We discuss in this paper the lessons learned from using
CAPTURE to analyze real-world poaching data collected over 12 years in Queen Elizabeth
National Park in Uganda.
This paper focuses on new challenges in influence maximization inspired by non-profits’ use of social networks to effect behavioral
change in their target populations. Influence maximization is a multiagent problem where the challenge is to select the most influential agents
from a population connected by a social network. Specifically, our work is
motivated by the problem of spreading messages about HIV prevention
among homeless youth using their social network. We show how to compute solutions which are provably close to optimal when the parameters
of the influence process are unknown. We then extend our algorithm to
a dynamic setting where information about the network is revealed at
each stage. Simulation experiments using real world networks collected
by the homeless shelter show the advantages of our approach.
We consider the problem of dynamically allocating screening resources of different efficacies (e.g.,
magnetic or X-ray imaging) at checkpoints (e.g., at
airports or ports) to successfully avert an attack by
one of the screenees. Previously, the Threat Screening Game model was introduced to address this
problem under the assumption that screenee arrival
times are perfectly known. In reality, arrival times
are uncertain, which severely impedes the implementability and performance of this approach. We
thus propose a novel framework for dynamic allocation of threat screening resources that explicitly accounts for uncertainty in the screenee arrival
times. We model the problem as a multistage robust
optimization problem and propose a tractable solution approach using compact linear decision rules
combined with robust reformulation and constraint
randomization. We perform extensive numerical
experiments which showcase that our approach outperforms (a) exact solution methods in terms of
tractability, while incurring only a very minor loss
in optimality, and (b) methods that ignore uncertainty in terms of both feasibility and optimality.
In recent years, there have been a number of successful cyber attacks
on enterprise networks by malicious actors. These attacks generate alerts which
must be investigated by cyber analysts to determine if they are an attack. Unfortunately, there are magnitude more alerts than cyber analysts - a trend expected
to continue into the future creating a need to find optimal assignments of the
incoming alerts to analysts in the presence of a strategic adversary. We address
this challenge with the four following contributions: (1) a cyber allocation game
(CAG) model for the cyber network protection domain, (2) an NP-hardness proof
for computing the optimal strategy for the defender, (3) techniques to find the
optimal allocation of experts to alerts in CAG in the general case and key special
cases, and (4) heuristics to achieve significant scale-up in CAGs with minimal
loss in solution quality.
Many homeless shelters conduct interventions to raise awareness about HIV (human
immunodeficiency virus) among homeless youth. Due to human and financial resource
shortages, these shelters need to choose intervention attendees strategically, in order to maximize
awareness through the homeless youth social network. In this work, we propose HEALER
(hierarchical ensembling based agent which plans for effective reduction in HIV spread), an
agent that recommends sequential intervention plans for use by homeless shelters. HEALER's
sequential plans (built using knowledge of homeless youth social networks) select intervention
participants strategically to maximize influence spread, by solving POMDPs (partially
observable Markov decision process) on social networks using heuristic ensemble methods. This
paper explores the motivations behind HEALER’s design, and analyzes HEALER’s performance
in simulations on real-world networks. First, we provide a theoretical analysis of the DIME
(dynamic influence maximization under uncertainty) problem, the main computational problem
that HEALER solves. HEALER relies on heuristic methods for solving the DIME problem due
to its computational hardness. Second, we explain why heuristics used inside HEALER work
well on real-world networks. Third, we present results comparing HEALER to baseline
algorithms augmented by HEALER’s heuristics. HEALER is currently being tested in real-world
pilot studies with homeless youth in Los Angeles.
Wildlife conservation organizations task rangers to deter and capture wildlife poachers. Since rangers are responsible for patrolling vast areas, adversary behavior modeling can help more effectively direct future patrols. In this innovative application track paper, we present an adversary behavior modeling system, INTERCEPT (INTERpretable Classification Ensemble to Protect Threatened species), and provide the most extensive evaluation in the AI literature of one of the largest poaching datasets from Queen Elizabeth National Park (QENP) in Uganda, comparing INTERCEPT with its competitors; we also present results from a month-long test of INTERCEPT in the field. We present three major contributions. First, we present a paradigm shift in modeling and forecasting wildlife poacher behavior. Some of the latest work in the AI literature (and in Conservation) has relied on models similar to the Quantal Response model from Behavioral Game Theory for poacher behavior prediction. In contrast, INTERCEPT presents a behavior model based on an ensemble of decision trees (i) that more effectively predicts poacher attacks and (ii) that is more effectively interpretable and verifiable. We augment this model to account for spatial correlations and construct an ensemble of the best models, significantly improving performance. Second, we conduct an extensive evaluation on the QENP dataset, comparing 41 models in prediction performance over two years. Third, we present the results of deploying INTERCEPT for a one-month field test in QENP - a first for adversary behavior modeling applications in this domain. This field test has led to finding a poached elephant and more than a dozen snares (including a roll of elephant snares) before they were deployed, potentially saving the lives of multiple animals - including endangered elephants.
The field of influence maximization (IM) has made rapid advances, resulting in many sophisticated algorithms for identifying “influential” members in social networks. However, in order to engender trust in IM algorithms, the rationale behind their choice of “influential” nodes needs to be explained to its users. This is a challenging open problem that needs to be solved before these algorithms can be deployed on a large scale. This paper attempts to tackle this open problem via four major contributions: (i) we propose a general paradigm for designing explanation systems for IM algorithms by exploiting the tradeoff between explanation accuracy and interpretability; our paradigm treats IM algorithms as black boxes, and is flexible enough to be used with any algorithm; (ii) we utilize this paradigm to build XplainIM, a suite of explanation systems; (iii) we illustrate the usability of XplainIM by explaining solutions of HEALER (a recent IM algorithm) among ∼200 human subjects on Amazon Mechanical Turk (AMT); and (iv) we provide extensive evaluation of our AMT results, which shows the effectiveness of XplainIM.
Despite significant research in Security Games, limited efforts have been made to handle game domains with continuous space. Addressing such limitations, in this paper we propose: (i) a continuous space security game model that considers infinitesize action spaces for players; (ii) OptGradFP, a novel and general algorithm that searches for the optimal defender strategy in a parametrized search space; (iii) OptGradFP-NN, a convolutional neural network based implementation of OptGradFP for continuous space security games; (iv) experiments and analysis with OptGradFP-NN. This is the first time that neural networks have been used for security games, and it shows the promise of applying deep learning to complex security games which previous approaches fail to handle.
This paper focuses on a topic that is insufficiently addressed in the literature, i.e., challenges faced in transitioning agents from an emerging phase in the lab, to a deployed application in the field. Specifically, we focus on challenges faced in transitioning HEALER and DOSIM, two agents for social influence maximization, which assist service providers in maximizing HIV awareness in real-world homeless-youth social networks. These agents recommend key "seed" nodes in social networks, i.e., homeless youth who would maximize HIV awareness in their real-world social network. While prior research on these agents published promising simulation results from the lab, this paper illustrates that transitioning these agents from the lab into the real-world is not straightforward, and outlines three major lessons. First, it is important to conduct real-world pilot tests; indeed, due to the health-critical nature of the domain and complex influence spread models used by these agents, it is important to conduct field tests to ensure the real-world usability and effectiveness of these agents. We present results from three real-world pilot studies, involving 173 homeless youth in an American city. These are the first such pilot studies which provide headto-head comparison of different agents for social influence maximization, including a comparison with a baseline approach. Second, we present analyses of these real-world results, illustrating the strengths and weaknesses of different influence maximization approaches we compare. Third, we present research and deployment challenges revealed in conducting these pilot tests, and propose solutions to address them. These challenges and proposed solutions are instructive in assisting the transition of agents focused on social influence maximization from the emerging to the deployed application phase.
This paper presents HEALER, a software agent that recommends sequential intervention plans for use by homeless shelters, who organize these interventions to raise awareness about HIV among homeless youth. HEALER’s sequential plans (built using knowledge of social networks of homeless youth) choose intervention participants strategically to maximize influence spread, while reasoning about uncertainties in the network. While previous work presents influence maximizing techniques to choose intervention participants, they do not address two real-world issues: (i) they completely fail to scale up to real-world sizes; and (ii) they do not handle deviations in execution of intervention plans. HEALER handles these issues via two major contributions: (i) HEALER casts this influence maximization problem as a POMDP and solves it using a novel planner which scales up to previously unsolvable real-world sizes; and (ii) HEALER allows shelter officials to modify its recommendations, and updates its future plans in a deviationtolerant manner. HEALER was deployed in the real world in Spring 2016 with considerable success.
Poaching is considered a major driver for the population drop of key species such as tigers, elephants, and rhinos, which can be detrimental to whole ecosystems. While conducting foot patrols is the most commonly used approach in many countries to prevent poaching, such patrols often do not make the best use of the limited patrolling resources. This paper presents PAWS, a game-theoretic application deployed in Southeast Asia for optimizing foot patrols to combat poaching. In this paper, we report on the significant evolution of PAWS from a proposed decision aid introduced in 2014 to a regularly deployed application. We outline key technical advances that lead to PAWS’s regular deployment: (i) incorporating complex topographic features, e.g., ridgelines, in generating patrol routes; (ii) handling uncertainties in species distribution (game theoretic payoffs); (iii) ensuring scalability for patrolling large-scale conservation areas with fine-grained guidance; and (iv) handling complex patrol scheduling constraints.
Conservation agencies worldwide must make the most efficient use of their limited resources to protect natural resources from over-harvesting and animals from poaching. Predictive modeling, a tool to increase efficiency, is seeing increased usage in conservation domains such as to protect wildlife from poaching. Many works in this wildlife protection domain, however, fail to train their models on real-world data or test their models in the real world. My thesis proposes novel poacher behavior models that are trained on real-world data and are tested via first-of-their-kind tests in the real world. First, I proposed a paradigm shift in traditional adversary behavior modeling techniques from Quantal Response-based models to decision tree-based models. Based on this shift, I proposed an ensemble of spatially-aware decision trees, INTERCEPT, that outperformed the prior stateof-the-art and then also presented results from a one-month pilot field test of the ensemble’s predictions in Uganda’s Queen Elizabeth Protected Area (QEPA). This field test represented the first time that a machine learning-based poacher behavior modeling application was tested in the field. Second, I proposed a hybrid spatio-temporal model that led to further performance improvements. To validate this model, I designed and conducted a large-scale, eight-month field test of this model’s predictions in QEPA. This field test, where rangers patrolled over 450 km in the largest and longest field test of a machine learning-based poacher behavior model to date in this domain, successfully demonstrated the selectiveness of the model’s predictions; the model successfully predicted, with statistical significance, where rangers would find more snaring activity and also where rangers would not find as much snaring activity. I also conducted detailed analysis of the behavior of my predictive model. Third, beyond wildlife poaching, I also provided novel graph-aware models for modeling human adversary behavior in wildlife or other contraband smuggling networks and tested them against human subjects. Lastly, I examined human considerations of deployment in new domains and the importance of easily-interpretable models and results. While such interpretability has been a recurring theme in all my thesis work, I also created a game-theoretic inspection strategy application that generated randomized factory inspection schedules and also contained visualization and explanation components for users.
Worldwide, conservation agencies employ rangers to protect conservation areas from poachers. However, agencies lack the manpower to have rangers effectively patrol these vast areas frequently. While past work has modeled poachers’ behavior so as to aid rangers in planning future patrols, those models’ predictions were not validated by extensive field tests. In this paper, we present a hybrid spatio-temporal model that predicts poaching threat levels and results from a five-month field test of our model in Uganda’s Queen Elizabeth Protected Area (QEPA). To our knowledge, this is the first time that a predictive model has been evaluated through such an extensive field test in this domain. We present two major contributions. First, our hybrid model consists of two components: (i) an ensemble model which can work with the limited data common to this domain and (ii) a spatio-temporal model to boost the ensemble’s predictions when sufficient data are available. When evaluated on real-world historical data from QEPA, our hybrid model achieves significantly better performance than previous approaches with either temporally-aware dynamic Bayesian networks or an ensemble of spatially-aware models. Second, in collaboration with the Wildlife Conservation Society and Uganda Wildlife Authority, we present results from a five-month controlled experiment where rangers patrolled over 450 sq km across QEPA. We demonstrate that our model successfully predicted (1) where snaring activity would occur and (2) where it would not occur; in areas where we predicted a high rate of snaring activity, rangers found more snares and snared animals than in areas of lower predicted activity. These findings demonstrate that (1) our model’s predictions are selective, (2) our model’s superior laboratory performance extends to the real world, and (3) these predictive models can aid rangers in focusing their efforts to prevent wildlife poaching and save animals.
This paper focuses on new challenges in influence maximization inspired by non-profits’ use of social networks to effect behavioral change in their target populations. Influence maximization is a multiagent problem where the challenge is to select the most influential agents from a population connected by a social network. Specifically, our work is motivated by the problem of spreading messages about HIV prevention among homeless youth using their social network. We show how to compute solutions which are provably close to optimal when the parameters of the influence process are unknown. We then extend our algorithm to a dynamic setting where information about the network is revealed at each stage. Simulation experiments using real world networks collected by the homeless shelter show the advantages of our approach.