Lily Xu, Arpita Biswas, Fei Fang, and Milind Tambe. 7/23/2022. “Ranked Prioritization of Groups in Combinatorial Bandit Allocation.” International Joint Conference on Artificial Intelligence (IJCAI) 31. Vienna, Austria. arXiv linkAbstract
Preventing poaching through ranger patrols is critical for protecting endangered wildlife. Combinatorial bandits have been used to allocate limited patrol resources, but existing approaches overlook the fact that each location is home to multiple species in varying proportions, so a patrol benefits each species to differing degrees. When some species are more vulnerable, we ought to offer more protection to these animals; unfortunately, existing combinatorial bandit approaches do not offer a way to prioritize important species. To bridge this gap, (1) We propose a novel combinatorial bandit objective that trades off between reward maximization and also accounts for prioritization over species, which we call ranked prioritization. We show this objective can be expressed as a weighted linear sum of Lipschitz-continuous reward functions. (2) We provide RankedCUCB, an algorithm to select combinatorial actions that optimize our prioritization-based objective, and prove that it achieves asymptotic no-regret. (3) We demonstrate empirically that RankedCUCB leads to up to 38% improvement in outcomes for endangered species using real-world wildlife conservation data. Along with adapting to other challenges such as preventing illegal logging and overfishing, our no-regret algorithm addresses the general combinatorial bandit problem with a weighted linear objective.
Lily Xu. 10/24/2021. “Learning, Optimization, and Planning Under Uncertainty for Wildlife Conservation.” INFORMS Doing Good with Good OR.Abstract

Wildlife poaching fuels the multi-billion dollar illegal wildlife trade and pushes countless species to the brink of extinction. To aid rangers in preventing poaching in protected areas around the world, we have developed PAWS, the Protection Assistant for Wildlife Security. We present technical advances in multi-armed bandits and robust sequential decision-making using reinforcement learning, with research questions that emerged from on-the-ground challenges. We also discuss bridging the gap between research and practice, presenting results from field deployment in Cambodia and large-scale deployment through integration with SMART, the leading software system for protected area management used by over 1,000 wildlife parks worldwide.

Lily Xu. 8/21/2021. “Learning and Planning Under Uncertainty for Green Security.” 30th International Joint Conference on Artificial Intelligence (IJCAI). xu-dc-ijcai21.pdf
Lily Xu, Andrew Perrault, Fei Fang, Haipeng Chen, and Milind Tambe. 7/27/2021. “Robust Reinforcement Learning Under Minimax Regret for Green Security.” Conference on Uncertainty in Artificial Intelligence (UAI).Abstract
Green security domains feature defenders who plan patrols in the face of uncertainty about the adversarial behavior of poachers, illegal loggers, and illegal fishers. Importantly, the deterrence effect of patrols on adversaries' future behavior makes patrol planning a sequential decision-making problem. Therefore, we focus on robust sequential patrol planning for green security following the minimax regret criterion, which has not been considered in the literature. We formulate the problem as a game between the defender and nature who controls the parameter values of the adversarial behavior and design an algorithm MIRROR to find a robust policy. MIRROR uses two reinforcement learning-based oracles and solves a restricted game considering limited defender strategies and parameter values. We evaluate MIRROR on real-world poaching data.
Lily Xu, Andrew Perrault, Fei Fang, Haipeng Chen, and Milind Tambe. 5/5/2021. “Robustness in Green Security: Minimax Regret Optimality with Reinforcement Learning.” AAMAS Workshop on Autonomous Agents for Social Good. robust_rl_aamas_aasg.pdf
Lily Xu, Elizabeth Bondi, Fei Fang, Andrew Perrault, Kai Wang, and Milind Tambe. 2/2021. “Dual-Mandate Patrols: Multi-Armed Bandits for Green Security.” In Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21).Abstract
Conservation efforts in green security domains to protect wildlife and forests are constrained by the limited availability of defenders (i.e., patrollers), who must patrol vast areas to protect from attackers (e.g., poachers or illegal loggers). Defenders must choose how much time to spend in each region of the protected area, balancing exploration of infrequently visited regions and exploitation of known hotspots. We formulate the problem as a stochastic multi-armed bandit, where each action represents a patrol strategy, enabling us to guarantee the rate of convergence of the patrolling policy. However, a naive bandit approach would compromise short-term performance for long-term optimality, resulting in animals poached and forests destroyed. To speed up performance, we leverage smoothness in the reward function and decomposability of actions. We show a synergy between Lipschitz-continuity and decomposition as each aids the convergence of the other. In doing so, we bridge the gap between combinatorial and Lipschitz bandits, presenting a no-regret approach that tightens existing guarantees while optimizing for short-term performance. We demonstrate that our algorithm, LIZARD, improves performance on real-world poaching data from Cambodia.
Rachel Guo, Lily Xu, Drew Cronin, Francis Okeke, Andrew Plumptre, and Milind Tambe. 12/12/2020. “Enhancing Poaching Predictions for Under-Resourced Wildlife Conservation Parks Using Remote Sensing Imagery”. Publisher's Version
Lily Xu, Andrew Perrault, Andrew Plumptre, Margaret Driciru, Fred Wanyama, Aggrey Rwetsiba, and Milind Tambe. 7/20/2020. “Game Theory on the Ground: The Effect of Increased Patrols on Deterring Poachers.” Harvard CRCS Workshop on AI for Social Good. Publisher's VersionAbstract
Applications of artificial intelligence for wildlife protection have focused on learning models of poacher behavior based on historical patterns. However, poachers' behaviors are described not only by their historical preferences, but also their reaction to ranger patrols. Past work applying machine learning and game theory to combat poaching have hypothesized that ranger patrols deter poachers, but have been unable to find evidence to identify how or even if deterrence occurs. Here for the first time, we demonstrate a measurable deterrence effect on real-world poaching data. We show that increased patrols in one region deter poaching in the next timestep, but poachers then move to neighboring regions. Our findings offer guidance on how adversaries should be modeled in realistic game-theoretic settings.
Lily Xu*, Shahrzad Gholami *, Sara Mc Carthy, Bistra Dilkina, Andrew Plumptre, Milind Tambe, Rohit Singh, Mustapha Nsubuga, Joshua Mabonga, Margaret Driciru, Fred Wanyama, Aggrey Rwetsiba, Tom Okello, and Eric Enyel. 4/20/2020. “Stay Ahead of Poachers: Illegal Wildlife Poaching Prediction and Patrol Planning Under Uncertainty with Field Test Evaluations.” In IEEE International Conference on Data Engineering (ICDE-20).Abstract
Illegal wildlife poaching threatens ecosystems and drives endangered species toward extinction. However, efforts for wildlife protection are constrained by the limited resources of law enforcement agencies. To help combat poaching, the Protection Assistant for Wildlife Security (PAWS) is a machine learning pipeline that has been developed as a data-driven approach to identify areas at high risk of poaching throughout protected areas and compute optimal patrol routes. In this paper, we take an end-to-end approach to the data-to-deployment pipeline for anti-poaching. In doing so, we address challenges including extreme class imbalance (up to 1:200), bias, and uncertainty in wildlife poaching data to enhance PAWS, and we apply our methodology to three national parks with diverse characteristics. (i) We use Gaussian processes to quantify predictive uncertainty, which we exploit to improve robustness of our prescribed patrols and increase detection of snares by an average of 30%. We evaluate our approach on real-world historical poaching data from Murchison Falls and Queen Elizabeth National Parks in Uganda and, for the first time, Srepok Wildlife Sanctuary in Cambodia. (ii) We present the results of large-scale field tests conducted in Murchison Falls and Srepok Wildlife Sanctuary which confirm that the predictive power of PAWS extends promisingly to multiple parks. This paper is part of an effort to expand PAWS to 800 parks around the world through integration with SMART conservation software. 
Andrew Perrault, Bryan Wilder, Eric Ewing, Aditya Mate, Bistra Dilkina, and Milind Tambe. 2020. “End-to-End Game-Focused Learning of Adversary Behavior in Security Games.” AAAI Conference on Artificial Intelligence. New York. 1903.00958.pdf
Ayan Mukhopadhyay, Kai Wang, Andrew Perrault, Mykel Kochenderfer, Milind Tambe, and Yevgeniy Vorobeychik. 2020. “Robust Spatial-Temporal Incident Prediction.” Conference on Uncertainty in Artificial Intelligence (UAI). Toronto. uai_1.pdf
Shahrzad Gholami, Sara Mc Carthy, Bistra Dilkina, Andrew Plumptre, Milind Tambe, Margaret Driciru, Fred Wanyama, and Aggrey Rwetsiba. 2018. “Adversary models account for imperfect crime data: Forecasting and planning against real-world poachers (Corrected Version).” In International Conference on Autonomous Agents and Multi-agent Systems (AAMAS 2018).Abstract
Poachers are engaged in extinction level wholesale slaughter, so it is critical to harness historical data for predicting poachers’ behavior. However, in these domains, data collected about adversarial actions are remarkably imperfect, where reported negative instances of crime may be mislabeled or uncertain. Unfortunately, past attempts to develop predictive and prescriptive models to address this problem suffer from shortcomings from a modeling perspective as well as in the implementability of their techniques. Most notably these models i) neglect the uncertainty in crime data, leading to inaccurate and biased predictions of adversary behavior, ii) use coarse-grained crime analysis and iii) do not provide a convincing evaluation as they only look at a single protected area. Additionally, they iv) proposed time-consuming techniques which cannot be directly integrated into low resource outposts. In this innovative application paper, we (I) introduce iWare-E a novel imperfect-observation aWare Ensemble (iWare-E) technique, which is designed to handle the uncertainty in crime information efficiently. This approach leads to superior accuracy and efficiency for adversary behavior prediction compared to the previous stateof-the-art. We also demonstrate the country-wide efficiency of the models and are the first to (II) evaluate our adversary behavioral model across different protected areas in Uganda, i.e., Murchison Fall and Queen Elizabeth National Park, (totaling about 7500 km2) as well as (III) on fine-grained temporal resolutions. Lastly, (IV) we provide a scalable planning algorithm to design fine-grained patrol routes for the rangers, which achieves up to 150% improvement in number of predicted attacks detected.
Fei Fang, Thanh H. Nguyen, Rob Pickles, Wai Y. Lam, Gopalasamy R. Clements, Bo An, Amandeep Singh, Brian C. Schwedock, Milind Tambe, and Andrew Lemieux. 2017. “PAWS: A Deployed Game-Theoretic Application to Combat Poaching.” AI Magazine 38(1):23-36.Abstract
Poaching is considered a major driver for the population drop of key species such as tigers, elephants, and rhinos, which can be detrimental to whole ecosystems. While conducting foot patrols is the most commonly used approach in many countries to prevent poaching, such patrols often do not make the best use of the limited patrolling resources. This paper presents PAWS, a game-theoretic application deployed in Southeast Asia for optimizing foot patrols to combat poaching. In this paper, we report on the significant evolution of PAWS from a proposed decision aid introduced in 2014 to a regularly deployed application. We outline key technical advances that lead to PAWS’s regular deployment: (i) incorporating complex topographic features, e.g., ridgelines, in generating patrol routes; (ii) handling uncertainties in species distribution (game theoretic payoffs); (iii) ensuring scalability for patrolling large-scale conservation areas with fine-grained guidance; and (iv) handling complex patrol scheduling constraints.
S Gholami, B Ford, F Fang, A Plumptre, M Tambe, M Driciru, F Wanyama, A Rwetsiba, M Nsubaga, and J Mabonga. 2017. “Taking it for a Test Drive: A Hybrid Spatio-temporal Model for Wildlife Poaching Prediction Evaluated through a Controlled Field Test.” In The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2017 Applied Data Science Track).Abstract
Worldwide, conservation agencies employ rangers to protect conservation areas from poachers. However, agencies lack the manpower to have rangers effectively patrol these vast areas frequently. While past work has modeled poachers’ behavior so as to aid rangers in planning future patrols, those models’ predictions were not validated by extensive field tests. In this paper, we present a hybrid spatio-temporal model that predicts poaching threat levels and results from a five-month field test of our model in Uganda’s Queen Elizabeth Protected Area (QEPA). To our knowledge, this is the first time that a predictive model has been evaluated through such an extensive field test in this domain. We present two major contributions. First, our hybrid model consists of two components: (i) an ensemble model which can work with the limited data common to this domain and (ii) a spatio-temporal model to boost the ensemble’s predictions when sufficient data are available. When evaluated on real-world historical data from QEPA, our hybrid model achieves significantly better performance than previous approaches with either temporally-aware dynamic Bayesian networks or an ensemble of spatially-aware models. Second, in collaboration with the Wildlife Conservation Society and Uganda Wildlife Authority, we present results from a five-month controlled experiment where rangers patrolled over 450 sq km across QEPA. We demonstrate that our model successfully predicted (1) where snaring activity would occur and (2) where it would not occur; in areas where we predicted a high rate of snaring activity, rangers found more snares and snared animals than in areas of lower predicted activity. These findings demonstrate that (1) our model’s predictions are selective, (2) our model’s superior laboratory performance extends to the real world, and (3) these predictive models can aid rangers in focusing their efforts to prevent wildlife poaching and save animals.
Thanh H. Nguyen, Arunesh Sinha, Shahrzad Gholami, Andrew Plumptre, Lucas Joppa, Milind Tambe, Margaret Driciru, Fred Wanyama, Aggrey Rwetsiba, Rob Critchlow, and Colin Beale. 2016. “CAPTURE: A New Predictive Anti-Poaching Tool for Wildlife Protection.” In 15th International Conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract
Wildlife poaching presents a serious extinction threat to many animal species. Agencies (“defenders”) focused on protecting such animals need tools that help analyze, model and predict poacher activities, so they can more effectively combat such poaching; such tools could also assist in planning effective defender patrols, building on the previous security games research. To that end, we have built a new predictive anti-poaching tool, CAPTURE (Comprehensive Anti-Poaching tool with Temporal and observation Uncertainty REasoning). CAPTURE provides four main contributions. First, CAPTURE’s modeling of poachers provides significant advances over previous models from behavioral game theory and conservation biology. This accounts for: (i) the defender’s imperfect detection of poaching signs; (ii) complex temporal dependencies in the poacher’s behaviors; (iii) lack of knowledge of numbers of poachers. Second, we provide two new heuristics: parameter separation and target abstraction to reduce the computational complexity in learning the poacher models. Third, we present a new game-theoretic algorithm for computing the defender’s optimal patrolling given the complex poacher model. Finally, we present detailed models and analysis of realworld poaching data collected over 12 years in Queen Elizabeth National Park in Uganda to evaluate our new model’s prediction accuracy. This paper thus presents the largest dataset of real-world defender-adversary interactions analyzed in the security games literature. CAPTURE will be tested in Uganda in early 2016.
Fei Fang, Thanh H. Nguyen, Rob Pickles, Wai Y. Lam, Gopalasamy R. Clements, Bo An, Amandeep Singh, Milind Tambe, and Andrew Lemieux. 2016. “Deploying PAWS: Field Optimization of the Protection Assistant for Wildlife Security.” In Twenty-Eighth Innovative Applications of Artificial Intelligence Conference.Abstract
Poaching is a serious threat to the conservation of key species and whole ecosystems. While conducting foot patrols is the most commonly used approach in many countries to prevent poaching, such patrols often do not make the best use of limited patrolling resources. To remedy this situation, prior work introduced a novel emerging application called PAWS (Protection Assistant for Wildlife Security); PAWS was proposed as a game-theoretic (“security games”) decision aid to optimize the use of patrolling resources. This paper reports on PAWS’s significant evolution from a proposed decision aid to a regularly deployed application, reporting on the lessons from the first tests in Africa in Spring 2014, through its continued evolution since then, to current regular use in Southeast Asia and plans for future worldwide deployment. In this process, we have worked closely with two NGOs (Panthera and Rimba) and incorporated extensive feedback from professional patrolling teams. We outline key technical advances that lead to PAWS’s regular deployment: (i) incorporating complex topographic features, e.g., ridgelines, in generating patrol routes; (ii) handling uncertainties in species distribution (game theoretic payoffs); (iii) ensuring scalability for patrolling large-scale conservation areas with fine-grained guidance; and (iv) handling complex patrol scheduling constraints.
D. Kar, F. Fang, F. Delle Fave, N. Sintov, M. Tambe, and A. Van Wissen. 2015. “Effectiveness of Probability Perception Modeling and Defender Strategy Generation Algorithms in Repeated Stackelberg Games: An Initial Report.” In Computational Sustainability Workshop at AAAI’15, Texas, Austin.Abstract
While human behavior models based on repeated Stackelberg games have been proposed for domains such as “wildlife crime” where there is repeated interaction between the defender and the adversary, there has been no empirical study with human subjects to show the effectiveness of such models. This paper presents an initial study based on extensive human subject experiments with participants on Amazon Mechanical Turk (AMT). Our findings include: (i) attackers may view the defender’s coverage probability in a non-linear fashion; specifically it follows an S-shaped curve, and (ii) there are significant losses in defender utility when strategies generated by existing models are deployed in repeated Stackelberg game settings against human subjects.
Thanh H. Nguyen, Francesco M. Delle Fave, Debarun Kar, Aravind S. Lakshminarayanan, Amulya Yadav, Milind Tambe, Noa Agmon, Andrew J. Plumptre, Margaret Driciru, Fred Wanyama, and Aggrey Rwetsiba. 2015. “Making the most of Our Regrets: Regret-based Solutions to Handle Payoff Uncertainty and Elicitation in Green Security Games.” In Conference on Decision and Game Theory for Security.Abstract
Recent research on Green Security Games (GSG), i.e., security games for the protection of wildlife, forest and fisheries, relies on the promise of an abundance of available data in these domains to learn adversary behavioral models and determine game payoffs. This research suggests that adversary behavior models (capturing bounded rationality) can be learned from real-world data on where adversaries have attacked, and that game payoffs can be determined precisely from data on animal densities. However, previous work has, as yet, failed to demonstrate the usefulness of these behavioral models in capturing adversary behaviors based on real-world data in GSGs. Previous work has also been unable to address situations where available data is insufficient to accurately estimate behavioral models or to obtain the required precision in the payoff values. In addressing these limitations, as our first contribution, this paper, for the first time, provides validation of the aforementioned adversary behavioral models based on real-world data from a wildlife park in Uganda. Our second contribution addresses situations where real-world data is not precise enough to determine exact payoffs in GSG, by providing the first algorithm to handle payoff uncertainty in the presence of adversary behavioral models. This algorithm is based on the notion of minimax regret. Furthermore, in scenarios where the data is not even sufficient to learn adversary behaviors, our third contribution is to provide a novel algorithm to address payoff uncertainty assuming a perfectly rational attacker (instead of relying on a behavioral model); this algorithm allows for a significant scaleup for large security games. Finally, to reduce the problems due to paucity of data, given mobile sensors such as Unmanned Aerial Vehicles (UAV), we introduce new payoff elicitation strategies to strategically reduce uncertainty.
Fei Fang, Peter Stone, and Milind Tambe. 2015. “When Security Games Go Green: Designing Defender Strategies to Prevent Poaching and Illegal Fishing.” In International Joint Conference on Artificial Intelligence (IJCAI).Abstract
Building on the successful applications of Stackelberg Security Games (SSGs) to protect infrastructure, researchers have begun focusing on applying game theory to green security domains such as protection of endangered animals and fish stocks. Previous efforts in these domains optimize defender strategies based on the standard Stackelberg assumption that the adversaries become fully aware of the defender’s strategy before taking action. Unfortunately, this assumption is inappropriate since adversaries in green security domains often lack the resources to fully track the defender strategy. This paper (i) introduces Green Security Games (GSGs), a novel game model for green security domains with a generalized Stackelberg assumption; (ii) provides algorithms to plan effective sequential defender strategies — such planning was absent in previous work; (iii) proposes a novel approach to learn adversary models that further improves defender performance; and (iv) provides detailed experimental analysis of proposed approaches.
Benjamin Ford, Debarun Kar, Francesco M. Delle Fave, Rong Yang, and Milind Tambe. 5/2014. “PAWS: Adaptive Game-theoretic Patrolling for Wildlife Protection (Demonstration).” In Conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract
Endangered species around the world are in danger of extinction from poaching. From the start of the 20th century, the African rhino population has dropped over 98% [4] and the global tiger population has dropped over 95% [5], resulting in multiple species extinctions in both groups. Species extinctions have negative consequences on local ecosystems, economies, and communities. To protect these species, countries have set up conservation agencies and national parks, such as Uganda’s Queen Elizabeth National Park (QENP). However, a common lack of funding for these agencies results in a lack of law enforcement resources to protect these large, rural areas. As an example of the scale of disparity, one wildlife crime study in 2007 reported an actual coverage density of one ranger per 167 square kilometers [2]. Because of the hazards involved (e.g., armed poachers, wild animals), rangers patrol in groups, further increasing the amount of area they are responsible for patrolling. Security game research has typically been concerned with combating terrorism, and this field has indeed benefited from a range of successfully deployed applications [1, 6]. These applications have enabled security agencies to make more efficient use of their limited resources. In this previous research, adversary data has been absent during the development of these solutions, and thus, it has been difficult to make accurate adversary behavior models during algorithm development. In a domain such as wildlife crime, interactions with the adversary are frequent and repeated, thus enabling conservation agencies to collect data. This presence of data enables security game researchers to begin developing algorithms that incorporate this data into, potentially, more accurate behavior models and consequently better security solutions. Developed in conjunction with staff at QENP, the Protection Assistant for Wildlife Security (PAWS) generates optimized defender strategies for use by park rangers [7]. Due to the repeated nature of wildlife crime, PAWS is able to leverage crime event data - a previously unrealized capability in security games research. Thus, PAWS implements a novel adaptive algorithm that processes crime event data, builds multiple human behavior models, and, based on those models, predicts where adversaries will attack next. These predictions are then used to generate a patrol strategy for the rangers (i.e., a set of patrol waypoints) that can be viewed on a GPS unit. Against this background, the demonstration presented in this paper introduces two contributions. First, we present the PAWS system which incorporates the algorithm in [7] into a scheduling system and a GPS visualizer. Second, we present a software interface to run a number of human subject experiments (HSE) to evaluate and improve the efficacy of PAWS before its deployment in QENP. By conducting these HSEs, we can (i) test the PAWS algorithms with repeated interactions with humans, thus providing a more realistic testing environment than in its previous simulations; (ii) generate data that can be used to initialize PAWS’s human behavior models for deployment, and (iii) compare the current PAWS algorithms’ performance to alternatives and determine if additional improvements are needed prior to deployment. To provide proper context for the presentation, this paper also presents a brief overview of the PAWS system data flow and its adaptive algorithms. The demonstration will engage audience members by having them participate in the HSEs and using the GPS unit to visualize a patrol schedule in QENP.