Lily Xu. 10/24/2021. “
Learning, Optimization, and Planning Under Uncertainty for Wildlife Conservation.” INFORMS Doing Good with Good OR.
Abstract
Wildlife poaching fuels the multi-billion dollar illegal wildlife trade and pushes countless species to the brink of extinction. To aid rangers in preventing poaching in protected areas around the world, we have developed PAWS, the Protection Assistant for Wildlife Security. We present technical advances in multi-armed bandits and robust sequential decision-making using reinforcement learning, with research questions that emerged from on-the-ground challenges. We also discuss bridging the gap between research and practice, presenting results from field deployment in Cambodia and large-scale deployment through integration with SMART, the leading software system for protected area management used by over 1,000 wildlife parks worldwide.
paws_informs.pdf Lily Xu. 8/21/2021. “
Learning and Planning Under Uncertainty for Green Security.” 30th International Joint Conference on Artificial Intelligence (IJCAI).
xu-dc-ijcai21.pdf Lily Xu, Andrew Perrault, Fei Fang, Haipeng Chen, and Milind Tambe. 7/27/2021. “
Robust Reinforcement Learning Under Minimax Regret for Green Security.” Conference on Uncertainty in Artificial Intelligence (UAI).
AbstractGreen security domains feature defenders who plan patrols in the face of uncertainty about the adversarial behavior of poachers, illegal loggers, and illegal fishers. Importantly, the deterrence effect of patrols on adversaries' future behavior makes patrol planning a sequential decision-making problem. Therefore, we focus on robust sequential patrol planning for green security following the minimax regret criterion, which has not been considered in the literature. We formulate the problem as a game between the defender and nature who controls the parameter values of the adversarial behavior and design an algorithm MIRROR to find a robust policy. MIRROR uses two reinforcement learning-based oracles and solves a restricted game considering limited defender strategies and parameter values. We evaluate MIRROR on real-world poaching data.
xu_uai21_robust_rl.pdf Lily Xu, Andrew Perrault, Fei Fang, Haipeng Chen, and Milind Tambe. 5/5/2021. “
Robustness in Green Security: Minimax Regret Optimality with Reinforcement Learning.” AAMAS Workshop on Autonomous Agents for Social Good.
robust_rl_aamas_aasg.pdf Lily Xu, Elizabeth Bondi, Fei Fang, Andrew Perrault, Kai Wang, and Milind Tambe. 2/2021. “
Dual-Mandate Patrols: Multi-Armed Bandits for Green Security.” In Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21).
AbstractConservation efforts in green security domains to protect wildlife and forests are constrained by the limited availability of defenders (i.e., patrollers), who must patrol vast areas to protect from attackers (e.g., poachers or illegal loggers). Defenders must choose how much time to spend in each region of the protected area, balancing exploration of infrequently visited regions and exploitation of known hotspots. We formulate the problem as a stochastic multi-armed bandit, where each action represents a patrol strategy, enabling us to guarantee the rate of convergence of the patrolling policy. However, a naive bandit approach would compromise short-term performance for long-term optimality, resulting in animals poached and forests destroyed. To speed up performance, we leverage smoothness in the reward function and decomposability of actions. We show a synergy between Lipschitz-continuity and decomposition as each aids the convergence of the other. In doing so, we bridge the gap between combinatorial and Lipschitz bandits, presenting a no-regret approach that tightens existing guarantees while optimizing for short-term performance. We demonstrate that our algorithm, LIZARD, improves performance on real-world poaching data from Cambodia.
dual_mandate_patrols_aaai.pdf