Andrew Perrault, Bryan Wilder, Eric Ewing, Aditya Mate, Bistra Dilkina, and Milind Tambe. 2020. “End-to-End Game-Focused Learning of Adversary Behavior in Security Games.” AAAI Conference on Artificial Intelligence. New York. 1903.00958.pdf
Ilkin Bayramli, Elizabeth Bondi, and Milind Tambe. 2020. “In the Shadow of Disaster: Finding Shadows to Improve Damage Detection.” Harvard CRCS Workshop on AI for Social Good. 2020_07_teamcore_harvard_crcs_shadow_disaster_detection_2020.pdf
Harshavardhan Kamarthi, Priyesh Vijayan, Bryan Wilder, Balaraman Ravindran, and Milind Tambe. 2020. “Influence maximization in unknown social networks: Learning Policies for Effective Graph Sampling.” In International Conference on Autonomous Agents and Multiagent Systems.Abstract
A serious challenge when finding influential actors in real-world social networks is the lack of knowledge about the structure of the underlying network. Current state-of-the-art methods rely on hand-crafted sampling algorithms; these methods sample nodes and their neighbours in a carefully constructed order and choose opinion leaders from this discovered network to maximize influence spread in the (unknown) complete network. In this work, we propose a reinforcement learning framework for network discovery that automatically learns useful node and graph representations that encode important structural properties of the network. At training time, the method identifies portions of the network such that the nodes selected from this sampled subgraph can effectively influence nodes in the complete network. The realization of such transferable network structure based adaptable policies is attributed to the meticulous design of the framework that encodes relevant node and graph signatures driven by an appropriate reward scheme. We experiment with real-world social networks from four different domains and show that the policies learned by our RL agent provide a 10-36% improvement over the current state-of-the-art method.
Elizabeth Bondi, Andrew Perrault, Fei Fang, Benjamin L. Rice, Christopher D. Golden, and Milind Tambe. 2020. “Mapping for Public Health: Initial Plan for Using Satellite Imagery for Micronutrient Deficiency Prediction.” KDD 2020 Workshop on Humanitarian Mapping. kdd_humanitarian_mapping_workshop_2020_4.pdf
Aaron Ferber, Bryan Wilder, Bistra Dilkina, and Milind Tambe. 2020. “MIPaaL: Mixed Integer Program as a Layer.” In AAAI Conference on Artificial Intelligence.Abstract
Machine learning components commonly appear in larger decision-making pipelines; however, the model training process typically focuses only on a loss that measures average accuracy between predicted values and ground truth values. Decision-focused learning explicitly integrates the downstream decision problem when training the predictive model, in order to optimize the quality of decisions induced by the predictions. It has been successfully applied to several limited combinatorial problem classes, such as those that can be expressed as linear programs (LP), and submodular optimization. However, these previous applications have uniformly focused on problems with simple constraints. Here, we enable decision-focused learning for the broad class of problems that can be encoded as a mixed integer linear program (MIP), hence supporting arbitrary linear constraints over discrete and continuous variables. We show how to differentiate through a MIP by employing a cutting planes solution approach, an algorithm that iteratively tightens the continuous relaxation by adding constraints removing fractional solutions. We evaluate our new end-to-end approach on several real world domains and show that it outperforms the standard two phase approaches that treat prediction and optimization separately, as well as a baseline approach of simply applying decision-focused learning to the LP relaxation of the MIP. Lastly, we demonstrate generalization performance in several transfer learning tasks.
Ayan Mukhopadhyay, Kai Wang, Andrew Perrault, Mykel Kochenderfer, Milind Tambe, and Yevgeniy Vorobeychik. 2020. “Robust Spatial-Temporal Incident Prediction.” Conference on Uncertainty in Artificial Intelligence (UAI). Toronto. uai_1.pdf
Kai Wang, Andrew Perrault, Aditya Mate, and Milind Tambe. 2020. “Scalable Game-Focused Learning of Adversary Models: Data-to-Decisions in Network Security Games.” In International Conference on Autonomous Agents and Multi-agent Systems (AAMAS).Abstract
Previous approaches to adversary modeling in network security games (NSGs) have been caught in the paradigm of first building a full adversary model, either from expert input or historical attack data, and then solving the game. Motivated by the need to disrupt the multibillion dollar illegal smuggling networks, such as wildlife and drug trafficking, this paper introduces a fundamental shift in learning adversary behavior in NSGs by focusing on the accuracy of the model using the downstream game that will be solved. Further, the paper addresses technical challenges in building such a game-focused learning model by i) applying graph convolutional networks to NSGs to achieve tractability and differentiability and ii) using randomized block updates of the coefficients of the defender's optimization in order to scale the approach to large networks. We show that our game-focused approach yields scalability and higher defender expected utility than models trained for accuracy only.
Sanket Shah, Arunesh Sinha, Pradeep Varakantham, Andrew Perrault, and Milind Tambe. 2020. “Solving Online Threat Screening Games Using Constrained Action Space Reinforcement Learning.” AAAI Conference on Artificial Intelligence. New York.
Elizabeth Bondi, Hoon Oh, Haifeng Xu, Fei Fang, Bistra Dilkina, and Milind Tambe. 2020. “To Signal or Not To Signal: Exploiting Uncertain Real-Time Information in Signaling Games for Security and Sustainability.” In AAAI conference on Artificial Intelligence. 2020_02_teamcore_aaai_signaluncertainty.pdf
Elizabeth Bondi. 2020. “Vision for Decisions: Utilizing Uncertain Real-Time Information and Signaling for Conservation.” In AAMAS Doctoral Consortium. 2020_12_teamcore_aamas_doctoral_consortium.pdf
Edward A. Cranford, Cleotilde Gonzalez, Palvi Aggarwal, Milind Tambe, and Christian Lebiere. 2020. “What Attackers Know and What They Have to Lose: Framing Effects on Cyber-attacker Decision Making.” In 64th Human Factors and Ergonomics Society (HFES) Annual Conference. cranford_hfes2020_final_20200602.pdf
Jackson Killian, Bryan Wilder, Amit Sharma, Vinod Choudhary, Bistra Dilkina, and Milind Tambe. 8/4/2019. “Learning to Prescribe Interventions for Tuberculosis Patients Using Digital Adherence Data.” In The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 8/4/2019. Abstract
Digital Adherence Technologies (DATs) are an increasingly popular method for verifying patient adherence to many medications.
We analyze data from one city served by 99DOTS, a phone-callbased DAT deployed for Tuberculosis (TB) treatment in India where
nearly 3 million people are afflicted with the disease each year. The
data contains nearly 17,000 patients and 2.1M dose records. We lay
the groundwork for learning from this real-world data, including
a method for avoiding the effects of unobserved interventions in
training data used for machine learning. We then construct a deep
learning model, demonstrate its interpretability, and show how it
can be adapted and trained in three different clinical scenarios to
better target and improve patient care. In the real-time risk prediction setting our model could be used to proactively intervene with
21% more patients and before 76% more missed doses than current
heuristic baselines. For outcome prediction, our model performs
40% better than baseline methods, allowing cities to target more
resources to clinics with a heavier burden of patients at risk of failure. Finally, we present a case study demonstrating how our model
can be trained in an end-to-end decision focused learning setting to
achieve 15% better solution quality in an example decision problem
faced by health workers.
Elizabeth Bondi. 2019. “Bridging the Gap Between High-Level Reasoning in Strategic Agent Coordination and Low-Level Agent Development .” In International Conference on Autonomous Agents and Multiagent Systems (Doctoral Consortium) (AAMAS-19).Abstract
Recent advances in fields such as computer vision and natural language processing have paved the way for developing agents capable of automatically interpreting their surrounding environment. Concurrently, advances in artificial intelligence have made the coordination of many such agents possible. However, there is little work considering both the low-level reasoning that allows agents to interpret their environment, such as deep learning techniques, and the high-level reasoning that coordinates such agents. By considering both together, we can better handle real-world scenarios, for example by planning at a high level with low-level uncertainty in mind, or even by improving low-level processing by using high level reasoning to place the agent in the best scenario for success.
Elizabeth Bondi, Hoon Oh, Haifeng Xu, Fei Fang, Bistra Dilkina, and Milind Tambe. 2019. “Broken Signals in Security Games: Coordinating Mobile Patrollers and Sensors in the Real World (Extended Abstract).” In International Conference on Autonomous Agents and Multiagent Systems (Extended Abstract) (AAMAS-19).Abstract
Mobile sensors, e.g., unmanned aerial vehicles (UAVs), are becoming increasingly important in security domains and can be used for tasks such as searching for poachers in conservation areas. Such mobile sensors augment human patrollers by assisting in surveillance and in signaling potentially deceptive information to adversaries, and their coordinated deployment could be modeled via the well-known security games framework. Unfortunately, real-world uncertainty in the sensor’s detection of adversaries and adversaries’ observation of the sensor’s signals present major challenges in the sensors’ use. This leads to significant detriments in security performance. We first discuss the current shortcomings in more detail, and then propose a novel game model that incorporates uncertainty with sensors. The defender strategy in this game model will consist of three interdependent stages: an allocation stage, a signaling stage, and a reaction stage
Omkar Thakoor, Milind Tambe, Phebe Vayanos, Haifeng Xu, Christopher Kiekintveld, and Fei Feng. 2019. “Cyber Camouflage Games for Strategic Deception.” In Conference on Decision and Game Theory for Security, 2019.Abstract
 The rapid increase in cybercrime, causing a reported annual economic
loss of $600 billion (Lewis, 2018), has prompted a critical need for effective cyber defense. Strategic criminals conduct network reconnaissance prior to executing attacks to avoid detection and establish situational awareness via scanning
and fingerprinting tools. Cyber deception attempts to foil these reconnaissance
efforts by camouflaging network and system attributes to disguise valuable information. Game-theoretic models can identify decisions about strategically deceiving attackers, subject to domain constraints. For effectively deploying an optimal
deceptive strategy, modeling the objectives and the abilities of the attackers, is a
key challenge. To address this challenge, we present Cyber Camouflage Games
(CCG), a general-sum game model that captures attackers which can be diversely
equipped and motivated. We show that computing the optimal defender strategy is
NP-hard even in the special case of unconstrained CCGs, and present an efficient
approximate solution for it. We further provide an MILP formulation accelerated
with cut-augmentation for the general constrained problem. Finally, we provide
experimental evidence that our solution methods are efficient and effective.
Andrew Perrault, Bryan Wilder, Eric Ewing, Aditya Mate, Bistra Dilkina, and Milind Tambe. 2019. “Decision-Focused Learning of Adversary Behavior in Security Games.” In GAIW: Games, Agents and Incentives Workshop at International Conference on Autonomous Agents and Multiagent Systems (AAMAS-19).Abstract
Stackelberg security games are a critical tool for maximizing the
utility of limited defense resources to protect important targets
from an intelligent adversary. Motivated by green security, where
the defender may only observe an adversary’s response to defense
on a limited set of targets, we study the problem of defending
against the same adversary on a larger set of targets from the same
distribution. We give a theoretical justification for why standard
two-stage learning approaches, where a model of the adversary is
trained for predictive accuracy and then optimized against, may
fail to maximize the defender’s expected utility in this setting. We
develop a decision-focused learning approach, where the adversary behavior model is optimized for decision quality, and show
empirically that it achieves higher defender expected utility than
the two-stage approach when there is limited training data and a
large number of target features.
Nitin Kamra, Umang Gupta, Kai Wang, Fei Fang, Yan Liu, and Milind Tambe. 2019. “Deep Fictitious Play for Games with Continuous Action Spaces .” In International Conference on Autonomous Agents and Multiagent Systems (Extended Abstract) (AAMAS-19).Abstract
Fictitious play has been a classic algorithm to solve two-player adversarial games with discrete action spaces. In this work we develop an approximate extension of fictitious play to two-player games with high-dimensional continuous action spaces. We use generative neural networks to approximate players’ best responses while also learning a differentiable approximate model to the players’ rewards given their actions. Both these networks are trained jointly with gradient-based optimization to emulate fictitious play. We explore our approach in zero-sum games, non zero-sum games and security game domains.
Nitin Kamra, Umang Gupta, Kai Wang, Fei Fang, Yan Liu, and Milind Tambe. 2019. “DeepFP for Finding Nash Equilibrium in Continuous Action Spaces .” In Conference on Decision and Game Theory for Security, 2019.Abstract
Finding Nash equilibrium in continuous action spaces is a
challenging problem and has applications in domains such as protecting geographic areas from potential attackers. We present DeepFP, an
approximate extension of fictitious play in continuous action spaces.
DeepFP represents players’ approximate best responses via generative
neural networks which are highly expressive implicit density approximators. It additionally uses a game-model network which approximates
the players’ expected payoffs given their actions, and trains the networks
end-to-end in a model-based learning regime. Further, DeepFP allows using domain-specific oracles if available and can hence exploit techniques
such as mathematical programming to compute best responses for structured games. We demonstrate stable convergence to Nash equilibrium
on several classic games and also apply DeepFP to a large forest security domain with a novel defender best response oracle. We show that
DeepFP learns strategies robust to adversarial 
Shahrzad Gholami, Amulya Yadav, Long Tran-Thanh, Bistra Dilkina, and Milind Tambe. 2019. “Don’t Put All Your Strategies in One Basket: Playing Green Security Games with Imperfect Prior Knowledge .” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS-19).Abstract
Security efforts for wildlife monitoring and protection of endangered species (e.g., elephants, rhinos, etc.) are constrained by limited resources available to law enforcement agencies. Recent progress in Green Security Games (GSGs) has led to patrol planning algorithms for strategic allocation of limited patrollers to deter adversaries in environmental settings. Unfortunately, previous approaches to these problems suffer from several limitations. Most notably, (i) previous work in GSG literature relies on exploitation of error-prone machine learning (ML) models of poachers’ behavior trained on (spatially) biased historical data; and (ii) online learning approaches for repeated security games (similar to GSGs) do not account for spatio-temporal scheduling constraints while planning patrols, potentially causing significant shortcomings in the effectiveness of the planned patrols. Thus, this paper makes the following novel contributions: (I) We propose MINION-sm, a novel online learning algorithm for GSGs which does not rely on any prior error-prone model of attacker behavior, instead, it builds an implicit model of the attacker on-the-fly while simultaneously generating schedulingconstraint-aware patrols. MINION-sm achieves a sublinear regret against an optimal hindsight patrol strategy. (II) We also propose MINION, a hybrid approach where our MINION-sm model and an ML model (based on historical data) are considered as two patrol planning experts and we obtain a balance between them based on their observed empirical performance. (III) We show that our online learning algorithms significantly outperform existing state-of-theart solvers for GSGs.
Omkar Thakoor, Milind Tambe, Phebe Vayanos, Haifeng Xu, and Christopher Kiekintveld. 2019. “General-Sum Cyber Deception Games under Partial Attacker Valuation Information.” In International Conference on Autonomous Agents and Multiagent Systems (Extended Abstract) (AAMAS-19).Abstract
The rapid increase in cybercrime, causing a reported annual economic loss of $600 billion [20], has prompted a critical need for
effective cyber defense. Strategic criminals conduct network reconnaissance prior to executing attacks to avoid detection and establish
situational awareness via scanning and fingerprinting tools. Cyber
deception attempts to foil these reconnaissance efforts; by disguising network and system attributes, among several other techniques.
Cyber Deception Games (CDG) is a game-theoretic model for optimizing strategic deception, and can apply to various deception
methods. Recently introduced initial model for CDGs assumes zerosum payoffs, implying directly conflicting attacker motives, and
perfect defender knowledge on attacker preferences. These unrealistic assumptions are fundamental limitations of the initial zero-sum
model, which we address by proposing a general-sum model that
can also handle uncertainty in the defender’s knowledge.