2020

2020
Rachel Guo, Lily Xu, Drew Cronin, Francis Okeke, Andrew Plumptre, and Milind Tambe. 12/12/2020. “Enhancing Poaching Predictions for Under-Resourced Wildlife Conservation Parks Using Remote Sensing Imagery”. Publisher's Version
Xinrun Wang, Tarun Nair, Haoyang Li, Rueben Wong, Nachiket Kelkar, Srinivas Vaidyanathan, Rajat Nayak, Bo An, Jagdish Krishnaswamy, and Milind Tambe. 12/6/2020. “Efficient Reservoir Management throughDeep Reinforcement Learning.” In NeuriPS2020 workshop on AI for Earth Sciences. google_dam_arxiv.pdf
Kai Wang, Bryan Wilder, Andrew Perrault, and Milind Tambe. 12/5/2020. “Automatically Learning Compact Quality-aware Surrogates for Optimization Problems.” In NeurIPS 2020 (spotlight). Vancouver, Canada.Abstract
Solving optimization problems with unknown parameters often requires learning a predictive model to predict the values of the unknown parameters and then solving the problem using these values. Recent work has shown that including the optimization problem as a layer in the model training pipeline results in predictions of the unobserved parameters that lead to higher decision quality. Unfortunately, this process comes at a large computational cost because the optimization problem must be solved and differentiated through in each training iteration; furthermore, it may also sometimes fail to improve solution quality due to  non-smoothness issues that arise when training through a complex optimization layer. To address these shortcomings, we learn a low-dimensional surrogate model of a large optimization problem by representing the feasible space in terms of meta-variables, each of which is a linear combination of the original variables. By training a low-dimensional surrogate model end-to-end, and jointly with the predictive model, we achieve: i) a large reduction in training and inference time; and ii) improved performance by focusing attention on the more important variables in the optimization and learning in a smoother space. Empirically, we demonstrate these improvements on a non-convex adversary modeling task, a submodular recommendation task and a convex portfolio optimization task.
automatically_learning_compact_quality_aware_surrogates_for_optimization_problems.pdf
Aditya Mate*, Jackson A. Killian*, Haifeng Xu, Andrew Perrault, and Milind Tambe. 12/5/2020. “Collapsing Bandits and Their Application to Public Health Interventions.” In Advances in Neural and Information Processing Systems (NeurIPS) 12/5/2020. Vancouver, Canada. Publisher's VersionAbstract
We propose and study Collapsing Bandits, a new restless multi-armed bandit (RMAB) setting in which each arm follows a binary-state Markovian process with a special structure: when an arm is played, the state is fully observed, thus “collapsing” any uncertainty, but when an arm is passive, no observation is made, thus allowing uncertainty to evolve. The goal is to keep as many arms in the “good” state as possible by planning a limited budget of actions per round. Such Collapsing Bandits are natural models for many healthcare domains in which health workers must simultaneously monitor patients and deliver interventions in a way that maximizes the health of their patient cohort. Our main contributions are as follows: (i) Building on the Whittle index technique for RMABs, we derive conditions under which the Collapsing Bandits problem is indexable. Our derivation hinges on novel conditions that characterize when the optimal policies may take the form of either “forward” or “reverse” threshold policies. (ii) We exploit the optimality of threshold policies to build fast algorithms for computing the Whittle index, including a closed form. (iii) We evaluate our algorithm on several data distributions including data from a real-world healthcare task in which a worker must monitor and deliver interventions to maximize their patients’ adherence to tuberculosis medication. Our algorithm achieves a 3-order-of-magnitude speedup compared to state-of-the-art RMAB techniques, while achieving similar performance.
collapsing_bandits_arXiv.pdf
Ankit Bhardwaj*, Han Ching Ou*, Haipeng Chen, Shahin Jabbari, Milind Tambe, Rahul Panicker, and Alpan Raval. 11/2020. “Robust Lock-Down Optimization for COVID-19 Policy Guidance.” In AAAI Fall Symposium. robust_lock-down_optimization_for_covid-19_policy_guidance.pdf
Daniel B. Larremore, Bryan Wilder, Evan Lester, Soraya Shehata, James M. Burke, James A. Hay, Milind Tambe, Michael J. Mina, and Roy Parker. 11/2020. “Test sensitivity is secondary to frequency and turnaround time for COVID-19 screening.” Science Advances. Publisher's VersionAbstract
The COVID-19 pandemic has created a public health crisis. Because SARS-CoV-2 can spread from individuals with pre-symptomatic, symptomatic, and asymptomatic infections, the re-opening of societies and the control of virus spread will be facilitated by robust population screening, for which virus testing will often be central. After infection, individuals undergo a period of incubation during which viral titers are usually too low to detect, followed by an exponential viral growth, leading to a peak viral load and infectiousness, and ending with declining viral levels and clearance. Given the pattern of viral load kinetics, we model the effectiveness of repeated population screening considering test sensitivities, frequency, and sample-to-answer reporting time. These results demonstrate that effective screening depends largely on frequency of testing and the speed of reporting, and is only marginally improved by high test sensitivity. We therefore conclude that screening should prioritize accessibility, frequency, and sample-to-answer time; analytical limits of detection should be secondary.
scienceadvances_full_.pdf
Omkar Thakoor, Shahin Jabbari, Palvi Aggarwal, Cleotilde Gonzalez, Milind Tambe, and Phebe Vayanos. 10/2020. “Exploiting Bounded Rationality in Risk-based Cyber Camouflage Games.” Conference on Decision and Game Theory for Security.Abstract
Recent works have growingly shown that Cyber deception can effectively impede the reconnaissance efforts of intelligent cyber attackers. Recently proposed models to optimize a deceptive defense based on camouflaging network and system attributes, have shown effective numerical results on simulated data. However, these models possess a fundamental drawback due to the assumption that an attempted attack is always successful — as a direct consequence of the deceptive strategies being deployed, the attacker runs a significant risk that the attack fails. Further, this risk or uncertainty in the rewards magnifies the boundedly rational behavior in humans which the previous models do not handle. To that end, we present Risk-based Cyber Camouflage Games — a general-sum game model that captures the uncertainty in the attack's success. In case of the rational attackers, we show that optimal defender strategy computation is NP-hard even in the zero-sum case.We provide an MILP formulation for the general problem with constraints on cost and feasibility, along with a pseudo-polynomial time algorithm for the special unconstrained setting. Second, for risk-averse attackers, we present a solution based on Prospect theoretic modeling along with a robust variant that minimizes regret. Third, we propose a solution that does not rely on the attacker behavior model or past data, and effective for the broad setting of strictly competitive games where previous solutions against bounded rationality prove ineffective. Finally, we provide numerical results that our solutions effectively lower the defender loss.
bounded_rationality_ccg.pdf
Palvi Aggarwal, Omkar Thakoor, Aditya Mate, Milind Tambe, Edward A. Cranford, Christian Lebiere, and Cleotilde Gonzalez. 10/2020. “An Exploratory Study of a Masking Strategy of Cyberdeception Using CyberVAN.” In 64th Human Factors and Ergonomics Society (HFES) Annual Conference.Abstract
During the network reconnaissance process, attackers scan the network to gather information before launching an attack. This is a good chance for defenders to use deception and disrupt the attacker’s learning process. In this paper, we present an exploratory experiment to test the effectiveness of a masking strategy (compared to a random masking strategy) to reduce the utility of attackers. A total of 30 human participants (in the role of attackers) are randomly assigned to one of the two experimental conditions: Optimal or Random (15 in each condition). Attackers appeared to be more successful in launching attacks in the optimal condition compared to the random condition but the total score of attackers was not different from the random masking strategy. Most importantly, we found a generalized tendency to act according to the certainty bias (or risk aversion). These observations will help to improve the current state-of-the-art masking algorithms of cyberdefense.
An Exploratory Study of a Masking Strategy of Cyberdeception Using CyberVAN
Jackson A. Killian, Aditya Mate, Haifeng Xu, Andrew Perrault, and Milind Tambe. 10/2020. “Personalized Adherence Management in TB: Using AI to Schedule Targeted Interventions.” 51st Union World Conference on Lung Health - Accepted Oral Abstract.
Bryan Wilder, Marie Charpignon, Jackson A Killian, Han-Ching Ou, Aditya Mate, Shahin Jabbari, Andrew Perrault, Angel Desai, Milind Tambe, and Maimuna S. Majumder. 9/24/2020. “Modeling between-population variation in COVID-19 dynamics in Hubei, Lombardy, and New York City.” Proceedings of the National Academy of Sciences. Publisher's Version pnas_full.pdf
Anthony Fulgianti, Avi Segal, Jennifer Wilson, Chyna Hill, Milind Tambe, Carl Castro, and Eric Rice. 8/31/2020. “Getting to the root of the problem: A decision-tree analysis for suicide risk among young people experiencing homelessness.” Journal of the Society for Social Work and Research, (to appear).Abstract
Objective : The assessment and prediction of suicide risk among young people experiencing
homelessness (YEH) has proven difficult. Although a large number of suicide risk factors have
been identified, there is limited guidance about their relative importance and the combinations of
factors (i.e., profiles) that heighten risk. Method : Using survey and social network methods, we
gathered information about 940 YEH and their relationships. We then used a machine learning
approach to construct Classification and Regression Tree models to predict suicidal ideation and
suicide attempts. Results : Thirteen variables were important correlates in the decision tree
models. This included prominent individual risk factors (e.g., trauma, depression), but over half
of them were social network factors (e.g., hard drug use). For suicidal ideation, the model had an
area under the receiver operating characteristic curve (AUC) value of 0.79, with Accuracy of
68%, Sensitivity of 48%, and Specificity of 73%. For suicide attempt, the model had an AUC
value of 0.86, with Accuracy of 71%, Sensitivity of 68%, and Specificity of 72%. Conclusions :
Effective suicide prevention programming should target the syndemic that threatens YEH (i.e.,
co-occurrence of trauma-depression-substance use-violence), including social norms in their
environments. With refinement, our decision trees may be useful aids for suicide risk screening
and guiding targeted intervention.
rootoftheproblem_acceptedversion.pdf
Evaluating COVID-19 Lockdown and Business-Sector-Specific Reopening Policies for Three US States
Jackson A. Killian, Marie Charpignon, Bryan Wilder, Andrew Perrault, Milind Tambe, and Maimuna S. Majumder. 8/24/2020. “Evaluating COVID-19 Lockdown and Business-Sector-Specific Reopening Policies for Three US States.” In KDD 2020 Workshop on Humanitarian Mapping. Publisher's VersionAbstract
Background: The United States has been particularly hard-hit by COVID-19, accounting for approximately 30% of all global cases and deaths from the disease that have been reported as of May 20, 2020. We extended our agent-based model for COVID-19 transmission to study the effect of alternative lockdown and reopening policies on disease dynamics in Georgia, Florida, and Mississippi. Specifically, for each state we simulated the spread of the disease had the state enforced its lockdown approximately one week earlier than it did. We also simulated Georgia's reopening plan under various levels of physical distancing if enacted in each state, making projections until June 15, 2020.

Methods: We used an agent-based SEIR model that uses population-specific age distribution, household structure, contact patterns, and comorbidity rates to perform tailored simulations for each region. The model was first calibrated to each state using publicly available COVID-19 death data as of April 23, then implemented to simulate given lockdown or reopening policies.

Results: Our model estimated that imposing lockdowns one week earlier could have resulted in hundreds fewer COVID-19-related deaths in the context of all three states. These estimates quantify the effect of early action, a key metric to weigh in developing prospective policies to combat a potential second wave of infection in each of these states. Further, when simulating Georgia’s plan to reopen select businesses as of April 27, our model found that a reopening policy that includes physical distancing to ensure no more than 25% of pre-lockdown contact rates at reopened businesses could allow limited economic activity to resume in any of the three states, while also eventually flattening the curve of COVID-19-related deaths by June 15, 2020.
covid_19_us_states.pdf
Aditya Mate, Jackson A. Killian, Bryan Wilder, Marie Charpignon, Ananya Awasthi, Milind Tambe, and Maimuna S. Majumder. 8/2020. “Evaluating COVID-19 Lockdown Policies for India: A Preliminary Modeling Assessment for Individual States.” KDD 2020 Workshop on Humanitarian Mapping.Abstract
Background: On March 24, India ordered a 3-week nationwide lockdown in an effort to control the spread of COVID-19. While the lockdown has been effective, our model suggests that completely ending the lockdown after three weeks could have considerable adverse public health ramifications. We extend our individual-level model for COVID-19 transmission [1] to study the disease dynamics in India at the state level for Maharashtra and Uttar Pradesh to estimate the effect of further lockdown policies in each region. Specifically, we test policies which alternate between total lockdown and simple physical distancing to find "middle ground" policies that can provide social and economic relief as well as salutary population-level health effects. Methods: We use an agent-based SEIR model that uses population-specific age distribution, household structure, contact patterns, and comorbidity rates to perform tailored simulations for each region. The model is first calibrated to each region using publicly available COVID-19 death data, then implemented to simulate a range of policies. We also compute the basic reproduction number R0 and case documentation rate for both regions. Results: After the initial lockdown, our simulations demonstrate that even policies that enforce strict physical distancing while returning to normal activity could lead to widespread outbreaks in both states. However, "middle ground" policies that alternate weekly between total lockdown and physical distancing may lead to much lower rates of infection while simultaneously permitting some return to normalcy.
kdd_workshop_india_modeling_cam_ready.pdf
Edward A. Cranford, Cleotilde Gonzalez, Palvi Aggarwal, Sarah Cooney, Milind Tambe, and Christian Lebiere. 7/30/2020. “Toward Personalized Deceptive Signaling for CyberDefense Using Cognitive Models.” Topics in Cognitive Science, 12, 3, Pp. 992-1011. Publisher's VersionAbstract
Recent research in cybersecurity has begun to develop active defense strategies using game-theoretic optimization of the allocation of limited defenses combined with deceptive signaling. These algorithms assume rational human behavior. However, human behavior in an online game designed to simulate an insider attack scenario shows that humans, playing the role of attackers, attack far more often than predicted under perfect rationality. We describe an instance-based learning cognitive model, built in ACT-R, that accurately predicts human performance and biases in the game. To improve defenses, we propose an adaptive method of signaling that uses the cognitive model to trace an individual’s experience in real time. We discuss the results and implications of this adaptive signaling method for personalized defense.
Toward Personalized Deceptive Signaling for Cyber Defense Using Cognitive Models
Siddharth Nishtala, Harshavardhan Kamarthi, Divy Thakkar, Dhyanesh Narayanan, Anirudh Grama, Aparna Hegde, Ramesh Padmanabhan, Neha Madhiwala, Suresh Chaudhary, Balaram Ravindran, and Milind Tambe. 7/23/2020. “Missed calls, Automated Calls and Health Support: Using AI to improve maternalhealth outcomes by increasing program engagement.” In Harvard CRCS workshop on AI for Social Good. armman1.pdf
Lily Xu, Andrew Perrault, Andrew Plumptre, Margaret Driciru, Fred Wanyama, Aggrey Rwetsiba, and Milind Tambe. 7/20/2020. “Game Theory on the Ground: The Effect of Increased Patrols on Deterring Poachers.” Harvard CRCS Workshop on AI for Social Good. Publisher's VersionAbstract
Applications of artificial intelligence for wildlife protection have focused on learning models of poacher behavior based on historical patterns. However, poachers' behaviors are described not only by their historical preferences, but also their reaction to ranger patrols. Past work applying machine learning and game theory to combat poaching have hypothesized that ranger patrols deter poachers, but have been unable to find evidence to identify how or even if deterrence occurs. Here for the first time, we demonstrate a measurable deterrence effect on real-world poaching data. We show that increased patrols in one region deter poaching in the next timestep, but poachers then move to neighboring regions. Our findings offer guidance on how adversaries should be modeled in realistic game-theoretic settings.
poaching_deterrence.pdf
Lindsay Young, Jerome Mayaud, Sze-Chuan Suen, Milind Tambe, and Eric Rice. 7/7/2020. “Modeling the dynamism of HIV information diffusion in multiplex networks of homeless youth.” Social Networks, 63, Pp. 112-121. 1-s2.0-s0378873320300393-main.pdf
Daniel B Larremore, Bryan Wilder, Evan Lester, Soraya Shehata, James M Burke, James A Hay, Milind Tambe, Michael J Mina, and Roy Parker. 6/25/2020. “Surveillance testing of SARS-CoV-2.” medRxiv. Publisher's VersionAbstract
The COVID-19 pandemic has created a public health crisis. Because SARS-CoV-2 can spread from individuals with pre-symptomatic, symptomatic, and asymptomatic infections, the re-opening of societies and the control of virus spread will be facilitated by robust surveillance, for which virus testing will often be central. After infection, individuals undergo a period of incubation during which viral titers are usually too low to detect, followed by an exponential growth of virus, leading to a peak viral load and infectiousness, and ending with declining viral levels and clearance. Given the pattern of viral load kinetics, we model surveillance effectiveness considering test sensitivities, frequency, and sample-to-answer reporting time. These results demonstrate that effective surveillance, including time to first detection and outbreak control, depends largely on frequency of testing and the speed of reporting, and is only marginally improved by high test sensitivity. We therefore conclude that surveillance should prioritize accessibility, frequency, and sample-to-answer time; analytical limits of detection should be secondary.
2020.06.22.20136309v1.full_.pdf
Aida Rahmattalabi, Shahin Jabbari, Himabindu Lakkaraju, Phebe Vayanos, Eric Rice, and Milind Tambe. 6/16/2020. “Fair Influence Maximization: A Welfare Optimization Approach.” In AAAI 2020 Workshop on Health Intelligence, preliminary version.Abstract
Several social interventions (e.g., suicide and HIV prevention) leverage social network information to maximize outreach. Algorithmic influence maximization techniques have been proposed to aid with the choice of “influencers” (often referred to as “peer leaders”) in such interventions. Traditional algorithms for influence maximization have not been designed with social interventions in mind. As a result, they may disproportionately exclude minority communities from the benefits of the intervention. This has motivated research on fair influence maximization. Existing techniques require committing to a single domain-specific fairness measure. This makes it hard for a decision maker to meaningfully compare these notions and their resulting trade-offs across different applications. We address these shortcomings by extending the principles of cardinal welfare to the influence maximization setting, which is underlain by complex connections between members of different communities. We generalize the theory regarding these principles and show under what circumstances these principles can be satisfied by a welfare function. We then propose a family of welfare functions that are governed by a single inequity aversion parameter which allows a decision maker to study task-dependent trade-offs between fairness and total influence and effectively trade off quantities like influence gap by varying this parameter. We use these welfare functions as a fairness notion to rule out undesirable allocations. We show that the resulting optimization problem is monotone and submodular and can be solved with optimality guarantees. Finally, we carry out a detailed experimental analysis on synthetic and real social networks and should that high welfare can be achieved without sacrificing the total influence significantly. Interestingly we can show there exists welfare functions that empirically satisfy all of the principles.
2020_teamcore_jabbari_2006_070906.pdf
Han-Ching Ou, Arunesh Sinha, Sze-Chuan Suen, Andrew Perrault, Alpan Raval, and Milind Tambe. 5/9/2020. “Who and When to Screen Multi-Round Active Screening for Network Recurrent Infectious Diseases Under Uncertainty.” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS-20). who_and_when_to_screen.pdf

Pages