Publications

2024
Sanket Shah, Arun Suggala, Milind Tambe, and Aparna Taneja. 5/1/2024. “Efficient Public Health Intervention Planning Using Decomposition-Based Decision-Focused Learning.” International Conference on Autonomous Agents and Multiagent Systems (AAMAS). Auckland, New Zealand.Abstract

The declining participation of beneficiaries over time is a key concern in public health programs. A popular strategy for improving retention is to have health workers `intervene' on beneficiaries at risk of dropping out.  However, the availability and time of these health workers are limited resources. As a result, there has been a line of research on optimizing these limited intervention resources using Restless Multi-Armed Bandits (RMABs). The key technical barrier to using this framework in practice lies in estimating the beneficiaries' RMAB parameters from historical data. Recent research on Decision-Focused Learning (DFL) has shown that estimating parameters that maximize beneficiaries' cumulative returns rather than predictive accuracy, is essential to good performance. 

Unfortunately, these gains come at a high computational cost because of the need to solve and evaluate the RMAB in each DFL training step. Consequently, past approaches may not be sustainable for the NGOs that manage such programs in the long run, given that they operate under resource constraints. In this paper, we provide a principled way to exploit the structure of RMABs to speed up DFL by decoupling intervention planning for different beneficiaries. We use real-world data from an Indian NGO, ARMMAN, to show that our approach is up to two orders of magnitude faster than the state-of-the-art approach while also yielding superior model performance. This enables computationally efficient solutions, giving NGOs the ability to deploy such solutions to serve potentially millions of mothers, ultimately advancing progress toward UNSDG 3.1.

aamas-24-exact_dfl_for_rmabs_camera_ready.pdf
Soumyabrata Pal, Milind Tambe, Arun Suggala, Karthikeyan Shanmugam, and Aparna Taneja. 5/1/2024. “Improving Mobile Maternal and Child Health Care Programs:Collaborative Bandits for Time slot selection.” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS). kilikari_aamas_camera_ready_1.pdf
Jackson A. Killian, Manish Jain, Yugang Jia, Jonathan Amar, Erich Huang, and Milind Tambe. 3/15/2024. “New Approach to Equitable Intervention Planning to Improve Engagement and Outcomes in a Digital Health Program: Simulation Study.” JMIR Diabetes, 9. Publisher's Version
Arshika Lalan, Paula Rodriguez Diaz, Panayiotis Danassis, Amrita Mahale, Kumar Madhu Sudan, Aparna Hegde, Milind Tambe, and Aparna Taneja. 2/28/2024. “Improving Health Information Access in the World’s Largest Maternal MobileHealth Program via Bandit Algorithms.” In Innovative Applications of Artificial Intelligence (IAAI). iaai_2024_kilkari_camera_ready_1.pdf
Sanket Shah, Bryan Wilder, Andrew Perrault, and Milind Tambe. 2/20/2024. “Leaving the Nest: Going Beyond Local Loss Functions for Predict-Then-Optimize.” AAAI Conference on Artificial Intelligence (AAAI). Vancouver, BC.Abstract

Predict-then-Optimize is a framework for using machine learning to perform decision-making under uncertainty. The central research question it asks is, “How can we use the structure of a decision-making task to tailor ML models for that specific task?” To this end, recent work has proposed learning task- specific loss functions that capture this underlying structure. However, current approaches make restrictive assumptions about the form of these losses and their impact on ML model behavior. These assumptions both lead to approaches with high computational cost, and when they are violated in prac- tice, poor performance. In this paper, we propose solutions to these issues, avoiding the aforementioned assumptions and utilizing the ML model’s features to increase the sample effi- ciency of learning loss functions. We empirically show that our method achieves state-of-the-art results in four domains from the literature, often requiring an order of magnitude fewer samples than comparable methods from past work. Moreover, our approach outperforms the best existing method by nearly 200% when the localness assumption is broken.

aaai_2024_egl_paper_camera-ready.pdf
2023
Shresth Verma, Gargi Singh, Aditya Mate, Paritosh Verma, Sruthi Gorantla, Neha Madhiwalla, Aparna Hegde, Divy Thakkar, Manish Jain, Milind Tambe, and Aparna Taneja. 9/5/2023. “Expanding Impact of Mobile Health Programs: SAHELI for Maternal and Child Care.” AI magazine (to appear).Abstract
Underserved communities face critical health challenges due to lack of access to timely and reliable information. Non- governmental organizations are leveraging the widespread use of cellphones to combat these healthcare challenges and spread preventative awareness. The health workers at these organizations reach out individually to beneficiaries; however such programs still suffer from declining engagement.
We have deployed SAHELI, a system to efficiently utilize the limited availability of health workers for improving maternal and child health in India. SAHELI uses the Restless Multi- armed Bandit (RMAB) framework to identify beneficiaries for outreach. It is the first deployed application for RMABs in public health, and is already in continuous use by our part- ner NGO, ARMMAN. We have already reached ∼ 130K beneficiaries with SAHELI, and are on track to serve 1 mil- lion beneficiaries by the end of 2023. This scale and impact has been achieved through multiple innovations in the RMAB model and its development, in preparation of real world data, and in deployment practices; and through careful considera- tion of responsible AI practices. Specifically, in this paper, we describe our approach to learn from past data to improve the performance of SAHELI’s RMAB model, the real-world chal- lenges faced during deployment and adoption of SAHELI, and the end-to-end pipeline.
Expanding Impact of Mobile Health Programs: SAHELI for Maternal and Child Care
Haipeng Chen, Bryan Wilder, Wei Qiu, Bo An, Eric Rice, and Milind Tambe. 8/2023. “Complex Contagion Influence Maximization: A Reinforcement Learning Approach.” In International Joint Conference on AI (IJCAI) 8/2023. Abstract
In influence maximization (IM), the goal is to find a set of seed nodes in a social network that maximizes the influence spread. While most IM problems focus on classical influence cascades (e.g., Independent Cascade and Linear Threshold) which assume indi- vidual influence cascade probability is independent of the number of neighbors, recent studies by soci- ologists show that many influence cascades follow a pattern called complex contagion (CC), where in- fluence cascade probability is much higher when more neighbors are influenced. Nonetheless, there are very limited studies for complex contagion in- fluence maximization (CCIM) problems. This is partly because CC is non-submodular, the solution of which has been an open challenge. In this study, we propose the first reinforcement learning (RL) approach to CCIM. We find that a key obstacle in applying existing RL approaches to CCIM is the reward sparseness issue, which comes from two dis- tinct sources. We then design a new RL algorithm that uses the CCIM problem structure to address the issue. Empirical results show that our approach achieves the state-of-the-art performance on 9 real- world networks.
Complex Contagion Influence Maximization: A Reinforcement Learning Approach
Panayiotis Danassis, Shresth Verma, Jackson A. Killian, Aparna Taneja, and Milind Tambe. 8/2023. “Limited Resource Allocation in a Non-Markovian World: The Case of Maternal and Child Healthcare.” International Joint Conference on Artificial Intelligence (IJCAI).Abstract
The success of many healthcare programs depends on participants' adherence. We consider the problem of scheduling interventions in low resource settings (e.g., placing timely support calls from health workers) to increase adherence and/or engagement. Past works have successfully developed several classes of Restless Multi-armed Bandit (RMAB) based solutions for this problem. Nevertheless, all past RMAB approaches assume that the participants' behaviour follows the Markov property. We demonstrate significant deviations from the Markov assumption on real-world data on a maternal health awareness program from our partner NGO, ARMMAN. Moreover, we extend RMABs to continuous state spaces, a previously understudied area. To tackle the generalised non-Markovian RMAB setting we (i) model each participant's trajectory as a time-series, (ii) leverage the power of time-series forecasting models to learn complex patterns and dynamics to predict future states, and (iii) propose the Time-series Arm Ranking Index (TARI) policy, a novel algorithm that selects the RMAB arms that will benefit the most from an intervention, given our future state predictions. We evaluate our approach on both synthetic data, and a secondary analysis on real data from ARMMAN, and demonstrate significant increase in engagement compared to the SOTA, deployed Whittle index solution. This translates to 16.3 hours of additional content listened, 90.8% more engagement drops prevented, and reaching more than twice as many high dropout-risk beneficiaries.
danassis_ijcai23.pdf
Lucia Gordon, Nikhil Behari, Samuel Collier, Elizabeth Bondi-Kelly, Jackson A. Killian, Catherine Ressijac, Peter Boucher, Andrew Davies, and Milind Tambe. 8/2023. “Find Rhinos without Finding Rhinos: Active Learning with Multimodal Imagery of South African Rhino Habitats.” Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence (IJCAI). Publisher's VersionAbstract

Much of Earth's charismatic megafauna is endangered by human activities, particularly the rhino, which is at risk of extinction due to the poaching crisis in Africa. Monitoring rhinos' movement is crucial to their protection but has unfortunately proven difficult because rhinos are elusive. Therefore, instead of tracking rhinos, we propose the novel approach of mapping communal defecation sites, called middens, which give information about rhinos' spatial behavior valuable to anti-poaching, management, and reintroduction efforts. This paper provides the first-ever mapping of rhino midden locations by building classifiers to detect them using remotely sensed thermal, RGB, and LiDAR imagery in passive and active learning settings. As existing active learning methods perform poorly due to the extreme class imbalance in our dataset, we design MultimodAL, an active learning system employing a ranking technique and multimodality to achieve competitive performance with passive learning models with 94% fewer labels. Our methods could therefore save over 76 hours in labeling time when used on a similarly-sized dataset. Unexpectedly, our midden map reveals that rhino middens are not randomly distributed throughout the landscape; rather, they are clustered. Consequently, rangers should be targeted at areas with high midden densities to strengthen anti-poaching efforts, in line with UN Target 15.7.

rhino_midden_detector_paper.pdf
Arshika Lalan, Shresth Verma, Kumar Madhu Sudan, Amrita Mahale, Aparna Hegde, Milind Tambe, and Aparna Taneja. 7/2023. “Analyzing and Predicting Low-Listenership Trends in a Large-Scale Mobile Health Program: A Preliminary Investigation.” In KDD workshop on Data Science for social good 7/2023.
Aditya Mate, Bryan Wilder, Aparna Taneja, and Milind Tambe. 7/2023. “Improved Policy Evaluation for Randomized Trials of Algorithmic Resource Allocation.” In International Conference on Machine Learning (ICML 2023). Honolulu, Hawaii.Abstract

We consider the task of evaluating policies of algorithmic resource allocation through randomized controlled trials (RCTs). Such policies are tasked with optimizing the utilization of limited intervention resources, with the goal of maximizing the benefits derived. Evaluation of such allocation policies through RCTs proves difficult, notwithstanding the scale of the trial, because the individuals’ outcomes are inextricably interlinked through resource constraints controlling the policy decisions. Our key contribution is to present a new estimator leveraging our proposed novel concept, that involves retrospective reshuffling of participants across experimental arms at the end of an RCT. We identify conditions under which such reassignments are permissible and can be leveraged to construct counterfactual trials, whose outcomes can be accurately ascertained, for free. We prove theoretically that such an estimator is more accurate than common estimators based on sample means — we show that it returns an unbiased estimate and simultaneously reduces variance. We demonstrate the value of our approach through empirical experiments on synthetic, semisynthetic as well as real case study data and show improved estimation accuracy across the board.

assignment-permutation-cameraready.pdf
Panayiotis Danassis, Aris Filos-Ratsikas, Haipeng Chen, Milind Tambe, and Boi Faltings. 6/1/2023. “AI-driven Prices for Externalities and Sustainability in Production Markets (Extended Abstract).” International Conference on Autonomous Agents and Multiagent Systems (AAMAS). London, United Kingdom.Abstract
Markets do not account for negative externalities; indirect costs that some participants impose on others, such as the cost of over-appropriating a common-pool resource (which diminishes future stock, and thus harvest, for everyone). Quantifying appropriate interventions to market prices has proven to be quite challenging. We propose a practical approach to computing market prices and allocations via a deep reinforcement learning policymaker agent, operating in an environment of other learning agents. Our policymaker allows us to tune the prices with regard to diverse objectives such as sustainability and resource wastefulness, fairness, buyers' and sellers' welfare, etc. As a highlight of our findings, our policymaker is significantly more successful in maintaining resource sustainability, compared to the market equilibrium outcome, in scarce resource environments.
danassis_aamas2023.pdf
Arpita Biswas, Jackson A. Killian, Paula Rodriguez Diaz, Susobhan Ghosh, and Milind Tambe. 6/1/2023. “Fairness for Workers Who Pull the Arms: An Index Based Policy for Allocation of Restless Bandit Tasks.” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract

Motivated by applications such as machine repair, project monitoring, and anti-poaching patrol scheduling, we study intervention planning of stochastic processes under resource constraints. This planning problem has previously been modeled as restless multi-armed bandits (RMAB), where each arm is an interventiondependent Markov Decision Process. However, the existing literature assumes all intervention resources belong to a single uniform pool, limiting their applicability to real-world settings where interventions are carried out by a set of workers, each with their own costs, budgets, and intervention effects. In this work, we consider a novel RMAB setting, called multi-worker restless bandits (MWRMAB) with heterogeneous workers. The goal is to plan an intervention schedule that maximizes the expected reward while satisfying budget constraints on each worker as well as fairness in terms of the load assigned to each worker. Our contributions are two-fold: (1) we provide a multi-worker extension of the Whittle index to tackle heterogeneous costs and per-worker budget and (2) we develop an index-based scheduling policy to achieve fairness. Further, we evaluate our method on various cost structures and show that our method significantly outperforms other baselines in terms of fairness without sacrificing much in reward accumulated.

aamas23_fair_workers_rmab_cr.pdf
Kai Wang. 5/30/2023. “Integrating Machine Learning and Optimization with Applications in Public Health and Sustainability .” Harvard University. kaiwangthesis.pdf
Shresth Verma, Aditya Mate, Kai Wang, Neha Madhiwala, Aparna Hegde, Aparna Taneja, and Milind Tambe. 5/28/2023. “Restless Multi-Armed Bandits for Maternal and Child Health:Results from Decision-Focused Learning.” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS). dfl_2022_study_aamas_2023_camera_ready.pdf
Shresth Verma, Aditya Mate, Kai Wang, Neha Madhiwala, Aparna Hegde, Aparna Taneja, and Milind Tambe. 5/28/2023. “Restless Multi-Armed Bandits for Maternal and Child Health:Results from Decision-Focused Learning.” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS). dfl_2022_study_aamas_2023_camera_ready.pdf
Abheek Ghosh, Dheeraj Nagraj, Manish Jain, and Milind Tambe. 5/27/2023. “Indexability is Not Enough for Whittle: Improved, Near-Optimal Algorithms for Restless Bandits.” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS). mean_field.pdf
Sanket Shah, Shresth Verma, Amrita Mahale, Kumar Madhu Sudan, Aparna Hegde, Aparna Taneja, and Milind Tambe. 5/24/2023. “Preliminary Results in Low-Listenership Prediction in One of theLargest Mobile Health Programs in theWorld.” In AAMAS 2023 workshop on Autonomous Agents for Social Good . Publisher's Version aasg_2023_kilkari_paper_4.pdf
Aditya Mate. 5/23/2023. “Actualizing Impact of AI in Public Health:Optimization of Scarce Health InterventionResources in the Real World.” Computer Science. dissertation-adityamate-final.pdf
Jackson A. Killian. 5/2023. “Activity Allocation in an Under-Resourced World: Toward Improving Engagement with Public Health Programs via Restless Bandits”. killian_dissertation_2023.pdf

Pages