Publications by Year: 2023

2023
Shresth Verma, Gargi Singh, Aditya Mate, Paritosh Verma, Sruthi Gorantla, Neha Madhiwalla, Aparna Hegde, Divy Thakkar, Manish Jain, Milind Tambe, and Aparna Taneja. 9/5/2023. “Expanding Impact of Mobile Health Programs: SAHELI for Maternal and Child Care.” AI magazine (to appear).Abstract
Underserved communities face critical health challenges due to lack of access to timely and reliable information. Non- governmental organizations are leveraging the widespread use of cellphones to combat these healthcare challenges and spread preventative awareness. The health workers at these organizations reach out individually to beneficiaries; however such programs still suffer from declining engagement.
We have deployed SAHELI, a system to efficiently utilize the limited availability of health workers for improving maternal and child health in India. SAHELI uses the Restless Multi- armed Bandit (RMAB) framework to identify beneficiaries for outreach. It is the first deployed application for RMABs in public health, and is already in continuous use by our part- ner NGO, ARMMAN. We have already reached ∼ 130K beneficiaries with SAHELI, and are on track to serve 1 mil- lion beneficiaries by the end of 2023. This scale and impact has been achieved through multiple innovations in the RMAB model and its development, in preparation of real world data, and in deployment practices; and through careful considera- tion of responsible AI practices. Specifically, in this paper, we describe our approach to learn from past data to improve the performance of SAHELI’s RMAB model, the real-world chal- lenges faced during deployment and adoption of SAHELI, and the end-to-end pipeline.
Expanding Impact of Mobile Health Programs: SAHELI for Maternal and Child Care
Haipeng Chen, Bryan Wilder, Wei Qiu, Bo An, Eric Rice, and Milind Tambe. 8/2023. “Complex Contagion Influence Maximization: A Reinforcement Learning Approach.” In International Joint Conference on AI (IJCAI) 8/2023. Abstract
In influence maximization (IM), the goal is to find a set of seed nodes in a social network that maximizes the influence spread. While most IM problems focus on classical influence cascades (e.g., Independent Cascade and Linear Threshold) which assume indi- vidual influence cascade probability is independent of the number of neighbors, recent studies by soci- ologists show that many influence cascades follow a pattern called complex contagion (CC), where in- fluence cascade probability is much higher when more neighbors are influenced. Nonetheless, there are very limited studies for complex contagion in- fluence maximization (CCIM) problems. This is partly because CC is non-submodular, the solution of which has been an open challenge. In this study, we propose the first reinforcement learning (RL) approach to CCIM. We find that a key obstacle in applying existing RL approaches to CCIM is the reward sparseness issue, which comes from two dis- tinct sources. We then design a new RL algorithm that uses the CCIM problem structure to address the issue. Empirical results show that our approach achieves the state-of-the-art performance on 9 real- world networks.
Complex Contagion Influence Maximization: A Reinforcement Learning Approach
Panayiotis Danassis, Shresth Verma, Jackson A. Killian, Aparna Taneja, and Milind Tambe. 8/2023. “Limited Resource Allocation in a Non-Markovian World: The Case of Maternal and Child Healthcare.” International Joint Conference on Artificial Intelligence (IJCAI).Abstract
The success of many healthcare programs depends on participants' adherence. We consider the problem of scheduling interventions in low resource settings (e.g., placing timely support calls from health workers) to increase adherence and/or engagement. Past works have successfully developed several classes of Restless Multi-armed Bandit (RMAB) based solutions for this problem. Nevertheless, all past RMAB approaches assume that the participants' behaviour follows the Markov property. We demonstrate significant deviations from the Markov assumption on real-world data on a maternal health awareness program from our partner NGO, ARMMAN. Moreover, we extend RMABs to continuous state spaces, a previously understudied area. To tackle the generalised non-Markovian RMAB setting we (i) model each participant's trajectory as a time-series, (ii) leverage the power of time-series forecasting models to learn complex patterns and dynamics to predict future states, and (iii) propose the Time-series Arm Ranking Index (TARI) policy, a novel algorithm that selects the RMAB arms that will benefit the most from an intervention, given our future state predictions. We evaluate our approach on both synthetic data, and a secondary analysis on real data from ARMMAN, and demonstrate significant increase in engagement compared to the SOTA, deployed Whittle index solution. This translates to 16.3 hours of additional content listened, 90.8% more engagement drops prevented, and reaching more than twice as many high dropout-risk beneficiaries.
danassis_ijcai23.pdf
Lucia Gordon, Nikhil Behari, Samuel Collier, Elizabeth Bondi-Kelly, Jackson A. Killian, Catherine Ressijac, Peter Boucher, Andrew Davies, and Milind Tambe. 8/2023. “Find Rhinos without Finding Rhinos: Active Learning with Multimodal Imagery of South African Rhino Habitats.” Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence (IJCAI). Publisher's VersionAbstract

Much of Earth's charismatic megafauna is endangered by human activities, particularly the rhino, which is at risk of extinction due to the poaching crisis in Africa. Monitoring rhinos' movement is crucial to their protection but has unfortunately proven difficult because rhinos are elusive. Therefore, instead of tracking rhinos, we propose the novel approach of mapping communal defecation sites, called middens, which give information about rhinos' spatial behavior valuable to anti-poaching, management, and reintroduction efforts. This paper provides the first-ever mapping of rhino midden locations by building classifiers to detect them using remotely sensed thermal, RGB, and LiDAR imagery in passive and active learning settings. As existing active learning methods perform poorly due to the extreme class imbalance in our dataset, we design MultimodAL, an active learning system employing a ranking technique and multimodality to achieve competitive performance with passive learning models with 94% fewer labels. Our methods could therefore save over 76 hours in labeling time when used on a similarly-sized dataset. Unexpectedly, our midden map reveals that rhino middens are not randomly distributed throughout the landscape; rather, they are clustered. Consequently, rangers should be targeted at areas with high midden densities to strengthen anti-poaching efforts, in line with UN Target 15.7.

rhino_midden_detector_paper.pdf
Arshika Lalan, Shresth Verma, Kumar Madhu Sudan, Amrita Mahale, Aparna Hegde, Milind Tambe, and Aparna Taneja. 7/2023. “Analyzing and Predicting Low-Listenership Trends in a Large-Scale Mobile Health Program: A Preliminary Investigation.” In KDD workshop on Data Science for social good 7/2023.
Aditya Mate, Bryan Wilder, Aparna Taneja, and Milind Tambe. 7/2023. “Improved Policy Evaluation for Randomized Trials of Algorithmic Resource Allocation.” In International Conference on Machine Learning (ICML 2023). Honolulu, Hawaii.Abstract

We consider the task of evaluating policies of algorithmic resource allocation through randomized controlled trials (RCTs). Such policies are tasked with optimizing the utilization of limited intervention resources, with the goal of maximizing the benefits derived. Evaluation of such allocation policies through RCTs proves difficult, notwithstanding the scale of the trial, because the individuals’ outcomes are inextricably interlinked through resource constraints controlling the policy decisions. Our key contribution is to present a new estimator leveraging our proposed novel concept, that involves retrospective reshuffling of participants across experimental arms at the end of an RCT. We identify conditions under which such reassignments are permissible and can be leveraged to construct counterfactual trials, whose outcomes can be accurately ascertained, for free. We prove theoretically that such an estimator is more accurate than common estimators based on sample means — we show that it returns an unbiased estimate and simultaneously reduces variance. We demonstrate the value of our approach through empirical experiments on synthetic, semisynthetic as well as real case study data and show improved estimation accuracy across the board.

assignment-permutation-cameraready.pdf
Panayiotis Danassis, Aris Filos-Ratsikas, Haipeng Chen, Milind Tambe, and Boi Faltings. 6/1/2023. “AI-driven Prices for Externalities and Sustainability in Production Markets (Extended Abstract).” International Conference on Autonomous Agents and Multiagent Systems (AAMAS). London, United Kingdom.Abstract
Markets do not account for negative externalities; indirect costs that some participants impose on others, such as the cost of over-appropriating a common-pool resource (which diminishes future stock, and thus harvest, for everyone). Quantifying appropriate interventions to market prices has proven to be quite challenging. We propose a practical approach to computing market prices and allocations via a deep reinforcement learning policymaker agent, operating in an environment of other learning agents. Our policymaker allows us to tune the prices with regard to diverse objectives such as sustainability and resource wastefulness, fairness, buyers' and sellers' welfare, etc. As a highlight of our findings, our policymaker is significantly more successful in maintaining resource sustainability, compared to the market equilibrium outcome, in scarce resource environments.
danassis_aamas2023.pdf
Arpita Biswas, Jackson A. Killian, Paula Rodriguez Diaz, Susobhan Ghosh, and Milind Tambe. 6/1/2023. “Fairness for Workers Who Pull the Arms: An Index Based Policy for Allocation of Restless Bandit Tasks.” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract

Motivated by applications such as machine repair, project monitoring, and anti-poaching patrol scheduling, we study intervention planning of stochastic processes under resource constraints. This planning problem has previously been modeled as restless multi-armed bandits (RMAB), where each arm is an interventiondependent Markov Decision Process. However, the existing literature assumes all intervention resources belong to a single uniform pool, limiting their applicability to real-world settings where interventions are carried out by a set of workers, each with their own costs, budgets, and intervention effects. In this work, we consider a novel RMAB setting, called multi-worker restless bandits (MWRMAB) with heterogeneous workers. The goal is to plan an intervention schedule that maximizes the expected reward while satisfying budget constraints on each worker as well as fairness in terms of the load assigned to each worker. Our contributions are two-fold: (1) we provide a multi-worker extension of the Whittle index to tackle heterogeneous costs and per-worker budget and (2) we develop an index-based scheduling policy to achieve fairness. Further, we evaluate our method on various cost structures and show that our method significantly outperforms other baselines in terms of fairness without sacrificing much in reward accumulated.

aamas23_fair_workers_rmab_cr.pdf
Kai Wang. 5/30/2023. “Integrating Machine Learning and Optimization with Applications in Public Health and Sustainability .” Harvard University. kaiwangthesis.pdf
Shresth Verma, Aditya Mate, Kai Wang, Neha Madhiwala, Aparna Hegde, Aparna Taneja, and Milind Tambe. 5/28/2023. “Restless Multi-Armed Bandits for Maternal and Child Health:Results from Decision-Focused Learning.” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS). dfl_2022_study_aamas_2023_camera_ready.pdf
Shresth Verma, Aditya Mate, Kai Wang, Neha Madhiwala, Aparna Hegde, Aparna Taneja, and Milind Tambe. 5/28/2023. “Restless Multi-Armed Bandits for Maternal and Child Health:Results from Decision-Focused Learning.” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS). dfl_2022_study_aamas_2023_camera_ready.pdf
Abheek Ghosh, Dheeraj Nagraj, Manish Jain, and Milind Tambe. 5/27/2023. “Indexability is Not Enough for Whittle: Improved, Near-Optimal Algorithms for Restless Bandits.” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS). mean_field.pdf
Sanket Shah, Shresth Verma, Amrita Mahale, Kumar Madhu Sudan, Aparna Hegde, Aparna Taneja, and Milind Tambe. 5/24/2023. “Preliminary Results in Low-Listenership Prediction in One of theLargest Mobile Health Programs in theWorld.” In AAMAS 2023 workshop on Autonomous Agents for Social Good . Publisher's Version aasg_2023_kilkari_paper_4.pdf
Aditya Mate. 5/23/2023. “Actualizing Impact of AI in Public Health:Optimization of Scarce Health InterventionResources in the Real World.” Computer Science. dissertation-adityamate-final.pdf
Jackson A. Killian. 5/2023. “Activity Allocation in an Under-Resourced World: Toward Improving Engagement with Public Health Programs via Restless Bandits”. killian_dissertation_2023.pdf
Kai Wang*, Lily Xu*, Aparna Taneja, and Milind Tambe. 2/14/2023. “Optimistic Whittle Index Policy: Online Learning for Restless Bandits.” AAAI Conference on Artificial Intelligence (AAAI). arXiv linkAbstract
Restless multi-armed bandits (RMABs) extend multi-armed bandits to allow for stateful arms, where the state of each arm evolves restlessly with different transitions depending on whether that arm is pulled. Solving RMABs requires information on transition dynamics, which are often unknown upfront. To plan in RMAB settings with unknown transitions, we propose the first online learning algorithm based on the Whittle index policy, using an upper confidence bound (UCB) approach to learn transition dynamics. Specifically, we estimate confidence bounds of the transition probabilities and formulate a bilinear program to compute optimistic Whittle indices using these estimates. Our algorithm, UCWhittle, achieves sublinear $O(H \sqrt{T \log T})$ frequentist regret to solve RMABs with unknown transitions in $T$ episodes with a constant horizon~$H$. Empirically, we demonstrate that UCWhittle leverages the structure of RMABs and the Whittle index policy solution to achieve better performance than existing online learning baselines across three domains, including one constructed from a real-world maternal and childcare dataset.
Kai Wang*, Shresth Verma*, Aditya Mate, Sanket Shah, Aparna Taneja, Neha Madhiwalla, Aparna Hegde, and Milind Tambe. 2/14/2023. “Scalable Decision-Focused Learning in Restless Multi-Armed Bandits with Application to Maternal and Child Health.” AAAI Conference on Artificial Intelligence (AAAI).Abstract
This paper studies restless multi-armed bandit (RMAB) problems with unknown arm transition dynamics but with known correlated arm features. The goal is to learn a model to predict transition dynamics given features, where the Whittle index policy solves the RMAB problems using predicted transitions. However, prior works often learn the model by maximizing the predictive accuracy instead of final RMAB solution quality, causing a mismatch between training and evaluation objectives. To address this shortcoming, we propose a novel approach for decision-focused learning in RMAB that directly trains the predictive model to maximize the Whittle index solution quality. We present three key contributions: (i) we establish differentiability of the Whittle index policy to support decision-focused learning; (ii) we significantly improve the scalability of decision-focused learning approaches in sequential problems, specifically RMAB problems; (iii) we apply our algorithm to a previously collected dataset of maternal and child health to demonstrate its performance. Indeed, our algorithm is the first for decision-focused learning in RMAB that scales to real-world problem sizes.
10767.wangk_full.pdf
Shresth Verma, Gargi Singh, Aditya Mate, Paritosh Verma, Sruthi Gorantala, Neha Madhiwalla, Aparna Hegde, Divy Thakkar, Manish Jain, Milind Tambe, and Aparna Taneja. 2/10/2023. “Increasing Impact of Mobile Health Programs: SAHELI for Maternal and ChildCare.” In Innovative Applications of Artificial Intelligence (IAAI). iaai_2023_armman_rmab_deployment_5.pdf
Jackson A. Killian*, Arpita Biswas*, Lily Xu*, Shresth Verma*, Vineet Nair, Aparna Taneja, Aparna Hegde, Neha Madhiwalla, Paula Rodriguez Diaz, Sonja Johnson-Yu, and Milind Tambe. 2/9/2023. “Robust Planning over Restless Groups: Engagement Interventions for a Large-Scale Maternal Telehealth Program.” In AAAI Conference on Artificial Intelligence. armman_groups.pdf
Paula Rodriguez Diaz, Jackson A Killian, Lily Xu, Arun Sai Suggala, Aparna Taneja, and Milind Tambe. 2/7/2023. “Flexible Budgets in Restless Bandits: A Primal-Dual Algorithm for Efficient Budget Allocation.” In AAAI Conference on Artificial Intelligence (AAAI). aaai23_frmab_cr.pdf

Pages