Fighting and Preventing Tuberculosis

2022
Aditya Mate, Arpita Biswas, Christoph Siebenbrunner, Susobhan Ghosh, and Milind Tambe. 5/2022. “Efficient Algorithms for Finite Horizon and Streaming RestlessMulti-Armed Bandit Problems.” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS). streamingbandits-camready-main.pdf
2021
Aditya Mate, Andrew Perrault, and Milind Tambe. 5/7/2021. “Risk-Aware Interventions in Public Health: Planning with Restless Multi-Armed Bandits.” In 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS). London, UK.Abstract
Community Health Workers (CHWs) form an important component of health-care systems globally, especially in low-resource settings. CHWs are often tasked with monitoring the health of and intervening on their patient cohort. Previous work has developed several classes of Restless Multi-Armed Bandits (RMABs) that are computationally tractable and indexable, a condition that guarantees asymptotic optimality, for solving such health monitoring and intervention problems (HMIPs).
However, existing solutions to HMIPs fail to account for risk-sensitivity considerations of CHWs in the planning stage and may run the danger of ignoring some patients completely because they are deemed less valuable to intervene on.
Additionally, these also rely on patients reporting their state of adherence accurately when intervened upon. Towards tackling these issues, our contributions in this paper are as follows: 
(1) We develop an RMAB solution to HMIPs that allows for reward functions that are monotone increasing, rather than linear, in the belief state and also supports a wider class of observations.
(2) We prove theoretical guarantees on the asymptotic optimality of our algorithm for any arbitrary reward function. Additionally, we show that for the specific reward function considered in previous work, our theoretical conditions are stronger than the state-of-the-art guarantees.
(3) We show the applicability of these new results for addressing the three issues pertaining to: risk-sensitive planning, equitable allocation and reliance on perfect observations as highlighted above. We evaluate these techniques on both simulated as well as real data from a prevalent CHW task of monitoring adherence of tuberculosis patients to their prescribed medication in Mumbai, India and show improved performance over the state-of-the-art. The simulation code is available at: https://github.com/AdityaMate/risk-aware-bandits.
Risk-Aware-Bandits.pdf
Beyond "To Act or Not to Act": Fast Lagrangian Approaches to General Multi-Action Restless Bandits
Jackson A Killian, Andrew Perrault, and Milind Tambe. 5/2021. “Beyond "To Act or Not to Act": Fast Lagrangian Approaches to General Multi-Action Restless Bandits.” In 20th International Conference on Autonomous Agents and Multiagent Systems. multi_action_bandits_aamas_2021_preprint.pdf
2020
Aditya Mate*, Jackson A. Killian*, Haifeng Xu, Andrew Perrault, and Milind Tambe. 12/5/2020. “Collapsing Bandits and Their Application to Public Health Interventions.” In Advances in Neural and Information Processing Systems (NeurIPS) 12/5/2020. Vancouver, Canada. Publisher's VersionAbstract
We propose and study Collapsing Bandits, a new restless multi-armed bandit (RMAB) setting in which each arm follows a binary-state Markovian process with a special structure: when an arm is played, the state is fully observed, thus “collapsing” any uncertainty, but when an arm is passive, no observation is made, thus allowing uncertainty to evolve. The goal is to keep as many arms in the “good” state as possible by planning a limited budget of actions per round. Such Collapsing Bandits are natural models for many healthcare domains in which health workers must simultaneously monitor patients and deliver interventions in a way that maximizes the health of their patient cohort. Our main contributions are as follows: (i) Building on the Whittle index technique for RMABs, we derive conditions under which the Collapsing Bandits problem is indexable. Our derivation hinges on novel conditions that characterize when the optimal policies may take the form of either “forward” or “reverse” threshold policies. (ii) We exploit the optimality of threshold policies to build fast algorithms for computing the Whittle index, including a closed form. (iii) We evaluate our algorithm on several data distributions including data from a real-world healthcare task in which a worker must monitor and deliver interventions to maximize their patients’ adherence to tuberculosis medication. Our algorithm achieves a 3-order-of-magnitude speedup compared to state-of-the-art RMAB techniques, while achieving similar performance.
collapsing_bandits_full_paper_camready.pdf
Han-Ching Ou, Arunesh Sinha, Sze-Chuan Suen, Andrew Perrault, Alpan Raval, and Milind Tambe. 5/9/2020. “Who and When to Screen Multi-Round Active Screening for Network Recurrent Infectious Diseases Under Uncertainty.” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS-20). who_and_when_to_screen.pdf
2019
Jackson Killian, Bryan Wilder, Amit Sharma, Vinod Choudhary, Bistra Dilkina, and Milind Tambe. 8/4/2019. “Learning to Prescribe Interventions for Tuberculosis Patients Using Digital Adherence Data.” In The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 8/4/2019. Abstract
Digital Adherence Technologies (DATs) are an increasingly popular method for verifying patient adherence to many medications.
We analyze data from one city served by 99DOTS, a phone-callbased DAT deployed for Tuberculosis (TB) treatment in India where
nearly 3 million people are afflicted with the disease each year. The
data contains nearly 17,000 patients and 2.1M dose records. We lay
the groundwork for learning from this real-world data, including
a method for avoiding the effects of unobserved interventions in
training data used for machine learning. We then construct a deep
learning model, demonstrate its interpretability, and show how it
can be adapted and trained in three different clinical scenarios to
better target and improve patient care. In the real-time risk prediction setting our model could be used to proactively intervene with
21% more patients and before 76% more missed doses than current
heuristic baselines. For outcome prediction, our model performs
40% better than baseline methods, allowing cities to target more
resources to clinics with a heavier burden of patients at risk of failure. Finally, we present a case study demonstrating how our model
can be trained in an end-to-end decision focused learning setting to
achieve 15% better solution quality in an example decision problem
faced by health workers.
killian-kdd-2019.pdf