Please don’t remove this alert box from this page.
This alert box is placed on all the pages of this website. Although visible in edit mode, it is hidden by custom code.

Prevention of Tuberculosis via Prediction and Multi-agent Planning

Prevention of Tuberculosis
via Prediction and Multi-agent Planning

MOTIVATION

Tuberculosis (TB), an infectious disease primarly of the lungs, is still one of the top 10 causes of death worldwide and disproportionately affects under-resourced countries and communities. Though the disease is curable, the challenges for patients in treatment are myriad: treatment regimens require daily antibioitics for a minimum of six months, treatment side effects make it difficult to work, and patients often face social stigma. At the policy level, the disease also presents challenges as governments must try to make accurate models of disease spread with incomplete information in order to decide where to spend and deploy limited health resources. 

One of the biggest challenges to ending TB is ensuring medication adherence. When patients miss too many antibiotic doses, they risk reinfection or development of multi-drug resistant strains of TB. This project focuses on the development of decision support systems for TB health workers who need to plan intervention schedules targeted toward keeping patients adherent to their antibiotics. The earlier workers can detect risk of non-adherence, the sooner they can deliver preventative interventions, keeping patients from missing critical doses.

Through ongoing collaborations with the healthcare technology company Everwell, and the city of Mumbai, India, we developed a machine learning system trained on millions of daily adherence logs from the 99DOTS digital adherence monitoring system. In simulation, the system is capable of detecting patient missed doses almost twice as early as baselines. The work is described in the following video:

PREVIOUS WORK

Restless Bandit Algorithms For Scheduling Resource-Constrained Interventions To Support Medication Adherence

In this work, we develop a multi-agent system to tackle situations in which the daily adherence logs are not automatically available and health workers must call patients to gather the adherence information as well as simultaneously deliver interventions to patients to boost their future adherence. In such scenarios, the health workers must plan their interventions while balancing the exploration/exploitation tradeoff effectively so as to maximize the overall health outcomes for their patient cohort.

We propose and develop Collapsing Bandits, a new sub-class of the Restless Multi-Armed Bandits (RMAB) setting in which each arm follows a binary-state Markovian process with a special structure: when an arm is played, the state is fully observed, thus “collapsing” any uncertainty, but when an arm is passive, no observation is made, thus allowing uncertainty to evolve.  We use this framework to develop a planning algorithm that recommends to the health workers which patients would benefit most from intervention each day. We show that our algorithm achieves a 3-order of magnitude speedup as compared to previous RMAB approaches without sacrificing on performance, making our approach feasible to use with limited computational resources on real-world problem sizes (e.g., 100s of patients).

In previous work (NeurIPS’20), we used restless bandits to schedule adherence support interventions, but traditional restless bandits only allow the user to plan a single intervention type. In general, workers have many options for intervening (e.g., call, text, visit), each with their own cost and patient-dependent effect. The key challenge then is how to act optimally (i.e., maximize patient medication adherence) while staying under budget? We extend to the little-studied “Multi-action” Restless Bandits setting. Using techniques from Lagrangian relaxation and linear programming, we develop an algorithm with provable gaurantees and improved best-case run time scaling compared to existing baslines. In a simulated TB medication adherence intervention task, our algorithm gives state of the art intervention policies in a fraction of the time compared to baselines, taking a critical step toward enabling long-term planning — that considers cost-benefit tradeoffs of multiple interventions — in a way that scales to real-world problem sizes that we see in TB and other public health challenges.

This work also builds off of our earlier “Collapsing Bandits” framework (NeurIPS’20) that proposes an algorithm to recommend to the health workers each day, which patients they should intervene on. We address multiple challenges related to real-world deployment: 1) The previous algorithm may be insensitive to equitable resource allocation, meaning it may deem that some patients are “sub-optimal” to intervene on (e.g., they will be less likely to respond to an intervention compared to another patient) and may choose to ignore them completely. 2) The previous algorithm also assumes that the worker will be able to perfectly observe the patient’s adherence and 3) assumes that the planner is risk-neutral — while real-world health workers may be risk-averse. To address these concerns, we propose a “Risk-Aware” algorithm that admits a more general, non-linear reward function that can be suitably shaped to capture the planner’s considerations and also accommodates imperfect observations. We also develop new theory to prove optimality guarantees for our approach. The end result is an algorithm that caters to real-world considerations, adding flexibility to allow the planner to specify the objectives that are meaningful in their setting.

Contact Tracing To Identify Undiagnosed Disease

While individuals with symptoms of infectious disease may seek treatment themselves (passive-screening), public health campaigns may fund active-screening programs where health workers seek out undiagnosed cases by canvasing at-risk individuals.  Individuals at particularly high risk–those who have come into contact with infected individuals–may be identified through interviews with patients and encouraged to seek screening and treatment.

In an ongoing project in collaboration with the Wadhwani AI Institute in Mumbai, India, we are developing an algorithm to help inform how contact tracing might be optimized, given a limited budget for disease screening and knowledge of the contact network between individuals in a community.  Who should be screened if some individuals are confirmed to be infected while others are merely suspected to have disease?  How does this change over time, with the graph structure, and disease progression?

Active screening provides a powerful yet expensive means to control disease spread in the public health domain that passive screening cannot achieve due to its latency of cure. With limited but valuable health workers, we need to plan their visits smartly.

In this work, we built a novel active screening model based on the network SIS model commonly used in public health literature. The problem involved sequential planning of combinatorial actions that are very challenging and at least NP-hard. We developed the algorithm REMEDY with two variants. FULL-REMEDY considers the effect of future actions and finds a policy that provides high solution quality, where FAST-REMEDY scales linearly in the network’s size. It leverages both the information gained from passive-screening and graph structure to provide high-quality solutions. In the simulations we have done using real-world contact network and disease parameters, both variants outperform baselines used in most practices with a large margin. The details are described in the above paper and video links.

While the prior work provides high-quality solutions for short-term planning, it fails to scale to a large time horizon while fully considering current interventions’ future effects. In this work, we further address such challenge by using a novel reinforcement learning (RL) approach based on Deep Q-Networks (DQN). We proposed an innovative two-level multi-agent framework that hierarchically solves the problem. We also speed-up the slow convergence RL by incorporate ideas from curriculum learning into our approach. It can scale up to 10 times the problem size of Full-REMEDY in terms of planning time horizon and outperforms Fast-REMEDY by up to 33% in solution quality.

Identifying Risk Groups For Infectious Disease Outreach

Treatable infectious diseases are a critical challenge for public health. While treatment regimens may exist, and even be offered at low cost to patients, often individuals do not recognize the need to seek treatment or delay doing so, thereby increasing transmission risk to others.  Outreach and education campaigns can encourage undiagnosed patients to seek treatment. However, such programs must be carefully targeted to appropriate demographics (for the specific disease) in resource-constrained settings.

In prior work, we developed an algorithm to optimally allocate limited outreach resources among demographic groups in the population. The algorithm uses a novel multi-agent model of disease spread which both captures the underlying population dynamics and is amenable to optimization. Our algorithm extends, with provable guarantees, to a stochastic setting where we have only a distribution over parameters such as the contact pattern between agents. We evaluate our algorithm on two instances where this distribution is inferred from real world data: tuberculosis in India and gonorrhea in the United States.

Development Of Individual-Level Simulations Of Disease

While the medical literature offers a rich source of information about disease progression, screening, and treatment, the spread of disease on a population can be a complex process that requires simulation to understand.  We develop individual-level simulations of infectious and non-infectious disease to probabilistically infer who will acquire disease, project population disease metrics (such as prevalence and incidence) over long time horizons, and to evaluate the effectiveness (and cost) of interventions.  These simulations draw on detailed knowledge about disease progression, patient behavior, and treatment outcomes using information from medical literature and datasets.  In prior projects, we have developed individual-level simulations of tuberculosis, and efforts to model chronic disease in elderly persons are ongoing.  Ongoing work involves examining how different models (with different runtimes, fidelity to the real world, and noise) can be used in combination to quickly and accurately identify optimal policies for disease control given individual heterogeneity.

Related Publications

Han-Ching Ou. 3/31/2022. “Sequential Network Planning Problems for Public Health Applications” PhD Thesis, Computer Science, Harvard University.

Aditya Mate, Arpita Biswas, Christoph Siebenbrunner, Susobhan Ghosh, and Milind Tambe. 5/2022. “Efficient Algorithms for Finite Horizon and Streaming RestlessMulti-Armed Bandit Problems” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS).

Aditya Mate, Andrew Perrault, and Milind Tambe. 5/7/2021. “Risk-Aware Interventions in Public Health: Planning with Restless Multi-Armed Bandits” In 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS). London, UK.

Jackson A Killian, Andrew Perrault, and Milind Tambe. 5/2021. “Beyond “To Act or Not to Act”: Fast Lagrangian Approaches to General Multi-Action Restless Bandits” In 20th International Conference on Autonomous Agents and Multiagent Systems.

Aditya Mate*, Jackson A. Killian*, Haifeng Xu, Andrew Perrault, and Milind Tambe. 12/5/2020. “Collapsing Bandits and Their Application to Public Health Interventions” In Advances in Neural and Information Processing Systems (NeurIPS) 12/5/2020. Vancouver, Canada. Publisher’s Version

Jackson Killian, Bryan Wilder, Amit Sharma, Vinod Choudhary, Bistra Dilkina, and Milind Tambe. 8/4/2019. “Learning to Prescribe Interventions for Tuberculosis Patients Using Digital Adherence Data” In The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 8/4/2019.

Han-Ching Ou, Arunesh Sinha, Sze-Chuan Suen, Andrew Perrault, Alpan Raval, and Milind Tambe. 5/9/2020. “Who and When to Screen Multi-Round Active Screening for Network Recurrent Infectious Diseases Under Uncertainty” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS-20).

Sponsors

Contact us about being a sponsor for this project