Youth experiencing homelessness (YEH) are at elevated risk for HIV/AIDS and disproportionately identify as racial, ethnic, sexual, and gender minorities. We developed a new peer change agent (PCA) HIV prevention intervention with three arms: (1) an arm using an Artificial Intelligence (AI) planning algorithm to select PCAs; (2) a popularity arm, the standard PCA approach, operationalized as highest degree centrality (DC); and (3) an observation-only comparison group.
Community Health Workers (CHWs) form an important component of health-care systems globally, especially in low-resource settings. CHWs are often tasked with monitoring the health of and intervening on their patient cohort. Previous work has developed several classes of Restless Multi-Armed Bandits (RMABs) that are computationally tractable and indexable, a condition that guarantees asymptotic optimality, for solving such health monitoring and intervention problems (HMIPs). However, existing solutions to HMIPs fail to account for risk-sensitivity considerations of CHWs in the planning stage and may run the danger of ignoring some patients completely because they are deemed less valuable to intervene on. Additionally, these also rely on patients reporting their state of adherence accurately when intervened upon. Towards tackling these issues, our contributions in this paper are as follows: (1) We develop an RMAB solution to HMIPs that allows for reward functions that are monotone increasing, rather than linear, in the belief state and also supports a wider class of observations. (2) We prove theoretical guarantees on the asymptotic optimality of our algorithm for any arbitrary reward function. Additionally, we show that for the specific reward function considered in previous work, our theoretical conditions are stronger than the state-of-the-art guarantees. (3) We show the applicability of these new results for addressing the three issues pertaining to: risk-sensitive planning, equitable allocation and reliance on perfect observations as highlighted above. We evaluate these techniques on both simulated as well as real data from a prevalent CHW task of monitoring adherence of tuberculosis patients to their prescribed medication in Mumbai, India and show improved performance over the state-of-the-art. The simulation code is available at: https://github.com/AdityaMate/risk-aware-bandits.
We present the design and analysis of a multi-level game-theoretic model of hierarchical policy-making, inspired by policy responses to the COVID-19 pandemic. Our model captures the potentially mismatched priorities among a hierarchy of policy-makers (e.g., federal, state, and local governments) with respect to two main cost components that have opposite dependence on the policy strength, such as post-intervention infection rates and the cost of policy implementation. Our model further includes a crucial third fac- tor in decisions: a cost of non-compliance with the policy-maker immediately above in the hierarchy, such as non-compliance of state with federal policies. Our first contribution is a closed-form approximation of a recently published agent-based model to com- pute the number of infections for any implemented policy. Second, we present a novel equilibrium selection criterion that addresses common issues with equilibrium multiplicity in our setting. Third, we propose a hierarchical algorithm based on best response dynamics for computing an approximate equilibrium of the hierarchical policy-making game consistent with our solution concept. Finally, we present an empirical investigation of equilibrium policy strategies in this game as a function of game parameters, such as the degree of centralization and disagreements about policy priorities among the agents, the extent of free riding as well as fairness in the distribution of costs.
Global illegal wildlife trade threatens biodiversity and acts as a potential crisis of invasive species and disease spread. Despite a wide range of national and international policies and regulations designed to stop illegal wildlife trade, high profit margins and increasing demand drive a vigorous global illicit trade network. In this paper, we aim to build an adversarial model to predict the future wildlife trade based on the historical trade data. We hypothesize that the majority of illegal wildlife trade is opportunistic crime, which is highly correlated to legal wildlife trade. We can therefore leverage the abundant legal wildlife trade data to forecast the future wildlife trade, where a fixed fraction of trade volume will reflect the opportunistic wildlife trade volume. To learn a legal wildlife trade model, we propose to use graph neural networks and meta-learning to handle the network and species dependencies, respectively. Lastly, we suggest to incorporate agent-based models on top of our model to study the evolution from opportunistic to more organized illegal wildlife trade behavior.
Conservation efforts in green security domains to protect wildlife and forests are constrained by the limited availability of defenders (i.e., patrollers), who must patrol vast areas to protect from attackers (e.g., poachers or illegal loggers). Defenders must choose how much time to spend in each region of the protected area, balancing exploration of infrequently visited regions and exploitation of known hotspots. We formulate the problem as a stochastic multi-armed bandit, where each action represents a patrol strategy, enabling us to guarantee the rate of convergence of the patrolling policy. However, a naive bandit approach would compromise short-term performance for long-term optimality, resulting in animals poached and forests destroyed. To speed up performance, we leverage smoothness in the reward function and decomposability of actions. We show a synergy between Lipschitz-continuity and decomposition as each aids the convergence of the other. In doing so, we bridge the gap between combinatorial and Lipschitz bandits, presenting a no-regret approach that tightens existing guarantees while optimizing for short-term performance. We demonstrate that our algorithm, LIZARD, improves performance on real-world poaching data from Cambodia.
Several behavioral, social, and public health interventions, such as suicide/HIV prevention or community preparedness against natural disasters, leverage social network information to maximize outreach. Algorithmic influence maximization techniques have been proposed to aid with the choice of “peer leaders” or “influencers” in such interventions. Yet, traditional algorithms for influence maximization have not been designed with these interventions in mind. As a result, they may disproportionately exclude minority communities from the benefits of the intervention. This has motivated research on fair influence maximization. Existing techniques come with two major drawbacks. First, they require committing to a single fairness measure. Second, these measures are typically imposed as strict constraints leading to undesirable properties such as wastage of resources. To address these shortcomings, we pro- vide a principled characterization of the properties that a fair influence maximization algorithm should satisfy. In particular, we propose a framework based on social welfare theory, wherein the cardinal utilities derived by each community are aggregated using the isoelastic social welfare functions. Under this framework, the trade-off between fairness and efficiency can be controlled by a single inequality aversion design parameter. We then show under what circumstances our proposed principles can be satisfied by a welfare function. The resulting optimization problem is monotone and submodular and can be solved efficiently with optimality guarantees. Our frame- work encompasses as special cases leximin and proportional fairness. Extensive experiments on synthetic and real world datasets including a case study on landslide risk management demonstrate the efficacy of the proposed framework.
Active screening is a common approach in controlling the spread of recurring infectious diseases such as tuberculosis and influenza. In this approach, health workers periodically select a subset of population for screening. However, given the limited number of health workers, only a small subset of the population can be visited in any given time period. Given the recurrent nature of the disease and rapid spreading, the goal is to minimize the number of infections over a long time horizon. Active screening can be formalized as a sequential combinatorial optimization over the network of people and their connections. The main computational challenges in this formalization arise from i) the combinatorial nature of the problem, ii) the need of sequential planning and iii) the uncertainties in the infectiousness states of the population.
Previous works on active screening fail to scale to large time horizon while fully considering the future effect of current interventions. In this paper, we propose a novel reinforcement learning (RL) approach based on Deep Q-Networks (DQN), with several innovative adaptations that are designed to address the above challenges. First, we use graph convolutional networks (GCNs) to represent the Q-function that exploit the node correlations of the underlying contact network. Second, to avoid solving a combinatorial optimization problem in each time period, we decompose the node set selection as a sub-sequence of decisions, and further design a two-level RL framework that solves the problem in a hierarchical way. Finally, to speed-up the slow convergence of RL which arises from reward sparseness, we incorporate ideas from curriculum learning into our hierarchical RL approach. We evaluate our RL algorithm on several real-world networks. Results show that our RL algorithm can scale up to 10 times the problem size of state-of-the-art (the variant that considers the effect of future interventions but un-scalable) in terms of planning time horizon. Meanwhile, it outperforms state-of-the-art (the variant that scales up but does not consider the effect of future interventions) by up to 33% in solution quality.
Youth experiencing homelessness (YEH) are subject to substantially greater risk of HIV infection, compounded both by their lack of access to stable housing and the disproportionate representation of youth of marginalized racial, ethnic, and gender identity groups among YEH. A key goal for health equity is to improve adoption of protective behaviors in this population. One promising strategy for intervention is to recruit peer leaders from the population of YEH to promote behaviors such as condom usage and regular HIV testing to their social contacts. This raises a computational question: which youth should be selected as peer leaders to maximize the overall impact of the intervention? We developed an artificial intelligence system to optimize such social network interventions in a community health setting. We conducted a clinical trial enrolling 713 YEH at drop-in centers in a large US city. The clinical trial compared interventions planned with the algorithm to those where the highest-degree nodes in the youths' social network were recruited as peer leaders (the standard method in public health) and to an observation-only control group. Results from the clinical trial show that youth in the AI group experience statistically significant reductions in key risk behaviors for HIV transmission, while those in the other groups do not. This provides, to our knowledge, the first empirical validation of the usage of AI methods to optimize social network interventions for health. We conclude by discussing lessons learned over the course of the project which may inform future attempts to use AI in community-level interventions.