2019
Alan Tsang, Bryan Wilder, Eric Rice, Milind Tambe, and Yair Zick. 2019. “
Group-Fairness in Influence Maximization .” In International Joint Conference on Artificial Intelligence (IJCAI) 2019.
AbstractInfluence maximization is a widely used model for
information dissemination in social networks. Recent work has employed such interventions across
a wide range of social problems, spanning public
health, substance abuse, and international development (to name a few examples). A critical but understudied question is whether the benefits of such
interventions are fairly distributed across different
groups in the population; e.g., avoiding discrimination with respect to sensitive attributes such as race
or gender. Drawing on legal and game-theoretic
concepts, we introduce formal definitions of fairness in influence maximization. We provide an algorithmic framework to find solutions which satisfy fairness constraints, and in the process improve
the state of the art for general multi-objective submodular maximization problems. Experimental results on real data from an HIV prevention intervention for homeless youth show that standard influence maximization techniques oftentimes neglect
smaller groups which contribute less to overall utility, resulting in a disparity which our proposed algorithms substantially reduce.
2019_15_teamcore_group_fairness_in_influence_maximization.pdf Kai Wang, Bryan Wilder, Sze-Chuan Suen, Bistra Dilkina, and Milind Tambe. 2019. “
Improving GP-UCB Algorithm by Harnessing Decomposed Feedback.” In The 4th Workshop on Data Science for Social Good at ECML-PKDD, 2019.
AbstractGaussian processes (GPs) have been widely applied to machine learning and nonparametric approximation. Given existing observations, a GP allows
the decision maker to update a posterior belief over the unknown underlying function. Usually, observations from a complex system come with noise and decomposed feedback from intermediate layers. For example, the decomposed feedback
could be the components that constitute the final objective value, or the various
feedback gotten from sensors. Previous literature has shown that GPs can successfully deal with noise, but has neglected decomposed feedback. We therefore propose a decomposed GP regression algorithm to incorporate this feedback, leading
to less average root-mean-squared error with respect to the target function, especially when the samples are scarce. We also introduce a decomposed GP-UCB
algorithm to solve the resulting bandit problem with decomposed feedback. We
prove that our algorithm converges to the optimal solution and preserves the noregret property. To demonstrate the wide applicability of this work, we execute
our algorithm on two disparate social problems: infectious disease control and
weather monitoring. The numerical results show that our method provides significant improvement against previous methods that do not utilize these feedback,
showcasing the advantage of considering decomposed feedback.
2019_23_teamcore_ecml_sogood_dgpucb.pdf Qingyu Guo, Jiarui Gan, Fei Fang, Long Tran-Thanh, Milind Tambe, and Bo An. 2019. “
On the Inducibility of Stackelberg Equilibrium for Security Games .” In AAAI conference on Artificial Intelligence (AAAI-19).
AbstractStrong Stackelberg equilibrium (SSE) is the standard solution concept of Stackelberg security games. As opposed to
the weak Stackelberg equilibrium (WSE), the SSE assumes
that the follower breaks ties in favor of the leader and this is
widely acknowledged and justified by the assertion that the
defender can often induce the attacker to choose a preferred
action by making an infinitesimal adjustment to her strategy.
Unfortunately, in security games with resource assignment
constraints, the assertion might not be valid; it is possible
that the defender cannot induce the desired outcome. As a result, many results claimed in the literature may be overly optimistic. To remedy, we first formally define the utility guarantee of a defender strategy and provide examples to show that
the utility of SSE can be higher than its utility guarantee. Second, inspired by the analysis of leader’s payoff by Von Stengel and Zamir (2004), we provide the solution concept called
the inducible Stackelberg equilibrium (ISE), which owns the
highest utility guarantee and always exists. Third, we show
the conditions when ISE coincides with SSE and the fact
that in general case, SSE can be extremely worse with respect to utility guarantee. Moreover, introducing the ISE does
not invalidate existing algorithmic results as the problem of
computing an ISE polynomially reduces to that of computing
an SSE. We also provide an algorithmic implementation for
computing ISE, with which our experiment
2019_13_teamcore_main_ise.pdf Bryan Wilder, Jackson A. Killian, Amit Sharma, Vinod Choudhary, Bistra Dilkina, and Milind Tambe. 2019. “
Integrating optimization and learning to prescribe interventions for tuberculosis patients .” In 10th International Workshop on Optimization in Multiagent Systems (OptMAS).
AbstractCreating impact in real-world settings requires agents which
navigate the full pipeline from data, to predictive models, to decisions.
These components are typically approached separately: a machine learning model is first trained via a measure of predictive accuracy, and then
its predictions are used as input into an optimization algorithm which
produces a decision. However, the loss function used to train the model
may easily be misaligned with the end goal of the agent, which is to make
the best decisions possible.
We focus on combinatorial optimization problems and introduce a general framework for decision-focused learning, where the machine learning model is directly trained in conjunction with the optimization algorithm to produce high-quality decisions. Technically, our contribution is
a means of integrating common classes of discrete optimization problems
into deep learning or other predictive models, which are typically trained
via gradient descent. The main idea is to use a continuous relaxation of
the discrete problem to propagate gradients through the optimization
procedure. We instantiate this framework for two broad classes of combinatorial problems: linear programs and submodular maximization.
We then provide an application of such techniques to a real problem of
societal importance: improving interventions in tuberculosis treatment.
Using data on 17,000 Indian patients provided by the NGO Everwell,
we consider the problem of predicting which patients are likely to miss
doses of medication in the near future and optimizing interventions by
health workers to avert such treatment failures. We find the decisionfocused learning improves the number of successful interventions by approximately 15% compared to standard machine learning approaches,
demonstrating that aligning the goals of learning and decision making
can yield substantial benefits in a socially critical application.
2019_17_teamcore_optmas_submission.pdf Sarah Cooney, Kai Wang, Elizabeth Bondi, Thanh Nguyen, Phebe Vayanos, Hailey Winetrobe, Edward A. Cranford, Cleotilde Gonzalez, Christian Lebiere, and Milind Tambe. 2019. “
Learning to Signal in the Goldilocks Zone: Improving Adversary Compliance in Security Games .” In European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, (ECML-PKDD), 2019.
AbstractMany real-world security scenarios can be modeled via a
game-theoretic framework known as a security game in which there is
a defender trying to protect potential targets from an attacker. Recent
work in security games has shown that deceptive signaling by the defender can convince an attacker to withdraw his attack. For instance,
a warning message to commuters indicating speed enforcement is in
progress ahead might lead to them driving more slowly, even if it turns
out no enforcement is in progress. However, the results of this work are
limited by the unrealistic assumption that the attackers will behave with
perfect rationality, meaning they always choose an action that gives them
the best expected reward. We address the problem of training boundedly
rational (human) attackers to comply with signals via repeated interaction with signaling without incurring a loss to the defender, and offer the
four following contributions: (i) We learn new decision tree and neural
network-based models of attacker compliance with signaling. (ii) Based
on these machine learning models of a boundedly rational attacker’s response to signaling, we develop a theory of signaling in the Goldilocks
zone, a balance of signaling and deception that increases attacker compliance and improves defender utility. (iii) We present game-theoretic
algorithms to solve for signaling schemes based on the learned models
of attacker compliance with signaling. (iv) We conduct extensive human
subject experiments using an online game. The game simulates the scenario of an inside attacker trying to steal sensitive information from company computers, and results show that our algorithms based on learned
models of attacker behavior lead to better attacker compliance and improved defender utility compared to the state-of-the-art algorithm for
rational attackers with signaling.
2019_14_teamcore_ecml_camera_ready.pdf Bryan Wilder, Bistra Dilkina, and Milind Tambe. 2019. “
Melding the Data-Decisions Pipeline: Decision-Focused Learning for Combinatorial Optimization .” In AAAI conference on Artificial Intelligence (AAAI-19).
AbstractCreating impact in real-world settings requires artificial intelligence techniques to span the full pipeline from data, to
predictive models, to decisions. These components are typically approached separately: a machine learning model is
first trained via a measure of predictive accuracy, and then its
predictions are used as input into an optimization algorithm
which produces a decision. However, the loss function used to
train the model may easily be misaligned with the end goal,
which is to make the best decisions possible. Hand-tuning
the loss function to align with optimization is a difficult and
error-prone process (which is often skipped entirely).
We focus on combinatorial optimization problems and introduce a general framework for decision-focused learning,
where the machine learning model is directly trained in conjunction with the optimization algorithm to produce highquality decisions. Technically, our contribution is a means
of integrating common classes of discrete optimization problems into deep learning or other predictive models, which are
typically trained via gradient descent. The main idea is to use
a continuous relaxation of the discrete problem to propagate
gradients through the optimization procedure. We instantiate
this framework for two broad classes of combinatorial problems: linear programs and submodular maximization. Experimental results across a variety of domains show that decisionfocused learning often leads to improved optimization performance compared to traditional methods. We find that standard
measures of accuracy are not a reliable proxy for a predictive
model’s utility in optimization, and our method’s ability to
specify the true goal as the model’s training objective yields
substantial dividends across a range of decision problems.
2019_12_teamcore_aaai_diff_opt_final.pdf Sarah Cooney, Wendy Gomez, Kai Wang, Jorja Leap, P. Jefferey Brantingham, and Milind Tambe. 2019. “
Mobile Game Theory with Street Gangs.” In The 4th Workshop on Data Science for Social Good at ECML-PKDD, 2019.
AbstractGang violence remains a persistent public safety problem
in Los Angeles. Gang interventionists and community organizers are
turning to proactive peacekeeping, a process of addressing the underlying structures that cause young people to join gangs such as pervasive poverty and marginalization. Given the increasing prevalence and
decreasing cost of mobile technology, there may be opportunities for interventionists to employ technological solutions in their work. However,
before such solutions can be deployed, it is necessary to have accurate
models of the target users—in this case, gang-involved youth. Of particular interest with regard proactive peacekeeping is their propensity for
cooperation. However, given the unique circumstances surrounding the
lives of gang-involved youth, traditional laboratory-based experiments
measuring cooperation are infeasible. In this paper, we present a novel
method of collecting experimental data from gang-involved youth in the
Los Angeles area. We design a mobile application based on the classic Prisoner’s Dilemma model, which has been used to collect almost
3000 data points on cooperation from more than 20 participants. We
present initial results that show despite their unique life circumstances
gang-involved youth cooperate at roughly the same rate as university students in classic studies of cooperation. We conclude by addressing the
implications of this result for future work and proactive peacekeeping
endeavors
2019_21_teamcore_so_good2019b.pdf Shahrzad Gholami. 2019. “
Predicting and Planning against Real-world Adversaries: An End-to-end Pipeline to Combat Illegal Wildlife Poachers on a Global Scale”.
AbstractSecurity is a global concern and a unifying theme in various security projects is strategic reasoning where the mathematical framework of machine learning and game theory can be integrated
and applied. For example, in the environmental sustainability domain, the problem of protecting
endangered wildlife from attacks (i.e., poachers’ strikes) can be abstracted as a game between
defender(s) and attacker(s). Applying previous research on security games to sustainability domains (denoted as Green Security Games) introduce several novel challenges that I address in
my thesis to create computationally feasible and accurate algorithms in order to model complex
adversarial behavior based on real-world data and to generate optimal defender strategy.
My thesis provides four main contributions to the emerging body of research in using machine
learning and game theory framework for the fundamental challenges existing in the environmental sustainability domain, namely (i) novel spatio-temporal and uncertainty-aware machine
learning models for complex adversarial behavior based on the imperfect real-world data, (ii) the
first large-scale field test evaluation of the machine learning models in the adversarial settings
concerning the environmental sustainability, (iii) a novel multi-expert online learning model for
constrained patrol planning, and (iv) the first game theoretical model to generate optimal defender
strategy against collusive adversaries.
In regard to the first contribution, I developed bounded rationality models for adversaries
based on the real-world data that account for the naturally occurring uncertainty in past attack
evidence collected by defenders. To that end, I proposed two novel predictive behavioral models,
which I improved progressively. The second major contribution of my thesis is a large-scale field
test evaluation of the proposed adversarial behavior model beyond the laboratory. Particularly,
my thesis is motivated by the challenges in wildlife poaching, where I directed the defenders
(i.e., rangers) to the hotspots of adversaries that they would have missed. During these experiments across multiple vast national parks, several snares and snared animals were detected, and
poachers were arrested, potentially more wildlife saved. The algorithm I proposed, that combines
machine learning and game-theoretic patrol planning is planned to be deployed at 600 national
parks around the world in the near future to combat illegal poaching.
The third contribution in my thesis introduces a novel multi-expert online learning model for
constrained and randomized patrol planning, which benefits from several expert planners where
insufficient or imperfect historical records of past attacks are available to learn adversarial behavior. The final contribution of my thesis is developing an optimal solution against collusive
adversaries in security games assuming both rational and boundedly rational adversaries. I conducted human subject experiments on Amazon Mechanical Turk involving 700 human subjects
using a web-based game that simulates collusive security games.
2019_3_teamcore_shahrzad_thesis.pdf Aida Rahmattalabi, Phebe Vayanos, Anthony Fulginiti, and Milind Tambe. 2019. “
Robust Peer-Monitoring on Graphs with an Application to Suicide Prevention in Social Networks .” In International Conference on Autonomous Agents and Multiagent Systems (Extended Abstract) (AAMAS-19).
AbstractWe consider the problem of selecting a subset of nodes (individuals)
in a (social) network that can act as monitors capable of “watchingout” for their neighbors (friends) when the availability or performance of the chosen monitors is uncertain. Such problems arise
for example in the context of “Gatekeeper Trainings” for suicide
prevention. We formulate this problem as a two-stage robust optimization problem that aims to maximize the worst-case number of
covered nodes. Our model is capable of incorporating domain specific constraints, e.g., fairness constraints. We propose a practically
tractable approximation scheme and we provide empirical results
that demonstrate the effectiveness of our approach.
2019_6_teamcore_aamas2019_robust_monitoring_on_social_networks_extended_abstract.pdf Edward A. Cranford, Cleotilde Gonzalez, Palvi Aggarwal, Sarah Cooney, Milind Tambe, and Christian Lebiere. 2019. “
Towards Personalized Deceptive Signaling for Cyber Defense Using Cognitive Models .” In International Conference on Cognitive Modeling (ICCM), 2019.
AbstractRecent research in cybersecurity has begun to develop active
defense strategies using game-theoretic optimization of the
allocation of limited defenses combined with deceptive
signaling. While effective, the algorithms are optimized against
perfectly rational adversaries. In a laboratory experiment, we
pit humans against the defense algorithm in an online game
designed to simulate an insider attack scenario. Humans attack
far more often than predicted under perfect rationality.
Optimizing against human bounded rationality is vitally
important. We propose a cognitive model based on instancebased learning theory and built in ACT-R that accurately
predicts human performance and biases in the game. We show
that the algorithm does not defend well, largely due to its static
nature and lack of adaptation to the particular individual’s
actions. Thus, we propose an adaptive method of signaling that
uses the cognitive model to trace an individual’s experience in
real time, in order to optimize defenses. We discuss the results
and implications of personalized defense.
Keywords: cyber deception; cognitive models; instance-based
learning; knowledge-tracing; model-tracing
2019_16_teamcore_iccm2019_paper_56.pdf Elizabeth Bondi, Hoon Oh, Haifeng Xu, Fei Fang, Bistra Dilkina, and Milind Tambe. 2019. “
Using Game Theory in Real Time in the Real World: A Conservation Case Study (Demonstration) .” In International Conference on Autonomous Agents and Multiagent Systems (Demonstration) (AAMAS-19).
AbstractIn the real world, real-time data are now widely available, especially in security domains. Security cameras, aerial imagery, and even social media keep defenders informed when protecting important events, locations, and people. Further, advances in artificial intelligence have led to tools that can interpret these data automatically. Game theoretic models, for example, have shown great success in security. However, most of them ignore real-time information. In this paper, we demonstrate the potential to use real-time information from imagery to better inform our decisions in game theoretic models for security. As a concrete example, a conservation group called Air Shepherd uses conservation drones equipped with thermal infrared cameras to locate poachers at night and alert park rangers. They have also used lights aboard the drones, or signaled, to warn poachers of their presence, which often deters the poachers. We propose a system that (i) allocates drones and humans strategically throughout a protected area, (ii) detects poachers in the thermal infrared videos recorded by the conservation drones flying through the protected area in the predetermined location, and (iii) recommends moving to the location and/or signaling to the poacher that a patroller is nearby depending on real-time detections.
View the demonstration.
2019_9_teamcore_aamas_2019_demo.pdf Kai Wang, Aditya Mate, Bryan Wilder, Andrew Perrault, and Milind Tambe. 2019. “
Using Graph Convolutional Networks to Learn Interdiction Games .” In AI for Social Good workshop (AI4SG) at International Joint Conference on Artificial Intelligence (IJCAI) 2019.
AbstractIllegal smuggling is one of the most important issues across countries, where more than $10 billion
a year of illegal wildlife trafficking is conducted
within transnational criminal networks. Governments have tried to deploy inspections at checkpoints to stop illegal smuggling, though the effect
is quite limited due to vast protection areas but limited human resources. We study these problems
from the perspective of network interdiction games
with a boundedly rational attacker. In this paper, we
aim to improve the efficiency of the limited number of checkpoints. The problem involves two main
stages: i) a predictive stage to predict the attacker’s
behavior based on the historical interdiction; ii)
a prescriptive stage to optimally allocate limited
checkpoints to interdict the attacker. In this paper, we propose a novel boundedly rational model
which resolves the issue of exponentially many attacker strategies by making memoryless assumption about the attacker’s behavior. We show that
the attacker’s behavior can be reduced to an absorbing Markov chain, where the success probability of
reaching any target can be computed analytically,
thus optimized via any gradient-based optimization technique. We incorporate graph convolutional
neural networks with K-hops look-ahead to model
the attacker’s reasoning. Our proposed model provides a new perspective to study the boundedly
rationality in traditional interdiction games with
graph structure. This novel model possesses nice
analytical properties and scales up very well by
avoiding enumerating all paths in the graph.
2019_22_teamcore_ijcai_ai4sg_gcn_interdiction.pdf Elizabeth Bondi. 2019. “
Visionary Security: Using Uncertain Real-Time Information in Signaling Games .” In International Joint Conference on Artificial Intelligence (IJCAI) 2019 (Doctoral Consortium).
AbstractIn important domains from natural resource conservation to public safety, real-time information is becoming increasingly important. Strategic deployment of security cameras and mobile sensors such as drones can provide real-time updates on illegal activities. To help plan for such strategic deployments of sensors and human patrollers, as well as warning signals to ward off adversaries, the defender-attacker security games framework can be used. [Zhang et al., 2019] has shown that real-time data (e.g., human view from a helicopter) may be used in conjunction with security game models to interdict criminals. Other recent work relies on real-time information from sensors that can notify the patroller when an opponent is detected [Basilico et al., 2017; Xu et al., 2018]. Despite considering real-time information in all cases, these works do not consider the combined situation of uncertainty in real-time information in addition to strategically signaling to adversaries. In this thesis, we will not only address this gap, but also improve the overall security result by considering security game models and computer vision algorithms together. A major aspect of this work is in applying it to real-world challenges, such as conservation. Although it applies to many environmental challenges, such as protecting forests and avoiding illegal mining, we will focus particularly on reducing poaching of endangered wildlife as an example. To reduce poaching, human patrollers typically search for snares and poaching activity as they patrol, as well as intervene if poaching activity is found. Drones are useful patrolling aids due to their ability to cover additional ground, but they must interpret their environments, notify nearby human patrollers for intervention, and send potentially deceptive signals to the adversary to deter poaching. Rather than treating these as separate tasks, models must coordinate to handle challenges found in real-world conservation scenarios (Fig. 1). We will determine the success of this work both in simulated experiments and through work with conservation agencies such as Air Shepherd to implement the system in the real world.
2019_15-5_teamcore_0902.pdf Sarah Cooney, Phebe Vayanos, Thanh H. Nguyen, Cleotilde Gonzalez, Christian Lebiere, Edward A. Cranford, and Milind Tambe. 2019. “
Warning Time: Optimizing Strategic Signaling for Security Against Boundedly Rational Adversaries.” In International Conference on Autonomous Agents and Multiagent Systems (Extended Abstract) (AAMAS-19).
AbstractDefender-attacker Stackelberg security games (SSGs) have been
applied for solving many real-world security problems. Recent work
in SSGs has incorporated a deceptive signaling scheme into the
SSG model, where the defender strategically reveals information
about her defensive strategy to the attacker, in order to inuence
the attacker’s decision making for the defender’s own benet. In
this work, we study the problem of signaling in security games
against a boundedly rational attacker.
2019_7_teamcore_sample_aamas19.pdf Han-Ching Ou, Arunesh Sinha, Sze-Chuan Suen, Andrew Perrault, and Milind Tambe. 2019. “
Who and When to Screen: Multi-Round Active Screening for Recurrent Infectious Diseases Under Uncertainty .” In Joint Workshop on Autonomous Agents for Social Good (AASG) at International Conference on Autonomous Agents and Multiagent Systems (AAMAS-19).
AbstractControlling recurrent infectious diseases is a vital yet complicated problem. In this paper, we propose a novel active screening
model (ACTS) and algorithms to facilitate active screening for recurrent diseases (no permanent immunity) under infection uncertainty. Our
contributions are: (1) A new approach to modeling multi-round networkbased screening/contact tracing under uncertainty, which is a common
real-life practice in a variety of diseases [10, 30]; (2) Two novel algorithms,
Full- and Fast-REMEDY. Full-REMEDY considers the effect of future actions and finds a policy that provides high solution quality, where
Fast-REMEDY scales linearly in the size of the network; (3) We evaluate
Full- and Fast-REMEDY on several real-world datasets which emulate human contact and find that they control diseases better than the
baselines. To the best of our knowledge, this is the first work on multiround active screening with uncertainty for diseases with no permanent
immunity.
2019_1_teamcore_whoandwhentoscreenmultiroundactivescreeningforrecurrentinfectiousdiseasesunderuncertainty.pdf Bryan Wilder, Eric Ewing, Bistra Dilkina, and Milind Tambe. 2019. “
End to End Learning and Optimization on Graphs.” In Advances in Neural and Information Processing Systems (NeurIPS-19), 2019.
AbstractReal-world applications often combine learning and optimization problems on graphs. For instance, our objective may be to cluster the graph in order to detect meaningful communities (or solve other common graph optimization problems such as facility location, maxcut, and so on). However, graphs or related attributes are often only partially observed, introducing learning problems such as link prediction which must be solved prior to optimization. Standard approaches treat learning and optimization entirely separately, while recent machine learning work aims to predict the optimal solution directly from the inputs. Here, we propose an alternative decision-focused learning approach that integrates a differentiable proxy for common graph optimization problems as a layer in learned systems. The main idea is to learn a representation that maps the original optimization problem onto a simpler proxy problem that can be efficiently differentiated through. Experimental results show that our CLUSTERNET system outperforms both pure end-to-end approaches (that directly predict the optimal solution) and standard approaches that entirely separate learning and optimization. Code for our system is available at
https://github.com/bwilder0/clusternet.
graphopt_neurips.pdf Aida Rahmattalabi, Phebe Vayanos, Anthony Fulginiti, Eric Rice, Bryan Wilder, Amulya Yadav, and Milind Tambe. 2019. “
Exploring Algorithmic Fairness in Robust Graph Covering Problems.” In . Advances in Neural and Information Processing Systems (NeurIPS-19), 2019.
AbstractFueled by algorithmic advances, AI algorithms are increasingly being deployed in
settings subject to unanticipated challenges with complex social effects. Motivated
by real-world deployment of AI driven, social-network based suicide prevention
and landslide risk management interventions, this paper focuses on robust graph
covering problems subject to group fairness constraints. We show that, in the
absence of fairness constraints, state-of-the-art algorithms for the robust graph covering problem result in biased node coverage: they tend to discriminate individuals
(nodes) based on membership in traditionally marginalized groups. To mitigate
this issue, we propose a novel formulation of the robust graph covering problem
with group fairness constraints and a tractable approximation scheme applicable to
real-world instances. We provide a formal analysis of the price of group fairness
(PoF) for this problem, where we show that uncertainty can lead to greater PoF. We
demonstrate the effectiveness of our approach on several real-world social networks.
Our method yields competitive node coverage while significantly improving group
fairness relative to state-of-the-art methods.
camera-ready-faircovering.pdf Aida Rahmattalabi, Anamika Barman Adhikari, Phebe Vayanos, Milind Tambe, Eric Rice, and Robin Baker. 2019. “
Social Network Based Substance Abuse Prevention via Network Modification (A Preliminary Study).” In Strategic Reasoning for Societal Challenges (SRSC) Workshop at International Conference on Autonomous Agents and Multiagent Systems (AAMAS-19).
AbstractSubstance use and abuse is a significant public health problem in the
United States. Group-based intervention programs offer a promising
means of preventing and reducing substance abuse. While effective,
unfortunately, inappropriate intervention groups can result in an
increase in deviant behaviors among participants, a process known
as deviancy training. This paper investigates the problem of optimizing the social influence related to the deviant behavior via careful
construction of the intervention groups. We propose a Mixed Integer Optimization formulation that decides on the intervention
groups to be formed, captures the impact of the intervention groups
on the structure of the social network, and models the impact of
these changes on behavior propagation. In addition, we propose
a scalable hybrid meta-heuristic algorithm that combines Mixed
Integer Programming and Large Neighborhood Search to find nearoptimal network partitions. Our algorithm is packaged in the form
of GUIDE, an AI-based decision aid that recommends intervention groups. Being the first quantitative decision aid of this kind,
GUIDE is able to assist practitioners, in particular social workers, in
three key areas: (a) GUIDE proposes near-optimal solutions that are
shown, via extensive simulations, to significantly improve over the
traditional qualitative practices for forming intervention groups;
(b) GUIDE is able to identify circumstances when an intervention
will lead to deviancy training, thus saving time, money, and effort;
(c) GUIDE can evaluate current strategies of group formation and
discard strategies that will lead to deviancy training. In developing
GUIDE, we are primarily interested in substance use interventions
among homeless youth as a high risk and vulnerable population.
GUIDE is developed in collaboration with Urban Peak, a homelessyouth serving organization in Denver, CO, and is under preparation
for deployment.
2019_5_teamcore_aida_aamas2019_workshop_substance_abuse.pdf Xinrun Wang, Milind Tambe, Branislav Boˇsansky ́, and Bo An. 2019. “
When Players Affect Target Values: Modeling and Solving Dynamic Partially Observable Security Games.” In . Conference on Decision and Game Theory for Security, 2019.
AbstractMost of the current security models assume that the values of targets/areas are static or the changes (if any) are scheduled and
known to the defender. Unfortunately, such models are not sufficient
for many domains, where actions of the players modify the values of
the targets. Examples include wildlife scenarios, where the attacker can
increase value of targets by secretly building supporting facilities. To address such security game domains with player-affected values, we first
propose DPOS3G, a novel partially observable stochastic Stackelberg
game where target values are determined by the players’ actions; the defender can only partially observe these targets’ values, while the attacker
can fully observe the targets’ values and the defender’s strategy. Second,
we propose RITA (Reduced game Iterative Transfer Algorithm), which
is based on the heuristic search value iteration algorithm for partially observable stochastic game (PG-HSVI) and introduces three key novelties:
(a) building a reduced game with only key states (derived from partitioning the state space) to reduce the numbers of states and transitions
considered when solving the game; (b) incrementally adding defender’s
actions to further reduce the number of transitions of the game; (c) providing novel heuristics for lower bound initialization of the algorithm.
Third, we conduct extensive experimental evaluations of the algorithms
and the results show that RITA significantly outperforms the baseline
PG-HSVI algorithm on scalability while allowing for trade off in scalability and solution quality.
2019_20_teamcore_stochastic_usc_game_sec_08052019_updated.pdf