Publications

2015
L. S. Marcolino. 1/2015. “Multi-agent Team Formation: Solving Complex Problems by Aggregating Opinions .” In Conference on Artificial Intelligence (AAAI 2015). (Doctoral Consortium). Texas, USA.Abstract
Aggregating the opinions of different agents is a powerful way to find high-quality solutions to complex problems. However, when using agents in this fashion, there are two fundamental open questions. First, given a universe of agents, how to quickly identify which ones should be used to form a team? Second, given a team of agents, what is the best way to aggregate their opinions? Many researchers value diversity when forming teams. LiCalzi and Surucu (2012) and Hong and Page (2004) propose models where the agents know the utility of the solutions, and the team converges to the best solution found by one of its members. Clearly in complex problems the utility of solutions would not be available, and agents would have to resort to other methods, such as voting, to take a common decision. Lamberson and Page (2012) study diversity in the context of forecasts, where the solutions are represented by real numbers and the team takes the average of the opinion of its members. Domains where the possible solutions are discrete, however, are not captured by such a model. I proposed a new model to study teams of agents that vote in discrete solution spaces (Marcolino, Jiang, and Tambe 2013), where I show that a diverse team of weaker agents can overcome a uniform team made of copies of the best agent. However, this phenomenon does not always occur, and it is still necessary to identify when we should use diverse teams and when uniform teams would be more appropriate. Hence, in Marcolino et al. (2014b), I shed a new light into this problem, by presenting a new, more general model of diversity for teams of voting agents. Using that model I can predict that diverse teams perform better than uniform teams in problems with a large action space. All my predictions are verified in a real system of voting agents, in the Computer Go domain. I show that: (i) a team of diverse players gets a higher winning rate than a uniform team made of copies of the best agent; (ii) the diverse team plays increasingly better as the board size increases. Moreover, I also performed an experimental study in the building design domain. This is a fundamental domain in the current scenario, since it is known that the design of a building has a major impact in the consumption of energy throughout its whole lifespan (Lin and Gerber 2014). It is fundamental to design energy efficient buildings. Meanwhile, it is important to balance other factors, such as construction cost, creating a multi-objective optimization problem. I show that by aggregating the opinions of a team of agents, a higher number of 1 st ranked solutions in the Pareto frontier is found than when using a single agent. Moreover, my approach eliminates falsely reported 1 st ranked solutions (Marcolino et al. 2014a; 2015). As mentioned, studying different aggregation rules is also fundamental. In Jiang et al. (2014), I introduce a novel method to extract a ranking from agents, based on the frequency that actions are played when sampling them multiple times. My method leads to significant improvements in the winning rate in Go games when using the Borda voting rule to aggregate the generated rankings.
2015_2_teamcore_aaai15.pdf
L. S. Marcolino, H. Xu, D. Gerber, B. Kolev, S. Price, E. Pantazis, and M. Tambe. 2015. “Agent Teams for Design Problems .” In International Workshop on Coordination, Organisations, Institutions and Norms (COIN 2015).Abstract
Design imposes a novel social choice problem: using a team of voting agents, maximize the number of optimal solutions; allowing a user to then take an aesthetical choice. In an open system of design agents, team formation is fundamental. We present the first model of agent teams for design. For maximum applicability, we envision agents that are queried for a single opinion, and multiple solutions are obtained by multiple iterations. We show that diverse teams composed of agents with different preferences maximize the number of optimal solutions, while uniform teams composed of multiple copies of the best agent are in general suboptimal. Our experiments study the model in bounded time; and we also study a real system, where agents vote to design buildings.
2015_31_teamcore_coin2015.pdf
L. S. Marcolino, D. Gerber, B. Kolev, S. Price, E. Pantazis, Y. Tian, and M. Tambe. 2015. “Agents vote for the environment: Designing energy-efficient architecture .” In Workshop on Computational Sustainability (AAAI 2015).Abstract
Saving energy is a major concern. Hence, it is fundamental to design and construct buildings that are energy-efficient. It is known that the early stage of architectural design has a significant impact on this matter. However, it is complex to create designs that are optimally energy efficient, and at the same time balance other essential criterias such as economics, space, and safety. One state-of-the art approach is to create parametric designs, and use a genetic algorithm to optimize across different objectives. We further improve this method, by aggregating the solutions of multiple agents. We evaluate diverse teams, composed by different agents; and uniform teams, composed by multiple copies of a single agent. We test our approach across three design cases of increasing complexity, and show that the diverse team provides a significantly larger percentage of optimal solutions than single agents.
2015_30_teamcore_wcs2015.pdf
Matthew Brown. 2015. “Balancing Tradeoffs in Security Games: Handling Defenders and Adversaries with Multiple Objectives ”.Abstract
Stackelberg security games (SSG) have received a significant amount of attention in the literature for modeling the strategic interactions between a defender and an adversary, in which the defender has a limited amount of security resources to protect a set of targets from a potential attack by the adversary. SSGs are at the heart of several significant decision-support applications deployed in real world security domains. All of these applications rely on standard assumptions made in SSGs, including that the defender and the adversary each have a single objective which is to maximize their expected utility. Given the successes and real world impact of previous SSG research, there is a natural desire to push towards increasingly complex security domains, leading to a point where considering only a single objective is no longer appropriate. My thesis focuses on incorporating multiple objectives into SSGs. With multiple conflicting objectives for either the defender or adversary, there is no one solution which maximizes all objectives simultaneously and tradeoffs between the objectives must be made. Thus, my thesis provides two main contributions by addressing the research challenges raised by considering SSGs with (1) multiple defender objectives and (2) multiple adversary objectives. These contributions consist of approaches for modeling, calculating, and analyzing the tradeoffs between objectives in a variety of different settings. First, I consider multiple defender objectives resulting from diverse adversary threats where protecting against each type of threat is treated as a separate objective for the defender. Second, I investigate the defender’s need to balance between the exploitation of collected data and the exploration of alternative strategies in patrolling domains. Third, I explore the necessary tradeoff between the efficacy and the efficiency of the defender’s strategy in screening domains. Forth, I examine multiple adversary objectives for heterogeneous populations of boundedly rational adversaries that no longer strictly maximize expected utility. The contributions of my thesis provide the novel game models and algorithmic techniques required to incorporate multiple objectives into SSGs. My research advances the state of the art in SSGs and opens up the model to new types of security domains that could not have been handled previously. As a result, I developed two applications for real world security domains that either have been or will be tested and evaluated in the field.
2015_23_teamcore_brown_thesis.pdf
Benjamin Ford, Thanh Nguyen, Milind Tambe, Nicole Sintov, and Francesco Delle Fave. 2015. “Beware the Soothsayer: From Attack Prediction Accuracy to Predictive Reliability in Security Games .” In Conference on Decision and Game Theory for Security.Abstract
. Interdicting the flow of illegal goods (such as drugs and ivory) is a major security concern for many countries. The massive scale of these networks, however, forces defenders to make judicious use of their limited resources. While existing solutions model this problem as a Network Security Game (NSG), they do not consider humans’ bounded rationality. Previous human behavior modeling works in Security Games, however, make use of large training datasets that are unrealistic in real-world situations; the ability to effectively test many models is constrained by the time-consuming and complex nature of field deployments. In addition, there is an implicit assumption in these works that a model’s prediction accuracy strongly correlates with the performance of its corresponding defender strategy (referred to as predictive reliability). If the assumption of predictive reliability does not hold, then this could lead to substantial losses for the defender. In the following paper, we (1) first demonstrate that predictive reliability is indeed strong for previous Stackelberg Security Game experiments. We also run our own set of human subject experiments in such a way that models are restricted to learning on dataset sizes representative of real-world constraints. In the analysis on that data, we demonstrate that (2) predictive reliability is extremely weak for NSGs. Following that discovery, however, we identify (3) key factors that influence predictive reliability results: the training set’s exposed attack surface and graph structure.
2015_36_teamcore_ford15_gamesec.pdf
B. Bosansky, A. Jiang, M. Tambe, and C. Kiekintveld. 2015. “Combining Compact Representation and Incremental Generation in Large Games with Sequential Strategies .” In AAAI conference on Artificial Intelligence (AAAI).Abstract
Many search and security games played on a graph can be modeled as normal-form zero-sum games with strategies consisting of sequences of actions. The size of the strategy space provides a computational challenge when solving these games. This complexity is tackled either by using the compact representation of sequential strategies and linear programming, or by incremental strategy generation of iterative double-oracle methods. In this paper, we present novel hybrid of these two approaches: compact-strategy doubleoracle (CS-DO) algorithm that combines the advantages of the compact representation with incremental strategy generation. We experimentally compare CS-DO with the standard approaches and analyze the impact of the size of the support on the performance of the algorithms. Results show that CS-DO dramatically improves the convergence rate in games with non-trivial support.
2015_1_teamcore_bosansky_1567.pdf
Yue Yin, Haifeng Xu, Jiarui Gan, Bo An, and Albert X. Jiang. 2015. “Computing Optimal Mixed Strategies for Security Games with Dynamic Payoffs .” In International Joint Conference on Artificial Intelligence (IJCAI).Abstract
Security agencies in the real world often need to protect targets with time-dependent values, e.g., tourist sites where the number of travelers changes over time. Since the values of different targets often change asynchronously, the defender can relocate security resources among targets dynamically to make the best use of limited resources. We propose a game-theoretic scheme to develop dynamic, randomized security strategies in consideration of adversary’s surveillance capability. This differs from previous studies on security games by considering varying target values and continuous strategy spaces of the security agency and the adversary. The main challenge lies in the computational intensiveness due to the continuous, hence infinite strategy spaces. We propose an optimal algorithm and an arbitrarily near-optimal algorithm to compute security strategies under different conditions. Experimental results show that both algorithms significantly outperform existing approaches.
2015_41_teamcore_ijcai15_dynamic.pdf
Debarun Kar, Fei Fang, Francesco Delle Fave, Nicole Sintov, and Milind Tambe. 2015. “Conducting Longitudinal Experiments with Behavioral Models in Repeated Stackelberg Security Games on Amazon Mechanical Turk .” In Human-Agent Interaction Design and Models (HAIDM) Workshop at the International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2015).Abstract
Recently, there has been an increase of interest in domains involving repeated interactions between defenders and adversaries. This has been modeled as a repeated Stackelberg Security Game (repeated SSG). Although different behavioral models have been proposed for the attackers in these games, human subjects experiments for testing these behavioral models in repeated SSGs have not been conducted previously. This paper presents the first “longitudinal study” – at least in the context of SSGs – of testing human behavior models in repeated SSG settings. We provide the following contributions in this paper. First, in order to test the behavioral models, we design a game that simulates the repeated interactions between the defender and the adversary and deploy it on Amazon Mechanical Turk (AMT). Human subjects are asked to participate in this repeated task in rounds of the game, with a break between consecutive rounds. Second, we develop several approaches to keep the human subjects motivated throughout the course of this longitudinal study so that they participate in all measurement occasions, thereby minimizing attrition. We provide results showing improvements of retention rate due to implementation of these approaches. Third, we propose a way of choosing representative payoffs that fit the real-world scenarios as conducting these experiments are extremely time-consuming and we can only conduct a limited number of such experiments.
2015_24_teamcore_aamas15_haidm_crc_debarunkar.pdf
Fei Fang, Peter Stone, and Milind Tambe. 2015. “Defender Strategies In Domains Involving Frequent Adversary Interaction (Extended Abstract) .” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2015).Abstract
Recently, there has been an increase in interest in applying game theoretic approaches to domains involving frequent adversary interactions, such as wildlife and fishery protection. In these domains, the law enforcement agency faces adversaries who repeatedly and frequently carry out illegal activities, and thus, do not have time for extensive surveillance before taking actions. This makes them significantly different from counter-terrorism domains where game-theoretic approaches have been widely deployed. This paper presents a game-theoretic approach to be used by the defender in these Frequent Adversary Interaction (FAI) domains. We provide (i) a novel game model for FAI domains, describing the interaction between the defender and the attackers in a repeated game and (ii) algorithms that plan for the defender strategies to achieve high average expected utility over all rounds.
2015_14_teamcore_aamas2015_lead_ea_latest.pdf
D. J. Gerber, E. Pantazis, and L. S. Marcolino. 2015. “Design Agency: Prototyping Multi-Agent Systems in Architecture .” In CAAD Futures Conference.Abstract
This paper presents research on the prototyping of multi-agent systems for architectural design. It proposes a design exploration methodology at the intersection of architecture, engineering, and computer science. The motivation of the work includes exploring bottom up generative methods coupled with optimizing performance criteria including for geometric complexity and objective functions for environmental, structural and fabrication parameters. The paper presents the development of a research framework and initial experiments to provide design solutions, which simultaneously satisfy complexly coupled and often contradicting objectives. The prototypical experiments and initial algorithms are described through a set of different design cases and agents within this framework; for the generation of façade panels for light control; for emergent design of shell structures; for actual construction of reciprocal frames; and for robotic fabrication. Initial results include multi-agent derived efficiencies for environmental and fabrication criteria and discussion of future steps for inclusion of human and structural factors.
2015_29_teamcore_caad_futures_195_pp03_print.pdf
L. S. Marcolino, V. Nagarajan, and M. Tambe. 2015. “Every Team Deserves a Second Chance: An Interactive 9x9 Go Experience (Demonstration) .” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2015).Abstract
We show that without using any domain knowledge, we can predict the final performance of a team of voting agents, at any step towards solving a complex problem. This demo allows users to interact with our system, and observe its predictions, while playing 9x9 Go.
2015_16_teamcore_aamas15demo.pdf
V. Nagarajan, L. S. Marcolino, and M. Tambe. 2015. “Every team deserves a second chance: Identifying when things go wrong .” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2015).Abstract
Voting among different agents is a powerful tool in problem solving, and it has been widely applied to improve the performance in finding the correct answer to complex problems. We present a novel benefit of voting, that has not been observed before: we can use the voting patterns to assess the performance of a team and predict their final outcome. This prediction can be executed at any moment during problem-solving and it is completely domain independent. We present a theoretical explanation of why our prediction method works. Further, contrary to what would be expected based on a simpler explanation using classical voting models, we argue that we can make accurate predictions irrespective of the strength (i.e., performance) of the teams, and that in fact, the prediction can work better for diverse teams composed of different agents than uniform teams made of copies of the best agent. We perform experiments in the Computer Go domain, where we obtain a high accuracy in predicting the final outcome of the games. We analyze the prediction accuracy for three different teams with different levels of diversity and strength, and we show that the prediction works significantly better for a diverse team. Since our approach is domain independent, it can be easily applied to a variety of domains.
2015_8_teamcore_aamas2015.pdf
V. Nagarajan, L. S. Marcolino, and M. Tambe. 2015. “Every team deserves a second chance: Identifying when things go wrong (Student Abstract Version) .” In Conference on Artificial Intelligence (AAAI 2015). Texas, USA.Abstract
We show that without using any domain knowledge, we can predict the final performance of a team of voting agents, at any step towards solving a complex problem.
2015_9_teamcore_aaai_student_abstract2015.pdf
V. Nagarajan, L. S. Marcolino, and M. Tambe. 2015. “Every Team Makes Mistakes: An Initial Report on Predicting Failure in Teamwork .” In AAAI Workshop Learning for General Competency in Video Games (AAAI 2015). Texas, USA.Abstract
Voting among different agents is a powerful tool in problem solving, and it has been widely applied to improve the performance in machine learning. However, the potential of voting has been explored only in improving the ability of finding the correct answer to a complex problem. In this paper we present a novel benefit in voting, that has not been observed before: we show that we can use the voting patterns to assess the performance of a team and predict their final outcome. This prediction can be executed at any moment during problem-solving and it is completely domain independent. We present a preliminary theoretical explanation of why our prediction method works, where we show that the accuracy is better for diverse teams composed by different agents than for uniform teams made of copies of the same agent. We also perform experiments in the Computer Go domain, where we show that we can obtain a high accuracy in predicting the final outcome of the games. We analyze the prediction accuracy for 3 different teams, and we show that the prediction works significantly better for a diverse team. Since our approach is completely domain independent, it can be easily applied to a variety of domains, such as the video games in the Arcade Learning Environment.
2015_10_teamcore_aaai_workshop2015.pdf
L. S. Marcolino, V. Nagarajan, and M. Tambe. 2015. “Every Team Makes Mistakes, in Large Action Spaces .” In Multidisciplinary Workshop on Advances in Preference Handling (M-PREF 2015).Abstract
Voting is applied to better estimate an optimal answer to complex problems in many domains. We recently presented a novel benefit of voting, that has not been observed before: we can use the voting patterns to assess the performance of a team and predict whether it will be successful or not in problem-solving. Our prediction technique is completely domain independent, and it can be executed at any time during problem solving. In this paper we present a novel result about our technique: we show that the prediction quality increases with the size of the action space. We present a theoretical explanation for such phenomenon, and experiments in Computer Go with a variety of board sizes.
2015_33_teamcore_mpref15.pdf
H. Xu, Z. Rabinovich, S. Dughmi, and M. Tambe. 2015. “Exploring Information Asymmetry in Two-Stage Security Games .” In AAAI Conference on Artificial Intelligence (AAAI).Abstract
Stackelberg security games have been widely deployed to protect real-world assets. The main solution concept there is the Strong Stackelberg Equilibrium (SSE), which optimizes the defender’s random allocation of limited security resources. However, solely deploying the SSE mixed strategy has limitations. In the extreme case, there are security games in which the defender is able to defend all the assets “almost perfectly” at the SSE, but she still sustains significant loss. In this paper, we propose an approach for improving the defender’s utility in such scenarios. Perhaps surprisingly, our approach is to strategically reveal to the attacker information about the sampled pure strategy. Specifically, we propose a two-stage security game model, where in the first stage the defender allocates resources and the attacker selects a target to attack, and in the second stage the defender strategically reveals local information about that target, potentially deterring the attacker’s attack plan. We then study how the defender can play optimally in both stages. We show, theoretically and experimentally, that the two-stage security game model allows the defender to achieve strictly better utility than SSE.
2015_3_teamcore_two_stage_game.pdf
Shahrzad Gholami, Chao Zhang, Arunesh Sinha, and Milind Tambe. 2015. “An extensive study of Dynamic Bayesian Network for patrol allocation against adaptive opportunistic criminals .” In IJCAI'15 Workshop on Behavioral, Economic and Computational Intelligence for Security (BECIS).Abstract
Police patrols are used ubiquitously to deter crimes in urban areas. A distinctive feature of urban crimes is that criminals react opportunistically to patrol officers’ assignments. Different models of adversary behavior have been proposed but their exact form remains uncertain. Recent work [Zhang et al., 2015] has explored learning the model from real-world criminal activity data. To that end, criminal behavior and the interaction with the patrol officers is represented as parameters of a Dynamic Bayesian Network (DBN), enabling application of standard algorithms such as EM to learn the parameters. More specifically, the EMC2 algorithm is a sequence of modifications to the DBN representation, that allows for a compact representation resulting in better learning accuracy and increased speed of learning. In this paper, we perform additional experiments showing the efficacy of the EMC2 algorithm. Furthermore, we explore different variations of Markov model. Unlike DBNs, the Markov models do not have hidden states, which indicate distribution of criminals, and are therefore easier to learn using standard MLE techniques. We compare all the approaches by learning from a real data set of criminal activity obtained from the police department of University of Southern California (USC) situated in Los Angeles, USA. We demonstrate a significant better accuracy of predicting the crime using the EMC2 algorithm compared to other approaches. This work was done in collaboration with the police department of USC.
2015_34_teamcore_keep_pace_with_criminal_workshop.pdf
Amulya Yadav, Thanh Nguyen, Francesco Delle Fave, Milind Tambe, Noa Agmon, Manish Jain, Widodo Ramono, and Timbul Batubara. 2015. “Handling Payoff Uncertainty in Green Security Domains with Adversary Bounded Rationality .” In In IJCAI-15 Workshop on Behavioral, Economic and Computational Intelligence for Security (BECIS-15).Abstract
Research on Stackelberg Security Games (SSG) has recently shifted to green security domains, for example, protecting wildlife from illegal poaching. Previous research on this topic has advocated the use of behavioral (bounded rationality) models of adversaries in SSG. As its first contribution, this paper, for the first time, provides validation of these behavioral models based on real-world data from a wildlife park. The paper’s next contribution is the first algorithm to handle payoff uncertainty – an important concern in green security domains – in the presence of such adversarial behavioral models.
2015_39_teamcore_yadav_becis2015.pdf
Amulya Yadav, Thanh Nguyen, Francesco Delle Fave, Milind Tambe, Noa Agmon, Manish Jain, Widodo Ramono, and Timbul Batubara. 2015. “Handling Payoff Uncertainty with Adversary Bounded Rationality in Green Security Domains .” In In IJCAI-15 Workshop on Algorithmic Game Theory (AGT-15).Abstract
Research on Stackelberg Security Games (SSG) has recently shifted to green security domains, for example, protecting wildlife from illegal poaching. Previous research on this topic has advocated the use of behavioral (bounded rationality) models of adversaries in SSG. As its first contribution, this paper, for the first time, provides validation of these behavioral models based on real-world data from a wildlife park. The paper’s next contribution is the first algorithm to handle payoff uncertainty – an important concern in green security domains – in the presence of such adversarial behavioral models.
2015_38_teamcore_yadav_agt2015.pdf
Yasaman Dehghani Abbasi, Martin Short, Arunesh Sinha, Nicole Sintov, Chao Zhang, and Milind Tambe. 2015. “Human Adversaries in Opportunistic Crime Security Games: Evaluating Competing Bounded Rationality Models .” In Conference on Advances in Cognitive Systems.Abstract
There are a growing number of automated decision aids based on game-theoretic algorithms in daily use by security agencies to assist in allocating or scheduling their limited security resources. These applications of game theory, based on the “security games” paradigm, are leading to fundamental research challenges: one major challenge is modeling human bounded rationality. More specifically, the security agency, assisted with an automated decision aid, is assumed to act with perfect rationality against a human adversary; it is important to investigate the bounded rationality of these human adversaries to improve effectiveness of security resource allocation. This paper for the first time provides an empirical investigation of adversary bounded rationality in opportunistic crime settings, where modeling bounded rationality is particularly crucial. We conduct extensive human subject experiments, comparing ten different bounded rationality models, and illustrate that: (a) while previous research proposed the use of the stochastic choice “quantal response” model of human adversary, this model is significantly outperformed by more advanced models of “subjective utility quantal response”; (b) Combinations of the well-known prospect theory model with these advanced models lead to an even better performance in modeling human adversary behavior; (c) while it is important to model the non-linear human weighing of probability, as advocated by prospect theory, our findings are the exact opposite of prospect theory in terms of how humans are seen to weigh this non-linear probability.
2015_20_teamcore_yasi_cogsy_re_submitted_version.pdf

Pages