Publications

2011
Manish Jain, Milind Tambe, and Christopher Kiekintveld. 2011. “Quality-bounded Solutions for Finite Bayesian Stackelberg Games: Scaling up .” In International Conference on Autonomous Agents and Multiagent Systems.Abstract
The fastest known algorithm for solving General Bayesian Stackelberg games with a finite set of follower (adversary) types have seen direct practical use at the LAX airport for over 3 years; and currently, an (albeit non-Bayesian) algorithm for solving these games is also being used for scheduling air marshals on limited sectors of international flights by the US Federal Air Marshals Service. These algorithms find optimal randomized security schedules to allocate limited security resources to protect targets. As we scale up to larger domains, including the full set of flights covered by the Federal Air Marshals, it is critical to develop newer algorithms that scale-up significantly beyond the limits of the current state-of-theart of Bayesian Stackelberg solvers. In this paper, we present a novel technique based on a hierarchical decomposition and branch and bound search over the follower type space, which may be applied to different Stackelberg game solvers. We have applied this technique to different solvers, resulting in: (i) A new exact algorithm called HBGS that is orders of magnitude faster than the best known previous Bayesian solver for general Stackelberg games; (ii) A new exact algorithm called HBSA which extends the fastest known previous security game solver towards the Bayesian case; and (iii) Approximation versions of HBGS and HBSA that show significant improvements over these newer algorithms with only 1- 2% sacrifice in the practical solution quality.
2011_4_teamcore_jain2011a.pdf
Bo An, Milind Tambe, Fernando Ordonez, Eric Shieh, and Christopher Kiekintveld. 2011. “Refinement of Strong Stackelberg Equilibria in Security Games .” In Conference on Artificial Intelligence (AAAI).Abstract
Given the real-world deployments of attacker-defender Stackelberg security games, robustness to deviations from expected attacker behaviors has now emerged as a critically important issue. This paper provides four key contributions in this context. First, it identifies a fundamentally problematic aspect of current algorithms for security games. It shows that there are many situations where these algorithms face multiple equilibria, and they arbitrarily select one that may hand the defender a significant disadvantage, particularly if the attacker deviates from its equilibrium strategies due to unknown constraints. Second, for important subclasses of security games, it identifies situations where we will face such multiple equilibria. Third, to address these problematic situations, it presents two equilibrium refinement algorithms that can optimize the defender’s utility if the attacker deviates from equilibrium strategies. Finally, it experimentally illustrates that the refinement approach achieved significant robustness in consideration of attackers’ deviation due to unknown constraints.
2011_25_teamcore_aaai11_yin.pdf
Meritxell Vinyals, Eric Shieh, Jesus Cerquides, Juan Antonio Rodriguez-Aguilar, Zhengyu Yin, Milind Tambe, and Emma Bowring. 2011. “Reward-based region optimal quality guarantees.” In Workshop on Optimisation in Multiagent Systems (OPTMAS) at AAMAS 2011 .Abstract
Distributed constraint optimization (DCOP) is a promising approach to coordination, scheduling and task allocation in multi agent networks. DCOP is NP- hard [6], so an important line of work focuses on developing fast incomplete solution algorithms that can provide guarantees on the quality of their local optimal solutions. Region optimality [11] is a promising approach along this line: it provides quality guarantees for region optimal solutions, namely solutions that are optimal in a specific region of the DCOP. Region optimality generalises k- and t-optimality [7, 4] by allowing to explore the space of criteria that define regions to look for solutions with better quality guarantees. Unfortunately, previous work in region-optimal quality guarantees fail to exploit any a-priori knowledge of the reward structure of the problem. This paper addresses this shortcoming by defining reward-dependent region optimal quality guarantees that exploit two different levels of knowledge about rewards, namely: (i) a ratio between the least minimum reward to the maximum reward among relations; and (ii) the minimum and maximum rewards per relation.
2011_17_teamcore_optmas2011_coptimality.pdf
Zhengyu Yin, Manish Jain, Milind Tambe, and Fernando Ordonez. 2011. “Risk-Averse Strategies for Security Games with Execution and Observational Uncertainty .” In Conference on Artificial Intelligence (AAAI).Abstract
Attacker-defender Stackelberg games have become a popular game-theoretic approach for security with deployments for LAX Police, the FAMS and the TSA. Unfortunately, most of the existing solution approaches do not model two key uncertainties of the real-world: there may be noise in the defender’s execution of the suggested mixed strategy and/or the observations made by an attacker can be noisy. In this paper, we provide a framework to model these uncertainties, and demonstrate that previous strategies perform poorly in such uncertain settings. We also provide RECON, a novel algorithm that computes strategies for the defender that are robust to such uncertainties, and provide heuristics that further improve RECON’s efficiency
2011_26_teamcore_aaai11_yin.pdf
Jun-young Kwak, Rong Yang, Zhengyu Yin, Matthew E. Taylor, and Milind Tambe. 2011. “Robust Execution-time Coordination in DEC-POMDPs Under Model Uncertainty.” In Workshop on Multiagent Sequential Decision Making in Uncertain Domains(MSDM) at AAMAS 2011 .Abstract
Despite their worst-case NEXP-complete planning complexity, DEC-POMDPs remain a popular framework for multiagent teamwork. This paper introduces effective teamwork under model uncertainty (i.e., potentially inaccurate transition and observation functions) as a novel challenge for DEC-POMDPs and presents MODERN, the first execution-centric framework for DEC-POMDPs explicitly motivated by addressing such model uncertainty. MODERN’s shift of coordination reasoning from planning-time to execution-time avoids the high cost of computing optimal plans whose promised quality may not be realized in practice. There are three key ideas in MODERN: (i) it maintains an exponentially smaller model of other agents’ beliefs and actions than in previous work and then further reduces the computationtime and space expense of this model via bounded pruning; (ii) it reduces execution-time computation by exploiting BDI theories of teamwork, and limits communication to key trigger points; and (iii) it limits its decision-theoretic reasoning about communication to trigger points and uses a systematic markup to encourage extra communication at these points – thus reducing uncertainty among team members at trigger points. We empirically show that MODERN is substantially faster than existing DEC-POMDP executioncentric methods while achieving significantly higher reward.
2011_18_teamcore_modern_aamas11_msdm.pdf
Milind Tambe. 2011. “Security and Game Theory: Algorithms, Deployed Systems, Lessons Learned .” Cambridge University Press.Abstract
Global threats of terrorism, drug-smuggling and other crimes have led to a significant increase in research on game theory for security. Game theory provides a sound mathematical approach to deploy limited security resources to maximize their effectiveness. A typical approach is to randomize security schedules to avoid predictability, with the randomization using artificial intelligence techniques to take into account the importance of different targets and potential adversary reactions. This book distills the forefront of this research to provide the first and only study of long-term deployed applications of game theory for security for key organizations such as the Los Angeles International Airport police and the US Federal Air Marshals Service. The author and his research group draw from their extensive experience working with security officials to intelligently allocate limited security resources to protect targets, outlining the applications of these algorithms in research and the real world.
2011_34_teamcore_tambebook.pdf
Zhengyu Yin, Kanna Rajan, and Milind Tambe. 2011. “Solving Continuous-Time Transition-Independent DEC-MDP with Temporal Constraints .” In AAMAS Workshop on Multiagent Sequential Decision Making in Uncertain Domains (MSDM).Abstract
Despite the impact of DEC-MDPs over the past decade, scaling to large problem domains has been difficult to achieve. The scale-up problem is exacerbated in DEC-MDPs with continuous states, which are critical in domains involving time; the latest algorithm (M-DPFP) does not scale-up beyond two agents and a handful of unordered tasks per agent. This paper is focused on meeting this challenge in continuous resource DEC-MDPs with two predominant contributions. First, it introduces a novel continuous time model for multi-agent planning problems that exploits transition independence in domains with graphical agent dependencies and temporal constraints. More importantly, it presents a new, iterative, locally optimal algorithm called SPAC that is a combination of the following key ideas: (1) defining a novel augmented CT-MDP such that solving this singleagent continuous time MDP provably provides an automatic best response to neighboring agents’ policies; (2) fast convolution to efficiently generate such augmented MDPs; (3) new enhanced lazy approximation algorithm to solve these augmented MDPs; (4) intelligent seeding of initial policies in the iterative process; (5) exploiting graph structure of reward dependencies to exploit local interactions for scalability. Our experiments show SPAC not only finds solutions substantially faster than M-DPFP with comparable quality, but also scales well to large teams of agents.
2011_21_teamcore_msdm11_mctmdp.pdf
Dmytro Korzhyk, Zhengyu Yin, Christopher Kiekintveld, Vincent Conitzer, and Milind Tambe. 2011. “Stackelberg vs. Nash in Security Games: An Extended Investigation of Interchangeability, Equivalence, and Uniqueness .” Journal of AI Research (JAIR), 41, Pp. 297-327.Abstract
There has been significant recent interest in game theoretic approaches to security, with much of the recent research focused on utilizing the leader-follower Stackelberg game model; for example, these games are at the heart of major applications such as the ARMOR program deployed for security at the LAX airport since 2007 and the IRIS program in use by the US Federal Air Marshals (FAMS). The foundational assumption for using Stackelberg games is that security forces (leaders), acting first, commit to a randomized strategy; while their adversaries (followers) choose their best response after surveillance of this randomized strategy. Yet, in many situations, the followers may act without observation of the leader’s strategy, essentially converting the game into a simultaneous-move game model. Previous work fails to address how a leader should compute her strategy given this fundamental uncertainty about the type of game faced. Focusing on the complex games that are directly inspired by real-world security applications, the paper provides four contributions in the context of a general class of security games. First, exploiting the structure of these security games, the paper shows that the Nash equilibria in security games are interchangeable, thus alleviating the equilibrium selection problem. Second, resolving the leader’s dilemma, it shows that under a natural restriction on security games, any Stackelberg strategy is also a Nash equilibrium strategy; and furthermore, the solution is unique in a class of security games of which ARMOR is a key exemplar. Third, when faced with a follower that can attack multiple targets, many of these properties no longer hold. Fourth, we show experimentally that in most (but not all) games where the restriction does not hold, the Stackelberg strategy is still a Nash equilibrium strategy, but this is no longer true when the attacker can attack multiple targets. These contributions have major implications for the real-world applications. As a possible direction for future research on cases where the Stackelberg strategy is not a Nash equilibrium strategy, we propose an extensive-form game model that makes the defender’s uncertainty about the attacker’s ability to observe explicit.
2011_27_teamcore_jair3269_final_version.pdf
Jun-young Kwak, Rong Yang, Zhengyu Yin, Matthew E. Taylor, and Milind Tambe. 2011. “Teamwork in Distributed POMDPs: Execution-time Coordination Under Model Uncertainty .” In International Conference on Autonomous Agents and Multiagent Systems (Extended Abstract) .Abstract
Despite their NEXP-complete policy generation complexity [1], Distributed Partially Observable Markov Decision Problems (DEC-POMDPs) have become a popular paradigm for multiagent teamwork [2, 6, 8]. DEC-POMDPs are able to quantitatively express observational and action uncertainty, and yet optimally plan communications and domain actions. This paper focuses on teamwork under model uncertainty (i.e., potentially inaccurate transition and observation functions) in DEC-POMDPs. In many domains, we only have an approximate model of agent observation or transition functions. To address this challenge we rely on execution-centric frameworks [7, 11, 12], which simplify planning in DEC-POMDPs (e.g., by assuming costfree communication at plan-time), and shift coordination reasoning to execution time. Specifically, during planning, these frameworks have a standard single-agent POMDP planner [4] to plan a policy for the team of agents by assuming zero-cost communication. Then, at execution-time, agents model other agents’ beliefs and actions, reason about when to communicate with teammates, reason about what action to take if not communicating, etc. Unfortunately, past work in execution-centric approaches [7, 11, 12] also assumes a correct world model, and the presence of model uncertainty exposes key weaknesses that result in erroneous plans and additional inefficiency due to reasoning over incorrect world models at every decision epoch. This paper provides two sets of contributions. The first is a new execution-centric framework for DEC-POMDPs called MODERN (MOdel uncertainty in Dec-pomdp Execution-time ReasoNing). MODERN is the first execution-centric framework for DECPOMDPs explicitly motivated by model uncertainty. It is based on three key ideas: (i) it maintains an exponentially smaller model of other agents’ beliefs and actions than in previous work and then further reduces the computation-time and space expense of this model via bounded pruning; (ii) it reduces execution-time computation by exploiting BDI theories of teamwork, thus limiting communication to key trigger points; and (iii) it simplifies its decision-theoretic reasoning about communication over the pruned model and uses a systematic markup, encouraging extra communication and reducing uncertainty among team members at trigger points. This paper’s second set of contributions are in opening up model uncertainty as a new research direction for DEC-POMDPs and emphasizing the similarity of this problem to the Belief-DesireIntention (BDI) model for teamwork [5, 9]. In particular, BDI teamwork models also assume inaccurate mapping between realworld problems and domain models. As a result, they emphasize robustness via execution-time reasoning about coordination [9]. Given some of the successes of prior BDI research in teamwork, we leverage insights from BDI in designing MODERN.
2011_6_teamcore_modern_aaamas11_short.pdf
James Pita, Rong Yang, Milind Tambe, and Richard John. 2011. “Toward Addressing Human Behavior with Observational Uncertainty in Security Games .” In AAAI'11 Workshop on Applied Adversarial Reasoning and Risk Modeling (AARM) .Abstract
Stackelberg games have recently gained significant attention for resource allocation decisions in security settings. One critical assumption of traditional Stackelberg models is that all players are perfectly rational and that the followers perfectly observe the leader’s strategy. However, in real-world security settings, security agencies must deal with human adversaries who may not always follow the utility maximizing rational strategy. Accounting for these likely deviations is important since they may adversely affect the leader’s (security agency’s) utility. In fact, a number of behavioral gametheoretic models have begun to emerge for these domains. Two such models in particular are COBRA (Combined Observability and Bounded Rationality Assumption) and BRQR (Best Response to Quantal Response), which have both been shown to outperform game-theoretic optimal models against human adversaries within a security setting based on Los Angeles International Airport (LAX). Under perfect observation conditions, BRQR has been shown to be the leading contender for addressing human adversaries. In this work we explore these models under limited observation conditions. Due to human anchoring biases, BRQR’s performance may suffer under limited observation conditions. An anchoring bias is when, given no information about the occurrence of a discrete set of events, humans will tend to assign an equal weight to the occurrence of each event (a uniform distribution). This study makes three main contributions: (i) we incorporate an anchoring bias into BRQR to improve performance under limited observation; (ii) we explore finding appropriate parameter settings for BRQR under limited observation; (iii) we compare BRQR’s performance versus COBRA under limited observation conditions.
2011_31_teamcore_aarm_pita.pdf
Jun-young Kwak, Milind Tambe, Paul Scerri, Amos Freedy, and Onur Sert. 2011. “Towards a Robust MultiAgent Autonomous Reasoning System (MAARS): An Initial Simulation Study for Satellite Defense .” In AIAA Infotech at Aerospace.Abstract
Multi-agent autonomous reasoning systems have emerged as a promising planning technique for addressing satellite defense problems. The main challenge is to extend and scale up the capabilities of current and emerging reasoning and planning methods to handle the characteristics of the satellite defense problem. This paper focuses on some key critical research issues that need to be addressed in order to perform automated planning and execution fitted to the specific nature of response to ASAT attacks, and provides MAARS, a new autonomous reasoning framework for satellite defense. As the core of MAARS, we present MODERN, a new execution-centric method for DEC-POMDPs explicitly motivated by model uncertainty. There are two key innovative features in MODERN: (i) it maintains an exponentially smaller model of other agents’ beliefs and actions than in previous work and then further reduces the computation-time and space expense of this model via bounded pruning; and (ii) it reduces execution-time computation by exploiting BDI theories of teamwork, and limits communication reasoning to key trigger points. We demonstrate a proof of concept of MAARS in the simplified ASAT mitigation scenario. We then show initial evaluation results of MAARS in ASAT domains that are critical in advancing the state-of-the-art in providing autonomous reasoning to delve into unperceived models as well as deal with exponential explosion of the computational complexity of current algorithms.
2011_19_teamcore_maars_infotech_aerospace11.pdf
Jun-young Kwak, Rong Yang, Zhengyu Yin, Matthew E. Taylor, and Milind Tambe. 2011. “Towards Addressing Model Uncertainty: Robust Execution-time Coordination for Teamwork (Short Paper) .” In International Conference on Intelligent Agent Technology (short paper).Abstract
Despite their worst-case NEXP-complete planning complexity, DEC-POMDPs remain a popular framework for multiagent teamwork. This paper introduces effective teamwork under model uncertainty (i.e., potentially inaccurate transition and observation functions) as a novel challenge for DEC-POMDPs and presents MODERN, the first executioncentric framework for DEC-POMDPs explicitly motivated by addressing such model uncertainty. MODERN’s shift of coordination reasoning from planning-time to execution-time avoids the high cost of computing optimal plans whose promised quality may not be realized in practice. There are three key ideas in MODERN: (i) it maintains an exponentially smaller model of other agents’ beliefs and actions than in previous work and then further reduces the computationtime and space expense of this model via bounded pruning; (ii) it reduces execution-time computation by exploiting BDI theories of teamwork, and limits communication to key trigger points; and (iii) it limits its decision-theoretic reasoning about communication to trigger points and uses a systematic markup to encourage extra communication at these points — thus reducing uncertainty among team members at trigger points. We empirically show that MODERN is substantially faster than existing DEC-POMDP execution-centric methods while achieving significantly higher reward.
2011_32_teamcore_iat11_modern_short.pdf
Bostjan Kaluza, Gal Kaminka, and Milind Tambe. 2011. “Towards Detection of Suspicious Behavior from Multiple Observations .” In PAIR 2011: AAAI Workshop on Plan, Activity, and Intent Recognition.Abstract
This paper addresses the problem of detecting suspicious behavior from a collection of individuals events, where no single event is enough to decide whether his/her behavior is suspicious, but the combination of multiple events enables reasoning. We establish a Bayesian framework for evaluating multiple events and show that the current approaches lack modeling behavior history included in the estimation whether a trace of events is generated by a suspicious agent. We propose a heuristic for evaluating events according to the behavior of the agent in the past. The proposed approach, tested on an airport domain, outperforms the current approaches.
2011_29_teamcore_bostjan_paper.pdf
Jun-young Kwak, Pradeep Varakantham, Milind Tambe, Laura Klein, Farrokh Jazizadeh, Geoffrey Kavulya, Burcin B. Gerber, and David J. Gerber. 2011. “Towards Optimal Planning for Distributed Coordination Under Uncertainty in Energy Domains .” In Workshop on Agent Technologies for Energy Systems (ATES) at AAMAS 2011.Abstract
Recent years have seen a rise of interest in the deployment of multiagent systems in energy domains that inherently have uncertain and dynamic environments with limited resources. In such domains, the key challenge is to minimize the energy consumption while satisfying the comfort level of occupants in the buildings under uncertainty (regarding agent negotiation actions). As human agents begin to interact with complex building systems as a collaborative team, it becomes crucial that the resulting multiagent teams reason about coordination under such uncertainty to optimize multiple metrics, which have not been systematically considered in previous literature. This paper presents a novel multiagent system based on distributed coordination reasoning under uncertainty for sustainability called SAVES. There are three key ideas in SAVES: (i) it explicitly considers uncertainty while reasoning about coordination in a distributed manner relying on MDPs; (ii) human behaviors and their occupancy preferences are incorporated into planning and modeled as part of the system; and (iii) the influence of various control strategies for multiagent teams is evaluated on an existing university building as the practical research testbed with actual energy consumption data. We empirically show the preliminary results that our intelligent control strategies substantially reduce the overall energy consumption in the actual simulation testbed compared to the existing control means while achieving comparable average satisfaction level of occupants.
2011_14_teamcore_energyaamas11_ates.pdf
Laura Klein, Geoffrey Kavulya, Farrokh Jazizadeh, Jun-young Kwak, Burcin Becerik-Gerber, Pradeep Varakantham, and Milind Tambe. 2011. “Towards Optimization Of Building Energy And Occupant Comfort Using Multi-Agent Simulation.” In International Symposium on Automation and Robotics in Construction.Abstract
The primary consumers of building energy are heating, cooling, ventilation, and lighting systems, which maintain occupant comfort, and electronics and appliances that enable occupant functionality. The optimization of building energy is therefore a complex problem highly dependent on unique building and environmental conditions as well as on time dependent operational factors. To provide computational support for this optimization, this paper presents and implements a multi-agent comfort and energy simulation (MACES) to model alternative management and control of building systems and occupants. Human and device agents are used to explore current trends in energy consumption and management of a university test bed building. Reactive and predictive control strategies are then imposed on device agents in an attempt to reduce building energy consumption while maintaining occupant comfort. Finally, occupant agents are motivated by simulation feedback to accept more energy conscious scheduling through multi-agent negotiations. Initial results of the MACES demonstrate potential energy savings of 17% while maintaining a high level of occupant comfort. This work is intended to demonstrate a simulation tool, which is implementable in the actual test bed site and compatible with real-world input to instigate and motivate more energy conscious control and occupant behaviors.
2011_28_teamcore_isarc.pdf
Matthew E. Taylor, Manish Jain, Christopher Kiekintveld, Jun-young Kwak, Rong Yang, Zhengyu Yin, and Milind Tambe. 2011. “Two Decades of Multiagent Teamwork Research: Past, Present, and Future .” In Collaborative Agents REsearch and Development (CARE) 2010 workshop.Abstract
This paper discusses some of the recent cooperative multiagent systems work in the TEAMCORE lab at the University of Southern California. Based in part on an invited talk at the CARE 2010 workshop, we highlight how and why execution-time reasoning has been supplementing, or replacing, planning-time reasoning in such systems.
2011_3_teamcore_care_taylor.pdf
Bo An, James Pita, Eric Shieh, Milind Tambe, Christopher Kiekintveld, and Janusz Marecki. 2011. “GUARDS and PROTECT: Next Generation Applications of Security Games.” In ACM SIGecom Exchanges , 1st ed. Vol. 10.Abstract
The last five years have witnessed the successful application of game theory in reasoning about complex security problems [Basilico et al. 2009; Korzhyk et al. 2010; Dickerson et al. 2010; Jakob et al. 2010; Paruchuri et al. 2008; Pita et al. 2009; Pita et al. 2010; Kiekintveld et al. 2009; Jain et al. 2010]. Stackelberg games have been widely used to model patrolling or monitoring problems in security. In a Stackelberg security game, the defender commits to a strategy and the adversary makes its decision with knowledge of the leader’s commitment. Two systems applying Stackelberg game models to assist with randomized resource allocation decisions are currently in use by the Los Angeles International Airport (LAX) [Pita et al. 2008] and the Federal Air Marshals Service (FAMS) [Tsai et al. 2009]. Two new applications called GUARDS (Game-theoretic Unpredictable and Randomly Deployed Security) [Pita et al. 2011] and PROTECT (Port Resilience Operational / Tactical Enforcement to Combat Terrorism) are under development for the Transportation Security Administration (TSA) and the United States Coast Guard respectively. Both are based on Stackelberg games. In contrast with previous applications at LAX and FAMS, which focused on one-off tailored applications and one security activity (e.g., canine patrol, checkpoints, or covering flights) per application, both GUARDS and PROTECT face new challenging issues due to the potential large scale deployment. This includes reasoning about hundreds of heterogeneous security activities, reasoning over diverse potential threats, and developing a system designed for hundreds of end-users. In this article we will highlight several of the main issues that have arisen. We begin with an overview of the new applications and then discuss these issues in turn.
2011_12_teamcore_usc2.pdf
Bostjan Kaluza, Erik Dovgan, Tea Tusar, Milind Tambe, and Matjaz Gams. 2011. “A Probabilistic Risk Analysis for Multimodal Entry Control.” Expert systems with Applications,, 28, Pp. 6696-6704.Abstract
Entry control is an important security measure that prevents undesired persons from entering secure areas. The advanced risk analysis presented in this paper makes it possible to distinguish between acceptable and unacceptable entries, based on several entry sensors, such as fingerprint readers, and intelligent methods that learn behavior from previous entries. We have extended the intelligent layer in two ways: first, by adding a meta-learning layer that combines the output of specific intelligent modules, and second, by constructing a Bayesian network to integrate the predictions of the learning and meta-learning modules. The obtained results represent an important improvement in detecting security attacks.
2011_1_teamcore_entry_control_eswa.pdf
2010
Christopher Kiekintveld, Zhengyu Yin, Atul Kumar, and Milind Tambe. 2010. “Asynchronous Algorithms for Approximate Distributed Constraint Optimization with Quality Bounds .” In International Conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract
Distributed Constraint Optimization (DCOP) is a popular framework for cooperative multi-agent decision making. DCOP is NPhard, so an important line of work focuses on developing fast incomplete solution algorithms for large-scale applications. One of the few incomplete algorithms to provide bounds on solution quality is k-size optimality, which defines a local optimality criterion based on the size of the group of deviating agents. Unfortunately, the lack of a general-purpose algorithm and the commitment to forming groups based solely on group size has limited the use of k-size optimality. This paper introduces t-distance optimality which departs from k-size optimality by using graph distance as an alternative criteria for selecting groups of deviating agents. This throws open a new research direction into the tradeoffs between different group selection and coordination mechanisms for incomplete DCOP algorithms. We derive theoretical quality bounds for t-distance optimality that improve known bounds for k-size optimality. In addition, we develop a new efficient asynchronous local search algorithm for finding both k-size and t-distance optimal solutions — allowing these concepts to be deployed in real applications. Indeed, empirical results show that this algorithm significantly outperforms the only existing algorithm for finding general k-size optimal solutions, which is also synchronous. Finally, we compare the algorithmic performance of k-size and t-distance optimality using this algorithm. We find that t-distance consistently converges to higher-quality solutions in the long run, but results are mixed on convergence speed; we identify cases where k-size and t-distance converge faster.
2010_4_teamcore_aamas10_dcop_approximation.pdf
Emma Bowring, Milind Tambe, and Makoto Yokoo. 2010. “Balancing Local Resources and Global Goals in Multiply-Constrained DCOP .” Journal of Multiagent and Grid Systems (MAGS), 6, 4, Pp. 353-393.Abstract
Distributed constraint optimization (DCOP) is a useful framework for cooperative multiagent coordination. DCOP focuses on optimizing a single team objective. However, in many domains, agents must satisfy constraints on resources consumed locally while optimizing the team goal. Yet, these resource constraints may need to be kept private. Designing DCOP algorithms for these domains requires managing complex trade-offs in completeness, scalability, privacy and efficiency. This article defines the multiply-constrained DCOP (MC-DCOP) framework and provides complete (globally optimal) and incomplete (locally optimal) algorithms for solving MC-DCOP problems. Complete algorithms find the best allocation of scarce resources while optimizing the team objective, while incomplete algorithms are more scalable. The algorithms use four main techniques: (i) transforming constraints to maintain privacy; (ii) dynamically setting upper bounds on resource consumption; (iii) identifying the extent to which the local graph structure allows agents to compute exact bounds; and (iv) using a virtual assignment to flag problems rendered unsatisfiable by resource constraints. Proofs of correctness are presented for all algorithms. Experimental results illustrate the strengths and weaknesses of both the complete and incomplete algorithms.
2010_12_teamcore_magsbty2.pdf

Pages