Publications

2007
Jonathan Pearce, Emma Bowring, Christopher Portway, and Milind Tambe. 2007. “K-optimal algorithms for Distributed Constraint Optimization: Extending to domains with hard constraints .” In Distributed Constraint Reasoning Workshop.Abstract
Distributed constraint optimization (DCOP) has proven to be a promising approach to address coordination, scheduling and task allocation in largescale multiagent networks, in domains involving sensor networks, teams of unmanned air vehicles, or teams of software personal assistants and others. Locally optimal approaches to DCOP suggest themselves as appropriate for such large-scale multiagent networks, particularly when such networks are accompanied by lack of high-bandwidth communications among agents. K-optimal algorithms provide an important class of these locally optimal algorithms, given analytical results proving quality guarantees. Previous work on koptimality, including its theoretical guarantees, focused exclusively on soft constraints. This paper extends the results to DCOPs with hard constraints. It focuses in particular on DCOPs where such hard constraints are resource constraints which individual agents must not violate. We provide two key results in the context of such DCOPs. First we provide reward-independent lower bounds on the quality of k-optima in the presence of hard (resource) constraints. Second, we present algorithms for k-optimality given hard resource constraints, and present detailed experimental results over DCOP graphs of 1000 agents with varying constraint density.
2007_17_teamcore_mc_k_optimal4.pdf
Pradeep Varakantham, Janusz Marecki, Yuichi Yabu, Milind Tambe, and Makoto Yokoo. 2007. “Letting loose a SPIDER on a network of POMDPs: Generating quality guaranteed policies .” In International Conference on Autonomous Agents and Multiagent Systems, AAMAS-2007.Abstract
Distributed Partially Observable Markov Decision Problems (Distributed POMDPs) are a popular approach for modeling multi-agent systems acting in uncertain domains. Given the significant complexity of solving distributed POMDPs, particularly as we scale up the numbers of agents, one popular approach has focused on approximate solutions. Though this approach is efficient, the algorithms within this approach do not provide any guarantees on solution quality. A second less popular approach focuses on global optimality, but typical results are available only for two agents, and also at considerable computational cost. This paper overcomes the limitations of both these approaches by providing SPIDER, a novel combination of three key features for policy generation in distributed POMDPs: (i) it exploits agent interaction structure given a network of agents (i.e. allowing easier scale-up to larger number of agents); (ii) it uses a combination of heuristics to speedup policy search; and (iii) it allows quality guaranteed approximations, allowing a systematic tradeoff of solution quality for time. Experimental results show orders of magnitude improvement in performance when compared with previous global optimal algorithms.
2007_1_teamcore_aamas2007.pdf
Jonathan P. Pearce. 2007. “Local Optimization in Cooperative Agent Networks ”.Abstract

My research focuses on constructing and analyzing systems of intelligent, autonomous agents. These agents may include people, physical robots, or software programs acting as assistants, teammates, opponents, or trading partners. In a large class of multi-agent scenarios, the effect of local interactions between agents can be compactly represented as a network structure such as a distributed constraint optimization problem (DCOP) for cooperative domains. Collaboration between large groups of agents, given such a network, can be difficult to achieve; often agents can only manage to collaborate in smaller subgroups of a certain size, in order to find a workable solution in a timely manner. The goal of my thesis is to provide algorithms to enable networks of agents that are bounded in this way to quickly find high-quality solutions, as well as theoretical results to understand key properties of these solutions. Relevant domains for my work include personal assistant agents, sensor networks, and teams of autonomous robots. In particular, this thesis considers the case in which agents optimize a DCOP by forming groups of one or more agents until no group of k or fewer agents can possibly improve the solution; we define this type of local optimum, and any algorithm guaranteed to reach such a local optimum, as k-optimal. In this document, I present four key contributions related to k-optimality. The first set of results are worst-case guarantees on the solution quality of k-optima in a DCOP.

These guarantees can help determine an appropriate k-optimal algorithm, or possibly an appropriate constraint graph structure, for agents to use in situations where the cost of coordination between agents must be weighed against the quality of the solution reached. The second set of results are upper bounds on the number of k-optima that can exist in a DCOP. Because each joint action consumes resources, knowing the maximal number of k-optimal joint actions that could exist for a given DCOP allows us to allocate sufficient resources for a given level of k, or, alternatively, choosing an appropriate level of k-optimality, given fixed resource. The third contribution is a set of 2-optimal and 3-optimal algorithms and an experimental analysis of the performance of 1-, 2-, and 3-optimal algorithms on several types of DCOPs. The final contribution of this thesis is a case study of the application of k-optimal DCOP algorithms and solutions to the problem of the formation of human teams spanning multiple organizations. Given a particular specification of a human team (such as a task force to respond to an emergency) and a pool of possible team members, a DCOP can be formulated to match this specification. A set of k-optimal solutions to the DCOP represents a set of diverse, locally optimal options from which a human commander can choose the team that will be used.

2007_13_teamcore_jonathan_thesis.pdf
Janusz Marecki and Milind Tambe. 2007. “On Opportunistic Techniques for Solving Decentralized Markov Decision Processes with Temporal Constraints .” In Sixth International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS).Abstract
Decentralized Markov Decision Processes (DEC-MDPs) are a popular model of agent-coordination problems in domains with uncertainty and time constraints but very difficult to solve. In this paper, we improve a state-of-the-art heuristic solution method for DECMDPs, called OC-DEC-MDP, that has recently been shown to scale up to larger DEC-MDPs. Our heuristic solution method, called Value Function Propagation (VFP), combines two orthogonal improvements of OC-DEC-MDP. First, it speeds up OC-DEC-MDP by an order of magnitude by maintaining and manipulating a value function for each state (as a function of time) rather than a separate value for each pair of state and time interval. Furthermore, it achieves better solution qualities than OC-DEC-MDP because, as our analytical results show, it does not overestimate the expected total reward like OC-DEC- MDP. We test both improvements independently in a crisis-management domain as well as for other types of domains. Our experimental results demonstrate a significant speedup of VFP over OC-DEC-MDP as well as higher solution qualities in a variety of situations.
2007_7_teamcore_marecki07vfp.pdf
Jonathan P. Pearce and Milind Tambe. 2007. “Quality Guarantees on k-Optimal Solutions for Distributed Constraint Optimization Problems .” In International Joint Conference on Artificial Intelligence (IJCAI).Abstract
A distributed constraint optimization problem (DCOP) is a formalism that captures the rewards and costs of local interactions within a team of agents. Because complete algorithms to solve DCOPs are unsuitable for some dynamic or anytime domains, researchers have explored incomplete DCOP algorithms that result in locally optimal solutions. One type of categorization of such algorithms, and the solutions they produce, is koptimality; a k-optimal solution is one that cannot be improved by any deviation by k or fewer agents. This paper presents the first known guarantees on solution quality for k-optimal solutions. The guarantees are independent of the costs and rewards in the DCOP, and once computed can be used for any DCOP of a given constraint graph structure.
2007_3_teamcore_ijcai07pearce.pdf
Pradeep Varakantham, Janusz Marecki, Milind Tambe, and Makoto Yokoo. 2007. “SPIDER attack on a network of POMDPs: Towards quality bounded solutions .” In AAAI Spring Symposium 2007.Abstract
Distributed Partially Observable Markov Decision Problems (Distributed POMDPs) are a popular approach for modeling multi-agent systems acting in uncertain domains. Given the significant computational complexity of solving distributed POMDPs, one popular approach has focused on approximate solutions. Though this approach provides for efficient computation of solutions, the algorithms within this approach do not provide any guarantees on the quality of the solutions. A second less popular approach has focused on a global optimal result, but at considerable computational cost. This paper overcomes the limitations of both these approaches by providing SPIDER (Search for Policies In Distributed EnviRonments), which provides quality-guaranteed approximations for distributed POMDPs. SPIDER allows us to vary this quality guarantee, thus allowing us to vary solution quality systematically. SPIDER and its enhancements employ heuristic search techniques for finding a joint policy that satisfies the required bound on the quality of the solution.
2007_6_teamcore_spider_symp.pdf
Pradeep Varakantham, Rajiv T. Maheswaran, Tapana Gupta, and Milind Tambe. 2007. “Towards efficient computation of error bounded solutions in POMDPs: Expected Value Approximation and Dynamic Disjunctive Beliefs .” In International Joint Conference on Artificial Intelligence (IJCAI).Abstract
While POMDPs (partially observable markov decision problems) are a popular computational model with wide-ranging applications, the computational cost for optimal policy generation is prohibitive. Researchers are investigating ever-more efficient algorithms, yet many applications demand such algorithms bound any loss in policy quality when chasing efficiency. To address this challenge, we present two new techniques. The first approximates in the value space to obtain solutions efficiently for a pre-specified error bound. Unlike existing techniques, our technique guarantees the resulting policy will meet this bound. Furthermore, it does not require costly computations to determine the quality loss of the policy. Our second technique prunes large tracts of belief space that are unreachable, allowing faster policy computation without any sacrifice in optimality. The combination of the two techniques, which are complementary to existing optimal policy generation algorithms, provides solutions with tight error bounds efficiently in domains where competing algorithms fail to provide such tight bounds.
2007_4_teamcore_ijcai07.pdf
Pradeep Varakantham. 2007. “Towards Efficient Planning in Real World Partially Observable Domains ”.Abstract
My research goal is to build large-scale intelligent systems (both single- and multi-agent) that reason with uncertainty in complex, real-world environments. I foresee an integration of such systems in many critical facets of human life ranging from intelligent assistants in hospitals to offices, from rescue agents in large scale disaster response to sensor agents tracking weather phenomena in earth observing sensor webs, and others. In my thesis, I have taken steps towards achieving this goal in the context of systems that operate in partially observable domains that also have transitional (non-deterministic outcomes to actions) uncertainty. Given this uncertainty, Partially Observable Markov Decision Problems (POMDPs) and Distributed POMDPs present themselves as natural choices for modeling these domains. Unfortunately, the significant computational complexity involved in solving POMDPs (PSPACEComplete) and Distributed POMDPs (NEXP-Complete) is a key obstacle. Due to this significant computational complexity, existing approaches that provide exact solutions do not scale, while approximate solutions do not provide any usable guarantees on quality. My thesis addresses these issues using the following key ideas: The first is exploiting structure in the domain. Utilizing the structure present in the dynamics of the domain or the interactions between the agents allows improved efficiency without sacrificing on the quality of the solution. The second is direct approximation in the value space. This allows for calculated approximations at each step of the algorithm, which in turn allows us to provide usable quality guarantees; such quality guarantees may be specified in advance. In contrast, the existing approaches approximate in the belief space leading to an approximation in the value space (indirect approximation in value space), thus making it difficult to compute functional bounds on approximations. In fact, these key ideas allow for the efficient computation of optimal and quality bounded solutions to complex, large-scale problems, that were not in the purview of existing algorithms.
news_2007_11.pdf
2006
Rachel Greenstadt, Jonathan P. Pearce, and Milind Tambe. 7/2006. “Analysis of Privacy Loss in Distributed Constraint Optimization .” In The Twenty-First National Conference on Artificial Intelligence (AAAI-06).Abstract
Distributed Constraint Optimization (DCOP) is rapidly emerging as a prominent technique for multiagent coordination. However, despite agent privacy being a key motivation for applying DCOPs in many applications, rigorous quantitative evaluations of privacy loss in DCOP algorithms have been lacking. Recently, [Maheswaran et al.2005] introduced a framework for quantitative evaluations of privacy in DCOP algorithms, showing that some DCOP algorithms lose more privacy than purely centralized approaches and questioning the motivation for applying DCOPs. This paper addresses the question of whether state-of-the art DCOP algorithms suffer from a similar shortcoming by investigating several of the most efficient DCOP algorithms, including both DPOP and ADOPT. Furthermore, while previous work investigated the impact on efficiency of distributed contraint reasoning design decisions (e.g. constraint-graph topology, asynchrony, message-contents), this paper examines the privacy aspect of such decisions, providing an improved understanding of privacy-efficiency tradeoffs.
2006_13_teamcore_aaai06.pdf
Nathan Schurr, Pradeep Varakantham, Emma Bowring, Milind Tambe, and Barbara Grosz. 2006. “Asimovian Multiagents: Applying Laws of Robotics to Teams of Humans and Agents .” In WORKSHOP on programming multiagent systems.Abstract
In the March 1942 issue of “Astounding Science Fiction”, Isaac Asimov for the first time enumerated his three laws of robotics. Decades later, researchers in agents and multiagent systems have begun to examine these laws for providing a useful set of guarantees on deployed agent systems. Motivated by unexpected failures or behavior degradations in complex mixed agent-human teams, this paper for the first time focuses on applying Asimov’s first two laws to provide behavioral guarantees in such teams. However, operationalizing these laws in the context of such mixed agent-human teams raises three novel issues. First, while the laws were originally written for interaction of an individual robot and an individual human, clearly, our systems must operate in a team context. Second, key notions in these laws (e.g. causing “harm” to humans) are specified in very abstract terms and must be specified in concrete terms in implemented systems. Third, since removed from science-fiction, agents or humans may not have perfect information about the world, they must act based on these laws despite uncertainty of information. Addressing this uncertainty is a key thrust of this paper, and we illustrate that agents must detect and overcome such states of uncertainty while ensuring adherence to Asimov’s laws. We illustrate the results of two different domains that each have different approaches to operationalizing Asimov’s laws.
2006_12_teamcore_schurr_promas06.pdf
Milind Tambe, Emma Bowring, Jonathan P. Pearce, Pradeep Varakantham, Paul Scerri, and V. D Pynadath. 2006. “Electric Elves: What Went Wrong and Why .” In AAAI Spring Symposium.Abstract
Software personal assistants continue to be a topic of significant research interest. This paper outlines some of the important lessons learned from a successfully-deployed team of personal assistant agents (Electric Elves) in an office environment. These lessons have important implications for similar on-going research projects. The Electric Elves project was a team of almost a dozen personal assistant agents which were continually active for seven months. Each elf (agent) represented one person and assisted in daily activities in an actual office environment. This project led to several important observations about privacy, adjustable autonomy, and social norms in office environments. This paper outlines some of the key lessons learned and, more importantly, outlines our continued research to address some of the concerns raised.
2006_11_teamcore_elves_lessons.pdf
Rachel Greenstadt, Jonathan P. Pearce, Emma Bowring, and Milind Tambe. 2006. “Experimental analysis of privacy loss in DCOP algorithms (short paper).” In Fifth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract
Distributed Constraint Optimization (DCOP) is rapidly emerging as a prominent technique for multiagent coordination. Unfortunately, rigorous quantitative evaluations of privacy loss in DCOP algorithms have been lacking despite the fact that agent privacy is a key motivation for applying DCOPs in many applications. Recently, Maheswaran et al. [3, 4] introduced a framework for quantitative evaluations of privacy in DCOP algorithms, showing that early DCOP algorithms lose more privacy than purely centralized approaches and questioning the motivation for applying DCOPs. Do state-of-the art DCOP algorithms suffer from a similar shortcoming? This paper answers that question by investigating the most efficient DCOP algorithms, including both DPOP and ADOPT.
2006_9_teamcore_aamas06.pdf
Yoonheui Kim, Ranjit Nair, Pradeep Varakantham, Milind Tambe, and Makoto Yokoo. 2006. “Exploiting Locality of Interaction in Networked Distributed POMDPs .” In AAAI Spring Symposium on Distributed Planning and Scheduling.Abstract
In many real-world multiagent applications such as distributed sensor nets, a network of agents is formed based on each agent’s limited interactions with a small number of neighbors. While distributed POMDPs capture the real-world uncertainty in multiagent domains, they fail to exploit such locality of interaction. Distributed constraint optimization (DCOP) captures the locality of interaction but fails to capture planning under uncertainty. In previous work, we presented a model synthesized from distributed POMDPs and DCOPs, called Networked Distributed POMDPs (ND-POMDPs). Also, we presented LID-JESP (locally interacting distributed joint equilibrium-based search for policies: a distributed policy generation algorithm based on DBA (distributed breakout algorithm). In this paper, we present a stochastic variation of the LID-JESP that is based on DSA (distributed stochastic algorithm) that allows neighboring agents to change their policies in the same cycle. Through detailed experiments, we show how this can result in speedups without a large difference in solution quality. We also introduce a technique called hyper-linkbased decomposition that allows us to exploit locality of interaction further, resulting in faster run times for both LID-JESP and its stochastic variant without any loss in solution quality.
2006_3_teamcore_main.pdf
2006. “Implementation Techniques for solving POMDPs in Personal Assistant Domains .” In Programming Multiagent Systems (PROMAS). Springer Press.Abstract
Agents or agent teams deployed to assist humans often face the challenges of monitoring the state of key processes in their environment (including the state of their human users themselves) and making periodic decisions based on such monitoring. POMDPs appear well suited to enable agents to address these challenges, given the uncertain environment and cost of actions, but optimal policy generation for POMDPs is computationally expensive. This paper introduces two key implementation techniques (one exact and one approximate), where the policy computation is restricted to the belief space polytope that remains reachable given the progress structure of a domain. One technique uses Lagrangian methods to compute tighter bounds on belief space support in polynomial time, while the other technique is based on approximating policy vectors in dense policy regions of the bounded belief polytope. We illustrate this by enhancing two of the fastest existing algorithms for exact POMDP policy generation. The order of magnitude speedups demonstrate the utility of our implementation techniques in facilitating the deployment of POMDPs within agents assisting human users.
2006_5_teamcore_promas_varakantham.pdf
Emma Bowring, Milind Tambe, and Makoto Yokoo. 2006. “Multiply-Constrained DCOP for Distributed Planning and Scheduling .” In AAAI Spring Symposium on Distributed Planning and Scheduling .Abstract
Distributed constraint optimization (DCOP) has emerged as a useful technique for multiagent planning and scheduling. While previous DCOP work focuses on optimizing a single team objective, in many domains, agents must satisfy additional constraints on resources consumed locally (due to interactions within their local neighborhoods). Such local resource constraints may be required to be private or shared for efficiency’s sake. This paper provides a novel multiplyconstrained DCOP algorithm for addressing these domains. This algorithm is based on mutually-intervening search, i.e. using local resource constraints to intervene in the search for the optimal solution and vice versa, realized via three key ideas: (i) transforming n-ary constraints via virtual variables to maintain privacy; (ii) dynamically setting upper bounds on joint resource consumption with neighbors; and (iii) identifying if the local DCOP graph structure allows agents to compute exact resource bounds for additional efficiency. These ideas are implemented by modifying Adopt, one of the most efficient DCOP algorithms. Both detailed experimental results as well as proofs of correctness are presented.
2006_1_teamcore_ss_01.pdf
Emma Bowring, Milind Tambe, and Makoto Yokoo. 2006. “Multiply-Constrained Distributed Constraint Optimization .” In Fifth International Joint Conference on Autonomous Agents and Multi Agent Systems (AAMAS).Abstract
Distributed constraint optimization (DCOP) has emerged as a useful technique for multiagent coordination. While previous DCOP work focuses on optimizing a single team objective, in many domains, agents must satisfy additional constraints on resources consumed locally (due to interactions within their local neighborhoods). Such resource constraints may be required to be private or shared for efficiency’s sake. This paper provides a novel multiply-constrained DCOP algorithm for addressing these domains which is based on mutually-intervening search, i.e. using local resource constraints to intervene in the search for the optimal solution and vice versa. It is realized through three key ideas: (i) transforming n-ary constraints to maintain privacy; (ii) dynamically setting upper bounds on joint resource consumption with neighbors; and (iii) identifying if the local DCOP graph structure allows agents to compute exact resource bounds for additional efficiency. These ideas are implemented by modifying Adopt, one of the most efficient DCOP algorithms. Both detailed experimental results as well as proofs of correctness are presented.
2006_7_teamcore_aamas06mca.pdf
Praveen Paruchuri, Emma Bowring, Ranjit Nair, Jonathan P. Pearce, Nathan Schurr, Milind Tambe, and Pradeep Varakantham. 2006. “Mutiagent Teamwork: Hybrid Approaches .” In Computer society of India Communications (Invited).Abstract
Today within the multiagent community, we see at least four competing methods to building multiagent systems: beliefdesire-intention (BDI), distributed constraint optimization (DCOP), distributed POMDPs, and auctions or game-theoretic methods. While there is exciting progress within each approach, there is a lack of cross-cutting research. This article highlights the various hybrid techniques for multiagent teamwork developed by the teamcore group. In particular, for the past decade, the TEAMCORE research group has focused on building agent teams in complex, dynamic domains. While our early work was inspired by BDI, we will present an overview of recent research that uses DCOPs and distributed POMDPs in building agent teams. While DCOP and distributed POMDP algorithms provide promising results, hybrid approaches allow us to use the complementary strengths of different techniques to create algorithms that perform better than either of their component algorithms alone. For example, in the BDI-POMDP hybrid approach, BDI team plans are exploited to improve POMDP tractability, and POMDPs improve BDI team plan performance.
2006_14_teamcore_tambe.pdf
Rajiv T. Maheswaran, Jonathan P. Pearce, Emma Bowring, Pradeep Varakantham, and Milind Tambe. 2006. “Privacy Loss in Distributed Constraint Reasoning: A Quantitative Framework for Analysis and its Applications .” Journal of Autonomous Agents and Multiagent Systems (JAAMAS), 13, Pp. 27 - 60.Abstract
It is critical that agents deployed in real-world settings, such as businesses, offices, universities and research laboratories, protect their individual users’ privacy when interacting with other entities. Indeed, privacy is recognized as a key motivating factor in the design of several multiagent algorithms, such as in distributed constraint reasoning (including both algorithms for distributed constraint optimization (DCOP) and distributed constraint satisfaction (DisCSPs)), and researchers have begun to propose metrics for analysis of privacy loss in such multiagent algorithms. Unfortunately, a general quantitative framework to compare these existing metrics for privacy loss or to identify dimensions along which to construct new metrics is currently lacking. This paper presents three key contributions to address this shortcoming. First, the paper presents VPS (Valuations of Possible States), a general quantitative framework to express, analyze and compare existing metrics of privacy loss. Based on a state-space model, VPS is shown to capture various existing measures of privacy created for specific domains of DisCSPs. The utility of VPS is further illustrated through analysis of privacy loss in DCOP algorithms, when such algorithms are used by personal assistant agents to schedule meetings among users. In addition, VPS helps identify dimensions along which to classify and construct new privacy metrics and it also supports their quantitative comparison. Second, the article presents key inference rules that may be used in analysis of privacy loss in DCOP algorithms under different assumptions. Third, detailed experiments based on the VPS-driven analysis lead to the following key results: (i) decentralization by itself does not provide superior protection of privacy in DisCSP/DCOP algorithms when compared with centralization; instead, privacy protection also requires the presence of uncertainty about agents’ knowledge of the constraint graph. (ii) one needs to carefully examine the metrics chosen to measure privacy loss; the qualitative properties of privacy loss and hence the conclusions that can be drawn about an algorithm can vary widely based on the metric chosen. This paper should thus serve as a call to arms for further privacy research, particularly within the DisCSP/DCOP arena.
2006_6_teamcore_jaamas_privacy.pdf
Praveen Paruchuri, Milind Tambe, Fernando Ordonez, and Sarit Kraus. 2006. “Security in Multiagent Systems by Policy Randomization .” In Fifth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract
Security in multiagent systems is commonly defined as the ability of the system to deal with intentional threats from other agents. This paper focuses on domains where such intentional threats are caused by unseen adversaries whose actions or payoffs are unknown. In such domains, action randomization can effectively deteriorate an adversary’s capability to predict and exploit an agent/agent team’s actions. Unfortunately, little attention has been paid to intentional randomization of agents’ policies in single-agent or decentralized (PO)MDPs without significantly sacrificing rewards or breaking down coordination. This paper provides two key contributions to remedy this situation. First, it provides three novel algorithms, one based on a non-linear program and two based on linear programs (LP), to randomize single-agent policies, while attaining a certain level of expected reward. Second, it provides Rolling Down Randomization (RDR), a new algorithm that efficiently generates randomized policies for decentralized POMDPs via the single-agent LP method.
2006_10_teamcore_paruchur.pdf
Jonathan P. Pearce, Rajiv T. Maheswaran, and Milind Tambe. 2006. “Solution Sets for DCOPs and Graphical Games .” In Fifth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract
A distributed constraint optimization problem (DCOP) is a formalism that captures the rewards and costs of local interactions within a team of agents, each of whom is choosing an individual action. When rapidly selecting a single joint action for a team, we typically solve DCOPs (often using locally optimal algorithms) to generate a single solution. However, in scenarios where a set of joint actions (i.e. a set of assignments to a DCOP) is to be generated, metrics are needed to help appropriately select this set and efficiently allocate resources for the joint actions in the set. To address this need, we introduce k-optimality, a metric that captures the desirable properties of diversity and relative quality of a set of locally-optimal solutions using a parameter that can be tuned based on the level of these properties required. To achieve effective resource allocation for this set, we introduce several upper bounds on the cardinalities of k-optimal joint action sets. These bounds are computable in constant time if we ignore the graph structure, but tighter, graphbased bounds are feasible with higher computation cost. Bounds help choose the appropriate level of k-optimality for settings with fixed resources and help determine appropriate resource allocation for settings where a fixed level of k-optimality is desired. In addition, our bounds for a 1-optimal joint action set for a DCOP also apply to the number of pure-strategy Nash equilibria in a graphical game of noncooperative agents.
2006_4_teamcore_aamas06pearce.pdf

Pages