Publications

2006

Rachel Greenstadt, Jonathan P. Pearce, Emma Bowring, and Milind Tambe. 2006. “Experimental analysis of privacy loss in DCOP algorithms (short paper).” In Fifth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract

Distributed Constraint Optimization (DCOP) is rapidly emerging as a prominent technique for multiagent coordination. Unfortunately, rigorous quantitative evaluations of privacy loss in DCOP algorithms have been lacking despite the fact that agent privacy is a key motivation for applying DCOPs in many applications. Recently, Maheswaran et al. [3, 4] introduced a framework for quantitative evaluations of privacy in DCOP algorithms, showing that early DCOP algorithms lose more privacy than purely centralized approaches and questioning the motivation for applying DCOPs. Do state-of-the art DCOP algorithms suffer from a similar shortcoming? This paper answers that question by investigating the most efficient DCOP algorithms, including both DPOP and ADOPT.

2006_9_teamcore_aamas06.pdf

Yoonheui Kim, Ranjit Nair, Pradeep Varakantham, Milind Tambe, and Makoto Yokoo. 2006. “Exploiting Locality of Interaction in Networked Distributed POMDPs .” In AAAI Spring Symposium on Distributed Planning and Scheduling.Abstract

In many real-world multiagent applications such as distributed sensor nets, a network of agents is formed based on each agent’s limited interactions with a small number of neighbors. While distributed POMDPs capture the real-world uncertainty in multiagent domains, they fail to exploit such locality of interaction. Distributed constraint optimization (DCOP) captures the locality of interaction but fails to capture planning under uncertainty. In previous work, we presented a model synthesized from distributed POMDPs and DCOPs, called Networked Distributed POMDPs (ND-POMDPs). Also, we presented LID-JESP (locally interacting distributed joint equilibrium-based search for policies: a distributed policy generation algorithm based on DBA (distributed breakout algorithm). In this paper, we present a stochastic variation of the LID-JESP that is based on DSA (distributed stochastic algorithm) that allows neighboring agents to change their policies in the same cycle. Through detailed experiments, we show how this can result in speedups without a large difference in solution quality. We also introduce a technique called hyper-linkbased decomposition that allows us to exploit locality of interaction further, resulting in faster run times for both LID-JESP and its stochastic variant without any loss in solution quality.

2006_3_teamcore_main.pdf

2006. “Implementation Techniques for solving POMDPs in Personal Assistant Domains .” In Programming Multiagent Systems (PROMAS). Springer Press.Abstract

Agents or agent teams deployed to assist humans often face the challenges of monitoring the state of key processes in their environment (including the state of their human users themselves) and making periodic decisions based on such monitoring. POMDPs appear well suited to enable agents to address these challenges, given the uncertain environment and cost of actions, but optimal policy generation for POMDPs is computationally expensive. This paper introduces two key implementation techniques (one exact and one approximate), where the policy computation is restricted to the belief space polytope that remains reachable given the progress structure of a domain. One technique uses Lagrangian methods to compute tighter bounds on belief space support in polynomial time, while the other technique is based on approximating policy vectors in dense policy regions of the bounded belief polytope. We illustrate this by enhancing two of the fastest existing algorithms for exact POMDP policy generation. The order of magnitude speedups demonstrate the utility of our implementation techniques in facilitating the deployment of POMDPs within agents assisting human users.

2006_5_teamcore_promas_varakantham.pdf

Emma Bowring, Milind Tambe, and Makoto Yokoo. 2006. “Multiply-Constrained DCOP for Distributed Planning and Scheduling .” In AAAI Spring Symposium on Distributed Planning and Scheduling .Abstract

Distributed constraint optimization (DCOP) has emerged as a useful technique for multiagent planning and scheduling. While previous DCOP work focuses on optimizing a single team objective, in many domains, agents must satisfy additional constraints on resources consumed locally (due to interactions within their local neighborhoods). Such local resource constraints may be required to be private or shared for efficiency’s sake. This paper provides a novel multiplyconstrained DCOP algorithm for addressing these domains. This algorithm is based on mutually-intervening search, i.e. using local resource constraints to intervene in the search for the optimal solution and vice versa, realized via three key ideas: (i) transforming n-ary constraints via virtual variables to maintain privacy; (ii) dynamically setting upper bounds on joint resource consumption with neighbors; and (iii) identifying if the local DCOP graph structure allows agents to compute exact resource bounds for additional efficiency. These ideas are implemented by modifying Adopt, one of the most efficient DCOP algorithms. Both detailed experimental results as well as proofs of correctness are presented.

2006_1_teamcore_ss_01.pdf

Emma Bowring, Milind Tambe, and Makoto Yokoo. 2006. “Multiply-Constrained Distributed Constraint Optimization .” In Fifth International Joint Conference on Autonomous Agents and Multi Agent Systems (AAMAS).Abstract

Distributed constraint optimization (DCOP) has emerged as a useful technique for multiagent coordination. While previous DCOP work focuses on optimizing a single team objective, in many domains, agents must satisfy additional constraints on resources consumed locally (due to interactions within their local neighborhoods). Such resource constraints may be required to be private or shared for efficiency’s sake. This paper provides a novel multiply-constrained DCOP algorithm for addressing these domains which is based on mutually-intervening search, i.e. using local resource constraints to intervene in the search for the optimal solution and vice versa. It is realized through three key ideas: (i) transforming n-ary constraints to maintain privacy; (ii) dynamically setting upper bounds on joint resource consumption with neighbors; and (iii) identifying if the local DCOP graph structure allows agents to compute exact resource bounds for additional efficiency. These ideas are implemented by modifying Adopt, one of the most efficient DCOP algorithms. Both detailed experimental results as well as proofs of correctness are presented.

2006_7_teamcore_aamas06mca.pdf

Praveen Paruchuri, Emma Bowring, Ranjit Nair, Jonathan P. Pearce, Nathan Schurr, Milind Tambe, and Pradeep Varakantham. 2006. “Mutiagent Teamwork: Hybrid Approaches .” In Computer society of India Communications (Invited).Abstract

Today within the multiagent community, we see at least four competing methods to building multiagent systems: beliefdesire-intention (BDI), distributed constraint optimization (DCOP), distributed POMDPs, and auctions or game-theoretic methods. While there is exciting progress within each approach, there is a lack of cross-cutting research. This article highlights the various hybrid techniques for multiagent teamwork developed by the teamcore group. In particular, for the past decade, the TEAMCORE research group has focused on building agent teams in complex, dynamic domains. While our early work was inspired by BDI, we will present an overview of recent research that uses DCOPs and distributed POMDPs in building agent teams. While DCOP and distributed POMDP algorithms provide promising results, hybrid approaches allow us to use the complementary strengths of different techniques to create algorithms that perform better than either of their component algorithms alone. For example, in the BDI-POMDP hybrid approach, BDI team plans are exploited to improve POMDP tractability, and POMDPs improve BDI team plan performance.

2006_14_teamcore_tambe.pdf

Rajiv T. Maheswaran, Jonathan P. Pearce, Emma Bowring, Pradeep Varakantham, and Milind Tambe. 2006. “Privacy Loss in Distributed Constraint Reasoning: A Quantitative Framework for Analysis and its Applications .” Journal of Autonomous Agents and Multiagent Systems (JAAMAS), 13, Pp. 27 - 60.Abstract

It is critical that agents deployed in real-world settings, such as businesses, offices, universities and research laboratories, protect their individual users’ privacy when interacting with other entities. Indeed, privacy is recognized as a key motivating factor in the design of several multiagent algorithms, such as in distributed constraint reasoning (including both algorithms for distributed constraint optimization (DCOP) and distributed constraint satisfaction (DisCSPs)), and researchers have begun to propose metrics for analysis of privacy loss in such multiagent algorithms. Unfortunately, a general quantitative framework to compare these existing metrics for privacy loss or to identify dimensions along which to construct new metrics is currently lacking. This paper presents three key contributions to address this shortcoming. First, the paper presents VPS (Valuations of Possible States), a general quantitative framework to express, analyze and compare existing metrics of privacy loss. Based on a state-space model, VPS is shown to capture various existing measures of privacy created for specific domains of DisCSPs. The utility of VPS is further illustrated through analysis of privacy loss in DCOP algorithms, when such algorithms are used by personal assistant agents to schedule meetings among users. In addition, VPS helps identify dimensions along which to classify and construct new privacy metrics and it also supports their quantitative comparison. Second, the article presents key inference rules that may be used in analysis of privacy loss in DCOP algorithms under different assumptions. Third, detailed experiments based on the VPS-driven analysis lead to the following key results: (i) decentralization by itself does not provide superior protection of privacy in DisCSP/DCOP algorithms when compared with centralization; instead, privacy protection also requires the presence of uncertainty about agents’ knowledge of the constraint graph. (ii) one needs to carefully examine the metrics chosen to measure privacy loss; the qualitative properties of privacy loss and hence the conclusions that can be drawn about an algorithm can vary widely based on the metric chosen. This paper should thus serve as a call to arms for further privacy research, particularly within the DisCSP/DCOP arena.

2006_6_teamcore_jaamas_privacy.pdf

Praveen Paruchuri, Milind Tambe, Fernando Ordonez, and Sarit Kraus. 2006. “Security in Multiagent Systems by Policy Randomization .” In Fifth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract

Security in multiagent systems is commonly defined as the ability of the system to deal with intentional threats from other agents. This paper focuses on domains where such intentional threats are caused by unseen adversaries whose actions or payoffs are unknown. In such domains, action randomization can effectively deteriorate an adversary’s capability to predict and exploit an agent/agent team’s actions. Unfortunately, little attention has been paid to intentional randomization of agents’ policies in single-agent or decentralized (PO)MDPs without significantly sacrificing rewards or breaking down coordination. This paper provides two key contributions to remedy this situation. First, it provides three novel algorithms, one based on a non-linear program and two based on linear programs (LP), to randomize single-agent policies, while attaining a certain level of expected reward. Second, it provides Rolling Down Randomization (RDR), a new algorithm that efficiently generates randomized policies for decentralized POMDPs via the single-agent LP method.

2006_10_teamcore_paruchur.pdf

Jonathan P. Pearce, Rajiv T. Maheswaran, and Milind Tambe. 2006. “Solution Sets for DCOPs and Graphical Games .” In Fifth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract

A distributed constraint optimization problem (DCOP) is a formalism that captures the rewards and costs of local interactions within a team of agents, each of whom is choosing an individual action. When rapidly selecting a single joint action for a team, we typically solve DCOPs (often using locally optimal algorithms) to generate a single solution. However, in scenarios where a set of joint actions (i.e. a set of assignments to a DCOP) is to be generated, metrics are needed to help appropriately select this set and efficiently allocate resources for the joint actions in the set. To address this need, we introduce k-optimality, a metric that captures the desirable properties of diversity and relative quality of a set of locally-optimal solutions using a parameter that can be tuned based on the level of these properties required. To achieve effective resource allocation for this set, we introduce several upper bounds on the cardinalities of k-optimal joint action sets. These bounds are computable in constant time if we ignore the graph structure, but tighter, graphbased bounds are feasible with higher computation cost. Bounds help choose the appropriate level of k-optimality for settings with fixed resources and help determine appropriate resource allocation for settings where a fixed level of k-optimality is desired. In addition, our bounds for a 1-optimal joint action set for a DCOP also apply to the number of pure-strategy Nash equilibria in a graphical game of noncooperative agents.

2006_4_teamcore_aamas06pearce.pdf

Nathan Schurr, Pratik Patil, Fred Pighin, and Milind Tambe. 2006. “Using Multiagent Teams to Improve the Training of Incident Commanders .” In Fifth International Joint Conference on Autonomous Agents and Multi Agent Systems (AAMAS) Industry Track.Abstract

The DEFACTO system is a multiagent based tool for training incident commanders for large scale disasters. In this paper, we highlight some of the lessons that we have learned from our interaction with the Los Angeles Fire Department (LAFD) and how they have affected the way that we continued the design of our training system. These lessons were gleaned from LAFD feedback and initial training exercises and they include: system design, visualization, improving trainee situational awareness, adjusting training level of difficulty and situation scale. We have taken these lessons and used them to improve the DEFACTO system’s training capabilities. We have conducted initial training exercises to illustrate the utility of the system in terms of providing useful feedback to the trainee.

2006_8_teamcore_schurr_industry_aamas_06.pdf

Pradeep Varakantham, Ranjit Nair, Milind Tambe, and Makoto Yokoo. 2006. “Winning back the CUP for distributed POMDPs: Planning over continuous belief spaces .” In Fifth International Joint Conference on Autonomous Agents and Multi Agent Systems (AAMAS).Abstract

Distributed Partially Observable Markov Decision Problems (Distributed POMDPs) are evolving as a popular approach for modeling multiagent systems, and many different algorithms have been proposed to obtain locally or globally optimal policies. Unfortunately, most of these algorithms have either been explicitly designed or experimentally evaluated assuming knowledge of a starting belief point, an assumption that often does not hold in complex, uncertain domains. Instead, in such domains, it is important for agents to explicitly plan over continuous belief spaces. This paper provides a novel algorithm to explicitly compute finite horizon policies over continuous belief spaces, without restricting the space of policies. By marrying an efficient single-agent POMDP solver with a heuristic distributed POMDP policy-generation algorithm, locally optimal joint policies are obtained, each of which dominates within a different part of the belief region. We provide heuristics that significantly improve the efficiency of the resulting algorithm and provide detailed experimental results. To the best of our knowledge, these are the first run-time results for analytically generating policies over continuous belief spaces in distributed POMDPs.

2006_2_teamcore_aamas2006.pdf

2005

Paul Scerri, A. Farinelli, Steven Okamoto, and Milind Tambe. 7/2005. “Allocating tasks in extreme teams .” In Fourth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract

Extreme teams, large-scale agent teams operating in dynamic environments, are on the horizon. Such environments are problematic for current task allocation algorithms due to the lack of locality in agent interactions. We propose a novel distributed task allocation algorithm for extreme teams, called LA-DCOP, that incorporates three key ideas. First, LA-DCOP’s task allocation is based on a dynamically computed minimum capability threshold which uses approximate knowledge of overall task load. Second, LA-DCOP uses tokens to represent tasks and further minimize communication. Third, it creates potential tokens to deal with inter-task constraints of simultaneous execution. We show that LA-DCOP convincingly outperforms competing distributed task allocation algorithms while using orders of magnitude fewer messages, allowing a dramatic scale-up in extreme teams, upto a fully distributed, proxybased team of 200 agents. Varying threshold are seen as a key to outperforming competing distributed algorithms in the domain of simulated disaster rescue.

2005_10_teamcore_final.pdf

Milind Tambe, Emma Bowring, H. Jung, Gal Kaminka, Rajiv T. Maheswaran, Janusz Marecki, Pragnesh J. Modi, Ranjit Nair, Steven Okamoto, Jonathan P. Pearce, Praveen Paruchuri, D. V. Pynadath, Paul Scerri, Nathan Schurr, and Pradeep Varakantham. 7/2005. “Conflicts in teamwork: Hybrids to the rescue .” In Fourth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract

Today within the AAMAS community, we see at least four competing approaches to building multiagent systems: beliefdesire-intention (BDI), distributed constraint optimization (DCOP), distributed POMDPs, and auctions or game-theoretic approaches. While there is exciting progress within each approach, there is a lack of cross-cutting research. This paper highlights hybrid approaches for multiagent teamwork. In particular, for the past decade, the TEAMCORE research group has focused on building agent teams in complex, dynamic domains. While our early work was inspired by BDI, we will present an overview of recent research that uses DCOPs and distributed POMDPs in building agent teams. While DCOP and distributed POMDP algorithms provide promising results, hybrid approaches help us address problems of scalability and expressiveness. For example, in the BDI-POMDP hybrid approach, BDI team plans are exploited to improve POMDP tractability, and POMDPs improve BDI team plan performance. We present some recent results from applying this approach in a Disaster Rescue simulation domain being developed with help from the Los Angeles Fire Department.

2005_15_teamcore_tambe.pdf

Jonathan P. Pearce, Rajiv T. Maheswaran, and Milind Tambe. 7/2005. “How Local Is That Optimum? k-optimality for DCOP .” In Fourth International Joint Conference Poster on Autonomous Agents and Multiagent Systems (AAMAS).Abstract

In multi-agent systems where sets of joint actions (JAs) are generated, metrics are needed to evaluate these sets and efficiently allocate resources for the many JAs. For the case where a JA set can be represented by multiple solutions to a DCOP, we introduce koptimality as a metric that captures desirable properties of diversity and relative quality, and apply results from coding theory to obtain upper bounds on cardinalities of k-optimal JA sets. These bounds can help choose the appropriate level of k-optimality for settings with fixed resources and help determine appropriate resource allocation for settings where a fixed level of k-optimality is desired.

2005_9_teamcore_pearce_aamas05.pdf

R. Nair, P. Varakantam, M. Tambe, and M. Yokoo. 7/2005. “Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs .” In International Joint Conference on Artificial Intelligence (IJCAI).Abstract

2005_19_teamcore_post_0517.pdf

Ranjit Nair, Pradeep Varakantham, Milind Tambe, and Makoto Yokoo. 7/2005. “Networked Distributed POMDPs: A Synthesis of Distributed Constraint Optimization and POMDPs.” In Twentieth AAAI-05 National Conference on Artificial Intelligence.Abstract

2005_2_teamcore_297.pdf

Rajiv T. Maheswaran, Jonathan P. Pearce, Pradeep Varakantham, Emma Bowring, and Milind Tambe. 7/2005. “Valuations of Possible States (VPS): A Quantitative Framework for Analysis of Privacy Loss Among Collaborative Personal Assistant Agents.” In Fourth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS).Abstract

For agents deployed in real-world settings, such as businesses, universities and research laboratories, it is critical that agents protect their individual users’ privacy when interacting with others entities. Indeed, privacy is recognized as a key motivating factor in design of several multiagent algorithms, such as distributed constraint optimization (DCOP) algorithms. Unfortunately, rigorous and general quantitative metrics for analysis and comparison of such multiagent algorithms with respect to privacy loss are lacking. This paper takes a key step towards developing a general quantitative model from which one can analyze and generate metrics of privacy loss by introducing the VPS (Valuations of Possible States) framework. VPS is shown to capture various existing measures of privacy created for specific domains of distributed constraint satisfactions problems (DCSPs). The utility of VPS is further illustrated via analysis of DCOP algorithms, when such algorithms are used by personal assistant agents to schedule meetings among users. In addition, VPS allows us to quantitatively evaluate the properties of several privacy metrics generated through qualitative notions. We obtain the unexpected result that decentralization does not automatically guarantee superior protection of privacy.

2005_5_teamcore_p799_maheswaran.pdf

Nathan Schurr, Janusz Marecki, Milind Tambe, J. P. Lewis, and N. Kasinadhuni. 3/21/2005. “The Future of Disaster Response: Humans Working with Multiagent Teams using DEFACTO .” In AAAI Spring Symposium on Homeland Security. Stanford, CA.Abstract

When addressing terrorist threats we must give special attention to both prevention and disaster response. Enabling effective interactions between agent teams and humans for disaster response is a critical area of research, with encouraging progress in the past few years. However, previous work suffers from two key limitations: (i) limited human situational awareness, reducing human effectiveness in directing agent teams and (ii) the agent team’s rigid interaction strategies that limit team performance. This paper focuses on a novel disaster response software prototype, called DEFACTO (Demonstrating Effective Flexible Agent Coordination of Teams through Omnipresence). DEFACTO is based on a software proxy architecture and 3D visualization system, which addresses the two limitations described above. First, the 3D visualization interface enables human virtual omnipresence in the environment, improving human situational awareness and ability to assist agents. Second, generalizing past work on adjustable autonomy, the agent team chooses among a variety of “team-level” interaction strategies, even excluding humans from the loop in extreme circumstances.

2005_16_teamcore_ss105schurrn.pdf

3/21/2005. “Optimize My Schedule but Keep It Flexible: Distributed Multi-Criteria Coordination for Personal Assistants .” In AAAI Spring Symposium on Persistent Assistants: Living and Working with AI. Menlo Park, CA.Abstract

Research projects have begun focusing on deploying personal assistant agents to coordinate users in such diverse environments as offices, distributed manufacturing or design centers, and in support of first responders for emergencies. In such environments, distributed constraint optimization (DCOP) has emerged as a key technology for multiple collaborative assistants to coordinate with each other. Unfortunately, while previous work in DCOP only focuses on coordination in service of optimizing a single global team objective, personal assistants often require satisfying additional individual userspecified criteria. This paper provides a novel DCOP algorithm that enables personal assistants to engage in such multicriteria coordination while maintaining the privacy of their additional criteria. It uses n-ary NOGOODS implemented as private variables to achieve this. In addition, we’ve developed an algorithm that reveals only the individual criteria of a link and can speed up performance for certain problem structures. The key idea in this algorithm is that interleaving the criteria searches — rather than sequentially attempting to satisfy the criteria — improves efficiency by mutually constraining the distributed search for solutions. These ideas are realized in the form of private-g and public-g Multi-criteria ADOPT, built on top of ADOPT, one of the most efficient DCOP algorithms. We present our detailed algorithm, as well as some experimental results in personal assistant domains.

2005_7_teamcore_ss705bowringe.pdf

Rajiv T. Maheswaran, Jonathan P. Pearce, Pradeep Varakantham, Emma Bowring, and Milind Tambe. 3/21/2005. “Valuations of Possible States (VPS): A Quantitative Framework for Analysis of Privacy Loss Among Collaborative Personal Assistant Agents .” In AAAI Spring Symposium on Persistent Assistants: Living and Working with AI. Menlo Park, CA.Abstract

2005_4_teamcore_aaaiss05vps.pdf

Publications

Pages

Publications: Years

Recent Publications

59f5d16cd6de1f782b70da1b8b307fdc

css_footer