The utility of the multi-agent team approach for coordination of distributed agents has been demonstrated in a number of large-scale systems for sensing and acting like sensor
networks for real-time tracking of moving targets (Modi et
al. 2001) and disaster rescue simulation domains, such as
RoboCup Rescue Simulation Domain (Kitano et al. 1999;
Tadokoro et al. 2000) These domains contain tasks that can
be performed only by collaborative actions of the agents. Incomplete or incorrect knowledge owing to constrained sensing and uncertainty of the environment further motivate the
need for these agents to explicitly work in teams. A key precursor to teamwork is team formation, the problem of how
best to organize the agents into collaborating teams that perform the tasks that arise. For instance, in the disaster rescue
simulation domain, injured civilians in a burning building
may require teaming of two ambulances and three nearby
fire-brigades to extinguish the fire and quickly rescue the
civilians. If there are several such fires and injured civilians, the teams must be carefully formed to optimize performance.
Our work in team formation focuses on dynamic, realtime environments, such as sensor networks (Modi et al.
2001) and RoboCup Rescue Simulation Domain (Kitano
et al. 1999; Tadokoro et al. 2000). In such domains
teams must be formed rapidly so tasks are performed within
given deadlines, and teams must be reformed in response
to the dynamic appearance or disappearance of tasks. The
problems with the current team formation work for such
dynamic real-time domains are two-fold: i) most team
formation algorithms (Tidhar, Rao, & Sonenberg 1996;
Hunsberger & Grosz 2000; Fatima & Wooldridge 2001;
Horling, Benyo, & Lesser 2001; Modi et al. 2001) are static.
In order to adapt to the changing environment the static algorithm would have to be run repeatedly, ii) Team formation has largely relied on experimental work, without any
theoretical analysis of key properties of team formation algorithms, such as their worst-case complexity. This is especially important because of the real-time nature of the domains.
In this paper we take initial steps to attack both these problems. As the tasks change and members of the team fail, the
current team needs to evolve to handle the changes. In both
the sensor network domain (Modi et al. 2001) and RoboCup.
Rescue (Kitano et al. 1999; Tadokoro et al. 2000), each
re-organization of the team requires time (e.g., fire-brigades
may need to drive to a new location) and is hence expensive because of the need for quick response. Clearly, the
current configuration of agents is relevant to how quickly
and well they can be re-organized in the future. Each reorganization of the teams should be such that the resulting team is effective at performing the existing tasks but
also flexible enough to adapt to new scenarios quickly. We
refer to this reorganization of the team as ”Team Formation for Reformation”. In order to solve the “Team Formation for Reformation” problem, we present R-COM-MTDPs
(Roles and Communication in a Markov Team Decision
Process), a formal model based on communicating decentralized POMDPs, to address the above shortcomings. RCOM-MTDP significantly extends an earlier model called
COM-MTDP (Pynadath & Tambe 2002), by making important additions of roles and agents’ local states, to more
closely model current complex multiagent teams. Thus, RCOM-MTDP provides decentralized optimal policies to take
up and change roles in a team (planning ahead to minimize
reorganization costs), and to execute such roles.
R-COM-MTDPs provide a general tool to analyze roletaking and role-executing policies in multiagent teams. We
show that while generation of optimal policies in R-COMMTDPs is NEXP-complete, different communication and
observability conditions significantly reduce such complexity. In this paper, we use the disaster rescue domain to motivate the “Team Formation for Reformation” problem. We
present real world scenarios where such an approach would
be useful and use the RoboCup Rescue Simulation Environment (Kitano et al. 1999; Tadokoro et al. 2000) to explain
the working of our model.