Many multiagent settings are appropriately modeled as Stackelberg games [Fudenberg
and Tirole 1991; Paruchuri et al. 2007], where a leader commits to a strategy first, and
then a follower selfishly optimizes its own reward, considering the action chosen by the
leader. Stackelberg games are commonly used to model attacker-defender scenarios in security domains [Brown et al. 2006] as well as in patrolling [Paruchuri et al. 2007; Paruchuri
et al. 2008]. For example, security personnel patrolling an infrastructure commit to a patrolling strategy first, before their adversaries act taking this committed strategy into account. Indeed, Stackelberg games are being used at the Los Angeles International Airport
to schedule security checkpoints and canine patrols [Murr 2007; Paruchuri et al. 2008; Pita
et al. 2008a]. They could potentially be used in network routing, pricing in transportation
systems and many other situations [Korilis et al. 1997; Cardinal et al. 2005].
Although the follower in a Stackelberg game is allowed to observe the leader’s strategy
before choosing its own strategy, there is often an advantage for the leader over the case
where both players must choose their moves simultaneously. To see the advantage of being
the leader in a Stackelberg game, consider the game with the payoff as shown in Table I.
The leader is the row player and the follower is the column player. The only pure-strategy
Nash equilibrium for this game is when the leader plays a and the follower plays c which
gives the leader a payoff of 2. However, if the leader commits to a mixed strategy of
playing a and b with equal (0.5) probability, then the follower will play d, leading to an
expected payoff for the leader of 3.5.