Security in multiagent systems is commonly defined as the ability of the system to deal with intentional threats from other agents.
This paper focuses on domains where such intentional threats are
caused by unseen adversaries whose actions or payoffs are unknown. In such domains, action randomization can effectively deteriorate an adversary’s capability to predict and exploit an agent/agent
team’s actions. Unfortunately, little attention has been paid to intentional randomization of agents’ policies in single-agent or decentralized (PO)MDPs without significantly sacrificing rewards or
breaking down coordination. This paper provides two key contributions to remedy this situation. First, it provides three novel
algorithms, one based on a non-linear program and two based on
linear programs (LP), to randomize single-agent policies, while attaining a certain level of expected reward. Second, it provides
Rolling Down Randomization (RDR), a new algorithm that efficiently generates randomized policies for decentralized POMDPs
via the single-agent LP method.