My research concentrates on developing reasoning techniques for intelligent, autonomous agent systems. In particular, I focus on planning techniques for both single and multi-agent systems acting in uncertain domains. In modeling these domains, I consider two types of uncertainty: (i) the outcomes of agent actions are uncertain and (ii) the amount of resources consumed by agent actions is uncertain and only characterized by continuous probability density functions. Such rich domains, that range from the Mars rover exploration to the unmanned aerial surveillance to the automated disaster rescue operations are commonly modeled as continuous resource Markov decision processes (MDPs) that can then be solved in order to construct policies for agents acting in these domains. This thesis addresses two major unresolved problems in continuous resource MDPs. First, they are very difficult to solve and existing algorithms are either fast, but make additional restrictive assumptions about the model, or do not introduce these assumptions but are very inefficient. Second, continuous resource MDP framework is not directly applicable to multi-agent systems and current approaches all discretize resource levels or assume deterministic resource consumption which automatically invalidates the formal solution quality guarantees. The goal of my thesis is to fundamentally alter this landscape in three contributions:

I first introduce CPH, a fast analytic algorithm for solving continuous resource MDPs. CPH solves the planning problems at hand by first approximating with a desired accuracy the probability distributions over the resource consumptions with phase-type distributions, which use exponential distributions as building blocks. It then uses value iteration to solve the resulting MDPs more efficiently than its closest competitor, and allows for a systematic trade-off of solution quality for speed. Second, to improve the anytime performance of CPH and other continuous resource MDP solvers I introduce the DPFP algorithm. Rather than using value iteration to solve the problem at hand, DPFP performs a forward search in the corresponding dual space of cumulative distribution functions. In doing so, DPFP discriminates in its policy generation effort providing only approximate policies for regions of the state-space reachable with low probability yet it bounds the error that such approximation entails. Third, I introduce CR-DEC-MDP, a framework for planning with continuous resources in multi-agent systems and propose two algorithms for solving CR-DEC-MDPs: The first algorithm (VFP) emphasizes scalability. It performs a series of policy iterations in order to quickly find a locally optimal policy. In contrast, the second algorithm (M-DPFP) stresses optimality; it allows for a systematic trade-off of solution quality for speed by using the concept of DPFP in a multiagent setting. My results show up to three orders of magnitude speedups in solving single agent planning problems and up to one order of magnitude speedup in solving multi-agent planning problems. Furthermore, I demonstrate the practical use of one of my algorithms in a large-scale disaster simulation where it allows for a more efficient rescue operation.

%G eng %9 PhD thesis %0 Conference Paper %B International Joint Conference on Autonomous Agents and Multiagent Systems, 2008 %D 2008 %T Playing games with security: An efficient exact algorithm for Bayesian Stackelberg Games %A Praveen Paruchuri %A Jonathan P. Pearce %A Janusz Marecki %A Tambe, Milind %A Ordonez, Fernando %A Kraus, Sarit %X In a class of games known as Stackelberg games, one agent (the leader) must commit to a strategy that can be observed by the other agent (the follower or adversary) before the adversary chooses its own strategy. We consider Bayesian Stackelberg games, in which the leader is uncertain about the types of adversary it may face. Such games are important in security domains, where, for example, a security agent (leader) must commit to a strategy of patrolling certain areas, and a robber (follower) has a chance to observe this strategy over time before choosing its own strategy of where to attack. This paper presents an efficient exact algorithm for finding the optimal strategy for the leader to commit to in these games. This algorithm, DOBSS, is based on a novel and compact mixed-integer linear programming formulation. Compared to the most efficient algorithm known previously for this problem, DOBSS is not only faster, but also leads to higher quality solutions, and does not suffer from problems of infeasibility that were faced by this previous algorithm. Note that DOBSS is at the heart of the ARMOR system that is currently being tested for security scheduling at the Los Angeles International Airport. %B International Joint Conference on Autonomous Agents and Multiagent Systems, 2008 %G eng %0 Conference Paper %B The Seventh International Conference on Autonomous Agents and Multiagent Systems %D 2008 %T RIAACT: A robust approach to adjustable autonomy for human-multiagent teams %A Nathan Schurr %A Janusz Marecki %A Tambe, Milind %X When human-multiagent teams act in real-time uncertain domains, adjustable autonomy (dynamic transferring of decisions between human and agents) raises three key challenges. First, the human and agents may differ significantly in their worldviews, leading to inconsistencies in their decisions. Second, these human-multiagent teams must operate and plan in real-time with deadlines with uncertain duration of human actions. Thirdly, adjustable autonomy in teams is an inherently distributed and complex problem that cannot be solved optimally and completely online. To address these challenges, our paper presents a solution for Resolving Inconsistencies in Adjustable Autonomy in Continuous Time (RIAACT). RIAACT incorporates models of the resolution of inconsistencies, continuous time planning techniques, and hybrid method to address coordination complexity. These contributions have been realized in a disaster response simulation system. %B The Seventh International Conference on Autonomous Agents and Multiagent Systems %G eng %0 Conference Paper %B Association for the Advancement of Artificial Intelligence 4th Multidiciplinary Workshop on Advances in Preference Handling %D 2008 %T Robust Solutions in Stackelberg Games: Addressing Boundedly Rational Human Preference Models %A Jain, Manish %A Fernando Ord´onez %A Pita, James %A Christopher Portway %A Tambe, Milind %A Craig Western %A Praveen Paruchuri %A Kraus, Sarit %X Stackelberg games represent an important class of games in which one player, the leader, commits to a strategy and the remaining players, the followers, make their decision with knowledge of the leader’s commitment. Existing algorithms for Bayesian Stackelberg games find optimal solutions while modeling uncertainty over follower types with an a-priori probability distribution. Unfortunately, in real-world applications, the leader may also face uncertainty over the follower’s response which makes the optimality guarantees of these algorithms fail. Such uncertainty arises because the follower’s specific preferences or the follower’s observations of the leader’s strategy may not align with the rational strategy, and it is not amenable to a-priori probability distributions. These conditions especially hold when dealing with human subjects. To address these uncertainties while providing quality guarantees, we propose three new robust algorithms based on mixed-integer linear programs (MILPs) for Bayesian Stackelberg games. A key result of this paper is a detailed experimental analysis that demonstrates that these new MILPs deal better with human responses: a conclusion based on 800 games with 57 human subjects as followers. We also provide run-time results on these MILPs. %B Association for the Advancement of Artificial Intelligence 4th Multidiciplinary Workshop on Advances in Preference Handling %G eng %0 Magazine Article %D 2008 %T Solving multiagent networks using distributed constraint optimization %A Jonathan P. Pearce %A Tambe, Milind %A Rajiv T. Maheswaran %X In many cooperative multiagent domains, the effect of local interactions between agents can be compactly represented as a network structure. Given that agents are spread across such a network, agents directly interact only with a small group of neighbors. A distributed constraint optimization problem (DCOP) is a useful framework to reason about such networks of agents. Given agents’ inability to communicate and collaborate in large groups in such networks, we focus on an approach called k-optimality for solving DCOPs. In this approach, agents form groups of one or more agents until no group of k or fewer agents can possibly improve the DCOP solution; we define this type of local optimum, and any algorithm guaranteed to reach such a local optimum, as k-optimal. The article provides an overview of three key results related to k-optimality. The first set of results are worst-case guarantees on the solution quality of k-optima in a DCOP. These guarantees can help determine an appropriate k-optimal algorithm, or possibly an appropriate constraint graph structure, for agents to use in situations where the cost of coordination between agents must be weighed against the quality of the solution reached. The second set of results are upper bounds on the number of k-optima that can exist in a DCOP. These results are useful in domains where a DCOP must generate a set of solutions rather than single solution. Finally, we sketch algorithms for k-optimality and provide some experimental results for 1-, 2- and 3-optimal algorithms for several types of DCOPs. %B AI Magazine %V 29 %P 47-66 %G eng %N 3 %0 Conference Paper %B Twenty Third AAAI Conference on Artificial Intelligence %D 2008 %T Towards Faster Planning with Continuous Resources in Stochastic Domains %A Janusz Marecki %A Tambe, Milind %X Agents often have to construct plans that obey resource limits for continuous resources whose consumption can only be characterized by probability distributions. While Markov Decision Processes (MDPs) with a state space of continuous and discrete variables are popular for modeling these domains, current algorithms for such MDPs can exhibit poor performance with a scale-up in their state space. To remedy that we propose an algorithm called DPFP. DPFP’s key contribution is its exploitation of the dual space cumulative distribution functions. This dual formulation is key to DPFP’s novel combination of three features. First, it enables DPFP’s membership in a class of algorithms that perform forward search in a large (possibly infinite) policy space. Second, it provides a new and efficient approach for varying the policy generation effort based on the likelihood of reaching different regions of the MDP state space. Third, it yields a bound on the error produced by such approximations. These three features conspire to allow DPFP’s superior performance and systematic trade-off of optimality for speed. Our experimental evaluation shows that, when run stand-alone, DPFP outperforms other algorithms in terms of its any-time performance, whereas when run as a hybrid, it allows for a significant speedup of a leading continuous resource MDP solver. %B Twenty Third AAAI Conference on Artificial Intelligence %G eng %0 Conference Paper %B AAAI Spring Symposium 2008 %D 2008 %T Using science fiction in teaching Artificial Intelligence %A Tambe, Milind %A Anne Balsamo %A Bowring, Emma %X Many factors are blamed for the decreasing enrollments in computer science and engineering programs in the U.S., including the dot-com economic bust and the increase in the use of “off-shore” programming labor. One major factor is also the lack of bold new vision and excitement about computer science, which thus results in a view of computer science as a field wedded to routine programming. To address this concern, we have focused on science fiction as a means to generate excitement about Artificial Intelligence, and thus in turn in Computer Science and Engineering. In particular, since the Fall of 2006, we have used science fiction in teaching Artificial Intelligence to undergraduate students at the University of Southern California (USC), in teaching activities ranging from an undergraduate upper division class in computer science to a semester-long freshman seminar for nonengineering students to micro-seminars during the welcome week. As an interdisciplinary team of scholar/instructors, our goal has been to use science fiction not only in motivating students to learn about AI, but also to use science fiction in understanding fundamental issues that arise at the intersection of technology and culture, as well as to provide students with a more creative and well-rounded course that provided a big picture view of computer science. This paper outlines the courses taught using this theme, provides an overview of our classroom teaching techniques in using science fiction, and discusses some of the lectures in more detail as exemplars. We conclude with feedback received, lessons learned and impact on both the computer science students and noncomputer-science (and non-engineering) students. %B AAAI Spring Symposium 2008 %G eng