Please don’t remove this alert box from this page.
This alert box is placed on all the pages of this website. Although visible in edit mode, it is hidden by custom code.

MASSI (MultiAgents for Speeding up Social Impact)

MASSI (MultiAgents for Speeding up Social Impact)

A meta-level multi-agent system powered by LLM Agents to accelerate the development and deployment of AI for Social Impact (AI4SI) solutions

MASSI

The MASSI (MultiAgents for Speeding up Social Impact) project proposes a meta-level multi-agent system powered by LLM agents to accelerate the development and deployment of AI for Social Impact (AI4SI) solutions. Specifically designed to overcome the labor-intensive nature of traditional AI4SI, MASSI aims to speed up key phases across the entire pipeline—from problem formulation and solution design to testing and impact evaluation. Some demonstrations of MASSI include the Decision-Language Model (DLM), which translates a public health worker’s natural language policy goals into executable reward functions for dynamic resource allocation problems (Restless Multi-Armed Bandits – RMABs), the Social Choice Language Model (SCLM) and Preference Robustness (DPO-PRO) techniques.

Key papers

Yunfan Zhao, Niclas Boehmer, Aparna Taneja and Milind Tambe
In Proc. of the 24th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2025), Blue Sky Ideas Track


This Blue Sky Idea paper proposes a meta-level multi-agent system that uses Foundation Models (FMs) and LLMs to accelerate the development of customized AI for Social Impact (AI4SI) solutions. The system reduces labor costs by assisting with problem formulation, solution design, and testing, while incorporating crucial aspects like fairness and a human-in-the-loop.

Cheol Kim, Jai Moondra, Shresth Verma, Madeleine Pollack, Lingkai Kong, Milind Tambe, Swati Gupta
In Proc. of the 42nd International Conference on Machine Learning (ICML 2025)


The paper addresses the challenge in Multi-Objective Reinforcement Learning (MORL) of selecting a policy that effectively manages trade-offs when using a Social Welfare Function. It introduces the α-approximate portfolio, a small set of policies that are near-optimal across the entire spectrum of possible trade-off weightings (generalized p-means). This allows decision-makers to efficiently navigate and make informed choices about balancing human preferences.

Cheol Kim, Shresth Verma, Mauricio Tec, Milind Tambe
Under Submission at AAAI 2026


The paper proposes DPO-PRO, a Distributionally Robust Optimization (DRO) enhancement for Direct Preference Optimization (DPO) to fine-tune LLMs. This method specifically addresses noisy preference signals in reward generation for public health settings by incorporating uncertainty in preference distribution. DPO-PRO improves robustness and provides performance comparable to prior self-reflection baselines like Decision Language Model but with significantly lower inference-time cost.

Nikhil Behari, Edwin Zhang, Yunfan Zhao, Aparna Taneja, Dheeraj Nagaraj, Milind Tambe
In Proc. of the 38th Conference on Neural Information Processing Systems (NeurIPS 2024)


The paper introduces a Decision-Language Model (DLM) to translate natural language policy goals specified by health workers into executable reward functions for Restless Multi-Armed Bandits (RMABs). This allows public health resource allocation policies to be automatically generated and dynamically adapted to new priorities.

Shresth Verma, Niclas Boehmer, Lingkai Kong and Milind Tambe
To appear in Proc. of the 16th Conference on Game Theory and AI for Security (GameSec 2025)


This paper proposes Social Choice Language Model (SCLM) to solve the problem of multi-objective tradeoffs that arise when LLMs design rewards for Restless Multi-Armed Bandits (RMABs). SCLM incorporates a transparent external adjudicator that applies a user-selected social welfare function (e.g., Utilitarian or Egalitarian) to evaluate and choose the reward function that is most balanced and aligned with complex, potentially conflicting human preferences.