Citation:
Jackson A Killian, Arpita Biswas, Sanket Shah, and Milind Tambe. 8/2021. “Q-Learning Lagrange Policies for Multi-Action Restless Bandits.” Proceedings of the 27th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.
See also: 2021