Teamcore research published in 2019
Kai Wang, Bryan Wilder, Sze-Chuan Suen, Bistra Dilkina, and Milind Tambe. 2019. “Improving GP-UCB Algorithm by Harnessing Decomposed Feedback.” In The 4th Workshop on Data Science for Social Good at ECML-PKDD, 2019.Abstract
Gaussian processes (GPs) have been widely applied to machine learning and nonparametric approximation. Given existing observations, a GP allows
the decision maker to update a posterior belief over the unknown underlying function. Usually, observations from a complex system come with noise and decomposed feedback from intermediate layers. For example, the decomposed feedback
could be the components that constitute the final objective value, or the various
feedback gotten from sensors. Previous literature has shown that GPs can successfully deal with noise, but has neglected decomposed feedback. We therefore propose a decomposed GP regression algorithm to incorporate this feedback, leading
to less average root-mean-squared error with respect to the target function, especially when the samples are scarce. We also introduce a decomposed GP-UCB
algorithm to solve the resulting bandit problem with decomposed feedback. We
prove that our algorithm converges to the optimal solution and preserves the noregret property. To demonstrate the wide applicability of this work, we execute
our algorithm on two disparate social problems: infectious disease control and
weather monitoring. The numerical results show that our method provides significant improvement against previous methods that do not utilize these feedback,
showcasing the advantage of considering decomposed feedback.