TY - GEN
T1 - Federated Linear Contextual Bandits
AU - Huang, Ruiquan
AU - Wu, Weiqiang
AU - Yang, Jing
AU - Shen, Cong
N1 - Funding Information:
The work of RH and JY was supported by the US National Science Foundation under Grants CNS-1956276, CNS-2003131, CNS-2114542, and ECCS-2030026. CS acknowledges the funding support by the US National Science Foundation under Grants ECCS-2029978, ECCS-2033671, and CNS-2002902. WW's work was done before he joined Facebook.
Publisher Copyright:
© 2021 Neural information processing systems foundation. All rights reserved.
PY - 2021
Y1 - 2021
N2 - This paper presents a novel federated linear contextual bandits model, where individual clients face different K-armed stochastic bandits coupled through common global parameters. By leveraging the geometric structure of the linear rewards, a collaborative algorithm called Fed-PE is proposed to cope with the heterogeneity across clients without exchanging local feature vectors or raw data. Fed-PE relies on a novel multi-client G-optimal design, and achieves near-optimal regrets for both disjoint and shared parameter cases with logarithmic communication costs. In addition, a new concept called collinearly-dependent policies is introduced, based on which a tight minimax regret lower bound for the disjoint parameter case is derived. Experiments demonstrate the effectiveness of the proposed algorithms on both synthetic and real-world datasets.
AB - This paper presents a novel federated linear contextual bandits model, where individual clients face different K-armed stochastic bandits coupled through common global parameters. By leveraging the geometric structure of the linear rewards, a collaborative algorithm called Fed-PE is proposed to cope with the heterogeneity across clients without exchanging local feature vectors or raw data. Fed-PE relies on a novel multi-client G-optimal design, and achieves near-optimal regrets for both disjoint and shared parameter cases with logarithmic communication costs. In addition, a new concept called collinearly-dependent policies is introduced, based on which a tight minimax regret lower bound for the disjoint parameter case is derived. Experiments demonstrate the effectiveness of the proposed algorithms on both synthetic and real-world datasets.
UR - http://www.scopus.com/inward/record.url?scp=85131931441&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85131931441&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85131931441
T3 - Advances in Neural Information Processing Systems
SP - 27057
EP - 27068
BT - Advances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
A2 - Ranzato, Marc'Aurelio
A2 - Beygelzimer, Alina
A2 - Dauphin, Yann
A2 - Liang, Percy S.
A2 - Wortman Vaughan, Jenn
PB - Neural information processing systems foundation
T2 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
Y2 - 6 December 2021 through 14 December 2021
ER -