Abstract
In this paper, we propose to solve a regularized distributionally robust learning problem in the decentralized setting, taking into account the data distribution shift. By adding a Kullback-Liebler regularization function to the robust min-max optimization problem, the learning problem can be reduced to a modified robust minimization problem and solved efficiently. Leveraging the newly formulated optimization problem, we propose a robust ver-sion of Decentralized Stochastic Gradient Descent (DSGD), coined Distributionally Robust Decentralized Stochastic Gradient Descent (DR-DSGD). Under some mild assumptions and provided that the regularization parameter is larger than one, we theoretically prove that DR-DSGD achieves a convergence rate of O (1/√KT + K/T), where K is the number of devices and T is the number of iterations. Simulation results show that our proposed algorithm can improve the worst distribution test accuracy by up to 10%. Moreover, DR-DSGD is more communication-efficient than DSGD since it requires fewer communication rounds (up to 20 times less) to achieve the same worst distribution test accuracy target. Fur-thermore, the conducted experiments reveal that DR-DSGD results in a fairer performance across devices in terms of test accuracy.
| Original language | English |
|---|---|
| Journal | Transactions on Machine Learning Research |
| Volume | 2022-August |
| State | Published - 2022 |
| Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2022, Transactions on Machine Learning Research. All rights reserved.
ASJC Scopus subject areas
- Computer Vision and Pattern Recognition
- Artificial Intelligence