TY - JOUR
T1 - Online Stochastic Optimization with Time-Varying Distributions
AU - Cao, Xuanyu
AU - Zhang, Junshan
AU - Vincent Poor, H.
N1 - Funding Information:
Manuscript received February 11, 2019; revised November 6, 2019; accepted May 15, 2020. Date of publication May 21, 2020; date of current version March 29, 2021. This work was supported in part by the U.S. Army Research Office under Grant W911NF-16-1-0448 and in part by U.S. National Science Foundation under Grant CPS-1739344. Recommended by Associate Editor Uday V. Shanbhag. (Corresponding author: Xuanyu Cao.) Xuanyu Cao is with the Coordinated Science Lab, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA (e-mail: xyc@illinois.edu).
Publisher Copyright:
© 1963-2012 IEEE.
PY - 2021/4
Y1 - 2021/4
N2 - This article studies online stochastic optimization, where the random parameters follow time-varying distributions. In each time slot, after a control variable is determined, a sample drawn from the current distribution is revealed as feedback information. This form of stochastic optimization has broad applications in online learning and signal processing, where the underlying ground-truth is inherently time-varying, e.g., tracking a moving target. Dynamic optimal points are adopted as the performance benchmark to define the regret of algorithms, as opposed to the static optimal point used in stochastic optimization with fixed distributions. Unconstrained stochastic optimization with time-varying distributions is first examined and a projected stochastic gradient descent algorithm is presented. An upper bound on its regret is established with respect to the drift of the dynamic optima, which measures the temporal variations of the underlying distributions. In particular, the algorithm possesses sublinear regret as long as the drift of the optima is sublinear, i.e., the distributions do not vary too drastically. Further, a stochastic saddle point method involving only iterative closed-form computation is proposed for constrained stochastic optimization with time-varying distributions. Upper bounds on its regret and constraint violation are developed with respect to the drift of the optima. Analogously, sublinear regret and sublinear constraint violation can be ensured provided that the drift of the optima is sublinear. Finally, numerical results are presented to corroborate the efficacy of the proposed algorithms and the derived analytical results.
AB - This article studies online stochastic optimization, where the random parameters follow time-varying distributions. In each time slot, after a control variable is determined, a sample drawn from the current distribution is revealed as feedback information. This form of stochastic optimization has broad applications in online learning and signal processing, where the underlying ground-truth is inherently time-varying, e.g., tracking a moving target. Dynamic optimal points are adopted as the performance benchmark to define the regret of algorithms, as opposed to the static optimal point used in stochastic optimization with fixed distributions. Unconstrained stochastic optimization with time-varying distributions is first examined and a projected stochastic gradient descent algorithm is presented. An upper bound on its regret is established with respect to the drift of the dynamic optima, which measures the temporal variations of the underlying distributions. In particular, the algorithm possesses sublinear regret as long as the drift of the optima is sublinear, i.e., the distributions do not vary too drastically. Further, a stochastic saddle point method involving only iterative closed-form computation is proposed for constrained stochastic optimization with time-varying distributions. Upper bounds on its regret and constraint violation are developed with respect to the drift of the optima. Analogously, sublinear regret and sublinear constraint violation can be ensured provided that the drift of the optima is sublinear. Finally, numerical results are presented to corroborate the efficacy of the proposed algorithms and the derived analytical results.
KW - Dynamic benchmark
KW - online learning
KW - online optimization
KW - stochastic optimization
KW - time-varying distributions
UR - http://www.scopus.com/inward/record.url?scp=85103451565&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85103451565&partnerID=8YFLogxK
U2 - 10.1109/TAC.2020.2996178
DO - 10.1109/TAC.2020.2996178
M3 - Article
AN - SCOPUS:85103451565
SN - 0018-9286
VL - 66
SP - 1840
EP - 1847
JO - IEEE Transactions on Automatic Control
JF - IEEE Transactions on Automatic Control
IS - 4
M1 - 9097905
ER -