Abstract
Particle gradient descent, which uses particles to represent a probability measure and performs gradient descent on particles in parallel, is widely used to optimize functions of probability measures. This paper considers particle gradient descent with a finite number of particles and establishes its theoretical guarantees to optimize functions that are displacement convex in measures. Concretely, for Lipschitz displacement convex functions defined on probability over ℝd, we prove that O(1/ϵ2) particles and O(d/ϵ4) iterations are sufficient to find the ϵ-optimal solutions. We further provide improved complexity bounds for optimizing smooth displacement convex functions. An application of our results proves the conjecture of no optimization-barrier up to permutation invariance, proposed by Entezari et al. (2022), for specific two-layer neural networks with two-dimensional inputs uniformly drawn from unit circle.
Original language | English (US) |
---|---|
Pages (from-to) | 6836-6854 |
Number of pages | 19 |
Journal | Proceedings of Machine Learning Research |
Volume | 202 |
State | Published - 2023 |
Event | 40th International Conference on Machine Learning, ICML 2023 - Honolulu, United States Duration: Jul 23 2023 → Jul 29 2023 |
All Science Journal Classification (ASJC) codes
- Artificial Intelligence
- Software
- Control and Systems Engineering
- Statistics and Probability