Abstract
Despite being a key bottleneck in many machine learning tasks, the cost of solving large linear systems has proven challenging to quantify due to problem-dependent quantities such as condition numbers. To tackle this, we consider a fine-grained notion of complexity for solving linear systems, which is motivated by applications where the data exhibits low-dimensional structure, including spiked covariance models and kernel machines, and when the linear system is explicitly regularized, such as ridge regression. Concretely, let κl be the ratio between the lth largest and the smallest singular value of n×n matrix A. We give a stochastic algorithm based on the Sketch-and-Project paradigm, that solves the linear system Ax = b in time (Formula presented) for any (Formula presented). This is a direct improvement over preconditioned conjugate gradient, and it provides a stronger separation between stochastic linear solvers and algorithms accessing A only through matrix-vector products. Our main technical contribution is the new analysis of the first and second moments of the random projection matrix that arises in Sketch-and-Project.
| Original language | English (US) |
|---|---|
| Journal | Journal of Machine Learning Research |
| Volume | 26 |
| State | Published - 2025 |
All Science Journal Classification (ASJC) codes
- Software
- Control and Systems Engineering
- Statistics and Probability
- Artificial Intelligence
Keywords
- Linear systems
- matrix sketching
- random matrix theory
- sketch-and-project
- stochastic optimization