Abstract
In this paper, we have developed a new algorithm for solving nonconvex large-scale problems. The new algorithm performs explicit matrix modifications adaptively, mimicing the implicit modifications used by trust-region methods. Thus, it shares the equivalent theoretical strength of trust-region approaches, without needing to accommodate an explicit step-size constraint. We show that the algorithm is well suited for solving very large-scale nonconvex problems whenever Hessian-vector products are available. The numerical results on the CUTEr problems demonstrate the effectiveness of this approach in the context of a line-search method for large-scale unconstrained nonconvex optimization. Moreover, applications in deep-learning problems further illustrate the usefulness of this algorithm. It does not share any of the prohibitive traits of popular matrix-free algorithms such as truncated conjugate gradient (CG) due to the difficult nature of deep-learning problems. Thus the proposed algorithm serves to bridge the gap between the needs of data-mining community and existing state-of-the-art approaches embraced foremost by the optimization community. Moreover, the proposed approach can be realized with minimal modification to the CG algorithm itself with negligible storage and computational overhead.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 1-24 |
| Number of pages | 24 |
| Journal | Optimization Methods and Software |
| Volume | 34 |
| Issue number | 1 |
| DOIs | |
| State | Published - Jan 2 2019 |
| Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Software
- Control and Optimization
- Applied Mathematics
Keywords
- conjugate gradient method
- Hessian-free methods
- machine learning
- nonconvex large-scale problems
- nonlinear programming
- trust-region methods