TY - GEN
T1 - Building Efficient Neural Prefetcher
AU - Liu, Yuchen
AU - Tziantzioulis, Georgios
AU - Wentzlaff, David
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/10/2
Y1 - 2023/10/2
N2 - Data prefetching is a promising approach to mitigate computation slow down due to the memory wall. While modern workloads grow more and more complicated, their memory access patterns become less organized and rule-based prefetchers can no longer deliver improved performance, which motivates the research of adopting neural networks for prefetching. However, current neural prefetchers require high computation costs and large storage space to obtain good performance, which makes them far from practical. To this end, we address the efficiency issue in neural prefetchers and propose an effective approach to build light-weight models. Specifically, our method is aware of both machine learning and micro-architecture, where we introduce a novel neural prefetcher design space with knobs from both aspects. We optimize these knobs using workload characteristic observations, rigorous mathematical optimization, and efficient design space traversal, which provides us with highly-efficient neural prefetchers. Our approach is evaluated on SPEC CPU 2006, where our models can provide up to 60% IPC gain compared to no prefetching, outperforming non-neural based prefetchers. In comparison with the state of the art neural prefetcher, our models enjoy an average of 15.4 × multiply-accumulation reduction, 6.7 × parameters saving, with even better IPC gains. Although it is still challenging to provide implementable neural prefetcher, this order of magnitude computational and storage reduction, provided by our method, marks an important milestone towards practical neural prefetchers.
AB - Data prefetching is a promising approach to mitigate computation slow down due to the memory wall. While modern workloads grow more and more complicated, their memory access patterns become less organized and rule-based prefetchers can no longer deliver improved performance, which motivates the research of adopting neural networks for prefetching. However, current neural prefetchers require high computation costs and large storage space to obtain good performance, which makes them far from practical. To this end, we address the efficiency issue in neural prefetchers and propose an effective approach to build light-weight models. Specifically, our method is aware of both machine learning and micro-architecture, where we introduce a novel neural prefetcher design space with knobs from both aspects. We optimize these knobs using workload characteristic observations, rigorous mathematical optimization, and efficient design space traversal, which provides us with highly-efficient neural prefetchers. Our approach is evaluated on SPEC CPU 2006, where our models can provide up to 60% IPC gain compared to no prefetching, outperforming non-neural based prefetchers. In comparison with the state of the art neural prefetcher, our models enjoy an average of 15.4 × multiply-accumulation reduction, 6.7 × parameters saving, with even better IPC gains. Although it is still challenging to provide implementable neural prefetcher, this order of magnitude computational and storage reduction, provided by our method, marks an important milestone towards practical neural prefetchers.
UR - http://www.scopus.com/inward/record.url?scp=85190673606&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85190673606&partnerID=8YFLogxK
U2 - 10.1145/3631882.3631903
DO - 10.1145/3631882.3631903
M3 - Conference contribution
AN - SCOPUS:85190673606
T3 - ACM International Conference Proceeding Series
BT - MEMSYS 2023 - Proceedings of the International Symposium on Memory Systems
PB - Association for Computing Machinery
T2 - 9th International Symposium on Memory Systems, MEMSYS 2023
Y2 - 2 October 2023 through 5 October 2023
ER -