TY - GEN
T1 - The Implications of Page Size Management on Graph Analytics
AU - Manocha, Aninda
AU - Yan, Zi
AU - Tureci, Esin
AU - Aragon, Juan Luis
AU - Nellans, David
AU - Martonosi, Margaret
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Graph representations of data are ubiquitous in analytic applications. However, graph workloads are notorious for having irregular memory access patterns with variable access frequency per address, which cause high translation lookaside buffer (TLB) miss rates and significant address translation overheads during workload execution. Furthermore, these access patterns sparsely span a large address space, yielding memory footprints greater than total TLB coverage by orders of magnitude. It is widely recognized that employing huge pages can alleviate some of these bottlenecks. However, in real systems, huge pages are not always available and the OS often provisions huge pages suboptimally, significantly reducing peak application performance. State-of-the-art huge page management techniques employ heuristics, such as huge page region utilization, to guide page size decisions. However, these heuristics are often only optimal for specific memory access patterns, or footprint sizes, and do not sufficiently adapt to dynamic workload characteristics.This work performs a comprehensive characterization of the effects of page size allocation policy and page placement on graph application throughput. We show that when system memory is nearly full or fragmented (the common case in real systems), huge page resources available to an application are limited and their utility must be maximized. We demonstrate that (1) awareness of single-use memory can eliminate the use of precious huge page resources for data that receives little benefit and (2) coupling degree-aware preprocessing of graph data with programmer-guided use of huge pages boosts performance by 1.26 - 1.57× over using 4KB pages alone, while achieving 77.3 - 96.3% the performance of unbounded huge page usage and requiring only 0.58 - 2.92% of the memory resources. This manual, domain-specific optimization of huge page efficiency in memory constrained systems demonstrates that huge pages are a new class of resource that must be intelligently managed by programmers or next-generation OS policies to optimize application performance.
AB - Graph representations of data are ubiquitous in analytic applications. However, graph workloads are notorious for having irregular memory access patterns with variable access frequency per address, which cause high translation lookaside buffer (TLB) miss rates and significant address translation overheads during workload execution. Furthermore, these access patterns sparsely span a large address space, yielding memory footprints greater than total TLB coverage by orders of magnitude. It is widely recognized that employing huge pages can alleviate some of these bottlenecks. However, in real systems, huge pages are not always available and the OS often provisions huge pages suboptimally, significantly reducing peak application performance. State-of-the-art huge page management techniques employ heuristics, such as huge page region utilization, to guide page size decisions. However, these heuristics are often only optimal for specific memory access patterns, or footprint sizes, and do not sufficiently adapt to dynamic workload characteristics.This work performs a comprehensive characterization of the effects of page size allocation policy and page placement on graph application throughput. We show that when system memory is nearly full or fragmented (the common case in real systems), huge page resources available to an application are limited and their utility must be maximized. We demonstrate that (1) awareness of single-use memory can eliminate the use of precious huge page resources for data that receives little benefit and (2) coupling degree-aware preprocessing of graph data with programmer-guided use of huge pages boosts performance by 1.26 - 1.57× over using 4KB pages alone, while achieving 77.3 - 96.3% the performance of unbounded huge page usage and requiring only 0.58 - 2.92% of the memory resources. This manual, domain-specific optimization of huge page efficiency in memory constrained systems demonstrates that huge pages are a new class of resource that must be intelligently managed by programmers or next-generation OS policies to optimize application performance.
UR - http://www.scopus.com/inward/record.url?scp=85145646924&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85145646924&partnerID=8YFLogxK
U2 - 10.1109/IISWC55918.2022.00026
DO - 10.1109/IISWC55918.2022.00026
M3 - Conference contribution
AN - SCOPUS:85145646924
T3 - Proceedings - 2022 IEEE International Symposium on Workload Characterization, IISWC 2022
SP - 199
EP - 214
BT - Proceedings - 2022 IEEE International Symposium on Workload Characterization, IISWC 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 IEEE International Symposium on Workload Characterization, IISWC 2022
Y2 - 6 November 2022 through 8 November 2022
ER -