TY - GEN
T1 - Chip Architectures Under Advanced Computing Sanctions
AU - Ning, August
AU - Wentzlaff, David
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s).
PY - 2025/6/21
Y1 - 2025/6/21
N2 - The rise of large scale machine learning models has generated unprecedented requirements and demand on computing hardware to enable these trillion parameter models. However, the importance of these bleeding-edge chips to the global economy, technological advancement, and strategic national interests have made them targets of sanctions. Recent advanced computing sanctions set limits on a device's Total Processing Performance, device bandwidth, and Performance Density and placed export controls on flagship data center and consumer products. In this work, we present the first study on the architectural and economic externality implications of these advanced computing sanctions and their effects on large language model (LLM) inference. We identify which architectural parameters are limited under existing regulations, and perform thorough design space exploration of compliant designs. Optimized designs are able to improve LLM inference prefill performance by 4% and decoding performance by 27% compared to a restricted device baseline. We then demonstrate how an architecture-first approach for computing policies allows chip designers and policymakers to craft efficient guidelines that achieve desired goals while minimizing negative externalities. We show how architectural features can unify marketing-based data center vs. non-data center regulations and how policies can be specified to create gaming-focused device architectures which are inherently limited in AI performance. Augmenting existing performance metrics with insightful architectural constraints better predict workload performance. Combined metrics achieved up to 42.4x narrower distributions compared to using theoretical compute performance alone, enable targeted and efficient policies.
AB - The rise of large scale machine learning models has generated unprecedented requirements and demand on computing hardware to enable these trillion parameter models. However, the importance of these bleeding-edge chips to the global economy, technological advancement, and strategic national interests have made them targets of sanctions. Recent advanced computing sanctions set limits on a device's Total Processing Performance, device bandwidth, and Performance Density and placed export controls on flagship data center and consumer products. In this work, we present the first study on the architectural and economic externality implications of these advanced computing sanctions and their effects on large language model (LLM) inference. We identify which architectural parameters are limited under existing regulations, and perform thorough design space exploration of compliant designs. Optimized designs are able to improve LLM inference prefill performance by 4% and decoding performance by 27% compared to a restricted device baseline. We then demonstrate how an architecture-first approach for computing policies allows chip designers and policymakers to craft efficient guidelines that achieve desired goals while minimizing negative externalities. We show how architectural features can unify marketing-based data center vs. non-data center regulations and how policies can be specified to create gaming-focused device architectures which are inherently limited in AI performance. Augmenting existing performance metrics with insightful architectural constraints better predict workload performance. Combined metrics achieved up to 42.4x narrower distributions compared to using theoretical compute performance alone, enable targeted and efficient policies.
KW - Advanced Computing Sanctions
KW - Artificial Intelligence
UR - https://www.scopus.com/pages/publications/105009600411
UR - https://www.scopus.com/inward/citedby.url?scp=105009600411&partnerID=8YFLogxK
U2 - 10.1145/3695053.3731012
DO - 10.1145/3695053.3731012
M3 - Conference contribution
AN - SCOPUS:105009600411
T3 - Proceedings - International Symposium on Computer Architecture
SP - 1225
EP - 1239
BT - ISCA 2025 - Proceedings of the 52nd Annual International Symposium on Computer Architecture
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 52nd Annual International Symposium on Computer Architecture, ISCA 2025
Y2 - 21 June 2025 through 25 June 2025
ER -