TY - GEN
T1 - Bit matrix multiplication in commodity processors
AU - Hilewitz, Yedidya
AU - Lauradoux, Cédric
AU - Lee, Ruby B.
PY - 2008
Y1 - 2008
N2 - Registers in processors generally contain words or, with the addition of multimedia extensions, short vectors of subwords of bytes or 16-bit elements. In this paper, we view the contents of registers as vectors or matrices of individual bits. However, the facility to operate efficiently on the bit-level is generally lacking. A commodity processor usually only has logical and shift instructions and occasionally population count instructions. Perhaps the most powerful primitive bit-level operation is the bit matrix multiply (BMM) instruction, currently found only in supercomputers like Cray. This instruction multiplies two n x n bit matrices. In this paper, we show the power of BMM. We propose and analyze new processor instructions that implement simpler BMM primitive operations more suitable for a commodity processor. We show the impact of BMM on the performance of critical application kernels and discuss its hardware cost.
AB - Registers in processors generally contain words or, with the addition of multimedia extensions, short vectors of subwords of bytes or 16-bit elements. In this paper, we view the contents of registers as vectors or matrices of individual bits. However, the facility to operate efficiently on the bit-level is generally lacking. A commodity processor usually only has logical and shift instructions and occasionally population count instructions. Perhaps the most powerful primitive bit-level operation is the bit matrix multiply (BMM) instruction, currently found only in supercomputers like Cray. This instruction multiplies two n x n bit matrices. In this paper, we show the power of BMM. We propose and analyze new processor instructions that implement simpler BMM primitive operations more suitable for a commodity processor. We show the impact of BMM on the performance of critical application kernels and discuss its hardware cost.
UR - http://www.scopus.com/inward/record.url?scp=51649098779&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=51649098779&partnerID=8YFLogxK
U2 - 10.1109/ASAP.2008.4580146
DO - 10.1109/ASAP.2008.4580146
M3 - Conference contribution
AN - SCOPUS:51649098779
SN - 9781424418985
T3 - Proceedings of the International Conference on Application-Specific Systems, Architectures and Processors
SP - 7
EP - 12
BT - ASAP08, Conference Proceedings - IEEE 19th International Conference on Application-Specific Systems, Architectures and Processors
T2 - ASAP08 - IEEE 19th International Conference on Application-Specific Systems, Architectures and Processors
Y2 - 2 July 2008 through 4 July 2008
ER -