TY - GEN
T1 - Arbitrary bit permutations in one or two cycles
AU - Shi, Zhijie
AU - Yang, Xiao
AU - Lee, Ruby B.
N1 - Publisher Copyright:
© 2003 IEEE.
PY - 2003
Y1 - 2003
N2 - Symmetric-key block ciphers encrypt data, providing data confidentiality over the public Internet. For interoperability reasons, it is desirable to support a variety of symmetric-key ciphers efficiently. We show the basic operations performed by a variety of symmetric-key cryptography algorithms. Of these basic operations, only bit permutation is very slow using existing processors, followed by integer multiplication. New instructions have been proposed recently to accelerate bit permutations in general-purpose processors, reducing the instructions needed to achieve an arbitrary n-bit permutation from O(n) to O(log(n)). However, the serial data-dependency between these log(n) permutation instructions prevents them from being executed in fewer than log(n) cycles, even on superscalar processors. Since application specific instruction processors (ASIPs) have fewer constraints on maintaining standard processor datapath and control conventions, can we achieve even faster permutations? We propose six alternative ASIP approaches to achieve arbitrary 64 bit permutations in one or two cycles, using new BFLY and IBFLY instructions. This reduction to one or two cycles is achieved without increasing the cycle time. We compare the latencies of different permutation units in a technology independent way to estimate cycle time impact. We also compare the alternative ASIP architectures and their efficiency in performing arbitrary 64 bit permutations.
AB - Symmetric-key block ciphers encrypt data, providing data confidentiality over the public Internet. For interoperability reasons, it is desirable to support a variety of symmetric-key ciphers efficiently. We show the basic operations performed by a variety of symmetric-key cryptography algorithms. Of these basic operations, only bit permutation is very slow using existing processors, followed by integer multiplication. New instructions have been proposed recently to accelerate bit permutations in general-purpose processors, reducing the instructions needed to achieve an arbitrary n-bit permutation from O(n) to O(log(n)). However, the serial data-dependency between these log(n) permutation instructions prevents them from being executed in fewer than log(n) cycles, even on superscalar processors. Since application specific instruction processors (ASIPs) have fewer constraints on maintaining standard processor datapath and control conventions, can we achieve even faster permutations? We propose six alternative ASIP approaches to achieve arbitrary 64 bit permutations in one or two cycles, using new BFLY and IBFLY instructions. This reduction to one or two cycles is achieved without increasing the cycle time. We compare the latencies of different permutation units in a technology independent way to estimate cycle time impact. We also compare the alternative ASIP architectures and their efficiency in performing arbitrary 64 bit permutations.
UR - http://www.scopus.com/inward/record.url?scp=3042592162&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=3042592162&partnerID=8YFLogxK
U2 - 10.1109/ASAP.2003.1212847
DO - 10.1109/ASAP.2003.1212847
M3 - Conference contribution
AN - SCOPUS:3042592162
T3 - Proceedings of the International Conference on Application-Specific Systems, Architectures and Processors
SP - 237
EP - 247
BT - Proceedings - IEEE International Conference on Application-Specific Systems, Architectures, and Processors, ASAP 2003
A2 - Deprettere, Ed
A2 - Bhattacharyya, Shuvra
A2 - Cavallaro, Joseph
A2 - Darte, Alain
A2 - Thiele, Lothar
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - IEEE International Conference on Application-Specific Systems, Architectures, and Processors, ASAP 2003
Y2 - 24 June 2003 through 26 June 2003
ER -