Subword sorting with versatile permutation instructions

Zhijie Shi, Ruby B. Lee

Research output: Contribution to conferencePaperpeer-review

14 Scopus citations


Subword parallelism has succeeded in accelerating many multimedia applications. Subword permutation instructions have been proposed to efficiently rearrange subwords in or among registers. Bit-level permutation instructions have also been proposed recently for their importance in cryptography. However, some important algorithms, especially ones with lots of conditional control dependencies such as sorting, have not exploited the advantage of subword parallel instructions. In this paper, we show how one of the bit permutation instructions, GRP, can be used for fast sorting. In the process, we demonstrate the versatility of this permutation instruction for uses other than bit permutations. This versatility is important in considering the addition of a new instruction to a general-purpose processor. The results show that our sorting methods have a significant speedup even when compared with the fastest sorting algorithms. We also discuss the hardware implementation of the GRP instruction and compare its latency to a typical processor's cycle time.

Original languageEnglish (US)
Number of pages8
StatePublished - 2002
EventInternational Conference on Computer Design (ICCD'02) VLSI in Copmuters and Processors - Freiburg, Germany
Duration: Sep 16 2002Sep 18 2002


OtherInternational Conference on Computer Design (ICCD'02) VLSI in Copmuters and Processors

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture
  • Electrical and Electronic Engineering


Dive into the research topics of 'Subword sorting with versatile permutation instructions'. Together they form a unique fingerprint.

Cite this