Fast bit gather, bit scatter and bit permutation instructions for commodity microprocessors

Yedidya Hilewitz, Ruby B. Lee

Research output: Contribution to journalArticlepeer-review

25 Scopus citations


Advanced bit manipulation operations are not efficiently supported by commodity word-oriented microprocessors. Programming tricks are typically devised to shorten the long sequence of instructions needed to emulate these complicated bit operations. As these bit manipulation operations are relevant to applications that are becoming increasingly important, we propose direct support for them in microprocessors. In particular, we propose fast bit gather (or parallel extract), bit scatter (or parallel deposit) and bit permutation instructions (including group, butterfly and inverse butterfly). We show that all these instructions can be implemented efficiently using both the fast butterfly and inverse butterfly network datapaths. Specifically, we show that parallel deposit can be mapped onto a butterfly circuit and parallel extract can be mapped onto an inverse butterfly circuit. We define static, dynamic and loop invariant versions of the instructions, with static versions utilizing a much simpler functional unit. We show how a hardware decoder can be implemented for the dynamic and loop-invariant versions to generate, dynamically, the control signals for the butterfly and inverse butterfly datapaths. The simplest functional unit we propose is smaller and faster than an ALU. We also show that these instructions yield significant speedups over a basic RISC architecture for a variety of different application kernels taken from applications domains including bioinformatics, steganography, coding, compression and random number generation.

Original languageEnglish (US)
Pages (from-to)145-169
Number of pages25
JournalJournal of Signal Processing Systems
Issue number1-2 SPEC. ISS.
StatePublished - Nov 2008

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Theoretical Computer Science
  • Signal Processing
  • Information Systems
  • Modeling and Simulation
  • Hardware and Architecture


  • Algorithm acceleration
  • Bioinformatics
  • Bit gather
  • Bit manipulations
  • Bit scatter
  • Compression
  • Cryptology
  • ISA
  • Instruction set architecture
  • Microprocessors
  • Pack
  • Parallel deposit
  • Parallel extract
  • Pattern matching
  • Permutations
  • Steganography
  • Unpack


Dive into the research topics of 'Fast bit gather, bit scatter and bit permutation instructions for commodity microprocessors'. Together they form a unique fingerprint.

Cite this