Abstract
Advanced bit manipulation operations are not efficiently supported by commodity word-oriented microprocessors. Programming tricks are typically devised to shorten the long sequence of instructions needed to emulate these complicated bit operations. As these bit manipulation operations are relevant to applications that are becoming increasingly important, we propose direct support for them in microprocessors. In particular, we propose fast bit gather (or parallel extract), bit scatter (or parallel deposit) and bit permutation instructions (including group, butterfly and inverse butterfly). We show that all these instructions can be implemented efficiently using both the fast butterfly and inverse butterfly network datapaths. Specifically, we show that parallel deposit can be mapped onto a butterfly circuit and parallel extract can be mapped onto an inverse butterfly circuit. We define static, dynamic and loop invariant versions of the instructions, with static versions utilizing a much simpler functional unit. We show how a hardware decoder can be implemented for the dynamic and loop-invariant versions to generate, dynamically, the control signals for the butterfly and inverse butterfly datapaths. The simplest functional unit we propose is smaller and faster than an ALU. We also show that these instructions yield significant speedups over a basic RISC architecture for a variety of different application kernels taken from applications domains including bioinformatics, steganography, coding, compression and random number generation.
Original language | English (US) |
---|---|
Pages (from-to) | 145-169 |
Number of pages | 25 |
Journal | Journal of Signal Processing Systems |
Volume | 53 |
Issue number | 1-2 SPEC. ISS. |
DOIs | |
State | Published - Nov 2008 |
All Science Journal Classification (ASJC) codes
- Control and Systems Engineering
- Theoretical Computer Science
- Signal Processing
- Information Systems
- Modeling and Simulation
- Hardware and Architecture
Keywords
- Algorithm acceleration
- Bioinformatics
- Bit gather
- Bit manipulations
- Bit scatter
- Compression
- Cryptology
- ISA
- Instruction set architecture
- Microprocessors
- Pack
- Parallel deposit
- Parallel extract
- Pattern matching
- Permutations
- Steganography
- Unpack