Memory performance optimizations for real-time software HDTV decoding

Han Chen, Kai Li, Bin Wei

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

Pure software HDTV video decoding is still a challenging task on entry-level to mid-range desktop and notebook PCs, even with today's microprocessors frequency measured in GHz. This paper shows that the performance bottleneck in a software MPEG-2 decoder has been shifted to memory operations, as microprocessor technologies including multimedia instruction extensions have been improving at a fast rate during the past years. Our study exploits concurrencies at macroblock level to alleviate the performance bottleneck in a software MPEG-2 decoder. First, the paper introduces an interleaved block-order data layout to improve CPU cache performance. Second, the paper describes an algorithm to explicitly prefetch macroblocks for motion compensation. Finally, the paper presents an algorithm to schedule interleaved decoding and output at macroblock level. Our implementation and experiments show that these methods can effectively hide the latency of memory and frame buffer. The optimizations improve the performance of a multimedia-instruction-optimized software MPEG-2 decoder by a factor of about two. On a PC with a 933 MHz Pentium III CPU, the decoder can decode and display 1280 × 720-resolution HDTV streams at over 62 frames per second.

Original languageEnglish (US)
Pages (from-to)193-207
Number of pages15
JournalJournal of VLSI Signal Processing Systems for Signal, Image, and Video Technology
Volume41
Issue number2
DOIs
StatePublished - Sep 2005

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Information Systems
  • Electrical and Electronic Engineering

Keywords

  • CPI
  • Cache
  • Concurrency
  • Decompression
  • Locality
  • MPEG-2
  • Motion compensation
  • Prefetching

Fingerprint

Dive into the research topics of 'Memory performance optimizations for real-time software HDTV decoding'. Together they form a unique fingerprint.

Cite this