Abstract
Any successful solution to using multicore processors to scale general-purpose program performance will have to contend with rising intercore communication costs while exposing coarse-grained parallelism. Recently proposed pipelined multithreading (PMT) techniques have been demonstrated to have general-purpose applicability and are also able to effectively tolerate inter-core latencies through pipelined interthread communication. These desirable properties make PMT techniques strong candidates for program parallelization on current and future multicore processors and understanding their performance characteristics is critical to their deployment. To that end, this paper evaluates the performance scalability of a general-purpose PMT technique called decoupled software pipelining (DSWP) and presents a thorough analysis of the communication bottlenecks that must be overcome for optimal DSWP scalability.
Original language | English (US) |
---|---|
Article number | 8 |
Journal | Transactions on Architecture and Code Optimization |
Volume | 5 |
Issue number | 2 |
DOIs | |
State | Published - Aug 1 2008 |
All Science Journal Classification (ASJC) codes
- Software
- Information Systems
- Hardware and Architecture
Keywords
- Decoupled software pipelining
- Performance analysis