Abstract
Computing radiosity is a very expensive problem in computer graphics. Recent hierarchical methods have greatly speeded up the computation of first diffuse and now also specular radiosity. We present a parallel algorithm for computing both diffuse and specular radiosity together, and discuss the techniques we used to improve its performance. The algorithm is both irregular and highly unpredictable. Despite this, by carefully designing a parallel algorithm that minimizes synchronization and memory access overhead and by identifying and correcting several synchronization bottlenecks that we did not anticipate, we were able to obtain speedups of 26.3 on a 32-processor machine with distributed memory and 14.2 on a 16-processor machine with centralized memory. We demonstrate how execution profiles obtained at runtime, for example time spent waiting at different locks, can be used to significantly improve the performance of complex, irregular parallel applications.
Original language | English (US) |
---|---|
Pages | 59-69 |
Number of pages | 11 |
DOIs | |
State | Published - 1997 |
Externally published | Yes |
Event | Proceedings of the 1997 IEEE Symposium on Parallel Rendering, PRS - Phoenix, AZ, USA Duration: Oct 20 1997 → Oct 21 1997 |
Other
Other | Proceedings of the 1997 IEEE Symposium on Parallel Rendering, PRS |
---|---|
City | Phoenix, AZ, USA |
Period | 10/20/97 → 10/21/97 |
All Science Journal Classification (ASJC) codes
- General Computer Science
- General Engineering