Abstract
Even sophisticated branch-prediction techniques necessarily suffer some mispredictions, and even relatively small mispredict rates hurt performance substantially in current-generation processors. In this paper, we investigate schemes for improving performance in the face of imperfect branch predictors by having the processor simultaneously execute code from both the taken and not-taken outcomes of a branch. This paper presents data regarding the limits of multipath execution, considers fetch-bandwidth needs for multipath execution, and discusses various dynamic confidence-prediction schemes that gauge the likelihood of branch mispredictions. Our evaluations consider executing along several (2-8) paths at once. Using 4 paths and a relatively simple confidence predictor, multipath execution garners speedups of up to 30% compared to the single-path case, with an average speedup of 14.4% for the SPECint suite. While associated increases in instruction-fetch-bandwidth requirements are not too surprising, a less expected result is the significance of having a separate return-address stack for each forked path. Overall, our results indicate that multipath execution offers significant improvements over single-path performance, and could be especially useful when combined with multithreading so that hardware costs can be amortized over both approaches.
Original language | English (US) |
---|---|
Pages | 101-108 |
Number of pages | 8 |
State | Published - 1998 |
Event | Proceedings of the 1998 International Conference on Supercomputing - Melbourne, Aust Duration: Jul 13 1998 → Jul 17 1998 |
Other
Other | Proceedings of the 1998 International Conference on Supercomputing |
---|---|
City | Melbourne, Aust |
Period | 7/13/98 → 7/17/98 |
All Science Journal Classification (ASJC) codes
- General Computer Science