Abstract
Even the simplest hardware, running the simplest programs, can behave in the strangest of ways. Tracking down the cause of a performance anomaly without the complete hardware reference of a processor is a prime example of black-box architectural exploration. When doubling the work of a simple benchmark program, that was run on a single core of Tilera's TILEPro64 processor, did not double the number of consumed cycles, a mystery was unveiled. After ruling out different levels of optimization for the two programs, a cycle-accurate simulation attributed the sub-optimal performance to an abnormally high number of L1 data cache misses. Further investigation showed that the processor stalled on every Read-After-Write instruction sequence when the following two conditions were met: 1) there are 0 or 1 instructions between the write and the read instruction and 2) the read and the write instructions target distinct memory locations that share an L1 cache line. We call this performance pitfall a RAW hiccup. We describe two countermeasures, memory padding and the explicit introduction of pipeline bubbles, that sidestep the RAW hiccup.
This experience paper serves as a useful troubleshooting guide for uncovering anomalous performance issues when the hardware design under study is unavailable.
This experience paper serves as a useful troubleshooting guide for uncovering anomalous performance issues when the hardware design under study is unavailable.
| Original language | English |
|---|---|
| Title of host publication | International Conference on Performance Engineering (ICPE 2013), Industry and Experience Track |
| Place of Publication | New York, NY, USA |
| Publisher | ACM |
| Pages | 63-70 |
| Number of pages | 8 |
| ISBN (Print) | 978-1-4503-1636-1 |
| Publication status | Published - 2013 |
| Event | Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering (ICPE'13) - Prague, Czech Republic Duration: 21 Apr 2013 → 24 Apr 2013 |
Conference
| Conference | Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering (ICPE'13) |
|---|---|
| Country/Territory | Czech Republic |
| City | Prague |
| Period | 21/04/13 → 24/04/13 |
Keywords
- L1 data cache
- TilePro64
- memory padding
- pipeline bubble