When Spatial and Temporal Locality Collide: The Case of the Missing Cache Hits

Mattias De Wael, Tom Van Cutsem, David Ungar

    Research output: Chapter in Book/Report/Conference proceedingConference paper

    1 Citation (Scopus)

    Abstract

    Even the simplest hardware, running the simplest programs, can behave in the strangest of ways. Tracking down the cause of a performance anomaly without the complete hardware reference of a processor is a prime example of black-box architectural exploration. When doubling the work of a simple benchmark program, that was run on a single core of Tilera's TILEPro64 processor, did not double the number of consumed cycles, a mystery was unveiled. After ruling out different levels of optimization for the two programs, a cycle-accurate simulation attributed the sub-optimal performance to an abnormally high number of L1 data cache misses. Further investigation showed that the processor stalled on every Read-After-Write instruction sequence when the following two conditions were met: 1) there are 0 or 1 instructions between the write and the read instruction and 2) the read and the write instructions target distinct memory locations that share an L1 cache line. We call this performance pitfall a RAW hiccup. We describe two countermeasures, memory padding and the explicit introduction of pipeline bubbles, that sidestep the RAW hiccup.
    This experience paper serves as a useful troubleshooting guide for uncovering anomalous performance issues when the hardware design under study is unavailable.
    Original languageEnglish
    Title of host publicationInternational Conference on Performance Engineering (ICPE 2013), Industry and Experience Track
    Place of PublicationNew York, NY, USA
    PublisherACM
    Pages63-70
    Number of pages8
    ISBN (Print)978-1-4503-1636-1
    Publication statusPublished - 2013
    EventProceedings of the 4th ACM/SPEC International Conference on Performance Engineering (ICPE'13) - Prague, Czech Republic
    Duration: 21 Apr 201324 Apr 2013

    Conference

    ConferenceProceedings of the 4th ACM/SPEC International Conference on Performance Engineering (ICPE'13)
    Country/TerritoryCzech Republic
    CityPrague
    Period21/04/1324/04/13

    Keywords

    • L1 data cache
    • TilePro64
    • memory padding
    • pipeline bubble

    Fingerprint

    Dive into the research topics of 'When Spatial and Temporal Locality Collide: The Case of the Missing Cache Hits'. Together they form a unique fingerprint.

    Cite this