Post: Cache

2 minute read


Cache allows for faster access to data in memory compared to RAM. However, this would allow comes at a cost of lower space available.

How it works


  1. CPU check if the address is in cache.
  2. If it is cache, load the data directly from the cache
  3. Else fetch the cache line from memory and update the cache
    • If the cache is full evict the cache line according to LRU/LFU


  • When the process write to a memory address in the cache, it will follow one of the folowing policy
  • Write Through:
    • Update both the cache and memory.
    • Advantage: if another core tries to access the same memory address, it will have the updated version.
  • Write Back:
    • Update the cache and only update the memory only after being evicted.
    • Use dirty bit to state that the cache line is dirty and needs to be written back into memory.


  1. Spatial locality: when fetching an address into the cache, the entire cache line would be fetched allowing for non-previously accessed address to be in the cache.
  2. Temporal locality: the previously fetch addressed in the cache will stay there until it is full and evicted

Cache Coherence

Problem: when multiple copies of the same data is on different caches. Happen in muliprocessors that share an address space.

Coherence: Referring to multiple processor writing to the same address.


  1. Program order: the order of memory access in the code should be same as the actual memory access.
    • Cannot reorder the sequence in code
  2. Write Propagation: when a processor writes to a memory address another processor should be able to read the new value written by the processor.
  3. Write Serialization: the order in writing to the same memory address must be the same for all processors


  • Overhead in having to snooping/directory based
  • false sharing: as updates are published as a block, two process sharing same block but different address on the block will need to be coherent.

Snooping Based

  • Each cache will track the status of its own cache block
  • Cache monitor or snoop on the bus
  • Bus
    • All processor on the bus can observe every bus transaction
      • Write propagation: when there is a transaction the cache can propagate the change to itself
    • Bus transaction are visible to the processor in the same order
      • Write serialization: the writes will be propagated in the same order.

Directory Based

  • Sharing status is kept in a centralized location

Memory Consistency

Problem: memory operation on different memory location happen in order across multiple process.

Memory operations:

  • Write to Read: write to X complete before read from Y
  • Read to Read: read to X complete before read from Y
  • Read to Write: read to X complete before write from Y
  • Write to Write: read to X complete before write from Y