Memory Processing Units (Awarded BEST POSTER of

Memory Processing Units
Jaikrishnan Menon, Lorenzo De Carli, VijayraghavanThiruvengadam
Karthikeyan Sankaralingam and Cristian Estan*
UW-Madison and *Google
MPU hardware
MPU
MPU Controller
● Compute fabric (90nm)
o ARM Cortex M3
 250 MHz, 1-issue
 16KB stack, code
o 8 cores / vault
o 128 cores = 1.1W
● MMU can ensure
sequential semantics
3D DRAM stack
MMU
Compute fabric
Scheduler
Buffer
Bank
state
Compute
Scheduler
HMC slice/vault
controller
Programming MPUs
● Shard data among DRAM chips
● CPU-to-MPU memory procedure calls start MPU threads
● MPU-to-MPU continuations