While the implications of persistent memory (PM) on CPU hardware and software are well-explored, the same is not true for GPUs (Graphics Processing Units). A recent work, GPM, demonstrated how GPU programs can benefit from the fine-grain persistence of PM. However, in the absence of a persistency model, one cannot reason about the correctness of PM-aware GPU programs. Persistency models define the order in which writes to PM are persisted.
We explore persistency models for GPUs. We demonstrate that CPU persistency models fall short for GPUs. We qualitatively and quantitatively argue that GPU persistency models should support scopes and buffering of writes to PM to leverage parallelism while adapting to higher NVM latencies. We formally specify a GPU persistency model that supports both scopes and buffers. We detail how GPU architecture can efficiently realize such a model. Finally, we quantitatively demonstrate the usefulness of scopes and buffers for PM-aware GPU programs.