Page Not Found
Page not found. Your pixels are in another canvas. Read more
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Page not found. Your pixels are in another canvas. Read more
About me Read more
Published:
I’m pretty passionate about weightlifting, and try to make it a part of my daily lifestyle. Over the years there’s a lot I’ve learned, and I hope this blog post might help others starting out. Read more
Published in 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA)
This paper proposed a hardware race detector for GPUs. Our hardware was able to efficiently support detection of scoped races in GPU programs. Read more
Published in 2021 ACM SIGOPS 28th Symposium on Operating Systems Principles (SOSP)
This paper proposed an in-GPU software race detector. The race detector made use of NVBit, a binary instrumentation tool. Using this, we were able to detect races due to improper synchronization, scopes, or ITS. We even found races in 3 NVIDIA-supported libaries (cuML, CUB, Cooperative Groups). Read more
Published in 2022 ACM 27th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)
This paper explored how to utilise commercially available NVM on a GPU using real hardware. Through this process we came up with a benchmark suite (GPMBench) consisting of GPU applications that benefit from both GPU parallelism as well as NVM persistence. We also provide a GPU-optimised library (libGPM) that simplifies access to NVM from a GPU. Read more
Published in 2023 ACM 28th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)
This paper explores persistency models for GPUs, analyzing whether CPU persistency models are suitable for GPU architecture and the needs of GPU applications (spoiler: they aren’t). We investigate how to express persistency models for intra-thread and inter-thread persist memory order (PMO) for GPU programs. We then look at how to design the hardware architecture necessary to implement these operations efficiently. Read more