International Symposium on Microarchitecture (Micro), Dec 2009.
Ability to replay a program's execution on a multi-processor system
can significantly help parallel programming. To replay a
shared-memory multi-threaded program, existing solutions record the
program input (I/O, DMA, etc.) and the shared-memory
dependencies between threads. Prior processor based record-and-replay
solutions are efficient, but they require non-trivial modifications to
the coherency protocol and the memory sub-system for recording the
In this paper, we propose a processor-based record-and-replay solution that does not require detecting and logging shared-memory dependencies to enable multi-processor replay. It is based on our insight that, a load-based checkpointing scheme that records the program input has sufficient information for deterministically replaying each thread. We propose an offline symbolic analysis algorithm based on a SMT solver that determines the shared-memory dependencies using just the program input logs during replay. In addition to saving log space, the proposed approach significantly reduces the hardware support required for enabling replay.