Application-Integrated Record-Replay of Distributed Systems
Narek Galstyan
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2024-4
January 12, 2024
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-4.pdf
This report reviews bug catalogs and debugging systems designed for distributed systems. It tries to find common pat- terns in distributed systems bugs, highlights the characteris- tics necessary in a debugging system to identify these bugs in distributed systems, and proposes the Application-Integrated Record-Replay (aiRR) system for addressing classes of these bugs. aiRR is designed specifically for distributed systems. aiRR integrates the recording into the distributed system and lever- ages this integration to reduce the overhead of recording in the application. To have low overhead, our approach avoids reducing application-level concurrency and avoids recording application-level data that is not necessary for replay.
Advisors: Scott Shenker and Sylvia Ratnasamy
BibTeX citation:
@mastersthesis{Galstyan:EECS-2024-4, Author= {Galstyan, Narek}, Title= {Application-Integrated Record-Replay of Distributed Systems}, School= {EECS Department, University of California, Berkeley}, Year= {2024}, Month= {Jan}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-4.html}, Number= {UCB/EECS-2024-4}, Abstract= {This report reviews bug catalogs and debugging systems designed for distributed systems. It tries to find common pat- terns in distributed systems bugs, highlights the characteris- tics necessary in a debugging system to identify these bugs in distributed systems, and proposes the Application-Integrated Record-Replay (aiRR) system for addressing classes of these bugs. aiRR is designed specifically for distributed systems. aiRR integrates the recording into the distributed system and lever- ages this integration to reduce the overhead of recording in the application. To have low overhead, our approach avoids reducing application-level concurrency and avoids recording application-level data that is not necessary for replay.}, }
EndNote citation:
%0 Thesis %A Galstyan, Narek %T Application-Integrated Record-Replay of Distributed Systems %I EECS Department, University of California, Berkeley %D 2024 %8 January 12 %@ UCB/EECS-2024-4 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-4.html %F Galstyan:EECS-2024-4