Network Fault Localization for the InterEdge

Matthew Fogel

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2025-54
May 14, 2025

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-54.pdf

Two challenges the modern Internet faces are the architectural stagnation that has widened the performance gap between private and public networks, and the increasing difficulty of diagnosing failures across distributed systems. The InterEdge architecture addresses the first challenge by enabling standardized in-network services without compromising Internet compatibility, while the "Where's the Fault?" (WTF) methodology tackles the second by designing cross-domain fault localization methods. This paper implements the WTF methodology within the InterEdge project to enable cross-domain and cross-layer fault identification. Lightweight scope requests traverse network paths based on historical routing states, which provides users and applications with a standardized mechanism to determine where faults occur without requiring extensive instrumentation or compromising proprietary information. The effectiveness of this approach is demonstrated through targeted test cases that successfully identify network path failures, node crashes, and service-level issues. This can help to reduce the time and complexity involved in troubleshooting distributed applications while maintaining compatibility with the InterEdge's existing service-oriented architecture. While further work and challenges remain, this work is progress toward addressing the growing complexity of diagnosing faults in modern Internet services. This work provides a guide for InterEdge service developers as well as a foundation for future enhancements.

Advisor: Scott Shenker

\"Edit"; ?>


BibTeX citation:

@mastersthesis{Fogel:EECS-2025-54,
    Author = {Fogel, Matthew},
    Title = {Network Fault Localization for the InterEdge},
    School = {EECS Department, University of California, Berkeley},
    Year = {2025},
    Month = {May},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-54.html},
    Number = {UCB/EECS-2025-54},
    Abstract = {Two challenges the modern Internet faces are the architectural stagnation that has widened the performance gap between private and public networks, and the increasing difficulty of diagnosing failures across distributed systems. The InterEdge architecture addresses the first challenge by enabling standardized in-network services without compromising Internet compatibility, while the "Where's the Fault?" (WTF) methodology tackles the second by designing cross-domain fault localization methods. This paper implements the WTF methodology within the InterEdge project to enable cross-domain and cross-layer fault identification. Lightweight scope requests traverse network paths based on historical routing states, which provides users and applications with a standardized mechanism to determine where faults occur without requiring extensive instrumentation or compromising proprietary information. The effectiveness of this approach is demonstrated through targeted test cases that successfully identify network path failures, node crashes, and service-level issues. This can help to reduce the time and complexity involved in troubleshooting distributed applications while maintaining compatibility with the InterEdge's existing service-oriented architecture. While further work and challenges remain, this work is progress toward addressing the growing complexity of diagnosing faults in modern Internet services. This work provides a guide for InterEdge service developers as well as a foundation for future enhancements.}
}

EndNote citation:

%0 Thesis
%A Fogel, Matthew
%T Network Fault Localization for the InterEdge
%I EECS Department, University of California, Berkeley
%D 2025
%8 May 14
%@ UCB/EECS-2025-54
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-54.html
%F Fogel:EECS-2025-54