Between pixels and policies: Toward interpretable representations for (inter)action
Kaylene Stocking
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2025-155
August 13, 2025
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-155.pdf
Robots and other embodied agents use an internal representation of their surrounding environment to pick actions that are appropriate for their purpose or goal. In robotics, this representation is often computed directly from camera images by using deep learning algorithms to extract relevant high-level features. However, unlike in other computer vision applications, robot representations must support closed-loop interaction, where current actions affect future observations. Furthermore, the need for safety and transparency in robotics motivates closer scrutiny of the contents and limitations of learned representations.
This dissertation argues that performant and interpretable robot representations are a goal of both scientific interest and practical importance. Two possible paths toward these representations are new representation learning algorithms that are engineered with interpretability in mind (``design''), and better tools for improving our understanding of the representations learned by existing algorithms (``interpretation''). However, current techniques fall far short of the goal of both strong learning performance and deep mechanistic understanding. This dissertation describes research that advances the state of the art along both paths. Part I proposes two algorithms that learn symbolic representations that facilitate a robot's ability to reason about other agents. By focusing on more realistic settings than prior works, these chapters show that interpretable-by-design algorithms need not be limited to simple toy problems. Part II introduces interpretability tools from other disciplines to robotics for the first time, leading to novel insights about the representations learned by a deep end-to-end neural network trained for autonomous driving (similar to human drivers' representations for low-level vision but not for representations of other agents) and vision-language-action models (deeply semantic despite being fine-tuned to only output actions). A common thread throughout both parts is making connections between robot representations and the representations that support human cognition and action. Whether the design path or the interpretation path ultimately proves more fruitful, this research works toward a holistic understanding of how capable embodied agents represent their environments.
Advisors: Claire Tomlin
BibTeX citation:
@phdthesis{Stocking:EECS-2025-155,
Author= {Stocking, Kaylene},
Title= {Between pixels and policies: Toward interpretable representations for (inter)action},
School= {EECS Department, University of California, Berkeley},
Year= {2025},
Month= {Aug},
Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-155.html},
Number= {UCB/EECS-2025-155},
Abstract= {Robots and other embodied agents use an internal representation of their surrounding environment to pick actions that are appropriate for their purpose or goal. In robotics, this representation is often computed directly from camera images by using deep learning algorithms to extract relevant high-level features. However, unlike in other computer vision applications, robot representations must support closed-loop interaction, where current actions affect future observations. Furthermore, the need for safety and transparency in robotics motivates closer scrutiny of the contents and limitations of learned representations.
This dissertation argues that performant and interpretable robot representations are a goal of both scientific interest and practical importance. Two possible paths toward these representations are new representation learning algorithms that are engineered with interpretability in mind (``design''), and better tools for improving our understanding of the representations learned by existing algorithms (``interpretation''). However, current techniques fall far short of the goal of both strong learning performance and deep mechanistic understanding. This dissertation describes research that advances the state of the art along both paths. Part I proposes two algorithms that learn symbolic representations that facilitate a robot's ability to reason about other agents. By focusing on more realistic settings than prior works, these chapters show that interpretable-by-design algorithms need not be limited to simple toy problems. Part II introduces interpretability tools from other disciplines to robotics for the first time, leading to novel insights about the representations learned by a deep end-to-end neural network trained for autonomous driving (similar to human drivers' representations for low-level vision but not for representations of other agents) and vision-language-action models (deeply semantic despite being fine-tuned to only output actions). A common thread throughout both parts is making connections between robot representations and the representations that support human cognition and action. Whether the design path or the interpretation path ultimately proves more fruitful, this research works toward a holistic understanding of how capable embodied agents represent their environments.},
}
EndNote citation:
%0 Thesis %A Stocking, Kaylene %T Between pixels and policies: Toward interpretable representations for (inter)action %I EECS Department, University of California, Berkeley %D 2025 %8 August 13 %@ UCB/EECS-2025-155 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-155.html %F Stocking:EECS-2025-155