Haozhi Qi

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2025-158

August 14, 2025

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-158.pdf

Human hands are fundamental to how we sense and act upon the physical world. Their unique ability to coordinate precise movements and adapt to changing conditions gives us the dexterity to grasp, manipulate, and interact with objects of different shapes and physical properties with remarkable ease. Replicating this level of dexterity in robots is critical for building general-purpose robots that can operate reliably in unstructured environments. Although modern artificial intelligence has achieved significant success in vision and language, dexterous manipulation remains a major unsolved challenge. This difficulty arises from the high dimensionality of motor control, the scarcity of real-world data, and the need to integrate diverse sensory inputs into robust behaviors. This thesis aims to close this gap by developing learning-based systems that equip robots with multisensory intelligence for dexterous manipulation. It shows how to combine vision, touch, and proprioception with scalable learning methods to enable robots to perform complex, contact-rich tasks that require coordination and adaptation.

The approach follows a structured learning paradigm. First, we focus on acquiring individual manipulation skills through large-scale training in simulation. Each skill is developed independently and grounded in principles of generalization across objects and physical properties. Second, we investigate how rich multisensory feedback, including tactile sensing, visual inputs, and proprioceptive signals, can be fused to improve both perception and control. We show that these modalities are not redundant but complementary, and their integration allows robots to perform tasks that remain infeasible with simple sensors alone. Third, we propose a compositional framework that builds on previously learned skills to acquire new behaviors efficiently. The thesis concludes by identifying key challenges ahead and outlining directions for future research in developing robots that can interact with the world as fluently and flexibly as humans do.

Advisors: Jitendra Malik and Yi Ma


BibTeX citation:

@phdthesis{Qi:EECS-2025-158,
    Author= {Qi, Haozhi},
    Title= {Multisensory Dexterity for Robotics},
    School= {EECS Department, University of California, Berkeley},
    Year= {2025},
    Month= {Aug},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-158.html},
    Number= {UCB/EECS-2025-158},
    Abstract= {Human hands are fundamental to how we sense and act upon the physical world. Their unique ability to coordinate precise movements and adapt to changing conditions gives us the dexterity to grasp, manipulate, and interact with objects of different shapes and physical properties with remarkable ease. Replicating this level of dexterity in robots is critical for building general-purpose robots that can operate reliably in unstructured environments. Although modern artificial intelligence has achieved significant success in vision and language, dexterous manipulation remains a major unsolved challenge. This difficulty arises from the high dimensionality of motor control, the scarcity of real-world data, and the need to integrate diverse sensory inputs into robust behaviors. This thesis aims to close this gap by developing learning-based systems that equip robots with multisensory intelligence for dexterous manipulation. It shows how to combine vision, touch, and proprioception with scalable learning methods to enable robots to perform complex, contact-rich tasks that require coordination and adaptation.

The approach follows a structured learning paradigm. First, we focus on acquiring individual manipulation skills through large-scale training in simulation. Each skill is developed independently and grounded in principles of generalization across objects and physical properties. Second, we investigate how rich multisensory feedback, including tactile sensing, visual inputs, and proprioceptive signals, can be fused to improve both perception and control. We show that these modalities are not redundant but complementary, and their integration allows robots to perform tasks that remain infeasible with simple sensors alone. Third, we propose a compositional framework that builds on previously learned skills to acquire new behaviors efficiently. The thesis concludes by identifying key challenges ahead and outlining directions for future research in developing robots that can interact with the world as fluently and flexibly as humans do.},
}

EndNote citation:

%0 Thesis
%A Qi, Haozhi 
%T Multisensory Dexterity for Robotics
%I EECS Department, University of California, Berkeley
%D 2025
%8 August 14
%@ UCB/EECS-2025-158
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-158.html
%F Qi:EECS-2025-158