Kaushik Shivakumar and Ken Goldberg

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2023-129

May 12, 2023

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-129.pdf

Probabilistic state estimation is crucial in robotics when full state reconstruction is not possible due to partial observability. Outputting distributions over the state space allows for expression of uncertainty in a useful way for a downstream planner, which can interact with the scene to increase confidence via a method called interactive perception and eventually make task progress. We investigate probabilistic state estimation and interactive perception for cable untangling and object search in semantically organized shelves. First, we introduce Tracing to Untangle Semi-planar Knots (TUSK), a learned cable tracing algorithm that resolves overcrossings and undercrossings to recognize knot structure and grasp points for untangling from a single RGB image. This work focuses on semi-planar knots, containing crossings each with at most 2 cable segments. We conduct experiments on 3-meter cables with up to 15 semi-planar crossings across 6 different knot types. We find that in scenes with multiple identical cables, TUSK can trace a single cable with 81% accuracy on 7 new knot types. In single-cable images, TUSK can trace and identify the correct knot with 77% success on 3 new knot types. We incorporate TUSK into a bimanual robot untangling system and find it successfully untangles 64% of cable configurations, including those with new knots unseen during training, across 3 levels of difficulty. Second, we introduce Semantic Spatial Search on Shelves (S^4) to improve efficiency when locating a fully occluded target object in a shelf. Shelves in pharmacies, restaurant kitchens, and grocery stores are often organized such that semantically similar objects are placed close to one another. With Semantic Spatial Search on Shelves (S^4), we use large language models (LLMs) to generate affinity matrices, where entries correspond to semantic likelihood of physical proximity between objects. We derive occupancy distributions by synthesizing semantics with learned spatial constraints. Simulation experiments suggest that S^4 combined with an interactive perception policy reduces search time relative to pure spatial search by an 24% across three domains: pharmacy, kitchen, and office shelves, and physical experiments in a pharmacy shelf suggest 47.1% improvement. We conclude with limitations and areas for future work.

Advisors: Ken Goldberg


BibTeX citation:

@mastersthesis{Shivakumar:EECS-2023-129,
    Author= {Shivakumar, Kaushik and Goldberg, Ken},
    Title= {Probabilistic State Estimation to Enable Manipulation and Interactive Perception for Robotic Cable Untangling and Object Search},
    School= {EECS Department, University of California, Berkeley},
    Year= {2023},
    Month= {May},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-129.html},
    Number= {UCB/EECS-2023-129},
    Abstract= {Probabilistic state estimation is crucial in robotics when full state reconstruction is not possible due to partial observability. Outputting distributions over the state space allows for expression of uncertainty in a useful way for a downstream planner, which can interact with the scene to increase confidence via a method called interactive perception and eventually make task progress. We investigate probabilistic state estimation and interactive perception for cable untangling and object search in semantically organized shelves. First, we introduce Tracing to Untangle Semi-planar Knots (TUSK), a learned cable tracing algorithm that resolves overcrossings and undercrossings to recognize knot structure and grasp points for untangling from a single RGB image. This work focuses on semi-planar knots, containing crossings each with at most 2 cable segments. We conduct experiments on 3-meter cables with up to 15 semi-planar crossings across 6 different knot types. We find that in scenes with multiple identical cables, TUSK can trace a single cable with 81% accuracy on 7 new knot types. In single-cable images, TUSK can trace and identify the correct knot with 77% success on 3 new knot types. We incorporate TUSK into a bimanual robot untangling system and find it successfully untangles 64% of cable configurations, including those with new knots unseen during training, across 3 levels of difficulty. Second, we introduce Semantic Spatial Search on Shelves (S^4) to improve efficiency when locating a fully occluded target object in a shelf. Shelves in pharmacies, restaurant kitchens, and grocery stores are often organized such that semantically similar objects are placed close to one another. With Semantic Spatial Search on Shelves (S^4), we use large language models (LLMs) to generate affinity matrices, where entries correspond to semantic likelihood of physical proximity between objects. We derive occupancy distributions by synthesizing semantics with learned spatial constraints. Simulation experiments suggest that S^4 combined with an interactive perception policy reduces search time relative to pure spatial search by an 24% across three domains: pharmacy, kitchen, and office shelves, and physical experiments in a pharmacy shelf suggest 47.1% improvement. We conclude with limitations and areas for future work.},
}

EndNote citation:

%0 Thesis
%A Shivakumar, Kaushik 
%A Goldberg, Ken 
%T Probabilistic State Estimation to Enable Manipulation and Interactive Perception for Robotic Cable Untangling and Object Search
%I EECS Department, University of California, Berkeley
%D 2023
%8 May 12
%@ UCB/EECS-2023-129
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-129.html
%F Shivakumar:EECS-2023-129