TY - JOUR
T1 - Synergistic Task and Motion Planning with Reinforcement Learning-Based Non-Prehensile Actions
AU - Liu, Gaoyuan
AU - De Winter, Joris
AU - Steckelmacher, Denis
AU - Hota, Roshan Kumar
AU - Nowe, Ann
AU - Vanderborght, Bram
PY - 2023/3/15
Y1 - 2023/3/15
N2 - obotic manipulation in cluttered environments requires synergistic planning among prehensile and non-prehensile actions. Previous works on sampling-based Task and Motion Planning (TAMP) algorithms, e.g. PDDLStream, provide a fast and generalizable solution for multi-modal manipulation. However, they are likely to fail in cluttered scenarios where no collision-free grasping approaches can be sampled without preliminary manipulations. To extend the ability of sampling-based algorithms, we integrate a vision-based Reinforcement Learning (RL) non-prehensile procedure, pusher. The pushing actions generated by pusher can eliminate interlocked situations and make the grasping problem solvable. Also, the sampling-based algorithm evaluates the pushing actions by providing rewards in the training process, thus the pusher can learn to avoid situations leading to irreversible failures. The proposed hybrid planning method is validated on a cluttered bin-picking problem and implemented in both simulation and real world. Results show that the pusher can effectively improve the success ratio of the previous sampling-based algorithm, while the sampling-based algorithm can help the pusher learn pushing skills.
AB - obotic manipulation in cluttered environments requires synergistic planning among prehensile and non-prehensile actions. Previous works on sampling-based Task and Motion Planning (TAMP) algorithms, e.g. PDDLStream, provide a fast and generalizable solution for multi-modal manipulation. However, they are likely to fail in cluttered scenarios where no collision-free grasping approaches can be sampled without preliminary manipulations. To extend the ability of sampling-based algorithms, we integrate a vision-based Reinforcement Learning (RL) non-prehensile procedure, pusher. The pushing actions generated by pusher can eliminate interlocked situations and make the grasping problem solvable. Also, the sampling-based algorithm evaluates the pushing actions by providing rewards in the training process, thus the pusher can learn to avoid situations leading to irreversible failures. The proposed hybrid planning method is validated on a cluttered bin-picking problem and implemented in both simulation and real world. Results show that the pusher can effectively improve the success ratio of the previous sampling-based algorithm, while the sampling-based algorithm can help the pusher learn pushing skills.
KW - Task and Motion Planning
KW - Reinforcement Learning
KW - Manipulation Planning
M3 - Article
JO - IEEE Robotics and Automation Letters
JF - IEEE Robotics and Automation Letters
SN - 2377-3766
ER -