TY - JOUR
T1 - Accelerating Interactive Reinforcement Learning byHuman Advice for an Assembly Task by a Cobot
AU - De Winter, Joris
AU - De Beir, Albert
AU - El Makrini, Ilias
AU - Van De Perre, Greet
AU - Nowe, Ann
AU - Vanderborght, Bram
PY - 2019/12/16
Y1 - 2019/12/16
N2 - The assembly industry is shifting more towards customizable products, or requiring assembly of small batches. This requires a lot of reprogramming, which is expensive because a specialized engineer is required. It would be an improvement if untrained workers could help a cobot to learn an assembly sequence by giving advice. Learning an assembly sequence is a hard task for a cobot, because the solution space increases drastically when the complexity of the task increases. This work introduces a novel method where human knowledge is used to reduce this solution space, and as a result increases the learning speed. The method proposed is the IRL-PBRS method, which uses Interactive Reinforcement Learning (IRL) to learn from human advice in an interactive way, and uses Potential Based Reward Shaping (PBRS), in a simulated environment, to focus learning on a smaller part of the solution space. The method was compared in simulation to two other feedback strategies. The results show that IRL-PBRS convergesmore quickly to a valid assembly sequence policy and does this with the fewest human interactions. Finally, a use case is presented where participants were asked to program an assembly task. Here, the results show that IRL-PBRS learns quickly enough to keep up with advice given by a user, and is able to adapt online to a changing knowledge base.
AB - The assembly industry is shifting more towards customizable products, or requiring assembly of small batches. This requires a lot of reprogramming, which is expensive because a specialized engineer is required. It would be an improvement if untrained workers could help a cobot to learn an assembly sequence by giving advice. Learning an assembly sequence is a hard task for a cobot, because the solution space increases drastically when the complexity of the task increases. This work introduces a novel method where human knowledge is used to reduce this solution space, and as a result increases the learning speed. The method proposed is the IRL-PBRS method, which uses Interactive Reinforcement Learning (IRL) to learn from human advice in an interactive way, and uses Potential Based Reward Shaping (PBRS), in a simulated environment, to focus learning on a smaller part of the solution space. The method was compared in simulation to two other feedback strategies. The results show that IRL-PBRS convergesmore quickly to a valid assembly sequence policy and does this with the fewest human interactions. Finally, a use case is presented where participants were asked to program an assembly task. Here, the results show that IRL-PBRS learns quickly enough to keep up with advice given by a user, and is able to adapt online to a changing knowledge base.
KW - Interactive Reinforcement Learning
KW - Cobots
KW - Programming by Advice
KW - Assembly planning
UR - http://www.scopus.com/inward/record.url?scp=85079805787&partnerID=8YFLogxK
U2 - 10.3390/robotics8040104
DO - 10.3390/robotics8040104
M3 - Article
VL - 8
JO - Robotics
JF - Robotics
SN - 2218-6581
IS - 4
M1 - 104
ER -