Trajectory Generation Using Dual-Robot Haptic Interface for Reinforcement Learning from Demonstration

Published in 2023 Iberian Robotics Conference, Coimbra (Portugal), 22-24 November, 2023

In learning robotics, techniques such as Learning from Demonstrations (LfD) and Reinforcement Learning (RL) have become widely popular among developers. However, this approximations can result in inefficient strategies when it comes to train more than one agent interacting in the same space with several objects and unknown obstacles. To solve this problematic, Reinforcement Learning from Demonstration (RLfD) allows the agent to learn and evaluate its performance from a set of demonstrations provided by a human expert while generalising from them using RL training. In dual-robot applications this approach is suitable for training agents that perform collaborative tasks. For this reason, a dual-robot haptic interface has been designed in order to produce dual manipulation trajectories to feed a RLfD agent. Haptics allows to perform high quality demonstrations following an impedance control approach. Trajectories obtained will be used as positive demonstrations so the training environment can generate automatic ones. As a result, this dual-robot haptic interface will provide a few trajectory demonstrations on dual manipulation in order to train agents using RL strategies. The aim of this research is to generate trajectories with this dual-robot haptic interface to train one or more agents following RLfD paradigms. Results show that trajectories performed with this interface present less error and deviation than others performed with a non-haptic interface, increasing the quality of the training data.

Keywords: Dual-robot, Haptic interface, Demonstrations, Reinforcement learning from demonstration

Recommended citation: Daniel Frau-Alfaro, Santiago T. Puente, Ignacio de Loyola Páez-Ubieta (2023). "Trajectory Generation Using Dual-Robot Haptic Interface for Reinforcement Learning from Demonstration." 2023 6th Iberian Robotics Conference (ROBOT). 976, 444-455, doi: 10.1007/978-3-031-58676-7_36
Download Paper

Share on

Twitter Facebook LinkedIn