MoMa-Teleop

MoMa-Teleop is a teleoperation method that produces whole-body motions for mobile manipulation by delegating the base motions to a reinforcement learning agent, leaving the operator to fully focus on the task-relevant end-effector motions.

Demonstration data plays a key role in learning complex behaviors and training robotic foundation models. While effective control interfaces exist for static manipulators, data collection remains cumbersome and time intensive for mobile manipulators due to their large number of degrees of freedom. While specialized hardware, avatars, or motion tracking can enable whole-body control, these approaches are either expensive, robot-specific, or suffer from the embodiment mismatch between robot and human demonstrator. In this work, we present MoMa-Teleop, a novel teleoperation method that delegates the base motions to a reinforcement learning agent, leaving the operator to focus fully on the task-relevant end-effector motions. This enables whole-body teleoperation of mobile manipulators with zero additional hardware or setup costs via standard interfaces such as joysticks or hand guidance. Moreover, the operator is not bound to a tracked workspace and can move freely with the robot over spatially extended tasks. We demonstrate that our approach results in a significant reduction in task completion time across a variety of robots and tasks. As the generated data covers diverse whole-body motions without embodiment mismatch, it enables efficient imitation learning. By focusing on task-specific end-effector motions, our approach learns skills that transfer to unseen settings, such as new obstacles or changed object positions, from as little as five demonstrations.

Efficient Teleoperation

We compare our approach with static operation methods, such as joystick or hand guidance, and tracking approaches using cameras or VR controllers. All methods are evaluated on common household tasks. MoMa-Teleop not only achieves high success rates and faster completion times but also produces smoother, continuous motions.

Joystick Teleoperation

We first evaluate MoMa-Teleop teleoperation with joystick inputs on the Toyota HSR robot.

HSR: Pick and Place

HSR: Microwave

HSR: Door Inwards

HSR: Toolbox

Kinesthetic Teaching

MoMa-Teleop also enables kinesthetic teaching for mobile platforms, with the obstacle avoidance of the base agent ensuring safe operation next to a human guiding the arm. In tasks over long motion areas we see substantial reductions in operation time over static hand guidance. As the imprecisions of the tracking methods become more pronounced on the FMM robots, we do not evaluate them on all tasks to ensure the safety of the hardware.

FMM: Clean Table

FMM: Door Outwards

FMM: Folding Cabinet

FMM : Fridge Pick & Place

Imitation Learning

MoMa-Teleop enables us to rapidly learn mobile manipulation policies from only five demonstrations with TAPAS-GMM. We fit both whole-body motions or only learn the end-effector motions and reuse the base agent. However, the learned whole-body motions fail to generalize to new scenarios, such as objects placed at different heights. In contrast, reusing the agent, the whole approach becomes object-centric and automatically adapts to new settings. This allows to reuse the same motions in common scenarios such as avoiding new obstacles added to the scene.

Publications

If you find our work useful, please consider citing our paper:

Daniel Honerkamp, Harsh Mahesheka, Jan Ole von Hartz, Tim Welschehold, and Abhinav Valada
Whole-Body Teleoperation for Mobile Manipulation at Zero Added Cost
IEEE Robotics and Automation Letters (RA-L), 2025.

(PDF) (BibTeX)