SpeedTuning: Speeding Up Policy Execution with Lightweight Reinforcement Learning

Stanford University
ICRA 2025

Abstract

While learned robotic policies hold promise for advancing generalizable manipulation, their practical deployment is often hindered by suboptimal execution speeds. Imitation learning policies are inherently limited by hardware constraints and the speed of the operator during data collection. In addition, there are no established methods for accelerating policies learned via imitation, and the empirical relationship between execution speed and task success remains underexplored. To address these issues, we introduce SpeedTuning, a reinforcement learning framework specifically designed to enhance the speed of manipulation policies. SpeedTuning learns to predict the optimal execution speed for actions, thereby complementing a base policy without necessitating additional data collection. We provide empirical evidence that SpeedTuning achieves substantial improvements in execution speed, exceeding 2.4x speed-up, while preserving an adequate success rate compared to both the original task policy and straightforward speedup methods such as linear interpolation at a fixed speed. We evaluate our approach across a diverse set of dynamic and precise tasks, including pouring, throwing, and picking, demonstrating its effectiveness and robustness in enhancing real-world robotic manipulation.





SpeedTuning is a reinforcement learning framework designed to optimize both execution speed and task success for learned manipulation policies. The left panel presents a schematic of SpeedTuning augmenting an imitation learning policy, where it outputs a speed multiplier to modulate the execution of predicted actions. This speed policy is optimized using reinforcement learning.


The right panel illustrates performance on the Tea Bag Disposal task, with the upper half depicting speed versus task progress and the lower half showing the speed multipliers predicted at different stages. For critical actions, such as grasping the tea bag (yellow marker), the policy selects a lower speed (2x) to ensure precise timing, whereas for less demanding stages, such as the final phase of discarding the tea bag (purple marker), a higher speed (4x) is applied to expedite execution.

BibTeX

To be released by ICRA 2025