Learning multi-step manipulation using symmetry-aware reinforcement learning
2026
While many existing grasping models can be highly reliable in picking objects in most cases, challenging scenarios persist in industrial automation where objects are difficult to grasp—such as when positioned in corners, occluded by other items, or tightly clustered. These challenges are prevalent in smart manufacturing and logistics systems, where current robotic systems often require costly human intervention for handling difficult cases. To address this automation gap, we propose a multi-step manipulation approach that combines non-prehensile and prehensile actions through two collaborative policies: a main policy responsible for picking, and a supporting policy that assists by rearranging the scene when the main policy lacks confidence. In particular, we leverage equivariant neural networks to encode SE(2) symmetries in both policies, enabling sample-efficient reinforcement learning from visual observations. Using a symmetry-aware DQN-based approach, we first train the main policy, then train the supporting policy to maximize the main policy's confidence through a value-based reward function. Our method significantly outperforms singlepolicy approaches, achieving 1.3-2× better performance across five manipulation tasks in simulation. Crucially, we demonstrate efficient on-robot learning directly on hardware (1-2 hours) for three tasks, showcasing the practical viability of our approach for smart industry applications where rapid deployment and adaptation are essential.
Research areas