Training Virtual Robots in Realistic Simulators

Erik Wijmans / Georgia Institute of Technology

September 2, 2021

Abstract: Recently there has been a shift in computer vision research from static vision, e.g., object detection, to Embodied AI, e.g., robot navigation. In this talk, I will focus on the task of PointGoal navigation (PointNav) where an embodied agent (virtual robot) is tasked with navigating to a point specified relative to its initial location in an unknown environment and without a map. I will ask and answer: Given only a depth camera and localization sensor, is this task learnable with generic tools, model-free reinforcement learning (RL) and generic neural networks? To answer this question, my collaborators and I developed a new distributed system for RL designed to meet the needs of training in realistic simulation. We showed that this task is entirely learnable by training an agent for the equivalent of 80 years of human experience. We then designed a new simulation paradigm specifically for the needs of RL centered on large batch simulation, where the simulator simulates many agents in many environments at once and is responsible for its own parallelization, reducing training wall-clock time from 6 GPU months to 36 GPU hours, an over 100x improvement.

Bio: Erik is a PhD student at Georgia Institute of Technology, advised by Irfan Essa and Dhruv Batra. His work is primarily focused on computer vision and its applications to artificial intelligence, with the long-term research goal of developing fundamental techniques, algorithms, and large-scale systems for robotic assistants. This work focuses on sim2real transfer and embodied AI. He's worked on the AI Habitat platform for Embodied AI, as well as being the lead organiser for two Embodied AI workshops at CVPR (2020, 2021).