Abstract:
The world around us — and our understanding of it — is rich in compositional structure: from atoms and their interactions to objects and entities in our environments. How can we learn models of the world that take this structure into account and generalize to new compositions in systematic ways? This talk focuses on an emerging class of slot-based neural architectures that can discover, represent, and reason about abstract entities from perceptual input alone. Taking our recent work on Slot Attention as an example, I will explain the challenges for object discovery and how attention-based routing provides an elegant solution to mapping from low-level perceptual features to high-level object-centric abstractions. With Slot Attention for Video (SAVi), we extend this framework to temporally-consistent modeling of objects over time and show how information about object motion can help the model find the right decomposition of a scene into its constituent components. Finally, I will discuss some of the open challenges that remain for developing and deploying structured world models.
Bio: Thomas Kipf is a Research Scientist at Google Brain in Amsterdam. His research focuses on developing machine learning models that can reason about the rich structure of the physical world, using structured abstractions such as objects, entities, and their relations. He obtained his PhD from the University of Amsterdam with a thesis on “Deep Learning with Graph-Structured Representations”, supervised by Max Welling. His work received a best paper award at ESWC2018 and he was recently elected as an ELLIS Scholar in “Semantic, Symbolic and Interpretable Machine Learning”.
Bio: Thomas Kipf is a Research Scientist at Google Brain in Amsterdam. His research focuses on developing machine learning models that can reason about the rich structure of the physical world, using structured abstractions such as objects, entities, and their relations. He obtained his PhD from the University of Amsterdam with a thesis on “Deep Learning with Graph-Structured Representations”, supervised by Max Welling. His work received a best paper award at ESWC2018 and he was recently elected as an ELLIS Scholar in “Semantic, Symbolic and Interpretable Machine Learning”.