Abstract
Graphs are a useful abstraction of image content. Not only can graphs represent details about individual objects in a scene but they can capture the interactions between pairs of objects. We present a method for training a convolutional neural network such that it takes in an input image and produces a full graph definition. This is done end-to-end in a single stage with the use of associative embeddings. The network learns to simultaneously identify all of the elements that make up a graph and piece them together. We benchmark on the Visual Genome dataset, and demonstrate state-of-the-art performance on the challenging task of scene graph generation.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 2172-2181 |
| Number of pages | 10 |
| Journal | Advances in Neural Information Processing Systems |
| Volume | 2017-December |
| State | Published - 2017 |
| Externally published | Yes |
| Event | 31st Annual Conference on Neural Information Processing Systems, NIPS 2017 - Long Beach, United States Duration: Dec 4 2017 → Dec 9 2017 |
All Science Journal Classification (ASJC) codes
- Computer Networks and Communications
- Information Systems
- Signal Processing
Fingerprint
Dive into the research topics of 'Pixels to graphs by associative embedding'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver