Growing numbers of 3D scenes in online repositories provide new opportunities for data-driven scene understanding, editing, and synthesis. Despite the plethora of data now available online, most of it cannot be effectively used for data-driven applications because it lacks consistent segmentations, category labels, and/or functional groupings required for co-analysis. In this paper, we develop algorithms that infer such information via parsing with a probabilistic grammar learned from examples. First, given a collection of scene graphs with consistent hierarchies and labels, we train a probabilistic hierarchical grammar to represent the distributions of shapes, cardinalities, and spatial relationships of semantic objects within the collection. Then, we use the learned grammar to parse new scenes to assign them segmentations, labels, and hierarchies consistent with the collection. During experiments with these algorithms, we find that: they work effectively for scene graphs for indoor scenes commonly found online (bedrooms, classrooms, and libraries); they outperform alternative approaches that consider only shape similarities and/or spatial relationships without hierarchy; they require relatively small sets of training data; they are robust to moderate over-segmentation in the inputs; and, they can robustly transfer labels from one data set to another. As a result, the proposed algorithms can be used to provide consistent hierarchies for large collections of scenes within the same semantic class.
All Science Journal Classification (ASJC) codes
- Computer Graphics and Computer-Aided Design
- Scene collections
- Scene understanding