TY - JOUR
T1 - Discrete object generation with reversible inductive construction
AU - Seff, Ari
AU - Zhou, Wenda
AU - Damani, Farhan
AU - Doyle, Abigail
AU - Adams, Ryan P.
N1 - Funding Information:
We would like to thank Wengong Jin, Michael Galvin, Dieterich Lawson, and members of the Princeton Laboratory for Intelligent Probabilistic Systems for valuable discussion and feedback. This work was partially funded by the Alfred P. Sloan Foundation, NSF IIS-1421780, and the DataX Program at Princeton University through support from the Schmidt Futures Foundation. AS was supported by the Department of Defense through the National Defense Science and Engineering Graduate Fellowship (NDSEG) Program. We acknowledge computing resources from Columbia University’s Shared Research Computing Facility project, which is supported by NIH Research Facility Improvement Grant 1G20RR030893-01, and associated funds from the New York State Empire State Development, Division of Science Technology and Innovation (NYSTAR) Contract C090171, both awarded April 15, 2010.
Funding Information:
We would like to thank Wengong Jin, Michael Galvin, Dieterich Lawson, and members of the Princeton Laboratory for Intelligent Probabilistic Systems for valuable discussion and feedback. This work was partially funded by the Alfred P. Sloan Foundation, NSF IIS-1421780, and the DataX Program at Princeton University through support from the Schmidt Futures Foundation. AS was supported by the Department of Defense through the National Defense Science and Engineering Graduate Fellowship (NDSEG) Program. We acknowledge computing resources from Columbia University's Shared Research Computing Facility project, which is supported by NIH Research Facility Improvement Grant 1G20RR030893-01, and associated funds from the New York State Empire State Development, Division of Science Technology and Innovation (NYSTAR) Contract C090171, both awarded April 15, 2010.
Publisher Copyright:
© 2019 Neural information processing systems foundation. All rights reserved.
PY - 2019
Y1 - 2019
N2 - The success of generative modeling in continuous domains has led to a surge of interest in generating discrete data such as molecules, source code, and graphs. However, construction histories for these discrete objects are typically not unique and so generative models must reason about intractably large spaces in order to learn. Additionally, structured discrete domains are often characterized by strict constraints on what constitutes a valid object and generative models must respect these requirements in order to produce useful novel samples. Here, we present a generative model for discrete objects employing a Markov chain where transitions are restricted to a set of local operations that preserve validity. Building off of generative interpretations of denoising autoencoders, the Markov chain alternates between producing 1) a sequence of corrupted objects that are valid but not from the data distribution, and 2) a learned reconstruction distribution that attempts to fix the corruptions while also preserving validity. This approach constrains the generative model to only produce valid objects, requires the learner to only discover local modifications to the objects, and avoids marginalization over an unknown and potentially large space of construction histories. We evaluate the proposed approach on two highly structured discrete domains, molecules and Laman graphs, and find that it compares favorably to alternative methods at capturing distributional statistics for a host of semantically relevant metrics.
AB - The success of generative modeling in continuous domains has led to a surge of interest in generating discrete data such as molecules, source code, and graphs. However, construction histories for these discrete objects are typically not unique and so generative models must reason about intractably large spaces in order to learn. Additionally, structured discrete domains are often characterized by strict constraints on what constitutes a valid object and generative models must respect these requirements in order to produce useful novel samples. Here, we present a generative model for discrete objects employing a Markov chain where transitions are restricted to a set of local operations that preserve validity. Building off of generative interpretations of denoising autoencoders, the Markov chain alternates between producing 1) a sequence of corrupted objects that are valid but not from the data distribution, and 2) a learned reconstruction distribution that attempts to fix the corruptions while also preserving validity. This approach constrains the generative model to only produce valid objects, requires the learner to only discover local modifications to the objects, and avoids marginalization over an unknown and potentially large space of construction histories. We evaluate the proposed approach on two highly structured discrete domains, molecules and Laman graphs, and find that it compares favorably to alternative methods at capturing distributional statistics for a host of semantically relevant metrics.
UR - http://www.scopus.com/inward/record.url?scp=85090174193&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85090174193&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85090174193
SN - 1049-5258
VL - 32
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
T2 - 33rd Annual Conference on Neural Information Processing Systems, NeurIPS 2019
Y2 - 8 December 2019 through 14 December 2019
ER -