TY - GEN
T1 - WHITE BOX SEARCH OVER AUDIO SYNTHESIZER PARAMETERS
AU - Yang, Yuting
AU - Jin, Zeyu
AU - Barnes, Connelly
AU - Finkelstein, Adam
N1 - Publisher Copyright:
© Y. Yang, Z. Jin, C. Barnes, A. Finkelstein.
PY - 2023
Y1 - 2023
N2 - Synthesizer parameter inference searches for a set of patch connections and parameters to generate audio that best matches a given target sound. Such optimization tasks benefit from access to accurate gradients. However, typical audio synths incorporate components with discontinuities - such as sawtooth or square waveforms, or a categorical search over discrete parameters like a choice among such waveforms - that thwart conventional automatic differentiation (AD). AD libraries in frameworks like TensorFlow and PyTorch typically ignore discontinuities, providing incorrect gradients at such locations. Thus, SOTA parameter inference methods avoid differentiating the synth directly, and resort to workarounds such as genetic search or neural proxies. Instead, we adapt and extend recent computer graphics methods for differentiable rendering to directly differentiate the synth as a white box program, and thereby optimize its parameters using gradient descent. We evaluate our framework using a generic FM synth with ADSR, noise, and IIR filters, adapting its parameters to match a variety of target audio clips. Our method outperforms baselines in both quantitative and qualitative evaluations.
AB - Synthesizer parameter inference searches for a set of patch connections and parameters to generate audio that best matches a given target sound. Such optimization tasks benefit from access to accurate gradients. However, typical audio synths incorporate components with discontinuities - such as sawtooth or square waveforms, or a categorical search over discrete parameters like a choice among such waveforms - that thwart conventional automatic differentiation (AD). AD libraries in frameworks like TensorFlow and PyTorch typically ignore discontinuities, providing incorrect gradients at such locations. Thus, SOTA parameter inference methods avoid differentiating the synth directly, and resort to workarounds such as genetic search or neural proxies. Instead, we adapt and extend recent computer graphics methods for differentiable rendering to directly differentiate the synth as a white box program, and thereby optimize its parameters using gradient descent. We evaluate our framework using a generic FM synth with ADSR, noise, and IIR filters, adapting its parameters to match a variety of target audio clips. Our method outperforms baselines in both quantitative and qualitative evaluations.
UR - http://www.scopus.com/inward/record.url?scp=85199651807&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85199651807&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85199651807
T3 - 24th International Society for Music Information Retrieval Conference, ISMIR 2023 - Proceedings
SP - 190
EP - 196
BT - 24th International Society for Music Information Retrieval Conference, ISMIR 2023 - Proceedings
A2 - Sarti, Augusto
A2 - Antonacci, Fabio
A2 - Sandler, Mark
A2 - Bestagini, Paolo
A2 - Dixon, Simon
A2 - Liang, Beici
A2 - Richard, Gael
A2 - Pauwels, Johan
PB - International Society for Music Information Retrieval
T2 - 24th International Society for Music Information Retrieval Conference, ISMIR 2023
Y2 - 5 November 2023 through 9 November 2023
ER -