Fftnet: A real-time speaker-dependent neural vocoder

Zeyu Jin, Adam Finkelstein, Gautham J. Mysore, Jingwan Lu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

29 Scopus citations

Abstract

We introduce FFTNet, a deep learning approach synthesizing audio waveforms. Our approach builds on the recent WaveNet project, which showed that it was possible to synthesize a natural sounding audio waveform directly from a deep convolutional neural network. FFTNet offers two improvements over WaveNet. First it is substantially faster, allowing for real-time synthesis of audio waveforms. Second, when used as a vocoder, the resulting speech sounds more natural, as measured via a 'mean opinion score' test.

Original languageEnglish (US)
Title of host publication2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2251-2255
Number of pages5
ISBN (Print)9781538646588
DOIs
StatePublished - Sep 10 2018
Event2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Calgary, Canada
Duration: Apr 15 2018Apr 20 2018

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2018-April
ISSN (Print)1520-6149

Other

Other2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018
CountryCanada
CityCalgary
Period4/15/184/20/18

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Keywords

  • FFTNet
  • Neural networks
  • Vocoder
  • WaveNet

Fingerprint Dive into the research topics of 'Fftnet: A real-time speaker-dependent neural vocoder'. Together they form a unique fingerprint.

  • Cite this

    Jin, Z., Finkelstein, A., Mysore, G. J., & Lu, J. (2018). Fftnet: A real-time speaker-dependent neural vocoder. In 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings (pp. 2251-2255). [8462431] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2018-April). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2018.8462431