Perceptually-motivated Environment-specific Speech Enhancement

Jiaqi Su, Adam Finkelstein, Zeyu Jin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

15 Scopus citations

Abstract

This paper introduces a deep learning approach to enhance speech recordings made in a specific environment. A single neural network learns to ameliorate several types of recording artifacts, including noise, reverberation, and non-linear equalization. The method relies on a new perceptual loss function that combines adversarial loss with spectrogram features. Both subjective and objective evaluations show that the proposed approach improves on state-of-the-art baseline methods.

Original languageEnglish (US)
Title of host publication2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages7015-7019
Number of pages5
ISBN (Electronic)9781479981311
DOIs
StatePublished - May 2019
Event44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Brighton, United Kingdom
Duration: May 12 2019May 17 2019

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2019-May
ISSN (Print)1520-6149

Conference

Conference44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
Country/TerritoryUnited Kingdom
CityBrighton
Period5/12/195/17/19

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Keywords

  • Denoising
  • de-reverberation
  • equalization matching
  • perceptual loss
  • speech enhancement

Fingerprint

Dive into the research topics of 'Perceptually-motivated Environment-specific Speech Enhancement'. Together they form a unique fingerprint.

Cite this