Supervised non-negative matrix factorization for audio source separation

Pablo Sprechmann, Alex M. Bronstein, Guillermo Sapiro

Research output: Chapter in Book/Report/Conference proceedingChapter

6 Scopus citations

Abstract

Source separation is a widely studied problem in signal processing. Despite the permanent progress reported in the literature it is still considered a significant challenge. This chapter first reviews the use of non-negative matrix factorization (NMF) algorithms for solving source separation problems, and proposes a new way for the supervised training in NMF. Matrix factorization methods have received a lot of attention in recent year in the audio processing community, producing particularly good results in source separation. Traditionally, NMF algorithms consist of two separate stages: a training stage, in which a generative model is learned; and a testing stage in which the pre-learned model is used in a high level task such as enhancement, separation, or classification. As an alternative, we propose a task-supervised NMF method for the adaptation of the basis spectra learned in the first stage to enhance the performance on the specific task used in the second stage. We cast this problem as a bilevel optimization program efficiently solved via stochastic gradient descent. The proposed approach is general enough to handle sparsity priors of the activations, and allow non-Euclidean data terms such as β-divergences. The framework is evaluated on speech enhancement.

Original languageEnglish (US)
Title of host publicationApplied and Numerical Harmonic Analysis
PublisherSpringer International Publishing
Pages407-420
Number of pages14
Edition9783319201870
DOIs
StatePublished - 2015
Externally publishedYes

Publication series

NameApplied and Numerical Harmonic Analysis
Number9783319201870
ISSN (Print)2296-5009
ISSN (Electronic)2296-5017

All Science Journal Classification (ASJC) codes

  • Applied Mathematics

Keywords

  • Bilevel optimization
  • NMF
  • Source separation
  • Speech enhancement
  • Supervised learning
  • Task-specific learning

Fingerprint

Dive into the research topics of 'Supervised non-negative matrix factorization for audio source separation'. Together they form a unique fingerprint.

Cite this