Maximum likelihood and maximum a posteriori adaptation for distributed speaker recognition systems

Chin Hung Sit, Man Wai Mak, Sun Yuan Kung

Research output: Chapter in Book/Report/Conference proceedingChapter

5 Scopus citations

Abstract

We apply the ETSI's DSR standard to speaker verification over telephone networks and investigate the effect of extracting spectral features from different stages of the ETSI's front-end on speaker verification performance. We also evaluate two approaches to creating speaker models, namely maximum likelihood (ML) and maximum a posteriori (MAP), in the context of distributed speaker verification. In the former, random vectors with variances depending on the distance between unquantized training vectors and their closest code vector are added to the vector-quantized feature vectors extracted from client speech. The resulting vectors are then used for creating speaker-dependent GMMs based on ML techniques. For the latter, vector quantized vectors extracted from client speech are used for adapting a universal background model to speaker-dependent GMMs. Experimental results based on 145 speakers from the SPIDRE corpus show that quantized feature vectors extracted from the server side can be directly used for MAP adaptation. Results also show that the best performing system is based on the ML approach. However, the ML approach is sensitive to the number of input dimensions of the training data.

Original languageEnglish (US)
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
EditorsDavid Zhang, Anil K. Jain
PublisherSpringer Verlag
Pages640-647
Number of pages8
ISBN (Print)3540221468, 9783540221463
DOIs
StatePublished - Jan 1 2004

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3072
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Maximum likelihood and maximum a posteriori adaptation for distributed speaker recognition systems'. Together they form a unique fingerprint.

Cite this