Robust speaker verification over the telephone by feature recuperation

X. Li, M. W. Mak, Sun-Yuan Kung

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

The performance of speaker verification systems is often compromised under real-world environments. For example, variations in handset characteristics could cause severe performance degradation. This paper presents a novel method to overcome this problem by using a non-linear handset mapper. Under this method, a mapper is constructed by training an elliptical basis function network using distorted speech features as inputs and the corresponding clean features as the desired outputs. During feature recuperation, clean features are recovered by feeding the distorted features to the feature mapper. The recovered features are then presented to a speaker model as if they were derived from clean speech. Experimental evaluations based on 258 speakers of the TIMIT and NTIMIT corpuses suggest that the feature mappers improve the verification performance remarkably.

Original languageEnglish (US)
Title of host publicationProceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2001
Pages433-436
Number of pages4
StatePublished - Dec 1 2001
Event2001 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2001 - Hong Kong, Hong Kong
Duration: May 2 2001May 4 2001

Other

Other2001 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2001
CountryHong Kong
CityHong Kong
Period5/2/015/4/01

All Science Journal Classification (ASJC) codes

  • Computer Science(all)

Fingerprint Dive into the research topics of 'Robust speaker verification over the telephone by feature recuperation'. Together they form a unique fingerprint.

  • Cite this

    Li, X., Mak, M. W., & Kung, S-Y. (2001). Robust speaker verification over the telephone by feature recuperation. In Proceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2001 (pp. 433-436)