Multi-sample fusion with constrained feature transformation for robust speaker verification

Ming Cheung Cheung, Kwok Kwong Yiu, Man Wai Mak, Sun Yuan Kung

Research output: Contribution to conferencePaper

2 Scopus citations

Abstract

This paper proposes a single-source multi-sample fusion approach to text-independent speaker verification. In conventional speaker verification systems, the scores obtained from claimant's utterances are averaged and the resulting mean score is used for decision making. Instead of using an equal weight for all scores, this paper proposes assigning a different weight to each score, where the weights are made dependent on the difference between the score values and a speaker-dependent reference score obtained during enrollment. Because the fusion weights depend on the verification scores, a technique called constrained stochastic feature transformation is applied to minimize the mismatch between enrollment and verification data in order to enhance the scores' reliability. Experimental results based on the 2001 NIST evaluation set show that the proposed fusion approach outperforms the equal-weight approach by 22% in terms of equal error rate and 16% in terms of minimum detection cost.

Original languageEnglish (US)
Pages1813-1816
Number of pages4
StatePublished - Jan 1 2004
Event8th International Conference on Spoken Language Processing, ICSLP 2004 - Jeju, Jeju Island, Korea, Republic of
Duration: Oct 4 2004Oct 8 2004

Other

Other8th International Conference on Spoken Language Processing, ICSLP 2004
CountryKorea, Republic of
CityJeju, Jeju Island
Period10/4/0410/8/04

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Linguistics and Language

Fingerprint Dive into the research topics of 'Multi-sample fusion with constrained feature transformation for robust speaker verification'. Together they form a unique fingerprint.

  • Cite this

    Cheung, M. C., Yiu, K. K., Mak, M. W., & Kung, S. Y. (2004). Multi-sample fusion with constrained feature transformation for robust speaker verification. 1813-1816. Paper presented at 8th International Conference on Spoken Language Processing, ICSLP 2004, Jeju, Jeju Island, Korea, Republic of.