Speaker Normalization with a Mixture of Recurrent Networks

Trentin, Edmondo; Giuliani, Diego

This work introduces a multiple connectionist architecture based on a mixture of Recurrent Neural Networks to approach the problem of speaker adaptation in the acoustic feature domain (i.e. speaker normalization). Normalization is applied to the case of a speaker-independent (SI) speech recognition system based on continuous density hidden Markov models. The technique for combining multiple recurrent models is discussed. Recognition experiments with a continuous speech large dictionary task show that the proposed architecture is capable to tangibly improve recognition performance, allowing for a 21.9% reduction of the word error rate