In this work we address the speaker verification task in domestic environments where multiple rooms are monitored by a set of distributed microphones. In particular, we focus on the mismatch between the training of the total variability feature extraction hyper-parameters, the enrolment stage, which occurs at a fixed position in the home, and the test phase which could happen in any location of the apartment. Building upon a previous work, where a position independent multi-channel verification system was introduced, we investigate different i-vector combination strategies to attenuate the effects of the above mentioned mismatch sources. The proposed methods implicitly select the microphones in the room where the speaker is, without any knowledge about the speaker position. An experimental analysis on a simulated multi-channel multi-room reverberant data-set shows that the proposed solutions are robust against changes in the speaker position and orientation, achieving performance close to an upper-bound based on knowledge about the speaker location.
Multi-channel i-vector combination for robust speaker verification in multi-room domestic environments
Brutti, Alessio;
2016-01-01
Abstract
In this work we address the speaker verification task in domestic environments where multiple rooms are monitored by a set of distributed microphones. In particular, we focus on the mismatch between the training of the total variability feature extraction hyper-parameters, the enrolment stage, which occurs at a fixed position in the home, and the test phase which could happen in any location of the apartment. Building upon a previous work, where a position independent multi-channel verification system was introduced, we investigate different i-vector combination strategies to attenuate the effects of the above mentioned mismatch sources. The proposed methods implicitly select the microphones in the room where the speaker is, without any knowledge about the speaker position. An experimental analysis on a simulated multi-channel multi-room reverberant data-set shows that the proposed solutions are robust against changes in the speaker position and orientation, achieving performance close to an upper-bound based on knowledge about the speaker location.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.