Model adaptation is important for the analysis of audio-visual data from body worn cameras in order to cope with rapidly changing scene conditions, varying object appearance and limited training data. In this paper, we propose a new approach for the on-line and unsupervised adaptation of deep-learning models for audio-visual target re-identification. Specifically, we adapt each mono-modal model using the unsupervised labelling provided by the other modality. To limit the detrimental effects of erroneous labels, we use a regularisation term based on the Kullback-Leibler divergence between the initial model and the one being adapted. The proposed adaptation strategy complements common audio-visual late fusion approaches and is beneficial also when one modality is no longer reliable. We show the contribution of the proposed strategy in improving the overall re-identification performance on a challenging public dataset captured with body worn cameras.
Unsupervised cross-modal deep-model adaptation for audio-visual re-identification with wearable cameras
Brutti, Alessio;
2017-01-01
Abstract
Model adaptation is important for the analysis of audio-visual data from body worn cameras in order to cope with rapidly changing scene conditions, varying object appearance and limited training data. In this paper, we propose a new approach for the on-line and unsupervised adaptation of deep-learning models for audio-visual target re-identification. Specifically, we adapt each mono-modal model using the unsupervised labelling provided by the other modality. To limit the detrimental effects of erroneous labels, we use a regularisation term based on the Kullback-Leibler divergence between the initial model and the one being adapted. The proposed adaptation strategy complements common audio-visual late fusion approaches and is beneficial also when one modality is no longer reliable. We show the contribution of the proposed strategy in improving the overall re-identification performance on a challenging public dataset captured with body worn cameras.| File | Dimensione | Formato | |
|---|---|---|---|
|
Brutti_Unsupervised_Cross-Modal_Deep-Model_ICCV_2017_paper.pdf
accesso aperto
Tipologia:
Documento in Post-print
Licenza:
Dominio pubblico
Dimensione
666.64 kB
Formato
Adobe PDF
|
666.64 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
