IRIS Institutional Research Information System

Considerable research progress in the areas of computer vision and multimodal analysis have now made the examination of complex phenomena such as social interactions possible. An important cue toward determining social interactions is the head pose of interacting members. While most automated social interaction analysis methods have focused on round-table meetings where head pose estimation (HPE) is easier given the high resolution of captured faces and the analyzed targets are static (seated), recent works have examined unstructured meeting scenes such as cocktail parties. While unstructured meeting scenes, where targets are free to move, provide additional cues such as proxemics for behavior analysis, they are also challenging to analyze owing to (i) the need to use distant, large field-of-view cameras which can only capture low-resolution faces of targets, and (ii) the variations in targets' facial appearance as they move, owing to changing camera perspective and scale. This chapter reviews recent works addressing HPE under target motion. In particular, we examine the use of transfer learning and multitask learning for HPE. Transfer learning is particularly useful when the training and the test data have different attributes (e.g., training data contains pose annotations for static targets, but test data involves moving targets), while multitask learning can be explicitly designed to address facial appearance variations under motion. Exhaustive experiments performed using both methodologies are presented.

Exploring Multitask and Transfer Learning Algorithms for Head Pose Estimation in Dynamic Multiview Scenarios

Elisa Ricci;Yan Yan;Anoop K. Rajagopal;Ramanathan Subramanian;Radu L. Vieriu;Oswald Lanz;Nicu Sebe

2017-01-01

Abstract

Considerable research progress in the areas of computer vision and multimodal analysis have now made the examination of complex phenomena such as social interactions possible. An important cue toward determining social interactions is the head pose of interacting members. While most automated social interaction analysis methods have focused on round-table meetings where head pose estimation (HPE) is easier given the high resolution of captured faces and the analyzed targets are static (seated), recent works have examined unstructured meeting scenes such as cocktail parties. While unstructured meeting scenes, where targets are free to move, provide additional cues such as proxemics for behavior analysis, they are also challenging to analyze owing to (i) the need to use distant, large field-of-view cameras which can only capture low-resolution faces of targets, and (ii) the variations in targets' facial appearance as they move, owing to changing camera perspective and scale. This chapter reviews recent works addressing HPE under target motion. In particular, we examine the use of transfer learning and multitask learning for HPE. Transfer learning is particularly useful when the training and the test data have different attributes (e.g., training data contains pose annotations for static targets, but test data involves moving targets), while multitask learning can be explicitly designed to address facial appearance variations under motion. Exhaustive experiments performed using both methodologies are presented.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2017
			
	Codice ISBN
	
				978-012809276-7
			
	Appare nelle tipologie:
	
				2.1 Contributo in volume (Capitolo o Saggio)

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/312972

Citazioni

ND

social impact