While in traditional methods object detection is based on the handcrafted definition of relevant visual features and rules, in machine/deep learning methods this task is achieved by learning both features and rules from a training set. The traditional and machine/deep learning object detection workflows are often described as opposite because in the traditional framework, the visual features and rules to detect the object of interest are provided as input, while in the machine/deep learning-based framework they are automatically learned from the data depending on the task considered and constitute the final trained model. In this work, we analyze the object detection recipe, and we show that these two approaches actually present three common issues that require human supervision and ad hoc procedures to be addressed: the design of an object model suitable for the context, devices, and task at hand; the achievement of detection robustness against several factors like noise, image quality, changes in geometry, and light variations; and the definition of an appropriate matching function. We also briefly review some common metrics for evaluating object detection performance, proving that human intervention is crucial in this task as well. Our analysis aims at fostering a more aware use of the object detection approaches and stimulating new research for automating—where possible—the tasks that humans are still in charge of.

Common issues and human intervention in object detection from handcrafted features to deep learning: discussion

Lecca, Michela
;
2025-01-01

Abstract

While in traditional methods object detection is based on the handcrafted definition of relevant visual features and rules, in machine/deep learning methods this task is achieved by learning both features and rules from a training set. The traditional and machine/deep learning object detection workflows are often described as opposite because in the traditional framework, the visual features and rules to detect the object of interest are provided as input, while in the machine/deep learning-based framework they are automatically learned from the data depending on the task considered and constitute the final trained model. In this work, we analyze the object detection recipe, and we show that these two approaches actually present three common issues that require human supervision and ad hoc procedures to be addressed: the design of an object model suitable for the context, devices, and task at hand; the achievement of detection robustness against several factors like noise, image quality, changes in geometry, and light variations; and the definition of an appropriate matching function. We also briefly review some common metrics for evaluating object detection performance, proving that human intervention is crucial in this task as well. Our analysis aims at fostering a more aware use of the object detection approaches and stimulating new research for automating—where possible—the tasks that humans are still in charge of.
File in questo prodotto:
File Dimensione Formato  
JOSA_A_LeccaBianco_ObjDetection_Preprint (1).pdf

solo utenti autorizzati

Tipologia: Documento in Pre-print
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 2.62 MB
Formato Adobe PDF
2.62 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/364727
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact