Barcodes are visual representations of data widely used in commerce and administration to compactly codify information about objects, services, and people. Specifically, a barcode is an image composed of parallel lines, with different widths, spacing and sizes. Generally, the lines are dark (usually black) on a bright background (usually white) or vice-versa. Thanks to this representation, barcodes can be detected and decoded in a way robust to changes of light and noise. However, using barcodes with several colours for the lines is quite intriguing because it enables boosting the barcode's data capacity. Colour barcodes still pose a challenge today, even though numerous studies on the topic were conducted between 1990 and 2022. The main issue that needs to be solved is the creation of an optical technology able to decode colour sequences regardless of the ambient light, the acquisition and printing or visualisation device, and the physical support on which the barcode is printed or displayed. To the best of our knowledge, the studies currently available in literature do not provide the experimental data on which they are based, nor are there online databases that can be used for further studies or for training data analysis procedures based on artificial intelligence techniques. To fill this gap and push further research in this technology, we built COCO-10, a public dataset of colour barcode images, that would like to become a testbench for the development and testing of colour barcode decoding algorithms, taking into account the colour variability due to the light, to the printer and camera gamuts and to the quality of the paper on which the barcode is printed. COCO-10 contains 5400 images of 150 colour barcodes, each of one printed on two white papers with different density and printers and acquired under six illuminations by three smartphones’ cameras. For each colour barcode image, a mask identifying the region occupied by the barcode is released too. The 150 colour barcodes have been generated by colouring the lines of black & white barcodes with colours randomly selected from a palette of ten colours including both warm and colour hues. The name COCO-10 just refers to the fact that the dataset contains COlor BarCOdes with 10 possible colours for each line. We also provide a set of 300 images created as follows. The 150 COCO-10 colour barcodes were synthetically superimposed on 150 cluttered backgrounds, resulting in 150 images. The first 75 (group 1) were printed on thick paper, the others (group 2) on plain paper. Each group was further subdivided into subsets of 25 images, resulting in 3 subgroups, each of which was captured by 2 smartphones’ cameras under one of the 6 illuminants mentioned above. We also provide masks for these images. These images would like to be a benchmark for testing the accuracy of barcode decoding algorithms, bearing in mind that the performance of these algorithms may be influenced by the accuracy of the previous detection of the barcodes themselves in the background. The total number of images in COCO-10 is 11700, including the 300 synthetic images of the colour barcodes displayed on white and cluttered background, the 5700 real-world images of the colour barcodes printed on white papers and with cluttered backgrounds and their corresponding 5700 masks. We finally highlight that COCO-10 can be also used for developing and testing algorithms for gamut and tone mapping, machine colour constancy, and colour correction.

A dataset for illuminant- and device- invariant colour barcode decoding with cameras

Lecca, Michela;
2024-01-01

Abstract

Barcodes are visual representations of data widely used in commerce and administration to compactly codify information about objects, services, and people. Specifically, a barcode is an image composed of parallel lines, with different widths, spacing and sizes. Generally, the lines are dark (usually black) on a bright background (usually white) or vice-versa. Thanks to this representation, barcodes can be detected and decoded in a way robust to changes of light and noise. However, using barcodes with several colours for the lines is quite intriguing because it enables boosting the barcode's data capacity. Colour barcodes still pose a challenge today, even though numerous studies on the topic were conducted between 1990 and 2022. The main issue that needs to be solved is the creation of an optical technology able to decode colour sequences regardless of the ambient light, the acquisition and printing or visualisation device, and the physical support on which the barcode is printed or displayed. To the best of our knowledge, the studies currently available in literature do not provide the experimental data on which they are based, nor are there online databases that can be used for further studies or for training data analysis procedures based on artificial intelligence techniques. To fill this gap and push further research in this technology, we built COCO-10, a public dataset of colour barcode images, that would like to become a testbench for the development and testing of colour barcode decoding algorithms, taking into account the colour variability due to the light, to the printer and camera gamuts and to the quality of the paper on which the barcode is printed. COCO-10 contains 5400 images of 150 colour barcodes, each of one printed on two white papers with different density and printers and acquired under six illuminations by three smartphones’ cameras. For each colour barcode image, a mask identifying the region occupied by the barcode is released too. The 150 colour barcodes have been generated by colouring the lines of black & white barcodes with colours randomly selected from a palette of ten colours including both warm and colour hues. The name COCO-10 just refers to the fact that the dataset contains COlor BarCOdes with 10 possible colours for each line. We also provide a set of 300 images created as follows. The 150 COCO-10 colour barcodes were synthetically superimposed on 150 cluttered backgrounds, resulting in 150 images. The first 75 (group 1) were printed on thick paper, the others (group 2) on plain paper. Each group was further subdivided into subsets of 25 images, resulting in 3 subgroups, each of which was captured by 2 smartphones’ cameras under one of the 6 illuminants mentioned above. We also provide masks for these images. These images would like to be a benchmark for testing the accuracy of barcode decoding algorithms, bearing in mind that the performance of these algorithms may be influenced by the accuracy of the previous detection of the barcodes themselves in the background. The total number of images in COCO-10 is 11700, including the 300 synthetic images of the colour barcodes displayed on white and cluttered background, the 5700 real-world images of the colour barcodes printed on white papers and with cluttered backgrounds and their corresponding 5700 masks. We finally highlight that COCO-10 can be also used for developing and testing algorithms for gamut and tone mapping, machine colour constancy, and colour correction.
File in questo prodotto:
File Dimensione Formato  
LecLec_DIB_2024.pdf

accesso aperto

Tipologia: Documento in Post-print
Licenza: Creative commons
Dimensione 3.63 MB
Formato Adobe PDF
3.63 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/347207
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact