While there are several ways to identify customer behaviors, few extract this value from information already in a database, much less extract relevant characteristics. This paper presents the development of a prototype using the recency, frequency, and monetary attributes for customer segmentation of a retail database. For this purpose, the standard K-means, K-medoids, and MiniBatch K-means were evaluated. The standard K-means clustering algorithm was more appropriate for data clustering than other algorithms as it remained stable until solutions with six clusters. The evaluation of the clusters’ quality was obtained through the internal validation indexes Silhouette, Calinski Harabasz, and Davies Bouldin. When consensus was not obtained, three external validation indexes were applied: global stability, stability per cluster, and segment-level stability across solutions. Six customer segments were obtained, identified by their unique behavior: lost customers, disinterested customers, recent customers, less recent customers, loyal customers, and best customers. Their behavior was evidenced and analyzed, indicating trends and preferences. The proposed method combining recency, frequency, monetary value (RFM), K-means clustering, internal indices, and external indices achieved return rates of 17.50%, indicating acceptable selectivity of the customers.

Recency, Frequency, Monetary Value, Clustering, and Internal and External Indices for Customer Segmentation from Retail Data

Stefenon, Stefano Frizzo;
2023-01-01

Abstract

While there are several ways to identify customer behaviors, few extract this value from information already in a database, much less extract relevant characteristics. This paper presents the development of a prototype using the recency, frequency, and monetary attributes for customer segmentation of a retail database. For this purpose, the standard K-means, K-medoids, and MiniBatch K-means were evaluated. The standard K-means clustering algorithm was more appropriate for data clustering than other algorithms as it remained stable until solutions with six clusters. The evaluation of the clusters’ quality was obtained through the internal validation indexes Silhouette, Calinski Harabasz, and Davies Bouldin. When consensus was not obtained, three external validation indexes were applied: global stability, stability per cluster, and segment-level stability across solutions. Six customer segments were obtained, identified by their unique behavior: lost customers, disinterested customers, recent customers, less recent customers, loyal customers, and best customers. Their behavior was evidenced and analyzed, indicating trends and preferences. The proposed method combining recency, frequency, monetary value (RFM), K-means clustering, internal indices, and external indices achieved return rates of 17.50%, indicating acceptable selectivity of the customers.
File in questo prodotto:
File Dimensione Formato  
algorithms-16-00396.pdf

accesso aperto

Descrizione: paper
Tipologia: Documento in Post-print
Licenza: Creative commons
Dimensione 2.04 MB
Formato Adobe PDF
2.04 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/339847
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact