This paper presents two metrics for the Nearest Neighbor Classifier that share the property of being adapted, i.e. learned, on a set of data. Both metrics can be used for similarity search when the retrieval critically depends on a symbolic target feature. The first one is called Local Asymmetrically Weighted Similarity Metric (LASM) and exploits reinforcement learning techniques for the computation of asymmetric weights. The learning procedure of LASM initially extracts a set of prototypes from the training data and iteratively optimizes its parameters using the remaining data. Experiments on benchmark datasets show that LASM maintains good accuracy and achieves high compression rates outperforming competitor editing techniques like Condensed Nearest Neighbor. On a completely different perspective the second metric, called Minimum Risk Metric (MRM) is based on probability estimates. MRM is optimal in the sense that it optimizes the finite misclassification risk and experimentally proved to outperform other probability based metrics like the Short and Fukunaga metric and the Difference Value Metrics. MRM can be implemented using different probability estimates and performs comparably to the Bayes classifier based on the same estimates. Both LASM and MRM outperform the NN classifier with the Euclidean metric

Advanced Metrics for Class-Driven Similarity Search

Avesani, Paolo;Blanzieri, Enrico;Ricci, Francesco
1999-01-01

Abstract

This paper presents two metrics for the Nearest Neighbor Classifier that share the property of being adapted, i.e. learned, on a set of data. Both metrics can be used for similarity search when the retrieval critically depends on a symbolic target feature. The first one is called Local Asymmetrically Weighted Similarity Metric (LASM) and exploits reinforcement learning techniques for the computation of asymmetric weights. The learning procedure of LASM initially extracts a set of prototypes from the training data and iteratively optimizes its parameters using the remaining data. Experiments on benchmark datasets show that LASM maintains good accuracy and achieves high compression rates outperforming competitor editing techniques like Condensed Nearest Neighbor. On a completely different perspective the second metric, called Minimum Risk Metric (MRM) is based on probability estimates. MRM is optimal in the sense that it optimizes the finite misclassification risk and experimentally proved to outperform other probability based metrics like the Short and Fukunaga metric and the Difference Value Metrics. MRM can be implemented using different probability estimates and performs comparably to the Bayes classifier based on the same estimates. Both LASM and MRM outperform the NN classifier with the Euclidean metric
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/1786
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact