Clustering of the entities composing a Web application (static and dynamic pages) can be used to support program understanding. However, several alternative options are available when a clustering technique is designed for Web applications. The entities to be clustered can be described in different ways (e.g., by their structure, by their connectivity, or by their content), different similarity measures are possible, and alternative procedures can be used to form the clusters. The problem is how to evaluate the competing clustering techniques, in order to select the best for program understanding purposes. In this paper, two methods for clustering evaluation are considered, the gold standard and the task oriented approach. The advantages and disadvantages of both of them are analyzed in detail. Definition of a gold standard (reference clustering) is difficult and prone to subjectivity. On the other side, an evaluation based on the level of support given to task execution is expensive and requires careful experimental design. Guidelines and examples are provided for the implementation of both methods

Evaluation Methods for Web Application Clustering

Tonella, Paolo;Ricca, Filippo;Pianta, Emanuele;Girardi, Christian;
2003-01-01

Abstract

Clustering of the entities composing a Web application (static and dynamic pages) can be used to support program understanding. However, several alternative options are available when a clustering technique is designed for Web applications. The entities to be clustered can be described in different ways (e.g., by their structure, by their connectivity, or by their content), different similarity measures are possible, and alternative procedures can be used to form the clusters. The problem is how to evaluate the competing clustering techniques, in order to select the best for program understanding purposes. In this paper, two methods for clustering evaluation are considered, the gold standard and the task oriented approach. The advantages and disadvantages of both of them are analyzed in detail. Definition of a gold standard (reference clustering) is difficult and prone to subjectivity. On the other side, an evaluation based on the level of support given to task execution is expensive and requires careful experimental design. Guidelines and examples are provided for the implementation of both methods
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/928
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact