Automated web crawlers can be used to explore and exercise portions of a web application under test. However, the possibility to achieve full exploration of a web application through automated crawling is severely limited by the choice of the input values submitted with forms. Depending on the crawler's capabilities, a larger or smaller portion of web application will be automatically explored. In this paper, we introduce web crawlability metrics to quantify properties of application pages and forms that affect crawlability. Moreover, we show that our metrics can be used to identify the boundaries between those parts of the application that can be successfully crawled automatically and those parts that will require manual intervention or other crawlability support. We have validated our crawlability metrics on real web applications, for which low crawlability was indeed associated with the existence of pages never exercised during automated crawling.
Crawlability Metrics for Web Applications
Marchetto, Alessandro;Tiella, Roberto;Tonella, Paolo
2012-01-01
Abstract
Automated web crawlers can be used to explore and exercise portions of a web application under test. However, the possibility to achieve full exploration of a web application through automated crawling is severely limited by the choice of the input values submitted with forms. Depending on the crawler's capabilities, a larger or smaller portion of web application will be automatically explored. In this paper, we introduce web crawlability metrics to quantify properties of application pages and forms that affect crawlability. Moreover, we show that our metrics can be used to identify the boundaries between those parts of the application that can be successfully crawled automatically and those parts that will require manual intervention or other crawlability support. We have validated our crawlability metrics on real web applications, for which low crawlability was indeed associated with the existence of pages never exercised during automated crawling.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.