Several techniques and tools have been proposed for the automatic generation of test cases. Usually, these tools are evaluated in terms of fault-revealing or coverage capability, but their impact on the manual debugging activity is not considered. The question is whether automatically generated test cases are equally effective in supporting debugging as manually written tests. We conducted a family of three experiments (five replications) with humans (in total, 55 subjects) to assess whether the features of automatically generated test cases, which make them less readable and understandable (e.g., unclear test scenarios, meaningless identifiers), have an impact on the effectiveness and efficiency of debugging. The first two experiments compare different test case generation tools (Randoop vs. EvoSuite). The third experiment investigates the role of code identifiers in test cases (obfuscated vs. original identifiers), since a major difference between manual and automatically generated test cases is that the latter contain meaningless (obfuscated) identifiers. We show that automatically generated test cases are as useful for debugging as manual test cases. Furthermore, we find that, for less experienced developers, automatic tests are more useful on average due to their lower static and dynamic complexity.

Do Automatically Generated Test Cases Make Debugging Easier? An Experimental Assessment of Debugging Effectiveness and Efficiency

Ceccato, Mariano;Marchetto, Alessandro;Tonella, Paolo
2015-01-01

Abstract

Several techniques and tools have been proposed for the automatic generation of test cases. Usually, these tools are evaluated in terms of fault-revealing or coverage capability, but their impact on the manual debugging activity is not considered. The question is whether automatically generated test cases are equally effective in supporting debugging as manually written tests. We conducted a family of three experiments (five replications) with humans (in total, 55 subjects) to assess whether the features of automatically generated test cases, which make them less readable and understandable (e.g., unclear test scenarios, meaningless identifiers), have an impact on the effectiveness and efficiency of debugging. The first two experiments compare different test case generation tools (Randoop vs. EvoSuite). The third experiment investigates the role of code identifiers in test cases (obfuscated vs. original identifiers), since a major difference between manual and automatically generated test cases is that the latter contain meaningless (obfuscated) identifiers. We show that automatically generated test cases are as useful for debugging as manual test cases. Furthermore, we find that, for less experienced developers, automatic tests are more useful on average due to their lower static and dynamic complexity.
File in questo prodotto:
File Dimensione Formato  
main.pdf

non disponibili

Tipologia: Documento in Pre-print
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 670 kB
Formato Adobe PDF
670 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/301861
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact