Do Automatically Generated Test Cases Make Debugging Easier? An Experimental Assessment of Debugging Effectiveness and Efficiency

Ceccato, Mariano; Marchetto, Alessandro; Mariani, Leonardo; Nguyen, Cu Duy; Tonella, Paolo

doi:10.1145/2768829

Several techniques and tools have been proposed for the automatic generation of test cases. Usually, these tools are evaluated in terms of fault-revealing or coverage capability, but their impact on the manual debugging activity is not considered. The question is whether automatically generated test cases are equally effective in supporting debugging as manually written tests. We conducted a family of three experiments (five replications) with humans (in total, 55 subjects) to assess whether the features of automatically generated test cases, which make them less readable and understandable (e.g., unclear test scenarios, meaningless identifiers), have an impact on the effectiveness and efficiency of debugging. The first two experiments compare different test case generation tools (Randoop vs. EvoSuite). The third experiment investigates the role of code identifiers in test cases (obfuscated vs. original identifiers), since a major difference between manual and automatically generated test cases is that the latter contain meaningless (obfuscated) identifiers. We show that automatically generated test cases are as useful for debugging as manual test cases. Furthermore, we find that, for less experienced developers, automatic tests are more useful on average due to their lower static and dynamic complexity.