Context Replication studies and experiments form an important foundation in advancing scientific research. While their prevalence in Software Engineering is increasing, there is still more to be done. Objective This article aims to extend our previous replication study on search-based test generation techniques by performing a large-scale empirical comparison with further techniques from the state of the art. Method We designed a comprehensive experimental study involving six techniques, a benchmark composed of 180 non-trivial Java classes, and a total of 21,600 independent executions. Metrics regarding effectiveness and efficiency of the techniques were collected and analyzed by means of statistical methods. Results Our empirical study shows that single target approaches are generally outperformed by multi-target approaches, while within the multi-target approaches, DynaMOSA/MOSA, which are based on many-objective optimization, outperform the others, in particular for complex classes. Conclusion The results obtained from our large-scale empirical investigation confirm what has been reported in previous studies, while also highlighting striking differences and novel observations. Future studies, on different benchmarks and considering additional techniques, could further reinforce and extend our findings.
A large scale empirical comparison of state-of-the-art search-based test case generators
Panichella, Annibale;Kifetew, Fitsum Meshesha
;Tonella, Paolo
2018-01-01
Abstract
Context Replication studies and experiments form an important foundation in advancing scientific research. While their prevalence in Software Engineering is increasing, there is still more to be done. Objective This article aims to extend our previous replication study on search-based test generation techniques by performing a large-scale empirical comparison with further techniques from the state of the art. Method We designed a comprehensive experimental study involving six techniques, a benchmark composed of 180 non-trivial Java classes, and a total of 21,600 independent executions. Metrics regarding effectiveness and efficiency of the techniques were collected and analyzed by means of statistical methods. Results Our empirical study shows that single target approaches are generally outperformed by multi-target approaches, while within the multi-target approaches, DynaMOSA/MOSA, which are based on many-objective optimization, outperform the others, in particular for complex classes. Conclusion The results obtained from our large-scale empirical investigation confirm what has been reported in previous studies, while also highlighting striking differences and novel observations. Future studies, on different benchmarks and considering additional techniques, could further reinforce and extend our findings.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.