Figure - uploaded by Renzo Degiovanni
Content may be subject to copyright.
Source publication
In this article, we empirically study the suitability of tests as acceptance criteria for automated program fixes, by checking patches produced by automated repair tools using a bug-finding tool, as opposed to previous works that used tests or manual inspections. We develop a number of experiments in which faulty programs from IntroClass , a known...
Contexts in source publication
Context 1
... paper actually analyzes this in more detail and by using independent test suites to validate the generated patches, it claims GenProg's patches pass 68.7% of independent tests, giving the non-expert reader the impression the produced patches were of good quality while, in fact, it might be the case that none of the patches is actually a fix. Actually, as our experiments reported in Table 5 show, only 30 out of 570 faults were correctly fixed (which gives a fixing ratio of 5.3%, well below the 36.8% presented in [26]). ...Context 2
... shows that Nopol, when fed with a better quality evaluation suite, is able to produce (a few) good quality fixes. GenProg, on the other hand, shows an interesting behaviour (see Table 5): it doubles the number of patches with suite O ∪ S100, yet the number of fixes is reduced from 30 to 10. With suite O ∪ S1000 produces around 50% more patches but the number of fixes is reduced from 30 to 11. ...