An Analysis of Patch Plausibility and Correctness for Generate-And-Validate Patch Generation Systems
Author(s)
Qi, Zichao; Long, Fan; Achour, Sara; Rinard, Martin
DownloadMIT-CSAIL-TR-2015-021.pdf (366.1Kb)
Other Contributors
Program Analysis and Compilation
Advisor
Martin Rinard
Metadata
Show full item recordAbstract
We analyze reported patches for three existing generate-and-validate patch generation systems (GenProg, RSRepair, and AE). The basic principle behind generate-and-validate systems is to accept only plausible patches that produce correct outputs for all inputs in the test suite used to validate the patches. Because of errors in the patch evaluation infrastructure, the majority of the reported patches are not plausible -- they do not produce correct outputs even for the inputs in the validation test suite. The overwhelming majority of the reported patches are not correct and are equivalent to a single modification that simply deletes functionality. Observed negative effects include the introduction of security vulnerabilities and the elimination of desirable standard functionality. We also present Kali, a generate-and-validate patch generation system that only deletes functionality. Working with a simpler and more effectively focused search space, Kali generates at least as many correct patches as prior GenProg, RSRepair, and AE systems. Kali also generates at least as many patches that produce correct outputs for the inputs in the validation test suite as the three prior systems. We also discuss the patches produced by ClearView, a generate-and-validate binary hot patching system that leverages learned invariants to produce patches that enable systems to survive otherwise fatal defects and security attacks. Our analysis indicates that ClearView successfully patches 9 of the 10 security vulnerabilities used to evaluate the system. At least 4 of these patches are correct.
Date issued
2015-05-29Series/Report no.
MIT-CSAIL-TR-2015-021