Michael Weber: “A #lisp conversation sparked the cunning idea to write a diff-like algorithm for s-expressions.”
Posted by jjwiseman at September 12, 2005 07:25 PMThe algorithms is based on Levenshtein-like edit distance calculations. It tries to minimize the number of atoms in the result tree (also counting the edit conditionals |#-new|, |#+new|). This can be observed in the example below: instead of blindly adding more edit conditionals, the algorithm backs up and replaces the whole subexpression.
DIFF-SEXP> (diff-sexp '(DEFUN F (X) (+ (* X 2) 4 1)) '(DEFUN F (X) (- (* X 2) 5 3 1))) ((DEFUN F (X) (|#-new| + |#+new| - (* X 2) |#-new| 4 |#+new| 5 |#+new| 3 1))) DIFF-SEXP> (diff-sexp '(DEFUN F (X) (+ (* X 2) 4 4 1)) '(DEFUN F (X) (- (* X 2) 5 5 3 1))) ((DEFUN F (X) |#-new| (+ (* X 2) 4 4 1) |#+new| (- (* X 2) 5 5 3 1)))
Espen Vestre generates automatic patches for running images by using something only slightly more sophisticated that loading the forms from two versions of a file as lists and using SET-DIFFERENCE on them.
The url is http://groups.google.com/group/comp.lang.lisp/msg/1785e3c38e289d47 if the markup doesn't survive...
Posted by: Zach Beane on September 13, 2005 06:56 AMWouldn't tree-differencing algorithms like Zhang-Shasha actually be more efficient for this purpose? I've used them in commercial code for that purpose, and I was hoping to release an open source implementation sometime.
Posted by: Julian Squires on September 14, 2005 03:38 AM