Lemonodor: Diff-Sexp

September 12, 2005

Diff-Sexp

Michael Weber: “A #lisp conversation sparked the cunning idea to write a diff-like algorithm for s-expressions.”

The algorithms is based on Levenshtein-like edit distance calculations. It tries to minimize the number of atoms in the result tree (also counting the edit conditionals |#-new|, |#+new|). This can be observed in the example below: instead of blindly adding more edit conditionals, the algorithm backs up and replaces the whole subexpression.
DIFF-SEXP> (diff-sexp  
                '(DEFUN F (X) (+ (* X 2) 4 1)) 
                '(DEFUN F (X) (- (* X 2) 5 3 1)))
((DEFUN F (X) (|#-new| + |#+new| - (* X 2) |#-new| 4 |#+new| 5 |#+new| 3 1)))

DIFF-SEXP> (diff-sexp  
                '(DEFUN F (X) (+ (* X 2) 4 4 1)) 
                '(DEFUN F (X) (- (* X 2) 5 5 3 1)))
((DEFUN F (X) |#-new| (+ (* X 2) 4 4 1) |#+new| (- (* X 2) 5 5 3 1)))

Posted by jjwiseman at September 12, 2005 07:25 PM

Comments

Espen Vestre generates automatic patches for running images by using something only slightly more sophisticated that loading the forms from two versions of a file as lists and using SET-DIFFERENCE on them.

The url is http://groups.google.com/group/comp.lang.lisp/msg/1785e3c38e289d47 if the markup doesn't survive...

Posted by: Zach Beane on September 13, 2005 06:56 AM

Wouldn't tree-differencing algorithms like Zhang-Shasha actually be more efficient for this purpose? I've used them in commercial code for that purpose, and I was hoping to release an open source implementation sometime.

Posted by: Julian Squires on September 14, 2005 03:38 AM