September 12, 2005


Michael Weber: “A #lisp conversation sparked the cunning idea to write a diff-like algorithm for s-expressions.”

The algorithms is based on Levenshtein-like edit distance calculations. It tries to minimize the number of atoms in the result tree (also counting the edit conditionals |#-new|, |#+new|). This can be observed in the example below: instead of blindly adding more edit conditionals, the algorithm backs up and replaces the whole subexpression.

DIFF-SEXP> (diff-sexp  
                '(DEFUN F (X) (+ (* X 2) 4 1)) 
                '(DEFUN F (X) (- (* X 2) 5 3 1)))
((DEFUN F (X) (|#-new| + |#+new| - (* X 2) |#-new| 4 |#+new| 5 |#+new| 3 1)))

DIFF-SEXP> (diff-sexp  
                '(DEFUN F (X) (+ (* X 2) 4 4 1)) 
                '(DEFUN F (X) (- (* X 2) 5 5 3 1)))
((DEFUN F (X) |#-new| (+ (* X 2) 4 4 1) |#+new| (- (* X 2) 5 5 3 1)))
Posted by jjwiseman at September 12, 2005 07:25 PM

Espen Vestre generates automatic patches for running images by using something only slightly more sophisticated that loading the forms from two versions of a file as lists and using SET-DIFFERENCE on them.

The url is if the markup doesn't survive...

Posted by: Zach Beane on September 13, 2005 06:56 AM

Wouldn't tree-differencing algorithms like Zhang-Shasha actually be more efficient for this purpose? I've used them in commercial code for that purpose, and I was hoping to release an open source implementation sometime.

Posted by: Julian Squires on September 14, 2005 03:38 AM
Post a comment

Email Address:


Unless you answer this question, your comment will be classified as spam and will not be posted.
(I'll give you a hint: the answer is “lisp”.)


Remember info?