Pipelines in Lisp

Posted by David Owen on 2022-03-07

Sometimes, we have a sequence of operations to perform on some data, each step using the results of the previous one:

(let* ((a (get-value))
       (b (transform-1 a))
       (c (transform-2 b))
       (d (transform-3 c)))
  (princ d))

If we don't want the intermediate variables, we're stuck writing something like this, in most languages:

(princ (transform-3 (transform-2 (transform-1 (get-value)))))

However, nesting the operations that way means that they no longer have a natural ordering when a human reads the code.

Some languages allow you to express such a sequence as a pipeline of operations, as in bash:

get-value | transform-1 | transform-2 | transform-3

We can write a simple macro in Lisp to allow the same thing. Given a form like:

(x (a) (b) (c))

it will embed the expression x into the form a, like this:

((a x) (b) (c))

and process recursively.

Here's the macro:

(defmacro pipeline (head &rest tail)
  (if tail
      (destructuring-bind ((op &rest args) &rest more) tail
        `(pipeline (,op ,head ,@args) ,@more))
      head))

Note that it places the first expression into the second's form in the position of first argument.

Now, we can write this:

(pipeline (get-value) (transform-1) (transform-2) (transform-3))

Most other similar macros seem to prefer embedding the leading expression into the second's form in the position of last argument, like this version does:

(defmacro pipeline (head &rest tail)
  (if tail
      (destructuring-bind ((op &rest args) &rest more) tail
        `(pipeline (,op ,@args ,head) ,@more))
      head))

There's a drawback to that, though: side-effects will occur out of order.

(flet ((f (n)
         (format t "Effect ~A~%" n)
         n))
  (princ (pipeline 12 (+ (f 1)) (+ (f 2)) (+ (f 3)))))

Effect 3
Effect 2
Effect 1
18

We can fix that by creating bindings for each argument to every form, then referencing those variables in the resulting form:

(defmacro pipeline (head &rest tail)
  (if tail
      (destructuring-bind ((op &rest args) &rest more) tail
        (let ((bindings (mapcar (lambda (a) `(,(gensym) ,a)) args)))
          `(let (,@bindings)
             (pipeline (,op ,@(mapcar 'first bindings) ,head) ,@more))))
      head))

(flet ((f (n)
         (format t "Effect ~A~%" n)
         n))
  (princ (pipeline 12 (+ (f 1)) (+ (f 2)) (+ (f 3)))))

Effect 1
Effect 2
Effect 3
18

That's better, but it still gets the order of evaluation wrong for the pipeline expressions:

(flet ((f (n &rest args)
         (format t "Effect ~A~%" n)))
  (pipeline (f 1) (f 4 (f 2) (f 3)) (f 7 (f 5) (f 6))))

Effect 2
Effect 3
Effect 5
Effect 6
Effect 1
Effect 4
Effect 7

The fix-up for that is simple enough:

(defmacro pipeline (head &rest tail)
  (if tail
      (destructuring-bind ((op &rest args) &rest more) tail
        (let ((head-val (gensym))
              (bindings (mapcar (lambda (a) `(,(gensym) ,a)) args)))
          `(let ((,head-val ,head)
                 ,@bindings)
             (pipeline (,op ,@(mapcar 'first bindings) ,head-val) ,@more))))
      head))

(flet ((f (n &rest args)
         (format t "Effect ~A~%" n)))
  (pipeline (f 1) (f 4 (f 2) (f 3)) (f 7 (f 5) (f 6))))

Effect 1
Effect 2
Effect 3
Effect 4
Effect 5
Effect 6
Effect 7

However, all this runs against the general philosophy of the language for two reasons. First is the need for the bindings. The original approach doesn't need them at all, and naturally preserves the order of evaluation. Second, many functions take "modifiers" as optional or keyword parameters, with the primary data being operated on coming as the first or second parameter.

I admit that functions like mapcar run counter to that tend, but substituting pipeline data into first argument seems more harmonious overall than last argument. And while we could place that data into the position of last required argument, that requires additional code for the introspection (in addition to the added bindings), and doesn't account for the ability for functions to be redefined at run-time.

Finally, you might wish to omit parentheses for forms with no explicit arguments:

      (pipeline get-value transform-1 transform-2 transform-3)

That could be done, but then you'd also want to handle this as a special case, too:

(pipeline get-value (lambda (x) (* x 2)))
;; instead of
(pipeline get-value ((lambda (x) (* x 2))))

So, the macro could transform like:

atom -> (atom previous)
(lambda args) -> ((lambda args) previous)
(other args) -> (other previous args)

But, again, this is against the philosophy of the language: it would miss macros that make, for example, (l ...) equivalent to (lambda ...), and so on.

And again, the macro could introspect to make that decision, but perhaps it's best to just accept a few extra parentheses to go with the flow.

David Owen’s blog

Pipelines in Lisp

Trackbacks

Comments

Add Comment