Couplings in Clojure testing tools

March 18, 2022

back to the land of parens

Last time around I wrote up my experiences hacking JavaScript VMs in C++. It was a fun and wild time; modern VMs and C++ are, well, damn complex beasts!

Since then I've moved back to the Clojure world, where I've joined the Nextjournal team. We generally work on tools for thought, but also run some consultancy projects. For instance, the Nextjournal reproducible notebook platform, as well as a local-first Clojure notebook tool called Clerk.

So today I wanted to talk a bit about some fun I had with Clojure recently. It has to do with test tooling; I'll start with some background experiences and then share a bit of code.

background experience with test tools in Clojure

Back when I worked at Nubank, I spent a good amount of time trying to improve the test tooling we had for Clojure.

When I joined, Nubank used Midje widely, which is a testing DSL inspired by Ruby's RSpec. Coming from Java, I found Midje wonderfully expressive and capable but after some time realized that the DSL deviates from a few standards found in the Clojure ecosystem:

loading a namespace mixes code evaluation and test execution effects. This can get in the way of REPL-driven workflows and analysis tools.
assertions are described in a non-S-expression infix style (fact (inc 1) => 2 (dec 1) => 0). This made it hard to use structural editing tools.
the library takes a strong "all or nothing" approach, where most Clojure libraries are small and composable.

Regardless, I started to hack on Midje to tighten some holes I found myself falling into. I eventually formed the opinion that, for the sake of maintainability and the points above, we should move to a collection of smaller and simpler testing tools.

clojure.test, being more or less pervasive in the eco-system, seemed like a good thing to try. I found it a hard swallow when used in isolation: it was very bare-bones when compared to what Midje was capable of. But clojure.test was small and extensible, and compatible with an approach of porting the best ideas from Midje into a suite of smaller test-framework agnostic libraries.

We ended up with clojure.test at the core, matcher-combinators for asserting over nested data-structures in a declarative way, and mockfn for mocking.

tests should produce data

In this time I often collaborated with Sophia Velten, who designed state-flow, the library Nubank uses for single-service integration tests.

One thing Sophia emphasized with state-flow is that the result of running a test should be data. It sounds like a simple and perhaps boring idea but has huge implications on the extensibility of a test framework.

For example, when clojure.test tests are run, they emit detailed human-readable reports and return very coarse-grained summary as data {:test 7, :pass 25, :fail 0, :error 0, :type :summary}. This creates additional burden to tool makers that might want to adapt this output. For instance, Arne Brasseur details a bit in this github comment how they handled this in the Kaocha test runner.

This is workable with clojure.test, given that it uses user-override-able multimethods for its reporting.

I guess my issue with this design is tool builders are still required to do a good bit of work. I imagine this work has unfortunately been repeated many times by different people in different dev tool code.

To concretize my point, clojure.test currently works like this

And if test evaluation and reporting were decoupled, it could look like the following, which would give nice hooks for those working on dev tooling

Clerk + testing

Okay, but why am I rambling about this at all?

Well I wanted to get a hold of more fine-grained test result data to start playing around with building test-related tools for Clerk, a local-first Clojure notebook tool.

And what is Clerk exactly?

It is a computational notebook tool for Clojure, which gives you the interactivity and visualization gains of a notebook while still embracing your existing dev flow. Notebooks developed locally but can be published online statically by bundling the data generated from the Clojure code and publishing it with the front-end Clojurescript viewer code.

My colleagues at Nextjournal have been making some really cool stuff with it! They've taken to chatting me up about how Clerk could be applied to testing dev experience.

Like imagine being able to do your dev in Emacs or Neovim, but have a test runner that printed matcher-combinator mismatch test failures, where irrelevant parts of the data-structure were auto-folded away.

Or you could add tests to your notebooks, plus a button to run them, and see highlighting for assertion forms that pass or fail.

So this is why I sat down to see how I could get clojure.test to provide test result data to then send over to custom Clerk viewers.

separating test execution from test reporting

As I dove into trying to get clojure.test data I realized there were no APIs for providing fine-grained report results. Like getting, as data, exactly which assertion forms failed and what deftest variables they are associated with doesn't seem possible.

I found myself needing to solve the issue of decoupling test execution from test reporting, the thing that Sophia was so spot on about in state-flow's design years ago.

Hacking a bit, I found a solution that seemed pretty cute:

What is going on here? Well, clojure.test reporting is implemented using Clojure's multimethods, which allow you to dispatch to different function bodies depending on some dispatch criteria defined by defmulti. For instance, clojure.test/report looks like (defmulti report :type), so the :type data of the arg passed into report specifies the behavior.

Multimethods are a nice way to bake extensibility into your libraries because one can always define a new multimethod dispatch/body.

For example, in the matcher-combinators integration with clojure.test, we were able to display custom matcher-combinators mismatch results by extending clojure.test/report in the matcher-combinators library itself, allowing operation over :mismatch, a new test result type we created (it looked like this but was eventually migrated)

What I'm doing now with the report multimethods above is stashing the old versions with methods (which I was delighted to discover this existed today), and redefining them to new behavior that eventually dispatches back to the stashed original. Essentially a hacky "call super"; using it ensures that the old clojure.test API continues to work fine. The new versions also accumulate some report data using same weird binding, dosync, commute stuff; a pattern I lift from the clojure.test code itself.

With the report definitions hijacked, we can implement evaluate-test-var to mute any results printed by clojure.test and also return the accumulated report data.

The new report! functionality then iterates through the report data produced by evaluate-test-var and calls the original clojure.test report functions (those that we stashed using methods) on the data.

You can try it out in your own REPL; you should see something like this:

Cute, no? And fun how Clojure provides all the tools needed to do such adaptations to clojure.test. And it isn't the only way to achieve this. Another approach could be to use with-redefs over clojure.test/do-report, which is the one place that clojure.test/report is called.

At Nubank we'd sometime toss around the idea of writing our own test framework that explicitly decouples execution and reporting. Yet after working out this snippet of code I'm wondering how far we can get purely through adaption.