Thoughts on “Testing is an Unsolved Problem”

Beware of bugs in the above code; I have only proved it correct, not tried it
~ Donald Knuth, Notes on the van Emde Boas construction of priority deques: An instructive use of recursion (1977)

I stumbled across Jason Arbon’s excellent and thought-provoking post the other day, “Testing is an Unsolved Problem” and it really got my noodle going. I feel pretty comfortable saying that, theoretical considerations aside, everyone in software finds non-trivial testing a challenging problem (if not a downright pain-in-the-ass).

In a certain sense, anybody can write a program that doesn’t work, and these programs only differ in how broken they really are. Moreover, I think it’s safe to say that most programs are at least partially broken in that same sense. Worse, and to (hopefully accurately) paraphrase the original, it can be hard or impossible to even quantify how broken a program is. That said, I do somewhat disagree that theoretical correctness is the challenging aspect of testing. It turns out that huge, useful classes of algorithms are decidable, and it’s very surprising to see how complex decidable (and even efficiently-decidable) some algorithms can be (with some big caveats).

There’s even a cool, possibly uncoincidental parallel between decidability in formal systems and testing in informal systems, viz. that you can write huge, “complex” programs in a decidable system that loses its decidability when you try to get anything into or out of it. That’s right: introducing IO to a decidable, formal system causes it to lose its decidability. Generalizing this idea motivates the concepts such as monads and effect systems that make functional programmers so insufferable after a few drinks. (The little Yost readability face is shrieking at me right now. I’m a little concerned that it may become corporeal and exact some nightmare readability revenge.)

This mirrors the fact that most of our defects stem from just a few locations:

  1. Getting data to/from users
  2. Getting data to/from integrations
  3. Concurrency

Note that (1) and (2) are effectively the same problem, which mirrors the theoretical “IO” issue. As an aside, we’ve largely been able to eliminate concurrency issues by modeling concurrent operations as a Coffman-Graham schedule. Our implementation is open-source here. This has the added advantage that if we encounter a concurrency issue, we can easily transform it into a single-actor model by linearizing the computation graph.

Anyhoodles, if you made it this far, I’d like to talk about how we really mitigated (1) with Vaadin and Aire::Test.

Keeping it in one process

Whenever you have a client (browser) and a server, you have a distributed system. Automated testing of distributed systems sucks, hard. It sucks so hard that is the first place I’ve ever worked that has developed its own in-house technologies to address the problem. Tools like Selenium/Watir/etc. are fantastic, but to be perfectly honest, I haven’t seen them very effectively used in most orgs. I’ve encountered. Frankly, the it’s pretty hard to write tests that aren’t brittle. Why you ask? In a word, LAYOUT.

Let me show you what I mean. I’m currently working on a new UI that provides an excellent, simple example. Check out that button outlined in blue:

Where is it at? Well, it’s human-navigable location is:

Main > Navbar > Zephyr > Modules > Grid[select] > Right Nav > Module Lifecycle > Stop

Not too bad, but in Selenium that’s gonna probably be (I’m sure they have utilities to simplify this–haven’t used it in a while)


  1. Open Browser
  2. Navigate to Main
  3. Locate Zephyr in top menu
    4 Click Zephyr
  4. Wait for Page Load
  5. Click Modules tab
  6. Wait for Module List to populate
  7. Locate module element
  8. Click module element
  9. Wait for side-nav element to load
  10. Locate button with CSS selector html body div#outlet aire-application-layout aire-panel aire-navigation-bar.vertical aire-drawer.verticalright vaadin-vertical-layout vaadin-menu-bar > vaadin-menu-bar-button > vaadin-context-menu-item > vaadin-context-menu-item > vaadin-button:nth-of-type(2). (This may not be simplest possible selector, but as your pages grow in complexity you generally need to increase your selectivity)
  11. Click button

1. Wait until the module-list contains a stopped module

I mean, in the scheme of things, and with a mature product, team, and methodology, this can be a great way to test stuff. You can reduce the test-development complexity by instituting some selector/UI conventions, refactoring out tests into pages, etc. The problem is that tiny things are going to break you. We change our components’ markup all the time, and in the Selenium model it virtually always breaks something.

Using Aire/Zephyr

What’s an equivalent test look like in Aire and Zephyr?

@Routes(scanClassPackage = ModuleGrid.class)
class ModuleGridTest {
void ensureStoppingAModuleThroughTheStopButtonStopsTheModule(
@Select("vaadin-vertical-layout > aire-drawer")
Drawer drawer, @Context TestContext context, @Autowired Zephyr zephyr) {
val $ = context.downTo(drawer);
val button = $.selectFirst("vaadin-button:nth-of-type(2)", Button.class)
.get(); // throws exception if it's not there
.allMatch(plugin -> plugin.getLifecycle().getState() == State.Active));;
assertEquals(1, zephyr.getPlugins(State.Resolved).size());

Boom. Here’s the tests in our CI:

It’s fully integrated with Spring Test (possibly the most useful middleware I’ve ever encountered), and because it uses Vaadin’s component model rather than the browser’s, it’s insensitive to DOM changes while being sensitive to view changes. Furthermore, it fully integrates with Spring/JTA transactions, etc. allowing you to test another challenging aspect: the data model (but that’s a topic for another time).


If you’ve made it this far, I really appreciate it! I hope it was helpful, and if this is something you’d like your org. to adopt, it’s completely free and open-source. I’d be happy to lend what time I can to assist in adoption. Despite requiring an entire CSS selector engine to implement, velocity is the highest it’s ever been.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: