Over the weekend, I decided once again to see if there was anything to be learned from the mountains of wishy-washy material on object oriented programming. Object-oriented patterns are unmathematical pseudoscience1, but, perhaps, the old-school programmers have some good insights on large-scale design? So I decided to hear what they had to say about this largely language-agnostic endeavour—I found a copy of Clean Architecture by Robert C. Martin online and read it in a couple of hours2.
If you’re active on online functional programming circles, then you must remember the drama earlier on regarding Bob’s uneducated opinions on programming language theory and functional programming. Despite that, however, there are a number of valid and insightful points made in his books, which shouldn’t be ignored (once you read past all of the dogma and cult-of-personality fluff).
I actually agree with and share many of the points he makes about architecture: the art of isolating your moving parts such that the whole machine can be modified easily and independently.
Before I launch into my critique, let me recount my background to give you an idea of my perspective.
I got interested in computer science around late 2011, and became consumed with trying to find anything of proper mathematical substance behind the plethora of complicated and arbitrarily-designed languages in the mainstream landscape3. That led me to Scheme and SICP, and through that to classic PL research from the 70s, and finally to the modern work in PL: statically-typed functional programming in Haskell and beyond, and category theory, the cutting edge in our understanding of abstraction.
So far in my engineering career, the systems I’ve worked on include: a large Java project with an “Enterprise-Scale Modular Service Oriented Architecture”, and a lot of Haskell projects—streaming packet parsing, and several research/teaching/industrial programming language implementations of varying complexity.
I first encountered to Bob’s opinions on design in the summer of 2015, when I was drowning in the Java sea-of-mud. That was the first real-world system (a simple CRUD data aggregator and dashboard) I worked on, and it informed a lot of my opinions on good system and programming language design.
The code was such a Mount Everest of crap (250k lines!! Not a single line of test!!4) that I spent a whole month trying to understand what it was doing, with such broken infrastructure that I endured such pains as spending three weeks trying to merge a 20-line change, and waiting an entire 12 hours to do an
svn diff in Eclipse because the “Chief Architect” didn’t know how to code and allowed some numbnut to check all the build products into version control. A highlight of that project include a 500-line god-function that was indented 20 levels in and did runtime reflection lookups into an OSGI-bundle registry to find instantiated classes and coerce objects to those types—a year later I realised that the whole setup was actually an elaborate hack to do sum types in Java. Bravo.
That was when I started thinking a lot about simplicity in system design and coding to minimise noise and maximise comprehension.
At the time, I was still a beginner at Haskell, but I had one lesson deeply imprinted upon me from trying to master Lisp—the meaning of abstraction from a language point of view.
The interpreter-oriented design advocated by SICP and Paul Graham’s bottom-up design methodology both teach the same thing—separate the logical description from the mechanical details of implementation.
… Language and program evolve together. Like the border between two warring states, the boundary between language and program is drawn and redrawn, until eventually it comes to rest along the mountains and rivers, the natural frontiers of your problem. In the end your program will look as if the language had been designed for it. And when language and program fit one another well, you end up with code which is clear, small, and efficient.
– Paul Graham, Programming Bottom-up
As I got deeper into statically-typed functional programming, I learnt that Haskellers call the same technique a different name: domain-specific language embedding. The Racketeers call it language-oriented programming. The OOP people call it use-case driven domain modelling5. My friend John de Goes calls it the Onion Architecture.
It’s all the same thing.
Languages are the essence of abstraction6.
Underneath the veneer of all the OOP diagrams and jargon, Martin’s Clean Architecture is advocating the same idea: Langauge-oriented software design.
The language-oriented approach is a familiar concept to functional programmers of dynamically and statically typed languages alike.
I think part of the reason why it is much more innate to us functional programmers is because many of us have a PL academic background or learnt FP properly by reading the same set of research papers, which present a proper, mathematically-formal model of affairs. And historically the primary application of FP was in research compiler engineering, which is just implementing the mathematical formalisms as programs.
The language-oriented methodology proceeds in three steps:
In operating systems and computer architecture, one manipulates a logical address space which is then translated to physical memory addresses by means of a translation lookaside buffer. The language-oriented approach implements a similar abstraction: treat all external software components like physical machinery and provide logical abstractions over them in your embedded language.
It doesn’t matter what your actual DSL implementation technique is: deep embedding, shallow embedding, final-tagless encoding, Lisp macros, a bag of functions, external language with concrete syntax, …whatever. You just need to model your domain functionally. Different applications demand different techniques. For instance, you would not model an accounting application with a deep embedding, but you need a deep embedding for a compiler because you need actual syntax trees to manipulate.
In this view, all concrete machinery is an interpreter. (Dan Friedman teaches that the CPU is just an interpreter for machine code.) Interpreters are the physical boundaries that isolate domain-specific components, translating the instructions of one language into another. Languages can be decomposed into sub-languages, each with their own interpreters. It’s little interpreters and languages all the way down.
At this conceptual level, we can view the philosophy of (large-scale) functional programming as being about language building (using pure functions). Language building is just another way of saying “domain modelling”—we capture the behaviours needed by the domain as functions, functions which are in turn composed of other functions. The basis of this set of functions forms the core of your domain specific language. Haskellers like to place emphasis on data modelling as the first thing you do, but the artful design of functions is equally important. Having a clean data model drives a clean operational semantics, but the operations that you must support also drives the data model7. (In the end, it’s all about designing your types, then building terms of that type.)
If you read the chapter on the functional paradigm as a seasoned functional programmer, it is painfully clear that Bob doesn’t understand FP at this level. Nor, does it seem, that he understands any of the theory8. He just thinks it’s about no mutation, which can hardly be further from the truth. No mutation is a consequence of having a clean mathematical design. Most functional languages are strict and impure anyways.
With this understanding in mind, let us examine the core ideas of the book.
I shall not consider the nitty-gritty details of the specific OO practices; suffice to say that “clean” OO as advocated by Bob is just functional programming at a macro level (e.g. on entire object-module things). Functional programmers need not care about such things.
For example, in his discussion of the Single Responsibility Principle, he takes a messy
Employee class with three methods,
calculatePay, reportHours, save, required by three separate entities, and refactors it into a pure
Employee data type and three object-functions,
PayCalculator, HourReporter, EmployeeSaver, that take the
Employee data structure and performs the specific function9. (With functional programming, this separation is automatic.) Then he bundles the things together again with an
EmployeeFacade (a module signature). But, I ask, if you’re going to write OO code like this, why not just use Haskell?
There are two parts to system architecture:
The book doesn’t say much about designing good Lego bricks, except for these two pieces of advice: (1) architecture should communicate intent, and (2) you should design for growth.
The first principle measures to what degree your DSL is crystallised at the centre, how flexible its abstractions are, and how well it fits the domain. The allow for growth part is just the DSL version of Guy Steele’s advice from Growing a language. You should watch the talk instead. The basic principle is just to make general purpose combinators and good domain specific data types. I don’t know how to make good combinators in OO languages, because the necessary facilities just aren’t available there.
However, what I found more illuminating was the material on how to split stuff up and how to organise dependencies to minimise disruption. Let’s examine this in further detail.
Once I started thinking deeper about the material, I found that the book had some folksy insights on compositionality. He does not use the proper scientific terminology, but the “use case logic” or whatever Bob calls it that sits at the centre of the system forms a language, in the formal languages sense. If you have a set of things and a means of combining them, then it’s a language.
The language also serves as a framework within which we organize our ideas about processes. … Every powerful language has three mechanisms for accomplishing this:
- primitive expressions, which represent the simplest entities the language is concerned with,
- means of combination, by which compound elements are built from simpler ones, and
- means of abstraction, by which compound elements can be named and manipulated as unit.
– Structure and Interpretation of Computer Programs, Chapter 1: The Elements of Programming
The core language of an object oriented architecture, as described in the book, is implemented as a composition of DSLs, smeared across lots of Java classes and interfaces. Each interface is a little language (a final embedding10). The components that implement an interface \(L\) are interpreters for \(L\). The interfaces are the objects of interest: they define the interconnections and couplings between subsystems.
The overall core language has primitive expressions (your data types), a means of combination (your functions) and a means of abstraction (creating new functions in the host language).
I prefer to think of this as linear dataflow:
IN1 IN2 ... -> format into C_i (core input language) [-> -> -> Core logic in many sub-DSLs ] | | | V V V -> core output language C_o -> many evaluators for different outputs OUT1 OUT2 ...
The trick is to isolate the outer world from the core, and have the core “point outwards” in both directions. This is simple. The core is setup to communicate only through stable and semantically well-specified I/O languages, isolating it from the rest of the world.
Thus, your application logic is a compiler from your input language \(C_i\) to output language \(C_o\). Each step of computation in the core logic can be thought of as a compiler pass that transforms the core data structures from something to something else.
(Note: here I use “compiler” and “interpreter” interchangeably. If an interface specifies some output data format, then the consumer compiles from that to something else; if the interface specifies some functions, then the consumer implements an interpreter that gives some concrete operational semantics for those functions. You get the idea.)
The system is easy to change because these specific interpreters are not part of your system. If a client implementing language \(M\) wants to feed you some different input, he writes a compiler from \(M_o\) to \(L_i\) and feeds you the result. If he wants to get some different output, he implements his own compiler from \(L_o\) and munges the data to \(M_i\) himself11.
The result of this setup is a clean isolation between the inner workings of the business logic and everything else. It works exactly like an actual compiler—the input is standardised (a source file) and the output is standardised (a binary). Do you know what the guts of
ghc is doing? Probably not, and you shouldn’t care if you just want to compile your code.
Bob gives the example of avoiding the need for a database when implementing
FitNesse by swapping in a sequence of different implementations for the
WikiPage interface for data access. First, he had a dummy
MockWikiPage, then when it came time for some actual data, he implemented
InMemoryPage, and when the need for persistence arose,
FileSystemWikiPage, and finally someone came along and implemented a short-lived
MySqlWikiPage. That’s right, they implemented a bunch of different interpreters for the
WikiPage language, and swapped them in at the component connections.
While writing the preceding section, I ran into a difficulty that took me a while to resolve. Supposedly, it is advantageous to organise things such that the core is a sink in the dependency graph, but what does this mean for the language definition/implementation pieces?
Eventually I realised my mistake — I had confused in my mind the two kinds of language embedding an interface could be. There is no problem if something is a data type, but there is if you use a Java-style interface. An interface, as I have stated, is a tagless-final embedding — the interface defines a language; the implementors of the language define an interpreter. Components that are parameterised by an interface are parameterised by the interpreter. That is, in the ML sense, those components are functors, which take an instantiation of some signature to plug into its machinery.
Thus, the core of the application should consist of: a functor parameterised by a bunch of interface language signatures that it defines and controls. All external components that you wish to make communicate with the core logic must implement one of the well-defined IO signatures. That’s it.
If you look at the example he gives, this is exactly what is going on:
The core is the stuff in the box labelled
Interactor. It provides a language for the database module to implement (
FinancialDataGateway), and defines an output language composed of some data structures (
FinancialReportRequest) and a signature language
The point of view I found myself with greatest resonance with is Martin’s rules on dependency management and his aversion to frameworks and libraries. Whenever I look at some framework-app I always get lost because I have no idea what the core abstractions are. Because there aren’t any. The app is implemented in the language of the framework.
What is a framework?
A framework \(F\) is an interpreter from language \(F_i\) to language \(F_o\) (the API).
The trouble with frameworks is that to use it, you have to match your application to the language of the API. If you’re not careful, you end up coupling your system to the framework. This means that you create a cycle between yourself and the framework by architecting your component as an interpreter from \(F_o\) to \(F_i\). This gives rise to all sorts of awkwardness and bugs due to the inevitable semantic mismatch. Then when the authors of \(F\) decide to change that language, you’re screwed. If you want to be in control, you design your own language.
The proper way to use a framework is as Bob describes: compartmentalise it in a module at the outer edges of the system. That means, writing a compiler from your internal output language L_o to F, keeping that away from everything else. Framework code wiring is just a interpreter!
Everything else he says about libraries and “details” are essentially the same thing: if you want to decopule yourself from some API, define your own API for doing that thing and translate it to the actual API.
Here I explain some of the example systems from the book from the view of a PL hacker. The examples are from the appendix, which is by far the best part of the book. Even if you skip the rest of the content, you should read the appendix. Bob’s war stories of programming in the 60s are very illuminating—I learnt a lot, both about his point of view and how different hacking was back in the day. I got the impression that the situation back then was quite (informally) language-oriented, and then things got complex and it all started going horribly wrong.
The Teradyne laser trim system was a control system for precision laser etching machinery. It was implemented in assembly on a custom PDP-8. Source code was stored on a system with a linear-access storage device and written using what was most likely
ed. There was a custom OS/firmware layer, on top of which was hosted an interpreter. The application logic for laser control was then written in some primitive DSL.
A networked system for dispatching service jobs. It consisted of a bunch of largely independent star networks that were linked together. The central control machines communicated with the remote machines via a small command language.
Another call dispatch system, but much more complex in scope than 4-TEL. The core logic was implemented in an external DSL, embedded in a primitive data-description language (like JSON).
Martin describes a success and failure in the process of OO design. He was commissioned to build a suite of 18 GUI applications that would share a common core framework, so he builds the framework while building the first app, then discovers that it doesn’t work for the rest. Next, he reengineers the framework properly and cranks out the rest of the apps. The story sounds similar to pg’s description of Lisp macro design — you cannot know what the DSL will be ahead of time, you have to evolve it alongside the concrete implementation.
I generally liked the book at a high-level and found it an enlightening read. The gem of the book is the appendix, which I highly recommend taking a look at. There are some good ideas on architecture, but they are presented with too much fluff and imprecision.
To me, the author is too engrossed in object-oriented programming and TDD dogma to truly appreciate the grander principles of what he is advocating in his own book. It took Mr Martin a career’s worth of experience to learn these lessons, which are obvious if you treat programming as a mathematically rigorous activity.
The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise. - Dijkstra, EWD340
This essay was expanded from my twitter thread.
OO patterns are codified workarounds for missing language features, namely higher-order functions, sum types, and higher-kinded polymorphism (that allows for the expression of recursion schemes).↩
There isn’t any hard stuff in this book, because, once again, OOP is not real science!↩
I had grown up being told that computers were mathematical and that you had to be good at maths to program them, so programming languages must be mathematically principled things, right?↩
I’m no TDD proponent, but you’ve got to have tests… the situation was beyond ridiculous. There was zero automation, and the only testing done was to load up the software in a mock setup connected to real hardware (I’m talking about wind turbines and electrical generators the size of a garage) and click around.↩
or something like that.↩
[Oxford English Dictionary] Abstraction: (noun) freedom from representational qualities. — once you name a group of things, you have a language.↩
I am aware that I’m starting to sound like I’m advocating OOP. No, that cannot be farther from the truth! I’m talking about the logical design of the system. The packaging of data and function is an implementation detail and an engineering concern—from a practical organisational and logistical perspective, they should be kept cleanly separated.↩
The day he says something insightful about type theory or recursion schemes will be the day I reevaluate this opinion.↩
The proper way to do OOP is to do FP. So FP is the right way and OOP is the wrong way.↩
A final (or shallow) embedding is a representation of the DSL using the constructs of the metalanguage. In the case of Java interfaces, there is not really a notion of , , there is no … of a typed is referred to the lecture notes in http://okmij.org/ftp/tagless-final.↩
There’s probably an even more abstract formalism of this in terms of category theory, something like: the system forms a category S where the subcomponents are the objects and the subcomponent interconnections are the arrows. Then consider the functor category of software artefacts where the arrows are the dependency relations between them for some set of versions. There exists a functor in the functor category whenever a version update is possible.↩