Dog Drives Car
Silly me shared the following picture with my sister-in-law yesterday,
as she was stuck in a massive traffic jam with her dog (in photo) in her mini.
"How did you make the picture" was the question. "PowerPoint" was my reply.
I used PowerPoint for that photo because it took me three minutes, and I had a slide deck open.
Stop, you might say, this has nothing to do with programming, or software designs! It does if we want to design a software where a dog drives. Yet how would we approach that? Here I tell you how I have designed and communicated designs of 'big and bold' softwares quickly and with good properties. And will also sketch out how to approach exotic requirements, such as dogs driving cars.
Arrows Away!
Let me show you how I work. This how I use to draw design 28 years ago (from my PhD thesis):
This is how I would have drawn that design 18 years ago (~2004):
Here I use arrows to represent flows. Arrows are used to capture the Lagrangian interpretation of flows (e.g. as stream oriented design). In this case: for a given set of inputs, there is one or more sets of models, and for each of these sets there is a solver. If this sounds confusing it is, as it was ambiguous. It is ambiguous as it is not clear which flows we are talking about. Are these flows of the actual simulation calculations (e.g. variables of the simulator, this was a simulator), are these flows of structural data dependencies (e.g. how big each array is), are these flows of types (e.g. real vs complex). The solution to that problem in 2004 was to draw multiple flows. For example, here we have the flow of configuration, and the flow of object hierarchy:
Note here I introduced these flows as Lagrangian, meaning we are 'moving along with the flow'. When implementing flows, there are always a 'outside observer' view of the flow, the Eulerian interpretation (e.g. a CPU core's view). These can also be drawn with arrows (e.g. see my 'old' stream processing video). In 2004, I would have used two different arrow types to show the difference of flow representation. For example in this diagram the 'stream processing solver engine' produces 'a stream of variable data':
Let's mention the strong influence of multi-stage programming on some of these 'notation concepts' (see Walid Taha ~2000 work referenced here) as one way to think of these arrows would be like this:
However, such a notation implies that the 'telescoping flow' has its own semantics, and that is tricky. An easier approach to draw something like this:
The idea being that the solver 'thing' is specialized with a model 'thing' (returning a new flow thing), which we can then specialize again with a variable 'thing', resulting in a 'fully formed flow thing' that includes the solver, model, and variable details. Importantly, we are no longer 'telescoping' things when we do it this way. Therefore one learning is that 'telescoping' is one representation, others are possible.
Holy Trinity Continuum
The notion of telescoping is much associated to monads. While the concept of 'specializing something with something else' is often implemented with with comonads. Yet this is not critical information to know. The reason being that monadic like logic can be expressed with continuation passing style (CPS), and again can also be expressed in defunctionalized forms (see this early ref). We have something that looks like this:
Even better, we have something that looks like this (correct me if I wrong, I am writing this fast today):
This last diagram is 'my holy trinity' of programming design belief, as I do not need to worry if a design will be implemented with types, closures or functions, as one can always be mapped into another. Instead, the key concept is that a design should express what is invariant no matter 'how it is implemented'. Said differently, designs are best when they express invariants that hold over multiple variants of implementations.
Notes:
- CPS is really functions + a more or less complex scoping logic implemented across the functions.
- Things are a little bit more tricky when we interpret things backwards. Still similar concepts hold.
Scoping Invariance
A primal invariant to mention is about 'variable scoping'. (The applicability of the concept is broader, and applicable to containers with access by indices versus access by search). Early languages noted two fundamentally different approaches to 'variable scope' semantics, the so called static and dynamic scoping. Dynamic scoping is still much a curiosity to developers in 2022. However, Richard Stallman writes: "Dynamic scope is useful". So what gives? The trick is that we need to restrict the dynamic scope semantics to make things scale. A good choice to limit dynamic scoping to have only 'one point of variable access'. With that we have the following symmetry:
- Static scoped variables are 'defined at one place' and possibly used many times.
- Dynamic scoped variables are 'defined possibly many times' used used 'at one place'.
Inspired by this, my notation rule is: arrows that go left-to-right are symmetric to those going right-to-left. Often just the direction is needed. For example, in the following diagram:
Directional Invariants
There are many software invariants. Many of them have their roots in the notion of 'sequence of effects'. For example, if we write something, then to 'read it' is just like 'to write it but backwards'. Here too we can use arrows to capture these relations, for example:
Again, the idea is: if a left-to-right arrow has a design feature, then there is matching 'reverse logic' right-to-left arrow with a 'complementary design feature'. I have about 10 of these invariant symmetries that I use 'all the time' to design, and a broader list of over 50 symmetries that I like to refer too.
Arrows and Dimensions to Help Design Software
I have been practicing to 'think backwards' for over ten years. (I sound incoherent when I forget to 'switch that mode off'). You may well wonder what 'running backwards' means. Luckily the following 'equivalence heuristics' help:
- While we 'execute forward' we can accumulate a lazy execution of reverse logic which we can run 'at the end' of the forward execution. At which point the reverse execution will be run forward, yet it will still be the reverse logic. (This is how reverse mode auto-diff is done).
- We go 'up' the call stack, after going 'down'. Similar to the previous bullet, we might say that we go forward as we go down into function calls, and go backwards as we return out of each function.
- A two phase commit has 'forward computation' in the first phase, and 'reverse computation' in the second phase.
By which I am saying: the design will verify things, accumulating a reversed execution logic which will be interpreted.
You may note that I am being 'very one dimensional', limiting myself to a left and right direction. We can in fact introduce as many dimensions as we want. Two dimensions (horizontal and vertical) often suffice to describe large architectures. (Sometime more dimensions are brought in with 45" or 30" and 60" degree arrow directions). For example, we may indicate our I/O logic 'vertically':
Which one to choose?
Both!
One way to read these designs are as follows:
- 1 top: while we write we accumulate to confirm the overall 'write commit' only when 'all writes' are valid from a 'reverse perspective'.
- 1 bottom: While we read we interpret.
- 2 top: To verify we read 'the past':
- 2 bottom: we can 're-interpret' the past by writing.
Further Directional Invariants
There are many important software design concept. Here are three more, expressed with complementary arrows:
In simple terms: A coherent design does not have complementary design concepts 'at the same level'. What the above is saying is:
- Both a local and global design must be expressed, and both are complementary.
- Providing authentication allows us to 'look back' with trust.
- If new 'things' are 'freshly' created, then at some point they become 'stale'. And two complimentary design logics must be designed to deal with this.
To note that this is how I was designing software early-2015 (see my pre-Elevence post).
(I leave the reader the reader the exercise to bring in authentication, trust, freshness and staleness).
Semantics, Tools, and Teamwork
Given a team of developers, I might suggest that design 1+2 be implemented with local monads and a global comonad (e.g. as a blockchain). Yet once I do that I lose flexibility. I also lose my audience. Instead I draw arrows, and explain what I mean with these arrows. I say what properties we 'the development team' are looking for, and sure enough we end up with a decent design that is robust and strong. Drawing design 1+2 takes just a few minutes in PowerPoint, and maybe would have needed a few tens of minutes to sketch out before (e.g. on a sheet of paper). Typically, I duplicate my 'master design slide' many times, to then add annotations to show how we might 'slip in' extra design requirements. For example, the notion of trust and authentication, and here, the logic to deal with 'staleness'.
I started this post with a picture of a dog driving a car. The important property of a software design is to be 'consistent by design'. Were I to design a 'dog driving car' software design. I would start by sketching out the complementary properties that I would need to bring in to the design. These might me:
No comments:
Post a Comment