Sunday, August 14, 2022

Software with formal properties at a low cost through monads and co-execution

Not so long ago a good friend of mine mentioned his need for 'formal properties' in his code.  

I asked him if he had already made the code purely monadic. He had. "Good", I said, "because the 'cheap route to formal properties' is first to ‘monadize' your code, and then 'reverse it' by bringing in 'co-execution'".  That last part needs some development. 

Formal properties are not only about execution, actually mostly not about execution, but more about formal properties of state. So while you do need to show execution invariants, you often much more need to show that the ‘results’ happen within stateful invariants (e.g. of the ledger). The easiest way to do this is to  deconstruct / co-execute the state model backwards. This 'backwards' logic must 'match-up' with the 'forward execution' (monadic) model. My view has always been comonads are easiest for this, especially if your stateful model (e.g. a blockchain) is already something that is easy to express as a comonad. 

You might note that my last post is all about combining 'forward' and 'backwards' arrows. One way to interpret these arrows is 'right arrow' for monads, and 'left arrows' for comonads. And sure enough, that is an easy way to sketch out higher order systems where adjoint invariant properties (e.g. rights, expectations, authorizations, ...) are assembled 'with good properties' by design. As mentioned in previous slide, monads are easy to express as CPS, and as are comonads with a bit more 'gymnatics' (see previous post).

All original content copyright James Litsios, 2022.


Friday, August 05, 2022

Why Use Arrows and PowerPoint to Express Software Designs?

Dog Drives Car 

Silly me shared the following picture with my sister-in-law yesterday,

as she was stuck in a massive traffic jam with her dog (in photo) in her mini. 
"How did you make the picture" was the question. "PowerPoint" was my reply.

I used PowerPoint for that photo because it took me three minutes, and I had a slide deck open.

Stop, you might say, this has nothing to do with programming, or software designs! It does if we want to design a software where a dog drives. Yet how would we approach that? Here I tell you how I have designed and communicated designs of 'big and bold' softwares quickly and with good properties. And will also sketch out how to approach exotic requirements, such as dogs driving cars.

Arrows Away!

Let me show you how I work. This how I use to draw design 28 years ago (from my PhD thesis):

This is how I would have drawn that design 18 years ago (~2004):

Here I use arrows to represent flows. Arrows are used to capture the Lagrangian interpretation of flows (e.g. as stream oriented design). In this case: for a given set of inputs, there is one or more sets of models, and for each of these sets there is a solver. If this sounds confusing it is, as it was ambiguous. It is ambiguous as it is not clear which flows we are talking about. Are these flows of the actual simulation calculations (e.g. variables of the simulator, this was a simulator), are these flows of structural data dependencies (e.g. how big each array is), are these flows of types (e.g. real vs complex). The solution to that problem in 2004 was to draw multiple flows. For example, here we have the flow of configuration, and the flow of object hierarchy:

Note here I introduced these flows as Lagrangian, meaning we are 'moving along with the flow'. When implementing flows, there are always a 'outside observer' view of the flow, the Eulerian interpretation (e.g. a CPU core's view). These can also be drawn with arrows (e.g. see my 'old' stream processing video). In 2004, I would have used two different arrow types to show the difference of flow representation. For example in this diagram the 'stream processing solver engine' produces 'a stream of variable data': 

Let's mention the strong influence of multi-stage programming on some of these 'notation concepts' (see Walid Taha ~2000 work referenced here) as one way to think of these arrows would be like this:


However, such a notation implies that the 'telescoping flow' has its own semantics, and that is tricky. An easier approach to draw something like this:

The idea being that the solver 'thing' is specialized with a model 'thing' (returning a new flow thing), which we can then specialize again with a variable 'thing', resulting in a 'fully formed flow thing' that includes the solver, model, and variable details. Importantly, we are no longer 'telescoping' things when we do it this way. Therefore one learning is that 'telescoping' is one representation, others are possible.

Holy Trinity Continuum

The notion of telescoping is much associated to monads. While the concept of 'specializing something with something else' is often implemented with with comonads. Yet this is not critical information to know. The reason being that monadic like logic can be expressed with continuation passing style (CPS), and again can also be expressed in defunctionalized forms (see this early ref).  We have something that looks like this:

Even better, we have something that looks like this (correct me if I wrong, I am writing this fast today):

This last diagram is 'my holy trinity' of programming design belief, as I do not need to worry if a design will be implemented with types, closures or functions, as one can always be mapped into another. Instead, the key concept is that a design should express what is invariant no matter  'how it is implemented'. Said differently, designs are best when they express invariants that hold over multiple variants of implementations.

Notes:

  • CPS is really functions + a more or less complex scoping logic implemented across the functions.
  • Things are a little bit more tricky when we interpret things backwards. Still similar concepts  hold.

Scoping Invariance

A primal invariant to mention is about 'variable scoping'. (The applicability of the concept is broader, and applicable to containers with access by indices versus access by search). Early languages noted two fundamentally different approaches to 'variable scope' semantics, the so called static and dynamic scoping. Dynamic scoping is still much a curiosity to developers in 2022. However, Richard Stallman writes: "Dynamic scope is useful".  So what gives? The trick is that we need to restrict the dynamic scope semantics to make things scale. A good choice to limit dynamic scoping to have only 'one point of variable access'. With that we have the following symmetry:

  • Static scoped variables are 'defined at one place' and possibly used many times.
  • Dynamic scoped variables are 'defined possibly many times' used used 'at one place'. 
An 'semi-formal' insight is that dynamic scoping is static scoping 'running backwards' (with the restriction above). In a simplified arrow form, this looks like this, indicating an invariance correspondence between static scope going forward and dynamic scope going backwards:

Inspired by this, my notation rule is: arrows that go left-to-right are symmetric to those going right-to-left. Often just the direction is needed. For example, in the following diagram:

If A has a static scope, B has a symmetric dynamic scope, and vice-versa.

Directional Invariants

There are many software invariants. Many of them have their roots in the notion of 'sequence of effects'. For example, if we write something, then to 'read it' is just like 'to write it but backwards'. Here too we can use arrows to capture these relations, for example:

Again, the idea is: if a left-to-right arrow has a design feature, then there is matching 'reverse logic' right-to-left arrow with a 'complementary design feature'.  I have about 10 of these invariant symmetries that I use 'all the time' to design, and a broader list of over 50 symmetries that I like to refer too.

Arrows and Dimensions to Help Design Software

I have been practicing to 'think backwards' for over ten years. (I sound incoherent when I forget to 'switch that mode off'). You may well wonder what 'running backwards' means. Luckily the following 'equivalence heuristics' help:

  • While we 'execute forward' we can accumulate a lazy execution of reverse logic which we can run 'at the end' of the forward execution. At which point the reverse execution will be run forward, yet it will still be the reverse logic. (This is how reverse mode auto-diff is done).
  • We go 'up' the call stack, after going 'down'.  Similar to the previous bullet, we might say that we go forward as we go down into function calls, and go backwards as we return out of each function. 
  • A two phase commit has 'forward computation' in the first phase, and 'reverse computation' in the second phase.
The important insight here is that reverse logic does not need to be run backwards. Yet it does need to be 'arranged in the design' to have a proper 'reversed logic'. Therefore we do not need to draw reverse logic from right-to-left if we have a way to indicate that it is 'forward execution that manages backward logic'. I use the U-turn arrow for that. The previous example can be drawn as:

Nothing imposes directions, therefore the following is a perfectly valid design sketch:

By which I am saying: the design will verify things, accumulating a reversed execution logic which will be interpreted. 
You may note that I am being 'very one dimensional', limiting myself to a left and right direction. We can in fact introduce as many dimensions as we want. Two dimensions (horizontal and vertical) often suffice to describe large architectures. (Sometime more dimensions are brought in with 45" or 30" and 60" degree arrow directions). For example, we may indicate our I/O logic 'vertically':

When we combine the two, we might indicate the following two designs:

Which one to choose?
Both!
One way to read these designs are as follows:

  • 1 top: while we write we accumulate to confirm the overall 'write commit' only when 'all writes' are valid from a 'reverse perspective'.
  • 1 bottom: While we read we interpret.
  • 2 top: To verify we read 'the past':
  • 2 bottom: we can 're-interpret' the past by writing.
A careful reader might note that while we may want 1 and 2, we cannot just have them! The issue is that when combined, we are mixing 'left-to-right' and 'right-to-left' arrows 'of the same thing'. For example, 1 has Verify as a U-turn and 2 as Verify as a left-to-right arrow. That will not work, as one is the 'reverse direction' of the other. 

Further Directional Invariants

There are many important software design concept. Here are three more, expressed with complementary arrows:

In simple terms: A coherent design does not have complementary design concepts 'at the same level'.  What the above is saying is:

  • Both a local and global design must be expressed, and both are complementary. 
  • Providing authentication allows us to 'look back' with trust.
  • If new 'things' are 'freshly' created, then at some point they become 'stale'. And two complimentary design logics must be designed to deal with this. 
With that in mind, the previous two designs can be brought together as as single coherent design:

To note that this is how I was designing software early-2015 (see my pre-Elevence post).
(I leave the reader the reader the exercise to bring in authentication, trust, freshness and staleness).

Semantics, Tools, and Teamwork

Given a team of developers, I might suggest that design 1+2 be implemented with local monads and a global comonad (e.g. as a blockchain). Yet once I do that I lose flexibility. I also lose my audience. Instead I draw arrows, and explain what I mean with these arrows. I say what properties we 'the development team' are looking for, and sure enough we end up with a decent design that is robust and strong.  Drawing design 1+2 takes just a few minutes in PowerPoint, and maybe would have needed a few tens of minutes to sketch out before (e.g. on a sheet of paper). Typically, I duplicate my 'master design slide' many times, to then add annotations to show how we might 'slip in' extra design requirements. For example, the notion of trust and authentication, and here, the logic to deal with 'staleness'. 
I started this post with a picture of a dog driving a car. The important property of a software design is to be 'consistent by design'. Were I to design a 'dog driving car' software design. I would start by sketching out the complementary properties that I would need to bring in to the design. These might me:

Then I would draw these in PowerPoint and iterate and grow to create a good design.

All original content copyright James Litsios, 2022.