Sunday, January 11, 2015

Software designs that grow with monads, comonads, and type compatibility

This post is about designing software with internal and external APIs that are robust to future changes. It is therefore about API compatibility, but more importantly it is about the compatibility of a full software design to changes. Not surprisingly, monads and comonads are part of the presented solution, as is a careful approach to use of types.

I had a "aha" moment last year when I watched a video (by Timothy Baldridge)
that showed how an AST for a Clojure compiler was fully based on key-value pairs (nested hash maps), therefore without typed structures nor classes, and was doing very well. The thing is, I have suffered enough times to get the types of the different phases of a compiler to fit together. Therefor the idea of giving up on types and just using key-value pairs, that can easily be added for each compiler phase, seemed really to be an attractive way to write a compiler.

Key-value pairs, duck typing, and many "traditional" functional patterns (think lisp, not Haskell) have all in common their reliance on generic, almost typeless, structures. So while each "atomic element" of these patterns has a type (e.g. int, string, ...), the structures (e.g. struct, list, array, ...) are all generically typed.

Types are what capture the invariants of a design. Giving up on types is in effect giving up on capturing those invariants in a robust manner. Using types normally leads to higher quality software, yet with enough complexity, types no longer capture all the design invariants of a program. Worse, sometime types actually hinder the design by holding it back because they are not powerful enough to capture the desired invariant. This is the case with the difficulty to design an typed AST that "fits" all phases of a compiler. This rigid nature of types is also the hidden problem of API compatibility.

The invariants of APIs are captured with the types of their signatures. When new features or corrections are added to an API, these invariants and associated types evolve. When APIs link different software projects, changing API types is where API compatibility becomes an issue: APIs remain compatible when types change but remain compatible with older types, APIs become incompatible when the types are no longer compatible. An example of type compatibility in OO, is to derive a new class from an existing class, and to add a constructor in this new class from the old class. Unfortunately, that last example is at the source level. At the binary level, compatibility to type change is usually nonexistent, especially when focusing on forward compatibility. Note that API compatibility is not only about types: an API will also become incompatible when the interpretation given to values transferred over the API changes. Therefore to remain compatible, an API can add new value interpretations but must also ensure that past value interpretations never change.

Serializing data for transport and for storage is much about breaking program defined types into language standard basic types, and keeping a strict discipline of interpretation to ensure compatibility. Therefore ensuring both backward and forward compatibility of an API is to maintain a strict growth of value interpretation and to use serializing packages like Protocol Buffers or Thrift. The question is then: how do we ensure the future growth of a complete software design, not just a single API, but a family of APIs? Just like with the single API, the answer also lies in the notion of serialization. The goal is to stitch together the typed lower level design with a higher level untyped design that can change over time.

Data is serialized by walking through it a certain order and breaking it down into its primitive types. Programs can be broken down the same way. Yet to do that, your first need to adopt a functional programming style because it is hard to serialize procedural constructions. In a functional style, only functions need to be "broken down".

In the good old days, scalable software design was about using construction such as data schemas, object patterns, futures, continuation style, etc. Data schemas are still good, but all these other program constructions elements must be revisited with the fact that they can all be implemented with monads and comonads. More importantly, they must be revisited because the bind and cobind operator (and other monadic and comonadic operators) is what serializes functions! Therefore, just like you must serialize your  data schema to ensure future data compatibility, you must "serialize" your functions with monads and comonads to ensure future design compatibility.

Here are few examples of existing designs that do this:
  • Injection frameworks are comonadic constructions. 
  • Transactional frameworks are monadic constructions.
  • Parallel tasks are monadic constructions.
  • Embedded queries are monadic constructions.
  • Reactive streams are monadic constructions.
  • Lenses are comonadic constructions.
  • Automatic differentiation (computing analytical derivatives of numerical code) are both monadic (forward differentiation) and comonadic (backward differentiation).
Just like data compatibility is tied to the order in which data is traversed, future design compatibility is tied to the "order" of function serialization. That is to say that each software design is strongly defined by the order in which functions are broken into monadic and comonadic constructions. While monads and comonads have a duality relationship, they fundamentally differ in "character": monads are "trace" centric, while comonads are environment centric. Said differently, conditional expressions can be monadic, while comonadic expressions can abstract away their environment. A design is then a chosen hierarchy of monads and comonads with chosen set of API extension points.

Just like with data compatibility, future design compatibility is tied to the amount in which types can be changed and remain compatible. And again, to differentiate between the need of source compatibility (e.g. for component designs) and binary compatibility (e.g. to design distributed systems). Use strong types to build your design when your programming language offers a type system that ensures that forward design compatibility is supported by forward type compatibility. Limit your reliance on types when these do not provide this forward compatibility. If this limited use of types, or the limits of the type system, do not allow monads and comonads constructions to be expressed with types, then use typeless/generic bind, cobind, and other monadic and comonadic like operators (e.g. duck typing on key-value pairs). 

Finally, use only the language features that allow you to break down your functions into monadic and comonadic constructions. For example, only use destructive assignment if you can model it within your monadic and comonadic design. Also,  do not forget you need also to enforce a "growth" only rule for your interpretation of values.

Now having said all this, you cannot build a design if it costs too much or if you cannot explain it to others. For example,  I wanted once to upgrade an algorithmic trading API around a monadic construction (within a C++ framework). I communicated around prototypes and presentations, (I was CTO), yet failed to get across to my key team members (who where top notch developers), and ended up canning the idea.  And this brings me back to a few important issues:
  • Monadic and comonadic constructions are higher order, even if you do not implement them with higher order types, you still need to think of them as higher order invariants hidden in back of your design. This is like a mathematician would "see them", and is not obvious from a classical procedural or OO background.
  • The cost to implement typeless/generic monadic and constructions within your language may simply be too high. 
For example, concerning this last point, this week I wanted to implement a duck typing pattern in Java  on top of Protocol Buffers objects (for all the reasons mentioned above), but Protocol Buffers is only efficient if you access fields in a typed manner. So even while I could see how to build a design with a robust algorithmic binary compatibility model, the cost on performance would simply be too high to do it on top of Protocol Buffers.  (Luckily using Protocol Buffers is not a requirement for this project so I can move to another data serialization protocol that supports fast generic field access).

As an end note: I understand that I am not telling you in this post "how" to write monadic and comonad style constructions, especially without a functional language. What I am telling you is that you want to know how to do this if you want your designs to grow well.

All original content copyright James Litsios, 2015.

Tuesday, October 28, 2014

Lean Startup is about market making your product

I am a big fan of lean startup, and will therefore tell you why. Still, the method is not universal, and has its limits, I will tell you about that too.  If there is one thing you should remember is that lean startup gives you a simple recipe that can shape a new product both to meet the needs of its customers and to be profitable. To do that, the method needs customer interaction, and without enough of it, the method is likely to fail.

Lean startup is an iterative process. Yet unlike other development processes that only focus on the "how", lean startup gives you some very precise rules to follow on the "what".  Once you understand how the method works, these rules make sense. Here are the important "whats":
  • Value is defined relative to a reverse income statement (defined as the "engine of growth"). In effect you are expected to model both your product and your customers. You can use a simple model of fix price values per events (such as client clicks), or you can be more creative and use some real financial engineering tricks to build your notion of value.
  • Risk is to be minimized, and as risk is hard to measure, it is up to you to judge your uncertainties, and then to diminish your largest know risk/uncertainty at each development iteration.
  • Customers and products are understood through the data that can be collected out of them (this is the notion of "measure"). All value assumptions must be validated through the collection of data. Risk assumptions are not validated.
  • Continuous deployment means that software changes go directly into production (with some smart test automation in between). 
  • Software is "staged" into production. So new changes go into production, but normally only to be initially used by few. As data comes in that validates the code change and expected value improvements, the change will be made available to more users. Staged deployment makes it possible to simultaneously collect data on different versions of your product. This is the notion of split testing. If you have done enough math you will know that split testing, well implemented,  gives you the most reliable measure of how the customers are effected by your code changes.  Other ways to collect data will include much more noise..
  •  The careful observer will have noted that split testing within a "learn and change" loop is in effect a quasi-Newtonian optimization process. That is, we a looking for the best way to improve our product (equivalent to the steepest gradient). As we do not know that best way, and only know "a way", when the split testing confirms that the new version of the product is better than its previous version, we have a "quasi" method. As Newtonian methods are susceptible to getting stuck in local minimas, we need something more, and that is inspired by linear programming with the the notion "pivot": To pivot is to swivel variables into constraints and constraints into variable. Said differently, to pivot in business terms, is to change assumptions, to change your beliefs.
  • Data and data collection are core values in Lean Startup, and therefore the notion of pivot. In lean startup you must note all your assumptions down, and any change to these assumption (any "pivot")  must be noted down too. This type of detail may sound pedantic but that is how smart learning happens. And more importantly that is how good team learning happens, by making the effort to express and note down your assumptions, to validate them, and make the effort to update your notes (data) when you change these assumptions.
Expressed in this way, lean startup is one big algorithmic machine. People still have a role to play as they decide the changes they might want to bring to their income model and to their product. The lean startup algorithm makes sure that changes really do have impact on revenue and on product perception. The originality of the method is to be expressed within one recipe.

I do not know what made the creator of lean startup put his method together, but if you have worked in finance, and especially market making and algorithmic trading, then you have already been using  very similar methods. The big differences is that the method is meant for material products, and not immaterial financial products, and that these material products normally only have a single sided market, unlike financial markets where you can buy and sell. These two aspects do make the method a bit tricky when the income model is not obvious. For example, when income comes from advertisement, you can equate each advertisement screen refresh with income, therefore it is pretty simple to map customer events to income values. But suppose you want to move to a subscription model, now your customer is paying a monthly fee, how to map your products features to this income? If you have done enough math and worked in finance, you can think up a few ideas on how to do this, but nothing is obvious when you do not have this math experience. And that can make the method tricky to apply.

I like the method's smart tie into  financial theory. For example, simple portfolio theory say that it equivalent to maximize return under fixed variance, as to minimize variance under fixed return. So lean startup chooses the second option: minimize risk (variance) while having asked you to express return (inverse income statement, growth engine). To then make sure that you validate that return model with your collected data. As a full notion of variance would be impossibly complicated, lean startup says: focus on the largest risk. That is a simplest approach, but also the smartest approach.

Another tie to financial markets is that you need to make a market in your product for lean startup to exist. Making a market means that you attract customers to use your product. And because lean startup relies on a "market fitting" approach, that is, you fit your product and income model to your market of customer, the type and amount of customers you have will make all the difference in how your product evolves. This is a real issue because the good clients are not the ones that want to spend their time playing with small applications that only have a subset of functions. Therefore, a critical difficulty with lean startup is in how to bring valuable customers to a product that is not yet really valuable enough to them. This bootstrap phase can be most difficult and that explains why you need to cheat at bit and sometimes offer what you do not have to get the necessary feedback to validate your product's evolution. Unfortunately, customers will hesitate to come back to your product if the first time they came they felt deceived. So offering what you do not have, in order to get a feed back on the demand, will also tarnish you ability to grow later. This is especially true with small niche markets.

Market dynamics are much the higher order properties of a product. Many just ignore them and hope for the best when launching their new product. Lean startup makes the effort to try to learn about these market forces while you develop the product. It is not easy, it is not obvious, but it is still the right thing to do.


Sunday, October 05, 2014

Recreational programming: zipper augmentations

I rarely program professionally, so when I write software, it is in the spare moments I have between home and and work (theses days providing consultancy services). That really means on the bus and local tram and subway network here in Zurich.

I do not program for the fun of making my executables run; I mean, for the fun of having the program really do something useful. I program for the fun of the exploration! What I enjoy is understanding how to use new concepts. Usually these are in specific research domains,  which unfortunately for this blog, are still to remain unpublished, but the last two weeks have had me travel into more classical mathematical and software territories, which allows me to talk about them here.

The way research goes, is that you need take leaps of faith. There are lots of territories to explore, you have these gut feelings that tell you to go in one direction and in other. Yet each of these explorations will take time and effort, leaving you with no other choice than to do nothing with your feelings. Yet too much unexplored feeling will make you anxious. Therefore from time to time, you need to take the risk of following up on your desires, even if you do not know exactly where they will lead you.

This was the case for myself two weeks ago, I was feeling nostalgic of KD-Trees. I could not know for sure if part of me wanted to revisit them because my first (summer) job as a developer had me adding these tree structure to an electronic CAD program, or because there was more to find in revisiting them.  The only way to find out was to put in some real work. I therefore quickly implemented a KD-Tree for rectangles (it uses a four dimensional search space), in F# under Xamarin on my macbook.

Here I make a small parenthesis:  F# on Xamarin for iOS is great. Xamarin comes out with an update every week or two, and they keep it one of the most productive environment that I know of to develop mobile apps on.

Back to the trees. KD-Trees have the fancy property of allowing you to make quick proximity search with them (e.g. to find the nearest neighbor to a given point). Implementing them functional style allowed me to understand how to generalize this notion of  tree augmentation within the concept of zippers and focuses. A very nice and powerful concept. I try to explain this hereunder.
  • Take a search tree. 
  • Augment the tree
    • Tree augmentations are secondary key like values that are added to each tree node and that only depend depend the local subtree values. 
    • For example, in KD-Trees, you can store per node a bounding box of all elements stored at that node and below it. 
  • Now create a zipper for the tree
    •   A zipper is structure that breaks recursive structures, allowing you to serialize an access to them
  • Make a focus for this tree and zipper
    •   A focus allows you to travel up and down a tree with immutable data structures. It is a list of zippers and a node of the tree. The list of zippers is used to "go back up the tree".
    • See Huet paper for a good description of the focus/cursor/location.
  • Augment the zippers to complement the augmentations of the tree.
    • For example, in the case of the KD-Tree with nodes augmented with bounding boxes, the zippers can be augmented with a "bounding hole" that is what space around the current subtree search is known empty.
    • This is what my excursions into KD-Tree taught me: you can augment the zipper to complement the tree augmentation.
  •  Now create a map function that goes over the tree with a focus. With both the tree nodes and the zipper are augmented, the map can do proximity searches for each element of the tree in a record time.
  • Finally, wrap up this mapping function in a zipper comonad because it is cool! 
    • I tried to find a good zipper comonad example off the web to give you a reference but could not find any clean and obvious code. So just search for zipper comonad. 
(A note here that there seems to be a lots of variations on terminology. Some people use the term zipper to describe the focus, while others use the terms cursor, location, or context instead of focus. I just know that the first example I read about used the terminology I used above).

Finally, some people have used multiple KD-Tree to do parallel searches, but only partial search in each tree (example). Such a parallel "sampling" approach is a way to reduce the "skewing" effect that that each individual tree hierarchy has on the tree and zipper augmentations.