Showing posts with label functional programming. Show all posts
Showing posts with label functional programming. Show all posts

Saturday, November 09, 2024

Why the struggle with functional programming?

Slow FP adoption...

Why no widespread adoption even though functional programming exists already now for over sixty years?

Not so long ago I explained this as:

The reason is actually pretty complicated: The strength of FP is much due to its ability to be very strong. This strength in the code weakens the ability of developers to make local changes. Initially, this sounds like something good, yet it creates a new challenge: developers are bound to the constraints of code written by others, and they are not only not happy, they are less productive! You can solve this by bringing your FP to the next level: math. A few companies do this, now a developer is not subject to constraints but to mathematics. If the developer understands the math she/he finds this acceptable.
I am simplifying things a bit here, yet FP, or any language that is strong enough, brings in a whole complexity of technical ownership and people dynamics that does not need to be dealt with with OO. The reality is that FP allows more scaling, but maintaining the stability of people within that scaling is a major challenge.

I want to present this a bit differently here, because the above put too much blame on people. Yes, FP is tricky for teams, yet the root cause is not the teams, the teams just highlight the root cause!

The Design - Features - Tasks trilemma

Let's start with the following trilemma

Design - Features - Tasks

And by trilemma, I mean that you cannot simultaneous set goals to progress all three. (And by task, I mean "effective coding work"). You must specify two of these, and let the third one be determined by the outcome of your work. You can specify design and features, you can specific design and tasks, you can specify features and tasks... but you cannot specify design, features, and tasks before you start working.  To be more concrete: when you focus on architecture before coding, and come up with a design, you are in fact constraining the features that will be implementable with tasks that follow that design. If now you specify additional features, before you start your tasks using the given design, you will likely fail.

Functional programming done seriously "ties" design and tasks together. By which I mean that FP that pushes its higher order design model highly constrains the design. The result is that by construction, a team that pushes its FP design culture, is also a team that is "squeezing themselves" into an over constrained design-features-tasks trilemma. To be specific, an FP team may be imposing design, features and tasks all at the same time, and not understand that this is an over-constrained setup that will most often fail.

There is a solution: Just like I mentioned in that earlier post, you get a around trilemma by splitting one of its terms. Here we split tasks into design tasks and non-design tasks. The approach is then the following:

  1. Given some desired features
  2. Work on design-tasks so that the design can implement the features
  3. Then work on non-design-tasks, that implement the features with the developed design.

From a work triplet perspective, we have:

  1. features & design-tasks -> new-design
  2. features, non-design-tasks -> implemented features

Here, the new-design produced by 1, are used in 2.

However, the challenge is that now we have a two phase work process, and by default teams, especially agile teams, are single phase process. Agile typically asks teams to focus on tasks that deliver MVP features, and therefore agile typically sets people up to fail, as there is no new feature in good FP without new design, and... and teams often struggles to juggle the three needs to progress design, features and tasks within a same agile effort.

TDD helps

Test driven development (TDD) is one way to escape the limits of a one phase agile process. The idea is the following: 

  1. Given some design features
  2. Develop tests for these features, AND extend the design to cover the features.
  3. Then work on non-design-tasks, that implement the tasks, and pass the tests.

However...

Yet there is still a challenge: design changes most often depend on strong opinions. FP depends on design changes. FP depends on strong opinions. Getting teams to gel around strong opinions is often not trivial. And that why I wrote the statement shared above.

All original content copyright James Litsios, 2024.

Sunday, October 06, 2024

A software mind's eye

I have been writing higher-order functional programming in Python for the last few weekends:

  • hofppy (https://github.com/equational/hofppy) will be a Python library. For the moment it is a collection of Jupyter notebooks.
  • My initial goal was to have a handy FP toolkit which supports applied math with JAX's JIT. Yet I realise that in fact what I am really doing is reimplementing the generic part of a compiler for a reactive (trading) language I wrote in 2010 in F#, while including a few tricks I picked up since then, the primary one being to think of code as implementing a synchronous execution model.
  • There is really very little code like this on the web, therefore why I am doing this open source.

This blog is in part written to mention the above, as already the first JAX JIT supporting monad and comonad models are "nice". Yet this blog is also to bring up the subject of the process of creating new technology.

My recipe to do something new, such as lead a team on a new subject, write software that did not exist before is the following:

  1. Create a "mind's eye" imaginary world of "usage" of the end result.
  2. Imagine how it should work.
  3. Start very quickly to implement features.
  4. Review 1, 2, and 3, as you regularly cycle through them, again, again, and again!
I use the word imagine twice above. For a few reasons...

People are constantly asking me "how do you know?", with regards to client's requirements, technology, design, approach, etc. The reality is that I do not completely know: I know much, I have done much, written hundreds and hundreds of thousands of lines of software. Yet when something is new, no one knows fully how things will look in the end. The reality is the newer it is, the less you know. And... the more you think you know, the more you are fooling yourself. However, just like with poker, the optimal approach is to imagine much and often, and to work as hard as possible to build and maintain a mindful view of "new usage" and "new functioning technology".

The more experience, the better! As your imagination is only as good as the real details it contains. I have tons of real-life learning that I use for this higher-order FP in python. I mention here a few of them:
  • Lazy software design are easier express in code.
    • The most flexible design, is the code that does nothing but accumulates lazily what it could do, until finally because output is expected, it works backwards from output expectations to pull out the desired output.
  • Multi-stack semantics is perfectly normal. 
    • For example, in the project's above monadic code, the free monads have "their" variables, "normal" Python has its variables, and these are carefully kept separate.
    • Multiple flows of causality exist in the real world, stacks of various semantics is the only cheap way to capture these cleanly in software.
  • For every Ying, there is a Yang.
    • If there is more, there is also less; If there are variables, there are constraints, etc.
  • Structure is more important than types.
    • I started in functional programming without the types. Then I embraced types. And yet to understand that the sweet spot is somewhere in between. The reality is that it is much easier to use code to constrain software "just a right level" than to use types.
  • Math is the most powerful structure.
    • Think geometry, topology, and dualities!

 All original content copyright James Litsios, 2024.

Wednesday, December 14, 2022

The future of software development (end 2022)

 

My end of 2022 predictions for software development:

  1. Structure of code becomes more important than code
  2. Partial code becomes more valuable than finalized code
  3. Proprietary code gains in value
  4. Contextual programming gains in traction
  5. IDEs become ever more ML driven
And therefore:
  1. Dynamic typing gains over static typing
  2. Duck-typing grows in value
  3. IDE security and spyware become an ever more important topic
  4. Functional programming is still the more powerful programming paradigm
  5. Proprietary IDE with in-house ML engines will grow in importance
All original content copyright James Litsios, 2022. 

Monday, January 10, 2022

Reversible contextual Python with strong properties

I previously mentioned my effort to learn how best to do semi-formal development in Python. Here I report my progress, done during my end of year 2021 holiday break. 

The effort is to find a 'best design' to achieve certain properties, and I am doing this by rewriting code again and again to progress a 'design' towards a 'better design'. At rewrite 21 we have:

  • A systematic way to write reversible, context aware code.
  • Context specific algorithms can be defined over containers as well as over higher order functional applications.
  • Logical and declarative 'free-variable' oriented style is possible.
  • Function applications with context can be 'saved' to state and to be continued later or elsewhere.
  • Quoted expressions can help define equalities and inclusion properties.

Some insights:

  • This could be Java or C++! Python is handy with a focus on 'abstract' arguments, but traits would carry us through an OO implementations.
  • What about types? The question is better expressed as: what about types within a formally strong stack? Types are good, but they too must be expressed 'with strength', otherwise they hinder you.
  • How much value is there in the strong typing of Haskell and Scala? Much if you use it for formal properties. However if most of your types are 'structural boilerplate', then you are not actually gaining much. Formal properties with types means 'type level' programming, and that is not cheap, nor easy to scale. 
  • This code does 'nothing real'!? It is not meant to! Its purpose is to show how design cost effective code with formal properties in 2022!

Here is a quick run through my efforts. To note each of these efforts are a 'throw' away effort with the goal of learning how to write 'stronger' more 'formal' code. Twenty-one rewrites later (about sixty hours of work), the effort was worth it:

  1. This effort starts with a hard coded functions to swap a positional argument to keyword arguments and vice versa (the other way around).  Swapping is reversable, and our goal here is to write reversible code. 
    • I built this 'keyword argument' constructor/destructor initially to bridge between the 'product' type nature of keyword arguments and the 'sum' property of positional arguments. Later, I realised that keyword are named, positional arguments are not (conceptually!), and that this first focus allows me to move freely between a nameless point-free style of programming with positional args, to a more classical 'named' style. This explains why I am still point-free when I use identifiers, because these can be removed with well chosen operators.
  2. A generic keyword argument' constructor/destructor function can be implemented by simply 'accessing kwargs'. However, Python has no efficient ways to remove keys from a dict, and that forces us to use 'ugly' copy constructs to 'pass on' a kwargs, minus the chosen keyword. Therefore  version two uses eval and lambda to generate a an effcient generic keyword argument' constructor/destructor. 
    • Retrospectively, I would have avoided the eval, and stuck with kwargs and copying the dict to remove an arg.
  3. Lambdas are a pain to debug and to type annotate,  version three uses callable objects instead of lambdas.
  4. None of the above is composable nor 'stateful' (from a 'traditional' perspective), as all happens within function arguments. Version 4 adds function composition and introduce a AKW class, to 'statefully' hold function arguments. It is 'reversible', in that we can 'call' (cocall) a function with the object's data as arguments. 
    • AKW stands for Args KWargs, as it holds both the args tuple and the kwargs dictionary.
  5. This is all dynamic typing, and initially at least, with no type conditional expression. However, to 'feel the ground', version 5 does the following: if two complementary functions/lambdas are expressed as callable object, then we can give their type classes a class attribute which programmatically provides this complementary tie. For example, given d_k1() an operator from keyword arg k1 to posional arg, one can find its complementary operator c_k1(), from a positional arg to a keyword arg k1.
    • This works, we do not need it (yet).  
  6. Focusing on function arguments is somewhat 'stream oriented'. The AKW class introduced in 4, is more of a classical 'stateful' approach. Version 6 shows how function composition can be constructed on this stateful view.
  7. Functions and callable objects are nice. Yet by default they force us to create 'return value states'. These return states 'semantics' are different and do not naturally fit within our simple 'call' semantics. Another aspect is they are 'invisible' to the notion of reversible code.  Therefore version seven takes a first go at using tail contination to avoid 'return states' of functions. 
  8. Continuation passing can express a 'deeper' execution context with the help of a 'stack' of tail continuations. Version 8 introduces a continuation 'list' for this purpose.
  9. Version 9 extends operators to have something equivalent to the notion 'quoted' expressions in Scheme/Lisp.
    • While not futher developed here, 'quoting' is an 'extra dimension' and extremly valuable to define notions of equalities, and of inclusion (e.g. conbined with matching).
  10. Version 10 introduces a basic map function. 
    • Implemented for tuple, list, sets and dicts, the mapping is not always reversible. Which highlights that certain operations need more support to guarantee properties of reversibility. For example, mapping the set 1,2,3 to 1,1,1 is not reversible.
  11. The classical tail continuation is only telling us where the output of a function should go. With a 'stack' / 'list' of continuations, we can carry more information, for example the 'index' or 'key' of the element that is currently being mapped. Version 11 shows how this is done, and shows how we can make a map able to recurse over hierchies of containers. Also this rewrite introduces a first 'partition' 'unpartion' operator.
  12. Here we partition and unpartition classes of dataclass fields. 
    • Object orientation is stronger when 'done backwards'. That is, composing general objects quickly breaks. However, given a large object that expresses a real world scenario, we can sometimes partition it 'with good formal results'.
  13. Logical programming allows us to perform 'or' and 'and' like operations, and often have something similare to a 'not'. Version 13, shows how logical programming is 'easy' with tail continuation.
    • Five class/functions are defined Success, Fail, First, All, and Fails. 
      • Sucess always succeeds.
      • Fail always fails.
      • First is succeeds when the first of its arguments succeeds.
      • All succeeds when all of its arguments succeed.
      • Fails succeeds when its single argument fails, and fails when it succeeds.
    • Here __call__ is success while 'fail' is called on failure. Later __call__ with the named keyword 'fail'=True indicates failure.
  14. Adding a 'match' operator (with free-variable) to logical programming gives us a simple way to write declarative programs (here in Python).
    •  The 'larger' Prolog program I wrote was a prototype compiler to assign 'topologically' senstive types (e.g. eges vs face vs element, etc.) and perform other transformations to a domain specific modeling language for semicondutor device simulations (applied math mid-90s).
    • My current view about declarative programming (logical programming is a form for declarative programming): They only scales when embedded in a more general semantic stack (as done here for example).
  15. Logical programming is more powerful with lambda expressions (or callable objects). Version 15 defines the operator SelectApplyN1, which apply multiple operators to a single argument, and keeps those that succeed. The goal is to validate that we can mix declarative and lambda like function application.
    • Example: 
      • Given: SelectApplyN1((Above(3), Above(5), Above(7), Above(8)))(6, continuations=...
      • We get: (((7, 6), (8, 6)),) having defined an appropriate 'Above' operator.
  16. In the previous code rewrites, spcial keyword arguments are defined (e.g. for continuation passing, for the free-variable list) and are explicitely defined in all the function signatures. Version 16, pushes this approach in a slightly non-sensical way by adding the keyword argument 'state' to the reference call signature, and explicitely adding this 'state' argument throughout the code. This results in the question: 'why be explicit'?
  17. Version 17, only declares keyword arguments explicitly when used locally. However, the tail continuation list is an explicit keyword argument, and expressed everywhere. The 'standard' call signature looks like this:
    • def __call__(self, *args, continuations: Optional['Continuations'],  **kwargs):
  18. At which point I ask myself: "why not go all the way"? Therefore version 18 moves the continuation stack to a positional argument, as follows:
    • def __call__(self, continuations: Optional['Continuations'],  *args, **kwargs):
    • To note that we are back to a more 'classical' point-free formulation, as 'continuations' is positional and not keyword based.
  19. Context annotation is revisited in version 19, and expressed more generically. To be specific, the continuation stack gets dynamically decorated with contextual information that can be queried 'within context'. These decorators are referred to by 'type'.
    • The example code tags the logical operataors First and All (see 13), so that within the arguments of these functions can query 'which argument index are we'? To the example presented in 15 above, we get (((7, 6, 2), (8, 6, 3)),), because the successful operators were the third and fourth (starting from 0).
  20. Continuing on context annotation, the recursive map of version 11 is adapted in version 20 and a 'depth' context specific function is introduced. The goal is to show that while "positional" context annotations is a handy nice, even more valuable is the ability to algorithmically extend your contexts.
  21. Finally, we conclude this series with version 21 and a careful change ensures that the complete context stack is captured when capturing the state of compuation (e.g. with AKW as introduce in version 4).  This context becomes active again when we 'continue' compuations from each saved state (with cocall).

All original content copyright James Litsios, 2022.

Sunday, December 19, 2021

Lowering your continuations while lifting your functions (in Python)

 A Previous post describes Python code that does functional composition on reversible code. A question that one asks when starting to write compositional code is: how does one compose functions 'at different levels'? And I purposely express this a bit ambiguously as 'different levels' can mean different things. Today we will care about levels within a hierarchy of data containers and simple objects, and note a thing or two on the way.

The map function is Python's builtin tool to apply a function over each data of a container. Combined with a lambda or a function definition we can convert fonctions on container elements to functions on containers. Jargon for this type of code is that we are lifting a function to the container. The idea being that a function on a container is higher than function on container elements.

An important detail to note, is that while the mapped function is being lifted, the continuation context needs to be 'lowered'. This is because the lifted function will later be called with a continuation context for the container, yet internally the original mapped function is called, and it will need 'additional' continuation context. Therefore an extra level of continuation is added in to 'lower it' down to the level of the mapped function. Two examples are given, the first  

  • "Lowers down" a detached generic continuation context. 

python

The second

  • "Lowers down" a location carrying continuation. 

python (container), python (object)

Another particularity of this mapping code is that these extended / lowered continuation are data constructors. Which makes sense because for each element we need its return value to be become data to be put into the resulting mapped container.  This data constructor nature of the continuation extension also reflects the nature of the continuation changing as it enters the data container that is being mapped.

And this brings up an anecdote:

As I wrote these mapping functions, I was immediately reminded of the Newton programming language by Prof. Rapin EPFL. Sadly, I could not find a freely available document on Newton, yet if my memory serves me well, Newton had a higher order container location abstraction that provided a service not very different to our subject here. (One of my CS MS semester projects had me extending the Newton compiler,  in ~1985).

This continuation 'extension' technic has strong ties to the zipper structures for container traversals.

Yet more importantly it reminds us that within a more formal approach to code function arguments may be expressing:

  1. What is coming in to the function.
  2. The call contexts to the function.
  3. Contexts of immediate use of the function's results.
Within a continuation passing style, it is therefore perfectly ok to decorate your continuations to communicate more context to your functions. In the example code referred to here, the 'lowered' continuation carries index positions, dictionary keys, and dataclass field names. 

All original content copyright James Litsios, 2021.

Monday, December 06, 2021

Data composition from functional composition in Python

Previous post describes code that does functional composition. Let's now add operators that convert this functional composition to become data composition, and allow what is produced to be 'executed later'. Github drop for these code snippet here.

Key insights in writing code this way are:

  • Data objects are derived from the functions which define their semantics.
  • This is meta-programming 'done backwards'. Like with meta-programming we capture 'what could be interpreted', but we do so through the execution semantics, not by adding an additional meta-language.
  • Data objects execution semantics can be ambiguous until ambiguities are removed. Not far from a "quantum-like" execution model!
  • There are strong ties here to laziness and trampolining.
  • Blockchains are as much about composition of data, as about composition of functions.
These example code snippets are very short (<100 active lines). Two design choices make this possible: 
  1. Focus on arguments, not return value, nor object state
  2. Make everything composable 
What these examples hint at is:
  • Return values are really just arguments to further calls. 
  • Object states are really just arguments, references (e.g. function reference), and contexts of possible further calls.
This is in fact true at the most simple level, such as in the code tied to this posting. 

All original content copyright James Litsios, 2021.

Wednesday, December 01, 2021

Point-free reversible Python

Have you ever wanted to to write 'formal style' in Python?

What is a formal style of programming

Formal "code", like formalism in mathematics, is to solidly tie what you write to what you mathematically mean. And that connection goes both ways: from math to expression, from expression to math. 

The traditional way to write formal code is to prove as much as you can 'along the way' of writing code. The difficulty of this approach is that proofs are hard, which either limits the complexity of your code, or limits your ability to be productive.

Another way to write code with mathematical properties is to assemble pieces of code that already have mathematical properties. When this assembly preserves these mathematical properties, and even better when it scales them, the resulting code has mathematical properties.  A key concept here is that these "pieces of code with mathematical property" are "at different levels", as it is the interplay between these levels that give the code its semantic depth.  (This last point explains why languages like Prolog are hard to scale).

Dualities are good place to find mathematical properties.  An example of duality is symmetry: something can be 'one way' and 'the other', and we know "mathematically" what that means. This is why reversible code,  reversible types, adjoint structures, and pretty much anything that goes 'one way' and 'the other' (e.g. like rows and columns) are a good basis on which to build systems with formal properties.

What is reversible programming?

Reversible expressions can be evaluated 'one way' and 'another'. This might be execution evaluation that might be reversed, yet might also mean that the interpretation of the expression can be seen from a reversed perspective at different semantic level (e.g. types, external view).

Reversibility is also a way to approach multi-party programming: Say I am party Alice having received B from Bob, after having previously shared A with Bob. Alice can reverse compute B to A to validate that Bob computed B from A. That is a simplified view of things, but it captures the idea that reversibility enables trust.

Reversibility is much like immutability that has been weakened, but which has not lost its magical properties.

What is point-free style?

A point-free (or tacit) style of programming traditionally means coding by composition without naming arguments (and without assignments). In a Python centric view that could mean to only use 'positional arguments', and to use only Callable objects (so as not to have method names), but you need to ask yourself: "Is that important"? In fact within the scope of reversible programming, point-free is best approached as composable and reversible. That you name things or not is of lesser importance. In fact, it is of lesser importance also because the traditional point-free is to focus on 'expression space'. Yet modern programming happens much at the type level, or the meta-type level, and a strict point-free approach would lead us to writing code where types are nameless, as well as meta-types. That would be tricky!

Note here: Point-free much leads to array, stream and structural oriented programming. This is not lost here. In fact a main reason to write code like this in Python is to write structural oriented ML. Reversibility is also a vital piece of blockchain and smart contracts theory too.

Why should you care?

Wikipedia's programming paradigm page enumerates many ways to code. Reversibility is one of the concepts that brings together all of these different ways of expressing software. Putting reversibility 'underneath' a paradigm gives it bridges to other paradigms that have also been given reversibility properties. 

All original content copyright James Litsios, 2021.

Sunday, February 21, 2021

Stream oriented processing

Here is a presentation on designing stream processing systems which I uploaded to YouTube today. Recorded in ~2014, and given in one form or another a few times since about 2005, it is how I "model with streams", something I learned designing market making and algo systems. What I like most about this old deck is how it stays aligned with modern abstraction expressed in higher-order FP semantics.

I wrote a bit on stream processing here in 2011.  I see also that I have a draft for publishing this presentation in 2013. Yet I had wanted to add Scala or Haskell code that captured the "nice figures" of this presentation, and that is why I held back from sharing it earlier. 

All original content copyright James Litsios, 2021.

Saturday, January 02, 2021

What works in higher order Python (Functional Programming, early 2021)

 Here is what works for me to write higher order typed Python:

  1. Use class with __call__ instead of lambdas or functions (aka explicitly specify each closure).
  2. Favor constraining types with Callable, ParamSpec and Concatenate (not with class constructions).
  3. Replace *args by immutable list (Tuple[a,Tuple[b, ...) when ParamSpec fails to scale. 
  4. Replace **kwargs with object using @overload with Literal typed arguments (sadly TypedDict seems not to support generic typing) .
  5. Use Generic to “store” multiple polymorphic types
  6. Use Protocol to separate semantic tiers of types (e.g. ownership in a smart contract)
  7. No nested functions, lambdas or classes.
  8. Use cast to type "unmanaged" types (e.g. eval)
  9. Use phantom typed "dummy" arguments (and type casts) to get around "wrong direction" type dependencies
  10. Visual Code's Pyright works. Early 2021, pycharm fails.

The untyped application scope is a broader Python 3 stack. Note this is 3.10 Python typing (e.g. ParamSpec).  

Not all is easy. Deeply typed Python seems to be magnitudes more expensive to write than "classical" Python. Also, Union types may be needed to "break out" from "tight" type relations. My current feeling is that one is sometimes forced to introduce Unions to allow complex trace joins and bifurcation, to then need to add additional layers of type constraints to control the generality of these Unions. All of this needs to be done with careful "locking" of types. Not the best of situations!

Python is known to be a dynamically typed language, which is ok, as static typing is more of a luxury than a necessity. I learned the above this 2020-2021 Xmas holiday writing lens/optic like python code. Initially with no types, fully lambda centric. Then I thought: let’s try to make it typed in Python!

All original content copyright James Litsios, 2021.

Sunday, April 12, 2020

Thinking versus brute force "doing" in software design in 2020

About thinking versus doing

(I put a subset of this on YouTube as: https://www.youtube.com/watch?v=_kXpyctbmrQ)

Is it ok to program by "brute force" and put aside your brain?

A few years ago, I wrote on the pros and cons of programming with explicit and implicit type declarations. In 2020, I feel that this question is almost irrelevant. Therefore to ask the question:
Should we think about software design when we program? 
There is a trend in software development which states "developing software products is not about thinking but about doing".  I know many venture people who believe very much in this statement. Yet the reality is that software is a process of thinking (in part at least). Still, from a pure seed-venture-money game, the rule of "it is not about thinking" is true"! Because the number one rule is "be fast" and thinking too much will very much slow you down.

Is it then both true and untrue, that we should both use and not use our brains when we write software? Yes, and yes! And it is exactly this subtle challenge that makes software management in innovation a tricky game.

Thinking can be done different ways, different approaches can be taken. Some are efficient, some are a waste of time. Some will help your software grow, others will block you. Some will help you meet your deadline, others will invariably lead to failure. The game is then only to think in the ways that help you!

Some of my colleagues have heard me condense these concepts into one statement:
It is about thinking backwards!
There are many ways to do this. I will cover just a few here, there are in fact as many more.

Play to win above all else

In recent blitz chess games, FM Lefong Hua repeats "it is not about calculating the end's win game", (because that is not a guarantee win), instead is about about "flagging your opponent". By which he means, winning is not about thinking within the classical rules of chess, it is about meeting the "win conditions", which in blitz chess games is much about not running out of time (being flagged).

Transposed into a software development, that means it is not about meeting the classical rules of software design, it is about thinking through how to always meet your goals and deliver results. Or to be more specific, it is about thinking backward, to find a path back from your "runnable software" goals, to your current status that guarantees your development success.

The heart of your software comes first

I have mentioned that "One of the little understood properties of programming is that at the lowest level things tend to be single dimensional".  A good software design builds a "break" or "separation" in that unique dimension "just about where your main product abstraction is", to achieve a sandwich like design where new product abstractions are added "by reassembly" of the two separated parts.

There are few ways to do this, domain specific languages (DSL) and functional programming (FP) being my favoured approaches.  While all my major software developments had DSLs, it was only much later in my career that I understood that what was working for me was not the DSLs but the effort to maintain separation between interpreter and interpreted abstractions. This separation is naturally achieved with a DSL, yet can also be achieved with adjunctive higher order FP. (The "ultimate tool" is then to combine the two concepts as adjunctive FP based DSLs, which was the base of Elevence's contract language).

Be functional and agile

Agile development is a lot about prioritizing your requirements and implementing them in work iterations. That means that the first level of your design, what might be seen as the first draft of primary keys of your data base, are directly tied to the most important business concept (taken from OO vs FP vs agile). The trick is then not to see this as a database of data, where keys are data, but a database of code, where keys are functional properties such as APIs with specific invariants (e.g. somewhat like class types in Haskell). The extended trick is then also to make this a normalized design (just like a normalized classical DB schema, but in a the "functional space"), for example by using linear types.


Brute force yes, but with the right pieces

Let us put this all together, our first conclusion is when building new software:
Use brute force and minimal brains to assemble data and call external APIs 
Importantly, keep your program structure as flat as possible when you do this. Which pretty much means to shun OO patterns, and to use as many array and container like constructions as you want.

Then, and this is the tricky part:
Incrementally invest effort to separate interpreter and interpreted abstractions in your code.
Always make sure that these abstractions fit within the plan to your deliverables. That means you are able to think backwards from your deliverable goal, and work out the path to your current development state, the path which you will follow to build your code. An agile process and a good team will make this an easier job. Also, all those arrays and containers will need to fit into a single unified concept (e.g. such as indexed tensors).

It's complicated

Sometimes things are just too complicated. Many developers do not know how to "separate interpreter and interpreted abstractions".  Here I will be honest, it is not easy. And just to make this clear, the last DSL I wrote took me multiple tries until "I got it right".  Also, to mention that embedding product abstraction in higher order FP is even harder than writing DSLs.

Worse, the process is "hack brute force" to assemble data and use external APIs, and only then think about how you can slice your interpretive abstractions. These means that initially, software development is about doing, not thinking. Tricky...

Welcome to 2020

I have been writing this blog for almost 15 years. Interestingly, a major change over these years is that there is so much well integrated open source software (OSS) out there that it is cheaper to "just try" than to "think" (when using OSS and third party APIs). And in part it was this reality that led me to write this blog today. Enjoy, and stay safe!

All original content copyright James Litsios, 2020.

Sunday, January 11, 2015

Software designs that grow with monads, comonads, and type compatibility

This post is about designing software with internal and external APIs that are robust to future changes. It is therefore about API compatibility, but more importantly it is about the compatibility of a full software design to changes. Not surprisingly, monads and comonads are part of the presented solution, as is a careful approach to use of types.

I had a "aha" moment last year when I watched a video (by Timothy Baldridge)
that showed how an AST for a Clojure compiler was fully based on key-value pairs (nested hash maps), therefore without typed structures nor classes, and was doing very well. The thing is, I have suffered enough times to get the types of the different phases of a compiler to fit together. Therefor the idea of giving up on types and just using key-value pairs, that can easily be added for each compiler phase, seemed really to be an attractive way to write a compiler.

Key-value pairs, duck typing, and many "traditional" functional patterns (think lisp, not Haskell) have all in common their reliance on generic, almost typeless, structures. So while each "atomic element" of these patterns has a type (e.g. int, string, ...), the structures (e.g. struct, list, array, ...) are all generically typed.

Types are what capture the invariants of a design. Giving up on types is in effect giving up on capturing those invariants in a robust manner. Using types normally leads to higher quality software, yet with enough complexity, types no longer capture all the design invariants of a program. Worse, sometime types actually hinder the design by holding it back because they are not powerful enough to capture the desired invariant. This is the case with the difficulty to design an typed AST that "fits" all phases of a compiler. This rigid nature of types is also the hidden problem of API compatibility.

The invariants of APIs are captured with the types of their signatures. When new features or corrections are added to an API, these invariants and associated types evolve. When APIs link different software projects, changing API types is where API compatibility becomes an issue: APIs remain compatible when types change but remain compatible with older types, APIs become incompatible when the types are no longer compatible. An example of type compatibility in OO, is to derive a new class from an existing class, and to add a constructor in this new class from the old class. Unfortunately, that last example is at the source level. At the binary level, compatibility to type change is usually nonexistent, especially when focusing on forward compatibility. Note that API compatibility is not only about types: an API will also become incompatible when the interpretation given to values transferred over the API changes. Therefore to remain compatible, an API can add new value interpretations but must also ensure that past value interpretations never change.

Serializing data for transport and for storage is much about breaking program defined types into language standard basic types, and keeping a strict discipline of interpretation to ensure compatibility. Therefore ensuring both backward and forward compatibility of an API is to maintain a strict growth of value interpretation and to use serializing packages like Protocol Buffers or Thrift. The question is then: how do we ensure the future growth of a complete software design, not just a single API, but a family of APIs? Just like with the single API, the answer also lies in the notion of serialization. The goal is to stitch together the typed lower level design with a higher level untyped design that can change over time.

Data is serialized by walking through it a certain order and breaking it down into its primitive types. Programs can be broken down the same way. Yet to do that, your first need to adopt a functional programming style because it is hard to serialize procedural constructions. In a functional style, only functions need to be "broken down".

In the good old days, scalable software design was about using construction such as data schemas, object patterns, futures, continuation style, etc. Data schemas are still good, but all these other program constructions elements must be revisited with the fact that they can all be implemented with monads and comonads. More importantly, they must be revisited because the bind and cobind operator (and other monadic and comonadic operators) is what serializes functions! Therefore, just like you must serialize your  data schema to ensure future data compatibility, you must "serialize" your functions with monads and comonads to ensure future design compatibility.

Here are few examples of existing designs that do this:
  • Injection frameworks are comonadic constructions. 
  • Transactional frameworks are monadic constructions.
  • Parallel tasks are monadic constructions.
  • Embedded queries are monadic constructions.
  • Reactive streams are monadic constructions.
  • Lenses are comonadic constructions.
  • Automatic differentiation (computing analytical derivatives of numerical code) are both monadic (forward differentiation) and comonadic (backward differentiation).
Just like data compatibility is tied to the order in which data is traversed, future design compatibility is tied to the "order" of function serialization. That is to say that each software design is strongly defined by the order in which functions are broken into monadic and comonadic constructions. While monads and comonads have a duality relationship, they fundamentally differ in "character": monads are "trace" centric, while comonads are environment centric. Said differently, conditional expressions can be monadic, while comonadic expressions can abstract away their environment. A design is then a chosen hierarchy of monads and comonads with chosen set of API extension points.

Just like with data compatibility, future design compatibility is tied to the amount in which types can be changed and remain compatible. And again, to differentiate between the need of source compatibility (e.g. for component designs) and binary compatibility (e.g. to design distributed systems). Use strong types to build your design when your programming language offers a type system that ensures that forward design compatibility is supported by forward type compatibility. Limit your reliance on types when these do not provide this forward compatibility. If this limited use of types, or the limits of the type system, do not allow monads and comonads constructions to be expressed with types, then use typeless/generic bind, cobind, and other monadic and comonadic like operators (e.g. duck typing on key-value pairs). 

Finally, use only the language features that allow you to break down your functions into monadic and comonadic constructions. For example, only use destructive assignment if you can model it within your monadic and comonadic design. Also,  do not forget you need also to enforce a "growth" only rule for your interpretation of values.

Now having said all this, you cannot build a design if it costs too much or if you cannot explain it to others. For example,  I wanted once to upgrade an algorithmic trading API around a monadic construction (within a C++ framework). I communicated around prototypes and presentations, (I was CTO), yet failed to get across to my key team members (who where top notch developers), and ended up canning the idea.  And this brings me back to a few important issues:
  • Monadic and comonadic constructions are higher order, even if you do not implement them with higher order types, you still need to think of them as higher order invariants hidden in back of your design. This is like a mathematician would "see them", and is not obvious from a classical procedural or OO background.
  • The cost to implement typeless/generic monadic and constructions within your language may simply be too high. 
For example, concerning this last point, this week I wanted to implement a duck typing pattern in Java  on top of Protocol Buffers objects (for all the reasons mentioned above), but Protocol Buffers is only efficient if you access fields in a typed manner. So even while I could see how to build a design with a robust algorithmic binary compatibility model, the cost on performance would simply be too high to do it on top of Protocol Buffers.  (Luckily using Protocol Buffers is not a requirement for this project so I can move to another data serialization protocol that supports fast generic field access).

As an end note: I understand that I am not telling you in this post "how" to write monadic and comonad style constructions, especially without a functional language. What I am telling you is that you want to know how to do this if you want your designs to grow well.

All original content copyright James Litsios, 2015.

Sunday, October 05, 2014

Recreational programming: zipper augmentations

I rarely program professionally, so when I write software, it is in the spare moments I have between home and and work (theses days providing consultancy services). That really means on the bus and local tram and subway network here in Zurich.

I do not program for the fun of making my executables run; I mean, for the fun of having the program really do something useful. I program for the fun of the exploration! What I enjoy is understanding how to use new concepts. Usually these are in specific research domains,  which unfortunately for this blog, are still to remain unpublished, but the last two weeks have had me travel into more classical mathematical and software territories, which allows me to talk about them here.

The way research goes, is that you need take leaps of faith. There are lots of territories to explore, you have these gut feelings that tell you to go in one direction and in other. Yet each of these explorations will take time and effort, leaving you with no other choice than to do nothing with your feelings. Yet too much unexplored feeling will make you anxious. Therefore from time to time, you need to take the risk of following up on your desires, even if you do not know exactly where they will lead you.

This was the case for myself two weeks ago, I was feeling nostalgic of KD-Trees. I could not know for sure if part of me wanted to revisit them because my first (summer) job as a developer had me adding these tree structure to an electronic CAD program, or because there was more to find in revisiting them.  The only way to find out was to put in some real work. I therefore quickly implemented a KD-Tree for rectangles (it uses a four dimensional search space), in F# under Xamarin on my macbook.

Here I make a small parenthesis:  F# on Xamarin for iOS is great. Xamarin comes out with an update every week or two, and they keep it one of the most productive environment that I know of to develop mobile apps on.

Back to the trees. KD-Trees have the fancy property of allowing you to make quick proximity search with them (e.g. to find the nearest neighbor to a given point). Implementing them functional style allowed me to understand how to generalize this notion of  tree augmentation within the concept of zippers and focuses. A very nice and powerful concept. I try to explain this hereunder.
  • Take a search tree. 
  • Augment the tree
    • Tree augmentations are secondary key like values that are added to each tree node and that only depend depend the local subtree values. 
    • For example, in KD-Trees, you can store per node a bounding box of all elements stored at that node and below it. 
  • Now create a zipper for the tree
    •   A zipper is structure that breaks recursive structures, allowing you to serialize an access to them
  • Make a focus for this tree and zipper
    •   A focus allows you to travel up and down a tree with immutable data structures. It is a list of zippers and a node of the tree. The list of zippers is used to "go back up the tree".
    • See Huet paper for a good description of the focus/cursor/location.
  • Augment the zippers to complement the augmentations of the tree.
    • For example, in the case of the KD-Tree with nodes augmented with bounding boxes, the zippers can be augmented with a "bounding hole" that is what space around the current subtree search is known empty.
    • This is what my excursions into KD-Tree taught me: you can augment the zipper to complement the tree augmentation.
  •  Now create a map function that goes over the tree with a focus. With both the tree nodes and the zipper are augmented, the map can do proximity searches for each element of the tree in a record time.
  • Finally, wrap up this mapping function in a zipper comonad because it is cool! 
    • I tried to find a good zipper comonad example off the web to give you a reference but could not find any clean and obvious code. So just search for zipper comonad. 
(A note here that there seems to be a lots of variations on terminology. Some people use the term zipper to describe the focus, while others use the terms cursor, location, or context instead of focus. I just know that the first example I read about used the terminology I used above).

Finally, some people have used multiple KD-Tree to do parallel searches, but only partial search in each tree (example). Such a parallel "sampling" approach is a way to reduce the "skewing" effect that that each individual tree hierarchy has on the tree and zipper augmentations.

Wednesday, August 14, 2013

OO design patterns from a functional programming perspective

Here my attempt to match up design patterns with functional programming constructions:
Chain of responsibility
Fold request across the chain of handlers, at each step returning either (using Either type) request to continue fold (possibly with transformed request), or returning handled request indicating that fold is finished.

Command
Request is held in data (e.g. as discriminant union)

Interpreter
Language is available as typed data structure (e.g. discriminant union) and is interpretable with functions calls.
Alternative:Language is defined in monadic form (e.g. as function calls)  and is interpreted as part of the monad's interpretation.

Iterator
Exposed traversal functions hide data structure of containers. Functions typically work on zipper focus adapted to the containers structure.
 
Mediator
Exposed functions stay at top/root of exposed data structures.

Memento
Use immutability and exposure of intermediate states to allow recovery, restart or alternate computation from previous intermediate states.

Observer
Associate container of data with container of queries (e.g filter plus callbacks). Changes to data container (insert, update, removal) triggers matching query callbacks, change to queries may trigger callbacks. Callbacks are either in a monadic structure or pass around a state context.

State
Data context position is indexed (e.g. by one or more keys), associated state is stored in corresponding state monad where individual states are found with index. 

Strategy
Single function signature implemented in multiple ways.

Template Method
Use type classes to refine implementation by type variant.

Visitor
Break down algorithm into sub-functions; Provide these sub-functions in argument to algo; Use them within algo; Provide different ones if you want to change behavior of algo.

Adapter
Adapter functions that takes one or more functions with given signatures and returns one or more functions with adapted functions.

Bridge
Clients accesses a bridge data type that holds functions.

Composite
Use recursive data types.

Decorator
Data has a dynamic map of functional operators.

Facade
API presents no internal types and is as "flat" as possible. This may mean that certain functional constructions have been "collapsed" into a purely data centric view in order to limit the "depth" of the API.

Flyweight
Flyweight data structure are often provided as "first" arguments of functions. These functions can then curry their first argument so that the flyweight shared data is provided but does not need to be "passed around" everywhere.

Proxy
Data type that "hides" another type. Possibly relies on using an index/key reference and a map like container to store original data.

Abstract Factory
Factory hides the finer type of created data and functions. Losing the inner type can be tricky because the  exposed signatures must still need to be powerful enough to do everything you want with the exposed API (as we assume that no dynamic type casting is allowed). This may mean that you need powerful type features to make the abstract factory concept work (e.g. GADT).

Builder
Define functions to construct data. This is in fact the "normal" way to write functional style.

Factory Method
Use polymorphism, types are determined by inner types of arguments.

Prototype
(Immutable data does not need to be copied).

Singleton
Encapsulate the access to the singleton as a mutable variable within a function; Use this function to access singleton. In immutable setting, store singleton within a  state monad.

(I'll revisit this one if I have the time to try to put side by side an OO definition of the pattern with a functional definition of design pattern. Also it is a bit minimalistic).

All original content copyright James Litsios, 2013.

Tuesday, October 09, 2012

Functional meets Fortran

I decided my life needed a little bit more math, and so to decided to add a few differential equations in some F# code. Math in code means applied math and lots of arrays, but arrays are a pain functional style. So heres my appoach.

I have written many numerial applications, mostly in C++, yet also some in Fortran. I mention Fortran because it has the notion of equivalence, where you can map arrays on to each other.  Equivalence is like C unions for array and there to  reminds us is that arrays do not need to be managed as dynamic memory.

A functional style of programming is a lot about preserving your invariants with immutability, yet immutable vectors are not a solution (yet). Still, numerical code has its own invariants, and therefore some immutability is possible, and should be exploited to write robust code.  In my case, I have vectors and matrices; Their sizes are invariant, and so is the structure of the matrices, they have lots of zeros, they are sparse.  The approach is then to specialize a state monad to maintain size, and structure in an immutable manner, while allocating mutable vectors only within the core of the numerical engine.

Long ago I wrote an matrix assembly algorithm. You can see it as a sparse matrix structure inference code. It was really pretty simple: the final matrix is an assembly of many small sparse matrices, finding the final structure can be seen as solving a set of matrix structure equations, the algorithm defines an order of traversal for all this matrix points and then proceeds to "crawl" across all matrices at once, working out all the local structures. In a functional style of programming, I can use the same newVar like concept you would find in a constraint or inference solver, extend it with an equivalent newEq, to build a monadic environment to support my numerical code. The result is that "application code" just uses variable and defines equations, these are then assembled and only later does the code allocate just a few very large arrays to store the real data associated to the problem

This approach is "easy" because all invariants are built first and then used. Much modern applied math depends on hieararchical and adaptive grids. Hierarchy is probably easy to bring in to a functional style, adaptive grids maybe less obvious.

Thursday, September 06, 2012

Where is my state?

People often associated functional programming with statelessness. The reality is that there is always a notion of state, functional programming is about stateful programming. The ambiguity is probably the result of state appearing in different ways than in a procedural style.

In a procedural style, state is associated to memory and you change it by using the writable property of memory.

In a functional style, state is still in memory  but the “classical” approach is to disallow yourself from changing it by writing again over memory. This is the immutable programming style. It goes with functional programming because it is very difficult to build on immutability without some functional programming concepts (I have tried!).  So if you have state and you cannot write over it, the only thing you can do is to read it, and if you want to change the state, you need to write a new memory location with your new state value (a non-functional  near immutable approach is to use arrays and grow them). The problem for beginners is “where should that new state be”?  It needs  to be accessable, and while you maybe knew how to access the old state, where should you put the new state so as to be able to work with it, instead of the old state?
Recursiveness is then the first  way  to manage state; The idea is that if you have a function F that works with state, then if you create a new version of the state you can simply call F recursively  with the new state, and again and again with each new version of the state.

Recursiveness is always somewhere nearby if you are using immutability; Yet you do not need to expose it to manage state. Again, suppose a function F uses state, have it return its state after use, if the state has not changed, then  return it unchanged, if the state changed, then return its new version. The trick is then to have a loop that calls F with the state, then takes the returned value to call F again, and so on. As this is immutable code, the loop is a small tail recursive function.

State can be big and heavy, and not for the wrong reasons, your application may be a big thing. The last thing you want to do is to copy that whole structure just to make a small change. The simple approach is to reuse parts of your old state to make a new state, changing only what needs to be changed. Yet  as the state is usually a tree like structure, if what is changing is deep in the original state, then it will cost you to create a new state because you will need to rebuild much of the hierarchy of the state. One way around this is to take the old state, encapsulate it in a “change of state” structure and use this as the new state  This approach is what I call a differential approach: the state change is the difference. By stacking differences you build a complete state. Usually the differential approach is combined with the “normal” partial copy approach. The reason is that if pursued forever, the “stack” of state changes become too big and starts both to consume memory and take time to access.

In the end your new state will either leave your function “onwards” into recursion or backwards by return, and often the same algorithm may use both methods.
When beginning with the functional style, state seems to be forever a pain to drag around. But with time patterns appear. First you may notice that not all states have the same lifetime, and they do not all have the same read write properties. It is a good idea to try to separate states that have different access and update properties.  You may then note that the hierarchy of functions aligns itself to with the access patterns of your state. Yet if you tie state to function in an layered like fashion that is classical to procedural programming you realize that what you have is not so much state as parameters. That is, while you are working in an inner function you have no ability to modify the outer “state”, so it is more a parameter like than state like. Nevertheless, with situation where states have a strict hierarchy, this approach is ok.

At some point, carrying around state becomes such a standardized routine that  it makes sense to look for help to go to the next level. And the next level is… monads. Here is the deal, I said before that states will either exit your function through a further recursion or by return.  That means that there is a set of compatible patterns that can be used to assemble little bits of code that work on state, so that they nicely fit and the state zips around as an argument or as a return value. This pattern appears naturally when your work with states: it is the idea that the state is passed as last argument and that the stateful functions always return a state. As you want the function to do something useful, you systematically return a tuplet with a state and a function specific return value. Before understanding monads,  I wrote whole applications like this. What monads allow you to do is to standardize so much this method of call pattern that it becomes possible to strip the exposed code of these “peripheral” state going in and state going out. The result is code that looks stateless but that isn’t.  The value of monadic state management is that you make much less mistakes and you can specialize your monad further to give it special properties. Still, from a stateful perspective the main gain of monadic states is productivity: hiding the state mean you can focus on more important things.

Saturday, August 11, 2012

The case of the mysterious distracting type class

I had a magical insight recently about something that was bugging me almost two years. The question was: why the hell was I so unproductive creating type classes!

So here is the deal: when you develop, you combine invariants and transformation properties of your  different pieces to make something new. If you are someone like me, you live your code, and that means that the relations between the different concepts throughout the code drive me "emotionaly". Yesterday, it occured to me that what is special with type classes is that they are "open".  They are open because their properties are not always obvious from the definitions. Worse, new type classes often have unambiguous properties.

I say open because by contrast if I avoid type classes by explictly defining type specific functions or if I create encapsulating types to replace the functional polymorphism by data variants, then I am in effect "closing" the design. With a "closed" design I have the whole design in my head and I "just know" what is best to do next. With an open design I find myself worrying about peripheral design issues which are mostly not so important,  yet I still worry about then... and that really hurts my productivity.

I will therefore adopt a much stricter approach to creating new type classes. The basic rule is NOT to create new type clases. Yes, I love to think about them, but I can't afford to. Then, when I have mentally built all the invariants with the explicit function calls or encapsulating types, and more importantly when I have finished what I was orginally trying to do, I can imagine refactoring and bringing in new type classes.

Friday, May 13, 2011

Immutable, mutable, higher order designs and monads


One of the great pleasures of programming functionally is to be able to rely on immutable data. With immutability, side effects need to be explicitly defined, either as return values from functions or as encapsulations in monadic states. Immutability allows two important properties: first it increases productivity by diminishing what needs to be cared about (freeing the mind to care about the right things!), then, almost as more importantly, it provides the foundation for higher order programming concepts by preserving an acyclic relationship between functions and data. By higher order, I mean that data is replaced by functions or that functions encapsulated in data; for example, in a context like data differentiation or in the use of monads. By acyclic, I mean that all trace trees are acyclic (trace tree can be seen as dynamic call graph where recursion is not a loop). Let me explain…
It is OK to have cycles in the static call graph. That is because we know how to infer invariants (e.g. like types) from inter-dependent and recursive functions. Likewise, recursive data types are also OK. Nevertheless everything is a bit trickier if we look at things from a data flow dependency point of view. This is the view we need to take if we really want understand how the functions are working on data and producing new data. And the sad reality is that it is generally hard to work out invariants from dependency graphs that have cycles because of mutable data, Nevertheless, there are cyclic dependency graphs that are easier to work with than others. More specifically, the more local, or of limited life span, a part of the dependency graph is, the easier it is to understand it scope and properties.
I mentioned trace trees above because they give me the bridge I need to wrap up this posting. The insight is the following: mutability brings in cycles in dependencies, cycles in dependencies kill invariants that allow higher order code constructions, trace trees are a down to earth way to work with dependencies, monads can seen as redefining call points in the trace tree (nodes), and as a consequence one the safest way to introduce mutability is within the scope of a monad. Again, we know the local trace tree will end up with cycles when we introduce mutability, but if this trace tree is “protected” in a monad, then its impact into the rest of the trace tree is easier to understand because of the intimate relation between monads and trace tree nodes.
You could argue that we do not need mutability. Yet real world I/O is mutable and therefore you you cannot escape working with mutable data. Not to mention that to get performance you may need to go mutable too.

Saturday, April 23, 2011

OO versus functional programming versus agile

Have you ever asked yourself the question of why database tables do not have two primary keys? Or maybe you have started a spreadsheet with a choice of columns and rows to only change them later?

One of the little understood properties of programming is that at the lowest level things tend to be single dimensional. So for example, the balanced tree has just one key, and so on. There is theoretical reason for this, but the general fact is that it is much easier to design a structure or algorithm to do one thing than to do two or more.

Object orientation starts with the fundamental idea that objects can do many things. And this naturally leads to the expectation that they can do them equality well. We know from the previous paragraph that this is not true, and yet many do not know this or forget it. You would think that this mismatch between expectation and reality is not really a big problem, yet sadly it is. OO programmers are constantly trying to fit multiple dimensions of abstractions into single objects without understanding that they will never be able do it well without giving more weight to some dimensions over others. Not prioritizing things, not having a clear idea of hierarchy of concepts, leads to a random choice of hierarchy at the deeper implementation level, and leads to redundancies that do not support each other well. The end result is that OO software tend often both to grow and age badly.

Functional programming approaches things differently. A function has only one entry point: from the top. Primitive data structures are all simple (one dimensional) and must be composed to build anything real. The consequence is that with functional programming you need to decide "on construction" how you will interleave the different dimensions of your problem. Unlike OO, you are immediately forced to care about how you will implement things. This is both a good thing and a bad thing. The good thing is that in the end, there is no escape that the deep down implementation of your structures needs a hierarchy, so you might as well take responsibility and choose how you will build your solution. By doing so you can keep things consistent, and consistent does not always mean the same, it may mean that your choice of order of composition is different for different parts of your program but that the unrolling of one part matches the "rolling-up" of another part of the your program. The bad thing about needing to choose early how you split and hierarchize things is that you need to have a lot of experience to do it right (think monads for example). This is why although OO programs tend to not grow well, they are still cheaper to produce as you need senior guys to work functional style.

But now I would like to bring in something different and yet very relevant. Agile development is a lot about prioritizing your requirements and implementing them in iterations. That means that the first level of your design, what might be seen as the first draft of primary keys of your data base, are directly tied to the most important business concept. If you think of it, that is not a bad choice! So to observe that agile development helps structure your design when you do not know better.

Monday, March 28, 2011

C++ and functional programming

What to say to C++ programmers about functional programming, especially now that many compilers have implemented C++0X features like lambda and auto. I am pretty good at C++, having started using it with the Zortech compiler V1 and the first versions of CFront (the orginal C++ to C translator). I have also written much fancy template code. Therefore thinking about this post leaves me in the dilemma that there is one part of me that likes the features of C++0X that "wrap up" loose ends, and another part of me, the functional programming part, that makes me want to rant! I will not rant today, but part of that reasons for being upset needs to be said, as is relevant. The "magic" of mature functional programming are:
  • Statements, expressions, functions and types support each other
  • Type inference and polymorphism automate the relation between statements, expressions, functions and types
  • Monadic and other "advanced" types are available to define new execution models such as parallelism.
C++ templates can bee seen as a primitive functional language. But I would argue against going into production focusing on C++ templates that way as I have been burnt so many times by C++ compilers failing me with complex templates, not to mention the difficulty to hire for this skill So template functional style is not where the emphasis should be. The real issue is that although C++ expressions have a C++ type counterparty (think operators like + and -), there is no C++ type support for statements. As a result many of the core concepts of functional programming cannot mapped across C++, even with the new C++0X features of lambda, garbage collections, auto, etc. My recommendation is that some concepts should be understood by C++ developers but not necessarily with a functional emphasis. My favorite topic is as always monads. In the C++ terms, monadic concepts define properties on the "traces" of the code. If we say: f(...); g(...); then the execution trace of the program will first have gone through f and therefore when g is called, properties set by f may still be valid. A good C++ architect should understand enough about monads to define invariants that can be used in this type of trace oriented context. For the moment, I would forget about other "functional programming concepts" in C++, even if it is C++0X!