Sunday, October 18, 2020

Healthy, noisy, productive Python

Asked about my feelings about Python, “speckle noise” came to mind.  The bigger subject here is Python's limited types and inherent noisiness in capturing deeper invariant properties (and therefore my speckle remark). 


Speckle noiseis the noise that arises due to the effect of environmental conditions on the imaging sensor during image acquisition. The analogy is that there is inherent noise in how one captures ideas in Python, and that this noise has "good Pythonic character". By noise, I mean something that cannot be precisely mastered.  Note that "real" speckle noise is not "good" or "bad", it just is.


All programming languages are "noisy". Yet to the developer, the way that noise affect you varies greatly. "Messiness" of computer languages may hurt you, as it may also help you (e.g. by improving your team's productivity). Said differently, sometime "cleanliness" is unproductive.  The main idea is the following: 


People naturally care about uncertainty.  Therefore, sometimes, we naturally focus on things that are not certain. As a bonus, we are naturally social around uncertain topics (think of the weather!), in part because we are happy to share when no one has an absolute truth, but also because sharing helps us deal with these uncertainties. Finally, there are many situation where an "external" nudge is needed to move out of a local minima. (I mention here the suggestion that financial markets need a bit uncertainty to be healthy, and here how I was once stuck in a bad local C++ design).


People naturally build on certainties. And when we do so, we in part lock ourselves in, because it would cost us to change what was certain and rebuild things.


This game of "certainty", "uncertainty", "building and locking ourselves in", "not finding a solid base to build", is what happens when we program.  Our choice of software language strongly affects how this happens. I have programmed and managed teams using Python, Haskell, F#, C++, Pascal, C, and Fortran (plus exotic languages like OPS5). Each of these languages is "robust" at a different level, some with more impact than others. 


Python, for example, is a language where expressions and functions come first, and types (e.g. list, object, classes, ...) are much a thin way to group functions and data together.  To contrast with Haskell,  where types are more important than expressions. The result is that new concepts are quickly captured in Python, and are considerably harder to capture in Haskell. However, it is quite difficult to capture deeper invariant properties of new concepts in Python, something that is easy to do in Haskell, with its strong types.


We might summarize by stating that Python has noisy types. At least that is often the way I feel when "dragging" concepts from one expression to another using "glue" list, dictionaries, objects or tuplets structures, just to make it work. Also to mention Python's limited dispatch logic, forcing yet more ad hoc constructions into your expressions Yet the magic of the real world, is that such noise creating properties is not necessarily bad!


I few years ago, I hired Haskell developers to build a system with "crystalline design properties". This had been one of my goals since being responsible for a "pretty messy design" in the late 90's. Therefore I co-founded a company with in part the goal of building a "single coherent distributed system". It is not easy to create a system where every complementary concerns fit precisely together, and all exists within coherent contexts. In fact, it only makes sense if you need it, for example to ensure trust and security. Now imagine the developers in the team working on such a "single coherent design". In such a development, no engineer can take independent decisions. In such a design no code can be written that does not fit exactly with rest of the code. How then to create common goals that map into personal team member tasks so as to avoid a design deadlock?  The simple answer might be make sure you have failed before in many ways, so as to avoid repeating those failures. Yet still, that does not avoid design deadlock. The hint of the approach is that for every dimension of freedom that you need to remove to guarantee the strength of your design, make sure to add an additional free non-critical dimension to help individuals still have a form of personal independence. In addition, I will add that it took me a lot of micro-management of vision and team dynamics to make that development a great success.


With “speckle noise”, especially at the type level, no such problems! There is no single crystalline unified software design. There is no coherency that is assured across your system. Python naturally accumulates imperfection which are just too expensive to keep precisely tamed with each addition of new code. This means that developers can agree on similar and yet different Python designs, In some sense one agrees to compare naturally fuzzy design views. And by doing so, one naturally protects one’s ego, as there is always a bit of room to express individual choices.


This may sound like Python bashing. It is not. This expensive “to design right” property is common to most programming languages.  This post is in fact a “praise Python” post.  If Python only had “a certain fuzziness”, it would be not much better than Visual Basic (to be slightly nasty).  Python is not “just another language”, it is a language where design logic is cheap to change because of the messy types are in fact naturally "spread apart". That is, the "noisy" Python type property results in "not too dense type relations", allowing changes to be made in one (implied) type without effecting the core of the other (implied) types. 


ps: I mentioned Fortran above really only because I like numpy's stride_tricks.as_strided and it reminds me of my large equivalence array structures which I used in Fortran when I was a teenager.


All original content copyright James Litsios, 2020.





No comments: