Sunday, February 21, 2021

Stream oriented processing

Here is a presentation on designing stream processing systems which I uploaded to YouTube today. Recorded in ~2014, and given in one form or another a few times since about 2005, it is how I "model with streams", something I learned designing market making and algo systems. What I like most about this old deck is how it stays aligned with modern abstraction expressed in higher-order FP semantics.

I wrote a bit on stream processing here in 2011.  I see also that I have a draft for publishing this presentation in 2013. Yet I had wanted to add Scala or Haskell code that captured the "nice figures" of this presentation, and that is why I held back from sharing it earlier. 

All original content copyright James Litsios, 2021.

Sunday, February 07, 2021

Single class refactoring exercise (in Python)

This week I needed to extract a little piece of computational geometry, imbedded in a single Python class of application specific logic. Therefore I proceeded to create a second class and migrate the computation geometry code over to it. On of the pleasures of Python is the ability to do this type of refactoring at low cost, and pretty much follow a recipe. Which I include here: 
  1. Start with class A
  2. Create new class B, derive A from B
  3. Move part of A.__init__ to B.__init__
  4. Move initial subset of member functions A to B, implicitly creating a first API between A and B (API A-B)
  5. Iterate and apply one or more of:
    1. Split / merge methods within API A-B
    2. For A still accessed directly (through self) by B,  extend API A-B to provide additional A content.
    3. Find flow/bundle/fiber/ data of A and B, refactor these to be within new classes C0, C1, .... Alternatively refactor them to be within additional Numpy axes (e.g. for ML applications).
    4. Move selected Cs to from A to B or B to A. If appropriate adapt API
    5. Move selected Cs into API A-B or take out of the API
    6. Find variants within the Cs. Bring these under new protocol classes P0, P1, ... Adapt API A-B to depend on Ps. Alternatively, find Numpy axes with shared "basis",  and refactor towards abstract shape/axes properties, then adapt API.
  6. Stop when all traces of A have been removed from B, and the API A-B is generic enough for the desired future usage of B.
Notes:
  • "Flow data" are typically found in arguments to functions.
  • Writing generic axes logic in Numpy is super hard as Numpy gives little help for pure pointfree-style functional code. However it can be done, and you can for example write 1d, 2d, and 3d logic as just one common piece of Numpy code.
  • I was going say always retest as you refactor. Yet in my usage above I did not, and regretted it when my first test failed after a few hours of work. Given that I had "changed much", I quickly reapplied my changes to the original code, but this time running tests each time.
All original content copyright James Litsios, 2021.