Thursday, January 21, 2021

From sparse tensor parameter space to orchestrated stream of distributed processes

In 1996, my team adopted a tensor view of its application parameter space.

Credit must be given to one of my partners at QT Optec/Actant, who presented the following view of the world: each client could "tie" each parameter to any point/slice in "high-dimensional" cartesian space of primary and secondary keys. It was an advanced concept which took me many years to fully master. The simple statement is: "it ain't easy"!

Higher dimensional data views are "all the rage" now, but not in 1996. I'll try to illustrate the concept as follows:

Imagine we have a parameter P1 used in formula F that goes from X to Y. Imagine that X can be queried with keys kx0 and kx1 (it is two dimensional) and Y is three dimensional with "axis" ky0, ky1, and ky2. P1 is a parameter, which means that it is a "given value". The question is then: is P1 just one value? Should we use a different P1 for the different values of X tied to its 2d mapping to kx0 and kx1? We could even imagine that P1 is different depending on F producing Y in different values of the three dimensions ky0, ky1, ky2. We could even do better. We might say that P1 sometimes depends on X, and sometimes depends on Y, this would especially make sense if F was dependent on yet another set of data Z, and the new condition is that the choice of P1 depends on Z.

With hindsight, the key learning is that no data is static. Therefor P1 is not "a parameter", P1 is a stream of values. The simple rule is nothing is "just data", it is always data within a process of update of that data, parameters do not exist. To simplify: we can say that P1 is a stream of data. Yet to get the design model right, we need to say that P1 is a stream of data constrained by a very specific process. 

All of the examples above are about P1 being parallel streams of data. And the second key learning is that most often this means that P1 is a distributed process of streamed data that must follow a very specific distributed process of data.

The original problem/solution formulation was somewhat OO or DB. Put in an orchestrated stream of distributed processes form we actually have most of the tools we need to make this concept scale and work.  

All original content copyright James Litsios, 2021.

No comments: