Sunday, December 01, 2013

Agility is both about individuals and collaboration: Analogy between agile teams and chess

What can we learn from chess to better our agile management? Here I use analogies from the game  to highlight a few important concept that all agile team members should be aware of. For example, the game of chess highlights the difference between the strength of individuals versus the strength of teams, as well as the difference between experience versus talent. 

I watched a few of the 2013 world chess championship games between Carlsen and Anand. (I am not a chess player, I am just happy to be a good spectator). It was a great fight! Looking back you might say that Anand just made too many mistakes, yet during the match I think that many of us though that Anand had good chance to do much better.

While watching those games, I could not help seeing some of the universal principals that can also be found in management. To explain the analogy, I first present the chess concepts, and then map these into a management world.

When you learn chess, you learn that pawns are the weakest pieces, queens, the strongest, bishops and knights have a similar strength, and that rooks are the second most powerful pieces. Therefore it is "normal" to make moves which exchange same value pieces. And as a beginner, you base your moves on this principal and start by thinking that the strength of a position is determined by summing the strength of your remaining pieces. Yet with experience, comes the realization that only pieces that can do something have value. And of course, it should not be easy to take them. Which really means that good players build their positions as complex supporting dependencies between their pieces, and complex attack/defense relations between their pieces and the adversaries pieces.  And relations are built not only for the existing position but also for selected possible future positions. Still, even though positional strength is usually more important than "piece value" strength, strong positional strength often ebbs away when it does not provide victory, and fortunes shift. In these moments, an extra piece or two may make the difference between winning and losing.

One way to make sure to build "the right" position in chess is to rely on learning many chess openings. With a chess opening "catalog" you are more sure "to do the right" thing. The same can be said for end games: there are techniques to learn to deal with many type of end scenarios. Note that to use such methods you do not need to be talented, you need only to have learned the skill (see recent talent blog entry for more). Therefore to mention that (in theory) you do not need to "think" to build a solid starting position in chess, you just follow the previously learned opening lines. Still, this only works if your adversary is doing the same. And, you do need to think, because there are often multiple choices possible for a given opening position. Also, opening lines in chess are not necessarily very deep. (I couldn't find how deep openings are on average, only total number of opening moves, e.g. 16.3 million ).

Let us now try to associate team management ideas to the chess concepts introduce above.
Chess Agile management
1 Piece value strength Team member strength
2 Positional strength is built on relation between pieces Agile team strength is built on relations and collaborations between people
3 Game position Each team member has a task activity, and relations and collaborations with other  team members
4 Piece is moved Team member finishes a task, picks up a new one, and possibly changes how he interacts with others.
5 Chess openings build strong positions Use of past experience builds strong and productive task assignments and collaborations within the teams.
6 Chess endgame theory helps win in difficult situations Use of past experience allows team to meet their delivery deadline.
7 Chess player thinks Team interacts and prepares tasks.

We should not push this analogy too far. A game of chess is not a team.  And in fact in chess, each player controls his individual pieces, which could be compared to a project manager ordering his people around. There are many industries where this comparison would make sense. The reasoning works like this:
  1. If you have acquired enough expertise so that each task is repeatable and demands little new learning.
  2. If what you need to do is known and fits within your expertise.
  3. If you find individuals that can perform the tasks needed.
  4. Then you can run a central project management, and as "everything" is known, no collaboration or individual decision taking is needed.
But, such a situation is explicitly disallowing all the new situations and new learning that appear in playing chess.  Which is why I have chosen only to focus on an agile view of the world here. In fact my only "starting" goal was to repeat the "mantra" that agility is both based on individuals and collaboration. And so is the game of chess (from a perspective of the pieces).

To conclude:
  • If you are a less skilled individual, you can provide critical strength to your team by supporting your teammates with valuable and well timed help.
  • If you are a very skilled individual, then you can make you team even stronger by leveraging your strength through appropriate interactions with your teammates. 
  • Use past experience to choose best how to assign tasks, to interact and support your teammates, and meet your deadlines.
And if you have any doubt that this is true, you should watch more chess.

All original content copyright James Litsios, 2013.





Friday, November 29, 2013

Why analytical people sometimes fail to get emotions right

The other day I was talking project management with a development manager, and I was not careful to wear my "member of the project manager club" social persona. Instead, I did what I usually do which is to focus more on competence and collaborations, and less on  titles. I little too late, I realized that I had just offended this person. The truth is, many of us include our job title in our social identity. Therefore when I did not show respect for the "social stance of project leaders",  this person felt personally aggressed, and to protect himself, needed to think that I was not being coherent.

My daughter wanted to write a story recently. I gave here the following advice:
You always have two choices:
You can start with events and then describe the resulting emotions,
or start with emotions and then describe the resulting events.

Just apply this rule over and over.
 An example:  "The man hit the dog, and everyone was sad" versus  "It was a sad day: the man hit the dog, (and possibly follow with more sad events)".

Emotions tend to be the weak spot of analytical people. Not because they do not feel emotions, but because their analytical system does not pay enough attention to the fact that simple statements can generate emotions that will change the rational perceptions of others.

All original content copyright James Litsios, 2013.


Monday, November 25, 2013

Are good movies for nerds still possible?

This weekend I watched Computer Chess and The Internship on Apple TV.
Sadly, I feel that both film fail my test for being great.

Computer chess is really pretty fun. It is not "funny" but brings enough retro geeky elements into the script to touch a lot of "├╝ber nerd" soft spots. If you are like me and have lived the late seventies and early eighties micro-computer revolution, then you definitely get value of the first three quarters of the film. The let down is the last quarter. I will not spoil your fun by giving you the plot, but I can say that the film tries to meet up with a broader audience by bringing in deep issues of faith within a "fantastic" settings. I just didn't feel this worked.

The internship suffers a similar "try to broaden appeal" problem. The issue here is sex. Maybe it is the reediting for the Apple TV version, or I am just old fashion, but this film had way to much "open" sexual dialog (and scenes), and not enough inhibited behavior to be a "true" geeky movie.

Hopefully there is a business model out there that allows movies to be true to their spirit, without being a flop. These two movies show that these producers do not know of it.

You may ask what is my definition of a movie for nerds?
Here are a few of my favorites: Electric Dreams, Colossus - The Forbin Project, The Brain, Pi, Tron.

What is interesting in this list is the common theme: The separation of mind from body within a rational framework.

Wednesday, November 06, 2013

Managing talented people



Misfits of science

Talented people are outstanding at what they do best... and often dysfunctional in one way or another. This post is about how to manage talented people. And especially about managing them within organizations that need a lot of collaboration (e.g. software development).

I'll start with a quote from a TV show from the eighties:
"Everybody that is worth the trouble is a little weird"
(This is out of the truly silly episode three of the Misfits of Science)
 
I have worked with many talented people that were "pretty normal", yet I would need to argue that the majority of the "good people", those that are worth the trouble to "keep", from a management perspective, are also "quirky people". For example, I have worked with someone who would request forks at McDonald's, another that rarely washed, another that always took credit for other people's ideas, one that never could join a meeting on time, yet another that physically broke down in certain situations (and then needed real assistance), one that spent every weekend doing the same thing, one that complained and bickered forever, etc. Not all of these behaviors became a management issue, but some did; And by my experience, even those talented people that did not show such "outward" signs of "weirdness", had their special side, which within work organization often showed up as social issues in abrupt or surprising manners, like when they decide to quit (see Bob Knowlton at Simmons Laboratories for a great example of interesting material collected on the subject in the fifties).

I will admit that as a manager I have wasted a lot of time trying to help some of these people, trying to improve their dysfunctional side. My reasoning was: I am talented, and a bit dysfunctional, yet I have learned to function within the social and work rules, so why cannot some of my colleagues be able to learn to function better with others and with themselves?

Yet changing oneself takes an incredible amount of time and focus, and is simply something that cannot be "managed at work". This brings us to rule one and two in managing talent:
Focus on your organization's culture and work process.
Ignore the dysfunctional side of talented individuals.
As a manager your first priority is to make sure that the "group thing" works, because it will cost you a fortune if it does not. A culture of healthy proactivity is worth gold, and so is a culture of efficient quality (e.g. test driven with automation). So your job as a manager, before all, is to bring in that productivity without worrying about dealing with the dysfunctional side of your talented people.

Note that I did not say that you should ignore your talented people, just their dysfunctional side. Which brings us to rule three:
Never ignore your talented people.
Here is the theory: Talented people are "special people" (see my blog entry on the subject). Which means that part of what "makes them tick" is independent from the social and group "thing". Talented people can build "a lot" on top of this independent "part of themselves", that is what gives them their speed and creativity. And they do it for different reasons: they may be driven by the purity of the building process, a form of aestheticism; they may be driven by their social imbalance, forever trying to compensate for it. But most importantly, this drive is almost always there! It does not matter if the person is doing good or bad. Therefore, if care is not taken, talented people may build "bad stuff" in this independent part of themselves, effectively growing their dysfunctionalities. This happens subconsciously or not.

Breaking bad

Talented people may build "bad stuff" and this can lead to bad behavior. Top on the list bad behavior with talented people is "the control thing". People with this problem may invest a lot of effort to control their surrounding with the only goal of preserving or protecting their "inner independence". This can happen in many levels of "badness": I have worked with people that lied to me and to others, a very bad behavior; I have worked with people that needed to "own" all or part of the solution, a sometimes bad behavior.  This ambiguity of what is a bad behavior is part of the trickiness of managing talented people. But it also leads back to rule one: first make sure your overall process is working. Do that for the productivity reasons I mentioned above but also to set limits for your talented people. The thing is, they are experts, they know more than you in specific domains, yet their effort to "stay in control" may well be leading you elsewhere than you want to go, and they may even not be aware of this. Therefore, as a manager you need to set (and help set) standards and make sure you have a very active monitoring process to enforce these standards. In fact you not only want that monitoring of "good organization process and culture" to be active, you want it to be visible to all. That will help remind your talents what the good rules are, and that is our rule number four:
Actively and publicly monitor "good" work process
You can be the traditional director, or project manager, that "walks the floor" and that tell their workers what they see and what they like. Or this can be an agile process where the monitoring is done as part of everyone's job,  for example, in the Scrum's daily standup meetings. What is important is that the effort is steady, repeated and maintains expectations of simple work behaviors.

Now having said all this, you might say I am providing little help, and I am not saying how rule three should be implemented. The one about not ignoring your talented people. I will come to that, but I really wanted to reenforce rule one, two, and four. And just to really, really, reenforce this notion of "make sure the environment is right", I need to add that just like with computers, there is a bit of a "garbage in, garbage out" aspect with people: if you do not give them good input, they will mostly not produce good output. With talented people, this effect is compounded, because they "soak in" more and faster, but also because they typically are more sensitive to connections between things.

This brings us to rule five:
 Maintain clear business goals, actively evangelize your teams towards them.
Having a company vision and a good product management that works hand in hand with your development resources goes a long way to keep your talented people aligned and feeling that someone cares about them. But that is not enough. There is an alpha person like hierarchy in talent. Therefore not only do your product owners or managers need to be good, they must also be talented. Otherwise the attention they bring to your talents will not be considered as relevant.

This last rule is little tough, I am saying you need a balance of talent in product management and development, yet if you know how hard it is to hire talent, this is not always an easy balance to achieve. Yet I am asking you for more, rule six is:
Have a head of HR, or CEO for smaller organization, that understands talent.
Now you might be thinking: talented development, talented product management, therefore talented head of HR or CEO; a form of "alpha person" hierarchy. That is not what I am saying -- although it is an option to shape your organization as an "alpha hierarchy" -- I am not saying that because if you build your company just by "stacking" talent, you end up with strong silo structures and a very political culture where people will put themselves before their organization. Such an approach is fine for specific business models, for example, to support a "pyramidal" sales organization. Yet it leads to organizations with limited sharing and team culture. In my business (software development) you want collaboration, therefore you want your head of HR or CEO to understand the subtle rules of managing talents. And therefore rule six says that your head of HR or CEO is experienced enough that he/she knows how to manage talent.
 

Just say no

Even with a "talent friendly" organization framework, we do need to accept that sometimes people just go too far, and there are situations were bad behavior needs to be stopped. Therefore, rule seven is:
Stop bad behavior that goes against your organization's culture and work process.
(This rule is really just a repeat of part of rule one. Yet it is so critically important that I give it its own rule.)

Talented people need a work culture as much as everybody else, but the more talented the person, the less their culture tends to be fed by "standard" social rules. Nerds or geeks are an example, almost a caricature, of such a different cultural standard. Other examples of "special cultures" exist, and each individual has their own "special" personal culture, built upon good and bad behaviors (and remember that dysfunctionality goes with talent). Although it is nice to be fully integrated in a culture, it is also nice to be just "in" a culture, especially if there are other ways to grow. Many talented people are satisfied to work within "simple" work process, even if this ends up being a pretty static activity, as long as they can pursue their personal growth in their talent domain. In fact, people that are "rationally" talented tend to do better with a simple and somewhat unemotional work culture. Yet even with a simple process and work culture, some people refuse to, or simply cannot, "play by the rules". Talented or not, that is not acceptable, and rule seven is there to reenforce this. In fact often this is the most critical activity of management, you need to stop people that are bringing others down when it happens, waiting is not an option.

Now having said all that, we can address rule three: "Never ignore your talented people".

In an ideal work world, we are all equally positively collaboratively driven, analytical, social and mature people. Yet the reality is different, and the more talented the person, the more issues he or she may have. One character pattern I have seen many times is the complementarity of analytical and social skills. Not only does talent in one of these stunt the other, but worse: talent often feeds a complementary negative character. Said simply, talented rational people tend to be emotionally immature, talented emotional people tend to be "rationally immature". The best way to deal with this is to have a healthy work process and culture. That is why all those previous rules are so important: they make good process and culture happen. Yet the question is still: given that not everyone has similar talents, what more should be done? What specific "talent oriented" action should you take as a manager?

One option is to do nothing. Or more specifically, run your normal process, and if your talents are not happy, well... tough luck for them! 
This option of "do nothing special about your talented people" is a real option. You may be within a domain and business where you are getting enough talent that you do not want do anything more than just to run your company efficiently. Still, you do need to be extra careful that your bring in the right talent for each specific job. That brings us to rule eight:
On hiring, make sure the talent matches the job;
Later, make sure the job matches the talent.
You might be thinking I am saying: "make sure that the skill set matches up with the job. Yet that is not the case! In fact that brings us to a major trickiness: Talent and skill set is not the same thing!

Making sense of quirkiness

In order to stay on safe grounds, I make use of the the dictionary, which states that the definition of talent is "natural aptitude or skill". The dictionary is saying that talent is "natural" while by default "skill" is learned. What the dictionary is trying to express is the "special" property, which I have already talked about, and which is the fact that the talented person is developing on his/her own inner construction, independently of others. But more importantly, the difference is between learning "on one's own", and being talented at it, versus learning from others. It is the key understanding that you want to pick up here.  It support everything I said above about normal management rules, it provides a "simple" explanation for the "quirky talent" syndrome, and finally it gives us a lead into how best to use talent.

The idea is that a talented person can build new skills "on their own", while the "less talented" person will need someone else from which to learn the new skills. (Obviously, this is somewhat of a reduces view of the world, but in my experience it is very useful view of how growth happens). We have eight possible scenarios:

Talented Less talented
Learns good things on their own More likely Less likely
Learns bad things on their own More likely Less likely
Learns good things from others Possible Likely
Learns bad things from others Possible Likely

To explain:
  • Talented people are more likely to learn on their own (good or bad).
  • Less talented people are less likely to learn on their own (good or bad).
  • Talented people have an inner independence that may block them from learning from others. Therefore it is possible that they learn less from others.
  • Less talented people have less inner independence and therefore are more likely to learn from others.
This help support the statements from above:
  • Normal management  is needed to help talented people learn from others and detect situations when they are blocking this process.
  • "Quirky talent" syndrome happens because talented people are likely to have learned differently on their own. This variation from the "common social standard" feels "quirky".
It also gives us another insight: talent has some similarity with volatility in finance.
And therefore managing talent has some similarities in with managing portfolios: there is really a notion of risk versus return, and a notion of steady return of value versus volatility in this return.

The science of working talent

The idea is that work is a form of investment, people get payed, and return value on this investment. (You can skip this paragraph and the next two if you do not want my "pseudoscience"). This rate of return depends on their skills, the compatibility of the skills with the needs of the project, but also varies widely, in a volatile manner, with the complexity of each project, but also also with the nature of each person. This is not pure science, but more of a qualitative effort of understanding what dependencies a manager needs to work with. The talent return and volatility model works like this:
  • Skills are like a rate of return: As more skills bring back more value on investment. This is independent of talent.
  • Rate of returns (skills) grow with time: Because people learn new skills and to be more productive, but usually peeks at a limit.
  • Volatility on return grows with talent:  Because talented people accumulate both good and bad learning on their own.
  • Less talent often means less volatile: Because less talented people learn from others and this is a filtering process that reduces uncertainty.
  • Asymmetry in volatility grows with time: People that acquire good behaviors tend to grow them and same for bad behaviors. On the long term there is a trend.
  • Skills are multidimensional:  Skills depend on their usage context and on the desired goal. Some skills can be composed, or used together, others not. So even though skills are similar to the rate of return of investment into a person, this rate of return is very context dependent.
  • Talented people often find it easier to mix different skills together: Because their skills are often learned "on there own", and therefore have a "more compatible common base".
  • The volatility of talented people may explode when meeting their limits: The reasoning is, their independent part may become unstable when no longer supported by their surrounding (social or domain specific). When talented people get lost, their independent part will frantically work alone until either it finds a "new balance" that still addresses the desired "work goals", or in the less desired equilibrium that is incompatible with work goals.
Now that we have laid out the major dependencies in talent management, we can can talk about strategy. We cannot be too complicated. For example, it would be tempting to take the above results and try to create something similar to modern portfolio optimization applied to the management of people. But that would be forgetting that we would need precise parameters of each individual, and also a precise understanding how each individual collaborates with others. That is something that we do not have. So while hedge funds exist based on many different strategies, there are usually much less different ways to manage talent. Still, the above results are telling us a few major elements of strategy that can be applied generally without the knowledge if someone is talented or not:
  • When hiring worry about talent:
    • Favor skills: The minimal risk approach. Ignore talent, just favors the skills you need.
    • Favor precise skills: Similar to favoring skills, you favor the skills you need, but as you keep the skill set very limited you are "repelling" many talented people because they would feel "boxed in". With less talent, you reduce your risk to acquire "bad talent".
    • Favor broad skills: You focus on the skills you need, but you ask for a wide enough variety of them to make the right talented person feel comfortable.
    • Favor good work culture:  Good work culture is good for everyone, but here we are interested in work culture that "filters" and "aligns" your talented people. People that do not "feel right" to your culture are not hired (and often do not want to be hired).
    • Favor engaged work culture: Talented people need to feed their inner independence. The easy option is for them to invest their talent elsewhere than work. The better option, is that the talent is invested into their work or into areas that are supporting their work. In academia, publications (and other forms of collaboration) are used for this purpose. In the "normal work world", blogs and open source/collaborative projects may be a way to judge someone else's "talent alignment". Yet many people do not have such public activities.  Unfortunately, it is not easy to judge where people invest their effort when no public material is available.
    • Favor past success: One success may be luck, but a lot of success is a sign of skills, and most likely talent. Yet past success does not indicate that the current skills are still usable, nor that bad learning has not tainted the person, so care must be taken.
    • Favor recent past success: As above but lowers the chance that the person has not acquired bad learning that would block his/her skill set and possible talent. 
    • Share your long term vision: As described below, your company will need to provide a reason for the person you are considering to hire to stay. Better present enough of your vision up front to make sure the candidate is buying in to it.
    • Have some social diversity (e.g. gender, age, ...): Who is going to tell your quirky talented staff that they smell bad, make annoying funny sounds, or nasty remarks? Ideally their co-workers! But for this to happen there needs to be enough social diversity, and this only happens if you hire some diversity in. The thing is, the more we are alike, the harder it is for us to chastise others. Social difference, for example man versus woman, are great enablers that help teams address social issues.
  • Monitor velocity: Faithfully sizing and estimating future work and measuring progress  (velocity in Scrum), is very important, talent management or not. Yet for talents it is a way to give them a hold when they fail to succeed. On the longer term this is important as unmanaged failures may result in more "explosive" reactions in talented people, such as burnout, breakdowns, and resignations.
  • Evangelize your staff with long term visions: There is often a latent insecurity in talented people. It is their independence that sometimes worries them. Best is that your company vision provides long term reassurance. In the tech industry, option plans were one way to do this. Now,  social visibility is way to "provide more" to someone. Yet both of these methods have  disadvantages, therefore ideally you want to get you talented people to "buy in" to your long term business strategy. This is one reason I recommend to have good product owners and product managers that evangelize your teams.
Example of strategies that do not work within a collaborative environment:
  • Focus only on good talent: If you are a gallery owner, or a consultancy firm, your business model may work by considering talent before all else in your selection process. Yet this strategy has a high risk of bringing in lots of bad behavior, and skill mismatches, something you definitely do not want in a collaborative environment.
  • Mushroom Management or "Keeping your developers in the dark and feed them fertilizer": Such an approach does not take a active approach to product management, communicating business vision, nor development culture. In effect, it does not provide the teams nor your talents with enough "elasticity" to make the right decisions. This results in bad alignment, issues "rotting", and ultimately your talents "exploding".
Some of you may feel that I have again avoided to address how to directly manage talented people, and again have focused on to "general" management process rules. That is true! In fact, it is a key observation: I am not going to tell you how to micromanage people on a people by people basis. First because so much value can be brought by ignoring micromanagement , and second because I cannot teach you to micromanage well in a single blog posting.

Hiring comes first

Now to cruise us to the end of this blog, we need to bring the elements of strategy introduced above into at least one rule. The most important element of strategy is in the hiring process. The interview process needs to bring productive people and filter out "bad talent", therefore you must first focus on skill, on culture, and filtering out bad talent, and then only in second priority on good talent. Note that when I mean culture, I mean "work culture": Someone who is "just nice", is not necessarily productive. Therefore rule nine in managing talent is:
Hire for skills, for work culture, for past success,  for some amount of social diversity, and for compatibility of vision; Filter out bad talent.
You need all these focuses to make a good hire. What is important is that you communicate well in your hiring process, you want to attract good talent, and repel bad talent, therefore do talk about your culture and your vision too.

You will note that I have already covered the other elements of strategy as part of the "normal" process above. Therefore we are almost ready to wrap up this "strategy section". I just want to bring one more rule, which is about levels of maturity. In a typical organization, you need different personal characteristics. You may need specific areas of professional skills, as you may need more generic attributes such as analytical and emotional skills, and also management and work culture skills. The thing is, the productivity of collaboration is directly related to the compatibility of the maturity level of the different people. Part of what I am saying is that we need to share the same work culture to work well together, yet work culture cannot "do away" with the difficulties that two "very different" people have in communicating. When your organization is confronted with the difficulty of communication between different types of people, you might be tempted in using your talented people fill in the gaps and make communication happen. Although this might be perceived as good use of resources, it is only a local good use, better would be to use those talented people somewhere that would be delivering new "value", not supporting the delivery of "current" value.

You may have noted that organizations tend to be split internally into groups and hierarchies by specific skill sets and maturity levels, and that these different groups and hierarchies work together with the help of cross organizational processes. This is because it is more efficient to use an "inhomogeneous" way of working, than a flat system were too many different skills and maturities are mixed within same groups. (Note that even agile processes have their separation of roles and hierarchical dependencies). Yet as organizations evolve, their structure and processes do not always follow, and one pattern I have noted is that overall company "weaknesses" due to inadequate structure and processes are resolved locally by bringing talented people in to fix "communication issues", when in fact the real issue to address is is a bigger non-local one. Therefore my final rule ten is about about trying to do without talent when the more scalable and sustainable solution exist to fix your organization:
Do not use talent when good organization structure and processes can do the job.

Managing talent, a summary

Talented people may bring both value with them as well as problems. To have good talented people in your organization, you want to hire them, and you want to keep them. To make that work, your biggest priority is to organize your company well, run good processes, encourage good culture, and stop bad behavior as soon as possible. Ideally, you want enough social diversity so that bad social behavior is quietly resolved "locally" with no need for interaction by management. Also, you have to be careful of supporting talent with talent, and you want at least one person high up in your organization that understands how talented people perceive each other. This is the concept of that a "alpha-person of talent" relations exists in your organization, but also that some form of "hierarchy of maturity" also should exist.  Finally, although you may be relying on talented people for key positions in your organization, you need to ask yourself if that is necessary, and if you would not be better off with an improved structure of organization or with better processes.

Here are the ten rules again.
  1. Focus on your organization's culture and work process.
  2. Ignore the dysfunctional side of talented individuals.
  3. Never ignore your talented people.
  4. Actively and publicly monitor "good" work process
  5. Maintain clear business goals, actively evangelize your teams towards them.
  6. Have a head of HR, or CEO for smaller organization, that understands talent.
  7. Stop bad behavior that goes against your organization's culture and work process.
  8. On hiring, make sure the talent matches the job; Later, make sure the job matches the talent.
  9. Hire for skills, for work culture, for past success,  for some amount of social diversity, and for compatibility of vision; Filter out bad talent.
  10. Do not use talent when good organization structure and processes can do the job.
All original content copyright James Litsios, 2013.


Sunday, October 27, 2013

What does the US government teach us about managing software projects?

Most of us have probably worked all night and all weekend sessions in order to finish a project "on time". And we all know that is the sign of "bad management", and can also be the sign of an impossible task. Therefore, looking back at the US government working hard a few days before the deadline to get the debt ceiling raised, can be seen as a sign that the US government is badly managed, but can also be seen as a sign that it had an something impossible to do.

You might say, a government and politicians is not the same as a software development organization. Governments are fragmented, many politicians only have a job because of these differences of opinions, politicians are continuously trying to shift blame, etc. While the goal of all members of a software development team is to finish a project on time and be successful.  Or is it?...Hmmm...  In fact, that is not always clear!

The tricky nature of people, is that if they do not feel enough belonging into a  common cause, they can very easily switch to supporting its "anti-cause". Politicians know this, and much "political noise" is the game of making sure that not everyone agrees with your oponents. A development team can suffer much damage when not everybody is just "happy" to work towards the common goal of finishing the project on time. This "bad behavior" is often subconscious, yet can also be a very conscious feeling, and even sometimes an open secret, referred to, for example, with flippant remarks. What happens is that dissension can feed upon itself and simply suck away much precious work time. The team may recover under the stress of the approaching deadline, but often that pressure simply comes in too late.

Luckily, difficulties in managing teams is not new, and good guidelines exist. The first recommendation is to use an agile process:
  • The "short" duration of the each iterative step ensures an optimal common focus driven by the "stress" of the deadline. And by the way, that is how you decide how long your agile iterations should be: short for a less mature team, longer for a more mature team. 
  • The independence of product management (e.g. product owner role in scrum), ensures that no dissension exists in the business goals presented to the team.
  • Roles like the scrum master, help focus on value to all, above the value of individuals.
Yet the agile process does not keep dissension happening on "long term" concepts. For example, on design and architecture, or product goals. Usually, technical dissensions are more problematic than business dissensions. The question is often: "How to make sure that people work together productively on long term technical goals"?

My answer is:
Get every one to agree that common long term technical goals exist only to constrain the team; New concepts can be brought in by the team's individuals, but only within the short term "iterative" process. 

For many, the idea of only be able to bring in new concepts within a short time frame is just not acceptable. They argue: we need more time! But too much time would mean too much uncertainty, and uncertainty is an area in which dissension can strive. Another argument is that vital new concepts will never be brought in if there is no long term goal, and as a result the team may fail. This is true. But the consequence is that people need slack time: unmanaged time outside the process in which they can mature new ideas (see delivery focus versus creative slack). It is only when ideas and new concepts have matured that an attempt can be made to bring them in to a project within a solution. And the criteria of maturity is "needs to be able to deliver within the short term iterative part of your agile process".

People may still argue that a bleak constraining, almost pesimistic, long term technical vision does not provide value, is "no fun", and possibly even not worth working for! These are not unreasonable worries.  Yet these remarks are all based on experiences where short term efforts fail to provide enough value, or fail to provide enough satisfaction.They are also come out of cultures that do not understand the critical need of a "side" process in which ideas and techniques are selected, grown, and matured, until either they are rejected or they are brought into production.

US politics, like most political systems, have this concept of the parallel process that prepares material before bringing it into the short term process of getting decisions made.This is a system of influence, of key players, committees and lobbyists. It would be wrong to deny that development organization are not subject to similar games of power, yet the big difference is that a development organization should only get one source of funding (from business). While a democratic system gets multiple sources of funding with often contradicting goals.

I cannot make this posting too long, therefore to summarize:
  • Run a tight development process: only bring in ideas that are mature enough to succeed withing your "short term" production cycles.
  • Encourage independent new long term ideas but outside your production process: Accept that there is some political process in how your team brings in new ideas into their production effort; Make sure that new concepts are mature enough that they can be brought into production with good certainty, within the normal "short term" development cycles.
  • Encourage "constraining" long term team goals within your production process: It is OK to debate as a team on longer terms plans that simplify things and restrict usage of new technology. I say this because technical entropy, the fact that things "grow more complicated with time" will kill you if you do stay united as a team to fight it.
  • Make sure that your team has enough slack time: And make sure also that they have the culture to use it wisely, for the benefit of the project (and not for each developer's ego).
  • Have only one external driver per development team: The same for "normal" short term development process as for the long term "creative process". This will keep the "political" debate healthy. And that is again a reason why agile processes have only one business representative per team.
  • Drive software with deliverable features, business with revenue streams: It is this separation and its "double layering" within the overall development process (e.g. agile development teams versus one product management team) that keeps the political "game" sane! Technology does not generate money, only features do! (Well, within a bigger model it can, but that is why you might consider having a CTO outside the development team process).
 This brings us back to US politics: they are like a very, very, bad development team. They apply none of the "good behaviors" above. Not a very good role model for their fellow citizens!

All original content copyright James Litsios, 2013.


Sunday, October 13, 2013

How to become an expert developer

I am very decent touch typist: I never look at the keyboard. If I cannot remember where a  key is, I "search" for it "blind", until I find it. The reason I do this, is that I know that if I look, I will learn to "look", not to "type". This is an approach that I use over and over, and can be summarized as follows:
To learn, you need to experience learning within the same context as usage.
The thing is, the mind will learn with everything you give it. If you want to learn efficiently, give the mind only what it will normally get. You need to give yourself an environment that is not "too" different from the environment that you will experience when you actually need to use the learning.

Now all of us may sometime "learn wrong". That is learn within the wrong context, and then be obliged to bring this context in when we need our learning. For example, the top two lines of your keyboard are typically "badly learned.  We make the effort to learn to touch type the alphabet, possibly the numbers, but not all the symbols, and definitely not the function keys (f1...f12). When you have learned "wrong", you must relearn. And to do that you must force yourself to stop using the "bad environment" (using your eyes) until you have learned enough to use the "good environment" (only use fingers), which is the one that matches you productive environment. I had such a hard time "unlearning" looking at my top keys, that I purchased a blank keyboard many years ago.  Having used this blank keyboard at work for a few years, I learned that "looking down" was of no use. Now I use a normal keyboard, and yet I still do not look down. First because my first reaction is not too, but also because the feeling of touch typing without help is a great feeling, it is like running or riding a bicycle, and as your fingers move freely you can concentrate on more important things. Also, looking down is breaking that great feeling.

This brings me to programming. For example, you might ask, how do I teach myself to use a new library? My first answer is:
Never use code completion!
If I want to learn a new library, first I read the documentation. Or read code that uses the library. Then I usually print out the header files and "study them" a bit. I keep these printouts nearby (yes, sorry for the trees). Then when I start programming. Usually, I will type in example code, which I will change to try out different features of the library. I will always try to guess the right name and usage patterns to use the library. If I feel that I am really not sure enough, I will "physically" make the effort to look at my header printout, or example code. What is important is that this use of external help be somewhat inefficient.  Also important, is that you make the effort to remember the names and signature of things (I'll come back to that below).

Now I have noticed a few things. By printing the header files, or main code usages, I am giving my mind a physical reference of what needs to be remembered. And even though I may not use this reference, I will learn better because I know that this reference is there and that its physicality gives it an immutable nature. Then the fact that the printout has a structure, that the definitions follow a certain order, or are on one page or the other is important for the memory process. This brings me back to code completion. Code completion provides little reference for you memory to work with. Yes, you will remember things on the long run, but not as quickly and not as well, and that will make you a less good developer.

Now we can get to the core of the learning process:
Make it personal!
If you want to be a good developer, you need to accept that you approach your craft like a chess grandmaster. It is a life long learning process, where you are building layers and layers of learning. And each of these layers of learning are working off each other, in a manner that generates feelings. It is these feelings that cruise you along. Your thinking process works one level above making big decisions on where you are going. Therefore, when you learn a library, you really want to invest the time to learn its signatures, not only the names that provide the different features. This is because these signature will support the feeling that you have when you use the libraries. Without those feelings you will be wasting your time thinking about little details, which is really not what you want to do.

In reality, you do not learn each signature of each type, function, object, module, etc. as a separate learning. Each piece of code is written with a certain style, following styles of others, and following patterns of design and coding. It would be way too hard to learn everything "in separation", so the mind will try to learn things "as a grand scheme". BUT... this only works if you have started along this path in the first place. You need to make the effort from the start to remember the little patterns and names that you meet along your programming lifetime. Of course it is never too late to start later along this way of learning.  But if you never approach it this way, then you will never become a master developer.

I am not sure it is much more complicated. The only thing I would add, is to take your time.  The thing is, if a piece of code/library is written in a manner that is too far away from your current learning. You will not be able to remember it. Possibly even not be able to understand it. That is when you need to be strategic. You need to have a few long term goals of "areas" to develop. Then spend a lot of your personal time programming "towards" those areas. Finally, when you are close enough, you will find that you can read code, and remember code, that you previously could not.



Wednesday, August 14, 2013

OO design patterns from a functional programming perspective

Here my attempt to match up design patterns with functional programming constructions:
Chain of responsibility
Fold request across the chain of handlers, at each step returning either (using Either type) request to continue fold (possibly with transformed request), or returning handled request indicating that fold is finished.

Command
Request is held in data (e.g. as discriminant union)

Interpreter
Language is available as typed data structure (e.g. discriminant union) and is interpretable with functions calls.
Alternative:Language is defined in monadic form (e.g. as function calls)  and is interpreted as part of the monad's interpretation.

Iterator
Exposed traversal functions hide data structure of containers. Functions typically work on zipper focus adapted to the containers structure.
 
Mediator
Exposed functions stay at top/root of exposed data structures.

Memento
Use immutability and exposure of intermediate states to allow recovery, restart or alternate computation from previous intermediate states.

Observer
Associate container of data with container of queries (e.g filter plus callbacks). Changes to data container (insert, update, removal) triggers matching query callbacks, change to queries may trigger callbacks. Callbacks are either in a monadic structure or pass around a state context.

State
Data context position is indexed (e.g. by one or more keys), associated state is stored in corresponding state monad where individual states are found with index. 

Strategy
Single function signature implemented in multiple ways.

Template Method
Use type classes to refine implementation by type variant.

Visitor
Break down algorithm into sub-functions; Provide these sub-functions in argument to algo; Use them within algo; Provide different ones if you want to change behavior of algo.

Adapter
Adapter functions that takes one or more functions with given signatures and returns one or more functions with adapted functions.

Bridge
Clients accesses a bridge data type that holds functions.

Composite
Use recursive data types.

Decorator
Data has a dynamic map of functional operators.

Facade
API presents no internal types and is as "flat" as possible. This may mean that certain functional constructions have been "collapsed" into a purely data centric view in order to limit the "depth" of the API.

Flyweight
Flyweight data structure are often provided as "first" arguments of functions. These functions can then curry their first argument so that the flyweight shared data is provided but does not need to be "passed around" everywhere.

Proxy
Data type that "hides" another type. Possibly relies on using an index/key reference and a map like container to store original data.

Abstract Factory
Factory hides the finer type of created data and functions. Losing the inner type can be tricky because the  exposed signatures must still need to be powerful enough to do everything you want with the exposed API (as we assume that no dynamic type casting is allowed). This may mean that you need powerful type features to make the abstract factory concept work (e.g. GADT).

Builder
Define functions to construct data. This is in fact the "normal" way to write functional style.

Factory Method
Use polymorphism, types are determined by inner types of arguments.

Prototype
(Immutable data does not need to be copied).

Singleton
Encapsulate the access to the singleton as a mutable variable within a function; Use this function to access singleton. In immutable setting, store singleton within a  state monad.

(I'll revisit this one if I have the time to try to put side by side an OO definition of the pattern with a functional definition of design pattern. Also it is a bit minimalistic).

All original content copyright James Litsios, 2013.

Friday, July 12, 2013

Silhouettes: understanding special people

My favorite skateboarders these days are Rodney Mullen, Ben Raybourn, Chris Haslam, and Chris Cole. There is an interview of Rodney Mullen where he says he is looking for a special "silhouette" in skaters. In fact, he is saying the he is looking for skaters that have internalized their skating into their cognitive growth process. To give you a mental image of what I am saying:
We all need things we can lean on to grow. What I mean by "leaning on something", is that we use it to support us, we use it as dependable reference. When we are young we lean on our parents. Later, most of us lean on our social/economic "fabric". Yet others lean on their own selves. If you are unlucky, this self "leaning" may make you a unpleasant narcissistic person, yet if you are more lucky you do not lean on a "synthetic" self created social reference, like a narcissist person, but on a self created non-social reference. And for the people above, on a self created skateboard cognitive reference.
This all sounds complicated, yet in fact it is not, Rodney Mullen has already "said it all": look at the silhouette of these skaters, are they connecting to you like others, do you see them like others? In fact, neither: they are not connecting to you like others, and you do you see them like others. There are connecting to themeselves, and we perceive this as a combination of a unique "character" or style of skating, plus a slight social disconnect.
You probably need to have been a skateboarder to "see" what I mean when I talk about these people. But if I say: Steve Jobs, David Bowie, Cat Stevens, Stevie Wonder, Eric Schmidt, Bill Clinton, maybe you get a feeling of what I am talking about.
(Say thanks, or not, to Splat for having ask me to explain this)
All original content copyright James Litsios, 2013.

Tuesday, July 02, 2013

Market connectivity

Financial exchanges can be seen as big computer clusters that are accessed through specialized APIs.  There are usually a few different ways to access an exchange. The connectivity can be direct or indirect, and there are usually different APIs for the different services.  Direct connectivity is the fastest way to interact with the exchange and pretty much implies that you are a member of the exchange or that you are working closely with an exchange member. If you cannot afford or do not want to bother with direct connectivity then you access the exchange indirectly: your system will connect to a broker’s system that is connected to the exchange.  Such a broker access may bundle up the different services into one API (e.g. using the FIX protocol). Direct connectivity usually has different APIs for different services.  Typically there might be a trading API, a price feed API and a clearing API. Orders and quotes are entered and updated on the trading API.  This API may or not receives confirmations of these trading commands, but will normally at least report own trade activity. Overall market activity is reported through the price feed API: it reports trading and quoting activity of other participants. Trades are reported as “last price”, quoting activity usually comes in multiple levels of details. For example, level 1 reports best order and quote activity, while level 2 reports all order and quoting activity. The clearing API tells you what happens “after” trades happen. For simple financial products like stocks, it is mostly just confirming your trading activity, but with more complex products like options, the clearing API will report exercise activity where the options are converted to stock or to cash.

In time critical trading activities like algorithmic trading or market making, the specifications of the trading API are not only what can be sent to or received from the exchange, but how much and how quickly these communications can be made. Trading commands usually have a very short validity lifetime: they become out of date with new information. All APIs have physical limitation that will cause them to block or queue-up when used with too much data or too high a rate of data. Therefore, delays will happen if too much trading activity is attempted through a trading API, and these delays may make you lose money. Some trading APIs provide extensive details on how much trading activity they allow. They may document the throttling algorithm used within them. For example, they may specify that no more than N trading commands be sent by given period of time. Other trading API may simply block when overused, or worse they may silently queue up trading commands without any form of indication. A way to avoid having the trading API queue-up always limit your trading activity. Yet that is not so simple because although you may decide when to enter an order into the market, the decision to update or cancel that order is often triggered by new information provided by others. Therefore too much new information will either cause you to try to overuse the capacity of the trading API, or leave you with “stale” orders that will be picked up by others (and bring you loss). For this type of issue, some APIs allow “mass cancelations” commands that allow all orders and quotes to be held as soon as possible, while using a minimal amount of API bandwidth. Note however that not all trading APIs offer mass cancelation options, and that the mass cancelation command may itself be subject to delays because of previously entered commands, or may have higher priority access to the exchange but then cause trouble with previously entered but not yet processed commands.



All original content copyright James Litsios, 2013.

Thursday, June 20, 2013

Energize your team: analogy between agile teams and lasers

Electricity, magnetism and time are tightly intertwined. You can try to work separately with them but that will not give you the whole picture. While if you consider the dynamic relation between all three, you can do magical things. For example, you can “bounces” electrical energy back and forth into magnetic energy resulting in alternative current, which magically almost avoiding the need to actually carry “real current” (which is expensive). Or you can “juggle” these energies while moving forward, and thereby creating photons and light.

In project management, tasks, resources and time are also tightly intertwined. You can try to work with them separately but again at risk of being inefficient.  It is a bit of a stretch to compare project management with electromagnetism, yet it is intuitively pleasing. In electromagnetism, work is created by “applying” electrical currents and magnetic fields on each other. In project management, work is creating by “applying” tasks and resources on each other.
I could argue how a waterfall process is comparable to an electrical motor with a rigid physical setup to maintain a well-calibrated cross product between the magnetic field and the flow of currents, but instead I would like to focus on the photonic aspect of light and how it can be compared to an agile process.

Here is the idea: a photon of light is a “self-encapsulation” (wavelet) of oscillating electrical and magnetic fields. (I am not a physicist, so please excuse an abuse of interpretation here). Comparing this to a work process, the photons can be seen as agile work cycles that are “bouncing” between defining the tasks and applying the resources on them. Every cycle of an agile process (e.g. Scrum), can be seen as the photons running through a single frequency cycle.

Agility is the ability to “bounce” your energy between “defining” tasks and “doing” tasks. Stretching the analogy a bit, you might say that the repetition of each agile iteration (or sprint) and with its generation of client deliverables and new future tasks, is a bit like a laser where photons “hit” energized states and generate new photons. The beauty of this analogy is that we can all “feel” how the coherence of light produced by a laser is somewhat similar to an agile team efficiently bouncing past results and knowledge into the definition of new tasks, and then efficiently executing these tasks within an agile iteration.

The laser analogy goes even one step further: lasers produce light that usually goes all in the same direction with the same frequency. If you are very lucky and your product backlog is very much aligned with what the customer needs and your past work always matches up with your new needs, then yes, you too can have a “laser” agile team that goes straight and coherently over your tasks to your goals. But, if your backlog needs to undergo changes (e.g. because there is important new information to interpret), or if you have made mistakes (and we all do!), then this focus on coherence and “straightness” is actually a great risk: a “laser” like team may well end up with the wrong product and possibly with the wrong technology. Again, the issue is that if you try too much to make your new tasks match up with your previous work, you end up going straight, but not necessarily in the right direction. And although the product owner may realize that something is amiss, it is not easy to understand what. One reason is that the team does not understand that it could turn away from an obvious “coherent and straight line” future, and therefore may never indicates to the product owner that the future could be different. Another reason to by wary, is that being a “laser team” feels good: it is great to feel tasks fitting together with marvelous coherence, it is great to have a fantastic history of “little refactoring”. Yet that great feeling is also blocking the team from questioning its actions, because that “would not feel as good”. Without enough self-questioning the team is too much influenced by its previous actions and may end up elsewhere than desired.

So what should a team do about this? The thing is, if you are aligned with your long-term goals, then following the feelings of coherence and alignment is the right thing to do. It is only if you are not aligned with your goals that you are fooling yourself. The product owner is therefore key here: he or she must work hard so that when the team feels that everything is fitting together, the team is also going in the right direction. Said differently, the product owner’s or product manager’s primary job is not to check that the team is aligned, but to ensure that when the team feels “right” it is also going in the right direction. To use the laser image again, the product owner wants to assure that work will be aligned with the overall goals at the creation of each photon. He/she cannot do so by physically controlling each creation and execution of task because that would simply not scale. Instead the product owner strives to provide a “higher level” guidance that ensures this alignment. In scrum, this is why the stories are not tasks. In practice good product owners evangelize their teams so that they “see the goals” and therefore produce tasks that are aligned.

Lasers work because the materials are in “energized states”. Pushing our analogy all the way, (and using scrum), we can say that the product owner energizes the team in such a manner that when the team creates and implements tasks, in a coherent and aligned manner, these tasks are also directed towards the product goals.

Who wants to work with clunky electrical motors when we can all have high-powered laser devices?

Sunday, June 09, 2013

Two algorithmic trading software requirements

This blog post is about two basic requirements when asking developers to write a very fast algorithmic trading systems.
Designing fast system seems initially an easy task. You hire developers that are in touch with the physical nature of computers and networks. And you ask them to write fast code.
Nevertheless, there are difficulties. The first is that allocating and freeing dynamic memory slows you down, so ideally you want to do without dynamic memory. Yet developers usually have a hard time with this requirement. One reason that developers are insecure in doing this is that they have been brought up with languages that only work with dynamic memory. The other reason is exactly the language issue: modern languages do not make it easy to program without dynamic memory allocation. Therefore it is best to present this requirement differently, in a form that is much easier to swallow, which could be as follows:
Make sure to preallocate your dynamic memory before activating your algos, and minimize any later reallocation.
Now this first request seems a bit heavy because it tends to use a lot of memory and brings the code back thirty years. Yet, as you expect your very fast system to make lots of money, you need to start by assuming you have no limits to you infrastructure costs, and are willing to pay for terabytes of memory if needed. In addition, languages like C++, with a wise use of templates, or functional languages with mutability (like Ocaml or F#), can actually do a good job with this requirement. More importantly, this requirement leads me to the next requirement, but first I need to repeat my “motto” in trading, which is:
"It is easier to make money than to keep it".
Which leads me to remind you that there are two very distinct activities in trading: making a profit and trying to prevent a loss. This may sound pretty trivial, but it is a key concept, and not all developers understand it. (I have reviewed enough trading code to know that). What it means to an algo system is that as much effort and thinking must be put in the code that actively tries to make money as in the code that actively tries to prevent losing money. As a requirement, I would state it as:
Given that trading algos can be in three different states: profit making, idle, and preventing loss; Put as much effort and optimization in the transitions from each of these states to another as you put in staying within each state.
Said emotionally:
I want no delay when I slam my foot on the brake or on the gas pedal!
If you have written a few complex real time systems, you may realize that this is not a trivial requirement. And in fact, it is even harder to implement than the first requirement. But I can tell you that it is easier to achieve if you also worry about the first "no dynamic allocation" requirement.

All original content copyright James Litsios, 2013.

Wednesday, May 15, 2013

Agile keeps you from being stupid

Have you noticed that people can make sense of what does not make sense when they do not understand it? That is the "soft" and "analogic" property of our brain: we can make sense out of nonsense,  and this is ok, when we do not know it is nonsense!
A waterfall process will take nonsense, and then often spend the money while delivering nonsense.
Agile process will take nonsense, and fail iterations, and... just go on, spend your money, and  delivery nonsense, or...,  recognise the stalling as a failure to deliver the requirements (e.g. sprint stories), and then force the requirements to be rewritten for the next development iteration. The next iteration may fail, but again the requirements will need to be rewritten. Yet if you are truthfully applying an agile process, each failed delivery will again attempt to reformulate the requirements to be achievable and in the end what was not understood and is blocking you becomes understood and you either stop the project, or you understand how remove the blocker.

Sunday, February 24, 2013

Eventual consistency: dual software models, software with gaps

Maintaining consistency within parallel, distributed, high-performance, and stream oriented programming is all about accepting one thing: when it comes to consistency, that is dealing with failures, don't promise too much too soon! And that is the whole notion of eventual consistency: you sometimes need to wait for things to get resolved. "How much you wait" depends typically on how much money you have invested in infrastructure, yet it also depends on how fine your system captures failures.

Let's start with the eventual consistency.  In traditional transactional models, consistency, the knowledge that your system is "doing well", is assured by approaching each operation "transactionaly": if an operation fails, make sure nothing is messed up and that consistency is maintained.  Yet in the "new" world of massive parallelism, need for high-performance, and the like of streaming models, it is often simply impossible to maintain consistency. The simple explanation is that as operations are split along multiple cores and computer, when something goes wrong it takes "more" time to clean things up. The key theory is that the "cleanup time" after failed operations is "not plannable" (lookup "CAP theorem"). Said differently, failed operation can get "eventually" cleaned up, if your systems puts in the effort!

Eventual consistency is often brought up in the context of distributed databases. Yet in my world of real time streaming systems, system designs is even more limited than with DBs. The simple statement is that streaming system are one dimensional. And that often mean that you only have one way to look at things. Or again, you can view things as different models, but often not bring these view back together at a reasonable cost. 

One pattern to manage failure is simply to kill the process in which the failure was detected. The idea is then to let that process or another recover the operation. I remember once arguing with another developer that "life" is not always that simple. It is ok to use this pattern if the scope of "operations" is small enough. In fact the pattern is also economically good because it simplifies the whole automated QA approach. Yet, if the operations that you are working with have too long a lifespan, the cost to kill processes and restart the failed operations may be simply too high. In these cases you can add lots of redundancy, but that too will cost you, and your solution might simply be non-competitive with a solution that uses a smaller footprint. The last resort is to refine the scope of what you kill: only kill a process if it is corrupted, kill only the broken part of operations, keep as much of the operation that is "still good", incrementally recover the rest.

Some of you may know that duality is still one of my favorite subjects. In the context of linear programming, a dual model is one where variables are used to "measure" the "slack" associated to each constraint. In a software system, the constraints can be seen as the transactions, or the transactional streams, that are being processed. The dual model of a software system is then a model in which you track "how wrong" transactional operation are. That is what I call the "gap model". In a stream oriented model, the gap model is also a stream oriented model. The finer the gap model, the more failures can be described precisely, and the more you can limit the corrective measures needed to fix things. Yet like with all good things, if your model is too "fine", it simply becomes too expensive to work with. So compromised are needed.

(PS: my blog work will be considerably lower this year as I decided that I was more fun to write books (and big theories) than blogs. I will still try to keep you entertained from time to time. And as ever, requests are taken seriously! So do write comments.)