Wednesday, November 26, 2014

Issues with F# type inference

This week I continued my recent code rewrite efforts and was unhappy to note that F#'s type inference is still a bit flaky.

I had written a parser+type inference for an experimental functional language before I wrote my introduction-to-stateful-monads presentation. And although the original code was already lifting state monads (the last part of the presentation), it was doing so in a limited and heavy manner.  One of the ideas I have had for some time was to write an optimizer for specialized monadic code. Therefore, I thought it would be nice to rewrite my code to be fully "state lifting" centric because that would be closer to the type of code I would like to optimize.

To illustrate what I mean, the type inference monadic state has a counter. This counter is incremented to generate new free variables during the type inference. In my rewrite,  I tried to implement this increment function as:
let incVarCounter = liftState varCounter' (mapState inc)
let varCounter' f s = 
  let (r,x') = f s.varCounter
  (r,{s with varCounter=x'})
let mapState f = fun s -> (), f s
let liftState f m = fun s -> f m s 
(hopefully this is correct because I simplified the code for this blog).

varCounter is a field of a record with two type variables. Call it R<'a,'b>.
Unfortunately, the code above is inferred to the following by the F# compiler:
R < obj , obj > -> unit*R< obj , obj >
obj is the generic object type used when the type inference fails to deliver.

On the other hand, add a dummy unit argument to the function, as follow:
let incVarCounter() = liftState varCounter' (mapState inc)
This infers to:
unit -> R<'a,'b> -> unit*R<'a,'b>
which is has the correct type details.

These type details are needed because obj types are unusable. Adding the dummy unit argument is a work around; The question is "why do we still need workarounds for a language that is now at version 4"?  And more esthetically, if I want a monad, I want a monad, not a function that delivers a monad.

Friday, November 07, 2014

Do not confuse data logic and execution logic

I mentioned in my last post that I was looking for a bug in my autoindentation code.

Sadly, strong typing had given me a hint of my problem: I was returning a boolean that I had no use for, but I chose to ignore this by adding an operator that "swallowed" this return value. Yet the source of that boolean return value was my bug.  It just took me extra bit of time to find it.

My parser monad is like a Maybe monad: when a monadic expression fails,  the current expression branch stops and returns the failure up the call stack. If the failure happens to be within a higher order control structure, like an "or" statement or a "star" or "plus" repetition (for pattern matching), then another expression branch may be pursued which may succeed.

On the other hand, a monadic  expression may return a value. For example, a number may be parsed, and the number may be returned by the monadic expression.

It just so happened that my test to check if I was starting or not a block (for the autoindentation detection) was conceptually returning a boolean. Yet in fact it was not, it was either succeeding, or not, in its "higher order monadic logic". The correct code is something like: "if cond then return () else fail withErrorTagX" , while the false code reads "return cond" . When the second code fails, it returns a the failure (as a boolean) but it does not cause a change of execution behavior.

My autoindent code had three monadic functions: beginBlock, endBlock,  checkNotBeginOrEnd.
The two first functions had the correct logic,  the "check..." didn't. This last function was doing what its name was implying: it was checking, with pedantic code in the form of "if beginOfBlock then return false else if endOfBlock then return false else return true". But it should have said "if .. then fail ..."

I had two goals with this post. The first was to make clear the "dual level" of thinking that goes on when working with monadic constructions that have their own execution semantics. My second goal was to remark that execution logic is not the same as good old boolean logic. In my monadic operators, I have boolean operators, like "or", "and", "andNot". I come to realize that this denomination is not good because these are really not boolean operators and should not be confused with them.

Monday, November 03, 2014

Breakpoints in monadic code

I was trying to get my autoindenting parser working again in my old monadic F# code under Xamarin. To note that breakpoints are still an issue with higher order code. This post presents a simple way I get around the problem.

The problem is that higher order code, is first run to be assembled as higher order constructions, and then only later run "to do something". For example, monads are first bound, and then "run". The problem is that a breakpoint sets a stop during this construction phase, not during the execution phase. That is reasonable as the construction is what the top level function does, but it is of little use to the developer who wants to break the execution of the code and not its construction.

My parser monad is very combinatorial. I have operators for the classical parsing constructions like "and" and "or", but also a bunch of helper operators that are multiple variations of the bind operator, (e.g. combine two monads, return value of first), as well as error management helpers, and as well as monadic y-combinators to bootstrap recursive structures. When I use these, I rarely need to define functions on the fly, and therefore I end up with very few or even no anchor points for breakpoints.

To help with setting breakpoints, I have two functions both with the signature m<'a>->('a->'a)->m<'a> (give it a monad and a function and it  returns a monad). I define these as operators %>> and >>%. They are a form of restricted map function, the first calls the function before the execution of the monad, the second calls the function  after the execution of the monad. To use these, I define special breakpoint functions, for example "let breakHere x = x , on which I set breakpoints and which I slip in to my code, for example ... (m %>> breakHere) ... will break before the monad m.

This method needs a rebuild to set a new breakpoint. That does restrict its usage, but it is still very helpful.