My favorite example is the reverse state monad.
To understand what this means, we first have to look at the normal state monad. I'll give a brief overview here, but it might be hard to follow—take a look at the classic "You Could Have Invented Monads!") for a clearer, more detailed explanation.
The normal state monad lets us write computations that depend on a single piece of implicit state throughout. We can read the state (get
) and we can change the state (put
) and that's all. Here's an example of generating temporary variable names, something I've used in real code:
- fresh name = do subscript <- get
- put (subscript + 1)
- return (name ++ "_" ++ show subscript)
To generate a new name, we get the current subscript, increment the stored state and then produce a string that uses the old subscript value and the base name of the variable. There's only one counter so this isn't ideal for producing streams of differently named variables, but we could extend to that by storing a map of names to subscripts instead of a single number.
The important thing to note is how the state flows through the expression: we get the old version of the state and then put a new one in. The reverse state monad has the same logic, but the state flows in the opposite direction: in the reverse state monad, the above snippet would increment the state before reading it and so return temporary names starting with _1
rather than _0
. It's like time travel!
So what sort of unholy magic does this require? Oddly enough, not much: the implementation is surprisingly straightforward thanks to laziness.
The normal state monad is implemented as a function. A "stateful value" of type a
has two properties: it can depend on an implicit input of type s
and it can write a value of type s
. We can make this explicit as a function: a State s a
is s -> (a, s)
under the hood. All the state monad machinery does is put these functions together threading the s
inputs and outputs through automatically. The goal is to combine two state values (functions) into one that takes in a state, runs the first function and then uses the resulting output value and new state to run the second function. This is what the bind operator (>>=
). (Look at the type as a handy mnemonic of what's going on.)
- (>>=) :: State s a -> (a -> State s b) -> State s b
- step1 >>= step2 = \ start ->
- let (resultA, intermediate) = step1 start
- (resultB, final) = step2 result intermediate in
- (resultB, final)
- -- remember that step2 is a function that takes result
- -- and gives us a State s b
We plumb the output and intermediate state right into step2
, giving us a single action (s -> (a, s)
) that first executes step1
and then executes step2
. The state and the result value both flow through the code in the same direction.
The example program above with multiple lines in do-notation ends up being multiple s -> (a, s)
functions strung together with this principle, giving us a single composite function of the same type—a State s a
action. In the end, the whole state computation is a bunch of composed functions like this. Running it then involves just passing the initial state in as an argument and inspecting the result.
So far, I hope, it makes sense. A good exercise here is to figure out how to implement get
and put
now that you know both how State s a
is represented and how they're strung together.
The reverse state monad has the same representation and follows the same pattern, except the state is threaded through in reverse. Here's the >>=
operator again, but in reverse:
- step1 >>= step2 = \ start ->
- let (resultA, final) = step1 intermediate
- (resultB, intermediate) = step2 resultA start in
- (resultB, final)
This logic is oddly circular: resultA
depends on intermediate
and intermediate
depends on resultA
. But, crucially, circular code can actually be meaningful in a lazy language! The following list definition, for example, is perfectly well-formed (it's the infinite sequence 1, 2, 3…) but is directly circular:
- nats = 0 : map (+ 1) nats
And, in fact, we can use the circularity from our reverse state monad to implement a list like this:
- nats' = do nats <- get
- put (0 : map (+ 1) nats)
- return nats
We can then run it with an initial state of []
to get the same infinite lazy sequence of natural numbers as the compact nats
above.
Unfortunately, if you want to experiment with this code yourself, the result will be a bit uglier. Functions already have a more general monad instance defined, so we need to wrap and unwrap our s -> (a, s)
functions into a new type. The logic is exactly the same and the runtime representation is unchanged, but there's more syntactic noise obscuring it. Here's a gist with the full code for you to play around.
There's an existing package that combines both normal state and reverse state into a single type called, appropriately enough, Tardis. While it's mostly a curiosity and a nice illustration of laziness, it actually comes in useful for real code too. An example I particularly liked from someone (can't remember who) was writing a Markdwon parser. The Markdown spec allows link references to be defined before or after they're used:
- This is some text with [a link][foo]
- [foo]: http://www.example.com
- And hey, I can use [another link][foo] too!
With a bidirectional state, we can write the logic for these references as a neat algorithm that conceptually does a single pass and can just fetch references from the future to deal with text it hasn't parsed yet.
Pretty cool!
The reverse state monad in action.