There are a lot of monad tutorials out there. Too many, probably. I'm not terribly happy with most of the explanations though, so I'll throw my hat in too!
The core problem, as I see it, is that the idea of a monad is very abstract. Trying to pin it down to a single concrete idea, a single analogy is doomed to fail.
I've been thinking about how to explain abstract concepts like this for a while. To me, a good explanation has to have two parts: what it is and why we care. That is: a definition and examples. Far too often, I've found people leave out one or the other, but I need both to understand! If you just give me an abstract mathematical definition without context, I won't actually understand it, even if I can understand the parts. And if you only give me a bunch of examples but forget to explicitly define the idea, I might understand how it behaves, but I won't have any idea of what it actually is.
So a good explanation has to have these two parts. The remaining question, then, is how to order them. Examples first, to motivate a definition? Start with the definition and "flesh it out" with examples? Somehow interleave the two?
I've found that for me, the best approach is to open with a definition and expand on it. I won't understand the definition immediately, but I'll have it in the back of my mind for each example, and I'll refer back to it periodically.
If you like the other order more, take a look at one of the best monad tutorials around: You Could Have Invented Monads! (And Maybe You Already Have.)
So I'll introduce monads in two parts: first, I'll tell you what a monad is; next, I'll tell you why we care; finally, I'll walk through a few concrete examples.
What
So what is a monad, at least in functional programming? Fundamentally, it's just a type m
that has three particular functions defined on it. Note that m
has to accept a type argument itself: you'd always use it as m Int
, m String
or m a
. Then it just has to have some functions defined for it:
- return :: a -> m a
- fmap :: (a -> b) -> m a -> m b
- join :: m (m a) -> m a
These functions also have to follow some laws. I usually don't think about these directly—it's enough to know that they relate to each other "reasonably"—so I'll talk about them later.
The most important part is join
, which captures a notion of "flattening" a structure. You go from a nested m (m a)
to a single m a
: you flatten out a level. If you want a single idea to hang onto, this is it: monads let you flatten them.
As I alluded earlier, the definition probably seems dry and arbitrary right now. Don't worry about it. Just look back here as you're reading the rest of my post. Most of all, remember: a monad is just a type with some functions. That's all.
Why
So why do we care about these particular functions? What's so special about them? How do they flow together? Why is the notion of "flattening" so important?
The first two functions, return
and fmap
both serve a simple role: they allow us to transition from normal code a
to code that uses m
. return
wraps a normal value, lifting it into the monad; fmap
lifts normal functions to operate over these wrapped values. This becomes clearer when you add an extra set of parentheses to its type signature:
- fmap :: (a -> b) -> (m a -> m b)
But what about flattening? That one's a bit less obvious.
Composition
Perhaps the single most important idea in functional programming (or really all programming ever) is composition. And this is exactly what monads help with! In particular, the three functions let us use m
as a customizable way to compose parts of our program, more versatile than just function composition.
Normally, we just compose normal functions: given a -> b
and b -> c
, we can first apply one then the other to get a -> c
. Simple but surprisingly useful.
If a type m
is a monad it means it gives us a new way to compose functions. It lets us compose functions of the form a -> m b
: given a -> m b
and b -> m c
, we can get a -> m c
. Note how the m
in the middle gets swallowed up: this is the main difference with this form of composition.
Since we have an instance of our m
in the middle, we can insert different behavior as things get composed. This means that, for every different monad m
, we get a new regime of composition.
And this is exactly what join
enables! join
specifies a bit of computation that lets us "get rid" of an extra layer of m
, and this is exactly what happens inside the composition. To compose two monadic functions like this, we start with fmap
:
- f >=> g = \ x -> fmap g (f x)
Try following along in GHCi, checking types as you go along.
Remember that f :: (a -> m b)
and g :: (b -> m c)
, which means that the type of the function above is:
- (>=>) :: (a -> m b) -> (b -> m c) -> (a -> m (m c))
We start with an a
, pass it into f
to get an m b
and map over that to get an m (m c)
. And how do we go from this to the actual type we want? We need to flatten it!
So we make the whole composition work properly and not have an extra m
by throwing in a call to join
:
- f >=> g = \ x -> join (fmap g (f x))
So: a monad gives us a new kind of "join-enabled composition", letting us customize what happens when things get composed by defining an appropriate join
function.
We can now view a -> m b
as a special kind of function between a
and b
which just gets composed in a different way.
Bind
So, we have a new idea of composition: >=>
. But we can actually have a new idea of function application too! This is called bind and is written as >>=
. It's very useful for programming, but I find it harder to think about than join
and >=>
.
What does this new sort of application mean? Well, it can't just be applying a -> m b
to a normal a
because we can already do that. Instead, it's applying a -> m b
to an m a
:
- (>>=) :: m a -> (a -> m b) -> m b
In spirit, it's very similar to >=>
: our new notion of application also "swallows" an intermediate m
, so we can customize how it works.
In fact, the definition is just like >=>
but not wrapped in an extra lambda:
- x >>= f = join (fmap f x)
This operator will come up in some of the practical examples.
Laws
Now, just before jumping into the examples, lets cover the monad laws. If you can't remember the details, don't worry: most of the time, how the laws need to apply in any given case is pretty intuitive.
These laws are most easily described in terms of >=>
:
- return >=> f = f
- f >=> return = f
- f >=> (g >=> h) = (f >=> g) >=> h
In other words, return
is both a right and a left identity for >=>
and >=>
is associative.
Hey, these laws look very similar to how id
and .
(normal function composition) behave!'
- id . f = f
- f . id = f
- f . (g . h) = (f . g) . h
This is just more evidence that monad provide a special kind of composition!
Examples
Now that we have the main idea, lets go on to see how it's used in practice with some examples!
Maybe
Perhaps the simplest meaningful monad is Maybe
, defined as follows:
- data Maybe a = Just a
- | Nothing
Maybe
wraps a normal type and also allows it to be null (ie Nothing
): it's Haskell's option type.
How do we make it a monad? Well, we know the three functions we need, so lets just implement those. We know that return
just "wraps" a normal value in Maybe
; the most natural way to do this is just Just
(heh):
- return :: a -> Maybe a
- return x = Just x
What about fmap
? How do we apply a function to a Maybe
value? Well, if we have a Just
, we can just apply the function to its contents. But if we have Nothing
, we don't have anything to apply the function to! We just have to return Nothing
again:
- fmap :: (a -> b) -> Maybe a -> Maybe b
- fmap f (Just x) = Just (f x)
- fmap f Nothing = Nothing
join
is the final function. Here, we just have to consider all the possible cases and implement them in the most natural way:
- join :: Maybe (Maybe a) -> Maybe a
- join (Just (Just a)) = Just a
- join (Just Nothing) = Nothing
- join Nothing = Nothing
Maybe
is a nice example because, at every step, there was only one "reasonable" thing to do. In this case, "reasonable" meant not needlessly throwing away real values. As long as we keep values whenever we can, we only have one possible implementation of the monad functions.
The next question to ask is "what does a -> Maybe b
mean? It's a function that takes an a
and may or may not return a b
. It's a function that can fail.
How do we compose two functions that can fail? Well, if they both succeed, they're just like normal functions. And if any one of them fails, if we ever get a Nothing
well, we have no choice but to return Nothing
for the whole thing.
So Maybe
gives us a type of composition and application for functions that can fail but propagating the failure through for you. It's very useful for avoiding deeply nested case
expressions. We can transform
- case f x of
- Nothing -> Nothing
- Just res ->
- case g res of
- Nothing -> Nothing
- Just res' ->
- case h res' of
- Nothing -> Nothing
- Just res'' -> ...
into a much nicer
- x >>= f >>= g >>= h
Maybe
abstracts over function application and composition that automatically pipes possible Nothing
s through the whole computation, saving us from doing it manually and making the code much less noisy. It saves us from the equivalent of
- if (x != null) {
- return ...;
- } else {
- return null;
- }
EitherMaybe
is great, but sometimes we want to fail in multiple different ways. This is where Either
comes in: it's like Maybe
but with the Nothing
case annotated with an extra argument:
- data Either err a = Left err
- | Right a
It uses the generic names Left
and Right
to show it isn't just about error handling, but we'll pretend it is.
It actually forms a monad almost exactly like Maybe
. If anything, it's easier: since we can't come up with a value of type err
from thin air, we can only really implement these functions one way:
- return :: a -> Either err a
- return x = Right x
- fmap :: (a -> b) -> Either err a -> Either err b
- fmap f (Right x) = Right (f x)
- fmap f (Left err) = Left err
- join :: Either err (Either err a) -> Either err a
- join (Right (Right x)) = Right x
- join (Right (Left err)) = Left err
- join (Left err) = Left err
In fact, Maybe
can be seen as a special case of Either
where the Left
case doesn't carry extra information:
- type Maybe a = Either () a
Either
lets us abstract over error checking via return types. We can avoid the Go
error-handling pattern:
- res, err = f(x)
- if err != nil {
- return nil
- }
- res2, err = g(res)
- if err != nil {
- return nil
- }
- res3, err = h(res)
- if err != nil {
- return nil
- }
- ...
The moral equivalent of the above in Haskell could just be written as
- x >>= f >>= g >>= h
even though they're both using return types to manage errors! Either
abstracts over all the plumbing for us.
List
So far, we've seen how monads let us compose functions that might fail. What else can we do?
Well, a simple one is the list type which is written as [a]
in Haskell. We're going to approach lists with the same heuristic as for Maybe
and Either
: never throw away data unnecessarily. We also don't want to copy or rearrange data unnecessarily.
With these conditions, return
is pretty easy:
- return :: a -> [a]
- return x = [x]
fmap
, as the name implies, is just map
- fmap :: (a -> b) -> [a] -> [b]
- fmap = map
- -- or
- fmap f [] = []
- fmap f (x:xs) = f x : fmap f xs
Finally, we have join
which needs to flatten a list. Here's the most reasonable definition:
- join :: [[a]] -> [a]
- join [] = []
- join (ls:rest) = ls ++ join rest
Basically, we just take our list of lists and concatenate each of its items with ++
.
Now our next question: what does a -> [b]
mean? It's a function that can return any number of results. In math, this is often called a nondeterministic function.
What does it mean to compose non-deterministic functions like this? Well, to compose f :: a -> [b]
with g :: b -> [c]
, we first pass in a value into f
to get a bunch of b
s, then we pass each b
into g
to get a whole bunch of c
s and finally we return all of those c
s. This is exactly what >=>
for lists does.
So the list monad gives us clean composition of functions that have any number of results. One cool thing to note is that this corresponds very closely to list comprehensions! In fact, we can rewrite a list comprehension in terms of return
and >>=
pretty easily:
- [f a b | a <- as, b <- bs]
becomes
- bs >>= \ b ->
- as >>= \ a ->
- return (f a b)
As you can see, this transformation is actually not list-specific at all. In fact, you can do this for any monad at all. Haskell's MonadComprehensions extension actually allows this: you can use list comprehension syntax for any monad at all, which is sometimes quite nice.
Reader
Now lets look at a much trickier one. This is the function monad in the form of r -> a
for some fixed r
. (Just like Either err a
had a fixed err
.)
What does return
mean for this? Let's look at the type we want:
- return :: a -> (r -> a)
We get a value and want to turn it into a function from r
. Since we don't know anything particular about r
, we can't do anything with that argument but ignore it:
- return x = (\ r -> x)
How about fmap
? Figuring it out on your own is actually a good exercise. Try it on your own before continuing.
Again, we want to look at the type:
- fmap :: (a -> b) -> (r -> a) -> (r -> b)
Hey, that type looks a familiar! It's just like function composition:
- (.) :: (b -> c) -> (a -> b) -> (a -> c)
And, in fact, that's exactly what fmap
is for (r -> a)
:
- fmap = (.)
- -- or
- fmap f g = \ x -> f (g x)
Finally, we need join
. Again, the type is going to be very helpful:
- join :: (r -> (r -> a)) -> (r -> a)
We take in a function f
with two arguments and need to use it to produce a function of one argument. Since we can't magically produce a value of type r
, there is actually only one reasonable implementation of join
:
- join f = \ x -> f x x
So now we see that, indeed, (r -> a)
is a monad. But what does this mean? We can think of value of type (r -> a)
as values of type a
in the context of r
. That is, they're normal values of a
that can also depend on a value of r
. They can "read" from the environment, which is why (r -> a)
is often called the "reader monad".
The reader monad allows us to pipe this environment through a whole bunch of values and functions that all depend on it. The result of x >>= f >>= g >>= h
lets us pass in a single value of r
that is first given to x
then passed into the result of f
then the into the result of g
and so on.
Writer
Another type to look at is (w, a)
for a fixed w
. This might seem a little weird, but as we'll see it naturally forms a monad and can actually be pretty useful!
So how do we do return
? Immediately, we run into a problem: we would need to manufacture a value of type w
, but we can't. So, in fact, we need to add a constraint to w
: it has to be a type that has a special value of some sort to use with return
. This is provided by the Monoid
class which has mempty
:
- mempty :: Monoid w => w
We'll just use this; as we'll see later, the other part of the monoid will also come in useful.
Given a free w
value, return
is straightforward:
- return :: Monoid w => a -> (w, a)
- return x = (mempty, x)
Now fmap
, which we can only really do in one way:
- fmap :: (a -> b) -> (w, a) -> (w, b)
- fmap f (w, x) = (w, f x)
Finally, we need to do join
:
- join :: (w, (w, a)) -> (w, a)
We have *two* values of type w
. We could just throw one away, but, as a rule, losing information unnecessarily is bad. So, instead, we will use the other part of Monoid
:
- mappend :: Monoid w => w -> w -> w
It's just an arbitrary way to combine two values of the type into a third. We can use it to turn the two levels of w
in (w, (w, a))
into one:
- join (w_1, (w_2, x)) = (mappend w_1 w_2, x)
So, as long as w
is a Monoid
, (w, a)
is a monad. But what does this mean?
It lets us string an extra channel of output through our whole computation, combining it using mappend
at every step. It's often called the "writer monad" because we can "write" to this extra channel of output at every step. To be useful, we have to actually have an extra function for writing:
- tell :: Monoid w => w -> (w, ())
- tell output = (output, ())
This lets us inject values into the output stream. This newly added output will be mappend
ed onto the rest of the output from our computation.
This is useful for purely functional logging. If we have a bunch of function f
, g
, h
that all want to log some String
s in addition to returning something, we can write them by using tell ["Message"]
and then automatically pipe all the strings through using the same code we've already seen a few times before:
- x >>= f' >>= g' >>= h'
For lists (ie [String]
in this example), mappend
is just ++
, so this expression becomes:
- (log ++ logF ++ logG ++ logH, (h (g (f x))))
(Where f
, g
and h
are the function parts of f'
, g'
and h'
that only do the actual computation and not the logging.)
It's pretty neat how we assembled the log
in parallel to actually applying the base functions. And thanks to laziness, we will only evaluate as much of the log as we use: we aren't wasting too many resources on making a log we will never use!
State
So, we've seen how to compose functions that read and functions that write. Can we do both at once? Why, that would be a function that can both read and write! Sounds like mutable state.
This is, in fact, exactly what the State
type is: it's a combination of both reader and writer using the same type for both:
- type State s a = s -> (s, a)
Conceptually, a value of type s -> (s, a)
is an a
that can depend on and/or modify a value of type s
.
Since we're always going to have a value of type s
passed in—that's the s ->
part of the type—we don't need the Monoid
constraint any more.
With that in mind, here are the monadic functions which are largely the combinations of their reader and writer versions. Note how return
and fmap
don't modify the state at all; only join
can affect it.
- return :: a -> (s -> (s, a))
- return x = \ s -> (s, x)
- fmap :: (a -> b) -> (s -> (s, a)) -> (s -> (s, b))
- fmap f x = \ s -> let (s', a) = x s in (s', f a)
- -- remember that the stateful value is a function itself!
- join :: (s -> (s, (s -> (s, a)))) -> (s -> (s, a))
- join x = \ s -> let (s', x') = x s in x' s'
join
is rather confusing, so take the time to work out what it does on paper. Basically, we start with a nested state function that depends on s
twice. To turn this into a single level of state dependency, we have to take a value of type s
and string it through both levels: that's all the join
code is doing.
We also need some primitive ways to access and modify the state, similar to tell
. It's easiest to think about two of them: get
to read in the current state and set
to change it:
- get :: (s -> (s, s))
- get = \ s -> (s, s)
Note how get
doesn't change the state at all. It just takes the state and moves it to the value "channel" as well, exposing it to the functions in the computation.
- set :: s -> (s -> (s, ()))
- set newS = \ s -> (newS, ())
set
takes the new value of the state and creates a new stateful value. This value ignores the state that's passed into it, replacing it with the new state. We also don't have a meaningful result for this, so we just put a ()
into the result channel.
We can combine get
and set
into some useful functions like modify
:
- modify :: (s -> s) -> (s -> (s, ()))
- modify f = get >>= set . f
- -- or
- modify f = \ s -> (f s, ())
The state monad lets us compose stateful functions, and manages passing the state through each one automatically. Hopefully a pattern is now emerging: monads give us new ways of composing functions, usually with more structure.
Procedures
Now we're going to talk about the only monad here that has any magic: IO
. The idea is that IO a
represents a procedure that, when run, produces a value of type a
. The procedure has access to the runtime system, so it can do "extra-language" things like talk to the operating system, get input, print output and so on: all these are impossible to define in terms of just pure Haskell.
Since IO
is an abstract type, we don't know—and don't care—about how the monadic functions are implemented. Instead, I'll just talk about what they do.
return :: a -> IO a
creates the empty procedure that just does nothing except return the given value. This is essentially where its name comes from, coincidentally.
fmap :: (a -> b) -> IO a -> IO b
takes a procedure that gives us an IO a
and creates a new one that first runs the IO a
then passes its result into the function to get a b
. Since this whole thing is still a procedure itself, it's an IO b
: the b
never "escapes" into normal code.
Finally, we have join :: IO (IO a) -> IO a
. This is a slightly odd way of sequencing procedures: we take in a procedure that returns a procedure and create one that runs both of them.
Honestly, join
does not make too much sense for IO
. But >>=
does: it's a way to apply procedures, as if they were functions! This lets us write complicated programs by combining these procedures in a systematic way to produce a single big procedure.
In fact, this is how the entire Haskell program actually gets run. main :: IO a
is just the big procedure; when you run a Haskell executable, the runtime system executes main
, which likely involves both executing IO
procedures inside main
and evaluating normal Haskell expressions.
Just like the writer and state monads, IO
needs some primitive values like tell
, get
and set
. But, unlike the earlier examples, it doesn't have one primitive value or two primitive values: it has hundreds. Every system call, every runtime functions are all primitive IO
values. getLine
, getChar
, print
... The monad machinery just lets us combine these primitive operations in different ways as well as letting us glue them together with normal (ie pure) Haskell code.
Do-notation
One thing I haven't really mentioned is do-notation, which is some syntax sugar Haskell provides to make code using monads look more like an imperative program. It actually works like the list comprehension, but in reverse:
- do x <- xs
- y <- ys
- return (f x y)
becomes xs >>= \ x -> ys >>= \ y -> return (f x y)
It also allows you to ignore the value of an expression:
- do something
- somethingElse
is the same as
- something >>= \ _ -> somethingElse
This turns every monad into an imperative-looking DSL. Once you understand the stuff I explained above, I think do-notation is quite easy to deal with.
Conclusion
There are actually a whole bunch more monads I haven't talked about. We can use them for parsing, for logic programming, for mutable references, for callCC
... Lots of interesting things.
Moreover, we can actually combine monads. This gives rise to "monad transformers". For example, we can layer Maybe
onto another monad by wrapping the inner monad's functions in checks for Nothing
.
Ultimately, just remember that a monad is a type with several functions defined on it. This type gives us a new way to compose and apply functions with custom behavior during the application/composition "step".
I hope this gives you a good understanding of monads in practice. I know the answer's a bit long, but I've thought a lot about how to explain this idea. I just hope it wasn't too much! I think this is the longest Quora answer I've ever written, by a fair margin :).