Higher Order

Philosophy and functional programming.

At Long Last

In early September 2014, we published Functional Programming in Scala. It is now available from all major booksellers, and from the publisher at manning.com/bjarnason. It’s available as a beautiful paper book, on Kindle and other e-book readers, and as a PDF file.

I just want to share my personal story of how this book came to exist. A much shorter version of this story became the preface for the finished book, but here is the long version.

Around 2006 I was working in Austin and coming up on my 8th anniversary as an enterprise Java programmer. I had started to notice that I was making a lot of the same mistakes over and over again. I had a copy of the Gang of Four’s Design Patterns on my desk that I referred to frequently, and I built what I thought were elegant object-oriented designs. Every new project started out well enough, but became a big ball of mud after a while. My once-elegant class hierarchies gathered bugs, technical debt, and unimplemented features. Making changes started to feel like trudging through a swamp. I was never confident that I wasn’t introducing defects as I went. My code was difficult to test or reuse, and impossible to reason about. My productivity plummeted, and a complete rewrite became inevitable. It was a vicious cycle.

In looking for a more disciplined approach, I came across Haskell and functional programming. Here was a community of people with a sound methodology for reasoning about their programs. In other words, they actually knew what they were doing. I found a lot of good ideas and proceeded to import them to Java. A little later I met Tony Morris, who had been doing the same, on IRC. He told me about this new JVM language, Scala. Tony had a library called Scalaz (scala-zed) that made FP in Scala more pleasant, and I started contributing to that library. One of the other people contributing to Scalaz was Paul Chiusano, who was working for a company in Boston. In 2008 he invited me to come work with him, doing Scala full time. I sold my house and everything in it, and moved to Boston.

Paul co-organized the Boston Area Scala Enthusiasts, a group that met monthly at Google’s office in Cambridge. It was a popular group, mainly among Java programmers who were looking for something better. But there was a clear gap between those who had come to Scala from an FP perspective and those who saw Scala as just a better way to write Java. In April 2010 another of the co-organizers, Nermin Serifovic, said he thought there was “tremendous demand” for a book that would bridge that gap, on the topic of functional programming in Scala. He suggested that Paul and I write that book. We had a very clear idea of the kind of book we wanted to write, and we thought it would be quick and easy. More than four years later, I think we have made a good book.

Paul and I hope to convey in this book some of the excitement that we felt when we were first discovering FP. It’s encouraging and empowering to finally feel like we’re writing comprehensible software that we can reuse and confidently build upon. We want to invite you to the world of programming as it could be and ought to be.

– Rúnar Óli Bjarnason

Boston, August 2014

Monoid Morphisms, Products, and Coproducts

Today I want to talk about relationships between monoids. These can be useful to think about when we’re developing libraries involving monoids, and we want to express some algebraic laws among them. We can then check these with automated tests, or indeed prove them with algebraic reasoning.

This post kind of fell together when writing notes on chapter 10, “Monoids”, of Functional Programming in Scala. I am putting it here so I can reference it from the chapter notes at the end of the book.

Monoid homomorphisms

Let’s take the String concatenation and Int addition as example monoids that have a relationship. Note that if we take the length of two strings and add them up, this is the same as concatenating those two strings and taking the length of the combined string:

1
(a.length + b.length) == (a + b).length

So every String maps to a corresponding Int (its length), and every concatenation of strings maps to the addition of corresponding lengths.

The length function maps from String to Int while preserving the monoid structure. Such a function, that maps from one monoid to another in such a preserving way, is called a monoid homomorphism. In general, for monoids M and N, a homomorphism f: M => N, and all values x:M, y:M, the following law holds:

1
f(x |+| y) == (f(x) |+| f(y))

The |+| syntax is from Scalaz and is obtained by importing scalaz.syntax.monoid._. It just references the append method on the Monoid[T] instance, where T is the type of the arguments.

This law can have real practical benefits. Imagine for example a “result set” monoid that tracks the locations of a particular set of records in a database or file. This could be as simple as a Set of locations. Concatenating several thousand files and then proceeding to search through them is going to be much slower than searching through the files individually and then concatenating the result sets. Particularly since we can potentially search the files in parallel. A good automated test for our result set monoid would be that it admits a homomorphism from the data file monoid.

Monoid isomorphisms

Sometimes there will be a homomorphism in both directions between two monoids. If these are inverses of one another, then this kind of relationship is called a monoid isomorphism and we say that the two monoids are isomorphic. More precisely, we will have two monoids A and B, and homomorphisms f: A => B and g: B => A. If f(g(b)) == b and g(f(a)) == a, for all a:A and b:B then f and g form an isomorphism.

For example, the String and List[Char] monoids with concatenation are isomorphic. We can convert a String to a List[Char], preserving the monoid structure, and go back again to the exact same String we started with. This is also true in the inverse direction, so the isomorphism holds.

Other examples include (Boolean, &&) and (Boolean, ||) which are isomorphic via not.

Note that there are monoids with homomorphisms in both directions between them that nevertheless are not isomorphic. For example, (Int, *) and (Int, +). These are homomorphic to one another, but not isomorphic (thanks, Robbie Gates).

Monoid products and coproducts

If A and B are monoids, then (A,B) is certainly a monoid, called their product:

1
2
3
4
5
6
def product[A:Monoid,B:Monoid]: Monoid[(A, B)] =
  new Monoid[(A, B)] {
    def append(x: (A, B), y: => (A, B)) =
      (x._1 |+| y._1, x._2 |+| y._2)
    val zero = (mzero[A], mzero[B])
  }

But is there such a thing as a monoid coproduct? Could we just use Either[A,B] for monoids A and B? What would be the zero of such a monoid? And what would be the value of Left(a) |+| Right(b)? We could certainly choose an arbitrary rule, and we may even be able to satisfy the monoid laws, but would that mean we have a monoid coproduct?

To answer this, we need to know the precise meaning of product and coproduct. These come straight from Wikipedia, with a little help from Cale Gibbard.

A product M of two monoids A and B is a monoid such that there exist homomorphisms fst: M => A, snd: M => B, and for any monoid Z and morphisms f: Z => A and g: Z => B there has to be a unique homomorphism h: Z => M such that fst(h(z)) == f(z) and snd(h(z)) == g(z) for all z:Z. In other words, the following diagram must commute:

A coproduct W of two monoids A and B is the same except the arrows are reversed. It’s a monoid such that there exist homomorphisms left: A => W, right: B => W, and for any monoid Z and morphisms f: A => Z and g: B => Z there has to be a unique homomorphism h: W => Z such that h(left(a)) == f(a) and h(right(b)) == g(b) for all a:A and all b:B. In other words, the following diagram must commute:

We can easily show that our productMonoid above really is a monoid product. The homomorphisms are the methods _1 and _2 on Tuple2. They simply map every element of (A,B) to a corresponding element in A and B. The monoid structure is preserved because:

1
2
(p |+| q)._1 == (p._1 |+| q._1)
(p |+| q)._2 == (p._2 |+| q._2)

And for any other monoid Z, and morphisms f: Z => A and g: Z => B, we can construct a unique morphism from Z to (A,B):

1
2
def factor[A:Monoid,B:Monoid,Z:Monoid](z: Z)(f: Z => A, g: Z => B): (A, B) =
  (f(z), g(z))

And this really is a homomorphism because we just inherit the homomorphism law from f and g.

What does a coproduct then look like? Well, it’s going to be a type C[A,B] together with an instance coproduct[A:Monoid,B:Monoid]:Monoid[C[A,B]]. It will be equipped with two monoid homomorphisms, left: A => C[A,B] and right: B => C[A,B] that satisfy the following (according to the monoid homomorphism law):

1
2
(left(a1) |+| left(a2)) == left(a1 |+| a2)
(right(b1) |+| right(b2)) == right(b1 |+| b2)

And additionally, for any other monoid Z and homomorphisms f: A => Z and g: B => Z we must be able to construct a unique homomorphism from C[A,B] to Z:

1
def fold[A:Monoid,B:Monoid,Z:Monoid](c: C[A,B])(f: A => Z, g: B => Z): Z

The simplest thing that could possibly work

We can simply come up with a data structure required for the coproduct to satisfy a monoid. We will start with two constructors, one for the left side, and another for the right side:

1
2
3
sealed trait These[+A,+B]
case class This[A](a: A) extends These[A,Nothing]
case class That[B](b: B) extends These[Nothing,B]

This certainly allows an embedding of both monoids A and B. But These is now basically Either, which we know doesn’t quite form a monoid. What’s needed is a zero, and a way of appending an A to a B. The simplest way to do that is to add a product constructor to These:

1
case class Both[A,B](a: A, b: B) extends These[A,B]

Now These[A,B] is a monoid as long as A and B are monoids:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def coproduct[A:Monoid,B:Monoid]: Monoid[These[A, B]] =
  new Monoid[These[A, B]] {
    def append(x: These[A, B], y: These[A, B]) = (x, y) match {
      case (This(a1), This(a2)) => This(a1 |+| a2)
      case (That(b1), That(b2)) => That(b1 |+| b2)
      case (That(b), This(a)) => Both(a, b)
      case (This(a), That(b)) => Both(a, b)
      case (Both(a1, b), This(a)) => Both(a1 |+| a, b)
      case (Both(a, b1), That(b)) => Both(a, b1 |+| b)
      case (This(a1), Both(a, b)) => Both(a1 |+| a, b)
      case (That(b), Both(a, b1)) => Both(a, b1 |+| b)
      case (Both(a1, b1), Both(a2, b2)) => Both(a1 |+| a2, b1 |+| b2)
    }
    val zero = Both(A.zero, B.zero)
  }

These[A,B] is the smallest monoid that contains both A and B as submonoids (the This and That constructors, respectively) and admits a homomorphism from both A and B. And notice that we simply added the least amount of structure possible to make These[A,B] a monoid (the Both constructor). But is it really a coproduct?

First we must prove that This and That really are homomorphisms. We need to prove the following two properties:

1
2
(This(a1) |+| This(a2)) == This(a1 |+| a2)
(That(b1) |+| That(b2)) == That(b1 |+| b2)

That’s easy. The first two cases of the append method on the coproduct monoid prove these properties.

But can we define fold? Yes we can:

1
2
3
4
5
6
7
def fold[A:Monoid,B:Monoid,Z:Monoid](
  these: These[A,B])(f: A => Z, g: B => Z): Z =
    these match {
      case This(a) => f(a)
      case That(b) => g(b)
      case Both(a, b) => f(a) |+| g(b)
    }

But is fold really a homomorphism? Let’s not assume that it is, but test it out.Here’s the homomorphism law:

1
fold(f,g)(t1 |+| t2) == fold(f,g)(t1) |+| fold(f,g)(t2)

What happens if both t1 or t2 are This or That?

1
2
3
4
fold(f,g)(This(a1) |+| This(a2)) ==
fold(f,g)(This(a1)) |+| fold(f, g)(This(a2))

f(a1) |+| f(a2) == f(a1) |+| f(a2)

That holds. But what if we introduce a Both on one side?

1
2
3
4
fold(f,g)(This(a1) |+| Both(a2, b)) ==
fold(f,g)(This(a1)) |+| fold(f,g)(Both(a2,b))

f(a1 |+| a2) |+| g(b) == f(a1) |+| f(a2) |+| g(b)

So far so good. That holds because of associativity. What about the other side?

1
2
3
4
fold(f,g)(That(b1) |+| Both(a, b2)) ==
fold(f,g)(That(b1)) |+| fold(f,g)(Both(a,b2))

f(a) |+| g(b1 |+| b2) == g(b1) |+| f(a) |+| g(b2)

No! Something has gone wrong. This will only hold if the Z monoid is commutative. So in general, These[A,B] is not the coproduct of A and B. My error was in the Both constructor, which commutes all B values to the right and all A values to the left.

So what kind of thing would work? It would have to solve this case:

1
  case (Both(a1, b1), Both(a2, b2)) => ???

We need to preserve that a1, b1, a2, and b2 appear in that order. So clearly the coproduct will be some kind of list!

We could modify the Both constructor this way:

1
case class Both[A,B](a: These[A,B], b: These[A,B]) extends These[A,B]

And let’s add an empty case for the combined zero:

1
case class Neither[A,B]() extends These[A,B]

In which case These[A,B] has become a kind of tree, or an unbalanced list of This[A] and That[B] values. A free product of the two monoids A and B. The implementation of append for the coproduct monoid could be:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def coproduct[A:Monoid,B:Monoid]: Monoid[These[A, B]] =
  new Monoid[These[A, B]] {
    def append(x: These[A, B], y: These[A, B]) = (x, y) match {
      case (Neither(), _) => y
      case (_, Neither()) => x
      case (This(a1), This(a2)) => This(a1 |+| a2)
      case (That(b1), That(b2)) => That(b1 |+| b2)
      case (This(a1), Both(This(a2), z)) => Both(This(a1 |+| a2), z)
      case (That(b1), Both(That(b2), z)) => Both(That(b1 |+| b2), z)
      case (Both(z, This(a1)), This(a2)) => Both(z, This(a1 |+| a2))
      case (Both(z, That(b1)), That(b2)) => Both(z, That(b1 |+| b2))
      case _ => Both(x, y)
    }
    val zero = Neither[A,B]()
  }

This append normalizes the list so that consecutive values of the same type are added together.

And we would modify fold to recurse over the tree:

1
2
3
4
5
6
7
def fold[Z:Monoid](these: These[A,B])(f: A => Z, g: B => Z): Z =
  these match {
    case Neither() => mzero[Z]
    case This(a) => f(a)
    case That(b) => g(b)
    case Both(a, b) => fold(a)(f, g) |+| fold(b)(f, g)
  }

This is now a homomorphism! We already know that this is so for the This and That cases. And now that the Both case simply appeals to the inductive hypothesis, we know that it holds for Both as well.

Free monoids on coproducts

To better understand what’s going on, let’s try going the other way. What if we start with the coproduct of the underlying sets and get a free monoid from there?

The underlying set of a monoid A is just the type A without the monoid structure. The coproduct of types A and B is the type Either[A,B]. Having “forgotten” the monoid structure of both A and B, we can recover it by generating a free monoid on Either[A,B], which is just List[Either[A,B]]. The append operation of this monoid is list concatenation, and the identity for it is the empty list.

Clearly List[Either[A,B]] is a monoid, but does it permit a homomorphism from both monoids A and B? If so, then the following properties should hold:

1
2
List(Left(a1)) ++ List(Left(a2))) == List(Left(a1 |+| a2))
List(Right(b1)) ++ List(Right(b2))) == List(Right(b1 |+| b2))

They clearly do not hold! The lists on the left of == will have two elements and the lists on the right will have one element. Can we do something about this?

Well, the fact is that List[Either[A,B]] is not exactly the monoid coproduct of A and B. It’s “too big” in a sense. But if we were to reduce the list to a normal form that approximates a “free product”, we can get a coproduct that matches our definition above. What we need is a new monoid:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
sealed class Eithers[A,B](val toList: List[Either[A,B]]) {
  def ++(p: Eithers[A,B]): Eithers[A,B] =
    Eithers(toList ++ p.toList)
}

object Eithers {
  def apply[A:Monoid,B:Monoid](xs: List[Either[A,B]]): Eithers[A,B] =
    xs.foldRight(List[Either[A,B]]()) {
      case (Left(a1), Left(a2) :: xs) => Left(a1 |+| a2) :: xs
      case (Right(b1), Right(b2) :: xs) => Right(b1 |+| b2) :: xs
      case (e, xs) => e :: xs
    }
  def empty[A,B]: Eithers[A,B] = new Eithers(Nil)
}

Eithers[A,B] is a kind of List[Either[A,B]] that has been normalized so that consecutive As and consecutive Bs have been collapsed using their respective monoids. So it will contain alternating A and B values.

This is now a monoid coproduct because it permits monoid homomorphisms from A and B:

1
2
3
4
Eithers(List(Left(a1))) ++ Eithers(List(Left(a2)))) ==
  Eithers(List(Left(a1 |+| a2)))
Eithers(List(Right(b1))) ++ Eithers(List(Right(b2)))) ==
  Eithers(List(Right(b1 |+| b2)))

And we can implement the fold homomorphism:

1
2
3
4
5
6
def fold[A:Monoid,B:Monoid,Z:Monoid](
  es: Eithers[A,B])(f: A => Z, g: B => Z): Z =
    es.toList.foldRight(mzero[Z]) {
      case (Left(a), z) => f(a) |+| z
      case (Right(b), z) => g(b) |+| z
    }

And this time fold really is a homomorphism, and we can prove it by case analysis. Here’s the law again:

1
(fold(e1)(f,g) |+| fold(e2)(f,g)) == fold(e1 ++ e2)(f,g)

If either of e1 or e2 is empty then the result is the fold of the other, so those cases are trivial. If they are both nonempty, then they will have one of these forms:

1
2
3
4
5
6
7
8
9
10
11
e1 = [..., Left(a1)]
e2 = [Left(a2), ...]

e1 = [..., Right(b1)]
e2 = [Right(b2), ...]

e1 = [..., Left(a)]
e2 = [Right(b), ...]

e1 = [..., Right(b)]
e2 = [Left(a), ...]

In the first two cases, on the right of the == sign in the law, we perform a1 |+| a2 and b1 |+| b2 respectively before concatenating. In the other two cases we simply concatenate the lists. The ++ method on Eithers takes care of doing this correctly for us. On the left of the == sign we fold the lists individually and they will be alternating applications of f and g. So then this law amounts to the fact that f(a1 |+| a2) == f(a1) |+| f(a2) in the first case, and the same for g in the second case. In the latter two cases this amounts to a homomorphism on List. So as long as f and g are homomorphisms, so is fold(_)(f,g). Therefore, Eithers[A,B] really is a coproduct of A and B.

Since we have already convinced ourselves that These is a monoid coproduct, we could also simply show that there is a homomorphism from Eithers to These:

1
2
3
4
5
6
def toThese[A:Monoid,B:Monoid](es: Eithers[A,B]): These[A,B] =
  es match {
    case Nil => Neither()
    case Left(a) :: t => This(a) |+| toThese(t)
    case Right(b) :: t => That(b) |+| toThese(t)
  }

Which would amount to proving that (toThese(xs) |+| toThese(ys)) = toThese(xs ++ ys).

The lesson learned here is to check assumptions and test against laws. Things are not always as straightforward as they seem.

Free Monads and the Yoneda Lemma

Last week I gave a talk on Purely Functional I/O at Scala.io in Paris. The slides for the talk are available here. In it I presented a data type for IO that is supposedly a “free monad”. But the monad I presented is not exactly the same as scalaz.Free and some people have been asking me why there is a difference and what that difference means.

IO as an application of Free

The Free monad in Scalaz is given a bit like this:

1
2
3
sealed trait Free[F[_],A]
case class Return[F[_],A](a: A) extends Free[F,A]
case class Suspend[F[_],A](s: F[Free[F, A]]) extends Free[F,A]

And throughout the methods on Free, it is required that F is a functor because in order to get at the recursive step inside a Suspend, we need to map over the F somehow.

But the IO monad I gave in the talk looks more like this:

1
2
3
sealed trait IO[F[_],A]
case class Return[F[_],A](a: A) extends IO[F,A]
case class Req[F[_],I,A](req: F[I], k: I => IO[F,A]) extends IO[F,A]

And it could actually be stated as an application of Free:

1
type IO[F[_],A] = Free[({type λ[α] = (F[I], I => α) forSome {type I}})#λ, A]

So in a very superficial sense, this is how the IO monad relates to Free. The monad IO[F,_] for a given F is the free monad generated by the functor (F[I], I => _) for some type I. And do note that this is a functor no matter what F is.

IO as equivalent to Free

There is a deeper sense in which IO and Free are actually equivalent (more precisely, isomorphic). That is, there exists a transformation from one to the other and back again. Since the only difference between IO and Free is in the functors F[_] vs ∃I. (F[I], I => _), we just have to show that these two are isomorphic for any F.

The Yoneda lemma

There is an important result in category theory known as the Yoneda lemma. What it says is that if you have a function defined like this…

1
def map[B](f: A => B): F[B]

…then you certainly have a value of type F[A]. All you need is to pass the identity function to map in order to get the value of type F[A] out of this function. In fact, a function like this is in practice probably defined as a method on a value of type F[A] anyway. This also means that F is definitely a functor.

The Yoneda lemma says that this goes the other way around as well. If you have a value of type F[A] for any functor F and any type A, then you certainly have a map function with the signature above.

In scala terms, we can capture this in a type:

1
2
3
trait Yoneda[F[_], A] {
  def map[B](f: A => B): F[B]
}

And the Yoneda lemma says that there is an isomorphism between Yoneda[F,A] and F[A], for any functor F and any type A. Here is the proof:

1
2
3
4
5
6
7
import scalaz._

def toYo[F[_]:Functor,A](fa: F[A]) = new Yoneda[F,A] {
  def map[B](f: A => B) = Functor[F].map(fa)(f)
}

def froYo[F[_],A](yo: Yoneda[F,A]) = yo.map(a => a)

CoYoneda

Of course, this also means that if we have a type B, a function of type (B => A) for some type A, and a value of type F[B] for some functor F, then we certainly have a value of type F[A]. This is kind of obvious, since we can just pass the B => A and the F[B] to the map function for the functor and get our F[A].

But the opposite is also true, and that is the really interesting part. If we have a value of type F[A], for any F and A, then we can always destructure it into a value of type F[B] and a function of type B => A, at least for some type B. And it turns out that we can do this even if F is not a functor.

This is the permutation of the Yoneda lemma that we were using in IO above. That is, IO[F, A] is really Free[({type λ[α] = CoYoneda[F,α]})#λ, A], given:

1
2
3
4
5
trait CoYoneda[F[_], A] {
  type I
  def f: I => A
  def fi: F[I]
}

And the lemma says that CoYoneda[F,A] is isomorphic to F[A]. Here is the proof:

1
2
3
4
5
6
def toCoYo[F[_],A](fa: F[A]) = new CoYoneda[F,A] {
  type I = A
  val f = (a: A) => a
  val fi = fa
}
def froCoYo[F[_]:Functor,A](yo: CoYo) = Functor[F].map(yo.fi)(yo.f)

Of course, this destructuring into CoYoneda using the identity function is the simplest and most general, but there may be others for specific F and A depending on what we know about them.

So there you have it. The scalaz.Free monad with its Suspend(F[Free[F,A]]) constructor and the IO monad with its Req(F[I], I => IO[F,A]) constructor are actually equivalent. The latter is simply making use of CoYoneda to say the same thing.

Why bother? The useful part is that CoYoneda[F,_] is a functor for any F, so it’s handy to use in a free monad since we can then drop the requirement that F is a functor. What’s more, it gives us map fusion for free, since map over CoYoneda is literally just function composition on its f component. Although this latter is, in the absence of tail call elimination, not as useful as it could be in Scala.

I hope that sheds a little bit of light on the Yoneda lemma as well as the different embeddings of free monads.

Free Monoids and Free Monads

In a series of old posts, I once talked about the link between lists and monoids, as well as monoids and monads. Now I want to talk a little bit more about monoids and monads from the perspective of free structures.

List is a free monoid. That is, for any given type A, List[A] is a monoid, with list concatenation as the operation and the empty list as the identity element.

1
2
3
4
def freeMonoid[A]: Monoid[List[A]] = new Monoid[List[A]] {
  def unit = Nil
  def op(a1: List[A], a2: List[A]) = a1 ++ a2
}

Being a free monoid means that it’s the minimal such structure. List[A] has exactly enough structure so that it is a monoid for any given A, and it has no further structure. This also means that for any given monoid B, there must exist a transformation, a monoid homomorphism from List[A] to B:

1
def foldMap[A,B:Monoid](as: List[A])(f: A => B): B

Given a mapping from A to a monoid B, we can collapse a value in the monoid List[A] to a value in B.

Free monads

Now, if you followed my old posts, you already know that monads are “higher-kinded monoids”. A monoid in a category where the objects are type constructors (functors, actually) and the arrows between them are natural transformations. As a reminder, a natural transformation from F to G can be represented this way in Scala:

1
2
3
trait ~>[F[_],G[_]] {
  def apply[A](a: F[A]): G[A]
}

And it turns out that there is a free monad for any given functor F:

1
2
3
sealed trait Free[F[_],A]
case class Return[F[_],A](a: A) extends Free[F,A]
case class Suspend[F[_],A](f: F[Free[F,A]]) extends Free[F,A]

Analogous to how a List[A] is either Nil (the empty list) or a product of a head element and tail list, a value of type Free[F,A] is either an A or a product of F[_] and Free[F,_]. It is a recursive structure. And indeed, it has exactly enough structure to be a monad, for any given F, and no more.

When I say “product” of two functors like F[_] and Free[F,_], I mean a product like this:

1
2
3
trait :*:[F[_],G[_]] {
  type λ[A] = F[G[A]]
}

So we might expect that there is a monad homomorphism from a free monad on F to any monad that F can be transformed to. And indeed, it turns out that there is. The free monad catamorphism is in fact a monad homomorphism. Given a natural transformation from F to G, we can collapse a Free[F,A] to G[A], just like with foldMap when given a function from A to B we could collapse a List[A] to B.

1
def runFree[F[_],G[_]:Monad,A](as: Free[F,A])(f: F ~> G): G[A]

But what’s the equivalent of foldRight for Free? Remember, foldRight takes a unit element z and a function that accumulates into B so that B doesn’t actually have to be a monoid. Here, f is a lot like the monoid operation, except it takes the current A on the left:

1
def foldRight[A,B](as: List[A])(z: B)(f: (A, B) => B): B

The equivalent for Free takes a natural transformation as its unit element, which for a monad happens to be monadic unit. Then it takes a natural transformation as its f argument, that looks a lot like monadic join, except it takes the current F on the left:

1
2
3
4
5
6
type Id[A] = A

def foldFree[F[_]:Functor,G[_],A](
  as: Free[F,A])(
  z: Id ~> G)(
  f: (F:*:G)#λ ~> G): G[A]

In this case, G does not have to be a monad at all.

Free as well as natural transformations and product types are available in Scalaz.

What Purity Is and Isn’t

A lot of discussion about “purity” goes on without participants necessarily having a clear idea of what it means exactly. Such discussion is generally unhelpful and distracting.

What purity is

The typical definition of purity (and the one we use in our book) goes something like this:

An expression e is referentially transparent if for all programs p, every occurrence of e in p can be replaced with the result of evaluating e without changing the result of evaluating p.

A function f is pure if the expression f(x) is referentially transparent for all referentially transparent x.

Now, something needs to be made clear right up front. Like all definitions, this holds in a specific context. In particular, the context needs to specify what “evaluating” means. It also needs to define “program”, “occurrence”, and the semantics of “replacing” one thing with another.

In a programming language like Haskell, Java, or Scala, this context is pretty well established. The process of evaluation is a reduction to some normal form such as weak head or beta normal form.

A simple example

To illustrate, let’s consider programs in an exceedingly simple language that we will call Sigma. An expression in Sigma has one of the following forms:

  • A literal character string like "a", "foo", "", etc.
  • A concatenation, s + t, for expressions s and t.
  • A special Ext expression that denotes input from an external source.

Now, without an evaluator for Sigma, it is a purely abstract algebra. So let’s define a straigtforward evaluator eval for it, with the following rules:

  • A literal string is already in normal form.
  • eval(s + t) first evaluates s and t and concatenates the results into one literal string.
  • eval(Ext) reads a line from standard input and returns it as a literal string.

This might seem very simple, but it is still not clear whether Ext is referentially transparent with regard to eval. It depends. What does “reads a line” mean, and what is “standard input” exactly? This is all part of a context that needs to be established.

Here’s one implementation of an evaluator for Sigma, in Scala:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
sealed trait Sigma
case class Lit(s: String) extends Sigma
case class Concat(s: Sigma, t: Sigma) extends Sigma
case object Ext extends Sigma

def eval1(sig: Sigma): Sigma = sig match {
  case Concat(s, t) => {
    val Lit(e1) = eval1(s)
    val Lit(e2) = eval1(t)
    Lit(e1 + e2)
  }
  case Ext => Lit(readLine)
  case x => x
}

Now, it’s easy to see that the Ext instruction is not referentially transparent with regard to eval1. Replacing Ext with eval1(ext) does not preserve meaning. Consider this:

1
2
val x = Ext
eval1(Concat(x, x))

VS this:

1
2
val x = eval1(Ext)
eval1(Concat(x, x))

That’s clearly not the same thing. The former will get two strings from standard input and concatenate them together. The latter will get only one string, store it as x, and return x + x.

Now consider a slightly different evaluator:

1
2
3
4
5
6
7
8
9
def eval2(sig: Sigma, stdin: String): Sigma = sig match {
  case Concat(s, t) => {
    val Lit(e1) = eval2(s, stdin)
    val Lit(e2) = eval2(t, stdin)
    Lit(e1 + e2)
  }
  case Ext => Lit(stdin)
  case x => x
}

In this case, the Ext instruction clearly is referentially transparent with regard to eval2, because our standard input is just a string, and it is always the same string. So you see, the purity of functions in the Sigma language very much depends on how that language is interpreted.

This is the reason why Haskell programs are considered “pure”, even in the presence of IO. A value of type IO a in Haskell is simply a function. Reducing it to normal form (evaluating it) has no effect. An IO action is of course not referentially transparent with regard to unsafePerformIO, but as long as your program does not use that it remains a referentially transparent expression.

What purity is not

In my experience there are more or less two camps into which unhelpful views on purity fall.

The first view, which we will call the empiricist view, is typically taken by people who understand “pure” as a pretentious term, meant to denegrate regular everyday programming as being somehow “impure” or “unclean”. They see purity as being “academic”, detached from reality, in an ivory tower, or the like.

This view is premised on a superficial understanding of purity. The assumption is that purity is somehow about the absence of I/O, or not mutating memory. But how could any programs be written that don’t change the state of memory? At the end of the day, you have to update the CPU’s registers, write to memory, and produce output on a display. A program has to make the computer do something, right? So aren’t we just pretending that our programs don’t run on real computers? Isn’t it all just an academic exercise in making the CPU warm?

Well, no. That’s not what purity means. Purity is not about the absence of program behaviors like I/O or mutable memory. It’s about delimiting such behavior in a specific way.

The other view, which I will call the rationalist view, is typically taken by people with overexposure to modern analytic philosophy. Expressions are to be understood by their denotation, not by reference to any evaluator. Then of course every expression is really referentially transparent, and so purity is a distinction without a difference. After all, an imperative side-effectful C program can have the same denotation as a monadic, side-effect-free Haskell program. There is nothing wrong with this viewpoint, but it’s not instructive in this context. Sure, when designing in the abstract, we can think denotationally without regard to evaluation. But when concretizing the design in terms of an actual programming language, we do need to be aware of how we expect evaluation to take place. And only then are referential transparency and purity useful concepts.

Both of these, the rationalist and empiricist views, conflate different levels of abstraction. A program written in a programming language is not the same thing as the physical machine that it may run on. Nor is it the same thing as the abstractions that capture its meaning.

Further reading

I highly recommend the paper What is a Purely Functional Language? by Amr Sabry, although it deals with the idea of a purely functional language rather than purity of functions within a language that does not meet that criteria.

New Beginnings

I have not seriously updated the old Apocalisp blog for quite some time. Mostly this is due to the fact that I have been spending all of my creative time outside of work on writing a book. It’s also partly that putting a post up on WordPress is a chore. It’s like building a ship in a bottle.

So I have decided to make posting really easy for myself by hosting the blog on GitHub. I am using a dead-simple markdown-based framework called Octopress. With this setup I can very easily write a new post from my command line and publish by pushing to GitHub. This is already part of my normal coding workflow, so it feels more friction-free.

The new blog is simply titled “Higher Order”, and is available at blog.higher-order.com. Check back soon for posts that I’ve been sitting on but have been too lazy busy to post.

All of the old content and comments will still be available at the old address, and I’ll probably cross-post to both places for a little while.

Scalaz Tutorial: Enumeration-based I/O With Iteratees

Scalaz 5.0 adds an implementation of a concept called Iteratee. This is a highly flexible programming technique for writing enumeration-based input processors that can be freely composed.

A lot of people have asked me to write a tutorial on how this works, specifically on how it is implemented in Scalaz and how to be productive with it, so here we go.

The implementation in Scalaz is based on an excellent article by John W. Lato called “Iteratee: Teaching an Old Fold New Tricks”. As a consequence, this post is also based on that article, and because I am too unoriginal to come up with my own examples, the examples are directly translated from it. The article gives code examples in Haskell, but we will use Scala here throughout.

Motivation

Most programmers have come across the problem of treating an I/O data source (such as a file or a socket) as a data structure. This is a common thing to want to do. To contrast, the usual means of reading, say, a file, is to open it, get a cursor into the file (such as a FileReader or an InputStream), and read the contents of the file as it is being processed. You must of course handle IO exceptions and remember to close the file when you are done. The problem with this approach is that it is not modular. Functions written in this way are performing one-off side-effects. And as we know, side-effects do not compose.

Treating the stream of inputs as an enumeration is therefore desirable. It at least holds the lure of modularity, since we would be able to treat a File, from which we’re reading values, in the same way that we would treat an ordinary List of values, for example.

A naive approach to this is to use iterators, or rather, Iterables. This is akin to the way that you would typically read a file in something like Ruby or Python. Basically you treat it as a collection of Strings:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
def getContents(fileName: String): Iterable[String] = {
  val fr = new BufferedReader(new FileReader(fileName))
  new Iterable[String] {
    def iterator = new Iterator[String] {
      def hasNext = line != null
      def next = {
        val retVal = line
        line = getLine
        retVal
      }
      def getLine = {
        var line:String = null
        try {
          line = fr.readLine
        } catch {
          case _ => line = null
        }
        line
      }
      var line = getLine
    }
  }
}

What this is doing is a kind of lazy I/O. Nothing is read from the file until it is requested, and we only hold one line in memory at a time. But there are some serious issues with this approach. It’s not clear when you should close the file handle, or whose responsibility that is. You could have the Iterator close the file when it has read the last line, but what if you only want to read part of the file? Clearly this approach is not sufficient. There are some things we can do to make this more sophisticated, but only at the expense of breaking the illusion that the file really is a collection of Strings.

The Idea

Any red-blooded functional programmer should be thinking right about now: “Instead of getting Strings out of the file, just pass in a function that will serve as a handler for each new line!” Bingo. This is in fact the plot with Iteratees. Instead of implementing an interface from which we request Strings by pulling, we’re going to give an implementation of an interface that can receive Strings by pushing.

And indeed, this idea is nothing new. This is exactly what we do when we fold a list:

1
def foldLeft[B](b: B)(f: (B, A) => B): B

The second argument is exactly that, a handler for each element in the list, along with a means of combining it with the accumulated value so far.

Now, there are two issues with an ordinary fold that prevent it from being useful when enumerating file contents. Firstly, there is no way of indicating that the fold should stop early. Secondly, a list is held all in memory at the same time.

The Iteratee Solution

Scalaz defines the following two data structures (actual implementation may differ, but this serves for illustration):

1
2
3
4
5
6
7
8
9
10
trait Input[+E]
case class El[E](e: E) extends Input[E]
case object Empty extends Input[Nothing]
case object EOF extends Input[Nothing]

trait IterV[E,A] {
  def run: A = ... // Implementation omitted
}
case class Done[E,A](a: A, e: Input[E]) extends IterV[E,A]
case class Cont[E,A](k: Input[E] => IterV[E,A]) extends IterV[E,A]

So an input to an iteratee is represented by Input[E], where E is the element type of the input source. It can be either an element (the next element in the file or stream), or it’s one of two signals: Empty or EOF. The Empty signal tells the iteratee that there is not an element available, but to expect more elements later. The EOF signal tells the iteratee that there are no more elements to be had.

Note that this particular set of signals is kind of arbitrary. It just facilitates a particular set of use cases. There’s no reason you couldn’t have other signals for other use cases. For example, a signal I can think of off the top of my head would be Restart, which would tell the iteratee to start its result from scratch at the current position in the input.

IterV[E,A] represents a computation that can be in one of two states. It can be Done, in which case it will hold a result (the accumulated value) of type A. Or it can be waiting for more input of type E, in which case it will hold a continuation that accepts the next input.

Let’s see how we would use this to process a List. The following function takes a list and an iteratee and feeds the list’s elements to the iteratee.

1
2
3
4
5
def enumerate[E,A]: (List[E], IterV[E,A]) => IterV[E,A] = {
  case (Nil, i) => i
  case (_, i@Done(_, _)) => i
  case (x :: xs, Cont(k)) => enumerate(xs, k(El(x)))
}

Now let’s see some actual iteratees. As a simple example, here is an iteratee that counts the number of elements it has seen:

1
2
3
4
5
6
7
8
def counter[A]: IterV[A,Int] = {
  def step(n: Int): Input[A] => IterV[A,Int] = {
    case El(x) => Cont(step(n + 1))
    case Empty => Cont(step(n))
    case EOF => Done(n, EOF)
  }
  Cont(step(0))
}

And here’s an iteratee that discards the first n elements:

1
2
3
4
5
6
7
8
def drop[E,A](n: Int): IterV[E,Unit] = {
  def step: Input[E] => IterV[E,Unit] = {
    case El(x) => drop(n - 1)
    case Empty => Cont(step)
    case EOF => Done((), EOF)
  }
  if (n == 0) Done((), Empty) else Cont(step)
}

And one that takes the first element from the input:

1
2
3
4
5
6
7
8
def head[E]: IterV[E, Option[E]] = {
  def step: Input[E] => IterV[E, Option[E]] = {
    case El(x) => Done(Some(x), Empty)
    case Empty => Cont(step)
    case EOF => Done(None, EOF)
  }
  Cont(step)
}

Let’s go through this code. Each one defines a “step” function, which is the function that will handle the next input. Each one starts the iteratee in the Cont state, and the step function always returns a new iteratee in the next state based on the input received. Note in the last one (head), we are using the Empty signal to indicate that we want to remove the element from the input. The utility of this will be clear when we start composing iteratees.

Now, an example usage. To get the length of a list, we write:

1
val length: Int = enumerate(List(1,2,3), counter[Int]).run // 3

The run method on IterV just gets the accumulated value out of the Done iteratee. If it isn’t done, it sends the EOF signal to itself first and then gets the value.

Composing Iteratees

Notice a couple of things here. With iteratees, the input source can send the signal that it has finished producing values. And on the other side, the iteratee itself can signal to the input source that it has finished consuming values. So on one hand, we can leave an iteratee “running” by not sending it the EOF signal, so we can compose two input sources and feed them into the same iteratee. On the other hand, an iteratee can signal that it’s done, at which point we can start sending any remaining elements to another iteratee. In other words, iteratees compose sequentially.

In fact, IterV[E,A] is an instance of the Monad type class for each fixed E, and composition is very similar to the way monadic parsers compose:

1
2
3
4
5
6
7
def flatMap[B](f: A => IterV[E,B]) = this match {
  case Done(x, e) => f(x) match {
    case Done(y, _) => Done(y, e)
    case Cont(k) => k(e)
  }
  case Cont(k) => Cont(e => k(e) flatMap f)
}

Here then is an example of composing iteratees with a for-comprehension:

1
2
3
4
def drop1Keep1[E]: IterV[E, Option[E]] = for {
  _ <- drop(1)
  x <- head
} yield x

The iteratee above discards the first element it sees and returns the second one. The iteratee below does this n times, accumulating the kept elements into a list.

1
2
3
4
5
6
7
def alternates[E](n: Int): IterV[E, List[E]] =
  drop1Keep1[E].
    replicate[List](n).
    foldRight(Done(List[Option[E]](),Empty))((x, y) => for {
      h <- x
      t <- y
    } yield h :: t).map(_.flatten)

Here’s an example run:

1
2
scala> enumerate(List.range(1,15), alternates[Int](5)).run
res85: List[Int] = List(2, 4, 6, 8, 10)

File Input With Iteratees

Using the iteratees to read from file input turns out to be incredibly easy. The only difference is in how the data source is enumerated, and in order to remain lazy (and not prematurely perform any side-effects), we must return our iteratee in a monad:

1
2
3
4
5
6
7
8
9
10
11
def enumReader[A](r: BufferedReader,
                  it: IterV[String, A]): IO[IterV[String, A]] = {
  def loop: IterV[String, A] => IO[IterV[String, A]] = {
    case i@Done(_, _) => IO { i }
    case i@Cont(k) => for {
      s <- IO { r.readLine }
      a <- if (s == null) IO { i } else loop(k(El(s)))
    } yield a
  }
  loop(it)
}

The monad being used here is an IO monad that I’ll explain in a second. The important thing to note is that the iteratee is completely oblivious to the fact that it’s being fed lines from a BufferedReader rather than a List.

Here is the IO monad I’m using. As you can see, it’s really just a lazy identity monad:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
object io {
  sealed trait IO[A] {
    def unsafePerformIO: A
  }

  object IO {
    def apply[A](a: => A): IO[A] = new IO[A] {
      def unsafePerformIO = a
    }
  }

  implicit val IOMonad = new Monad[IO] {
    def pure[A](a: => A): IO[A] = IO(a)
    def bind[A,B](a: IO[A], f: A => IO[B]): IO[B] = IO {
      implicitly[Monad[Function0]].bind(
        () => a.unsafePerformIO,
        (x:A) => () => f(x).unsafePerformIO)()
    }
  }
}

To read lines from a file, we’ll do something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def bufferFile(f: File) = IO {
  new BufferedReader(new FileReader(f))
}

def closeReader(r: Reader) = IO {
  r.close
}

def bracket[A,B,C](init: IO[A],
                   fin: A => IO[B],
                   body: A => IO[C]): IO[C] =
for { a <- init
      c <- body(a)
      _ <- fin(a) }
  yield c

def enumFile[A](f: File, i: IterV[String, A]): IO[IterV[String, A]] =
  bracket(bufferFile(f),
          closeReader(_:BufferedReader),
          enumReader(_:BufferedReader, i))

The enumFile method uses bracketing to ensure that the file always gets closed. It’s completely lazy though, so nothing actually happens until you call unsafePerformIO on the resulting IO action:

1
2
3
4
5
scala> enumFile(new File("/Users/runar/Documents/Iteratees.txt"), head) map (_.run)                
res2: io.IO[Option[String]] = io$IO@5f90b584

scala> res2.unsafePerformIO
res3: Option[String] = Some(Scalaz Tutorial: Enumeration-Based I/O With Iteratees)

That uses the “head” iteratee from above to get the first line of the file that I’m using to edit this blog post.

We can get the number of lines in two files combined, by composing two enumerations and using our “counter” iteratee from above:

1
2
3
4
def lengthOfTwoFiles(f1: File, f2: File): IO[Int] = for {
  l1 <- enumFile(f1, counter)
  l2 <- enumFile(f2, l1)
} yield l2.run

So what we have here is a uniform and compositional interface for enumerating both pure and effectful data sources. We can avoid holding on to the entire input in memory when we don’t want to, and we have complete control over when to stop iterating. The iteratee can decide whether to consume elements, leave them intact, or even truncate the input. The enumerator can decide whether to shut the iteratee down by sending it the EOF signal, or to leave it open for other enumerators.

There is even more to this approach, as we can use iteratees not just to read from data sources, but also to write to them. That will have to await another post.

A Critique of Impure Reason

Much has been made of the difference between “object-oriented” and “functional” programming. As much as I’ve seen people talk about this difference, or some kind of imperative/functional dichotomy, I don’t think it’s a meaningful distinction. I find that the question of OO vs FP, or imperative vs functional, is not interesting or relevant. The relevant question is one of purity, or the distinction between pure and impure programming. That is, respectively, the kind of programming that eschews data mutation and side-effects as against the kind that embraces them.

Looking at pure (referentially transparent, purely functional) programming from a perspective of having accepted the premise of impurity, it’s easy to be intimidated, and to see it as something foreign. Dijkstra once said that some programmers are “mentally mutilated beyond hope of regeneration”. I think that on one account, he was right, and that he was talking about this mental barrier of making the leap, from thinking about programming in terms of data mutation and effects, to the “pure” perspective. But I believe he was wrong on the other account. There is hope.

Contrary to what the last post indicated, this will not be a post about problems in “object-oriented” programming in particular, but about problems in impure programming in general. I will illustrate the issue of data mutation by way of two difficulties with it:

  • It blurs the distinction between values and variables, such that a term can be both a value and a variable.
  • It introduces an artificial distinction between equality and identity.

I will talk briefly about what the implications of each are, and what the alternative view is. But first I will discuss what I think motivates the impure perspective: an aversion to abstraction.

The Primacy of the Machine

In the early days of programming, there were no computers. The first programs were written, and executed, on paper. It wasn’t until later that machines were first built that could execute programs automatically.

During the ascent of computers, an industry of professional computer programmers emerged. Perhaps because early computers were awkward and difficult to use, the focus of these professionals became less thinking about programs and more manipulating the machine.

Indeed, if you read the Wikipedia entry on “Computer Program”, it tells you that computer programs are “instructions for a computer”, and that “a computer requires programs to function”. This is a curious position, since it’s completely backwards. It implies that programming is done in order to make computers do things, as a primary. I’ll warrant that the article was probably written by a professional programmer.

But why does a computer need to function? Why does a computer even exist? The reality is that computers exist solely for the purpose of executing programs. The machine is not a metaphysical primary. Reality has primacy, a program is a description, an abstraction, a proof of some hypothesis about an aspect of reality, and the computer exists to deduce the implications of that fact for the pursuit of human values.

To say that a computer “requires programs to function” is not at all like saying that an automobile requires fuel. It’s like saying that it needs a driver. While that’s true, the driver doesn’t exist for the sake of the car. The driver has somewhere to go and the car exists to serve that purpose.

Shaping Programs in the Machine’s Image

There’s a certain kind of programming that lends itself well to manipulating a computer at a low level. You must think in terms of copying data between registers and memory, and executing instructions based on the machine’s current state. In this kind of activity, the programmer serves as the interpreter between the abstract algorithm and the physical machine. But this in itself is a mechanical task. We can instead write a program, once and for all, that performs the interpretation for us, and direct that program using a general-purpose programming language.

But what kind of structure will that language have? Ideally, we would be able to express programs in a natural and concise notation without regard to the machine. Such was the motivation behind LISP. It was designed as an implementation of the lambda calculus, a minimal formal system for expressing algorithms.

This is not what has happened with languages like Java, C++, and some other of today’s most popular languages. The abstractions in those languages largely simulate the machine. You have mutable records, or objects, into which you copy data and execute instructions based on their current state. The identity of objects is based on their location in the computer’s memory. You have constructs like “factories”, “builders”, “generators”… machines!

What we have is a generation of programmers accustomed to writing programs as if they were constructing a machine. We hear about “engineering” and “architecture” when talking about software. Indeed, as users, we interact with software using pictures of buttons, knobs, and switches. More machinery.

Equality and Identity

Programmers are generally used to thinking of equality and identity as distinct. For example, in Java, (x == y) means that the variable x holds a pointer to the same memory address as the variable y, i.e. that the variable x is identical with y. However, by convention, x.equals(y) means that the object at the memory address pointed to by x is equivalent in some sense to the object at the memory address pointed to by y, but x and y are not necessarily identical.

Note that the difference between the two can only be explained in terms of the underlying machine. We have to invoke the notion of memory addresses to understand what’s going on, or else resort to an appeal to intuition with terms like “same object”, “same instance”, etc.

In a pure program, side-effects are disallowed. So a function is not allowed to mutate arguments or make calls to functions that have side-effects. So the result of every function is purely determined by the value of its arguments, and a function does nothing except compute a result. Think about the implications of that for a moment. Given that constraint, is there any reason to maintain a distinction between equality and identity? Consider this example:

1
2
String x = new String("");
String y = new String("");

If the values referenced by x and y are forever immutable, what could it possibly mean for them to not be identical? In what meaningful sense are they not the “same object”?

Note that there’s a philosophical connection here. The notion that every object carries an extradimensional identity that is orthogonal to its attributes is a very Platonic idea. The idea that an object has, in addition to its measurable attributes, a distinguished, eternal, noumenal identity. Contrast this with the Aristotelian view, in which an object’s identity is precisely the sum of its attributes.

Variables and Pseudovariables

So we find that, given the mutability premise, a term in the language may be not be equal to itself even as it remains identical with itself. In a sense, this is because it’s not entirely clear whether we are comparing values or variables when we compare a term. If that sounds a bit strange, consider the following snippet:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
public class Foo {
  public int v;
  public Foo(int i) {
    v = i;
  }
}
...
public Foo f(Foo x) {
  x.v = x.v + 1;
  return x;
}
...
Foo x = new Foo(0);
boolean a = f(x) == f(x); // true
boolean b = f(x).v > f(x).v; // ???

Even that short and ostensibly simple code snippet is difficult to comprehend properly, because the f method mutates its argument. It seems that since the first boolean is true, the same object appears on either side of the == operator, so f(x) is identical with f(x). They’re the “same object”. Even so, it is not clear that this “same object” is equal to itself. To figure out the value of b, you have to know some things about how the machine executes the program, not just what the program actually says. You can argue about pass-by-reference and pass-by-value, but these are terms that describe the machine, not the program.

You will note that the Foo type above is mutable in the sense that it has accessible components that can be assigned new values. But what does that mean? Is an object of type Foo a value or a variable? It seems to have a kind of pseudo-variable v. To clarify this, we can imagine that a variable holding a value of type Foo also holds a further value of type int. Assignment to a variable x of type Foo also assigns the variable x.v of type int, but the variable x.v can be assigned independently of x.

So data mutation is, ultimately, variable assignment. Above, Foo is mutated by assigning a value to the pseudo-variable v.

Now, I say pseudo-variable, because it’s not obvious whether this is a value or a variable. A function that receives a value of type Foo, also receives the variable Foo.v, to which it can assign new values. So the variable is passed as if it were a value. Stated in terms of the machine, a pointer to a pointer is being passed, and the memory at that address can be mutated by writing another pointer to it. In terms of the program, the called function is able to rebind a variable in the scope of the caller.

This is why you have bugs, why programs are hard to debug, and software is difficult to extend. It’s because programs are difficult to comprehend to the extent that they are impure. And the reason for that is that they are not sufficiently abstract. A writer of impure code is not operating on the level of abstraction of programming. He’s operating at the level of abstraction of the machine.

Thinking Pure Thoughts

In a pure program, there are no side-effects at all. Every function is just that, a function from its arguments to its result. Values are immutable and eternal. A is A, and will always remain A. The only way to get from A to B is for there to be a function that takes A and returns B.

To a lot of programmers used to thinking in terms of mutable state, this perspective is completely baffling. And yet, most of them use immutable values all the time, without finding them particularly profound. For example:

1
2
3
int x = 1;
int y = x;
int z = x + 1;

In that example, 1 is an immutable value. The + operator doesn’t modify the value of 1, nor of x, nor of y. It doesn’t take the value 1 and turn it into a 2. Every integer is equal to and identical with itself, always and forever.

But what’s special about integers? When adding a value to a list, a Java programmer will ordinarily do something like this:

1
list.add(one);

This modifies the list in place, so that any variable referencing it will change values. But why shouldn’t lists be as immutable as integers? I’m sure some readers just mumbled something about primitives and passing by reference versus passing by value. But why the distinction? If everything were immutable, you would never know if it were passed by value or reference under the covers. Nor would you care what’s primitive and what isn’t. For an immutable list, for example, the add function would not be able to modify the list in place, and so would return a new one:

1
list = list.add(one);

From this perspective, you can write programs that perform arithmetic with complex values as if they were primitive. But why stop with lists? When functions are pure, we can understand programs themselves as objects that are subject to calculation just like numbers in arithmetic, using simple and composable operators.

Assignment in a Pure World

Let’s take a simple example of just such an operator. We’ve talked about how mutation is really assignment. Now let’s take the leap fully into the pure perspective, in which assignment is really just closure (pure functions with free variables). To illustrate, here is an example that all Java and C programmers are familiar with:

1
2
3
4
int x = 1;
int y = x;
x = y + 1;
t(x + y)

The familiar semicolon represents a combinator called “bind”. In Haskell, this operator is denoted (>>=). In Scala, it’s called flatMap. To better see how this works, here’s a rough transliteration into Haskell:

1
2
3
4
f 1
  where
    f x = return x >>= g
    g y = return (y + 1) >>= \x -> t (x + y)

Of course, this isn’t at all the kind of code you would normally write, but it serves to demonstrate the idea that assignment can be understood as binding the argument to a function. The Haskell code above roughly means:

Pass 1 to f, where f passes its argument to unchanged to g. And g adds 1 to its argument and passes the result to a function that adds y to its argument and passes that to t.

So in the Java version, we can understand everything after the first semicolon to be a function that takes an argument x and we happen to be passing it 1. Everything after the second semicolon is a function that takes an argument y and we happen to be passing it x (which has the value 1). And everything after the third semicolon is a function that takes an argument called x (shadowing the first x) and passes x + y to some function t.

Haskell is a purely functional language, and purity is enforced by its type system. If you haven’t taken it for a spin, I challenge you to do so. For more on the benefits of purely functional programming, see “Why Haskell Matters”.

Of course, purity is not purely the prerogative of Haskell. Sure, Haskell enforces purity, but you can write pure code in any language. That Java code above is pure, you know. If this is all new to you, I challenge you to give purity a shot. In your favourite language, be it Java or something else, start using immutable datastructure in preference to mutable ones. Try thinking in terms of functions from arguments to results rather than in terms of mutation and effects. For Java, you might consider using the immutable datastructures provided by Functional Java or Google Collections. Scala also has immutable datastructures as part of its standard libraries.

It’s time that we graduate from punching buttons and pulling levers — manipulating the machines. Here’s to programming as it could be and ought to be.

Objects, Identity, and Concept-formation

Coming from a background in Pascal and C, during the 1990s, like most others, I became infatuated with Object-Oriented programming. I thought they were really on to something. It seemed intuitive. I read and re-read the GoF book. I became fluent in UML. I read Chris Alexander. I wrote factories and metafactories, and more class hierarchies than I’ll ever count. I embraced Java eagerly, and XML moreso. But with experience came the realisation that I was solving the same problems over and over again. There was some governing principle that I was missing but knew I had to grasp if I were to progress in my chosen profession. I went back to the fundamentals. I learned about LISP, lamdba calculi, type theory, relational programming, category theory, and epistemology. And I’m still learning. But in integrating all of that into my existing knowledge, I discovered something that I found liberating:

There is no such thing as Object-Oriented programming.

I realise that I might risk starting a religious war here, but I don’t intend to. If you think I’m attacking you personally in some way, please stop reading and come back later with fresh eyes. Note that I’m not saying that OO is a bad thing. I’m saying that there isn’t any special kind of programming that is object-oriented as against programming that isn’t. It is a useless distinction, in exactly the same way that “race” is a useless distinction. And holding on to a useless concept isn’t good for anybody. The term “object-oriented” is at least honest in that it says what it implies, which is a frame of mind, an orientation of the programmer. However, the practical effect of this orientation is anti-conceptual, in that it displaces actual concepts like algebra, calculus, closure, function, identity, type, value, variable, etc. It fosters an aversion to abstraction. So you could say that there are object-orientated programmers.

“Object-Oriented” as a non-concept

Remember, you cannot be called upon to prove a negative. The burden of proof falls on whomever is claiming that there is such a thing as OO. But try as you might, there’s no objective definition of what “object-oriented” refers to. It means anything you want, which is to say that it’s rather meaningless. Peddlers of “object methodology” readily admit that “object-oriented” means different things to different people. To quote Aristotle: “Not to have one meaning is to have no meaning, and if words have no meaning, our reasoning with one another, and indeed with ourselves, has been annihilated.”

When I say that there’s no such thing as OO, I mean, more precisely, that there exists some abstraction (or several) that is referred to as “object-oriented”, but that this abstraction has no actual referent in reality. It is an abstraction that is made in error. It is not necessary, and serves no cognitive purpose. It is “not even false”.

This is to say that it’s not like the concept of Santa Claus, which is a fiction, a falsehood. OO is not a fiction, but a misintegration. It is a term that sounds like a concept, but denotes nothing more than a vague collection of disparate elements, any of which can be omitted, and none of which are essential.

A Proper Method of Concept-Formation

How is Object-Oriented a non-concept? Let’s demonstrate that very cursorily. First, we have to explicitly identify what it means for a concept to be valid.

Valid concepts are arrived at by induction. Induction is the process of integrating facts of reality by observing similarities and omitting particulars. Or, if you will, the process of applying logic to experience. Our induction must adhere to the corollary laws of identity, noncontradiction, and the excluded middle, and it must observe causality.

Note that a correct concept of concepts is the vehicle and not the cargo of this post. In brief, a valid concept or theory must meet the following criteria (hat tip: David Harriman):

  1. It must be derived from observations by a valid method (by logical induction from experience or experiment as described above). A valid concept can have no arbitrary elements and no relations that are unsupported by the facts being integrated.
  2. It must form an integrated whole in which every part supports every other. It cannot be a conglomeration of independent parts that are freely adjusted to fit a given situation.
  3. It must be no more and no less general than required to meet criteria 1 and 2.

OO doesn’t meet criterion #1. There’s nothing about the nature of programming that gives rise to a concept of object-orientation. Rather, the notion is sparked by the desire to make an analogy between manipulating physical objects in the real world and manipulating abstract entities in the computer. The motivation is understandable, to help people think about programs in the same way that they are used to reasoning about physical objects. But the analogy isn’t warranted by the facts, and therefore it is arbitrary. The concepts simply don’t transfer, because they are abstractions over entities of a different kind. The mistake is to use the concepts anyway, choosing the illusion that software is like physical systems rather than understanding the actual nature of programs.

OO doesn’t meet criterion #2: There’s no consistent definition of object-orientation. It’s a loose grab-bag in which nothing is essential. Even if you start removing things, it’s not at all clear at what point your programming stops being object-oriented. One would think that it’s at the point where you stop programming with objects, but that’s merely begging the question since it’s not clear what kind of entity is an object and what kind of entity isn’t. We are to rely on intuition alone in discovering that. What’s more, some of the contents of the “object-oriented” bag contradict one another (both inheritance and mutation contradict encapsulation, for example).

Repairing the Disorientation

Of course, it does no good to tear down the cognitive package-deal that is “object-oriented” if you don’t replace it with something. If you entered a village of little blue creatures and convinced them that the term “smurf” had no cognitive utility, you would have to teach them a whole host of concepts to use in its stead. There are two main ideas that the anti-concept “object-oriented” annihilates:

  1. It introduces the false dichotomy of identity as apart from equivalence.
  2. It blurs the distinction between values and variables.

I will elaborate on each of these points in two follow-up posts. Stay tuned.