Tuesday, 12 November 2013

Feeling Puddled

A bit of a change of pace on the blog. I've not posted anything remotely nerdy for ages, so here is a post containing four very nerdy things: functional programming, Haskell, Scala and Twitter interview questions. The Twitter interview in question was not mine, but instead was one posted about by Michael Kozakov on his blog post "I Failed a Twitter Interview."

So, interview questions, eh? This Michael fella thinks he failed the interview by getting the question wrong, but you and I know that's not how it works. The interviewer is more interested in finding out how you go about solving problems than whether you get this particular problem right. After all, unless they have some nasty leaky roof scenario, I can't imagine there being a particular pressing need for Twitter to need their interview candidates to get this one spot on.

That all being said, I don't much care about the problem solving technique. I care about figuring it out and making a really expressive, concise version that looks like awesome code. Because I'm vain.

My first attempt, I will admit, sucked. It was an imperative, state-machine based algorithm that tried to calculate puddles by walking the landscape and figuring out if it was in, out, or around about a puddle. It was awful. It was messy, hard to grok, and about 1000 lines long. Terrible, terrible, terrible. I even fell into the "local maxima" trap described in the blog post, but didn't notice until it was too late.

But of course, a little niggly bit of me knew there must be a good, functional way to do it using a good, functional language. Seeing a pretty elegant faux-functional solution done with PHP's awful array manipulators made me even more determined that there must be a really nice way to do this.

The problem itself is simple; for a given landscape defined by integers, calculate what volume of water would stay in any puddles left due to run-off being impeded. I decided to solve it using Haskell (a safe, useless language) because that's what nerds do, and also using Scala because I happen to be messing about with it. Scala has the advantage of allowing both the imperative and functional versions to work side-by-side. Some may not see this as an advantage.

Consider the following landscape, as defined by an array of integers:

My random landscape. Stunning.
It should be fairly obvious where the puddle should end up. The problem is how to calculate it. From the imperative solution I knew that the trick lay in calculating cumulative maxima, both starting at the left and working right, and also starting at the right and working left. This would give me the "shape" of the puddle.

Haskell:

maxl w = scanl1 max w

Scala:

def maxl(w: List[Int]) = w.scanLeft(0)(max(_,_)).tail

Hmm ... it's already evident that Haskell is at least more brief than Scala, and that Scala's scanLeft function needs a tweak or two. However, the resulting list is the same for each:

Cumulative maxima, as seen from the left.

Good. This is the shape of the landscape, from the perspective of the left hand end. Not terribly useful on its own. We need its brother, maxr:

Haskell:

maxr w = scanr1 max w

Scala:

def maxr(w: List[Int]) = w.scanRight(0)(max(_,_)).init

Haskell's code looks very similar to the previous function. So does Scala's, but there are a couple of weird things about it. The use of "init" to strip the trailing 0 left over by the binary nature of the function, as opposed to the cleanliness of Haskell's unary scanr1 function. Still, the results are, again, the same:

Cumulative maxima, as seen from the right.

Stunning. It's evident that in each of the shapes the side we start at is shaped to the landscape, and the opposite side is largely flat. We can make use of this by fitting the shape to the landscape by calculating the intersect of the two shapes. Due to the nature of the shape (a list of positive integers) we can just use "min" to calculate the intersect. This gives us the maximum water levels for each column.

Haskell:

levels w = zipWith min (maxl w) (maxr w)

Scala:

def levels(w: List[Int]) = (maxl(w), maxr(w)).zipped map(min(_,_))

The code isn't terribly hard to understand, but Haskell's brevity is growing on me. But did it work? What shape do we see now?

An intersect of the left/right maxima showing the puddle.

Aha! The shape now conforms, mostly, to the shape of the ground. There is one clear exception, though. The puddle in the middle is still filled in. So what is the difference between the shape we see and the shape we want? The ground, of course. If we subtract the ground, we should be left with just the shape of the puddle.

Haskell:

water w = zipWith (-) (intersect w) w

Scala:

def water (w: List[Int]) = (levels(w), w).zipped map(_-_)

Again with the brevity, here, Haskell. Scala's weird anonymous function syntax crops up a few times, this time with the weird looking operator _-_. For the uninitiated, it means "subtract the 2nd anonymous argument from the 1st anonymous argument." So there.

Subtract the ground to just leave the puddle we wanted.

And that's it! We have the shape of our water. All we need to do is calculate the volume. Easy peasy.

Haskell:

sum $ water [ 2, 4, 5, 4, 2, 3, 4, 6, 5, 2 ]

Scala:

water(List(2, 4, 5, 4, 2, 3, 4, 6, 5, 2)).sum

Not much in it in terms of brevity, here. And the result of both, as you can probably guess, is 7.

The full listings of the two programs are here:

Haskell:

import System.Environment

main :: IO ()
main = print (sum $ water [2, 4, 5, 4, 2, 3, 4, 6, 5, 2])

maxl w = scanl1 max w
maxr w = scanr1 max w
levels w = zipWith min (maxl w) (maxr w)
water w = zipWith (-) (levels w) w

Scala:

import math._

object Water {

    def maxl(w: List[Int]) = w.scanLeft(0)(max(_,_)).tail
    def maxr(w: List[Int]) = w.scanRight(0)(max(_,_)).init
    def levels(w: List[Int]) = (maxl(w), maxr(w)).zipped map(min(_,_))
    def water(w: List[Int]) = (levels(w), w).zipped map(_-_)

    def main(args: Array[String]) {
        println(water(List(2, 4, 5, 4, 2, 3, 4, 6, 5, 2)).sum)
    }
}

Pretty similar, all in all. A nice, elegant solution that still looks elegant when written down. Arguably Scala's proliferation of types, dots and parentheses looks a bit messier.

Extra bonus content!

I did wonder if I could get the program to output an ASCII representation of the resulting ground and water. Turns out it's pretty easy. Here I've added a line and render function to the Scala version.

import math._

object Water {

    def maxl(w: List[Int]) = w.scanLeft(0)(max(_,_)).tail
    def maxr(w: List[Int]) = w.scanRight(0)(max(_,_)).init
    def levels(w: List[Int]) = (maxl(w), maxr(w)).zipped map(min(_,_))
    def water(w: List[Int]) = (levels(w), w).zipped map(_-_)

    def line(g: Int, w: Int, m: Int) = ("#" * g) + ("~" * w) + (" " * (m-w-g))
    def render(g: List[Int], w: List[Int]) = {
        val graph = (g,w).zipped.map(line(_,_,g.max+1))
        graph.transpose.reverse.map(_.mkString(""))
    }

    def main(args: Array[String]) {
        val g = List(2, 4, 5, 4, 2, 3, 4, 6, 5, 2);
        val w = water(g)
        render(g, w) foreach println
        println("\nTotal volume of water: " + w.sum)
    }
}

And the output:

$ scala Water.scala

       #
  #~~~~##
 ###~~###
 ###~####
##########
##########

Total volume of water: 7

I'll leave working out what it does as an exercise for the reader, as well as a modification to allow it to accept a list of integers on the command line.

Extra, Extra bonus content!

I just rediscovered this post after a long time of not posting anything at all. I should sort that.

Anyway I've been messing about with Python for the first time in forever and have been playing with functional style in a language that doesn't do immutable or checked types (except via maybe the wonderful Pyrsistent package). So here is my solution to this problem in Python.

from operator import sub
from itertools import izip

def _scanl(fn, xs, a):
    for x in xs:
        a = fn(a, x)
        yield a

def _scanr(fn, xs, a):
    xsl = list(xs)
    s = _scanl(fn, reversed(xsl), a)
    return reversed(list(s))

def water(xs):
    maxl = _scanl(max, xs, 0)
    maxr = _scanr(max, xs, 0)
    levels = (min(x) for x in izip(maxl, maxr))
    return (sub(*x) for x in izip(levels, xs))

Not bad, eh? The only reason it's longer than 5 lines is because Python doesn't seem to have scanl (or scanr) built in. They are trivial to build, though. Just doubles the line count.

I have, of course, added calculation of the sum of volumes and a renderer. Why wouldn't I? Here is that bit:

def render(landscape):
    depth = max(landscape) + 1
    line = lambda l, w: ["#"] * l + ["~"] * w + [" "] * (depth - w - l)
    graph = [line(l, w) for l, w in zip(landscape, water(landscape))]
    return ["".join(row) for row in reversed(zip(*graph))]

if __name__ == "__main__":
    from sys import argv

    landscape = map(int, argv[1:]) if len(argv) > 1 \
                else [2, 4, 5, 4, 2, 3, 4, 6, 5, 2]

    for line in render(landscape):
        print line

    print "Volume is %d" % sum(water(landscape))

Again, how it actually works is left as an exercise for the reader. Oh, and of course, the output:

$ python puddle.py 2 4 5 4 2 3 4 6 7 9 8 8 5 4 3 3 3 3 4 5 6 5 4 3 1 1 2 4 3 1

         #
         ###
        ####
       #####~~~~~~~~#
  #~~~~######~~~~~~###
 ###~~########~~~~#####~~~~#
 ###~###################~~~##
########################~~###
##############################
Volume is 34

No comments:

Post a Comment