Sunday, April 17, 2011

Implementing an NBA Playoff Bracket in F#

I'm a huge fan of the NBA and sports in general and for years I've been fascinated with the way the NBA structures their playoff bracket. They take the top 8 teams from each conference and seed them based on which teams have to most wins. There are 2 conferences in all so that makes 16 playoff teams each year. The top seed plays the worst team, the second seed plays the seventh team and so on and so forth. So I thought it'd be cool to actually code this setup in F#.

I started with a method that knows how to read in each team from a text file.

// Learn more about F# at http://fsharp.net

open System
open System.IO
open System.Reflection

type Conference =
    | Eastern
    | Western

type Team = {
    Name : string;
    Wins : int;
    Losses : int;
    Conference : Conference
}

let filename = (Assembly.GetExecutingAssembly().Location |> Path.GetDirectoryName)  + "\Teams.txt"
let totalgames = 82
let max = totalgames + 1
let r = new Random()
let conferencesize = 15
let playoffteamsperconference = 8
let half = playoffteamsperconference / 2

let getteams() =
    seq {
        let mapteam conference (l : string) =
            let wins = r.Next(0, max)
            { Name = l.Trim(); Wins = wins; Losses = totalgames - wins; Conference = conference;}

        let teams = filename |> File.ReadAllLines

        // first 15 teams are the eastern conference
        let eastern = teams
                        |> Seq.take conferencesize
                        |> Seq.map (mapteam Eastern)
        
        yield! eastern

        // last 15 are the western conference
        let western = teams
                        |> Seq.skip conferencesize
                        |> Seq.take conferencesize
                        |> Seq.map (mapteam Western)

        yield! western
    }

The chunk of code within the seq{...} scope is known as a computation expression. This particular type of computation expression is called a sequence expression. It's a language integrated feature of F# that allows you to use certain operators based on a set of methods you implement. In this case, the compiler will translate my call of yield! to a method call that knows how to accept a sequence and return it's values. Computations in F# are monads; a fundamental feature of all functional languages. Believe it or not, the infamous LINQ as you know it is based on monads.

Amusingly enough, you've been using monads in .NET for quite a while even if you aren't a normal user of LINQ. Ever used Nullable<T>? It's a maybe monad. Even the infamous jQuery is a monad. It either has a value or it doesn't. In F# we represent this type of construct using the generic option type. Options are a discriminated unions. They're represented by Some 'T or None; where 'T is a generic type argument.

An example is

let o = Some 5
let p = Some "string"

Now o is an int option and p is a string option.

This is cool because we never have to worry about a value being null. Null isn't even a valid value in F#, but it is however a valid .NET value.

How do we find out if o or p has a value? We have to use pattern matching.

let printifhasvalue optionvalue =
    match optionvalue with
    | Some v -> printfn "%A" v
    | _ -> ()

You can think of patterns as switch statements. And that's exactly how the compiler translates them. When using pattern matching, you have to handle all cases or the compiler will yell at you. In my case I only have to worry about Some and None. I accounted for Some with the first check, and I used the wildcard value, _, to handle all other cases. If there were a third option, we'd have a problem because there would be 2 more possibilities. The compiler knows that None is the only other option so we're ok.

So when inter operating with other .NET libraries we do have to account for it. F# libraries never return null though. As for discriminated unions, I'll talk more about those later. As a heads up, my Conference type is one.

The map function accepts a function that can take a value and transform it into another type of value. It's just like Select in LINQ. When I called Seq.map (mapteam Western), I used a concept called partial application.

The expression (mapteam Western) isn't actually invoking the mapteam function. It actually returns a function that accepts the remaining arguments for mapteam. In our case it's the actual team. This is also known as currying. If mapteam took 3 arguments, I'd get a compile time error because I'd get a function back that accepts 2 arguments as opposed to 1. In that  case I'd have to do (mapteam Western arg2), to get a function that accepts only 1 argument. Pretty cool.

Most everything is a function in F#. Event the + and - operators. Don't believe me? You can use the operators as functions by wrapping them in parenthesis like so: (+). The output of that expression is a function that accepts 2 ints and returns another. Or (int -> int -> int).

I also needed a datastructure to store each team and its properties. Specifically I used what's known as a record in F#. It's completely separate from a class and the two are not synonymous.

There are 82 games in an NBA season so I wanted to randomly generate the record for each team. I generate a number between 0 and 82 inclusively for the wins and subtract that from the remaining games in the season to compute the losses. Pretty simple.

As far as which team each conference belongs to, I got the names of each team from the NBA.com website and typed them in order into my Teams.txt text file. I tried to keep things simple on that regard.

Teams.txt


Chicago
Miami
Boston
Orlando
Atlanta
New York
Philadelphia
Indiana
Milwaukee
Charlotte
Detroit
New Jersey
Washington
Toronto
Cleveland
San Antonio
L.A. Lakers
Dallas
Oklahoma City
Denver
Portland
New Orleans
Memphis
Houston
Phoenix
Utah
Golden State
L.A. Clippers
Sacramento
Minnesota

The cool thing about F# is that it's functional. That means we should implement light weight composable functions. That's exactly the approach I've taken here. Each function builds atop the other. A simple f(g(x)) relationship if you will.

And by the way, all values in F# are immutable by default. That means we can't change the state of something once we've created it. This greatly simplifies multithreaded programming, because you don't have to worry about multiple threads changing the state of your data. You can rest assured that once you give a function a reference to a value, it'll be in that very state once the operation is completed.

F# isn't a purely functional language so we do have mutable properties and values that we can pass around. Just not by default.

So now that we have a function that knows how to fetch each team and generate their record, we'll make something that can consume the output of that function and print each team.

let printteams (teams : seq<Team>) =
    teams
    |> List.ofSeq
    |> printfn "%A"

The printfn function is a cool utility because it knows how to print a generic object. It can be a sequence, a base type, etc. It's just like printf in C. You can pass a format like %s for strings and %d for integers.

I used List.ofSeq from the List module to convert my sequence to a List. I did that because sequences can be infinite in F# so the printfn function wouldn't print out all the values. A List on the other has is finite. So printfn will iterate the entire sequence and print out each element as opposed to the first few.

The type seq<'a> in F# is equivalent to IEnumerable<T> in C#. It represents a possibly infinite list of elements. I work solely with sequences throughout my implementation. All instances of Seq.x represent a set of functions known as the Sequence Module in F#. You can think of it as a class with a bunch of static methods. The funny looking |> syntax is just an operator. It takes a function and its argument and invokes that function passing in the given argument as a value. F(x) once again. I don't have to use this operator, the forward pipe operator as it's known, but I really like it's logical syntax so I kind of abused it here. It's no different than piping output from one command to another when using bash on Linux or any other command shell. Piping input to grep is really nice by the way.

There is also a backward operator that knows how go go in the opposite direction. It accepts x first then pushes it into f. That one reads from right to left like

List.ofSeq <| x

In order to pull of the same syntax without the pipe operator, I'd have to nest all of my return values. It'd look like f(g(x)) or

printfn "%A" (List.ofSeq teams)

In this case, g if my List.ofSeq function which accepts the teams. The teams are of course x. The output of that function is then passed to printfn. That makes printfn g in this equation. It doesn't look so helpful in simple scenarios, but later on you'll witness me using the pipe operator quite aggressively; and I think you'll start to appreciate the elegant syntax it allows you to exercise.

And since F# is functional, functions are first class citizens. They don't have to belong to any particular object in general just like in JavaScript. You can pass them as normal values just like ints, GUIDS, and any other base types. That makes F# a really powerful language.

You're probably wondering how I got away without specifying types. Don't you expect to see int and string? Well the F# compiler implements what's known as type inference. It's able to infer types based on the way they're used. We rarely have to specify types in F#, but I had to do it a few times with my implementaton using type annotations. These appear in the signature of my method. Method signature syntax goes

functionname arg1[type] arg2[type] arg3[type]...argn[type]

F# interprets this as (arg1 -> arg2 -> arg3 -> argn -> returntype)

The last value is the return type. Just like Func<T> in C#.

So anytime you see (blah -> blah), that means a function. If you ever see this from intellisense when hovering over a method, you can believe that method accepts a function as an argument; so be prepared to pass one.

Arguments are delimited by spaces. The type of the argument is optional and is required based on whether the compiler can use its type inference algorithm to infer the type of the argument. We don't need curly braces also, because F# detects scope based on spaces. Four spaces to be exact.

Now that I can get all the teams in the league, I need to group them by their conference. There are 2 conferences in the NBA. The eastern and western conferences. You'd normally represent something like this an an enum in C#, but we have a more functional construct known as discriminated unions in F#. My discriminated union is called Conference. You're probably starting to notice that the functions I'm using look a lot like LINQ. That's because LINQ's roots are tied deeply to functional programming.


let getteamsbyconference() =
    getteams()
    |> Seq.groupBy (fun t -> t.Conference)

I again made a function that knows how to print the teams out. Since I grouped the conferences, they came back as a pair or tuple as we call it. So I have to drill down into the conference to get the teams. Then from their it's business as usual. I can simply reuse the initial function I created that knows how to print a sequence of teams.

let printteamsbyconference (conferences : seq<(Conference * seq<Team>)>) =
    conferences
    |> Seq.iter (fun (_, teams) -> teams |> printteams)

You probably noticed the weird syntax I used in my function to iterate the conferences. It another form of pattern matching.  As I mentioned before, the _ represents the wildcard character. That means I don't care about the result of the first value; which in this case is the conference. The syntax I used is a pattern for tuples because its wrapped in parenthesis and delimited by a comma. That means I working with a pair, but I could have easily been working with a triple, quadruple, and so on and so forth. I could have called my teams parameter whatever I like. The name you give to your parameters is completely arbitrary.

I can get each conference and its respective teams now. It's time to make something that knows how to get the best 8 teams from each conference.

let getplayoffteams() =
    getteamsbyconference()
    |> Seq.map (fun (c, teams) -> (c, teams |> 
                                      Seq.sortBy (fun t -> t.Losses)
                                      |> Seq.take playoffteamsperconference))

For each conference, I sort the teams in the conference by the number of losses they have. Logically you'd think I've order by wins, but the sortBy function orders in ascending order. That means the teams with the least number of losses will be at the front of the pack. Logically the teams with the lest number of losses are the best right? That means they have the most number of wins. After sorting the teams, I take the top 8 teams from each conference and return them as a tuple to pair them with their conference.

Anytime you see the (fun x -> ...) syntax, that represents a lambda. Lambdas are pretty big in functional languages. It's with the lambda symbol that we mathematically denote a function. That's some old school history related stuff and it's admittedly pretty boring. It is kind of nice to know though.

The last step is to order the teams and make the final bracket.

let printplayoffbracket() =
    getplayoffteams()
    |> Seq.iter (fun (c, teams) -> 
                        Console.ForegroundColor <- ConsoleColor.Red

                        printfn "%A conference matchups\n" c

                        let topfour = teams |> Seq.take half
                        let bottomfour = teams |> Seq.skip half |> Seq.take half |> Seq.sortBy (fun t -> t.Wins)
                        
                        Console.ForegroundColor <- ConsoleColor.Yellow

                        bottomfour
                        |> Seq.zip topfour 
                        |> Seq.iter (fun (topseed, bottomseed) -> 
                                        printfn "%s (%d-%d) vs %s (%d-%d) \n" topseed.Name topseed.Wins topseed.Losses bottomseed.Name bottomseed.Wins bottomseed.Losses)
                        Console.ResetColor())

We consume the playoff teams and match the best teams against the worst teams. I used closures in this case. I know the top for teams are at the front of the pack, so I simply took the first 4 teams out of the 8 available in each conference. After that, I took the last 4 teams and sorted them by the number of wins they had. Again, you'd logically then I'd sort them by the number of losses they had. But again I know the worse teams are the teams with the least number of wins. The zip function knows how to pair up each member of one sequence with a member of another sequence. It will do this for each pair it can find. If one sequence is bigger than the other, it'll stop pairing elements from each sequence once it's paired the number of elements equivalent to the smallest sequence.

Now it's time to watch our little composable puppies in action

do getteams() |> printteams
do getteamsbyconference() |> printteamsbyconference
do getplayoffteams() |> printteamsbyconference
do printplayoffbracket()


We use the do keyword in F# to execute imperative code. That's code that doesn't return a value and just executes an action. It'll have the return type unit, or void in C#. In F# we always have to return a value. It'll either be an actual value like a record or tuple, or unit. Unit is denoted by (). So to return unit from a function you just write

let f() =
    ()

The function I definte above not only returns unit, but accepts it as an argument. So event when you think you're calling a parameterless method in F#, you're really not. And when you think you're not returning anything, you actually are.

And that's it. We started out by making a function that could read in each team and generate wins and losses for it. Then we grouped each one of those teams into the right conference. Next we were able to take the top 8 teams from each conference, which were our playoff teams. And lastly, we pair up the best teams with the worse teams in each conference just as the NBA does it.

If you want to try out my code, you can download F# and fsharp.net. It's deployed as its own toolset, independent of Visual Studio. I'd recommend Visual Studio so you can have intellisense though. If you want to get down and dirty, you use the F# interactive command shell. It's an interpretor so you're allowed to execute raw code without compiling it.

Don't forget my favorite 2 books. Real World Functional Programming and Expert F# 2.0. Also checkout my 2 favorite guys, the authors of my 2 favorite books, Thomas P and Don Syme.

I think it's safe to say we implemented map reduce here.

Cheers!!


Source can be found here.

4 comments:

  1. Cool! Haha, Sacramento is 82-0 in your example. Go Hawks!

    ReplyDelete
  2. Lol @Misha. Yes. Go Hawks and Lakers.

    ReplyDelete
  3. This code may be biased to the Lakers, don't know how yet, but I know it's there.

    ReplyDelete
  4. Cool.. very nice work Antwan !!

    In the example you listed, only Heat exists in current playoffs for Eastern Conference.. So, its "Go Heat" always.. Hahaha :)

    ReplyDelete