Tuesday, February 25, 2014

SportsTribution - Gooooood day sweet world!

Hello everybody (all 13 people that probably find this blog),
as this is my first blog entry, some things about me. My name is Hannes (short for Johannes) and I'm German (which will explain my grammar and love for long sentences). I previously studied math in Germany and am now doing a PhD in bioinformatics (this part explains my interest in data). I also play very unprofessional basketball and try to actively follow the NBA (as actively as it is possible for not being able to watch the games live due to things like time zones).

I recently started to dive head first into the data on http://stats.nba.com and started a small program that I called SportsTribution1. SportsTribution allows to look at two information of data at the same time and you will see more about it very soon. I am sure that some people will find it crowded, but I promise that the four readers that are still reading right now will quickly get used to it.
It is available for free (right now only upon request, but I promise to quickly change this) and I am happy for any kind of critic (other than 'your stuff sucks!' of course...). Also feel free to publish content created by it.

SportsTribution is a great way to see outliers that are usually hidden, because they concern more than one type of data at the same time. My favorite example (up to now) is a plot called 'Josh McRoberts treats the ball like it's a hot potato!'
Comparing minutes of ball possession with the number of passes. Both values are normalized so that every player would play 36 minutes per game. Players are filtered by games (at least 40) and minutes per game (at least 30). Data published on nba.com on the 24/02/2014
It shows you a typical disadvantage of the plot, namely that some parts are unreadable. This is partly possible to avoid by slightly moving the names afterwards (by using vector based programs like Inkscape or Illustrator).

But in general, if you are interested in outliers, this problem will not occur. And you can see much more information than by simply looking at the numbers. For example, you can clearly see that all people with a high value in time of ball possession are point guards, followed by a second group of point forwards (James, George) and players that share ball carrying duties. Both not surprising, but it is still stunning to see how clearly those groups differ, which is not obvious by looking at the numbers.
And then for all other players the time of possession is very similar, while the number of passes are widely spread.
This basically gives you ball 'sieves' at the bottom (in this case Klay 'the Luigi of the Splash brothers' Thompson) and ball 'sources' like Pass McRoberts and Joakim 'Tornado Shot' at the top. I know that this plot doesn't answer anything, but I'll try to channel my inner Zach Lowe to get better at finding the answers.

So to finish my introduction blog2, SportsTribution can get messy, but it is really great if you want to see more than the Top 5 players in any statistic.

How - that's something you will see in my future posts... (Cliff hanger alert!)

1I would have preferred ScatterSports, but that Twitter handle was already taken
2I guess the only people reading by now are my parents - Hallo ihr Zwei!

