Friday, January 30, 2015

Revisiting Stats stabilization

Revisiting Stats stabilization

a warning before you start reading this. You can find a more polished version at Nylon Calculus (memo to myself: add link here as soon as you got it). They also published another piece of mine and have a lot of other great stuff. But if I would draw a Venn Diagram of people that read my blog and people that read Nylon Calculus, I am pretty sure that you know all this already...
This version has a bit more (probably boring) details on why I find previously used methods impractical. It also has a bit more shiny plots, which in the end where not helpful for understanding. So, if you are here for the shiny plots scroll down to the end. There is also an R script so that you can produce shiny plots yourself. You can find a github for the R function that I wrote here.
Side note: One reason for this blog entry is that I'm starting to move from Matlab to R. If you find technical flaws in it let me know. :)

Hello everybody,
over the last years there seems to be one main way to estimate the stabilization of a stat, ( http://nyloncalculus.com/2014/08/29/long-take-three-point-shooting-stabilize/ , http://www.fangraphs.com/blogs/stabilizing-statistics-interpreting-early-season-results/ , http://www.baseballprospectus.com/article.php?articleid=17659 ) based on the work of Prof. Dr. Pizza Cutter. While the work itself is technically sound, it has in my opinion several drawbacks. In short, the method is in my opinion unnecessarily complicated, can be easily misleading and is as a result impractical to use. In the following, I will explain these three points of critique, while introducing a simpler and more practical method that works perfectly well for a certain kind of commonly measured data.

Thursday, January 8, 2015

Scatter Plot Data

Hi everyone,

I decided to release the code that I am using for my scatter plots. Mostly because I am not using it often enough to keep it for myself, so I hope that it is useful for somebody else. Use it at your own risk. Feel free to remove my name or change anything. And so on (we all know these proclamations...).

In my opinion, the output is pretty cool, especially if you combine it with Inkscape or Illustrator (which works well if you save the plots as pdf)
As an example, take the plot I have here: http://sportstribution.blogspot.de/2014/03/three-quick-points-on-kyle-korver-and.html
You just have to use the same axes settings and different filter or data points and you get two plots that you can easily merge (and color differently).

Especially for those of you that have the data readily at hand (I'm looking at you, Nylon guys!) it could be worth to give it a try.
This code was my first try in learning Python and it is therefore far from perfect. It is only 400 lines and not so many packages, so it might also be interesting for those of you that are as well interested in learning Python.

The code should work with Python2.7 and the right packages. If someone knows how to update this, I am glad to listen.

Oh, I almost forgot to add the github:  

Cheers,
Hannes