Hello everybody!
I recently hit 1000 viewers which means - actually nothing (memo to myself: try to be first responder to as many Grantland posts as possible). Some weeks ago, after ESPN published Real Plus Minus (RPM), I dabbled a bit into comparing
it to 'normal' box score stats - and I have to admit it was partly rubbish (but at least it looks cool!).
(Note: If you get bored during the next paragraphs, simply scroll down to the new fancy figures...)
The biggest critique points are in my opinion that RPM is a stat that already includes box score stats and that I used a model that only compared one stat with RPM at a time. The first problem is directly obvious: If I compare assists with a stat that indirectly concludes 'assists are super!', then I don't know nothing. The second problem is a little bit more tricky. Imagine that assists correlate with turnovers (which is actually true, so you do not have to try very hard). This influences the analysis, as turnovers are generally seen as negative and assists as positive (for some strange reasons), but both could correlate as positive.
So, I started to use multiple linear regression, which sounds more dangerous than it is. Linear regression is basically: you have beans lying on the floor. Put a stick on the floor so that the beans are on average as close to the stick as possible. In the case of two factors, your beans are
floating in space and you have to put on your space suit and adjust a board in a way that the beans are as close to it as possible
1. I decided to not go further than two dimensions for the moment. That's probably another post.