Kubatko et al. (2007), many advanced basketball metrics are now normalized by possession. We can say, for example, that NBA teams scored 100.01 points per game or, just as easily, 104.5 points per 100 possessions last season. They are both statements about the volume of scoring, but the second statistic is normalized to a standard game length, which can be useful when comparing across players and teams with different styles of play.However one challenge to estimating per possession statistics is that the possession totals are nowhere to be found in box score game summaries. Instead, the best way to get accurate totals is to read the play-by-play for each game and track the possession changes as they occur. Since this would be too exhausting to do manually, we used some simple Python code to automatically grab NBA play-by-play and parse the text.We counted possession for 2014-15. Meanwhile, in their original work Kubatko et al. (2007) counted possessions for four seasons in the mid-2000s: 2002-03 to 2005-06.
When comparing the total number of possessions, we find…
For seasons 2002-06, NBA teams average roughly 93.1 possessions per game.
For the 2014-15 season, NBA teams averaged roughly 95.7 possessions per game!Is this change significant? A t-test suggests that it is *highly* significant (p<<0.01): that we are seeing significantly more possessions per game nowadays than in the mid-2000’s.What’s causing this change?We fit a linear model for possessions in the same way as Kubatko et al. (2007), i.e. by ordinary least squares (OLS), making a formula for possessions in terms of seven box score stats: FGA, FG missed, FTA, FT missed, OREB, opponent DREB, TO.The Kubatko 2007 model was…Fitting with the 2014-15 data, we get the following model:Right off the bat, we observe that all coefficients have the same sign and relatively similar values. Moreover comparing model correlations with actual possession numbers from 2014-15 shows that for Kubatko (2007) R^2 = 0.9483 while for the updated formula (2015) we have R^2 = 0.9496. So the old and new formulas have similar accuracy.
However the formulas aren’t identical. The constant term is higher in the 2014-15 model, consistent with the idea that there are now more possessions per game. Also, moving from mid-2000’s to now there is a decrease in the contribution of OREB: Namely, there are more offensive rebounds per possession now than ten years ago. To see this inverse relationship, imagine two games, one in each decade, and suppose both games have 100 possessions and identical box score stats — except for offensive rebounds. Solving for the number of rebounds, we see it is inversely proportional to the regression coefficient. A smaller coefficient therefore suggests there are more offensive rebounds per possession now than in the mid-2000’s. While the change is not quite significant at a confidence level of 95%, it suggests a trend that may become significant with more observations.In upcoming posts, we will continue breaking down possessions and discussing the methods used in NBA analytics. We hope that those who have interest in collaborating offer feedback and consider writing with us.[post script] Brain & Basketball believes in reproducible science and open source software. Let us know how we can help and provide for your own analysis.