(Updated to add full table and discussion about relative impact of the variables.)
I acquired all of the Division I team data since 2002, and from that we can observe some fascinating trends and data relationships in the data. This is a multipart series exploring some of that data.
Now that we are done with shooting stats, let's take a diversion and get a little bit into Points Per Possession. After evaluating 7,788 seasons of college basketball, it appears that Dean Smith's three key performance indicators, Points Per Possession (PPP), Points Per Possession Allowed (ppp), and Possession Differential (POSDIF) go a very long way toward explaining a team's winning percentage.
If we create a multiple regression analysis using those three as explanatory variables for Winning Percentage (W%), we find an extremely high R-Squared value of 0.9855. In other words, 98.55% of the error in explaining W% is covered by this model. This certainly makes for a very interesting basis for season evaluation.
The result of this regression is an equation we can use to estimate Winning Percentage given the three explanatory variables:
W% = 2.30*PPP - 1.76*ppp + 0.0006*POSSDIF
Where:
W% = Winning Percentage (0.0-1.0)
PPP = Points Per Possession using Smith Method
ppp = Points Per Possession Allowed using Smith Method
POSDIF = Total Possession Differential on the Season (not per game)
What does this mean? For every hundredth of a point per possession on offense a team improves, it improves its winning percentage by 0.023. For every hundredth of a point per possession it allows by an opponent, its winning percentage decreases by 0.0176. In other words, offense has a 31% bigger effect on winning percentage than defense does.
If a team has a 1 possession advantage over its opponent per game, it should (by now) have a +26 POSSDIF, resulting in a 0.016 winning percentage increase.
To increase a team's winning percentage by 1%, it would need to increase its Points Per Possession on Offense by 0.04, decrease ppp allowed on defense by 0.06, or increase POSSDIF by 0.641 per game. Those figures represent 4.3%, 6.7%, and 5.6% of the national averages, respectively. Therefore, it is easiest to increase winning percentage by improving offense, and next easiest by improving rebounding. The most difficult way to improve a team's winning percentage is by improving the defense.
In the following table we see the data for the current season thru this morning (2/20/24), the three explanatory variables, True W% (WLPCT), the predicted result based on those variable using the equation (WLPRED), and the error (WLDIFF). A large WLDIFF means that a team is winning far beyond its expectations, while a very negative number means a team is not meeting expectations.
I haven't experimented with an analysis like this before, but my hypothesis is that very negative WLDIFF values mean that the team is bound to win more games in the near future, while a high WLDIFF means that the team is bound to lose some upcoming games. We'll see how that works out over the ensuing 6 weeks.
I acquired all of the Division I team data since 2002, and from that we can observe some fascinating trends and data relationships in the data. This is a multipart series exploring some of that data.
Now that we are done with shooting stats, let's take a diversion and get a little bit into Points Per Possession. After evaluating 7,788 seasons of college basketball, it appears that Dean Smith's three key performance indicators, Points Per Possession (PPP), Points Per Possession Allowed (ppp), and Possession Differential (POSDIF) go a very long way toward explaining a team's winning percentage.
If we create a multiple regression analysis using those three as explanatory variables for Winning Percentage (W%), we find an extremely high R-Squared value of 0.9855. In other words, 98.55% of the error in explaining W% is covered by this model. This certainly makes for a very interesting basis for season evaluation.
The result of this regression is an equation we can use to estimate Winning Percentage given the three explanatory variables:
W% = 2.30*PPP - 1.76*ppp + 0.0006*POSSDIF
Where:
W% = Winning Percentage (0.0-1.0)
PPP = Points Per Possession using Smith Method
ppp = Points Per Possession Allowed using Smith Method
POSDIF = Total Possession Differential on the Season (not per game)
What does this mean? For every hundredth of a point per possession on offense a team improves, it improves its winning percentage by 0.023. For every hundredth of a point per possession it allows by an opponent, its winning percentage decreases by 0.0176. In other words, offense has a 31% bigger effect on winning percentage than defense does.
If a team has a 1 possession advantage over its opponent per game, it should (by now) have a +26 POSSDIF, resulting in a 0.016 winning percentage increase.
To increase a team's winning percentage by 1%, it would need to increase its Points Per Possession on Offense by 0.04, decrease ppp allowed on defense by 0.06, or increase POSSDIF by 0.641 per game. Those figures represent 4.3%, 6.7%, and 5.6% of the national averages, respectively. Therefore, it is easiest to increase winning percentage by improving offense, and next easiest by improving rebounding. The most difficult way to improve a team's winning percentage is by improving the defense.
In the following table we see the data for the current season thru this morning (2/20/24), the three explanatory variables, True W% (WLPCT), the predicted result based on those variable using the equation (WLPRED), and the error (WLDIFF). A large WLDIFF means that a team is winning far beyond its expectations, while a very negative number means a team is not meeting expectations.
I haven't experimented with an analysis like this before, but my hypothesis is that very negative WLDIFF values mean that the team is bound to win more games in the near future, while a high WLDIFF means that the team is bound to lose some upcoming games. We'll see how that works out over the ensuing 6 weeks.
Last edited: