Brownlow Medal Predictor

Using a data-driven approach to predict AFL Brownlow Medal votes

Published in

Towards Data Science

8 min readOct 25, 2021

Ruck Contest. Image by The-Pope, CC BY-SA 4.0, via Wikimedia Commons

The Brownlow Medal is awarded to the “best and fairest” player in AFL during the home and away season, as determined by the umpires. After each game, the three field umpires award 3, 2 and 1 votes for players regarded as the first, second and third best in the match respectively. On the awards night, the votes from each match are tallied and the player(s) with the highest number of votes is awarded the medal, subject to eligibility.

Predicting the Brownlow Medal winner is challenging due to the subjectiveness of the definition of “best and fairest” in collective umpire opinions — which may be influenced by public opinion. Additionally some players are more likely to be within the field of vision — midfielders predominantly have historically won the lion’s share of votes.

The purpose of this analysis to understand the key quantifiable predictors of player actions that contribute to Brownlow votes and build a reasonably predictive model which forms the basis for a betting strategy through the season.

Overview of the Baseline Model

The objective of the baseline model is to outline the framework from which future iterations of the model can be improved on.

Performance Measure We define the performance assessment measure of the baseline model as the root mean square error (RMSE) of the actual vs predicted votes; this measure penalises predictions non-linearly depending on distance from actual observations.

Data The data set comprises approximately 85,000 rows of player-match statistics for the 2012–2021 seasons from the fitzRoy data package. The 2021 season data has been set aside as the testing data set for predictions.

Preprocessing The data undergoes some transformations as described below.

Player votes are assigned on a per match basis — which requires a normalisation of the player stats by match — hence adjusting for expected variances due to weather, venue, etc. This process also naturally adjusts for the 2020 season which had shortened game times.
We also initially balance the classes in a 4:1 ratio to ensure that the training data is not overly biased towards players who do not receive votes.

Learning Algorithm The underlying learning algorithm is a tree-based framework (xgboost) for classification, which produces both probabilities as well as expected classes for vote counts on a player-match basis. The model uses only numeric variables and is cross validated.

There can only be three votes assigned per match — which makes this a classification problem with which we need to understand predictions in probability terms — as well as to be able to discern the quality differences between a 2 vs 3 vote game.

The model is stacked to the extent that it calculations conditional probabilities for 0–3 votes sequentially, rather than being a multi-class classification problem. For simplicity we have labelled the individual models as prob_B1, prob_B2, prob_B3 to represent the probability of getting at least 1, 2 and 3 votes respectively.

This allows us to use transfer learning — the results of prior iterations become an input into the subsequent model.

Model Results : Variable Importance

Variable Importance allows us to identify and resolve the issues that can potentially reduce the predictive quality of the model. There are over 50 possible predictor variables.

To build a robust model that does not overengineer the problem, we want to initially reduce the number of predictors before adding complexity into the final model.

Variable Importance for prob_B1 model (Image by Author)

The above chart gives us a good first insight into what variables are contributing the most within the prob_B1 model — when players receive at least one vote. Initial thoughts -

The top 4 variables are linear combinations of other statistics — in particular dream team points and ranking points are commonly used aggregate stats used to measure player performance.
Disposals are slightly more important than total possessions; contribution to scoring shots in the posession chain counts.
“Effective” variables are quite similar to the raw totals, however rank lower in terms of variable importance — which in terms implies that umpires not fully quantifying the quality of disposals while on the field.

Hence we will remove linear combinations as well as highly correlated variables from the data set in the next iteration of the model.

The variable importance chart for the last conditional model (prob_B3) shows that the conditional probabilities of prior models rank highly in terms of contribution to the final predictions.

Variable Importance for prob_B3 model (Image by Author)

Overall Accuracy

The output is a series of conditional probabilities where classification accuracy is in the order of 2–5% for the overall data set. The raw conditional probabilities for selected players for the first match of 2020 (RI v CA) are as follows.

Conditional probability results for selected players 2020 Round 1 (Richmond v Carlton)

The initial model results are match-agnostic, hence they do not consider that the sum of probabilities within a given match should equal to 1, 2, 3 for each of the prob_B1, prob_B2, prob_B3 models respectively ; hence we need to normalise the results on a match basis.
Additionally we also want to normalise the results such that the sum of the probabilities on an individual player level equals 1.

The results of the normalisation process are shown below (note some of total probabilities are not exactly 1 due to rounding).

Normalised probabilities for selected players in 2020 Round 1 (Richmond v Carlton)

Eyeballing the results on a pure probability basis we see that model correctly suggests the outcome for Dion Prestia (3 votes) but gets mixed up with Dustin Martin, Jack Martin and Patrick Cripps for the 1 and 2 vote classes.

From the normalised probabilities, we then transform these into a predicted number of votes. Two methodologies are considered, one which is a raw value and the other which ranks the raw values. In both cases we ensure that the total votes assigned per match must always add up to exactly 6.

Conversion of Probabilities to votes (weighted and class)

Again, Jack Martin has not received any votes and Dustin Martin has received 1 in our model (2 in actuality) — we reserve this example of outliers as an opportunity for improvement in later iterations of the baseline model.

An alternative method of assigning votes has been described in an ESPN article which suggests that a 3, 2.5, 1.5 or 0.5 vote allocation methodology results in a much lower margin for error. We will investigate this configuration in future research.

Accuracy at End of Season

We accept that there may be overall inaccuracies — however the objective of the exercise is to get to a reasonable accuracy at (1) end of season level and (2) aggregate round by round level.

The accuracy of each of the prediction methods is measured by root mean absolute error (RMSE) of the actual votes vs the predicted votes by season as well as the overall training set.

While the ranked classification method appears better than the raw values method, the averaging process is superior. The following chart shows the final predicted results for the top 15 contenders for the 2020 season.

Season 2020 actual and predicted votes (Image by Author)

A couple of things stand out in the above charts —

visual inspection shows that the pred-votes method generally under-predicts the number of votes whereas the pred-class method is more balanced. Overall, the averaging model does a reasonable job of getting the order of final predictions correct.
In 2020, Lachie Neale, Zach Merrett and Taylor Adams were a model outliers.

Accuracy at Aggregate Round by Round Basis

The charts below allows us to analyse each round in greater detail and get some confidence around the expected errors as the season progresses. The results of the top 8 contenders are analysed in greater detail.

Season 2020 Performance Metrics (Image by Author)

On face value, the baseline model can generally produce reasonable predictions in-sample to within 1–2 votes as the season evolves.

Out of Sample Model Accuracy

The 2021 data is out-of-sample, hence predictions are expected to be less accurate than 2020 which falls in the training set. In the training set the RMSE in the last round is 0.9 while in the out-of-sample 2021 data the RMSE is 3.3.

Season 2021 actual and predicted votes (Image by Author)

In 2021 notable outliers are Marcus Bontempelli, Darcy Parish and Sam Walsh, all whose results are understated by the model.
Touk Miller is an example of a player who was ineligible for a Brownlow because of suspension during the season — would be interesting to investigate if he was unintentionally down-voted by umpires in subsequent games as a result.

Season 2021 Path for Top 8 Brownlow Medallists (Image by Author)

Possible explanations for this could be the factors which are not yet considered in our initial model -

the “visibility” factor where visibility can be defined by physical characteristics — tattoos, hair colour, height — that cause players to be noticed or remembered more frequently (by umpires).
the impact factor where some players may have a greater impact on the outcome of the game than raw statistics such as disposals might actually suggest.
the “flashy” factor where some players are faster on foot on the outside of the pack or consistently more dominant on the inside of the pack.
the influence of media on umpire’s votes; some players are more “well-loved” than others.

With the framework now in place for assessing model quality we are now in a position to build on the learnings so far to improve the prediction model further.

Reflections and Next Steps

We can benchmark the quality of our results by comparing the baseline model output to several public models for our 2021 out of sample data — still wood to chop, but certainly not the least accurate!

2021 Season Benchmarking Model to Expert Peers — Player Votes and Final Ranking

The next stage of the exercise builds on the results of the Brownlow predictor into a form which can be used as the basis of a round by round betting strategy, from which we can monitor live results for in the 2022 season. We will then revert back to our key learnings to improve the accuracy of the predictive model.

References

Wikipedia — Brownlow Medal (link)
ESPN’s Brownlow Medal player, every vote for 2021 (link)
AFL.com.au :Brownlow Low-Down : Every club’s favourite, one-vote wonder, top three (link)
AFL.com.au : Banned : The 34 players ineligible for the Brownlow (link)
Foxsports.com.au : Brownlow Medal preview : Stats prove who’ll win the closest race in years and the value picks (link)
Herald Sun : Champion Data Brownlow Tracker : Top 20 finishers revealed (link)
Betfair.com.au : 2021 Brownlow Medal Predictor (link)
Introduction to fitzRoy (link)