How AFL Fantasy Pricing Works (Part 1)
A data-driven approach to understanding the features of the AFL Fantasy pricing mechanism
--
A few weeks into my first season as an AFL Fantasy player and I’ve learnt that early season is about accumulating cash from rookies to be able to trade up to higher quality players. Hence understanding how player prices are calculated from week to week is one of the key components for the strategy to accumulate cash.
In this article I present a model which accurately calculates the end of round prices for the AFL Fantasy competition. More than 33,000 prices were analysed over the 2015–2019 season with data sourced from individual player pages on footywire.
- The mean absolute error is $330, which is practically a rounding error for an average player priced at $400,000.
- Approximately 99.8% of prices predicted were within +/- $2,500 of the actual prices.
Practical applications of the model include forecasting the future path of player prices and breakevens. The model is easily implementable on a spreadsheet. The equation and parameters for the pricing model is as follows -
The calibration parameters for weights are based on a premise that there is a fixed pattern used by the AFL Fantasy gods rather than a random number generator behind the black box. While the model is intuitive, its a bit harder to use it to impress family and friends at a BBQ unless you are a human calculator.
The section that follows documents the statistical techniques and key concepts used to backsolve for the parameters for the model.
Linear Regression Hypothesis
The AFL Fantasy guidelines provided a few clues about the pricing mechanism.
The Players’ Prices will change based on a formula that takes into account their past performances. All games played by the Player since the start of the season are taken into account in the calculation of their Price changes, with a sliding scale of weightings with the most recent game receiving the highest weighting, as well as a component of their performance last year — if they played last year!
The hypothesis for the pricing formula is as follows, and used player data scraped from the footywire web pages from 2015–2019, totalling over 33,000 data points.
The approach used was to backsolve for coefficients using a series of linear regressions on the data which progressively simplified the problem. Specifically at the beginning of the exercise, the following calculations and parameters were not fully defined -
- number of game (k) scores which contributed to the score
- the weighting scheme used for each of the most recent scores αₖ
- the weighting of the prices used from previous round β
- the calculation for the magic number Mₙ
From some initial research I’d also understood that there was a normalising aspect in the magic number calculation which ensured that the aggregate prices of players before the game was played was equal to their aggregate post-match prices.
For each of the linear regressions performed, I have presented the model parameters calculated and the errors for each season which provides a guide as to how model accuracy improved with each subsequent iteration. In some cases analysing specific large errors at the player level provided insights on how to progress the exercise.
Linear Regression #1 : determining the number of games (k) which contribute to the score
In order to determine the appropriate lookback period I first analysed the contribution of the αₖ coefficients to final price of the equation using k=10.
- At k=10, the αₖ coefficients turn from positive to negative at k=5, implying that k=5 is a better starting point for the number of games used. There is no significant difference in the αₖ coefficients between the two cases considered.
- There was a high number predictive errors in both cases. Notably there was a significant difference in proportion of error observations by the number of most recent matches played out of the last 5.
Linear Regression #2 : adding number of games played as a model input
Based on these observations the model was updated to use the number of games played as an input — that is, for each round, up to 5 regressions would be run to calculate parameters for each group by number of games played (for k=5).
Hence in the case of R04, a maximum of 4 regressions would be run, taking into account the players who had played 1, 2, 3 or 4 games for the season and for R07, a maximum of 5 regressions would be run.
I was hesitant to go down this path as I thought the cohort size of some groups would have been smaller, leading to less accurate forecasts, however … Eureka!! Predictive errors greater than $2,500 had fallen to less than 1% for the dataset.
Regression parameters for score (αₖ) and price(β) show excellent stability through each the rounds and seasons. Note from score coefficient chart, there are up to 5 score coefficients calculated, depending on number of data points available out of the 5 most recent games played for the season to date.
Closer inspection of the alpha parameters indicated that they are almost identical across the number of most recent 5 games played on average.
Okay... at this point we take a huge leap of faith by assuming that the AFL Fantasy gods are not a random number generator and theorise that the weights for each of the last 5 matches are [5 4 3 2 1] which implies the following table of values.
Notice how the values that I’ve theorised are almost similar to the weights that were calculated? For the time being, lets assume that the small differences are due to other factors which can be investigated later in the analysis.
Calculating the Magic Number using fixed weightings of αₖ and β
To confirm if the theory of fixed weights made sense I re-ran the regression based using the fixed weights and now allowing the model to imply the magic number directly.
To do this calculation more explicitly, I moved the previous price term from the left hand side of the equation to the right hand side of the equation.
Noting that the magic number is intended to be a rebalancing factor between rounds such that the for a given round, the aggregate of all previous prices is equal to that of all new prices, we can set Pn=Pn-₁ and do all of our calculations based on aggregates for a given round.
Rewriting equation [2] and aggregating for all players becomes —
In other words, we will calculate the weighted average scores for each player and find the aggregates for that weighted score and the total pre-game prices over the round- which I’ve highlighted in blue to make things clearer.
- This is a very tidy calculation in that we do not have to know the post-game prices in order to find the magic number.
- This formula does not imply that the post-game prices for each player must be equal to their pre-game prices.
No linear regression required here. We only need to backsolve for the magic number given that we know all the other values.
Running the calculation, we get a magic number that appears to be of the correct magnitude. As there isn’t an “official” website which publishes these numbers or documents the exact calculation, nor much easily available historical information, I’ve compared my values to that of another AFL Fantasy enthusiast for major deviations in the 2019 season to date.
Performance of the Final Model
We now know the values for all of the component parts of the hypothesised model. Putting it all together, we can now assess the quality of model predictions for next price vs actual prices using the original equation [1].
In terms of the actual number of observations by magnitude of absolute errors — out of 33,000 prices predicted over 5 seasons [2015–2019]
- the mean absolute error is $330, which is practically a rounding error in prices to the nearest $1000.
- there are 75 observations in total which have a predicted error of greater than $2,500 which is 0.22% of the total - an accuracy of 99.8%!
Conclusion and thoughts
Linear regression was the statistical technique used to deconstruct the key components of the AFL Fantasy player pricing model. While we used it as a tool to further our understanding of the relationships between prices and scores, the final model does not use any of the actual parameters calculated by the regressions.
Given the predictive accuracy of the model proposed, there is no compelling need to add additional features to improve the results. Specifically,
- the guidelines provided by the AFL allude to previous season averages as an input, however this variable has not been fully explored in the analysis.
- the model does not appear to hold for R22, the last round of the fantasy season, results presented exclude this round because it produced unstable predictions— philosophically I am comfortable with this because in the last round, the only thing that counts in the competition is the points scored, not the overall team value.
There are many practical applications for using the pricing model, particularly in relation to predicting the path of prices and breakevens through the season which will be a topic to investigate in future analysis.
Data for this model was scraped from the individual player pages on footywire. Modelling was done using R and the code and data will be made available soon on Github.
Special thanks to my fantasy footy buddies — Jack, Justin and Selby — for their guidance and encouragement which made writing this article possible.