Tokyo 2020 Betting III: From Matches to Medals… and Bookies

Bayesian Optimisation Decision Analysis Cycling

In this post I will go from making individual match predictions using the Bradley-Terry model through to predicting the Gold medal winners of the track cycling Individual Sprint, and derive an optimal betting strategy based on a spread-bet Kelly Criterion optimised under posterior uncertainty.

This is the third post in a series: Click for links.

Tokyo 2020 Betting I: Predictive Models for Pairwise Matches

Tokyo 2020 Betting II: Model Refinement and Feature Engineering

Tokyo 2020 Betting III: From Matches to Medals… and Bookies (This Post)

Due to a lag in drafting vs analysis time, this post was originally published with betting stakes only. I have retrospectively added the detail of how these stakes were derived.

So far in this series I’ve focused on predicting the outcome of single matches between two athletes, and derived a bespoke Bradley-Terry model for this purpose.

To construct a betting strategy I will need to turn the probability that a given rider wins a single match, into the probability that they will win the whole tournament - and hence the gold medal.

In the first section of this post I introduce the tournament format for the Tokyo 2020 Individual Sprint, and how the outputs of the Bradley-Terry model are used to derive a distribution for the gold medal winner.

In the second part I will introduce the Kelly Criterion, and derive generalisations that allow for multiple outcome bets, handling posterior uncertainty, and accounting for additional caution.

At the end of the post I’ve included the log of the bets that I ended up placing, that were originally published in a holding post whilst the event was running.

Assumptions

This post makes some assumptions about you!

I’ll assume you’ve read the previous posts in the series, though this post should work as a standalone.

I’ll also assume you have some knowledge of basic betting terminology (fractional odds, stakes). To derive my betting strategy I will use some discrete probability, and formalise a non-linear optimisation model.

If you’re interested in reading the underlying code, this is in R..

Tournament Forecasting

The Individual Sprint has a complex tournament structure, which will see the winning athlete compete between 10 and 16 sprints before they can claim the medal!

There are four main parts to the tournament, which in Tokyo 2020 will see 30 athletes compete:

Overview

  • A qualifying round that sees all athletes competing individually to set the fastest time, with the six slowest athletes eliminated.

  • 1/32, 1/16 and 1/8 Finals that see the athletes compete in pairs to win a single sprint. The winner automatically qualifies for the next round (eg. 1/32 Finals winners qualify for 1/16 Finals).

  • Repechage races that see the losers from the previous round competing to take take any remaining places in the next round (eg. losers of 1/32 Finals compete for four remaining places in the 1/16 Finals).

  • Quarter-, Semi-, and Finals raced between pairs of riders. At this stage each match is contested in a best of three sprints format.

Tokyo 2020 Summary

Round Athletes Competing Matches x Athletes per Match Sprints per Match Athletes Qualifying
Qualifying 30 30 x 1 1 24
1/32 Finals 24 12 x 2 1 12
Repechage 1 12 4 x 3 1 4
1/16 Finals 16 8 x 2 1 8
Repechage 2 8 4 x 2 1 4
1/8 Finals 12 6 x 2 1 6
Repechage 3 6 2 x 3 1 2
Quarterfinals 8 4 x 2 Best of 3 4
Semifinals 4 2 x 2 Best of 3 2
Finals 2 2 x 1 Best of 3 1

Tokyo 2020 Detail The tables below provide the detail that determines which riders face each other in each round; its adapted from the table published by the UCI in their Track Regulations.

The initial rider codes N1-N24 are in order of the time posted in the qualifying round: N1 is the fastest qualifier, N24 the slowest.

1/32 Finals

round match_no rider_code_1 rider_code_2 rider_code_3 winner_code loser_code sprints
1/32 Finals 1 N1 N24 NA 1A1 1A2 1
1/32 Finals 2 N2 N23 NA 2A1 2A2 1
1/32 Finals 3 N3 N22 NA 3A1 3A2 1
1/32 Finals 4 N4 N21 NA 4A1 4A2 1
1/32 Finals 5 N5 N20 NA 5A1 5A2 1
1/32 Finals 6 N6 N19 NA 6A1 6A2 1
1/32 Finals 7 N7 N18 NA 7A1 7A2 1
1/32 Finals 8 N8 N17 NA 8A1 8A2 1
1/32 Finals 9 N9 N16 NA 9A1 9A2 1
1/32 Finals 10 N10 N15 NA 10A1 10A2 1
1/32 Finals 11 N11 N14 NA 11A1 11A2 1
1/32 Finals 12 N12 N13 NA 12A1 12A2 1

Repechage 1

round match_no rider_code_1 rider_code_2 rider_code_3 winner_code loser_code sprints
Repechage 1 13 1A2 8A2 9A2 1B NA 1
Repechage 1 14 2A2 7A2 10A2 2B NA 1
Repechage 1 15 3A2 6A2 11A2 3B NA 1
Repechage 1 16 4A2 5A2 12A2 4B NA 1

1/16 Finals

round match_no rider_code_1 rider_code_2 rider_code_3 winner_code loser_code sprints
1/16 Finals 17 1A1 4B NA 1C1 1C2 1
1/16 Finals 18 2A1 3B NA 2C1 2C2 1
1/16 Finals 19 3A1 2B NA 3C1 3C2 1
1/16 Finals 20 4A1 1B NA 4C1 4C2 1
1/16 Finals 21 5A1 12A1 NA 5C1 5C2 1
1/16 Finals 22 6A1 11A1 NA 6C1 6C2 1
1/16 Finals 23 7A1 10A1 NA 7C1 7C2 1
1/16 Finals 24 8A1 9A1 NA 8C1 8C2 1

Repechage 2

round match_no rider_code_1 rider_code_2 rider_code_3 winner_code loser_code sprints
Repechage 2 25 1C2 8C2 NA 1D1 NA 1
Repechage 2 26 2C2 7C2 NA 2D1 NA 1
Repechage 2 27 3C2 6C2 NA 3D1 NA 1
Repechage 2 28 4C2 5C2 NA 4D1 NA 1

1/8 Finals

round match_no rider_code_1 rider_code_2 rider_code_3 winner_code loser_code sprints
1/8 Finals 29 1C1 4D1 NA 10 100 1
1/8 Finals 30 2C1 3D1 NA 20 200 1
1/8 Finals 31 3C1 2D1 NA 30 300 1
1/8 Finals 32 4C1 1D1 NA 40 400 1
1/8 Finals 33 5C1 8C1 NA 50 500 1
1/8 Finals 34 6C1 7C1 NA 60 600 1

Repechage 3

round match_no rider_code_1 rider_code_2 rider_code_3 winner_code loser_code sprints
Repechage 3 35 100 400 500 1F1 NA 1
Repechage 3 36 200 300 600 2F1 NA 1

Quarterfinals

round match_no rider_code_1 rider_code_2 rider_code_3 winner_code loser_code sprints
Quarterfinals 37 10 2F1 NA 1G1 NA 2
Quarterfinals 38 20 1F1 NA 2G1 NA 2
Quarterfinals 39 30 60 NA 3G1 NA 2
Quarterfinals 40 40 50 NA 4G1 NA 2

Semifinals

round match_no rider_code_1 rider_code_2 rider_code_3 winner_code loser_code sprints
Semifinals 41 1G1 4G1 NA 1H1 1H2 2
Semifinals 42 2G1 3G1 NA 2H1 2H2 2

Finals

round match_no rider_code_1 rider_code_2 rider_code_3 winner_code loser_code sprints
Finals 43 1H1 2H1 NA Gold Silver 2
Finals 44 1H2 2H2 NA Bronze NA 2

To forecast the gold medal winner I will simulate results for each match in the tournament, using the detailed tournament structure tables above.

In short, simulating the tournament will involve:

That sounds simple enough, but there are a few things for us to unpack here.

Sampling a Single Match (Recap of the Bradley-Terry model)

The Bradley-Terry model assumes that in a match between athletes \(r\) and \(s\) then

\[\mathbf P[r \text{ beats } s] = \frac{\beta_r}{\beta_r + \beta_s}.\]

Given a set of parameters \((\beta_r)\) I can use the formula above to sample the winner of any given match, as required in step two above.

The previous posts have focused on estimating the parameters \(\beta_r\), with the final model taking the form \(\beta_r = \exp \left( \alpha_r^{(m)} + \kappa t_r \right)\), where

Sampling a Tournament

Sampling a tournament is just a case of sampling match outcomes, and then using the detailed tournament information in the tabs above to identify who to pair in the next round of matches. The animation below gives an example of this dynamic.