Words: 1800
Time to Read: <10minutes
I’ve recently been involved in the Women’s Rugby World Cup European Qualification tournament held in Parma, Italy. This was a very high stakes tournament with the 1st placing team gaining direct entry to the 2022 World Cup, the 2nd placing team gaining entry into a repechage tournament, and the 3rd and 4th place games having to wait four more years for the chance to go to a World Cup.
At the beginning of the tournament the teams were Italy-7th, Ireland-8th, Spain-9th and Scotland-11th in the World Rankings, so it was going to be a close tournament. At this stage I must make a confession: I have absolutely no rugby playing or coaching caliber whatsoever. I have played a handful of times when I was 16yrs old (maybe 5 matches), and I cannot read the game with anywhere close to the level of skill the coaches and players can. However, even experts are susceptible to bias and thinking errors so I decided to forecast the scores for each of our matches, and discuss my reasoning process with the coaches to add diversity to the decision making process. This allowed us to have some fun conversations on their reasoning processes, leading to us constructing a better shared model of what we feel is valuable in performance. Given the teams were so closely matched, having predictions based on sound reasoning principles could help inform match tactics or at the very least spark conversations. For example, in matches predicted to finish within a single try difference, choosing to kick 3 point penalties rather than go for a higher risk kick-lineout-maul, where the conversion is also more challenging, could prove pivotal.
I’ve detailed my forecasting and reasoning process below.
Stage 1: Calculate Outside Base Rate Data
I calculated the outside base rates of “scores for” each team by taking the previous 3 years worth of test match points scored and averaging them for each team. This was my simple base rate algorithm. I ignored the opponent score in each match and ignored whether the fixture was played home or away. I only used test matches in an attempt to somewhat standardize the level of competition. This gave me an initial outside base rate score of Italy-13, Ireland-19, Spain-16 and Scotland-11 points. I also updated this base rate after each match during the tournament by adding any “scores for” to the calculation of each team and averaging again.
I’ve made lots of assumptions in stage one, such as deciding to choose the previous 3 years worth of data. I chose 3 years because (1) I’ve been in my current position 3 years; (2) the turnover in our squad means that our team is over 50% different than 3 years ago and it’s similar in the other teams; (3) it provided me with 9 matches per team to create a base rate, which I thought was a suitable dataset size. Also my inclusion and exclusion of certain data is a source of error. I have chosen to only include “scores for” in my base rate data, mainly because I felt including scores against data would make this activity too large for me and I wouldn’t have time to complete it.


Stage 2: Calculate Inside Data
I calculated inside data which consisted of the head-to-head scores over the previous 3 years, which is the same time period the outside base rate score was calculated. I perhaps could have used this head-to-head data to form the base rate, but I felt this data didn’t take into account the overall potential of each team. Also, had this formed the base rate it would have included fewer scores to be used in the calculation and including scores from 4, 5, 6 etc. years ago probably decreases the accuracy in this context. I think using the inside, head-to-head data to adjust outside base rate data from is a more reasonable method for weighting data in this context.

The inside historical head-to-head score predictions were:
Scotland 9 – 36 Italy
Scotland 30 – 21 Spain
Scotland 11 – 17 Ireland
Stage 3: Initial Prediction
My reasoning method was to use the base rate (outside view) as the initial match forecasts, and then adjust this score up or down based on the inside information learned from the head-to-head matches. The difference between the outside base rate score and the inside scores were calculated. Then the base rate was adjusted up or down by -75%, -50%, -25%, +25%, 50%, +75% of the outside-inside difference.
These buckets of 25% increments introduce a few sources of error, however because they are percentages of small numbers they do not result in a huge variation in possible score predictions (which might end up making my predictions look more accurate than they actually are).

Stage 4: Final Prediction
I made the final predictions the evening before each match, to ensure I had the most up to date outside and inside information available. There is obviously a huge asymmetry in the outside vs inside information I had available, meaning I could see the strengths and weaknesses within our own squad but was completely blind to these in our opponents. This is one reason why choosing the base rate to adjust from makes more logical sense since it is less responsive to inside view biases.

I chose to use 25% increment adjustments, given the percentage changes were already from small numbers and I thought it silly to predict scores to decimal places.
Prediction (1) Scotland 11 – 30 Italy (Probability 75%)
Prediction (2) Scotland 21 – 14 Spain (Probability 70%)
Prediction (3) Scotland 13 – 18 Ireland (Probability 60%)
Reasoning for adjustments:
Prediction 1: For Scotland vs Italy I adjusted Scotland down by -25%. My reasoning for doing this was (1) we had travelled; (2) we did not have much opportunity to train in the heat prior to competition; (3) the match was at 3pm and temperatures were +30C; (4) I felt we had massively overloaded the players with decelerations on match day -6 and match day -5 and our monitoring CMJ data was still below ideal in some key players; and (5) we had a key player with muscle injury due to the high MD-6 and MD-5 loads. I chose to adjust Italy +75% because (1) Italy’s base rate “score for” include 5 games against England and France who are very tough opponents; (2) Italy had previously scored highly against Scotland in the previous 3 matches; (3) the style of rugby Italy play is difficult for Scotland to defend and we had limited practice scenarios against that style of play.
Prediction 2: For Scotland vs Spain I adjusted Scotland up by +50%. My reasoning for this was (1) travel fatigue was now not in issue; (2) I felt our exposure to stimulus and our response to exposure data showed the players in peak condition; (3) I felt the coaches were really confident in the game plan and players were crystal clear around the tactics; and (4) this was a must win this match or we would not qualify for the World Cup and I felt there were a few key players in our squad who would step up and not be beaten. I chose to adjust Spain down -25% because (1) the Spain vs Ireland game was very physical with high collisions, which would take long to recover from; and (2) we have good discipline so wouldn’t give Spain many easy points through kicked penalties; and (3) the coaches were confident Spain’s style of attack would not stress our defense too much, except perhaps from quick tap penalties, so I thought they would struggle to score high.
Prediction 3: For Scotland vs Ireland I updated my forecast on this match every day until match day. I eventually adjusted Scotland up by +25%. My reasons for this were (1) Scotland has scored a bonus point against Spain for tries scores showing we could score; (2) previously Scotland have scored several tries against Ireland; and (3) Ireland’s defense was giving away lots of penalties within kicking distance, therefore forecasting two tries and a kicked penalty I adjusted up. I didn’t adjust Ireland up or down because (1) most of the adjustment scores for Ireland were 18 points and I couldn’t find good justification to disprove this score; (2) I thought the match would be very close as the previous matches have been, and I thought it would be within 5 points; and (3) their match against Italy didn’t look overly physical so couldn’t justify any reasons to adjust.
Stage 5: Calculating my Forecasting accuracy
I calculated the difference between my final prediction and the actual score. The trend shows a consistent direction of under scoring, with a magnitude between 0 to -8 points.

I also calculated the accuracy of my probabilistic judgements using a Brier Score, where 0 is a perfect score and 1 is the most inaccurate score. My average Brier score was 0.17. I am delighted with being wrong and gaining a poor Brier score in prediction 3!
Brier Score 1 = (0.75 – 1)^2 = 0.0625
Brier Score 2 = (0.7 – 1)^2 = 0.09
Brier Score 3 = (0.6 – 0)^2 = 0.36
Average Brier Score = 0.17

Learnings:
This was a beneficial task to provide a moderate-high level of objectivity to a high-pressure and very emotional tournament. Initially it took me less than 30minutes to find the data online, input the data into a spreadsheet to find the base rates and inside data adjustment ranges – overall this was a great time investment.
It is common for people to get caught up in the asymmetry of only seeing the positives in their own team, and over-estimating the scores in their favour – leading to overconfidence and blind spots. Alternatively, only seeing the injuries or training errors within their own squad can lead to under-estimating outcomes and a loss of motivation. Having the predicted scores allowed me to challenge both over- and underconfidence within the management team and ask intelligent questions:
A: “You seem very confident we’ll beat team X, keeping them to under 12points? They seem to score a lot more than that in most matches, particularly against us – if they are going to score against us where and when will they do it?”
B: “If this game is going to be within 5 points, would you have the girls kick penalties if within their range early rather than taking a risker lineout option?”
C: “Our games versus team Z are very likely to be within a single score, what can we do to nudge an extra few penalties within kicking range?”
Perhaps the best outcome of using a forecasting tool like this is to facilitate conversations and to help ask more precise questions? Were these game-changing and tournament winning questions? Of course not, but they form part of complex decision-making environment where frequent conversations can nudge people to focus on highly relevant pieces of information, which can easily, and do easily, get overlooked when fatigued, under pressure, and highly-emotional.
Finally, most of the data used in this prediction process is publicly available on the World Rugby and similar websites. This means that people can use this valuable process of prediction in unfamiliar situations, with high levels of uncertainty and low time-pressure without having to know much inside information.
Too often, we use hindsight to evaluate whether our processes are fit for purpose, but this is fraught with errors. I would love to get your feedback on my process, where you think I have made process and reasoning errors, and what you would have done differently so please get in contact.
Links to public data:
Leave a Reply