Wednesday, November 7, 2012

Forecasting case study

Those of you wise enough to monitor Shots on Target may have noticed earlier a post regarding a new case study we're doing to compare our respective models against actual results. At this stage there's not much to do other than post our projections, so here they are:

The idea of these few posts will be to try and identify areas where the respective models can be improved, specifically with regards to the process rather than just the results. With just five games to work with we're not going to make wholesale changes to the models if they say Suarez will score twice and he scores four, so long as the underlying data looks reasonable and any significant variances can be explained.

One area that has already been highlighted to me if how to translate defensive goal forecasts into points. As you know, my points model only looks at midfielders and forwards and then I separately forecast goals conceded. The problem with that approach, now I think about in detail, is illustrated by the below example:

Team A forecast GPG:     2.2     0.6     0.7     1.2     0.6     Avg: 1.2
Team B forecast GPG:     1.4     1.1     1.2     1.1     1.2     Avg: 1.2

On the 8 week forecast table the above two teams with be ranked equally but one would assume that team A has the better chance at success in terms of clean sheets which is after all what it's all about. You might be reading this thinking this isn't exactly a ground breaking discovery, but I think we all have a tendency to consider goals conceded in aggregate and thus I'm going to put together a new set of data that tries to assign a percentage chance of a clean sheet based on those above rankings, which can then be aggregated better to provide rankings. More on this in a later post though.