Predicting Talent Show Winners with Machine Learning

View the entire report here (1,807 KB).

Above: Mean Absolute Error of voting share percentage predictions (by season).

For the final dissertation in my Computer Science BSc, my aim was to build a machine learning model capable of predicting the winners of I'm a Celebrity. It would do this using sentiment analysis on a large bank of tweets, and then by performing regression techniques on the results of sentiment analysis.

As part of this project, I developed a couple of novel techniques applicable to this specific problem set. One involved the creation of additonal training data with theoretical three-way votes that never actually occurred. The other was the usage of output post-normalisation, which enabled the use of three separate predictions to be used in tandem to generate one prediction for a three-way-vote.

The ideas explored in the months of model development are far too complex to express here (in fact they were difficult to express in a 50-page report!), so if you want to learn more please download the whole report with the link at the top of this page.