View the entire report here (1,807 KB).
For the final dissertation in my Computer Science BSc, my aim was to build a machine learning model capable of predicting the winners of I'm a Celebrity.
It would do this using sentiment analysis on a large bank of tweets, and then by performing regression techniques on the results of sentiment analysis.
As part of this project, I developed a couple of novel techniques applicable to this specific problem set.
One involved the creation of additonal training data with theoretical three-way votes that never actually occurred.
The other was the usage of output post-normalisation, which enabled the use of three separate predictions to be used in tandem to generate one prediction for a three-way-vote.
The ideas explored in the months of model development are far too complex to express here
(in fact they were difficult to express in a 50-page report!),
so if you want to learn more please download the whole report with the link at the top of this page.