Data says FRANCE will triumph EURO 2016
Download
Report
Transcript Data says FRANCE will triumph EURO 2016
Story Board : UEFA EURO 2016
Knockout stages from the eyes of data
Poland
Wales
Portugal
Belgium
Portugal
Belgium
Portugal
France
France
Germany
France
Germany
France
Italy
Iceland
Data says FRANCE will triumph
EURO 2016
Data and Statistics we used:
Euro cup 2008, Euro cup 2012, FIFA world cup
2014 and all international matches played
since 2014.
Features we used:
We used 8 features namely FIFA Rank, Team
cost, Cap sum of teams, Total Goals of a team,
total goals of best three players, Goalkeeper
Rank, Average Age, Home/away % win.
Algorithms and Methods:
We used Naïve Bayes classifier to calculate
the probability of winning a match while
provided all important attributes.
Our assumptions:
Belgium did not participate in EURO cup since
2000 so no recent data available. We took
some assumptions for that.
Prediction Algorithm Example
• Let’s say France and Germany playing semi-final in EURO 216. We have previous
statistics of both the teams.
• Given these statistic tables we can calculate conditional probability of every
attribute given match results (win/lose/draw)
• Let’s calculate probability of having high FIFA rank of a team (say Germany) given
that Germany won the match
• i.e. Probability (FIFA rank = high/ result = win)
• That comes out 0.794
• Similarly we can calculate for other attributes like team cost, cap sum etc.
• Let’s calculate probability of win for Germany i.e.
• P(win) = total win/ total match played = 0.76 and P(lose) = 1 – p(win) = 0.24
• Now we need to calculate conditional probability of winning of particular team given
all attributes.
• Probability of winning Germany over France given that Germany has FIFA rank
higher than France, Team cost is higher than France, teams goals are higher,
Best_3 score is higher, GK rank is low, average age is also low.
• That comes out = 0.3330 which means Germany has less chance to win according
to Naïve Bayes.
Disclaimer:
Please note that all predictions are made on purely statistical, scientific data. As such, there are many thousands (potentially millions) of
other unpredictable factors that could influence the outcome of a match (accidents etc.)