2018 Open project : Measuring Uncertainty regarding match results

Much has been said about whether or not a character's performance in a match is expected, or unexpected. Though there can't be any perfect way of doing this, there can be a way to model expected uncertainties.

System : Random Walk is Random!!!

Reference :

viewtopic.php?f=30&t=4506&start=138#p295666" onclick="window.open(this.href);return false;

Modification : The end of this Pot Value experiment showed that while Pot Value was good at predicting the winner, it was not a good tool to predict the actual VF%. There also wasn't anyway to generate a expected variance. These problems will be remedied by the following methods. You must read the reference above to understand

(1) Use Linear Expectations instead of Pythagorean. Linear Expectation does poorer job in predicting the winner, but is more useful in generating expected VF%. This means the exponent will be 1 instead of 5, so

Amount of Match Pot earned by character A = Match Pot Value / [ 1 + ( Vote for character B / Vote for character A ) ]

(2) Introduce the variance by using N = 0.1 = 10%, N = 0.2 = 20%, and N = 0.5 = 50%. While every character will start off with score of 100, after even just one match, they will have 3 different Pot Values, which we will call "PV(0.1,x)" (using N=0.1), "PV(0.2,x)" (using N=0.2), and "PV(0.5,x)" (using N=0.5) . Note that PV(0.1) will change slowly while PV(0.5) will change rapidly. This is the key to the whole process.

Procedure

(1) All Contestants will have 3 scores. When a match is scheduled, the 3 scores will all be used when comparing 2 contestants, which can generate as many as 9 different expected VF%.

function explanation PV(i,x) = Pot Value for character x using N=i

Expected VF% for character A when matched against character B = PV(i,A)/[ PV(i,A)+PV(j,B) ] where i can be 0.1,0.2, or 0.5, and j can independently be 0.1,0.2, or 0.5

(2) For a match, the expected winner will be the character that majority of 9 different expected VF% predicted to be the winner. We will call this expected winner to be character A

(3) The FINAL expected VF% for the match is average of the 9 different expected VF% s for the character A. We will call this number FEV%

(4) The all important variability will be measured simply by calculating the STDEV for the 9 different expected VF% This number will be called STDV%

Request

(1) Start calculating PV(0.1), PV(0.2) and PV(0.5) for all the characters from Aquamarine round 1.

(2) At the end of Aquamarine period, announce the PV(0.1),PVS(0.2), and PV(0.5) for all 240 characters. By this time PV(0.1) and PV(0.2) would probably be very stable. PV(0.5) is designed to be volatile, thus stability is not expected

(3) Starting from 1st round of Topaz, post FEV% and STDV% in form of

FEV% +- STDV% for Character A

for all the matches of the day.

(4) When the results are announced , Calculated the Z-score of the match by

(ACTUAL VF% - FEV%)/STDV%

for all the matches. Please alert me to any match that shows |Z-score|>3 .

(5) Update the PV(0.1), PV(0.2), and PV(0.5) for all the characters and repeat the process for the next match

(6) Continue to do this until end of 2018 season

(7) If all the participants in an exhibition match has their PV(0.1), PV(0.2), and PV(0.5) score that has matured = has been updated by at least 5 match results, then please post predictions and also use the exhibition match results to update the character's S scores.

(8) Please do this for all the seasonal tournament characters who are in the semi-finals and beyond as well. This is because even though The Pot Value has not matured for these characters by the time semi-final starts, we really need some means to guess the expected "uncertainty" involved in these crucial matches

GOAL

The main goal of this exercise is to help all of us better understand the results, detect anomalies better, and better gauge how much of an anomaly it is. Too many matches in 2017 were decried as "unexpected" when it actually should have been expected. Also, too many matches slipped past our radar because we didn't realize how unusual that match result actually is. With this exercise, I hope to better keep track of unusual occurrences and invest time properly into matches that should be more carefully analyzed, instead of matches that is more talked about.