Some thoughts on ranking systems

For topics that last throughout the whole season
User avatar
Midnight-Jasper
Intelligent device
Intelligent device
Posts: 1357
Joined: Sat Dec 31, 2011 9:47 pm
Badges:
Image
Melon Pan: 50
Wish: viva all, baby~
Location: Aloha Oe

Re: Some thoughts on ranking systems

Post by Midnight-Jasper » Sun Mar 23, 2014 10:31 pm

Why isn't data from multiple periods used to calculate strength of opponents (general strength, not SDO)? Is it because the formula needs to be consistent, or easy to understand?
I see that the necklace is an award for the best performer of the period, but you wouldn't usually expect the strength of her opponents to change largely between periods, which is what happens with SDO / VP / whatever pumping.
Image
User avatar
maglor
~Fukou da~
~Fukou da~
Posts: 8724
Joined: Thu Feb 26, 2009 8:57 pm
Badges:
ImageImage
Worships: Abriel Nei Debrusc Borl Paryun Lafiel
Melon Pan: 75
2019 Female Favorite: Akemi Homura
2019 Male Favorite: Arima Kōsei
2018 Female Favorite: Chtholly Nota Seniorious
2018 Male Favorite: Yang Wenli
2017 Female Favorite: Tomori Nao
2017 Male Favorite: Yang Wenli
Wish: More people being open to alternatives and compromises.

Re: Some thoughts on ranking systems

Post by maglor » Sun Mar 23, 2014 10:51 pm

Shmion84 wrote:
saving spaceShow
maglor wrote:OK, folks. I need some help. Here are the stats needed to advance or squash this.

For 2011, 2012, and 2013 Aquamarine, please tell me

(1) CVP
(2) SVDO = SDOVP = Sum of Vote Points of the defeated Opponents
(3) Necklace Score = SVDO + 50 * NGP ( = Necklace Group performance = ( votes received by the contestant in the 7 character necklace group match ) / ( total number of votes cast for the necklace group match) . Since using 100 as multiplier has chance to do too much, I thought 50 would be more appropriate.
The maximum was marked in bold text.

Image
maglor wrote: What we need to see is
(1) SVDO for 7 win characters be higher than those for 6 win characters in almost all the cases in these 3 aquamarine periods.
Yes. SVDO for 7 win characters is pretty constant (between 45 and 50 points).
maglor wrote:(2) Final Necklace Score ranking be different from SVDO ranking in at least one place in at least two out of three years.
Yes. For example: only in 2011 the character with the highest SVDO got the highest Final Necklace Score.
maglor wrote:(3) Final Necklace Score ranking using SVDO be different from Necklace Score Ranking we used for the year in question in at least one place in at least two out of three years.
Yes. For example: In 2011 Akiyama Mio gained 3rd rank under old rules and 5th rank under proposed rules. In 2012 Shana gained 5th rank under old rules and 4th rank under proposed rules. In 2013 Kuroyukihime gained 3rd rank under old rules and 6th rank under proposed rules.
maglor wrote:It would be interesting to see if any of the 7 would have been different if necklace group selection was (1) number of wins (2) CVP .
In 2011 Nakano Azusa (CVP 8.87895438) and Gokō Ruri (Kuroneko) (CVP 8.705217814) would have replaced Nakamura Yuri and Eucliwood Hellscythe.
In 2012 no changes (exact 7 characters with 7 wins).
In 2013 Yūki Asuna (CVP 9.154928839) would have replaced Kōsaka Kirino.

You might double check these calculations.
Thank you very much. So, SVDO passes all of the qualifying conditions I have laid out. Furthermore, I feel using the CVP necklace group selection criterion may have been better compared to the current selection process. In a way, I am pleased to know that 2011 situation did NOT result in necklace group match winner getting the necklace, as this suggests there is a good balance between SVDO and NGP. So, folks, do you prefer SVDO system better than SDO system? Here are pros and cons that I can think of. Please point out any mistakes or missing pro or cons.

Pro

1. The relative difference in SVDO is less compared to SDO.
2. at least it isn't SDO

Con

1. There now is even more reason to attempt SVDO manipulation as you can further increase SVDO by voting for a strong opponent that your favorite beat, so that strong opponent will beat down on weaker contestants even harder
2. This greatly reduces chance for 6 win character to win the necklace over a 7 win character. If we want to increase that chance, we can raise the multiplier for NGP to 100, but then we risk SVDO becoming too insignificant, thus even a 5 win character now will have chance to win the necklace over 7 win character if a 5 win character gets into necklace match due to some extraordinary circumstances. The main reason for all this is because it is possible for 6 or 7 win characters to have very low or very high SDO value, but SVDO value's usual range is much narrower.

Possible Alternative

Instead of SVDO, we can use AVDO = average of cumulative vote points(CVP) of defeated opponents. This will increase the chance that 5 or 6 win character can win the necklace. The multiplier for NGP will have to be greatly reduced, likely be 2 to 4, instead of 50 or 100, since I expect difference of AVDO between contestants of adjacent AVDO rank to be near 0.1, thus 0.1/0.035 ~ 3. Now good AND bad thing about this is the increase of the chance that 5 or 6 win character will win the necklace. Do not that 7 win character will still have higher AVDO compared to 6 win characters most of the time, because that 7th win would have came from someone very strong.
Image
User avatar
maglor
~Fukou da~
~Fukou da~
Posts: 8724
Joined: Thu Feb 26, 2009 8:57 pm
Badges:
ImageImage
Worships: Abriel Nei Debrusc Borl Paryun Lafiel
Melon Pan: 75
2019 Female Favorite: Akemi Homura
2019 Male Favorite: Arima Kōsei
2018 Female Favorite: Chtholly Nota Seniorious
2018 Male Favorite: Yang Wenli
2017 Female Favorite: Tomori Nao
2017 Male Favorite: Yang Wenli
Wish: More people being open to alternatives and compromises.

Re: Some thoughts on ranking systems

Post by maglor » Sun Mar 23, 2014 10:55 pm

Midnight-Jasper wrote:Why isn't data from multiple periods used to calculate strength of opponents (general strength, not SDO)? Is it because the formula needs to be consistent, or easy to understand?
I see that the necklace is an award for the best performer of the period, but you wouldn't usually expect the strength of her opponents to change largely between periods, which is what happens with SDO / VP / whatever pumping.
The necklace system is deliberately designed to be specific only within a period, so someone like Sakurazaki Setsuna(2008 Topaz), Katsura Hinagiku (2009 Ruby), or Eucliwood (2011 Diamond ) won't need to worry about records from prior periods dragging them or their opponents down.
Image
User avatar
10ZHAbin
Spirit hunter
Spirit hunter
Posts: 2161
Joined: Tue Sep 04, 2012 9:13 am
Badges:
ImageImage
Worships: Leina
Melon Pan: 50
Location: Otaku Community

Re: Some thoughts on ranking systems

Post by 10ZHAbin » Sun Mar 23, 2014 11:26 pm

maglor wrote: 2. This greatly reduces chance for 6 win character to win the necklace over a 7 win character.
I think this is a good thing, because I don't believe a character deserve to win the title of that period if she can't even manage a 7-0.

I took another look at Emerald 2013 where the highest necklace match VF did not belong to the character with the highest SDO.
Eligible CharactersShow
Tachibana Kanade 10.437
Gokou Ruri 9.682
Misaka Mikoto 9.453
Shiina Mashiro 9.321
Hasegawa Kobato 8.862
Kuroyukihime 8.757
Yuuki Asuna 8.734
Aragaki Ayase 8.701
Takanashi Rikka 8.624
Eucliwood Hellscythe 8.570
Shana 8.482
Nakano Azusa 8.454
Aisaka Taiga 8.401
Akiyama Mio 8.020
Eucliwood Hellscythe would have replaced Shana for having a higher CVP.
The necklace score based on different factor onto the Necklace V…Show
Image
User avatar
maglor
~Fukou da~
~Fukou da~
Posts: 8724
Joined: Thu Feb 26, 2009 8:57 pm
Badges:
ImageImage
Worships: Abriel Nei Debrusc Borl Paryun Lafiel
Melon Pan: 75
2019 Female Favorite: Akemi Homura
2019 Male Favorite: Arima Kōsei
2018 Female Favorite: Chtholly Nota Seniorious
2018 Male Favorite: Yang Wenli
2017 Female Favorite: Tomori Nao
2017 Male Favorite: Yang Wenli
Wish: More people being open to alternatives and compromises.

Re: Some thoughts on ranking systems

Post by maglor » Mon Mar 24, 2014 6:53 am

10ZHAbin wrote:
maglor wrote: 2. This greatly reduces chance for 6 win character to win the necklace over a 7 win character.
I think this is a good thing, because I don't believe a character deserve to win the title of that period if she can't even manage a 7-0.

I took another look at Emerald 2013 where the highest necklace match VF did not belong to the character with the highest SDO.
Eligible CharactersShow
Tachibana Kanade 10.437
Gokou Ruri 9.682
Misaka Mikoto 9.453
Shiina Mashiro 9.321
Hasegawa Kobato 8.862
Kuroyukihime 8.757
Yuuki Asuna 8.734
Aragaki Ayase 8.701
Takanashi Rikka 8.624
Eucliwood Hellscythe 8.570
Shana 8.482
Nakano Azusa 8.454
Aisaka Taiga 8.401
Akiyama Mio 8.020
Eucliwood Hellscythe would have replaced Shana for having a higher CVP.
The necklace score based on different factor onto the Necklace V…Show
Image

Thank you. These are very informative. One thing I keep noticing is how low SVDO is for Misaka Mikoto or Kuroyukihime. While you can significantly lower your rival's SDO with strategic voting if you get lucky, you can't really lower SVDO since vote against a character will have very small impact on that character's CVP. This means that the reason for Mikoto and Kuroyukihime having such a low SDO may have more to do with their opponents losing their fanbase votes instead of a faction actively trying to vote against them.
Image
User avatar
Shmion84
Moon princess
Moon princess
Posts: 3383
Joined: Mon Apr 02, 2012 6:33 pm
Badges:
ImageImage
Melon Pan: 95
2017 Female Favorite: Konjiki No Yami
2017 Male Favorite: Kyon
Wish: ISML still exists next year
Location: Central Europe

Re: Some thoughts on ranking systems

Post by Shmion84 » Mon Mar 24, 2014 7:14 am

On AVDO:
graphic for Aquamarine 2011, 2012, 2013Show
Image
(without winless characters)

You can clearly see: SVDO (7 wins) > SVDO (6 wins) > SVDO (5 wins) > ...

AVDO (7 wins) is often - not always - higher than AVDO (6 wins). So characters with 6 wins have chances to win the necklace (what i prefer). The difference of AVDO between contestants of adjacent AVDO is just below 0.1.


This is not the first discussion on necklace rules, right?

PS: I missed maglor's 4444th post.
Image
User avatar
maglor
~Fukou da~
~Fukou da~
Posts: 8724
Joined: Thu Feb 26, 2009 8:57 pm
Badges:
ImageImage
Worships: Abriel Nei Debrusc Borl Paryun Lafiel
Melon Pan: 75
2019 Female Favorite: Akemi Homura
2019 Male Favorite: Arima Kōsei
2018 Female Favorite: Chtholly Nota Seniorious
2018 Male Favorite: Yang Wenli
2017 Female Favorite: Tomori Nao
2017 Male Favorite: Yang Wenli
Wish: More people being open to alternatives and compromises.

Re: Some thoughts on ranking systems

Post by maglor » Mon Mar 24, 2014 7:27 am

Shmion84 wrote:On AVDO:
graphic for Aquamarine 2011, 2012, 2013Show
Image
(without winless characters)

You can clearly see: SVDO (7 wins) > SVDO (6 wins) > SVDO (5 wins) > ...

AVDO (7 wins) is often - not always - higher than AVDO (6 wins). So characters with 6 wins have chances to win the necklace (what i prefer). The difference of AVDO between contestants of adjacent AVDO is just below 0.1.


This is not the first discussion on necklace rules, right?

PS: I missed maglor's 4444th post.
Thanks again. There has been many discussion on the necklace rules, but I don't think any of the alternatives proposed has got this amount of analysis by multiple people. Feel free to check out this page -> http://internationalsaimoe.com/forum/vi ... f=4&t=3906" onclick="window.open(this.href);return false; and tell me if you see anything to your liking.

How to balance the need to properly award 7 win character while giving 6 win character a chance is a tough question. We may need to reverse engineer this. The question I like to pose for all of you is, "What should be the optimal VF% difference should 6 win character have over a 7 win character in the necklace group match in order for that 6 win character to deserve the necklace? " We need to try to reach an agreement in this value, as it will help us narrow down ranges of models and variables to use.

Assuming we have a good idea of that VF% difference to aim for, I do see some ways to lessen the gap between 6 and 7 win characters. The 2 easiest way would be to have NP = SVDO + alpha*AVDO + NGP or NP = SVDO + alpha*SVAO ( = sum of CVP of all opponents ) + NGP . Since it is possible for 6 win characters to have higher AVDO or SVAO, we can find the right alpha value that will adjust the situation to require 6 win characters to win by desired VF% or more in the necklace group match.

By the way, a tie should be handled as a half win in calculating the win percentage, and half of that opponent's CVP will be used in calculating SVDO or AVDO in this scoring system. For SVAO, whether or not you win, tie, or lose don't matter, thus the full CVP value should be used.
Image
User avatar
maglor
~Fukou da~
~Fukou da~
Posts: 8724
Joined: Thu Feb 26, 2009 8:57 pm
Badges:
ImageImage
Worships: Abriel Nei Debrusc Borl Paryun Lafiel
Melon Pan: 75
2019 Female Favorite: Akemi Homura
2019 Male Favorite: Arima Kōsei
2018 Female Favorite: Chtholly Nota Seniorious
2018 Male Favorite: Yang Wenli
2017 Female Favorite: Tomori Nao
2017 Male Favorite: Yang Wenli
Wish: More people being open to alternatives and compromises.

Re: Some thoughts on ranking systems

Post by maglor » Mon Mar 24, 2014 9:20 pm

I did some rough calculations with goal of having 10% gap a strong 6 win contestant usually have compared to weak 7 win contestant. The formula that seems to work for this seems to be

NP = SVDO + SVAO + NGP(100) meaning 100 is the multiplier for necklace group match result. I'm afraid this formula means nothing if we can't agree to the hurdle height of 10%.
Image
User avatar
Shmion84
Moon princess
Moon princess
Posts: 3383
Joined: Mon Apr 02, 2012 6:33 pm
Badges:
ImageImage
Melon Pan: 95
2017 Female Favorite: Konjiki No Yami
2017 Male Favorite: Kyon
Wish: ISML still exists next year
Location: Central Europe

Re: Some thoughts on ranking systems

Post by Shmion84 » Mon Mar 24, 2014 9:40 pm

So I read some of the 'Necklace and SDO Discussion' thread. I get the feeling it is all about an arms race between ISML staff and several factions. Even though manipulations should be prevented, the system should never be too complicated or develop negative side effects (voting for your favourite character in a match worsened her rating).
In 2012 I collected some examples for the SDO: http://internationalsaimoe.com/forum/vi ... 93#p172165" onclick="window.open(this.href);return false;
The AVDO has a similar problem: Your favourite character's AVDO will decline, if she wins against a weak character with a low CVP. She has to lose to keep her old (better!) AVDO.

Back on topic:
maglor wrote:"What should be the optimal VF% difference should 6 win character have over a 7 win character in the necklace group match in order for that 6 win character to deserve the necklace? "
I would say about 5 %.

I think a "good" ranking system should
- not be too complicated
- not punish votes for a character (whose rank drops as a result)
- not too vulnerable to coincidences (differences in SVDO of characters with same number of wins were small)

Any new system should be tested intensively on earlier Periods.


Note: We can comply with the rule of heraldry if Aquamarine is second period and Topaz is fourth period.
Image
User avatar
maglor
~Fukou da~
~Fukou da~
Posts: 8724
Joined: Thu Feb 26, 2009 8:57 pm
Badges:
ImageImage
Worships: Abriel Nei Debrusc Borl Paryun Lafiel
Melon Pan: 75
2019 Female Favorite: Akemi Homura
2019 Male Favorite: Arima Kōsei
2018 Female Favorite: Chtholly Nota Seniorious
2018 Male Favorite: Yang Wenli
2017 Female Favorite: Tomori Nao
2017 Male Favorite: Yang Wenli
Wish: More people being open to alternatives and compromises.

Re: Some thoughts on ranking systems

Post by maglor » Mon Mar 24, 2014 10:56 pm

Shmion84 wrote:So I read some of the 'Necklace and SDO Discussion' thread. I get the feeling it is all about an arms race between ISML staff and several factions. Even though manipulations should be prevented, the system should never be too complicated or develop negative side effects (voting for your favourite character in a match worsened her rating).
In 2012 I collected some examples for the SDO: http://internationalsaimoe.com/forum/vi ... 93#p172165" onclick="window.open(this.href);return false;
The AVDO has a similar problem: Your favourite character's AVDO will decline, if she wins against a weak character with a low CVP. She has to lose to keep her old (better!) AVDO.

Back on topic:
maglor wrote:"What should be the optimal VF% difference should 6 win character have over a 7 win character in the necklace group match in order for that 6 win character to deserve the necklace? "
I would say about 5 %.

I think a "good" ranking system should
- not be too complicated
- not punish votes for a character (whose rank drops as a result)
- not too vulnerable to coincidences (differences in SVDO of characters with same number of wins were small)

Any new system should be tested intensively on earlier Periods.


Note: We can comply with the rule of heraldry if Aquamarine is second period and Topaz is fourth period.
Thank you for your input. For 5% margin, since the difference in SVDO between 7 win characters and 6 win characters do occasionally be about 5, using multiplier of 100 for NGP seems appropriate.

For simplicity sake, I am now leaning towards NP = SVDO + 100*NGP

Compared to current system that rewards the loser point as (loser's VF)/(winner's VF), the vote for the winner will have less direct impact as this will be replaced by (loser's VF)/(Average VF of all contestants in that match day).

As for the difference in SVDO value, its practical value difference is much greater than suggested by its relative value difference. As I mentioned before, average difference in adjacent finish ranks for NGP is near 0.03 to 0.04. Any difference value needs to be normalized by dividing it by 0.035. In this sense, difference in SVDO value can be significant enough. This is why we also need to think about the multiplier.

I think we now needs some measure for the quality of the scoring/ranking systems, thus I introduce you to SARD = Sums of absolute rank difference.

For current necklace point system, SARD = ( sum of absolute values of (NP rank - SDO rank) + sum of absolute values of (NP rank - NGP rank ) )/2

We want a system that maximizes SARD.

I am considering following 3 challengers to current NP system = SDO/3 + 100*NGP = NPC

(1) NP1 = SVDO + 50 * NGP
(2) NP2 = SVDO + 100*NGP
(3) NP3 = SVDO + SVAO + 50* NGP

SARD for NP1 and NP2 will be ( sum of absolute values of (NP rank - SVDO rank) + sum of absolute values of (NP rank - NGP rank ) )/2
SARD for NP3 will be ( sum of absolute values of (NP rank - SVDO rank) + sum of absolute values of (NP rank - SVAO rank) + sum of absolute values of (NP rank - NGP rank ) )/3

Nice thing about SARD measurement is that central limit theorem likely will punish SARD score for more complicated model.

Since we are merely comparing 3 fixed model without having any coefficient being fitted to the situation, I don't think we need to go into more complicated validation technique of cross-validation, calculating of uncertainty through forming linear weight matrix , nor using bootstraps. Still I think we do need the average SARD value for ALL four models, using our entire sample space, which is all the necklace periods in 2011, 2012, and 2013. Please note that for 2011, we will be using the same NPC as 2012 and 2013, not the NP systme used in 2011. If we can find one of the models having significantly higher SARD value compared to other models ( and in order to calculate significance , we need to calculate average SARD value AND STDEV of SARD value for each model . The standard procedure would be to divide STDEV value by sqrt(number of samples = 7+5+5 = 17) which is called STERR and then construct 95% confidence interval by getting [average - 2*STERR, average + 2*STERR] range for each 4 model . Let's not be too concerned with multiple comparison issues yet, since we are using only 4 models. If any of the model has SARD confidence interval (CI) that does not overlap with any of other SARD CI , we can call that model to have significantly different SARD value with very high confidence ) , then we really need to consider that model to be superior to other models. I know this will take good deal of computer work, and we will fill great amount of visual space with tables and tables of numbers, but this is a task worth taking if we are to have any credible case for changing the necklace point system.



The above method is very important. We must consider average SARD value in any discussion of changes to NP system.

P.S. : Care to tell us more about the rule of Heraldry ?

P.S.2 : I had been thinking more about the SARD and believes MSARD = minimum of sums of absolute rank difference might be better quantity to maximize. The following is equation for MSARD for the 4 models at hand.

NPC : min( ( sum of absolute values of (NP rank - SDO rank) , sum of absolute values of (NP rank - NGP rank ) )
NP1 & NP2 : min( sum of absolute values of (NP rank - SVDO rank) , sum of absolute values of (NP rank - NGP rank ) )
NP3 : min (sum of absolute values of (NP rank - SVDO rank) , sum of absolute values of (NP rank - SVAO rank) , sum of absolute values of (NP rank - NGP rank ) )

One problem with this is that using minimum may too severely penalize a system for having more component. Still, this optimization strategy does better job of making sure all the components in the NP system contributes in meaningful way. If time allows it, we should consider whether using MSARD will lead to selection of different system compared to using SARD.

P.S. 3 : I do know that since we have a parameter to optimize for, we could make this true optimization problem by using models like

NPV = SVDO + C1*NGP

and then let the optimizer find us the best C1. We can even do a cross validation scheme to select whether model using SDO, SVDO, or SVDO+SADO is the best, by splitting the 17 necklace periods into 4 sets, and then letting the optimizer to find us the best coefficient value for each case. The reason I avoided the temptation to do this is because that will result in coefficient values that will look like 73.5246 , which will be more difficult number to grasp compared to simple 300 ( for current system, the NPC ), 100, or 50 being discussed here.
Image
User avatar
Vella
Gym leader
Gym leader
Posts: 173
Joined: Wed Oct 09, 2013 5:33 pm
Worships: Haibara Ai
Melon Pan: 50
Location: North American Aerospace Defense Command

Re: Some thoughts on ranking systems

Post by Vella » Tue Mar 25, 2014 11:36 am

maglor wrote:The necklace system is deliberately designed to be specific only within a period, so someone like Sakurazaki Setsuna(2008 Topaz), Katsura Hinagiku (2009 Ruby), or Eucliwood (2011 Diamond ) won't need to worry about records from prior periods dragging them or their opponents down.
But if my memory is correct, Eucliwood was 5-2 with SDO 57 in 2011 Diamond, behind Suzumiya Haruhi in SDO and Nakano Azusa in SoS.
User avatar
Shmion84
Moon princess
Moon princess
Posts: 3383
Joined: Mon Apr 02, 2012 6:33 pm
Badges:
ImageImage
Melon Pan: 95
2017 Female Favorite: Konjiki No Yami
2017 Male Favorite: Kyon
Wish: ISML still exists next year
Location: Central Europe

Re: Some thoughts on ranking systems

Post by Shmion84 » Tue Mar 25, 2014 8:14 pm

(1) At first I calculated SARD and MSARD for the current, fractional SDO-system.

SARD
Average: 7.029
Standard deviation: 1.851
Standard error: 0.449
95.5 % - confidence intervall: 0.953 - 2.749

MSARD
Average: 4.647
Standard deviation: 2.588
Standard error: 0.628
95.5 % - confidence intervall: 1.332 - 3.844
full graphicShow
Image
ABS (1) = sum of absolute values of (NP rank - SDO rank)
ABS (2) = sum of absolute values of (NP rank - NGP rank)

(2) I also calculated NP1, NP2 and NP3 for Aquamarine 2011. I have doubts about NP3:
NP3 = SVDO + SVAO + 50* NGP
For 7 win characters SVDO (= sum of CVP of defeated opponents) and SVAO (= sum of CVP of all opponents) are identical. In this case NGP has only half as much influence as in NP1!
However I uploaded the spreadsheet to google docs. I have not taken draws into account. You can check for errors (and look for easier functions):

https://docs.google.com/spreadsheets/d/ ... sp=sharing" onclick="window.open(this.href);return false;

SARD and MSARD values in BU1:BW4.

(3) I am no expert of heraldry. Last year I searched about inheritance of coat of arms and found this: https://en.wikipedia.org/wiki/Tincture_(heraldry" onclick="window.open(this.href);return false;). 6 of 7 tinctures were connected to Necklace-Gemstones:

Metals
Gold/Yellow - Topaz
Silver/White - Pearl
Colours
Blue - Sapphire
Red - Ruby
Purple - Amethyst
Black - Emerald
Green - Diamond

I thought this would have been intentional and Pearl was replaced by Aquamarine. Maybe it is only a coincidence.

However wikipedia mentions the rule of tincture: metal must never be placed upon metal, nor colour upon colour
This rule also applies to many flags. Look at: https://commons.wikimedia.org/wiki/Sove ... tate_flags" onclick="window.open(this.href);return false; (many tricolore-flags like Austria, Belgium, France, Ireland, Italy, Netherlands,or the Nordic Cross Flags of Denmark, Finland, Iceland, Norway and Sweden)

If we count Aquamarine as Silver/White (Pearl) the necklace gemstones consist of 2 metals (Aquamarine and Topaz) and 3 colours (Amethyst, Ruby and Emerald). The arrangement of necklace periods can comply with the rule using any colour - metal - colour - metal - colour combination.
Image
User avatar
maglor
~Fukou da~
~Fukou da~
Posts: 8724
Joined: Thu Feb 26, 2009 8:57 pm
Badges:
ImageImage
Worships: Abriel Nei Debrusc Borl Paryun Lafiel
Melon Pan: 75
2019 Female Favorite: Akemi Homura
2019 Male Favorite: Arima Kōsei
2018 Female Favorite: Chtholly Nota Seniorious
2018 Male Favorite: Yang Wenli
2017 Female Favorite: Tomori Nao
2017 Male Favorite: Yang Wenli
Wish: More people being open to alternatives and compromises.

Re: Some thoughts on ranking systems

Post by maglor » Tue Mar 25, 2014 8:51 pm

Shmion84 wrote:(1) At first I calculated SARD and MSARD for the current, fractional SDO-system.

SARD
Average: 7.029
Standard deviation: 1.851
Standard error: 0.449
95.5 % - confidence intervall: 0.953 - 2.749

MSARD
Average: 4.647
Standard deviation: 2.588
Standard error: 0.628
95.5 % - confidence intervall: 1.332 - 3.844
full graphicShow
Image
ABS (1) = sum of absolute values of (NP rank - SDO rank)
ABS (2) = sum of absolute values of (NP rank - NGP rank)

(2) I also calculated NP1, NP2 and NP3 for Aquamarine 2011. I have doubts about NP3:
NP3 = SVDO + SVAO + 50* NGP
For 7 win characters SVDO (= sum of CVP of defeated opponents) and SVAO (= sum of CVP of all opponents) are identical. In this case NGP has only half as much influence as in NP1!
However I uploaded the spreadsheet to google docs. I have not taken draws into account. You can check for errors (and look for easier functions):

https://docs.google.com/spreadsheets/d/ ... sp=sharing" onclick="window.open(this.href);return false;

SARD and MSARD values in BU1:BW4.

(3) I am no expert of heraldry. Last year I searched about inheritance of coat of arms and found this: https://en.wikipedia.org/wiki/Tincture_(heraldry" onclick="window.open(this.href);return false;). 6 of 7 tinctures were connected to Necklace-Gemstones:

Metals
Gold/Yellow - Topaz
Silver/White - Pearl
Colours
Blue - Sapphire
Red - Ruby
Purple - Amethyst
Black - Emerald
Green - Diamond

I thought this would have been intentional and Pearl was replaced by Aquamarine. Maybe it is only a coincidence.

However wikipedia mentions the rule of tincture: metal must never be placed upon metal, nor colour upon colour
This rule also applies to many flags. Look at: https://commons.wikimedia.org/wiki/Sove ... tate_flags" onclick="window.open(this.href);return false; (many tricolore-flags like Austria, Belgium, France, Ireland, Italy, Netherlands,or the Nordic Cross Flags of Denmark, Finland, Iceland, Norway and Sweden)

If we count Aquamarine as Silver/White (Pearl) the necklace gemstones consist of 2 metals (Aquamarine and Topaz) and 3 colours (Amethyst, Ruby and Emerald). The arrangement of necklace periods can comply with the rule using any colour - metal - colour - metal - colour combination.

Thank you. You had a question about NP3, specifically why SVAO was added to the mix. The idea is that 6 win characters will likely have less SVDO but have slightly greater SVAO compared to 7 win characters. By adding SVAO, I am giving 6 win characters slightly better chance to overcome SVDO deficit compared to NP1. It is also test to see whether our NP model are too simple or not. If we see large increase in SARD value with SVAO in the mix, we may need need to more seriously consider adding in SVAO or some other 3rd factor into the NP formula. For 2011 Aquamarine, SARD value for NP3 is large because of Kanade. I still expect NP3's SARD value to be comparable to or below other models when we bring in other periods and other years.

One thing of note. I think you have a typo in your spread sheet as CI should be Average +- 2* STERR, but you did STDEV +- 2* STERR .

By the way, nice catch about this being 95.5% CI, since more formal 95% CI would have used 1.96 instead of 2. I didn't bother listing 1.96, since I didn't think we need that much of formality, and since we have multiple comparison situation, we could have considered alpha as low as 5%/(4 Choose 2 = 6 ) = 0.83% , which may be too conservative for our need.
Image
User avatar
Shmion84
Moon princess
Moon princess
Posts: 3383
Joined: Mon Apr 02, 2012 6:33 pm
Badges:
ImageImage
Melon Pan: 95
2017 Female Favorite: Konjiki No Yami
2017 Male Favorite: Kyon
Wish: ISML still exists next year
Location: Central Europe

Re: Some thoughts on ranking systems

Post by Shmion84 » Tue Mar 25, 2014 9:03 pm

maglor wrote:For 2011 Aquamarine, SARD value for NP3 is large because of Kanade. I still expect NP3's SARD value to be comparable to or below other models when we bring in other periods and other years.
I think it is because of Eucliwood Hellscythe's and Nakamura Yuri's combination of high SVAO (both ranked #1) and low NP3 (ranked 6# resp. 7#).
maglor wrote:One thing of note. I think you have a typo in your spread sheet as CI should be Average +- 2* STERR, but you did STDEV +- 2* STERR .
You are right. Thanks for your attention. The correct values:
SARD-CI: 6.132 - 7.927
MSARD-CI: 3.392 - 5.903
Image
User avatar
maglor
~Fukou da~
~Fukou da~
Posts: 8724
Joined: Thu Feb 26, 2009 8:57 pm
Badges:
ImageImage
Worships: Abriel Nei Debrusc Borl Paryun Lafiel
Melon Pan: 75
2019 Female Favorite: Akemi Homura
2019 Male Favorite: Arima Kōsei
2018 Female Favorite: Chtholly Nota Seniorious
2018 Male Favorite: Yang Wenli
2017 Female Favorite: Tomori Nao
2017 Male Favorite: Yang Wenli
Wish: More people being open to alternatives and compromises.

Re: Some thoughts on ranking systems

Post by maglor » Tue Mar 25, 2014 9:10 pm

Shmion84 wrote:
maglor wrote:For 2011 Aquamarine, SARD value for NP3 is large because of Kanade. I still expect NP3's SARD value to be comparable to or below other models when we bring in other periods and other years.
I think it is because of Eucliwood Hellscythe's and Nakamura Yuri's combination of high SVAO (both ranked #1) and low NP3 (ranked 6# resp. 7#).
maglor wrote:One thing of note. I think you have a typo in your spread sheet as CI should be Average +- 2* STERR, but you did STDEV +- 2* STERR .
You are right. Thanks for your attention. The correct values:
SARD-CI: 6.132 - 7.927
MSARD-CI: 3.392 - 5.903
Yeah, You are right about Eucliwood and Yuri. In case of tie in the rank, use the average. In this case, it would have been better to give Eucliwood and Yuri rank of 1.5 for SVAO rank. The reason is that using the mean will likely reduce the overall variance.
Image
User avatar
Shmion84
Moon princess
Moon princess
Posts: 3383
Joined: Mon Apr 02, 2012 6:33 pm
Badges:
ImageImage
Melon Pan: 95
2017 Female Favorite: Konjiki No Yami
2017 Male Favorite: Kyon
Wish: ISML still exists next year
Location: Central Europe

Re: Some thoughts on ranking systems

Post by Shmion84 » Tue Mar 25, 2014 10:22 pm

maglor wrote:In case of tie in the rank, use the average. In this case, it would have been better to give Eucliwood and Yuri rank of 1.5 for SVAO rank. The reason is that using the mean will likely reduce the overall variance.
Yes. That is a better idea. So I replaced the function RANK by RANK.AVG.

For SDO-Calculation:
SARD-CI changed again: 6.065 - 7.817
MSARD-CI remained unchanged: 3.392 - 5.903

The SDO-"shared rank phenomenon" appeared for the last time in 2011 (thanks to the seeding I guess).
Image
User avatar
maglor
~Fukou da~
~Fukou da~
Posts: 8724
Joined: Thu Feb 26, 2009 8:57 pm
Badges:
ImageImage
Worships: Abriel Nei Debrusc Borl Paryun Lafiel
Melon Pan: 75
2019 Female Favorite: Akemi Homura
2019 Male Favorite: Arima Kōsei
2018 Female Favorite: Chtholly Nota Seniorious
2018 Male Favorite: Yang Wenli
2017 Female Favorite: Tomori Nao
2017 Male Favorite: Yang Wenli
Wish: More people being open to alternatives and compromises.

Re: Some thoughts on ranking systems

Post by maglor » Tue Mar 25, 2014 11:15 pm

Shmion84 wrote:
maglor wrote:In case of tie in the rank, use the average. In this case, it would have been better to give Eucliwood and Yuri rank of 1.5 for SVAO rank. The reason is that using the mean will likely reduce the overall variance.
Yes. That is a better idea. So I replaced the function RANK by RANK.AVG.

For SDO-Calculation:
SARD-CI changed again: 6.065 - 7.817
MSARD-CI remained unchanged: 3.392 - 5.903

The SDO-"shared rank phenomenon" appeared for the last time in 2011 (thanks to the seeding I guess).
Thanks, now if we have SARD and MSARD values for NP1,2,and 3, we can wrap up this phase of discussion.
Image
User avatar
10ZHAbin
Spirit hunter
Spirit hunter
Posts: 2161
Joined: Tue Sep 04, 2012 9:13 am
Badges:
ImageImage
Worships: Leina
Melon Pan: 50
Location: Otaku Community

Re: Some thoughts on ranking systems

Post by 10ZHAbin » Wed Mar 26, 2014 4:02 am

This is completely unrelated to all the stats above, but will it possible to have two necklace per period each one for each division (Nova and Stella)?
User avatar
maglor
~Fukou da~
~Fukou da~
Posts: 8724
Joined: Thu Feb 26, 2009 8:57 pm
Badges:
ImageImage
Worships: Abriel Nei Debrusc Borl Paryun Lafiel
Melon Pan: 75
2019 Female Favorite: Akemi Homura
2019 Male Favorite: Arima Kōsei
2018 Female Favorite: Chtholly Nota Seniorious
2018 Male Favorite: Yang Wenli
2017 Female Favorite: Tomori Nao
2017 Male Favorite: Yang Wenli
Wish: More people being open to alternatives and compromises.

Re: Some thoughts on ranking systems

Post by maglor » Wed Mar 26, 2014 6:04 am

10ZHAbin wrote:This is completely unrelated to all the stats above, but will it possible to have two necklace per period each one for each division (Nova and Stella)?
There are two hurdles for this.

(1) This would mean one more poster for period. How much are you willing to spend to bribe Hikari-chan for this?

(2) This also means there won't be any top tier match between Nova and Stella before Postseason Phase II . I know some people actually want more opportunities for Nova and Stella to have a match against each other. How much are you willing to spend to bribe these people?
Image
User avatar
Shmion84
Moon princess
Moon princess
Posts: 3383
Joined: Mon Apr 02, 2012 6:33 pm
Badges:
ImageImage
Melon Pan: 95
2017 Female Favorite: Konjiki No Yami
2017 Male Favorite: Kyon
Wish: ISML still exists next year
Location: Central Europe

Re: Some thoughts on ranking systems

Post by Shmion84 » Wed Mar 26, 2014 6:04 pm

Excel has calculated the following results:

NP1 = SVDO + 50 * NGP

CI (= Confidence intervall) using SARD (= Sums of absolute rank difference): 5.227 - 7.008
CI using MSARD (= minimum of sums of absolute rank difference): 2.978 - 5.257

NP2 = SVDO + 100 * NGP

CI-SARD: 5.182 - 7.053
CI-MSARD: 3.272 - 5.317

NP3 = SVDO + SVAO + 50 * NGP

CI-SARD: 7.202 - 9.464
CI-MSARD: 3.188 - 6.106

for comparison:

NPC = SDO/3 + 100 * NGP

CI-SARD: 6.065 - 7.817
CI-MSARD: 3.392 - 5.902
full imageShow
Image
maglor wrote:The standard procedure would be to divide STDEV value by sqrt(number of samples = 7+5+5 = 17) which is called STERR and then construct 95% confidence interval by getting [average - 2*STERR, average + 2*STERR] range for each 4 model . Let's not be too concerned with multiple comparison issues yet, since we are using only 4 models. If any of the model has SARD confidence interval (CI) that does not overlap with any of other SARD CI , we can call that model to have significantly different SARD value with very high confidence ) , then we really need to consider that model to be superior to other models.
Looking at the values (both SARD and MSARD) I would say NP1, NP2 and NP3 are not superior to NPC.

The best result for SVDO + C1*NGP is C1 = 90:

CI-SARD: 5.221 - 7.132
CI-MSARD: 3.327 - 5.261
Image
Post Reply