Some thoughts on ranking systems

For topics that last throughout the whole season
User avatar
matchbaby
Necromancer
Necromancer
Posts: 887
Joined: Sat Jun 28, 2014 9:51 am
Badges:
ImageImage
Worships: Isla
Melon Pan: 65
Wish: Isla!!! All hail for Isla!!!
Shiro and Sora!! As least win something!!
Same as Illya~!
Location: Hong Kong, China

Re: Some thoughts on ranking systems

Post by matchbaby » Tue Aug 26, 2014 3:30 am

maglor wrote:
matchbaby wrote:I think RPI is too difficult-_-
Why don't you make a SWVDO(Sum of defeated opponent weighted VP of one period)
As I think the idea for SDO is to reduce the effect of NM vote and give some benefits to strong 1v1er.
However, it is very unfair to Tier 2 or Wildcard.(Also, it depends on 'luck' ,and SDO of one period CAN'T show the strength of a character surely)
SWVDO can solve the above problem.
In extra, anti-vote will happen all way though the period if we use SWVDO instead of SDO, so more miracle can be seen.(As many people are complaining that Saimoe is boring as not many upset can be seen)

Also, for PSI's swiss system, I have something to say.
I think the rule should be changed as following:
1. most wisn in PSI.
2. if A and B have same wins, the winner of A vs B(if A haven't meet B in PSI,then include regular season) will be ahead.
3. if A and B and C have same wins and A>B, B>C, C>A, then, compare their strongest opponent.
4. regular season ranking.
For the necklace, thank you for your comments and I want to inform you that we are looking at several alternatives which may better help Tier 2 or wildcard

For PSI, it would take very rare set of coincidence for your 4 line rule to have different result compared to current 3 line rule. I would appreciate it if you can give us some examples where your rule will have different outcome compared to current 3 line rule, so I may better understand what are the cases I forgot to consider.
Well, SWVDO is surely fair for all Tier 2 and Wildcard compare to SDO. Much much more fair.

For PSI, I use Nova as example.
Assume Asuna vs Kotori, Yoshino vs Kurumi on semi-final in PSI and Kurumi+Kotori wins.
As Asuna had faced KYH and Yoshino had faced Eru, for the final result, KYH is 3W1L and Eru is 2W2L, so Asuna will be ahead and grab the third spot, Yoshino will be demoted to forth. (As Kurumi lose in final, she must be second instead of forth or fifth)

However, Yoshino wins Asuna and KYH in Regular season and KYH wins Asuna in Regular Season, they are all 3W1L, but the "weakest" Asuna is ahead.
So third should be Yoshino(wins Asuna in Regular Season), forth is Asuna(wins KYH in PSI), fifth is KYH.

As same as 2W2L, I assume Eru, Tsu, Rikka, Tohka, Nanami, Shinka all 2W2L,
using the present rule, Eru(win Nanami and Tohka)>Tsu(win Nanami and Tohka)>Rikka(win KUROUSAGI, Shinka)
However, in Regular Season, Eru> Tsu> Rikka> Eru, which is A>B>C>A, so I think using the regular season's ranking(Rikka> Eru> Tsu) is better than now's (Eru>Tsu>Rikka)?
User avatar
maglor
~Fukou da~
~Fukou da~
Posts: 8618
Joined: Thu Feb 26, 2009 8:57 pm
Badges:
ImageImage
Worships: Abriel Nei Debrusc Borl Paryun Lafiel
Melon Pan: 75
2018 Female Favorite: Chtholly Nota Seniorious
2018 Male Favorite: Yang Wenli
2017 Female Favorite: Tomori Nao
2017 Male Favorite: Yang Wenli
Wish: More people being open to alternatives and compromises.

Re: Some thoughts on ranking systems

Post by maglor » Tue Aug 26, 2014 8:19 am

matchbaby wrote:
For PSI, I use Nova as example.
Assume Asuna vs Kotori, Yoshino vs Kurumi on semi-final in PSI and Kurumi+Kotori wins.
As Asuna had faced KYH and Yoshino had faced Eru, for the final result, KYH is 3W1L and Eru is 2W2L, so Asuna will be ahead and grab the third spot, Yoshino will be demoted to forth. (As Kurumi lose in final, she must be second instead of forth or fifth)

However, Yoshino wins Asuna and KYH in Regular season and KYH wins Asuna in Regular Season, they are all 3W1L, but the "weakest" Asuna is ahead.
So third should be Yoshino(wins Asuna in Regular Season), forth is Asuna(wins KYH in PSI), fifth is KYH.

As same as 2W2L, I assume Eru, Tsu, Rikka, Tohka, Nanami, Shinka all 2W2L,
using the present rule, Eru(win Nanami and Tohka)>Tsu(win Nanami and Tohka)>Rikka(win KUROUSAGI, Shinka)
However, in Regular Season, Eru> Tsu> Rikka> Eru, which is A>B>C>A, so I think using the regular season's ranking(Rikka> Eru> Tsu) is better than now's (Eru>Tsu>Rikka)?

We wish to make the PS I ranking be determined mostly by what the character did in the Postseason. Regular season ranking comes into play only because we don't have enough match within PS I to truly differentiate from rank 1 to rank 16. Now let's look at your case in more detail. Here is the final ranking in your scenario, assuming NO UPSET other than Kurumi over Yoshino happens.

final rank
1 Itsuka Kotori 4w
2 Tokisaki Kurumi 3w1l 1qw
3 Yūki Asuna 3w1l 1qw
4 Yoshino 3w1l
5 Kuroyukihime 3w1l
6 Chitanda Eru 2w2l 2qw
7 Tsutsukakushi Tsukiko 2w2l 2qw
8 Takanashi Rikka 2w2l 1qw
9 Yatogami Tōka 2w2l
10 Aoyama Nanami 2w2l
11 Kurousagi 2w2l
12 Nibutani Shinka 1w3l 1qw
13 Yaya 1w3l 1qw
14 Azuki Azusa 1w3l
15 Yukinoshita Yukino 1w3l
16 Momo Belia Deviluke 4l


Let's look at each case more carefully.

1. Kotori being #1 is no problem, since she is the only unbeaten.
2. Kurumi beat 14 Yaya , 6 Takanashi Rikka , and 2 Yoshino , making it to the final. Yoshino gives her a quality win, thus she being 2nd is not a problem
3. Asuna beat Azuki Azusa 1w3l , Kuroyukihime 3w1l, and Chitanda Eru 2w2l . She lost to the CHAMP Kotori. Compare that to
4. Yoshino beat Yukinoshita Yukino 1w3l , Chitanda Eru 2w2l , and Tsutsukakushi Tsukiko 2w2l , while losing to the RUNNER UP Kurumi. Asuna beat 2 opponent that is better than Yoshino's 2 opponent. Asuna lost to the champ while Yoshino lost to the runner up. I find it hard to say Yoshino had better PS I performance compared to Asuna. Yoshino being 4th is deserved.
5. There is a quirk in the system that makes the 6th seedTakanashi Rikka and EASIER PATH to top 8 spot into Phase II, but at cost of bumping her to 8th seed instead of 6th seed spot she started out from, if NO UPSET happens. While some might dislike it, do remember that Rikka will have much greater chance of avoiding upsets compared to 7th or 8th seed entering into PS I. The price for 7th and 8th seed into PS I is GREAT chance for upset by 9th and 10th seed girls. If they overcome this hardship, I think their performance gives them deserved bump up the rank into PS II.

Consider this : Yoshino faced weaker opponents throughout schedule compared to Asuna. She had much easier path for 3 win which guarantees a top 5 spot. This guarantee may be worth the danger of being bumped down a spot by Asuna, who will have much less chance to win 3 times compared to Yoshino. ISML always liked girls who defeat tougher opponents compared to girls who beats up cupcakes.

Finally , you always have some upsets in the postseason. Those upsets will make it hard to use the regular season ranking ( = seed going into PS I ) as the major indicator for the strength. When several upsets do happen, I believe the current 3 line rule will reflect how impressive the character's performance is, much better than anything I have seen so far.
Image
User avatar
Shmion84
Moon princess
Moon princess
Posts: 3383
Joined: Mon Apr 02, 2012 6:33 pm
Badges:
ImageImage
Melon Pan: 95
2017 Female Favorite: Konjiki No Yami
2017 Male Favorite: Kyon
Wish: ISML still exists next year
Location: Central Europe

Re: Some thoughts on ranking systems

Post by Shmion84 » Tue Aug 26, 2014 5:29 pm

maglor wrote:Please tell me whether you would prefer 2c) case or think RPI, RPI-3, or something else is better. Also feel free to take your time with other analysis. We can't implement the change for half a year anyway.
2c (PSAO) has a big advantage: You can calculate it using values of the Period Standings only. (You would need to add a column for RPI value.)
It will be fun to read the Council Directive introducing NGP * 125 + ( pts / 21 ) * SAO as the new Necklace Formula.

PSAO and the other proposed formulas need further investigation.
maglor wrote:[another late addendum]

Please tell us

(1) Optimum multiplier for NGP using current SDO, when you calculate MRC with only the 7-0 characters. For example, if a necklace match had 5 girls with 7-0 record and 2 girls with 6-1, then recalculate the ranks among 7-0 characters and use only that. If you are using "rank" function, I think you can do this by deleting all data from girls with one or more loss, and making sure that the range considered by the "rank" function is the correct one.
(2) Tell us what is the range of multiplier for NGP that will keep MRC value below 0.8 for the above case ( the case where you consider only the 7-0 girls ), and also for PSAO = ( pts / 21 ) * SAO case where you would also consider girls with one or more loss.

Sorry again, but I would appreciate it if you take care of these 2 cases of multiplier values before everything else.
(1) MRC for factor 165: 0.7436
< 0.8 range: 93 - 291.6
(using recalculated necklace match percentage: MRC for factor 165: 0.7481, < 0.8 range: 89.8 - 264)
Keep in mind only 87 of 154 contenders got a 7-0 score (seven times Shiina Mashiro and Misaka Mikoto!)

(2) PSAO, < 0.8 range: 93 - 200.7
Image
User avatar
maglor
~Fukou da~
~Fukou da~
Posts: 8618
Joined: Thu Feb 26, 2009 8:57 pm
Badges:
ImageImage
Worships: Abriel Nei Debrusc Borl Paryun Lafiel
Melon Pan: 75
2018 Female Favorite: Chtholly Nota Seniorious
2018 Male Favorite: Yang Wenli
2017 Female Favorite: Tomori Nao
2017 Male Favorite: Yang Wenli
Wish: More people being open to alternatives and compromises.

Re: Some thoughts on ranking systems

Post by maglor » Tue Aug 26, 2014 8:54 pm

Shmion84 wrote:
maglor wrote:Please tell me whether you would prefer 2c) case or think RPI, RPI-3, or something else is better. Also feel free to take your time with other analysis. We can't implement the change for half a year anyway.
2c (PSAO) has a big advantage: You can calculate it using values of the Period Standings only. (You would need to add a column for RPI value.)
It will be fun to read the Council Directive introducing NGP * 125 + ( pts / 21 ) * SAO as the new Necklace Formula.

PSAO and the other proposed formulas need further investigation.
maglor wrote:[another late addendum]

Please tell us

(1) Optimum multiplier for NGP using current SDO, when you calculate MRC with only the 7-0 characters. For example, if a necklace match had 5 girls with 7-0 record and 2 girls with 6-1, then recalculate the ranks among 7-0 characters and use only that. If you are using "rank" function, I think you can do this by deleting all data from girls with one or more loss, and making sure that the range considered by the "rank" function is the correct one.
(2) Tell us what is the range of multiplier for NGP that will keep MRC value below 0.8 for the above case ( the case where you consider only the 7-0 girls ), and also for PSAO = ( pts / 21 ) * SAO case where you would also consider girls with one or more loss.

Sorry again, but I would appreciate it if you take care of these 2 cases of multiplier values before everything else.
(1) MRC for factor 165: 0.7436
< 0.8 range: 93 - 291.6
(using recalculated necklace match percentage: MRC for factor 165: 0.7481, < 0.8 range: 89.8 - 264)
Keep in mind only 87 of 154 contenders got a 7-0 score (seven times Shiina Mashiro and Misaka Mikoto!)

(2) PSAO, < 0.8 range: 93 - 200.7

Thank you very much. It seems that we made the right decision to reduce the NGP multiplier from 300 to 220 this year. One thing that bugs me is that if we reduce NGP multiplier value further, some voters may complain about necklace group match not mattering much. Neglecting all these MRC analysis and after just looking at several necklace result, what does your gut feeling tell you to be the best multiplier to use? I welcome people other than Shmion84 to answer this question as well.
Image
User avatar
Shmion84
Moon princess
Moon princess
Posts: 3383
Joined: Mon Apr 02, 2012 6:33 pm
Badges:
ImageImage
Melon Pan: 95
2017 Female Favorite: Konjiki No Yami
2017 Male Favorite: Kyon
Wish: ISML still exists next year
Location: Central Europe

Re: Some thoughts on ranking systems

Post by Shmion84 » Wed Aug 27, 2014 5:46 am

maglor wrote:Easy one first

1. Use Flat Bonus system : For example 1st in each division gets +3%, 2nd in each gets +2%, 3rd gets +1%, and the Wildcard get +0% Add these bonus to the necklace match result. Please tell me what will be the MRC in this case. ( I guess that means we have to say 1st in each division is actually rank 1.5, 2nd is 3.5, 3rd is 5.5 and the wildcard is the 7 for the bonus rank ) We can try to optimize the bonus percentage as well. ( that will be like giving the 1st +3 * C %, 2nd : +2*C% and so forth )
(a) Using 3 %, 2 %, 1%
MRC for (219.5 * NGP + SDO) + Bonus: 0.7614

(b)
Image
User avatar
maglor
~Fukou da~
~Fukou da~
Posts: 8618
Joined: Thu Feb 26, 2009 8:57 pm
Badges:
ImageImage
Worships: Abriel Nei Debrusc Borl Paryun Lafiel
Melon Pan: 75
2018 Female Favorite: Chtholly Nota Seniorious
2018 Male Favorite: Yang Wenli
2017 Female Favorite: Tomori Nao
2017 Male Favorite: Yang Wenli
Wish: More people being open to alternatives and compromises.

Re: Some thoughts on ranking systems

Post by maglor » Wed Aug 27, 2014 5:55 am

Shmion84 wrote:
maglor wrote:Easy one first

1. Use Flat Bonus system : For example 1st in each division gets +3%, 2nd in each gets +2%, 3rd gets +1%, and the Wildcard get +0% Add these bonus to the necklace match result. Please tell me what will be the MRC in this case. ( I guess that means we have to say 1st in each division is actually rank 1.5, 2nd is 3.5, 3rd is 5.5 and the wildcard is the 7 for the bonus rank ) We can try to optimize the bonus percentage as well. ( that will be like giving the 1st +3 * C %, 2nd : +2*C% and so forth )
(a) Using 3 %, 2 %, 1%
MRC for (219.5 * NGP + SDO) + Bonus: 0.7614

(b)
oh! the formula is even simpler for this case. It is

Necklace score = Bonus * C + NGP * D

Both C and D can be flexible, so easiest would be to set D = 1
Image
User avatar
Shmion84
Moon princess
Moon princess
Posts: 3383
Joined: Mon Apr 02, 2012 6:33 pm
Badges:
ImageImage
Melon Pan: 95
2017 Female Favorite: Konjiki No Yami
2017 Male Favorite: Kyon
Wish: ISML still exists next year
Location: Central Europe

Re: Some thoughts on ranking systems

Post by Shmion84 » Wed Aug 27, 2014 8:36 pm

maglor wrote:
Shmion84 wrote:
maglor wrote:Easy one first

1. Use Flat Bonus system : For example 1st in each division gets +3%, 2nd in each gets +2%, 3rd gets +1%, and the Wildcard get +0% Add these bonus to the necklace match result. Please tell me what will be the MRC in this case. ( I guess that means we have to say 1st in each division is actually rank 1.5, 2nd is 3.5, 3rd is 5.5 and the wildcard is the 7 for the bonus rank ) We can try to optimize the bonus percentage as well. ( that will be like giving the 1st +3 * C %, 2nd : +2*C% and so forth )
(a) Using 3 %, 2 %, 1%
MRC for (219.5 * NGP + SDO) + Bonus: 0.7614

(b)
oh! the formula is even simpler for this case. It is

Necklace score = Bonus * C + NGP * D

Both C and D can be flexible, so easiest would be to set D = 1
So in this case, you don't need a D.

For C=1 MRC: 0.8994

For C=3.4 MRC: 0.7659 (C=3.4 implies 0.102, 0.068 and 0.034 bonus points)

MRC <0.8 range: 2.66 - 4.35
Image
User avatar
maglor
~Fukou da~
~Fukou da~
Posts: 8618
Joined: Thu Feb 26, 2009 8:57 pm
Badges:
ImageImage
Worships: Abriel Nei Debrusc Borl Paryun Lafiel
Melon Pan: 75
2018 Female Favorite: Chtholly Nota Seniorious
2018 Male Favorite: Yang Wenli
2017 Female Favorite: Tomori Nao
2017 Male Favorite: Yang Wenli
Wish: More people being open to alternatives and compromises.

Re: Some thoughts on ranking systems

Post by maglor » Wed Aug 27, 2014 8:49 pm

Shmion84 wrote:
maglor wrote:
Shmion84 wrote:
maglor wrote:Easy one first

1. Use Flat Bonus system : For example 1st in each division gets +3%, 2nd in each gets +2%, 3rd gets +1%, and the Wildcard get +0% Add these bonus to the necklace match result. Please tell me what will be the MRC in this case. ( I guess that means we have to say 1st in each division is actually rank 1.5, 2nd is 3.5, 3rd is 5.5 and the wildcard is the 7 for the bonus rank ) We can try to optimize the bonus percentage as well. ( that will be like giving the 1st +3 * C %, 2nd : +2*C% and so forth )
(a) Using 3 %, 2 %, 1%
MRC for (219.5 * NGP + SDO) + Bonus: 0.7614

(b)
oh! the formula is even simpler for this case. It is

Necklace score = Bonus * C + NGP * D

Both C and D can be flexible, so easiest would be to set D = 1
So in this case, you don't need a D.

For C=1 MRC: 0.8994

For C=3.4 MRC: 0.7659 (C=3.4 implies 0.102, 0.068 and 0.034 bonus points)

MRC <0.8 range: 2.66 - 4.35
Thanks. This is very informative.
Image
User avatar
Shmion84
Moon princess
Moon princess
Posts: 3383
Joined: Mon Apr 02, 2012 6:33 pm
Badges:
ImageImage
Melon Pan: 95
2017 Female Favorite: Konjiki No Yami
2017 Male Favorite: Kyon
Wish: ISML still exists next year
Location: Central Europe

Re: Some thoughts on ranking systems

Post by Shmion84 » Thu Aug 28, 2014 9:04 pm

maglor wrote:@Shmion84

Also

3. Please tell us the correlation value between SDO rank and RPI rank for each period. For this one, you can use all 72 girls if you want to or just the 7 girls in the necklace group match. To be more clear, I am looking for 1 correlation values between ( N years* 5(7) periods * 72 ( or 7 ) girl's RPI ranks) and ( N years * 5(7) periods * 72 ( or 7 ) girl's SDO ranks).
I used the ranks of all 72 (2011: 50) girls (1430 data sets total).
Correlation (SDO ranks; RPI ranks) = 0.969978639
Largest rank difference: 23 (Roromiya Karuta in 2013 Aquamarine, SDO #34, RPI #57)
Best RPI value of a winless character: 0.446428571 (Mikazuki Yozora in 2014 Topaz)
Worst RPI value of any character: 0.264455782 (Honma Meiko in 2012 Ruby, winless)
Best RPI value of any character: 0.769557823 (Shiina Mashiro in 2013 Topaz, also best SDO value of any character: 112.38)
Image
User avatar
maglor
~Fukou da~
~Fukou da~
Posts: 8618
Joined: Thu Feb 26, 2009 8:57 pm
Badges:
ImageImage
Worships: Abriel Nei Debrusc Borl Paryun Lafiel
Melon Pan: 75
2018 Female Favorite: Chtholly Nota Seniorious
2018 Male Favorite: Yang Wenli
2017 Female Favorite: Tomori Nao
2017 Male Favorite: Yang Wenli
Wish: More people being open to alternatives and compromises.

Re: Some thoughts on ranking systems

Post by maglor » Thu Aug 28, 2014 10:53 pm

Shmion84 wrote:
maglor wrote:@Shmion84

Also

3. Please tell us the correlation value between SDO rank and RPI rank for each period. For this one, you can use all 72 girls if you want to or just the 7 girls in the necklace group match. To be more clear, I am looking for 1 correlation values between ( N years* 5(7) periods * 72 ( or 7 ) girl's RPI ranks) and ( N years * 5(7) periods * 72 ( or 7 ) girl's SDO ranks).
I used the ranks of all 72 (2011: 50) girls (1430 data sets total).
Correlation (SDO ranks; RPI ranks) = 0.969978639
Largest rank difference: 23 (Roromiya Karuta in 2013 Aquamarine, SDO #34, RPI #57)
Best RPI value of a winless character: 0.446428571 (Mikazuki Yozora in 2014 Topaz)
Worst RPI value of any character: 0.264455782 (Honma Meiko in 2012 Ruby, winless)
Best RPI value of any character: 0.769557823 (Shiina Mashiro in 2013 Topaz, also best SDO value of any character: 112.38)

Correlation (SDO ranks; RPI ranks) = 0.969978639 <- This pretty much dooms RPI in the choice for the necklace score formula because using it will give results too close to current SDO, with perhaps some exceptional cases.

I am now leaning heavily for PSAO, because it actually is a simple product of two existing stat, thus we might argue that we are not really making up a new stat. Also SAO is easier to calculate compared to SDO, since we don't need to worry whether the character won or not when calculating SAO.
Image
User avatar
avery-kun
Moon princess
Moon princess
Posts: 3128
Joined: Thu Oct 04, 2012 4:25 am
Badges:
ImageImageImage
Worships: Berserker!
Melon Pan: 135
2017 Female Favorite: Illyasviel Von Einzbern
2017 Male Favorite: Gilgamesh
Wish: Illya 2017!

Re: Some thoughts on ranking systems

Post by avery-kun » Fri Aug 29, 2014 4:01 am

maglor wrote:
Shmion84 wrote:
maglor wrote:Please tell me whether you would prefer 2c) case or think RPI, RPI-3, or something else is better. Also feel free to take your time with other analysis. We can't implement the change for half a year anyway.
2c (PSAO) has a big advantage: You can calculate it using values of the Period Standings only. (You would need to add a column for RPI value.)
It will be fun to read the Council Directive introducing NGP * 125 + ( pts / 21 ) * SAO as the new Necklace Formula.

PSAO and the other proposed formulas need further investigation.
maglor wrote:[another late addendum]

Please tell us

(1) Optimum multiplier for NGP using current SDO, when you calculate MRC with only the 7-0 characters. For example, if a necklace match had 5 girls with 7-0 record and 2 girls with 6-1, then recalculate the ranks among 7-0 characters and use only that. If you are using "rank" function, I think you can do this by deleting all data from girls with one or more loss, and making sure that the range considered by the "rank" function is the correct one.
(2) Tell us what is the range of multiplier for NGP that will keep MRC value below 0.8 for the above case ( the case where you consider only the 7-0 girls ), and also for PSAO = ( pts / 21 ) * SAO case where you would also consider girls with one or more loss.

Sorry again, but I would appreciate it if you take care of these 2 cases of multiplier values before everything else.
(1) MRC for factor 165: 0.7436
< 0.8 range: 93 - 291.6
(using recalculated necklace match percentage: MRC for factor 165: 0.7481, < 0.8 range: 89.8 - 264)
Keep in mind only 87 of 154 contenders got a 7-0 score (seven times Shiina Mashiro and Misaka Mikoto!)

(2) PSAO, < 0.8 range: 93 - 200.7

Thank you very much. It seems that we made the right decision to reduce the NGP multiplier from 300 to 220 this year. One thing that bugs me is that if we reduce NGP multiplier value further, some voters may complain about necklace group match not mattering much. Neglecting all these MRC analysis and after just looking at several necklace result, what does your gut feeling tell you to be the best multiplier to use? I welcome people other than Shmion84 to answer this question as well.
My gut feeling? A girl must work really hard to overcome an SDO disadvantage. Ruby and Topaz are excellent examples where SDO got it right. If the other girls couldn't bother to distinguish themselves from the pack in the necklace match, then the girl with 4th place in NM VF but with the SDO advantage deserves the win. So something around the current 220 is probably good. If anything it may need to be a bit lower to raise the bar a bit.
Image
User avatar
Shmion84
Moon princess
Moon princess
Posts: 3383
Joined: Mon Apr 02, 2012 6:33 pm
Badges:
ImageImage
Melon Pan: 95
2017 Female Favorite: Konjiki No Yami
2017 Male Favorite: Kyon
Wish: ISML still exists next year
Location: Central Europe

Re: Some thoughts on ranking systems

Post by Shmion84 » Fri Aug 29, 2014 8:34 pm

maglor wrote: Correlation (SDO ranks; RPI ranks) = 0.969978639 <- This pretty much dooms RPI in the choice for the necklace score formula because using it will give results too close to current SDO, with perhaps some exceptional cases.

I am now leaning heavily for PSAO, because it actually is a simple product of two existing stat, thus we might argue that we are not really making up a new stat. Also SAO is easier to calculate compared to SDO, since we don't need to worry whether the character won or not when calculating SAO.
Out of curiosity calculation - I used the ranks of all 72 (2011: 50) girls.

Correlation (SDO ranks; RPI ranks) = 0.969978639
Correlation (RPI ranks; PSAO ranks) = 0.982865413
Correlation (SDO ranks; PSAO ranks) = 0.987870751
Image
User avatar
maglor
~Fukou da~
~Fukou da~
Posts: 8618
Joined: Thu Feb 26, 2009 8:57 pm
Badges:
ImageImage
Worships: Abriel Nei Debrusc Borl Paryun Lafiel
Melon Pan: 75
2018 Female Favorite: Chtholly Nota Seniorious
2018 Male Favorite: Yang Wenli
2017 Female Favorite: Tomori Nao
2017 Male Favorite: Yang Wenli
Wish: More people being open to alternatives and compromises.

Re: Some thoughts on ranking systems

Post by maglor » Fri Aug 29, 2014 9:16 pm

Shmion84 wrote:
maglor wrote: Correlation (SDO ranks; RPI ranks) = 0.969978639 <- This pretty much dooms RPI in the choice for the necklace score formula because using it will give results too close to current SDO, with perhaps some exceptional cases.

I am now leaning heavily for PSAO, because it actually is a simple product of two existing stat, thus we might argue that we are not really making up a new stat. Also SAO is easier to calculate compared to SDO, since we don't need to worry whether the character won or not when calculating SAO.
Out of curiosity calculation - I used the ranks of all 72 (2011: 50) girls.

Correlation (SDO ranks; RPI ranks) = 0.969978639
Correlation (RPI ranks; PSAO ranks) = 0.982865413
Correlation (SDO ranks; PSAO ranks) = 0.987870751
Not surprised by those. PSAO is specifically designed to improve the chance of girls with 6-1 records, as the current system places their SDO number too far back for them to have any realistic chance. It is somewhat expected that PSAO would resemble RPI more than SDO, as both RPI and PSAO gets benefit from battling strong opponent, even if the character loses. With SDO, if the character loses, they get nothing no matter who their opponent is.
Image
User avatar
Shmion84
Moon princess
Moon princess
Posts: 3383
Joined: Mon Apr 02, 2012 6:33 pm
Badges:
ImageImage
Melon Pan: 95
2017 Female Favorite: Konjiki No Yami
2017 Male Favorite: Kyon
Wish: ISML still exists next year
Location: Central Europe

Re: Some thoughts on ranking systems

Post by Shmion84 » Sat Aug 30, 2014 2:20 pm

maglor wrote:Now the hard one

2. Try the partial rewards SDO. The difference to the SDO is that if you lose, you get either (a) VF% times the opponent's total period points or (b) VF%/2 times the opponent's total period points. The aim of this system is to reduce the penalty 1 loss character has. Please tell us what will be the MRC in this case
2 a, MRC for (partial SDO) + NGP * 150: 0.7549
MRC < 80 range: 102.8 - 224.6

2 b, MRC for (partial SDO) + NGP * 170: 0.7614
MRC < 80 range: 133.8 - 225.3

I'm pretty sure maglor will ask to optimize VF% factor.
Image
User avatar
maglor
~Fukou da~
~Fukou da~
Posts: 8618
Joined: Thu Feb 26, 2009 8:57 pm
Badges:
ImageImage
Worships: Abriel Nei Debrusc Borl Paryun Lafiel
Melon Pan: 75
2018 Female Favorite: Chtholly Nota Seniorious
2018 Male Favorite: Yang Wenli
2017 Female Favorite: Tomori Nao
2017 Male Favorite: Yang Wenli
Wish: More people being open to alternatives and compromises.

Re: Some thoughts on ranking systems

Post by maglor » Sun Aug 31, 2014 2:40 am

Shmion84 wrote:
maglor wrote:Now the hard one

2. Try the partial rewards SDO. The difference to the SDO is that if you lose, you get either (a) VF% times the opponent's total period points or (b) VF%/2 times the opponent's total period points. The aim of this system is to reduce the penalty 1 loss character has. Please tell us what will be the MRC in this case
2 a, MRC for (partial SDO) + NGP * 150: 0.7549
MRC < 80 range: 102.8 - 224.6

2 b, MRC for (partial SDO) + NGP * 170: 0.7614
MRC < 80 range: 133.8 - 225.3

I'm pretty sure maglor will ask to optimize VF% factor.
So, neither can beat the PSAO , if we use (pts/21)*SAO for PSAO. Thank you. I have to chalk that as another advantage for PSAO
Image
User avatar
matchbaby
Necromancer
Necromancer
Posts: 887
Joined: Sat Jun 28, 2014 9:51 am
Badges:
ImageImage
Worships: Isla
Melon Pan: 65
Wish: Isla!!! All hail for Isla!!!
Shiro and Sora!! As least win something!!
Same as Illya~!
Location: Hong Kong, China

Re: Some thoughts on ranking systems

Post by matchbaby » Sun Aug 31, 2014 6:03 am

I can say nothing on RPI as I think that is too difficult.
But for the SWVDO, I can say something on this.

(As I didn't use arena 19,20 to calculate VP, the data may have slightly difference)

If the formula change to vote%+SWVDO...

AQ SWVDO SDO

Itsuka Kotori 67.06907939 87.383
Yoshino 70.74444123 81.741
Takanashi Rikka 67.96201907 90.397
Kuroyukihime 56.46247856 71.687
Tachibana Kanade 78.20377128 93.576
Gokō Ruri (Kuroneko) 72.99274167 93.629
Misaka Mikoto 71.17057831 78.849

winner: Misaka Mikoto -> Misaka Mikoto (but much closer)

AM

Itsuka Kotori 62.48419285 84.786
Tokisaki Kurumi 62.33647183 94.443
Yoshino 59.38827635 71.352
Tsutsukakushi Tsukiko 47.43319469 69.369
Tachibana Kanade 68.53418038 93.412
Gokō Ruri (Kuroneko) 60.95762398 78.272
Eucliwood Hellscythe 47.0527038 72.429

winner: Tachibana Kanade -> Tachibana Kanade(difference even larger)

RU

Itsuka Kotori 61.6470273 84.405
Tokisaki Kurumi 66.70404251 87.245
Yūki Asuna 50.45168771 73.45
Kuroyukihime 58.64142922 92.721
Shiina Mashiro 53.51115278 71.464
Eucliwood Hellscythe 46.7520922 64.054
Shana 47.56027472 69.774

winner: Kuroyukihime->Tokisaki Kurumi(winner changed)

EM

Itsuka Kotori 62.39222785 85.178
Tokisaki Kurumi 52.67111196 62.133
Yoshino 58.08105043 82.598
Yūki Asuna 62.07954115 90.679
Shiina Mashiro 62.12816294 84.74
Aisaka Taiga 48.76444567 69.602
Nakano Azusa 45.42947638 64.567

winner: Yūki Asuna-> Yūki Asuna (but much closer)(If Kurumi got Ruby, Mashiro will not have vote-split, so...)

TO

Itsuka Kotori 62.67807202 76.737
Yoshino 59.53190731 79.696
Takanashi Rikka 50.59001573 78.789
Kurousagi 45.63314018 54.317
Shiina Mashiro 62.87154839 85.084
Eucliwood Hellscythe 57.2690238 82.412
Shana 55.72393037 82.977

winner: Shiina Mashiro->Shiina Mashiro(difference even larger)(If Kurumi got Ruby, Mashiro is likely to get Emerald, Asuna will get Topaz if the result of Topaz is the same)

It may seems not much difference in this year, but if you use this in ISML 2013, Mikoto will get Ruby as well as Emerald ._.
Also, SWVDO seems to be much more fair for Wildcard and Tier 2.

However, using SWVDO is not exciting as the power of a character will not change rapidly if she gets a 7-0 score into NM. Also, it's easier to predict who win a necklace BEFORE the period start-_-

image version: (Please neglect 41 42 43 to 50 51 as this is only a calculation step)
Image


In addition, I don't think PSAO has that much difference compare to SDO for a 7-0 score character, wildcard and Tier 2 are still depress, if a character is 7-0, (Pts/21)*SAO=SDO, PSAO just have more chance for a 6-1 character I think, however, is this good for a 7-0 character to be overcome by a 6-1 character?
Keep using SDO and add bonus for Wildcard and Tier 2 is enough?
Hope someone can explain the difference and what is "NGP" as I can't get this.
User avatar
Shmion84
Moon princess
Moon princess
Posts: 3383
Joined: Mon Apr 02, 2012 6:33 pm
Badges:
ImageImage
Melon Pan: 95
2017 Female Favorite: Konjiki No Yami
2017 Male Favorite: Kyon
Wish: ISML still exists next year
Location: Central Europe

Re: Some thoughts on ranking systems

Post by Shmion84 » Sun Aug 31, 2014 8:48 am

matchbaby wrote:Hope someone can explain the difference and what is "NGP" as I can't get this.
maglor wrote:NGP ( = Necklace Group performance = ( votes received by the contestant in the 7 character necklace group match ) / ( total number of votes cast for the necklace group match)
Will comment on SWVDO later. Might also compile a "dictionary".
Image
User avatar
Shmion84
Moon princess
Moon princess
Posts: 3383
Joined: Mon Apr 02, 2012 6:33 pm
Badges:
ImageImage
Melon Pan: 95
2017 Female Favorite: Konjiki No Yami
2017 Male Favorite: Kyon
Wish: ISML still exists next year
Location: Central Europe

Re: Some thoughts on ranking systems

Post by Shmion84 » Sun Aug 31, 2014 3:57 pm

Momento10 wrote:Why should a 6-1 character have a better chance at getting a necklace?
67 out of 154 contenders in past 22 necklace matches didn't reach a 7-0 record. A system using PSAO can help to reduce this disadvantage.

==

If you think a factor is too random or can easily be manipulated, you can increase the weight of the necklace match.
Image
User avatar
maglor
~Fukou da~
~Fukou da~
Posts: 8618
Joined: Thu Feb 26, 2009 8:57 pm
Badges:
ImageImage
Worships: Abriel Nei Debrusc Borl Paryun Lafiel
Melon Pan: 75
2018 Female Favorite: Chtholly Nota Seniorious
2018 Male Favorite: Yang Wenli
2017 Female Favorite: Tomori Nao
2017 Male Favorite: Yang Wenli
Wish: More people being open to alternatives and compromises.

Re: Some thoughts on ranking systems

Post by maglor » Tue Jan 20, 2015 7:31 am

Shmion and I worked on another score system which may be a good indicator of people's expectation of the character. After some thought I decided to call it ISML Character's Expectation Value, or The Value for the short.

Theoretical background

The score system is inspired by

(1) Pythagorean Expectations => http://en.wikipedia.org/wiki/Pythagorean_expectation" onclick="window.open(this.href);return false;
(2) Random Walk => http://rankings.amath.unc.edu/old/monkeys.htm" onclick="window.open(this.href);return false;

Calculation

(1) The base assumption is that there exist an Pythagorean Expectation formula that can connect votes for and votes against to the Win% of a character. Furthermore, this win% may be close to what people expect from a character in an ISML Match. After many trial and error, it was decided that the following formula is the simplest formula that adequately fits ISML results

Expected Win% = 1 / [ 1 + ( VA/VF) ^ 5 ]

The Root Mean Square Error = sqrt [ sum of all of (Expected Win% - Actual Win% ) ^2 ] = 0.066326

The RMSE value corresponds to average error of being off by about two wins in typical ISML season, but any formula that reduced error involved much more factors, which would have made this part very cumbersome. The form above was chosen for its simplicity, and also because the number 5 played important role in the next part.

(2) The next assumption is how much of an expectation that a character carries into the match is at the stake for a match. It is like how much of the "Pot" the character has will be "Bet" into the match pool. Stronger characters are likely to carry a bigger "Pot" thus have more expectation at stake for a match. How much of the "Pot" was "bet" on each match was of great importance, as "betting" too little would make the "Expectation Value" change too slowly, while "betting" too much would make it very volatile and not mirror the reality.

Shmion and I tested the "bet" amount between 0.01 = 1% of the whole pot to 0.5=50% of the whole pot on 2014 Nova and Stella Data. The calculation becomes as follow
  • i) All characters start with 100 points as their Pot
    ii) In each match character bets N% of the Pot into the Match Pot. We explored N%=0.01=1% to N%=0.5=50%
    iii) After the match between Character A and B, they will divide up the match pot by the following equation

    Amount of Match Pot earned by character A = Match Pot Value / [ 1 + ( Vote for character B / Vote for character A ) ^5 ]

    iv) The character's pot value will be readjusted and they will go to the next match
To give more concrete example, here is how the first few calculation would look like

i) Character A and B would have pot value of 100 at the start of the season
ii) In the first match of the season they are against each other. Each put up Pot * N% = 100*N% into the match pot. The Match Pot Value is (100 ( = Pot_A ) +100( = Pot_B) )*N%
iii) In the match Character A got Vf_A votes and Character B got Vf_B votes
iv) The Character A's Pot after 1st match would be New Pot value = 100 ( = initial pot ) - 100*N% ( Amount placed into the match pot ) + (100 ( = Pot_A ) +100( = Pot_B) )*N% / [ 1 + ( Vf_B / Vf_A ) ^5 ]

Shmion and I ranked the characters by their expectation value at the end of the season and compared it to our traditional point ranking. We wanted to find a case where average rank difference between expectation value and traditional point ranking was bigger than 1, but less than 5. We also didn't want more than two cases where rank difference would be more than 10. Another constraint I looked at was that maximum amount of "Pot" change that occurred from a match be near 50, where this 50 was chosen as value that is about half the initial pot value.

What we found was that when N% is greater than 0.2, we get the average rank difference to be greater than 1, but things somewhat stabilize after that, thus we couldn't exceed average rank difference of 2 even for N% = 0.5 in case of Nova ( 2.333 for Stella which is still near 2 ) . However, maximum amount of Pot change was as great as 174 for Nova and 232 for Stella, which means there was potential for it to be overly volatile. After much simulation, N% = 1/5 = 0.2 was chosen to be optimal. The value of 0.2 may not be mere coincidence as the exponent in our Pythagorean Expectation formula also was 5.

You can see the actual numbers at https://docs.google.com/spreadsheets/d/ ... 1236600324" onclick="window.open(this.href);return false; which was made by Shmion .

3) discussions

i) Illya at the end of 2014 Stella showed significant rise in Expectation value ranks. I believe the increase to be close to apparent level of interest people gave Illya due to her Topaz wins.
ii) Depending on how much characters placed into match pot, it is possible for a character to gain expectation value even when she lost. This is akin to a real world sports case where a team that used to struggle starts doing suddenly well, and in an important match, nearly beat an opponent that was considered to be one of the best team in the league. Many people would start paying more attention to the team that exceeded expectations.
iii) Unlike 1%-35% point stealing scheme, nor our traditional point system, every vote really matters, WIN OR LOSS . While this is great for voter participation, it also makes this "Value" system be vulnerable to manipulations by small yet consistent faction voting.
iv) The value rank should move faster than traditional point rank when a character receives a late season push. This can be very useful for ********* .
v) The Value and Value Rank can replace the SVAO and SWVO column in the stat table. VP and SVDO column should be replaced by the 1%-35% Score column and the Score Rank.
vi) It should be noted that the Value tracks expectation, not the actual strength level of a character. There already are many Actual Strength Level calculation systems that beats this Value system in predicting the results.

----- Conclusion ----

I believe we have found a nice formula that somewhat matches amount of hype a character may have. This hype amount can be used in various ways. It is recommended that the columns for the Expectation Value and the Value rank replace two of the columns in current stat table so its evolution can be watched and analyzed by the public.
Image
User avatar
avery-kun
Moon princess
Moon princess
Posts: 3128
Joined: Thu Oct 04, 2012 4:25 am
Badges:
ImageImageImage
Worships: Berserker!
Melon Pan: 135
2017 Female Favorite: Illyasviel Von Einzbern
2017 Male Favorite: Gilgamesh
Wish: Illya 2017!

Re: Some thoughts on ranking systems

Post by avery-kun » Wed Feb 04, 2015 8:35 pm

I think I'm gonna steal this as a scoring system for a game.
Image
Post Reply