ASPCC Ratings
Kristo S. Miettinen
|
|
Overview of the rating system
The rating system implemented for ASPCC is Arpad Elo’s system with a fixed weight per game and logistic scoring expectation function. For games played between players with established ratings, this means essentially that the game is a wager for 32 rating points, with each player contributing a portion of the 32 points to the “pot” in proportion to their odds of winning (estimated from previous ratings). The 32 points then go to the winner (minus the winner’s contribution to the pot), while the loser loses their contribution to the pot. In the event of a draw, each player gets 16 points, minus their contribution to the pot. The contributions are taken out at the same time as the game is rated, i.e. there is no “pay now, win later” delay in adjusting ratings. The scoring expectation function and the contribution to the 32-point pot can be computed for each player form the following formula:
Expected score =
1/(1+exp[(his_rating-my_rating) /166.2])
Contribution = 32*expected score
The expression “exp” represents the exponentiation function, available in Excel with that name, or on any scientific calculator with the key labeled “ex”.
To take an example, suppose that player A has a rating of 1450, and player B has a rating of 1320, and they play a game. What happens in each of the three cases (win/lose/draw)?
First, nothing happens until the game ends. So, if the ratings of the two players change during the course of the game because of other games finishing, then it is the most recent ratings that count. In the case of ASPCC games, all results published in any one issue of King’s Korner are rated simultaneously, using as the “previous” rating the rating list published in that same issue. For instance, the ratings list published here in KK266 is based on the formulae above, applied to each game reported by a TS in KK265, and using the ratings list in KK265 for the ratings in the formulae (“his_rating” and “my_rating”).
So, to make the example concrete, let’s stipulate that the ratings for player A and player B in the example were published in KK265, and so was the result of their game, and neither player A nor player B had any other results published in KK265.
Then, we can calculate the expected result of their game: player A’s expected score is 0.686, while player B’s expected score is 0.314. Each player contributes rating points to the wager in these proportions, so player A contributes 22 points, while player B contributes 10 points (for a total of 32 points).
If Player A wins, he gains 32 minus 22 points, or 10 points, and has a new rating of 1460, while player B loses 10 points and has a new rating of 1310. On the other hand, if player B wins then he gains 32 minus 10 points, or 22 points, and has a new rating of 1342, while player A loses 22 points and has a new rating of 1428.
If the game is drawn, then player A “gains” 16 minus 22 points, i.e. he loses 6 points, and his new rating is 1444, while player B gains 16 minus 10 points, or 6 points, and has a new rating of 1326.
If, as it usually happens, a player has many results in the same issue of King’s Korner, then the gains and losses are all computed based on the previous ratings, and then all gains and losses are combined into one for the rating change from one list to the next.
In situations where a player is provisionally rated, the rating adjustment is handled differently. The games of provisionally rated players are held in a storage file until they achieve 10 results against players with established ratings. At that time, a pseudorating (for purposes of the formulae only) is assigned based on their score in those 10 games and the ratings of their opponents, and then those 10 games are rated using the formulae above in the ordinary way.
Thus, the first rating published for a player is neither their provisional rating assigned for tournament qualification purposes (which is never used in the rating system), nor their pseudorating based on a self-consistency calculation for their 10-game performance and their opponents’ ratings, but rather it is the result of updating the pseudorating using the ordinary ratings update formulae and the 10 (or more) games that have been held for their initial rating.
