lasker-2.2.3/data/help/glicko

   1
   2        +-------------------------------------------------+
   3        |   Vek-splanation of the Glicko Ratings System   |
   4        +-------------------------------------------------+
   5
   6 As you may have noticed, each FICS player now has a rating and an RD.
   7
   8 RD stands for "ratings deviation".
   9
  10 Why a new system
  11 ----------------
  12
  13 The new system with the RD improves upon the binary categorization that was
  14 used before on fics and elsewhere, where players with fewer than 20 games were
  15 labeled"provisional" and others were labeled "established".  Instead of two
  16 separate ratings formulas for the two categories, there is now a single
  17 formula incorporating the two ratings and the two RD's to find the ratings
  18 changes for you and your opponent after a game.
  19
  20 What RD represents
  21 ------------------
  22
  23 The Ratings Deviation is used to measure how much a player's current rating
  24 should be trusted.  A high RD indicates that the player may not be competing
  25 frequently or that the player has not played very many games yet at the
  26 current rating level.   A low RD indicates that the player's rating is fairly
  27 well established.  This is described in more detail below under "RD
  28 Interpretation".
  29
  30 How RD Affects Ratings Changes
  31 ------------------------------
  32
  33 In general, if your RD is high, then your rating will change a lot each time
  34 you play.  As it gets smaller, the ratings change per game will go down.
  35 However, your opponent's RD will have the opposite effect, to a smaller
  36 extent: if his RD is high, then your ratings change will be somewhat smaller
  37 than it would be otherwise.
  38
  39 A further use of RD's:
  40 ----------------------
  41
  42 Vek asked Mark Glickman the following:
  43
  44 > Given player one with rating r1, error s1,
  45 > and player two with r2 and s2, do you have a formula for the probability
  46 > that player 1's "true" rating is greater than player 2's ?
  47
  48 Mark said:
  49
  50   Yes - it's:
  51
  52   1/(1 + 10^(-(r1-r2)f(sqrt(s1^2 + s2^2))/400) )
  53
  54   where f(s) is [the function applied to RD in Step 2 below].
  55
  56 How RD is Updated
  57 -----------------
  58
  59 In this system, the RD will decrease somewhat each time you play a game,
  60 because when you play more games there is a stronger basis for concluding what
  61 your rating should be.  However, if you go for a long time without playing any
  62 games, your RD will increase to reflect the increased  uncertainty in your
  63 rating due to the passage of time.  Also, your RD will decrease more if your
  64 opponent's rating is similar to yours, and decrease less your opponent's
  65 rating is much different.
  66
  67 Why Ratings Changes Aren't Balanced
  68 -----------------------------------
  69
  70 In the other system, except for provisional games, the ratings changes for the
  71 two players in a game would balance each other out - if A wins 16 points, B
  72 loses 16 points.  That is not the case with this system.  Here is the
  73 explanation I received from Mark Glickman:
  74
  75   The system does not conserve rating points - and with good
  76   reason!  Suppose two players both have ratings of 1700,
  77   except one has not played in awhile and the other playing
  78   constantly.  In the former case, the player's rating is not
  79   a reliable measure while in the latter case the rating is a fairly
  80   reliable measure.  Let's say the player with the uncertain rating
  81   defeats the player with the precisely measured rating.
  82   Then I would claim that the player with the imprecisely
  83   measured rating should have his rating increase a fair
  84   amount (because we have learned something informative from
  85   defeating a player with a precisely measured ability) and
  86   the player with the precise rating should have his rating
  87   decrease by a very small amount (because losing to a player
  88   with an imprecise rating contains little information).
  89   That's the intuitive gist of my extension to the Elo system.
  90
  91   On average, the system will stay roughly constant (by the
  92   law of large numbers).  In other words, the above scenario
  93   in the long run should occur just as often with the
  94   imprecisely rated player losing.
  95
  96 Mathematical Interpretation of RD
  97 ---------------------------------
  98
  99 Direct from Mark Glickman:
 100
 101 Each player can be characterized as having a true (but unknown) rating that
 102 may be thought of as the player's average ability.  We never get to know that
 103 value, partly because we only observe a finite number of games, but also
 104 because that true rating changes over time as a player's ability changes.  But
 105 we can *estimate* the unknown rating.  Rather than restrict oneself to a
 106 single estimate of the true rating, we can describe our estimate as an
 107 *interval* of plausible values.  The interval is wider if we are less sure
 108 about the player's unknown true rating, and the interval is narrower if we are
 109 more sure about the unknown rating.  The RD quantifies the uncertainty in
 110 terms of probability:
 111
 112 The interval formed by Current rating +/- RD contains your true rating with
 113 probability of about 0.67.
 114
 115 The interval formed by Current rating +/- 2 * RD contains your true rating
 116 with probability of about 0.95.
 117
 118 The interval formed by Current rating +/- 3 * RD contains your true rating
 119 with probability of about 0.997.
 120
 121 For those of you who know something about statistics, these are not confidence
 122 intervals, but are called "central posterior intervals" because the derivation
 123 came from a "Bayesian" analysis of the problem.
 124
 125 These numbers are found from the cumulative distribution function of the
 126 normal distribution with mean = current rating, and standard deviation = RD.
 127 For example, CDF[ N[1600,50], 1550 ] = .159  approximately (that's shorthand
 128 Mathematica notation.)
 129
 130 The Formulas
 131 ------------
 132
 133 Algorithm to calculate ratings change for a game against a given opponent:
 134
 135 Step 1.  Before a game, calculate initial rating and RD for each player.
 136
 137   a)  If no games yet, initial rating assumed to be 1720.
 138       Otherwise, use existing rating.
 139       (The 1720 is not printed out, however.)
 140
 141   b)  If no RD yet, initial RD assumed to be 350 if you have no games,
 142       or 70 if your rating is carried over from ICC.
 143       Otherwise, calculate new RD, based on the RD that was obtained
 144       after the most recent game played, and on the amount of time (t) that
 145       has passed since that game, as follows:
 146
 147       RD' = Sqrt(RD^2 + c log(1+t))
 148
 149       where c is a numerical constant chosen so that predictions made
 150       according to the ratings from this system will be approximately
 151       optimal.
 152
 153 Step 2.   Calculate the "attenuating factor" due to your OPPONENT's RD,
 154           for use in later steps.
 155
 156        f =  1/Sqrt(1 + p RD^2)
 157
 158           Here p is the mathematical constant 3 (ln 10)^2
 159                                              -------------
 160                                               Pi^2 400^2    .
 161
 162           Note that this is between 0 and 1 - if RD is very big,
 163           then f will be closer to 0.
 164
 165 Step 3.   r1 <- your rating,
 166           r2 <- opponent's rating,
 167
 168                     1
 169       E <-  ----------------------
 170                     -(r1-r2)*f/400     <- it has f(RD) in it!
 171               1 + 10
 172
 173           This quantity E seems to be treated kind of like a probability.
 174
 175 Step 4.   K =               q*f
 176               --------------------------------------
 177                1/(RD)^2   +   q^2 * f^2 * E * (1-E)
 178
 179           where q is a mathematical constant:  q = (ln 10)/400.
 180
 181 Step 5.   This is the K factor for the game, so
 182
 183           Your new rating = (pregame rating) + K * (w - E)
 184
 185           where w is 1 for a win, .5 for a draw, and 0 for a loss.
 186
 187 Step 6.   Your new RD is calculated as
 188
 189           RD' =                     1
 190                   -------------------------------------------------
 191                   Sqrt(    1/(RD)^2   +   q^2 * f^2 * E * (1-E)   )  .
 192
 193 The same steps are done for your opponent.
 194
 195 Further information
 196 -------------------
 197
 198 A PostScript file containing Mark Glickman's paper discussing this ratings
 199 system may be obtained via ftp.  The ftp site is hustat.harvard.edu, the
 200 directory is /pub/glickman, and the file is called "glicko.ps".  It is
 201 available at http://hustat.harvard.edu/pub/glickman/glicko.ps.
 202
 203 Credits
 204 -------
 205
 206 The Glicko Ratings System was invented by Mark Glickman, Ph.D. who is
 207 currently at the Harvard Statistics Department, and who is bound for Boston
 208 University.
 209
 210 Vek and Hawk programmed and debugged the new ratings calculations (we may
 211 still be debugging it).  Helpful assistance was given by Surf, and Shane fixed
 212 a heinous bug that Vek invented.
 213
 214 Vek wrote this helpfile and Mark Glickman made some essential
 215 corrections and additions.
 216
 217   Last major update: April 19, 1995.
 218   Minor revisions: August 28, 1995 by Friar.
 219