Xonotic Forums
Much Ado About Elo - Printable Version

+- Xonotic Forums (https://forums.xonotic.org)
+-- Forum: Community (https://forums.xonotic.org/forumdisplay.php?fid=6)
+--- Forum: Xonotic - News (https://forums.xonotic.org/forumdisplay.php?fid=2)
+--- Thread: Much Ado About Elo (/showthread.php?tid=3296)



Much Ado About Elo - Antibody - 07-22-2012

I’ve had several players ask me about how the ranking system for XonStat works. This is my humble attempt to do so. If you have any particular questions that aren’t answered here, ping me in the Xonotic IRC or forums and I’ll try to answer it. If the wind is blowing in a favorable direction, I might also update this very post.

The Short Story

Did I mention that there is a short story and a long story to this? There is - how delightful!The short story is that I’m using the Elo rating system to gauge the strength of players in Xonotic. It’s similar, though not exactly the same, as what is used in many chess ranking systems. An excellent writeup of the history and mathematical details behind the Elo algorithm can be found on Wikipedia here: http://en.wikipedia.org/wiki/Elo_rating_system. For those who want all the gory details, read that and become enlightened! Or read it and become more confused. It’s okay either way. Now for the long story.

The Long Story

Way back when I was first implementing XonStat, people would often approach me with an idea to incorporate global rankings into the system. I had no idea of how to do so until someone pointed out the aforementioned Elo system to me, also noting that several other popular FPS games had also done so. I looked into it and how it works. divVerent and I set out to implement it within our database.

The Elo system is based on wins and losses versus your opponents. You gain points from winning, and you lose points from losing. Sounds simple, right? It really is, but adopting the algorithm to Xonotic had a few nuances. The rating of your opponent also comes into play such that wins over stronger opponents will give you more points than wins over weak ones. This means that a win over a person rated higher than you yields more Elo points than a win over a person new to the game. The same principles apply to losses.

Let’s cut right to the chase and talk about how XonStat determines the points increase or decrease for each match. Unfortunately for this part we’ll have to look at the gory mathematical details. We’ll use a theoretical match between players A and B, and we’ll look specifically at the points increase/decrease for player A (forgive me for the bad screen caps):

[Image: elo.png]

In the notation above E is the expected outcome, S is the actual outcome (1 for a win, 0 for a loss), and K is the experience adjustment factor. We’ll talk more about K in a minute, but for now just think of it as a constant.

To get the points increase (or decrease) value for each match, we first have to start with what the expected outcome would be. Simply put, stronger players are more likely to defeat weaker players, and the formula for E is just expressing that in terms of a likelihood between 0 and 1. This may help you understand why winning against very weak opponents nets you very little points - if the expected outcome is near 1 (a sure win), it follows that what’s left over after subtracting it from 1 will not be very much, thus the amount of points gained will be low.

The Experience Factor

The more games you have under your belt, the more accurate we can say your Elo score is, and thus the less we need to adjust your score after each game. This is exactly what the K is for in the formulas above. We vary K along with how many games you’ve played. You start off with a K value of 200, but over the course of 32 games that factor decreases linearly down to 40. This means that new player will jump up and down in point value much more dramatically than a seasoned veteran until he or she reaches the requisite number of games.

In XonStat we also use the K value to account for players who have not played an entire match. Such players have their K value modified downwards by the percentage of the match they did not play. This is determined by comparing their alivetime value with the match’s overall duration. For example, if a player plays 800 seconds out of a 1000-second match, their K value will be 80% of what it would have been otherwise.

A Real Example

Enough with this A and B business. Let’s do an example with real values! Imagine I (Elo 350) play Mirio (Elo 450) in a duel and Mirio wins. Assuming both of us are experienced players with over 32 games played, we’ll each have a K value of 40. Since Mirio won, his S value is 1 while mine is 0.

Mirio’s points value from his win will be:

PHP Code:
40*(113.33/(13.33+7.50)) = 14.40 

My points from the loss will be:

PHP Code:
40*(07.50/(7.50+13.33)) = -14.40 

Now let’s turn the tables and see the points values for if I win the match! This time Mirio’s S value will be 0 and mine will be 1.

Mirio’s points from his loss will be:

PHP Code:
40*(013.33/(13.33+7.50)) = -25.60 

My points from the win will be:

PHP Code:
40*(17.50/(7.50+13.33)) = 25.60 

Take note of the points differences when the winners are transposed. If I won the match, I was rewarded with more points because I was expected to lose. Hooray to the underdog!

Team Matches

Thus far we’ve only discussed matches between two players. That’s fine and well, but what about team matches where there is more than one person to compare against? I handle this by running the above calculations between each winner and loser and averaging the result. For example, if a player has won a game against three individuals, the points he gets is the sum of all of the points from each individual calculation divided by three. The points gained is thus the average points gained from each individual a player has defeated, with the opposite being true for lost points.

In Conclusion


In this post I’ve covered the details of the Elo implementation within XonStat, the statistics database for Xonotic. I’ve covered individual and team games as well as how the basic structure of Elo is used to determine point gains and losses from matches. I also discussed a little about how experience alters the ratings. I hope you’ve found this subject matter entertaining! As always, you can contact me in the forums at forums.xonotic.org or on IRC at #xonotic on Quakenet with any questions or comments.


RE: Much Ado About Elo - machine! - 07-22-2012

Nice, thanks for writing! Smile


RE: Much Ado About Elo - frostwyrm333 - 07-22-2012

Where have my skills gone? From 100 to 150! Do you think I play just for fun?
(I'll read it tomorrow...)


RE: Much Ado About Elo - zykure - 07-23-2012

Hey, that sounds great! (And by the way, the whole Xonstat system is a true masterpiece Big Grin)

Especially that "K-factor reflects playing time"-thing should solve a few problems. (It wasn't implemented that way before, was it?) But what about this: I've played some duels, and sometimes people will just quit mid-game (by purpose or by accident, doesn't matter), which means I wouldn't get any Elo points, although I would've won that match. With the new system, players still have to be in the game at the end of the match so that points are calculated, is that right?

And another question: This new system does not take the player's scores into account anymore, like it was before? So it doesn't matter if I play 30:1 or 30:29 against somebody? What's the reason for that? I won't say that it's a bad decision, it just feels like "one step back" imho Wink


RE: Much Ado About Elo - rocknroll237 - 07-23-2012

Thanks for taking the time to explain this Antibody. Smile


RE: Much Ado About Elo - booo - 07-23-2012

Excellent Job and explanation!

One funny note : The Elo in Chess led in the early 80s and 90s the top players to very rarely join a tourney with many averagely weaker elo players,because their chances of losing more ELO points where more amplified as they might lose to an underdog Wink . This system makes i guess the top players more like a trophy .


RE: Much Ado About Elo - frostwyrm333 - 07-24-2012

3 days ago I was something about 96 in CTF, today I am 188. I don't really mind because I am going to have a break from Xonotic, the longer the better. Unbalanced games are an inherent problem which is never going to be solved. I doubt that 5% of players who try the game stay for a second try. Everyone who stayed is either a pro and/or crazy.


RE: Much Ado About Elo - Micha - 07-24-2012

(07-24-2012, 12:56 PM)frostwyrm333 Wrote: I doubt that 5% of players who try the game stay for a second try. Everyone who stayed is either a pro and/or crazy.

Come to the bright land of instagib, where every noob gets some frags even against pros. Braindead and fun. Heart


RE: Much Ado About Elo - Antibody - 07-28-2012

@zykure - yes, you have to be there at the end of the match (present on the scoreboard) for Elo points to count either way.

The algorithm was previously based on your points differential versus your opponents: actual score/(actual score + opponent's score). That didn't work for duel, so I changed the algorithm's S value. That's now not working the best for DM (and CTF isn't perfect either), so div and I have collaborated to create a "blended" Elo that will use win/loss for duels and score differential for other game types. Nothing is ever perfect and I'm fine-tuning the changes now, so expect them to hit the site in the next week or so.


RE: Much Ado About Elo - chooksta - 07-28-2012

i was going to make the elmo algorithm , that way everyone tops the ladder and feels warm and fuzzy inside Big Grin

* chooksta wonders what ranking elmo would get if he played this game


t


:^

( btw anti , that was a great write up , even i got it )

yay!


RE: Much Ado About Elo - frostwyrm333 - 07-29-2012

I'm back. (I don't know about precise implementation and different game modes but if what you described above is the core, then this: )

After a week of not playing the game, my uber-pro skills of inactivity granted me CTF position of 74 (I was 200~ several days ago, ~95 before that).
I thought about elo a bit and I think that it is unfair. In other words - it doesn't reflect well the "reality" of Xonotic. It may be good for competitive and/or controlled play, but public in Xonotic is something different. First thing is that it takes into consideration only win or loss and that's quite black & white. Teams are usually not the same when the game started, there is often rebalancing by the time score hits 0:5. Playing well for one side means nothing if you lose. And if you play against hard odds and you manage 9:10 it means nothing. But achieving nothing and switching at the last second will give you the score? There is no incentive for oh-so-great-and-practical-voluntary-team-rebalancing. When somebody said that he is going to change teams before losing I thought he was joking...

Other aspect I now miss is playing for fun. Not worrying about the bad outcome, only if you made a difference or not.
More simple and intuitive would be measuring just how well you play in a certain gamemode. (average of score). Anything more complex would have to take into consideration player switching, quitting etc etc.

stuff:
- Perhaps being more active should be a bit more rewarding, or not playing should erode your score. Top CTF players are quite inactive.
- quantity problem, if you play alone and lose against 5 noobs it will treat it in the same way as if you lost to a noob in a duel.
- for duels, clan battles and maybe DM, elo is still good solution
- I don't know about DM, the way it works it should be more fair, because it doesn't matter that much how many people there are, who joins or leaves, only winning matters. Problem may be if you are 2# player who usually loses but still beats the worse players. For those score will be only going down.
- I'm also not sure about noob dilution, you can play basically a duel with a pro and having 6 other noobs present may be inconsequential to the outcome but it will change the score dramatically.
- good players leaving before the end may affect the score greatly.
- quitting might be a way of cheating the system, people tend to do this anyway, maybe they just had enough
- elo does not even tell you how much is somebody good (jumping from 200 to 100 without one match?)
- why are score differences in matches no longer displayed?

also there may be a bug in your article:
we’ll each have a K value of 20. ----- but over the course of 32 games that factor decreases linearly down to 40.


RE: Much Ado About Elo - Antibody - 07-30-2012

In the week since I wrote that article I've made a lot of tweaks to the algorithm, especially for team games and DM (duel is largely unchanged). divVerent had many good suggestions that counteract some of your negative points and we came to a good compromise. I'll be writing a followup blog post to inform everyone of the changes.

Let me address a couple of your points anyway:

- I'm sorry you don't feel like you aren't playing for fun anymore. You can turn off your tracking and be excluded from any Elo calculations if you'd like, or you can just ignore the ranks because they aren't important to you.
- There is a reason the variance is high in the early stages - it's because we aren't yet confident that we've arrived at something close to your true score. That's the reason why you don't get ranked until >32 games.
- Quitting is a way to game the system at the moment. div and I are discussing ways to counteract that. Even without that in place, quitting is something everyone else playing the game can see. It doesn't take long for people to recognize the quitters and adjust accordingly for them. Additionally if people want to have a single minded focus on Elo, fine. IMO they are ignoring many other aspects of stats that would be beneficial to them if they were to stay in the game.
- The ending K value is indeed 40. In my haste I wrote down the kfactor (percentage reduction to K) instead. Fortunately this does not materially impact what I'm trying to show with the example since it each player had the same ending K value.


RE: Much Ado About Elo - frostwyrm333 - 07-30-2012

I don't take statistics all that seriously, I have just checked them a dozen times and realized with the help of your explanation that its sometimes not fair. They record the state of the game at the end which is something that may be the opposite of how the game started out.

I just wanted to point out potential problems, I know that I can ignore my stats and that nobody gives a frag about my score anyway. Statistics are quite cool but not when people can cheat them.

Quitting is not only stats-braking but also game-breaking but that is probably for some other thread...

- low variance: stats say I have played 391 CTF matches :-), my elo is better than rocknroll's (98) and he's a better player than I am


RE: Much Ado About Elo - zykure - 08-10-2012

Here's a funny thing I've coded lately ... thought I just share that with you: http://pastebin.com/BydSaRp5

It shows the elo change which results from a duel match for the two possible outcomes (A wins vs. B wins), according to Antibody's post above. The K-factor can be given too (default 40, for new players you have to calculate it from the number of games N (N < 32): K = 200 - 5 x N).

Example output:
Code:
$ python elo.py 100 1000
A has elo  100.000 at K =  40
B has elo 1000.000 at K =  40

Outcome 1: A wins
    A gains  39.776 points
    B loses  39.776 points

Outcome 2: B wins
    A gains   0.224 points
    B loses   0.224 points

Note that in duel matches, the final score doesn't affect the elo change, only the rank does. (However in dm/ctf, score is also important!)