TMM Rating Allowance Needs to Use Ladder 1v1 Matching (or close to it)

BlackYps

Well, the concrete values are what I am trying to get from this thread. 4head

The actual code is this:

newbie_bonus = 0
time_bonus = 0
ratings = []
for team in match:
    time_bonus += team.failed_matching_attempts * config.TIME_BONUS_WEIGHT
    if not team.has_top_player():
        newbie_bonus += team.has_newbie() * config.NEWBIE_BONUS_WEIGHT
    for mean, dev in team.raw_ratings:
        rating = mean - 3 * dev
        ratings.append(rating)

rating_imbalance = abs(match[0].cumulated_rating - match[1].cumulated_rating)
fairness = max((config.MAXIMUM_RATING_IMBALANCE - rating_imbalance) / config.MAXIMUM_RATING_IMBALANCE, 0)
deviation = stats.pstdev(ratings)
uniformity = max((config.MAXIMUM_RATING_DEVIATION - deviation) / config.MAXIMUM_RATING_DEVIATION, 0)

quality = fairness * uniformity + newbie_bonus + time_bonus

The preliminary config values are:

self.NEWBIE_MIN_GAMES = 10
self.TOP_PLAYER_MIN_RATING = 1600
self.MINIMUM_GAME_QUALITY = 0.5
self.MAXIMUM_RATING_IMBALANCE = 600
self.MAXIMUM_RATING_DEVIATION = 300
self.TIME_BONUS_WEIGHT = 0.1
self.NEWBIE_BONUS_WEIGHT = 0.2

Explanation:

@BlackYps said in TMM Rating Allowance Needs to Use Ladder 1v1 Matching (or close to it):

I am currently writing a new matching algorithm to make a 4v4 queue possible. From my understanding trueskill doesn't factor in rating disparity between players when calculating the game quality so in order to get onl y games with similarly rated players we need to introduce our own quality metric. I will now explain my first draft for this so you can discuss if you think that is a good formula and make suggestions to improve it.
Currently I calculate quality = uniformity * fairness + newbie bonus + time bonus
Uniformity goes from 0 to 1 and is 1 if all players in the game have the same rating and is zero if the standard deviation of the ratings if all players is greater than 300
Fairness works the same but looks at difference in total team skills, so it is 0 if the rating difference between the teams is higher than 600.
The newbie bonus is to faster match new players and is a flat bonus if a new player is in one of the teams.
The time bonus increases with every time you were not successfully matched.

Additional explanation:
The deviation is roughly one third of the biggest rating difference if we assume a somewhat even rating distribution in the team.
If both uniformity and fairness are about 2/3 we barely reach the quality threshold of 0.5. That means that a game that has 200 team rating difference and a 300 difference between individual players is the borderline case of what is acceptable. One of these metrics can be worse if the other one is better.

humanpotatoe

This post is deleted!

Katharsas

Could you randomly generate some imaginary players with rating values/number of games, put them into random teams, and then calculate the quality? Without concrete examples its rally hard to judge such an algorithm.

Edit: And throw away examples where fairness is too far off.

BlackYps

I don't have a lot of examples because I have just started making some, but here are a few for you:
(Keep in mind that a game quality of 0.5 is the cutoff here for a game to be considered)

A "Search" is a party of players that is searching for a game. the "pX" are the player names, so you can see how many players there are in the search. The number at the end is the average rating of that search party. The game quality uses the formula that I explained in my previous post.

team a: [Search(['p12'], 842), Search(['p15'], 738), Search(['p1', 'p2'], 781)] cumulated rating: 3142   average rating: 785.5
team b: [Search(['p11'], 745), Search(['p5', 'p6'], 788), Search(['p16'], 816)] cumulated rating: 3137   average rating: 784.25
bonuses: 0.0 rating disparity: 5 -> fairness: 0.9916666666666667 deviation: 44.917806881013234 -> uniformity: 0.8502739770632892 -> game quality: 0.8431883605877618

team a: [Search(['p5', 'p6'], 788), Search(['p16'], 816), Search(['p13'], 971)] cumulated rating: 3363   average rating: 840.75
team b: [Search(['p1', 'p2'], 781), Search(['p12'], 842), Search(['p17'], 951)] cumulated rating: 3355   average rating: 838.75
bonuses: 0.0 rating disparity: 8 -> fairness: 0.9866666666666667 deviation: 79.45399612354309 -> uniformity: 0.7351533462548563 -> game quality: 0.7253513016381249

team a: [Search(['p3', 'p4'], 1004.5), Search(['p17'], 951), Search(['p16'], 816)] cumulated rating: 3776   average rating: 944
team b: [Search(['p13'], 971), Search(['p12'], 842), Search(['p5', 'p6'], 788)] cumulated rating: 3389   average rating: 847.25
bonuses: 0.0 rating disparity: 387 -> fairness: 0.355 deviation: 92.79134859996378 -> uniformity: 0.6906955046667874 -> game quality: 0.24519690415670953

team a: [Search(['p7', 'p8', 'p9'], 1011.3333333333334), Search(['p12'], 842)] cumulated rating: 3876   average rating: 969
team b: [Search(['p13'], 971), Search(['p3', 'p4'], 1004.5), Search(['p17'], 951)] cumulated rating: 3931   average rating: 982.75
bonuses: 0.0 rating disparity: 55 -> fairness: 0.9083333333333333 deviation: 67.96035149261664 -> uniformity: 0.7734654950246113 -> game quality: 0.7025644913140219

team a: [Search(['p7', 'p8', 'p9'], 1011.3333333333334), Search(['p11'], 745)] cumulated rating: 3779   average rating: 944.75
team b: [Search(['p10'], 1047), Search(['p15'], 738), Search(['p14'], 1032), Search(['p13'], 971)] cumulated rating: 3788   average rating: 947
bonuses: 0.0 rating disparity: 9 -> fairness: 0.985 deviation: 125.21026066181636 -> uniformity: 0.5826324644606121 -> game quality: 0.5738929774937029

team a: [Search(['p7', 'p8', 'p9'], 925.3333333333334), Search(['p15'], 1328)] cumulated rating: 4104   average rating: 1026
team b: [Search(['p13'], 1115), Search(['p16'], 1231), Search(['p3', 'p4'], 998)] cumulated rating: 4342   average rating: 1085.5
bonuses: 0.0 rating disparity: 238 -> fairness: 0.6033333333333334 deviation: 152.77004778424336 -> uniformity: 0.4907665073858555 -> game quality: 0.2960957927894662

team a: [Search(['p7', 'p8', 'p9'], 925.3333333333334), Search(['p13'], 1115)] cumulated rating: 3891   average rating: 972.75
team b: [Search(['p3', 'p4'], 998), Search(['p5', 'p6'], 918.5)] cumulated rating: 3833   average rating: 958.25
bonuses: 0.0 rating disparity: 58 -> fairness: 0.9033333333333333 deviation: 84.13976467758869 -> uniformity: 0.719534117741371 -> game quality: 0.6499791530263718

team a: [Search(['p7', 'p8', 'p9'], 925.3333333333334), Search(['p12'], 846)] cumulated rating: 3622   average rating: 905.5
team b: [Search(['p5', 'p6'], 918.5), Search(['p1', 'p2'], 810.5)] cumulated rating: 3458   average rating: 864.5
bonuses: 0.0 rating disparity: 164 -> fairness: 0.7266666666666667 deviation: 79.85612061701971 -> uniformity: 0.733812931276601 -> game quality: 0.5332373967276633

Katharsas

I assume all players in that list have a newbie bonus of 0?
Search(['p7', 'p8', 'p9'], 925.3333333333334) means those 3 players are in the queue together, and 925 is their average rating?

Questions about algorithm:
What is inside the match array?
rating_imbalance = abs(match[0].cumulated_rating - match[1].cumulated_rating)
What is match[0] / match[1] reffering to?
Why is has_top_player() important and what does it do?

FtXCommando

top_player is used in the matchmaking process already. It's defined as anybody with over 1600 mu. It's used to eliminate certain players/teams from consideration when the system is just trying to throw a new player into a game after a few failed queue intervals.

BlackYps

I assume all players in that list have a newbie bonus of 0?
Search(['p7', 'p8', 'p9'], 925.3333333333334) means those 3 players are in the queue together, and 925 is their average rating?

Yes to both.

The match array just holds the two teams, so the rating imbalance is just the difference between the sum of the ratings of the two teams.

As ftx explained a top player has >1600 mu.
This way the newbie bonus gets only awarded if the team has no pro players. But now that I think about it, I am not sure anymore if this is a good idea.

Katharsas

Okay, what does deviation = stats.pstdev(ratings) do?

BlackYps

This is from the python statistics module. It calculates the population standard deviation
https://docs.python.org/3/library/statistics.html#statistics.pstdev
https://www.khanacademy.org/math/statistics-probability/summarizing-quantitative-data/variance-standard-deviation-sample/a/population-and-sample-standard-deviation-review

Morax

So guys, this is great and all (good work - seriously) but most people playing FAF do not understand nor have the time to translate the code to layman terms explanations. It would be really appreciated if some "as precise as possible" explanation was given when distributing information that shows rating brackets as a requisite for what they will experience while playing the game.

BlackYps

I don't think I understand what you want to say. Are you talking about the map pools? The matchmaker won't use rating brackets at all. There will also be no further explanation of the matchmaker inner workings in the client. The end user just queues up and will automagically get some nice balanced games (hopefully).

Katharsas

Ok i think the formula makes sense, except checking top_player. If there are a top player and a noob in the same team it will probably already create bad uniformity right?

However, for finding good parameters, i would have to code that formula up and try it with varying numbers, cannot really tell anything from your example calculations. So yeah i don't think you will get much use out of the forum for that^^

StormLantern

I didnt read the whole thing, but in my experience it can be frustrating when in tmm u play against two opponents that have a high rating difference as the death (or disconnect) of the lower rated opponent often means an auto loss for your team. If its is at all possible to still get consistent enough games when the maximum rating difference (between two teammates) is capped at say 400 or so, that would be a good change I think. But I suppose that would depend on the amount of players in que.