Recapturing Armaskill

This article is divided into three parts. In the first part, I will provide a practical understanding of Armaskill and define the scope and goals of my further analysis. In part two, Armaskill and other statistics will be analyzed. In part three, I will introduce and explain a new recreation of Armaskill, which I have of course named after myself (always name new things after yourself, they might catch on someday!). It’s a long article. I’ve kept it light on the math. Enjoy.

Part 1 – Introduction

dlh maintains a set of statistics for several servers at Gridstats.com, and a couple years ago added a player evaluation system that Microsoft developed for its XBox Live system. Trueskill or, for our purposes, Armaskill is in its most basic form an adaptation of an Elo rating to games with more than two players. In a fortress match, each player would enter with a Armaskill rating. If they win the match, it goes up. If they lose, it goes down. The collective weight of their opponents initial rating as compared to their teams determines how much it goes up or down. If a bad team loses to a good one, the respective Armaskill ratings of the team’s members do not change all that much, since we haven’t learned anything new.

In this simplification, each match performance is graded from 1-9, and then stacked according to its frequency. The average performance will be where the graph peaks.

Figure 1a: In this simplification, each match performance is graded from 1-9, and then stacked according to its frequency. The average performance will be where the graph peaks.

chart_1 (3)

Figure 1b: Smoothed over many matches, the distribution approaches some function of the mean and the standard deviation

Armaskill isn’t quite as simple as tracking a single rating, like is done for Elo. A further improvement Microsoft made was to represent players with two numbers, rather than just one. A player is represented by their mean skill and by a standard deviation [Figure 1a, 1b]. Each individual performance is thought to be an independent, random variable, and so the collection of these individual performances over a large number of matches should fall into some normal distribution. The Armaskill rating incorporates both the mean and the standard deviation, and for the purposes of this study, only the rating is important.

[A note on standard deviation: For the practical purposes of this article, think of standard deviation as representing the spread of a statistic. The higher the number, the more broadly spread the thing is. A standard deviation of 0 would be like tossing a one sided coin. I'll often normalize standard deviations as a percentage of the mean. This just means we can compare the spread across different types of statistics.]

It is important to understand what events in a match factor into Armaskill. The only event that Armaskill cares about is the match outcome, win or lose. Nothing else that happens is taken into account. The individual player could die every round, all Armaskill knows is whether his his team wins.

Essentially, Armaskill is a black box. It has inputs into a match, and it gets outputs, but it can’t see into the box. It can’t see into a match and describe anything about what is happening there. Over the course of many many matches, this black box works very well. The players involved in each match are the variables, and for each player’s rating, he himself is the constant, while all his teammates and opponents change. Holding him constant, we get a good sense of how good he is based on the effect he has on a match. It’s not perfect, but it’s pretty good.

In fact, it’s the best we can get without making value judgments. The nice thing about Armaskill is that all it cares about is match outcomes. If we were to try to improve on Armaskill, we would have to use our understanding of fortress to place value on various events. How much is a core dump worth? How bad is it to be killed? What about cutting? What about holing?

In the first place, a lot of these events don’t show up in the statistical record. Even if they do, there’s little context. Killing a very bad opponent doesn’t mean as much as killing a very good one. Getting killed after the round ends probably isn’t as bad as getting killed during it. You could think of a lot of these things. They all make it hard to make an assumption about the value of certain events.

But the problem is that we care about those events because that’s how we as players understand the game. We keep a inner count of how often we are dying or killing or doing all sorts of other things. Our judgment of our skill is based on how well perform on a round per round basis, and, in fortress, we tend to ignore the match outcome in judging how good the players in the match are. We are much more likely to look at individual scores on the scoreboard than the team ones.

And those scores are completely ignored by Armaskill. There is a disconnect between how good Armaskill says we are and our daily, intuitive judgments. The logic behind Armaskill is sound, and we should accept its ratings as an accurate portrayal of player’s  performances in casual fortress. That is an important caveat, because not all players play with the same focus in casual games. If Armaskill is skewed or flawed, it is because of how seriously players take those matches. That flaws extends to kill and die statistics as well, and it shows up on our in-game scoreboard. Unless you decide for yourself how seriously a given player takes casual fortress, it is impossible to use Armaskill or any other statistic to judge if that player is better than any other.

In this study, we are not trying to judge players against one another, because, as I said above, that’s impossible without making assumptions. For the purposes of this study, we are going to ignore the effect of that flaw. We are going to take the data set as an accurate portrayal of performance, which it is, rather than of ability.

The goal of this study is to get inside Armaskill’s black box. By the end of it, we will be able to say something about the value of kills, dying, and the accuracy of the scoreboard at the end of matches. The goal is to create a statistic that ignores match outcomes and that closely resembles Armaskill. Compiling other statistics, we want to recreate the outcome Armaskill gives us as closely as possible. That process will tell us about the value of the statistics we use.

Part Two: The Data Set and Initial Observations

In this section, we will define and analyze the data set and then proceed to analyze Armaskill’s behavior in that data set

Picking a data set

Gridstats keeps data on several servers. For this study, we want a good set of data. A good set will have a large sample size of matches and a large number of players represented. It will also represent a relatively short timespan. Players change over time, and so a server that stretched over years is asserting a comparison between two players who may have never played each other and had only a few mutual teammates and opponents. We want to limit those types of players.

Another criteria is having as much consistency as possible with regards to the number of players involved in a match. Ladle statistics or pickup matches are the ideal for this, but the sample size for either is far too small. There are only three players with more than 500 matches played in pickup, and the Ladle statistics cover just two Ladles worth. Sample size requirements mean that we will have to go one of the casual servers.

Of the four choices, G5′s Mega Fortress Pro is by far the best. Of the four, it has the most players who have played more than 500 matches, with 169. Fortress Cafe, which represents a very long timespan, only has 94 such players, despite representing about 10,000 more total matches. Both MB53′s and DS’s servers have under 40 of these players.

We need to inspect the average number of players in a match. We don’t want a data set that represents 3v3 fortress, or 13v13. Gridstats does not currently present information on match participants, but it’s fairly simple to calculate. Taking those 169 players statistics, we can calculate the average number of opponents using the calculation below.

Hunter Average / Percent Enemies Hunted

Expanding this, we see that it gives us number of opponents.

  ( core dumps / round ) / ( core dumps / number of enemies )
= ( core dumps / round ) * (number of enemies / core dumps )
= ( core dumps * number of enemies ) / ( core dumps * round )
=  number of enemies / round    as the core dumps cancel each other

We see then that the average number of opponents is 5.82, with a high of 7.26 (Lackadaisical) and a low of 3.42 (aussie@forums, makes sense). Calculating the quartiles, we see that half of the players have between 5.45 and 6.25 average opponents. This looks like a fairly good data set. Two graphs of this information are below.

Number of opponents rounded to the tenths. One can see that only 6 players averaged below 4.8 opponents, making it truly an outlier condition. This is a fairly good data set.

Graph 1: Number of opponents rounded to the tenths. One can see that only 6 players averaged below 4.8 opponents, making it truly an outlier condition. This is a fairly good data set.

Here’s another way of visualizing the same data.

Here is the same data, sorted and lined up.

Graph 2: Here is the same data, sorted and lined up.

The relative consistency of this notwithstanding, it is still important to take account of how the number of opponents might affect other statistics we care about. Some players have on average two more opponents than others. The high value is more than twice the low. If the average number of opponents has a significant effect on other stats, it might be something we need to adjust for. We’ll return to it when we start building our new statistic.

Analyzing Armaskill

First let’s look at the top 25 in this population. After this, we will ignore names, but it will be interest to come back and compare this list with the one our new stat creates at the end. Remember that this server is a couple years old.

 1 - newbÎe (newbie@forums)
 2 - -*insa*- (-*inS*-@forums)
 3 - ~*mkay*~ (Mkay1@unk.me)
 4 - ct|dreadlord (DreadLord@ct/junior)
 5 - Lackadaisical (Lackadaisical@forums)
 6 - madmax (madmax@forums)
 7 - free kill (dlh@generalconsumption.org)
 8 - Concord (Concord@forums)
 9 - noob13 (noob13@forums)
10 - 75 (7575757575@forums)
11 - koala (Pre@forums)
12 - ~*¤kült¤*~ (helllo@forums)
13 - fingerbib (fingerbib@forums)
14 - Luzifer (Lacrymosa@forums)
15 - ¶Potter (ppotter@aagid)
16 - slash (slash@ct/public)
17 - ct_Cronix (Cronix@ct/junior)
18 - G5 (G5@forums)
19 - slash (slash@unk.me)
20 - <^v{}v^> (vov@forums)
21 - teen (teen@ct/public)
22 - .×] Hoax (Hoax@forums)
23 - ct|Puuquie (Puuquie@ct/senior)
24 - CTxGonzap (Gonzap@ct/leader)
25 - Syllabear (Syllabear@forums)

The first thing to look at is how Armaskill is distributed among the population of players. We would expect it to fit  a normal distribution, a bell-shaped curve much like the one I drew in the introduction. I took all the Armaskill ratings of the 169 players and rounded them to the ones place, and graphed the frequency of a rating occurring. That graph is below.

Not quite a bell curve

Graph 3: Not quite a bell curve

In a general sense it looks like a bell curve, but upon inspection, we can see it really is not. What we would expect is for values to be more smoothly distributed. It is surprising, for example, that only one player has a rating of 19 or 20, bordered by 4 players with ratings of 18 and 21. What we observe is a sort of clumping around certain values: 5, 16, 24, 39. This suggests a sort of tiering is going on. There may be pockets of players that usually play together, and have slightly limited interaction with other groups of players. This makes sense, since certain groups of players play at certain times of day. It also stands to reason that certain players only play if they see other specific players in the server. It would seem that the best way to improve your Armaskill rating would be to  only play in matches with high rated players.

This then is a significant flaw in Armaskill. It is contextualized by a player’s competition, if that competition is not completely homogenous across an entire population, then some sort of tiering or clumping, as we observe above, may occur. The graph above suggests four different groups of players, with mean ratings of 5, 16, 24, and 39. These groups intersect, certainly, but it is not clear that those numbers are accurate. The degree of  heterogeneity  of each populations may not insulate it from comparison with the population at large.

Nonetheless, the logic behind Armaskill is sound. Some other things are also worth checking it empirically. Armaskill should not depend on any statistic other than winning matches. We expect a strong correlation between Armaskill and round winning percentage, as well as statistics that capture team performance in rounds and matches. That’s all fine, and expected. What we don’t want and what we need to check for is correlation between Armaskill and non-performance based statistics, like number of matches played and number of opponents. A correlation between Armaskill and either of those items would suggest you can improve your Armaskill simply by playing more (or less) or simply by playing in more crowded (or emptier) fortress games.

Conclusion: No correlation between number of opponents and Armaskill

Graph 4, Conclusion: No correlation between number of opponents and Armaskill

Conclusion: No correlation between matches played and Armaskill

Graph 5, Conclusion: No correlation between matches played and Armaskill

These two graphs compare our two suspects against Armaskill, and show them both to be innocent. There is no correlation between either and Armaskill, which is good. A correlation coefficient is a statistic that describes the correlative aspects of two data sets. A correlation coefficient of 1 would be perfect correlation, 0 is none. For matches played and number of opponents, the coefficients were 0.25 and 0.26 respectably. For a data set of 170, randomness can account for that very weak correlation. Furthermore, we can expect that a very bad player wouldn’t stay very bad if he played a lot of matches. The graph shows that for players playing more than around 2000 matches, a skill floor is created. You play enough, you’re certain to have a minimum degree of skill. As a general rule, if you cannot see any correlation in the graph, there isn’t any worth noting.

There aren’t any adjustments we need to make to Armaskill; it’s sound. The average rating is 24.8, and half of all players are rated between 13.1 and 35.8. The standard deviation is 16.0, as a percentage of the mean, 64.5%. So let’s get cracking.

Part Three: Recapturing Armaskill

In this final section, we get to start playing around with things. The first thing to be done is to find which statistics will be the most helpful in creating Concordance, our new Armaskill, then we must determine how to weight them.

The rules

  1. No black-box statistics can be used. These include round wins, match wins, and anything about team scores. These all say something about the team, and less about the individual.
  2. Armaskill itself cannot be readjusted. We determined in part two it was valid as is.
  3. Success will be evaluated by calculating the correlation coefficient of the new statistic. The better correlation we can get, the more we have succeeded.

Finding the statistics that matter

The simplest way to do this is simply to calculate the correlation between all the numbers and Armaskill. This will help us see what numbers are already closest to resembling Armaskill. In reality, I did this. For the sake of the article, we’ll pretend I didn’t, and we had to figure it out by inspection. This will give opportunity to see some other interesting things.

The natural starting place is took look at kills and deaths. We would expect this to form the core of our recreation. The frequency with which a player kills and dies seems to be directly connected to their skill. Simply enough, I graphed kills per round against deaths per round.

Graph 6: A loose but present correlation

Graph 6: A loose but definite correlation

Here we see there is a definite, if loose, correlation between how often you die and how often you kill. Generally, the players who die the least frequent also kill the most. This might be because they are better or it might be simply because they are alive they have greater opportunity to kill. They might play defense, a good position for generating kills and protecting yourself from getting killed. Across the whole population, obviously, players kill once for every death, since all deaths and kills must be accounted for. In practice, since we have only taken players with more than 500 matches played, the numbers might not line up perfectly, but at a overall kills/death figure of 1.02, we see confirmation of what we would expect. The correlation coefficient is -0.48, suggesting a definite connection between the two. What is not revealed is the importance of each one. Do kills come as a result of being alive, or is surviving more important to success.

Comparing each to Armaskill should tell us which is more meaningful.

Correlation, but how strong?

Graph 7: Correlation, but how strong?

Each clearly correlates to Armaskill, but it’s unclear from the picture how strong that correlation is, especially in the case of death rate (in green). We can see a general trend of high rated players dying less and killing more, but there are a number of outliers. In fact, death rate only has a correlation of 0.56, while kill or hunt rate has a rather strong 0.88. This means that if we simply stopped here, and used kill rate as Concordance, we’d get pretty close most of the time. Whatever we come up with is going to have to improve on 0.88.

The degree of randomness in death rate is somewhat surprising. In Ladle fortress we know that surviving is critical to winning rounds. That doesn’t seem to be the case in casual fortress. For our population, 24.6% of rounds ended in a 1v1. That means that at the end of the round, on average 11of 12 players had died, and thus death had almost no correlation on the round’s outcome. It plays into the point differential, though, and that factors into match outcomes. Armaskill recognizes it as a correlative, just not a very strong one.

Two other intriguing numbers are situational percentages. A players 1v1 ability should reflect their general skill, as should their 2v2  ability. When we compare these to Armaskill, we see a decently strong correlation.

Graph 7: Two good correlations

Graph 8: Two good correlations

Neither are closely correlated as hunting rate, but they are both fairly good. 2v2 is a bit better, since almost every player faces more 2v2 situations than 1v1 situations. Those situations also happen to say less about the individual player himself, because another teammate is present. Continuing this logic, we should expect that 1v2 situations say a lot about the individual, while 2v1 says very little at all. It will important to figure out how to weight each, but they’re definitely being included in Concordance.

The final two things that seem obvious to look at are eyeball tests. Zeroing occurs when a player doesn’t kill anyone in a round, and so zero percentage says something about a player’s consistency. It also reflects how we intuitively understand skill. The names we see on the console every round kill people we assume to be better. We want to find out the accuracy of that eyeball test. The other eyeball test is match high scores. Are the people who score the most points each match really the best? We want to test that theory as well.

Again, we graph against Armaskill.

Graph 10: Strong correlations

Graph 9: Strong correlations, very strong

These are more closely correlated than the situational stats, and hit at roughly the same levels as hunt rate. Zero is the best, have a correlation coefficient of 0.89, while Match high score is at 0.83.  (Zero rate is actually -0.89, a negative relationship; the absolute value represents the strength of the correlation.) This means that for eyeball tests, the match scoreboard doesn’t really lie, but the real truth is in round by round results. A binge round where someone gets three kills says less than consistent scoring. Regardless, both ingredients will find their way into our recipe.

At this point, it’s worth recapping our findings. The best candidates for inclusion in our new statistics are Hunt rate, Zero rate, Match high score rate, and situational win percentages. Just to make sure we didn’t miss anything, a graph of each stat’s correlation coefficient to Armaskill is below.

Graph 10: 0.89 is the number to beat

Graph 10: 0.89 is the number to beat

Here we see our competition. We cannot use round or match win percentage in our calculation, but we want something that will correlate just as closely as they do to Armaskill. Match win percentage comes in at 0.911, and round win percentage is at 0.926. Whatever we do, it needs to improve on Zero rate’s 0.89 correlation. Our number should give a more accurate recreation of Armaskill than simply using one of the raw stats we are using to create our statistic. Suicide, team kill and zone statistics had correlations no better than random.

Determining how to count them

It’s clear that there are going to be three main pieces of my new statistic, which I have called Concordance. The first is some combination of the situation statistics, the second is match high score rate, the third is zero rate. Hunt rate and zero rate have some overlap. We know that because of what the statistics mean, and we can see it intuitively in their almost identical correlation coefficients. I will have to figure out how to appropriately discount hunt rate from zero rate. We will also include death rate, but adjusted to take into account its randomness.

Before weighing the numbers, it is worth investigating if any of them need to be corrected for the number of opponents. As it turns out, there is a slight but notable correlation between hunt rate and number of opponents, and so adjusting for this should slightly improve the number. Adjusted hunt rate (hunt+) is calculated by the method below.

hunt+ = hunt rate / ( average number of opponents / overall population average number of opponents)

It is only the slightest of adjustments, but it does improve our accuracy, and every bit counts.

I also combined the situation statistics, throwing out 2v1, and normalized them as a their likelihood of occurring. It makes sense for certain abilities to count more, based on that situation occurring more frequently. For example the 1v1 element of your situational win percentage is calculated by the following:

  1v1 win percentage * ( number of 1v1 situations / total number of situations )
= 1v1 win percentage * 1v1 situation percentage

This is done for 1v1, 1v2, and 2v2. The three figures are then summed. A graph comparing the new situational win percentage and the old 1v1 and 2v2 percentages is below.

Graph 11: Situational win percentage

Graph 11: Situational win percentage

Combining them and weighing them creates a figure more closely correlated to Armaskill, at 0.84.

Adding them all up

For each number we include, we will weight it by it’s correlation. The better a number correlates, the more it will count towards the sum. Furthermore, since we are comparing different types of statistics, each figure will be normalized to the overall mean average. Lastly, each value will be given additional weight to improve correlation. These weights are what are interesting about the project. They tell us how much to value various events. Rather than choosing their value as our first step, we are setting it solely based on how it will improve correlation. By maximizing correlation, we are solving for the various weights.

              Correlation Coefficient   Weight
Zero Rate             -0.888            -5.222
Hunt+                  0.720            -1.172
Death Rate            -0.556            -1.191
Situational Rate       0.843             3.484
Match High Score       0.823             2.013
Round High Score       0.718             1.077

There are a couple conclusions that come immediately to mind. Round High Score is worth about half of Match High Score. Zero Rate, discounted with Hunt+, is worth just a bit more than the situational rate. This could be interpreted as meaning that end of game performance is worth around the same as consistency throughout rounds, but a clear conclusions escapes me at the moment. Notable that Death Rate shows up at about the same weight as Hunt+ is discounted. The weight is the correlation coefficient divided by the mean times some constant. The constants are set by maximizing the correlation coefficient of Concordance against Armaskill. An individual players Concordance is each weight multiplied by their statistic in that category, summed.

The correlation coefficient of Concordance is 0.9301, which is just a bit better than Round Winning percentage, at 0.9256. Concordance, normalized to have the same average as Armaskill, is graphed against Armaskill below.

Graph 12: Success!

Graph 12: Success! Correlation coefficient of 0.9301 beats all comers!

If we look at our new top 25, we see some changes. (Remember, again, this data is a couple years old)

 1 - -*insa*- (-*inS*-@forums)	             + 1
 2 - newbÎe (newbie@forums)	             - 1
 3 - ct|dreadlord (DreadLord@ct/junior)      + 1
 4 - madmax (madmax@forums)	             + 2
 5 - free kill (dlh@generalconsumption.org)  + 2
 6 - Concord (Concord@forums)	             + 2
 7- Luzifer (Lacrymosa@forums)	             + 7
 8 - teen (teen@ct/public)	             + 13
 9 - slash (slash@unk.me)	             + 10
10 - 75 (7575757575@forums)	               0
11 - ¶Potter (ppotter@aagid)                 + 4
12 - ~*viper*~ (viper1@forums)	             + 17
13 - Lackadaisical (Lackadaisical@forums)    - 8
14 - slash (slash@ct/public)	             + 2
15 - <^v{}v^> (vov@forums)	             + 5
16 - noob13 (noob13@forums)	             - 7
17 - koala (Pre@forums)	                     - 6
18 - ct_Xyron (Xyron@ct/junior)	             + 13
19 - _~R~_Luffy (Monkey.D.Luffy@forums)	     + 18
20 - ~*mkay*~ (Mkay1@unk.me)	             - 17
21 - esspeenuubee (Fort.nub@forums)	     + 11
22 - Syllabear (syllabear@forums)	     + 3
23 - 0ma (0ma@forums)	                     + 26
24 - CTxGonzap (Gonzap@ct/leader)	       0
25 - CtxWoned (owned@forums)                 + 20

Obviously, both Concordance and Armaskill have the same flaw– How seriously did these people play? I would argue that Concordance depends less on effort than Armaskill, because the statistics it counts are not “effort” statistics. There is a lot that goes into winning besides what Concordance takes into account, but most of what it doesn’t is tactical types of plays that we can loosely associate with how much someone cares about the match. More casual players are likely to still play well in situational cases, and are likely to still try to kill people. But they might not help their team win in all the small ways that Armaskill takes into account. That’s just a guess. I’ll try to push an argument forward in another piece. For now, enjoy the graphs.

Yours Truly

Ladle 67 Power Rankings

As a prefatory note, I would like to say that the contribution others have made to the community recently, specifically at the Wall Grind Journal, and even before then to Fort Fix, regardless of how it ended, is directly responsible for motivating me to write what appears below. It’s fun to read what other people have to say, and it’s no fun being the only one talking.

The Reigning Champs

1. Redemption

On March 7th, 2010, Speeders won Ladle 31. It was their third since January 10th, just eight weeks prior. The next month they would again reach the finals, where a talented Crazy Tronners team dug in and pulled away in a three match final.

In the three years since, winning three Ladles in a row has become the gold standard of competitive fortress. Several teams have won back-to-back Ladles. Crazy Tronners did it in Ladles 40-41. Twixted Xats followed up 44 with 45. Team Unknown took the summer Ladles of 47 and 48. Revolver won every match of their first two ladles, only to be ousted by Team Baylife in Ladle 57. Speeders immediately followed that with two in a row of their own in 58 and 59. Rogue Tronners won Ladles 63 and 64. That streak was stopped by Redemption, who currently sit on two consecutive Ladle victories with hopes of a third.

And so, ever since Speeders won their third, back-to-back has not at all been uncommon. Those Ladles account for more than a third of the total since Ladle 31. What’s more, in the past year of Ladles, eight can be accounted for by teams winning in part of a back-to-back. This is all to say that the position in which the Redemption Clan currently finds itself is more normal than it is abnormal. But while back-to-back means little, three in a row would mean everything.

So, will they do it? That is the obvious question.

Well, perhaps an even more obvious question that you, the reader, has coming this far in the article is: “why are they ranked number one?” That is generally the question that each ranking tries to answer. What makes my job easier is that Redemption has just happened to win the past two Ladles, which makes the answer to that question just as obvious as the question itself. Put simply, no team in the past two months has defeated Redemption in a best two out of three fortress match. You can’t argue with that type of track record. So, let’s get the ranking out of the way. Redemption is #1. There are more interesting questions with more interesting answers. Will Redemption three-peat? What suggests they will? What suggests they won’t? Who can beat them? If they lose, how?

I will present my thinking for each of those questions, and we’ll come back to Redemption at the end of the article, but I’ll start by running through the teams that pose the greatest threat to what Redemption is trying to achieve in Ladle 67.

The Contenders

Each of these teams has some edge over Redemption that suggests they might be able to unseat the champs. Redemption has been so strong over such a long period of time (4 straight finals) in part because they have been so balanced and so complete. Each these teams has a distinct advantage in one area of the game, but it may not be great enough to overcome the varied arsenal that Redemption has claim to.

2. Team Unknown

Team Unknown is the only team to take a match off Redemption since the calendar year 2012. That match came in the last Ladle, even as Unk split their clan between two teams. In Ladle 67, they should have more versatility, with some substitutes. What is really important is their front line of Vogue, Gazelle, Slash and Potter. Even as the latter two have declined in the past year, this foursome can rival Redemption’s holing attack, and both teams have an edge in the offensive sumo, which suggests that they will be eager to hole.

In the Ladle 66 final, we saw what looked to be an evolution in the structure of defenses. Both Revolver and Redemption employed a 2-4 formation, with some sort of two man defense and four players free to hole, sweep or hunt down enemies. It was Team Unknown that first used this structure winningly and it employed a castle defense in both of its victories during the summer of 2011. Before that, Ironside used a slower castle through the early rounds of Ladle 42, before switching to a more traditional defense in its semifinal loss. The castle defense had first been developed way back in the Spring of 2009 by Plus team, before holing was especially prevalent. Rudy Can’t Fail adopted it to great success in Ladles 30-31 with Notorious Emoticons, and it became the effective, if notorious, sweepbox.

What is important is that Team Unknown has a familiarity with such tactics, and has the personalities that can decisively and cohesively deploy them. People have joked about League of Legends teamwork carrying over into fortress, and while that’s a bad joke, and probably isn’t entirely true, there’s something to the idea. These players do practice teamwork and coordination in LoL, and that does translate.

Trending: upward

3. Revolver

Ladle 66 was Revolver’s first final since their last victory in Ladle 61, and was their first final loss. Things looked good until about halfway through the first match, and after that things didn’t look good again until it was over. Revolver has struggled to find a consistent line-up really since their first two ladles, and truly have not been the same team without Olive as they were with him in Ladle 55 and 56. Durka should be able to play a similar role in the attack, absorbing sweepers and battling them off, but he has played only rarely.

Still, Ladle 67 should mark an improvement from Revolver. Rudy Can’t Fail has regained some of his old form, and the rust was evident in matches a month ago. Ladle 66 showed glimpses of some tactical organization. Past Revolver teams had struggled to do simple things like set up their defense and block center. When they do, they generally win the round.

The edge Revolver has over Redemption is their attack. Not their holing; Revolver’s holing is worse than average among the entire field of Ladle teams. But the attack proper, Revolver is at the very top. Attack has been an area that teams have not been able to convert into wins, since defenses simply shrink inside their tail and lock down the zone from cuts. Nonetheless, if Revolver feels confident enough trying to crack open defense with cuts, that could mean deploying a formation with a lone attacker, and the rest either midfielders or sweepers. If that attacker can be more than a decoy or a cheap kill, they might have something.

Vov has proved particularly difficult for this team to counter, and he almost single-handedly wrecked their game in each of the past two Ladles. Center has long been a problem for this team, going back to their very first Ladle, where Over pushed Rogue Tronners to the edge of victory, but not over it. Xyron has played center for Revolver the past few times out, and that seems likely to continue, unless perhaps Durka makes an appearance.

Trending: flat

4. Rogue Tronners

Rogue Tronners have an excellent midfield and an excellent classical defense, and that’s been their foundation for quite some time. Red, Poke’Master and Luffy form that core, and if all three are present, you might as well slide the team over to the semifinals. In most Ladles. In Ladle 67, Rogue Tronners are unseeded, and if they advance to the quarterfinals, they will face Redemption. This could make for an early exit, or make them early favorites. Those times that they’ve gone further than a semifinal have been the times the attack has complemented that foundation, and it’s never quite clear if it will be an effective force until the day-of.

What we do know is that Lackadaisical will not be on the grid for this team. This absence is quite comparable to Revolver’s loss of Dreadlord. The obvious similarity is that both are single-binders and old-school players, and both are often deployed as a wingman attacker, in the classical fashion. (The author is aware that last Ladle Dreadlord got some run at sweeper as well.)

Without him, this team can still put together a pretty good threesome. Over, DGM and Shoebat is perhaps the top configuration. Replacing any one of those three with Titanoboa does not make for a significant drop off in skill, but the style does change. Shoebat has become a superb tactical player, and gives Rogue Tronners a weapon that few teams have themselves.

Rogue Tronners’ advantage over Redemption is that solid defense, and with either Over or DGM, the capability to match Vov at center. Still, they will need to create some offense of their own, and it’s not clear if that will come easily, if at all.

Trending: flat

The Wild Cards

These teams aren’t going to win the Ladle. I mean, they might, but they won’t. If they reach Redemption in the Semifinal, the might have a chance to knock them out.

5. Crazy Tronners A

This clan hasn’t reached a semifinal since November, and hasn’t reached a final since October, and hasn’t won since Ladle 53, more than a year ago in January 2012. There’s no real reason to expect any of that to change, other than the fact that it has before. Before they took the title in Ladle 53, it had been eleven months since Crazy Tronners had won. What’s more, it had been eleven months since they had been in a final at all. None of that means Crazy Tronners are due to win anything. I wouldn’t rank them fifth if I thought they were a serious contender to win. It doesn’t mean I won’t be surprised if they do end up winning. It means all that.

It is a reminder that all the top teams are quite competitive and that as we’ve seen before anyone can truly beat anyone. Crazy Tronners in particular have shown an ability to come out of nowhere. They were the ones who stopped Speeders short of four in Ladle 32. The personnel has changed since then (the leadership has not, incidentally).

Making the semifinals will be a challenge though (As it will for Meet Your Maker, see below). Roadrunnerz, even without Fipp and Wolf, are still a legitimate threat. Whoever the face in the quarters will deserve to be there as well.

Trending: downward

6. Meet Your Maker

Since this clan has coalesced behind their current leadership, it hasn’t had the same bite it used to. Gazelle and Vogue tend to be a bit overrated, but they also tend to play pretty well. The period of decline has been much longer. This clan hasn’t won a match in the semifinals since Ladle 62, and before that you have to go back to Ladle 54. Now, on the other hand, they won both those Ladles.

Their greatest advantage is their leadership. Sinewav has been in the game long enough to know his team’s strengths (few) and their weaknesses (many). He has done a lot with a little before though, (see Plus, Ladles 19-21), and while many teams seem to have ignored each other’s tactical weaknesses, Meet Your Maker seem to have some tactical guidance.

Their problem is that they haven’t been active enough or organized enough to do what they know they must. That doesn’t appear to have changed in the past couple weeks, and it should mean another loss in the semifinals or earlier.

Trending: downward

The Spoilers

It is highly unlikely any of these teams make the final, but they each have a considerable chance to spoil the bid of any of the teams above. Generally we call this an upset. Last Ladle saw Roadrunnerz defeat Rogue Tronners, and that surprised some people. In fact it surprised some of the same people that had for six months been saying Roadrunnerz were on the rise. So, which is it? Upset or time come due, it can’t be both. The teams below are ranked in order of the probability I think they have to win a match in Ladle 67.

A. Phoenix Fire v. Meet Your Maker

Phoenix hasn’t gotten quite the buzz that Roadrunnerz have over the past couple months, but neither have they had the same degree of success. Their showcase win is a defeat of Redemption’s B team in Ladle 64, and since then Redemption hasn’t put forth a second squad. In a team like this, the encouraging thing is that many players are at similar levels, and a rising tide can lift all boats. Momentum can carry them especially far. This is because lesser skilled players tend to lack confidence and tend to think too highly of those they see enjoying more success. Early match success can dispel that thinking, and let them play with confidence.

Phoenix is moving the opposite direction of Meet Your Maker, and that is a good thing. The result in this opening round will mean dramatically different things to each side. For Phoenix it will push that Ladle 64 win over to the second place spot on the podium. For Meet Your Maker, a victory will mean nothing, and a defeat might be very bad for them indeed. A loss for Phoenix might be frustrating, but it will confirm nothing other than that they still have work to do.

Top 3 names to start learning: Fenith, Metal, Nagi

B. Roadrunnerz v. Crazy Tronners

A big deal was made of Fipp and Wolf leaving Roadrunnerz. While they were both towards the top of Roadrunnerz roster, neither of the two of them are actually all that good. While Roadrunnerz lost contributing players, they didn’t lose anything they can’t, or already have replaced. The clan has maintained two Ladle teams for quite some time, and players are ready to step into Fipp and Wolf’s average-sized shoes.

This team not only defeated Rogue Tronners last Ladle, but put up a fight against Revolver, crossing the 50-point median in each match. While Crazy Tronners probably lose to both those teams, they might be just as tough a match up for this clan. The beginning of the Roadrunnerz fanclub seemed to be their match win against CT in Ladle 62, where CT went on to a second place finish. That was, until last month, their best achievement. Losing Fipp and Wolf might have reduced our expectations for this clan, but they are sure to be motivated to equal their last performance.

Duly noted: Crazy Speed Friends

This isn’t a team that will exist after the Ladle ends, and so it seems silly to give them a power ranking. They might beat Team Unknown to reach the semis, they might even go further, but outside of the Ladle, they don’t even exist, and so they are here duly noted, but nothing more.

1. Redemption

So, that brings us back to the beginning. The champs. Going for three straight. A good question to lead us into this subject is to look at the jump Redemption made: first the jump into the finals, and then the jump to winning them.

Redemption had never made a semi-final before they beat Crazy Tronners in the Ladle 63 bottom semi. They went on to take a match off the champion Rogue Tronners, and followed it up the next month with another finals appearance. That’s quite a jump.

Looking at the rosters, though, it becomes easy to explain. The Ladle 63 team that won a match in the finals featured two key players that were not in the clan a month earlier: Appleseed and Eckz. They’ve since become mainstays in the top lineup.

The second jump from losing finals to winning them came as Dreadlord came over from Revolver to replace Slick. Both parts of that are important. Slick is below replacement level talent for teams playing Ladle finals, and Dreadlord is at least replacement level, on good days well above it. Getting rid of Slick patched up a hole in the six-man unit, and adding Dread gave the team another weapon.

Redemption won their two ladles with a combination of Koala and Roter and will be without both in Ladle 67. That should mean more grid time from Subfocus and Wolf, both of whom represent steps backward. Both could be deployed usefully, or cleverly. Wolf, with his experience as being at the front of the line for his old team, should be capable of improvising when needed, but it’s as of yet unclear how this new chemistry will work.

The strengths of Redemption are foremost their defense and how quickly they move in on offense. Soul has been a blue chip defender for the past half-year, and while sweeping has been suspect at times, the offense puts enough pressure to negate that weakness. Vov has been a force in the middle, but this team is so dangerous because they can convert failed center attacks into successful ganks.

The biggest question going forward is one of chemistry. Leadership changes were made about a week ago, presumably to put an end to inner conflicts. There’s a reason three members have left in the past month, as reported in the Wall Grind Journal. But there’s also a reason one has joined.

Redemption seems for the moment to have little identity other than being a winner. That may be a good thing, but this particular configuration of players do not have any experience losing together. That lack of experience may become evident if they find themselves in a position where elimination becomes a real possibility. The most experience player on the team, Dreadlord, knows how rare it is to have a chance to win three in a row, and is likely to be even more committed to winning than he usually is. His competitiveness is the reason he is the winningest Ladler ever. He has won 5 Ladles in the past year, and 11 of the last 19, a continued period of excellence that is unmatched by any individual or team. He has already won three in a row as an individual, but doing it with the same team would be something different altogether.

If Redemption repeats for a second time it will be because they were able to control their emotions and keep a steady hand, rather than letting the moment overwhelm them. They are likely to keep facing serious competition as they keep advancing. The best chance anyone has to knock them off may well be Rogue Tronners, if the two meet in the quarterfinals. Redemption will be coming in cold, without playing an opening round match, and Rogue Tronners will be a serious squad to face to open the day.

Regardless of the result, and regardless of the drama surrounding their recruiting practices, Redemption has built not just a contender, but a winner. Only another winner will be able to knock them off in Ladle 67. Regardless of the result, the direction of the clan is confirmed.

Trending: upward