DK Scaling Issues - Fact or Fiction?

**DarthMetatron** · 2014-06-18, 05:52 PM

I decided to continue a study I started last year where I was comparing the value of stats between specs as item levels increased (commonly known as stat scaling) and thought I would post the results here. Feel free to post any comments you might have, but I am hoping for constructive feedback to see if this is worth pursuing in future patches (with 6.0 on the horizon).

The initial study was started in response to an active thread at the time concerning Frost DK scaling issues which started on July 4th 2013 during patch 5.3. There was a lot of discussion which included people using the term “scaling” to describe how much less value (DPS) we receive as gear levels (stats) increase (compared to other classes or other DK specs).

During this time, I set out to investigate the assumption that Frost stat values scaled poorly as gear levels increased when compared to other Strength-based classes as well as against standard Unholy DK and Festerblight.

I started by analyzing raidbots data to which people were claiming the raidbots data was invalid because of Sampling Bias (plus it didn’t help that most people when referring to raidbots were using “top 100” data which was not an optimal set of data to use for analysis purposes). I then performed some initial analysis and ran sims from different classes/specs at various item levels at which point I could then speak to what I felt were evident stat scaling issues with Frost.

From the initial samples I had done, it gave me enough confidence that there were scaling issues present and I should proceed with the study. I then settled on using a base gear profile for each DK spec and utilized the current version of SimC (v530-4 at the time).

During my study, I only utilized SimC to provide the stat values for each spec for each selected item level and did not take the DPS results into consideration. In my opinion, the DPS would be the end product of your stat values, so examining the individual value increase of each stat would be a better comparison.

Things started off a little hairy as I decided to look at the scaling of each stat instead of lumping all stats into one (Total Scale Factor), and whether to include strength or weapon DPS into the equation. I had run SimC for my character many times, but never modified any of the profiles so I apologize for the messy start.

I believe things get more clear as the study went on and it was decided how to best present the data.

TLDR:
If you only want to look at current data, skip forward to Parts IV-V (June 1-4) when I performed analysis with current gear and the current version of SimC and the same type of "scaling issues" seem to be present for Frost.

I did not compare other strength-based classes in the same fashion, but it could be something I look into in the future if this provides any beneficial information.
--

Here is a link to the study document which contains links to the data sheets and references:

https://drive.google.com/file/d/0Bz3...it?usp=sharing

UPDATED LINK
-Cleaned up document references
-Added Slope and Geometric Mean rows on Part I document for consistency

EDIT - 6/24/2014
Added result sets for 1H Fury Warrior, 2H Fury Warrior, Arms Warrior and Retribution Paladin as well as a comparison chart:
http://www.mmo-champion.com/threads/...1#post27785809

**Mendenbarr** · 2014-06-18, 07:52 PM

So, scaling.

Firstly, I have to commend you for doing an excellent job detailing your data and results. You've clearly put an extraordinary amount of work and time into this, and I know just how long it would have taken to sim so many profiles. /clap

I could give you a cookie cutter response that I'm sure more than a few people will bring up, which is "you can't trust simcraft results", but that's not fully true. To quote the simcraft wiki "SimC devs work for the users that understand that our models live in space between gospel and garbage". We should not look at the results from simcraft, and by extension your data, as being wrong or right, but we should rather look at them in terms of usefulness. I'd argue that they are pretty damn useful.

However, we must first examine the necessary approximations and missing pieces of information that prevent this data from being 100% useful. To begin with, I cannot see a specification of fight length anywhere in your results, so I have to assume you used the default fight length of 7.5 minutes, which I'm sad to say breaks my festerblight 2.0 profile, which only works with times around 4 minutes. It will only apply super diseases twice, at the start of the fight and the 2 minute point, and the extension will be way off when changing the fight time too much. This makes the festerblight 2.0 results less useful, but doesn't invalidate them, because for the majority of the fight, a majority of damage sources remain the same.

To further go into that point, I have to assume you used a constant fight length for all your trials, which lessens the usefulness of the data on any fight lengths other than 7.5 minutes. 7.5 minutes is about 1 minute after UF would end for an unholy dk, while a 6.5 minute fight would end as UF falls off for the third time, which would decrease the value of haste compared to a 7.5 min trial (as UF had higher uptime, and therefore your average haste was higher). It would also increase stat weights for crit, mastery, and strength, as you have a higher uptime on UF and therefore get in more attacks per second, and those attacks multiply with crit, mastery(in most cases) and strength.

We also have to take into account the strengths and weaknesses of each spec, such as target swapping(where frost has the advantage), burst AE(where frost wins yet again), sustained AE(where unholy wins), burst damage(in which unholy takes the cake with 2 minute cooldowns), short constant burst damage(in which frost wins with 40 second pillar), damage when you can't be in melee range, (in which unholy reigns supreme, unless your pet is also out of range, in which case frost wins), and even things like downtime, being stunned, movement, ect ect. Each and every one of these cases occurs in a raiding environment, and has an effect on both damage and stat weights, albeit a small one. When added together, the effect is quite large and noticeable.

We must also view breakpoints as a source of change in stat weights; for example the DWF haste breakpoint at 3868, where you get in and extra melee swing during the 2-set's 6 second duration. Gear ilevels slightly below this haste value would have an artificially inflated stat weight for haste. The same thing would be true of hit and expertise, where starting to miss attacks would artificially lower the scale factors for other stats(unless you hard-set hit/expertise to 2550, in which case stat weights would be artificially inflated, as you gained extra stats beyond your ilevel for free).

And we must take into account that while simC results are far from garbage, they are not gospel either. As one of the devs, I can assure you that there are bugs and inaccuracies in the code. For example, until the most recent release, DW weapons did not alternate, but instead attacked in the same instant, leading to an inflated value of haste prior to the 3686 breakpoint for DWF. While that has been fixed, I have no doubt there are further issues, unknown to me, that reduce the usefulness of the results that simC produces. On top of this, we have the action lists I've written at the helm, and those are far from perfect. Players also don't generally follow the action list, not exactly, which is yet another reduction in usefulness.

All of these factors have small effects, but together the effect is annoyingly large. It's possible to overcome most of these, taking an average number of targets, movement, and fight duration ect, and weighting the distribution of the ilevel results based on those results, but there is no real way to prove simC doesn't have any errors, and that would take well over a year to do, by which points the results would be out of date.

All that being said, the usefulness of the results isn't 0. I'll be bookmarking it for later reference, as your large collection of data is very useful, as long as we understand what it means. We can pretty much definitively say that unholy scales better than frost, with current tuning, but I would not be able to produce a quantitative number I trusted enough to use as evidence, and I don't believe that there is enough evidence to say that frost is far enough behind to need a buff, or unholy to need a nerf. DKs "feel" very middle of the pack right now, and however much we dislike that, middle of the pack is technically a design goal for every class. It sucks looking through the glass at outliers like warlocks and ferals, and not being able to compete with them, but that's an issue with them, not us. And honestly, SoS has been going on for what, 8 months now? Does it really make a difference if they are overpowered for the rest of the expansion?

With the stat squish, and the change to the way AP and weapon damage interact in formulas, we're basically getting a clean slate for the first time in a long time. Every ability is being rebalanced to the new system, and I very much doubt they will be perfectly converted. WoD is an opportunity for rotations to change, for balance to change, and for dks to get better, but we need to start working with the empty slate we're being given, and not the old MoP one we're effectively throwing away.

Again, great job on putting the data together, and I hope you can find even more use out of it. I'll certainly be spending an afternoon reviewing all of it.

**DarthMetatron** · 2014-06-19, 02:35 AM

Thanks for the great response Mendenbarr.

You pretty much nailed all of the random things that were rattling around in my head while putting it together plus a lot I hadn't considered. Thanks for mentioning the fight duration. I did forget to document the 20% variation on fight length, which I am using. I will update the document.

I will admit though, I didn't read through the priority list Vereesa listed for Festerblight 2.0 or I would have seen the 4 minute limitation

I completely understand this is only a simulation and in no way is equal to real world scenarios, but I didn't want all the what-if's and could-be's to stop me from at least attempting a high level view of how our scaling looks.

6.0 has me on the edge of my seat in anticipation to see how we will perform. I agree that this is their best chance to fix any inherent scaling "issues" and I am pretty optimistic. With the hint that Frost will want Haste, it had me baffled for a bit until I read up a bit, I hadn't considered not being GCD locked anymore.

Thanks again!

- - - Updated - - -

Just realized I wanted to separate the spreadsheets in Part III and Part IV, but the link points to the same spreadsheet which has the results of both tests on separate tabs.

Also, the last tab is named correctly as Unholy but the Scale Factor chart says Festerblight. Will tidy some things up in the AM.

**Darkedge** · 2014-06-19, 02:40 AM

The concept of scaling issues is quite a taboo subject around WoW at the moment, back when Ghostcrawler used to work for Blizzard he mentioned something to the effect of scaling being a word that people misuse and he rarely takes on their feedback as seriously - paraphrasing.

Ever since then you will find some people that outright refuse to believe it (scaling issues) exists, it can get quite odd.

Thanks for your efforts.

**DarthMetatron** · 2014-06-19, 02:55 PM

Originally Posted by Darkedge

The concept of scaling issues is quite a taboo subject around WoW at the moment, back when Ghostcrawler used to work for Blizzard he mentioned something to the effect of scaling being a word that people misuse and he rarely takes on their feedback as seriously - paraphrasing.

Ever since then you will find some people that outright refuse to believe it (scaling issues) exists, it can get quite odd.

Thanks for your efforts.

That is one of the main reasons I started doing this study in the first place. There seemed to be an elephant in the room (or better yet The Emperor's New Clothes).

I made some edits to the main document, updating links mainly and also cleaned up some of the verbiage on the sheets themselves.

I think I was just excited to share my results when I was finished that some things were still a little foggy. I am a Business Analyst and I would never turn in a project in this state lol. I could probably edit for another week, but unless there is a major snafu I think it is acceptable the way it is.

**SlippyCheeze** · 2014-06-19, 08:14 PM

Originally Posted by Darkedge

The concept of scaling issues is quite a taboo subject around WoW at the moment, back when Ghostcrawler used to work for Blizzard he mentioned something to the effect of scaling being a word that people misuse and he rarely takes on their feedback as seriously - paraphrasing.

*nod* This is what is needed to actually talk to Celeston (or GC) about scaling issues. Actual study, documenting the methodology, sources, etc. Hopefully you can grab attention from Cel and talk to him about it -- get some feedback on how to improve, and we might actually see movement on this front.

**DarthMetatron** · 2014-06-23, 12:42 PM

*Disclaimer - I understand the pitfalls of relying on raidbots data to reflect real world results and am in no way claiming there is a correlation between the two result sets below*

That being said, I ran sets of simulations for 1H Fury Warrior, 2H Fury Warrior, Arms Warrior and Retribution Paladin and here are the results:
(I added current raidbots data for comparison purposes only)

*The values of both Frost, Unholy and Fury specs were combined into one value since you cannot separate them on raidbots

I have the sheets containing the full results of the sims here here:

Paladin
https://drive.google.com/file/d/0Bz3...it?usp=sharing
*There was an extreme drop in the value of Haste after ~17K (or ilvl 575) so I changed some gems to Mastery after that point. I could probably game it a bit more to increase the overall value of the stats, but I didn't see the value in doing so for the purpose of these tests.

Warrior
https://drive.google.com/file/d/0Bz3...it?usp=sharing

**Stacie** · 2014-06-30, 09:39 AM

Would also like to say thanks for all the work put in here, very nice read

**Darkfriend** · 2014-07-08, 03:30 AM

Originally Posted by DarthMetatron

*Disclaimer - I understand the pitfalls of relying on raidbots data to reflect real world results and am in no way claiming there is a correlation between the two result sets below*

That being said, I ran sets of simulations for 1H Fury Warrior, 2H Fury Warrior, Arms Warrior and Retribution Paladin and here are the results:
(I added current raidbots data for comparison purposes only)

*The values of both Frost, Unholy and Fury specs were combined into one value since you cannot separate them on raidbots

I have the sheets containing the full results of the sims here here:

Paladin
https://drive.google.com/file/d/0Bz3...it?usp=sharing
*There was an extreme drop in the value of Haste after ~17K (or ilvl 575) so I changed some gems to Mastery after that point. I could probably game it a bit more to increase the overall value of the stats, but I didn't see the value in doing so for the purpose of these tests.

Warrior
https://drive.google.com/file/d/0Bz3...it?usp=sharing

Gota say I find those Fury numbers highly unrealistic. A better metric would be DPS gain per tertiary and secondary stat weight at various levels.

My FDK gets about the same from strength, but a third the DPS per point of crit, for example, and slightly less than half per point of mastery over my fury warrior at similar gear levels.

**DarthMetatron** · 2014-07-10, 02:25 PM

Originally Posted by Darkfriend

Gota say I find those Fury numbers highly unrealistic. A better metric would be DPS gain per tertiary and secondary stat weight at various levels.

My FDK gets about the same from strength, but a third the DPS per point of crit, for example, and slightly less than half per point of mastery over my fury warrior at similar gear levels.

Which numbers do you find unrealistic, the stat weights from SimC or the log data?

The linked Warrior sheet splits each stat and shows their values as they increase from 533-590.

**Darkfriend** · 2014-07-11, 02:20 AM

Originally Posted by DarthMetatron

Which numbers do you find unrealistic, the stat weights from SimC or the log data?

The linked Warrior sheet splits each stat and shows their values as they increase from 533-590.

https://docs.google.com/file/d/0Bz3H...5FZWpIbUE/edit

The numbers aren't comparable to any I've seen. Crit/str both too low at all levels.

Still, don't even need to bother with that. A simple comparision of strength to strength at gear levels, with mastery/crit in SEP form for each is enough.

**DarthMetatron** · 2014-07-11, 12:50 PM

Originally Posted by Darkfriend

https://docs.google.com/file/d/0Bz3H...5FZWpIbUE/edit

The numbers aren't comparable to any I've seen. Crit/str both too low at all levels.

Still, don't even need to bother with that. A simple comparision of strength to strength at gear levels, with mastery/crit in SEP form for each is enough.

Interesting, so you are saying the gap is even larger than it appears in my SimC results? The gear set I am using is the Tier 16 sample.

Do we think SimC is undervaluing stats or is it a problem with the Warrior T16 gear template not being optimal?

Yes, I've done that comparison (individually and combined) but I didn't post the link:

https://drive.google.com/file/d/0Bz3...it?usp=sharing

Deleted · 2014-07-11, 01:45 PM

The differences between classes are very small, and blizzard has said before that there are "imbalances" but players make it bigger problem than it actually is. Every raid encounter in any difficulty level is doable with any classes.

**Stacie** · 2014-07-11, 01:52 PM

Originally Posted by Retriavenger

The differences between classes are very small, and blizzard has said before that there are "imbalances" but players make it bigger problem than it actually is. Every raid encounter in any difficulty level is doable with any classes.

Doable and Optimal are different things, it becomes more apparent when progressing. They might be close but we don't know for sure what kind of tests they do (they have more data than us) before we know there test's could be a standard patchwork style one. When attempting a boss with some movement, places you can AMS soak that gap with balancing becomes much much larger.

Like if blizzard say we are balanced around AMS soaking? If so what if one tier half the boss's don't provide much chance of AMS soaking this would likely make us far worse off than intended. Same for reverse if there were areas we soaked to much and did more DPS than intended would also be an issue.

I think 80%+ (Random high number) of fights its not a real scaling issue just an issue with how we are balanced and fights not being made around the model blizzard uses to balance us all.

Deleted · 2014-07-11, 01:54 PM

Originally Posted by Mr Chaos

Doable and Optimal are different things, it becomes more apparent when progressing. They might be close but we don't know for sure what kind of tests they do (they have more data than us) before we know there test's could be a standard patchwork style one. When attempting a boss with some movement, places you can AMS soak that gap with balancing becomes much much larger.

I dont think they balance around patchwerk bosses if majority of the bosses are complicated. Also dps is not only part of balancing.

Any classes can play almoust optimally and it should be enough for any player, perfect balance is not even realistic. And classes are not played by computers, they are played by humans and it already creates such a gap especially with latency that ideal simulations are not realistically possible.

**Shiira** · 2014-07-11, 01:55 PM

Originally Posted by Retriavenger

The differences between classes are very small

Define "very small".

What do you think the difference between top and bottom dps is now? What do you think is acceptable? What do you think blizzard should aim for?

**DarthMetatron** · 2014-07-11, 01:58 PM

Originally Posted by Retriavenger

The differences between classes are very small, and blizzard has said before that there are "imbalances" but players make it bigger problem than it actually is. Every raid encounter in any difficulty level is doable with any classes.

I would agree with your first statement except these are not small differences. I guess it all depends what you find to be a "very small difference" or not.

This is not an argument if the spec is viable or not, this is a study on how the stat values of Strength based classes scale as gear levels increase.

**Stacie** · 2014-07-11, 02:00 PM

Originally Posted by Retriavenger

I dont think they balance around patchwerk bosses if majority of the bosses are complicated. Also dps is not only part of balancing.

Any classes can play almoust optimally and it should be enough for any player, perfect balance is not even realistic.

OFC its not like i said with AMS soaking (sorry its just a good example) some fights we can soak allot and do more dps others not so much. This can make us amazing even more so than some DPS on these fights with lots of soaking, but the ones with next to nothing to soak we could do worse than the others. You can't really scale around this with buffs you will always be good at some fights bad at others.

Not like you can buff us just for one boss and not the next, has to be some sort of balance and this can never be perfect unless the boss mechanics are all the same. But even with ever boss a patchwork something would get in the way like human error.

As it stands at the moment there are a few fights we could use a buff on but most we are fairly balanced i think. To play our class perfect could be harder than other melee and stuff, but i wont even go there as my opinion will be bias as i play DK. Allot does come down to player skill also, pushing right buttons right time not messing up, how quick can you recover when you do mess up compared to other class's, how much you have to manage, what you are assigned during the fights etc etc. (I am always asked to grip, losses me a few seconds DPS)

Deleted · 2014-07-11, 02:01 PM

Originally Posted by Shiira

Define "very small".

What do you think the difference between top and bottom dps is now? What do you think is acceptable? What do you think blizzard should aim for?

I believe the bottom of the chart is still doable, I really dont think that blizzard releases live patches where 1 class is 100% useless.

**Shiira** · 2014-07-11, 02:09 PM

You didn't answer any of my questions.

And how "useless" any class is depends on how serious your guild is. If you take until 2014 to get to the difficult part of this tier you've already had so much gear that which classes you have is practically irrelevant.

Recent Blue Posts

Season of Discovery Hotfixes - 29 April

Season of Discovery Hotfixes - April 29

WoW Hotfixes - 29 April

WoW Hotfixes - April 29

Hotfixes: April 29, 2024

Notable Differences Between Cataclsym Classic 4.4.0 and Original Cataclysm 4.0.3a

Notable Differences Between Cataclsym Classic 4.4.0 and Original Cataclysm 4.0.3a

Recent Forum Posts

Notable Differences Between Cataclysm Classic 4.4.0 and Original Cataclysm 4.0.3a

Dragonflight and Season of Discovery Hotfixes - April 29, 2024

Blizzard must stop introducing neutral races immediately

Premades Epic Battleground

Did Blizzard just hotfix an ilvl requirement onto Awakened LFR?

May 2024 Trading Post Rewards

Do you consider the Horde to be "the bad guys" or is it more complex?

Thread: DK Scaling Issues - Fact or Fiction?

Thread Tools

DK Scaling Issues - Fact or Fiction?

Posting Permissions

Social Media

Services

Resources

Our Communities

MOBAFire Network