Originally Posted by MMO-Champion
What are your thoughts on internal testing vs simulation and player experience? Too often I see "our internal testing doesn't show that" or "our internal testing shows something different" without actually telling players what that is. It completely invalidates a players concerns and first hand experience. If a player is doing something unintended is it not the job of the designer to either design better and redirect a player to play as intended OR accept what the player has done and make that design cater to the unforeseen way the player is utilizing your design?
“What are your thoughts on internal testing vs simulation and player experience? Too often I see “our internal testing doesn't show that” or “our internal testing shows something different” without actually telling players what that is. It completely invalidates a players concerns and first hand experience.”
I have a lot of thoughts on this topic. I’ll try to be brief.
First, I haven’t been impressed overall with simulation as an accurate predictor of real numbers (and most of my experience here is from WoW). Sims are enormously dependent on the skill of the simulator. A small error can produce terrible output and skew the whole exercise. That may not matter to the simulator herself who is more interested in figuring out things like stat weights or novel rotations. But were often linked a stack rank of DPS, almost inevitably with warlocks on top, as proof that something was rotten in Denmark. So when our tested numbers disagreed with sims, we almost always went with our numbers. When we started receiving real data from live, we would shift to using those numbers instead.
Second, thinking about whether or not we should share data is something that still keeps me up at night. I like to default to transparency, and showing LoL win rates is pretty transparent.
On the other hand, I haven’t had good experiences from experiments with showing data to players. Typically, it never actually settles anything. You get accused at worst of lying about the data, or at best about how you didn’t consider X as part of your analysis. Part of the blessing and curse of statistics as an art/science is that you can use different tests and get different results. This doesn’t mean that statistics as a thing is inherently flawed. It just means that you need consistency from test to test and a good justification for why you chose to handle the data in certain ways.
The factor that usually leans me towards not showing the data, is a question of focus, particularly with regard to players giving us feedback. Personally, I don’t want some large percentage of my interaction with players to be taken up with debates about whether or not our data appropriately handles ping correctly, or how many outliers we exclude, or whether we should run the numbers again with a different item set, or whether we divide results by region or how many games it takes to be experienced with a certain champ or whatever. Those are all valid discussion points, but they just don’t feel like the best use of my time interacting with players. I’d prefer to talk about the game and not debate the data. We are pretty confident in our data. We have smart people with a background in data science and analysis and we’re comfortable with their confidence. The value of the community playing armchair quarterback for our data analysis is not nearly as valuable as feedback from the community about what is working or not for them, what’s frustrating, where they’re having fun, and so on. That part can’t be replicated internally (internal tests notwithstanding) and is a large part for why I try to make such an effort to talk to players in the first place.