Our latest research had us draw yet another normal distribution graph of grades from 1 to 10. We’ve done that before, and it’s always interesting to see how uninteresting such a graph can be. Wait, what? Those bell curves, since they’re normal distributions, tend to concentrate on the average, which is usually in the 5–6–7 range. When we grade students' exams and course work, it’s on a
/20 scale, making things even worse: nobody of the teaching staff likes to give very low or very high grades. In other words, grading on a big scale is useless.
The same problem occurs when trying to rate your favorite book, video game, … We’ve all seen the flame wars when Gamespot dealt out an eight to a certain Zelda game when of course it should have been nine point eight. I encounter difficulties when trying to come up with a number on a scale of 10 for a book or a game. What’s the difference between a 5 or a 6? When to use 1 or 2, or possibly 3? Since I love lists, and therefore grading and sorting the items on that list, I’m always interested in how others grade their things. Let’s take a quick look.
The first thing that springs to mind is social reading website GoodReads, where the scale is quite interesting. You’re supposed to deal out stars, which of course get converted into a number for easier integer database storage (really?), but to me, the number distracts from the most interesting part, which are the labels:
- Did not like it;
- It was OK;
- Liked it;
- Liked it a lot;
- It was amazing!
Note that awarding a book a 2 out of 5 means it was OK, it does not mean it failed the test since it falls on the left-hand side of the distribution graph. I like this scale and use it myself to grade games, because it’s much easier to think about those labels than it is to think about a number from 1 to 10. When you read a book or play a game, you instinctively know whether or not you disliked it: there’s the 1. You also instinctively know when it was one of the best books you’ve read that year: there is the 5. Did you like it (a lot), or was it meh but okay? There are the numbers in-between the 1 and 5.
Sometimes though, the difference between a 3 and 4, a 2 and 3, or even a 4 and 5 is confusing or hard to pinpoint. Brian Bankler of The Tao of Gaming explains in a brief thought on game ratings how he rates board and card games, using an only 4 instead of GoodReads' 5 item scale:
- Avoid—won’t play this;
- Indifferent—I’ll play this out of politeness, but won’t suggest it;
- Suggest—I like this game, and suggest it;
- Enthusiastic—Play this often, suggest it.
His explanation as to why use the above system:
The great thing about the guide (for me) is that I’m constantly thinking “Is this game a 6 or 7?” but I have no trouble at all looking lumping games into those four categories. (I’m pretty quick to avoid a game; but I have a large, varied game group where people don’t take offense …)
In fact, that’s even better, and fixed my problem with the GoodReads system. I guess these could be mapped as follows: the 4 and 5 is an enthusiastic, the 3 is a suggest, the indifferent is a 2, and the 1 is avoid.
The most important part is that Brian never uses numbers on his blog. At the end of a review, he writes “suggest”, for instance. There are no “
x stars”. I still do that, but might have to rethink my approach, as in the end, your readers will simply scan the text and remember the number—therefore throwing much-needed context into the bin.
There’s a bit of a hiccup though: everyone even remotely related to board games ends up at the BoardGameGeek (BGG) community, which requires a score on 10, and displays averages rounded down to one after the comma. If you’re inclined to buy something but you’re not sure, you check out BGG’s average score. Guess which ones to watch out for? Indeed, 7+ = good, 8+ = amazing, 6-ish = meh. There’s never any 9, 10, and almost never anything below 6 (or 5). So, again, what’s the point? Even more troublesome, Brian has to translate his rating system into BGG’s, of which he has a rough mapping for, explained in his article.
If you want to trim grades or ratings even more aggressively, you could go with the route of many contemporary video game review sites such as EuroGamer. They ditched the classic 10-scale system a long time ago, in favor of something rather minimalistic: a game is either not worth it or average, in which case there’s no grade, or it’s Recommended—possibly translating to Brian’s Suggest. If it’s really really good, then it’ll be awarded the label Essential. The difference between avoid and indifferent has to be interpreted by effectively reading the review.
Some of our courses at the faculty are graded using a binary pass/fail system, but I’m not a fan. That way, there is no distinction between the average and the better. If I were to clean out my board game closet—like I promised I would—I’d have no way of expressing my enthusiasm for one game, while “just” agreeing to play another one.
And then there are specialized ranking systems, such as the cRPG Addict’s GIMLET abbreviation, where each game is judged across different categories: game world, character creation, NPC interaction, encounters & foes, … Each category results in a score on 10, and there are 10 categories: the sum is the global score on 100. Which, again, is totally useless, as admitted by the author: it reduces the rich information from the different categories into a single context-free number. His best games, such as Ultima Underworld, are awarded a 63. But what does that say about NPC interaction? You can’t untangle those numbers once they’re summed. Of course, at university, we do this all the time, and in the end, the administration office expects a number on 20, so we summarize and re-calibrate dutifully.
The more I think about grading, the more I’m inclined to pass on the numbers game as well, and instead focus on labels. I’ll let this sink in for a while and come back to it to implement in my own future systems. Great conversing with you, internet! Definitely a suggest there!