The last set of slides I have to put up are those for my five minute talk about game rating systems that I gave as part of Richard Lemarchand's Microtalks session on Thursday morning.
Overall I really enjoyed the session - though the prep was a nightmare. I guess it's pretty easy to enjoy a session where you only have to do 10% of the work and then you get to listen to smart people explain to you what exactly it is that you do for a living.
Anyway, the five minute format is colossally difficult. Especially if you are trying to make a point and support it with arguments instead of being all hand-wavy and theoretical. I wish I could have brought to bear all the arguments I have as for why we should switch from 10, 20, 100 or even 1000 point rating systems to a simple five point rating system, but I just didn't have time.
I guess that's what blogs are for.
Here are the arguments I made in the talk:
100 point rating systems are so analog that the human brain cannot make sense of the granularity (what is the difference between an 86 and an 87 in real perceptible, measurable terms?). These systems nobly attempt to make a threshold-less system - but when the brain is overloaded we add thresholds back in... we say '61-70 is above average' and '71-80 is good' or whatever. This is effectively adding back in the thresholds that a 100 point system endeavors to remove... but it is adding them back in in a less rigorous way leading to a mysticism created around certain improbable thresholds at the far reaches of the curve. I called this mysticism the Cult of 90+ and talked about it specifically in the context of wine rating.
I talked also about the relationship between the review scores, the reviewer, the publishers of the review and the developers and publishers making the games. There are pressures in these relationships that push reviewers to give higher scores. This is very easy to do in a 100 point system (because of the bracketed question in the previous paragraph), meaning a reviewer under pressure can relieve that pressure incrementally by inflating a score ever so slightly to make everyone involved a little bit happier. This systemic, distributed pressure release into the system of review scores is inflationary and increases game review scores over time.
In terms of inflation, I present the evidence of it by looking at the aggregated game ratings taken from gamerankings.com and I discuss why it is bad: principally because it prevents cross-generational comparison of games (in fact, this need for cross generational comparison means we should impose and maintain a bell curve in ratings even if games are getting empirically better over time).
As an aside: Adam Sessler ranted against a different set of problems inherent with the way aggregators work at this years Rant session. Adam is also right, in my opinion, and a global switch to a five star system would also work alleviate his complaint to some extent.
In any case, if you want the discussion on the above points, you can download the slides and the text of the talk here.
Also of note is that I used Kent Hudson's 'How To Pick a Lock' talk from GDC 2008 as a point in the discussion and I have had a couple people ask me about it. For those interested in that discussion, Kent's slides and video support can be found here. Thanks Kent.
Now - over and above the points raised in the talk there are a couple other arguments for why we ought to undertake a global switch to a five star rating system for games that I did not have time to cover in the talk itself.
Firstly, 100 point rating systems leverage the 'fallacy of precision' to lend authority to the accuracy of the rating (difference between accuracy and precision is here). Essentially the argument here is that if I claim a game is an 82, then most readers are likely to think 'well, it must be somewhere between an 80 and an 85'. In fact, this is a fundamentally invalid assumption. If I were to use a 10 point system and give the game an 8, readers would assume it to be 'probably between a 7 and a 9'. If I were to use a five star system and give a 4, they would assume it 'probably between a 3 and a 5'.
The problem is that the precision afforded by an 82 does not make the accuracy of the rating higher. It is as likely to be a 60 or a 100, just as a 4 star rating is probably somewhere between a 3 and a 5. Accuracy and Precision are orthogonal concepts, but the fallacy of precision leverages the fact that our brains tend to correlate them. In a the highly subjective field of rating games, it is much more important that we be accurate than that we be precise and someone who plays a lot of games can be accurate in a five star system.
To go one step further with this argument, I would point out that in a world full of aggregators, we still have 100 and 1000 point systems and having these systems aggregating five star ratings into percentages is useful and good (provided everyone uses a five star system and their aggregation policies are standardized, fair and well understood - cf Sessler). Having them aggregating ratings that are already claiming 100 points of precision makes the aggregators susceptible to the effects of the aforementioned inflation (in fact it is in the aggregators that we are able to measure the inflation).
For the record - this point was misquoted in IGN's article on the talk wherein they wrote:
According to Hocking ... 35 of the 100 current top-rated games (as tracked by an aggregate site) were released after 2001. That presents a skewed view of the industry's history and points toward a recent inflation of sorts in game reviews.
Note that an astute commenter then went on to reply this:
To say 35 of the top 100 rated games of all time came out since 2001 means that scores have been inflated since them is not consistent. Since 2001, there have been 8 years of games released. We can assume that almost all games with ratings have come out since around 1985, or 24 years ago. The past 8 years represents 1/3 of this time, and 35/100 is about 1/3 of the total game volume. It is consistent that ratings have NOT been inflated as 1/3 of the top rated games have come out in 1/3 of the time period of game releases.
The commenter is exactly right if 35 of the Top 100 rated games of all time were released since 2001. But the real quote was that 35 of the Top 41 highest rated games of all time were released since 2001. That`s not the 35% we would predict, that is 85%. Get it right IGN, you kind of invalidated the entire talk by erroneously reporting the most critical and central piece of data. Maybe it was the precision that threw you off?
I won't go into the other horribly broken arguments in the comments thread... I should know better... but here's my personal favorite:
Clint is a joke. Far Cry was overrated trash. The 100 point system is fine. It allows a more accurate score. Period.
That's like a straw-man multiplied by the denial of the fallacy of precision. Awesome stuff.
As a follow-on from the fallacy of precision point is the argument that the kinds of precision implied in 100 point systems feel elitist to 'ordinary' people. At a time when the industry is experiencing explosive growth - when new players of all demographics are starting to take up gaming - it serves our interest to give them reviews and rating scores they can understand. An 89.3 implies a level of refinement in perception among gamers and reviewers that not only does not exist, but is also intimidating for the mass market. It is the equivalent of terms like 'mature oak finish' and 'fruity nose' in wine discussion. These concepts are alienating for the market and make people feel excluded when - in the end - the entire purpose of game reviews and particularly of giving numerical scores at all is to help inform people about the product and invite them to play.
The final point I wanted to make was that - as members of a press with increasingly important social responsibilities - the reviewers, writers and editors giving the review scores should be the ones championing the push to a five star system. For all the reasons mentioned above, finer granularity systems are broken and compromise the integrity fo their work (and the utility of their work to developers). Chris Hecker also gave a mini rant this year and he called for the gaming press to do their job well. I believe the first step in doing that lies in radically reducing the importance and weight given to numerical scores and emphasizing the importance placed on quality reviews, well written.
I am pretty sure no matter what I say, this call is going to go unheeded... but I am also pretty sure that those who don't make the transition are going to find themselves very rapdily serving a smaller and smaller niche of an explosively growing market. I look very much forward to continuing to work with those who survive the transition.