Many people dislike the idea of assigning numerical ratings to video games. The prevailing attitude among games industry pundits and developers alike seems to range from indifference to outright hostility. Popular game review sites such as Kotaku, Rock Paper Shotgun, Ars Technica and The Verge forgo numerical ratings entirely. Eurogamer has recently decided to dispense with the traditional 10-point scoring system [footnote]Eurogamer has dropped review scores - Eurogamer - Oli Welsh - Feb 10, 2015
http://www.eurogamer.net/articles/2015-02-10-eurogamer-has-dropped-review-scores
The article lays out Eurogamer's rationale for dropping the 10-point scoring system in favor of a 3-level 'Essential / Recommended / Avoid' system. Says Oli Welsh: "This hasn't been the first time we've discussed dropping review scores. In the past, the case we've made for it internally has always been that a number is a very reductive way to represent a nuanced, subjective opinion, and that the arguments started by scores aren't productive." Welsh goes on to state: "The counter-argument was simple but powerful: as an at-a-glance guide to what we think, scores were very useful to readers. We no longer think that's true. In the present environment, scores are struggling to encompass the issues that are most important to you." In summary: "Scores are failing us, they're failing you, and perhaps most importantly, they are failing to fairly represent the games themselves." For all the talk in the article about how modern games are continuously evolving through patches and feature updates, and how this has made reviewing games more difficult, it's not made clear exactly how the proposed 3-level system is better equipped to handle these challenges. Eurogamer's manifesto becomes even more puzzling when it's revealed that the proposed rating system will only be applied to selected games, and that, barring extraordinary circumstances, they will not update reviews as games evolve over time. [/footnote], as did Joystiq shortly before closing its doors [footnote]Joystiq isn't scoring reviews anymore, and here's why - Joystiq - Richard Mitchell - Jan 13, 2015
http://www.joystiq.com/2015/01/13/joystiq-isnt-scoring-reviews-anymore-and-heres-why/
Here Joystiq explains their decision to stop scoring reviews. Unfortunately, they did not get the opportunity to gauge the decision's impact as the site was shuttered by AOL just two weeks later. Mitchell states: "The very purpose of a score is to define something entirely nebulous and subjective ? fun ? as narrowly as possible. The problem is that narrowing down something as broad and fluid as a video game isn't truly useful, especially in today's industry. Between pre-release reviews, post-release patching, online connectivity, server stability and myriad other unforeseeable possibilities, attaching a concrete score to a new game just isn't practical. More importantly, it's not helpful to our readers." Later in the article it is claimed that Joystiq felt compelled to modify their original five star rating system to better align with the typical distribution of scores on Metacritic, an apparent "capitulation to the industry" they weren't happy with. However, as the only changes implemented were to start using half-stars and to add a half-star to a few old scores, it's not entirely convincing that there was ever any serious incompatibility between Joystiq's rating scale and that of the typical reviewer listed on Metacritic. [/footnote]. Chief among the complaints about review scores are that they discourage meaningful discussion of games, and that they aren't sufficiently nuanced to be an effective tool for reviewers [footnote]The Spotty Death and Eternal Life of Gaming Review Scores - Ars Technica - Kyle Orland - Feb 15, 2015
http://arstechnica.com/gaming/2015/02/the-spotty-death-and-eternal-life-of-gaming-review-scores/
The article contains some comments from Jason Schreier of Kotaku regarding his opinion of game review scores. Says Schreier: "When I read through the comments on an IGN review, for example, all I see is people talking about the score ... compare that to, say, comments on an [unscored] review from Kotaku or Rock Paper Shotgun, and it's night and day". Schreier elaborates further: "Scores strip the nuance from video game criticism, forcing reviewers to stuff games into neat little boxes labeled 'good' and 'bad'". Well, it's perhaps not surprising that Schreier believes he and his colleagues at Kotaku are cultivating a superior audience, but it remains unclear as to whether reality subscribes to the same theory. What's considerably more likely is that reality maintains a subscription to the eponymous subreddit known for mocking the outlet's tawdry editorials.[/footnote]. Other industry professionals prefer to direct their ire towards the practice of review score aggregation on websites such as Metacritic and GameRankings. Have a listen to old Sessler rave about how evil Metacritic is tearing the industry apart [footnote]Adam Sessler's rant about Metacritic at GDC 2009 - Youtube video - Simon LeMonkey - Mar 28, 2009
https://www.youtube.com/watch?v=0QsXrswJ-yM#t=25s
This video is of Adam Sessler giving a talk at the Game Developers Conference (GDC) 2009. By way of introduction, Sessler proclaims: "Fuck Metacritic. Who the hell made you king?" Sessler relates an anecdote of a developer he was acquainted with approaching him, upset that Sessler's 2/5 rating for his game had been translated to a score of 40/100 on Metacritic. He later clarifies: "It's just kind of odious when we in the press are seeing our work retranslated and recalibrated ... where we're really not claiming ownership and suddenly there's this number attached to it." At this point, I'm wondering in what alternative universe has a 2/5 rating ever been indicative of anything other than a steamy mound of canine feces? It should go without saying that a straight multiplicative scaling of 5-point or 10-point scores to a 100-point score is the most natural method of conversion. The talk also touches on the issue of publishers compensating developers based on Metacritic scores. Here Sessler displays an astounding proficiency at mental gymnastics when he suggests that professional reviewers should not in any way be held responsible for poor game sales, but that Metacritic (an aggregation of professional reviews) should definitely be held responsible: "You know what? If it's a good game and you know it's a good game, but it doesn't sell well, go talk to your marketing staff, all right? Don't put it on us [game critics]. I'm sorry this has happened to you [game developers]. I want to put a stop to it. Maybe somehow we'll all get together, we'll march down to CNET [Metacritic's owner], we'll flip 'em the bird, and maybe somebody in that building will take a long hard look at something they're putting online that is odious, pernicious and needs to stop now."[/footnote]. TotalBiscuit also isn't a fan, to put it mildly [footnote]Content Patch : January 3rd, 2013 - Ep. 025 [Tomb Raider, Gamestick, Elite] - Youtube video - TotalBiscuit, The Cynical Brit - Jan 3, 2013
https://www.youtube.com/watch?v=mQqzqHgvB90#t=521s
Says TotalBiscuit: "It's no secret that I'm very much a critic of Metacritic. I very much dislike the notion of the website, and I very much dislike what it has done to the industry". The commentary devolves into a meandering rant containing several dubious arguments, such as (1) Metacritic is lambasted for refusing to remove a GameSpot review score after the reviewers later admitted to doing a terrible job, (2) the use of background colors to improve readability of review score text is criticized as pandering to lazy dimwits, (3) Metacritic is accused of operating on a "business model of manipulating ratings" because they dare to translate review scores of 10/10 or A+ to 100/100. The single coherent point of the tirade revolves around Metacritic's lack of transparency in the way critic scores are weighted to determine a game's Metascore, a legitimate concern that has been raised elsewhere. However, it's hard to take this too seriously when the sinister allegations of targeted rating manipulation that follow aren't backed by any evidence or even reasonable suspicion. TotalBiscuit concludes with a strongly worded condemnation: "it just proves once again that Metacritic, in its current form, is a dangerously useless site that is actively stomping around like a bull in a china shop when it comes down to game and media reviews. And its limited usefulness is not enough to counteract the potential damage that Metacritic could actually do to the industry and is continuing to do to this very day."[/footnote].
On the other side of the table, plenty of game consumers don't see a problem with including scores as a component of written reviews. Aggregated ratings pages are viewed as as a helpful, if not completely definitive, source of information about present and past titles. Games released with serious technical deficiencies will almost certainly find themselves on the business end of a numerical beatdown from critics and users alike. In fact, if attempts at coercion and bribery by publishers are even half as pervasive as games journalists tell us, turning to aggregate ratings for a reliable appraisal of game quality is a perfectly sensible course of action [footnote]The basic premise is that a minority of deliberately inflated (or deflated) review scores will not have a great impact on the overall average score. Fortunately, the premise still holds when outliers are merely a result of overzealousness rather than outright dishonesty. [/footnote]. I've also seen many comments on discussion boards that boil down to "find a critic that works for you", which suggests it isn't so much a matter of the particular review format being employed as it is the tastes of the reviewer. Currently, it seems there is still some degree of support for review scores among game critics [footnote]Review Score Guide - The Jimquisition - Jim Sterling - Nov 22, 2014
http://www.thejimquisition.com/review-score-guide/
This guide was posted shortly after Jim Sterling left The Escapist to become an independent game critic funded directly by readers. There isn't anything especially remarkable here, other than the fact that Sterling elects to continue using review scores of his own accord: "Some people don't like review scores, and do not want to see them in reviews ? that is okay! Scores here are subtle in their application, casually included at the end of the review, and you can always ignore it if you don't think it's useful. I personally like using scores, and intend to continue doing so until such time as I don't." [/footnote] [footnote]The Official Destructoid Review Guide - Destructoid - Chris Carter - June 16, 2011
http://www.destructoid.com/the-official-destructoid-review-guide-2011-203909.phtml
This article explains Destructoid's system of review scoring. Yanier 'Niero' Gonzalez, founder of Destructoid, is quoted as saying: "Ad companies we've worked with have called us crazy for publishing scores. It really is like deciding to go to war. The only reason a site does not publish review scores is to sell more advertising. We have lost ad campaigns because we've given bad review scores, and frankly my dear, I don't give a damn. I'm not compromising our voice. Still, we understand the danger of a bad score. For example, some publishers giving their employees pay cuts due to scores, but in that case we push it back on them. It's not our fault you choose this method to compensate your employees. Grow a backbone, stand behind your work, make better games, and stop blaming the gaming press for having an honest opinion." Blunt and logical. A sincere defense of the review score is a rare sight to behold. [/footnote], while for others it comes across as a calculated business decision [footnote]The Spotty Death and Eternal Life of Gaming Review Scores - Ars Technica - Kyle Orland - Feb 15, 2015
http://arstechnica.com/gaming/2015/02/the-spotty-death-and-eternal-life-of-gaming-review-scores/
The article contains some comments from Arthur Gies of Polygon regarding his opinion of game review scores. Says Gies: "The anecdotal accounts and experiences we had suggested that readers want them [scores], whether they admit to it or not", adding that "I think there are people who are interested in arguing numbers and people who are more interested in discussing points raised in review text, and that neither are mutually exclusive." Damn Arthur, could you be any less enthusiastic about it? Not even a respectful nod towards the numeric tramp stamp you brandished to hit back at that pixelated perpetrator of psychosexual trauma known as Bayonetta? [/footnote].
So what exactly is the problem with review scores? I find the artsy-fartsy answer of "video games, like any form of art, are far too complex for their merits to be reduced to a number" less than convincing [footnote]Listening to some of the unabashed arrogance on display from professional games journalists over the past six months, both on forums and in the Twitterverse, one begins to wonder if it isn't in fact their creations we're expected to regard with the sort of reverence usually reserved for fine art. And there's more than a sneaking suspicion that certain critics, unable to conceal a stunning lack of awareness of their own position on the food chain of gaming content production, are deeply resentful of Metacritic for continuing to exist on the back of their hard work. As Escapist reader Thanatos2K [http://www.escapistmagazine.com/forums/read/6.870766-A-Review-Scoring-System-That-Would-Work?page=2#21827140] aptly puts it: "The real reason reviewers try to argue that "averages are useless" and "averages flatten out stuff" is that they're afraid of what averages really do - render them just a voice in a sea of other reviews, all equal. They don't want to be equal though. They want their review to mean more than everyone else's. They think they know better after all, and they want you to read THEIR review (and please click through to our site while you're at it). When I can see 50 reviews at a glance, their scores, and the average of their scores, I only need to fully read a few reviews to get the gist of why the scores fell where they did. This is poison to reviewer ego."[/footnote]. A more cynical view is that taking a non-committal approach to reviews merely serves as a way for professional critics to alleviate themselves of the embarrassment of recommending the next big stinker, or perhaps as a way to make criticism more palatable to site advertisers with a vested interest in game sales [footnote]See reference [8].[/footnote]. This isn't quite right though; many game reviewers adopt an even more committal approach than writing down a number, as explained below.
I think it's misguided to point fingers at neutral aggregators like Metacritic, but I can understand why some eyebrows were raised at the discovery that game developer bonuses had, at least in one instance [footnote]Obsidian missed Fallout: New Vegas Metacritic bonus by one point - Joystiq - Ben Gilbert - Mar 15, 2012
http://www.joystiq.com/2012/03/15/obsidian-missed-fallout-new-vegas-metacritic-bonus-by-one-point/
A report on Fallout: New Vegas developer Obsidian missing out on royalties because the game achieved a (critic) Metascore of 84 instead of 85 or higher. This actually seems like a generous score considering the reviews of the PC version make universal reference to a buggy experience and gameplay all too similar to its predecessor. Curiously, even though Joystiq themselves only awarded the Xbox 360 version of the game a score of 70 (tied for 4th lowest out of 81 ratings), they can't resist taking a swing at Metacritic for giving smaller outlets a seat at the table: "Leaving aside the fact that Metacritic is a woefully unbalanced aggregation of review scores from both vetted and unvetted publications, agreements like this can leave indie studios -- like Obsidian -- in the lurch should that Metacritic score just barely miss the mark." Sorry Ben Gilbert, but the gaseous emissions of major gaming publications aren't quite as fragrant as you seem to think.[/footnote], been tied to critic review scores. What I find indefensible is the idea that one can take a principled stand against review scores while at the same time being perfectly happy to issue binary recommendations of the form 'buy / don't buy' or 'play / don't play'. Kotaku, Ars Technica, and now Eurogamer explicitly engage in this practice, to say nothing of the propensity of unscored reviews to all but club the reader over the head with a final verdict. The reason I consider this position absurd is simple: a score more astutely clarifies the relative value of the pros and cons discussed by a reviewer than a binary or ternary assessment. That is to say, if we gauge various forms of review by the amount of information they convey, the 'yes / no' recommendation ranks at the bottom. It's about as nuanced as a chainsaw wielding medical intern who complains that the surgical instruments aren't sharp enough.
Meanwhile, back in the camp where nuance isn't merely a philosophical construct for peddling opinions that are anything but, there are those who aren't as dogmatically opposed to numerical ratings but feel as though the scoring system is flawed. The main point of contention is that scores on a 10-point or 100-point scale are artificially skewed towards the 70%-100% region [footnote]Video Game Reviews: A Discussion Of The Ten-Point Scale And Inflated Scores - Forbes - Erik Kain - June 14, 2013
http://www.forbes.com/sites/erikkain/2013/06/14/how-video-game-reviews-work/
In this article, Erik Kain begins by explaining how the compression of school grades into the high end of the percentage scale is mirrored by video game review scores: "First of all, the 10-point scale is deceptive. Here's what I mean by that: First, take the numbers 1-10 and graft them over to the traditional letter-grading we use at school. There are just five letter grades [which] translate roughly to A = 90-100%, B = 80-89%, C = 70-79%, D = 60-69%, F = 00-59%. Only truly awful grades would get an F even though F comprises 59% of the total scale ... the same is true with video game reviews. Only truly awful games are given an F while most games fall somewhere between a 7 and a 9." Later Kain reveals his personal preference of a 3-tier rating system ("Buy / Hold / Sell"), stating: To me, only two scores count: ones above 9 and ones below 7. This indicates something that might be special on the one hand, and something that might be truly terrible on the other or at the very least not worth buying. Everything else just means it's okay-to-good with a margin of error based on personal taste."[/footnote] [footnote]Review Score Guide - The Jimquisition - Jim Sterling - Nov 22, 2014
http://www.thejimquisition.com/review-score-guide/
Here Jim Sterling advocates the full range of the 10-point scale: "In my prior work at Destructoid, I always aimed to use the full ten-point scale, rather than simply the higher end of it. There?s a popular belief that reviews are rated from 7-10 by major outlets, instead of 1-10, and while that's an exaggeration, I certainly feel more publications could stand to utilize all the numbers a bit more readily." Sterling also recommends the use of half-points on the 10-point scale, stating "[it's] useful to have that bit of wiggle room". The post goes on to describe the 10-point system employed by The Jimquisition.[/footnote], and any game rated below 70% is more likely to be found in a GameStop bargain bin than a console disc tray. Thus, there is a perceived incompatibility with the traditional 4-star system in which scores are (presumably) more evenly distributed across the scale, and where 2/4 truly does stand for average quality [footnote]#GamerGate Wants Objective Video Game Reviews: What Would Roger Ebert Do? - Forbes - Erik Kain - Dec 28, 2014
http://www.forbes.com/sites/erikkain/2014/12/28/gamergate-wants-objective-video-game-reviews-what-would-roger-ebert-do/
After dismissing various complaints made by Gamergate as "paranoid" and "silly", Kain moves on to the general topic of objectivity in game reviews, saying: "readers need to accept that each critic will weight his or her review differently, and that the search for the 'objective' reviewer is futile. A reviewer who ignores politics or gender issues in their review entirely is simply biased in another direction. Balance is crucial." I'd argue that leaving politics and gender issues out of reviews entirely is a far cry from hamfisting them to the forefront in fictional works where they aren't remotely a main theme. Citing famed movie critic Roger Ebert's review of The Last Boy Scout as the epitome of balance, Kain states: "Part of the problem may be our scoring system for video games. There's something about the four-star system that's simpler and more honest than a ten point scale. Gone are the weird decimals. Gone is the tendency to weight scores toward the upper end of the scale. A great movie or game simply gets four stars. A good movie or game gets three. A mediocre movie or game gets two. And a bad movie or game gets one. It's nice and tidy, and it allows reviewers to give a 'good' review score to a good game while still criticizing its less savory aspects, much as Ebert does with The Last Boy Scout." I find this reasoning extremely flimsy. The 10-point scale allows equivalent penalties to be levied for "less savory aspects" while affording greater flexibility in just how large that penalty should be. Flexibility ought to be a welcome ally to the "it's just a subjective opinion, no worse than any other" crowd, a group that many game critics count themselves as proud members of and which Kain seems intent on joining.[/footnote].
Well, if this 'tight scale makes for honest scores' argument holds water, surely we should be able to find some evidence to support it. The problem is that video game reviewers who employ a strict 4-star or 5-star grading system (strict = no half-stars) are a rare commodity. Among sites that could be classified as well-established, I count only Giant Bomb. So, while I'll grant that a single data point doesn't prove the general case, this particular data point might be viewed as an important one by those familiar with the origins of Giant Bomb [footnote]Jeff Gerstmann Explains His Departure From Gamespot - The Escapist - Earnest 'Nex' Cavalli - Mar 15, 2012
http://www.escapistmagazine.com/news/view/116360-Jeff-Gerstmann-Explains-His-Departure-From-Gamespot
A synopsis of the infamous dismissal of Jeff Gerstmann from GameSpot in 2007 and the subsequent formation of Giant Bomb. The conflict between Gerstmann and GameSpot management arose primarily over a low review score he awarded to Kane & Lynch: Dead Men, a game that was being heavily advertised on the site. Once the details of the affair became known, Gerstmann was lauded for refusing to cave to pressure from advertisers and he became somewhat of a symbol for ethical games journalism. Ironically, Giant Bomb was later sold to CBS Interactive, the parent company of GameSpot. More recently, Gerstmann's spotless reputation has been called into question for admitting indie game marketing baronesses onto Giant Bomb to hawk their wares under the pretext of 'Top 10 Games' lists. [/footnote]. Illustrated in the bar plot below is the probability distribution of review scores for Giant Bomb and The Escapist based on console and PC games reviewed over the past five years [footnote]A collection of data for the analysis of video game ratings - Blog post - Slandered Gamer - Dec 30, 2014
http://slanderedgamer.blogspot.com/2014/12/a-collection-of-data-for-analysis-of.html
This blog post details a software application for downloading and viewing Metacritic game reviews. It also provides a sizeable collection of review data. The collection of data used in this article includes all console and PC games reviewed by either Polygon, Joystiq, Giant Bomb or The Escapist from January 2010 to mid-December 2014. Mobile and handheld titles were excluded completely. Why does this matter? Well, I suspect this selection of games slightly inflates the score statistics of other publications, i.e., anyone who isn't Polygon, Joystiq, Giant Bomb or The Escapist, by discounting some of the lesser known (and lower scoring) titles they've reviewed. This is because game reviewers tend to cover all the same major titles while randomly picking and choosing among minor titles with less overlap. However, disparities between statistics given here and those listed on Metacritic - typically 1 to 5 points in average score, for example - are also influenced by the exclusion of mobile and handheld games as well as any games released prior to 2010. These exclusions were viewed as desirable in order to obtain a selection of games that is both relevant and recent.[/footnote]. Note that Metacritic uses proportional scaling to convert to a 100-point score (e.g. 3/5 = 60/100, 9/10 = 90/100). Rather than the vaunted uniform distribution across the scale, it can be observed that Giant Bomb's scores are about as heavily concentrated towards the high end as The Escapists'. The average ratings of 72 and 74 are also very similar. Even if, in a foolish attempt to appease the "five stars doesn't translate to a perfect score!!!" mouth breathers, we shift the score distributions left by a half-interval (subtract 10 points for Giant Bomb and 5 points for Escapist), what remains is that Giant Bomb rates 61% of games above the midpoint of its scale (3/5 stars) while just 13% fall below this mark.
There's no denying the fact that video game ratings lean heavily towards the upper end of the scale. You need only browse the publication lists on Metacritic to discover an industry-wide scoring average currently sitting at 74, a figure that the vast majority of individual publications fall within +/-10 points of. Compare that to an overall average score of 62 for movies, 65 for television and 72 for music.
To gain a better appreciation of the status quo, the table below contains a statistical summary of review scores from some prominent gaming critics for a large selection of console and PC titles [footnote]See reference [17].[/footnote]. Average scores are in the range of 68-78. The cumulative probability data conveys the distribution of each critic's review scores across the scale [footnote]If you've never heard of a cumulative distribution function (CDF) before, here is a brief explanation sufficient for our purposes. I have a bunch of review scores for different games, say {55, 70, 75, 90, 100}. If you name a particular score threshold you're interested in, I can calculate what fraction of my scores are less than or equal to that threshold. For example, if you say 70, I count 2 out of 5 of scores that are less than or equal to 70, which gives a cumulative probability 2/5 = 0.4. If you then ask about 85, I find that 3 of my 5 scores are less than or equal to 85, giving a cumulative probability of 3/5 = 0.6. The nice thing about this system is that it allows us to efficiently summarize thousands of scores by calculating the cumulative probability at a small number of preselected thresholds, for example at 30, 40, 50, 60, 70, 80, 90, 100 like in the article.[/footnote]. Some useful observations can be made. For example, the probability of scoring a game at 70 or lower is between 0.33 and 0.52 for most publications, but it goes as low as 0.24 (Game Informer and GameTrailers) and as high as 0.65 (Edge Magazine). The probability of scoring at 50 or below is typically between 0.09 and 0.17, but this can be as low as 0.05 in the most extreme case. Perhaps most alarming is the inclination of certain critics to use the 81-100 region of the scale for half of all games they rate (take 1.0 minus the cumulative probability value in the 80 column), whereas most gamers would agree that 81-100 territory should be reserved for truly top notch efforts [footnote]Downfall of Gaming Journalism #9: GAME INFORMER - Youtube video - The Rageaholic (RazorFist) - Feb 15, 2015
https://www.youtube.com/watch?v=pss0hJkmLBA
I doubt the creator of this video would think much of the current article as he unequivocally condemns video game rating inflation instead of seeking rationale for it. But if we agree on one thing, it's who to point the finger at when the worst offenders are lined up. Says RazorFist: "Polygon, Kotaku and Rock Paper Shotgun didn't just wake up one day and decide "hey, let's be a **** lapping cabal of bought out bitches." ... Long before there were URLs and mailing lists, there were SKUs and mailing lists. Print motherfucking journalism, folks. And no institution is more steeped in, or emblematic of, the omnipresent orgy of corruption in gaming journalism than fucking Game Informer." Among various bombs unloaded during this blistering rant, an interesting theory is put forward concerning a tipping point in the history of game review scoring: "I'm of the opinion that Game Informer almost single-handedly skewed review scores across all websites and publications for all time ... I hold an issue [of Game Informer] in my hand from 2009 - far from a banner year for gaming - that in 25 reviews boasts not one that ranks below a 5.5 [out of 10]". RazorFist later adds: "A bad game isn't a 6 or a 7 you colluding **** flaps [reviewers], a bad game is a 1 or a 2.", going on to enumerate the many faults of the December 2009 edition of Game Informer magazine.[/footnote]. All told, the results serve as further confirmation that nearly all of the action takes place in the top half of the scale.
[table border="1" width="600"]
[tr][td rowspan="2"]Critic[/td][td rowspan="2"]Samples[/td][td rowspan="2"]Average[/td][td colspan="8" align="center"]Cumulative probability at score:[/td][/tr]
[tr][td]30[/td][td]40[/td][td]50[/td][td]60[/td][td]70[/td][td]80[/td][td]90[/td][td]100[/td][/tr]
[tr][td]Destructoid[/td][td]747[/td][td]74[/td][td]0.04[/td][td]0.08[/td][td]0.13[/td][td]0.20[/td][td]0.37[/td][td]0.63[/td][td]0.90[/td][td]1.00[/td][/tr]
[tr][td]Edge Magazine[/td][td]548[/td][td]68[/td][td]0.04[/td][td]0.08[/td][td]0.19[/td][td]0.40[/td][td]0.65[/td][td]0.88[/td][td]0.99[/td][td]1.00[/td][/tr]
[tr][td]Eurogamer[/td][td]767[/td][td]71[/td][td]0.04[/td][td]0.09[/td][td]0.16[/td][td]0.29[/td][td]0.52[/td][td]0.79[/td][td]0.96[/td][td]1.00[/td][/tr]
[tr][td]Game Informer[/td][td]1090[/td][td]78[/td][td]0.01[/td][td]0.02[/td][td]0.06[/td][td]0.13[/td][td]0.24[/td][td]0.54[/td][td]0.89[/td][td]1.00[/td][/tr]
[tr][td]GameSpot[/td][td]1334[/td][td]72[/td][td]0.01[/td][td]0.06[/td][td]0.12[/td][td]0.24[/td][td]0.45[/td][td]0.79[/td][td]0.99[/td][td]1.00[/td][/tr]
[tr][td]GamesRadar[/td][td]746[/td][td]73[/td][td]0.03[/td][td]0.06[/td][td]0.14[/td][td]0.26[/td][td]0.47[/td][td]0.76[/td][td]0.95[/td][td]1.00[/td][/tr]
[tr][td]GameTrailers[/td][td]641[/td][td]78[/td][td]0.00[/td][td]0.01[/td][td]0.05[/td][td]0.11[/td][td]0.24[/td][td]0.47[/td][td]0.88[/td][td]1.00[/td][/tr]
[tr][td]Giant Bomb[/td][td]506[/td][td]72[/td][td]0.02[/td][td]0.13[/td][td]0.13[/td][td]0.39[/td][td]0.39[/td][td]0.87[/td][td]0.87[/td][td]1.00[/td][/tr]
[tr][td]IGN[/td][td]1476[/td][td]75[/td][td]0.02[/td][td]0.04[/td][td]0.09[/td][td]0.19[/td][td]0.33[/td][td]0.59[/td][td]0.92[/td][td]1.00[/td][/tr]
[tr][td]Joystiq[/td][td]638[/td][td]73[/td][td]0.03[/td][td]0.10[/td][td]0.17[/td][td]0.29[/td][td]0.45[/td][td]0.73[/td][td]0.92[/td][td]1.00[/td][/tr]
[tr][td]PC Gamer[/td][td]347[/td][td]74[/td][td]0.01[/td][td]0.03[/td][td]0.09[/td][td]0.18[/td][td]0.33[/td][td]0.59[/td][td]0.94[/td][td]1.00[/td][/tr]
[tr][td]Polygon[/td][td]386[/td][td]71[/td][td]0.04[/td][td]0.08[/td][td]0.17[/td][td]0.27[/td][td]0.48[/td][td]0.72[/td][td]0.95[/td][td]1.00[/td][/tr]
[tr][td]The Escapist[/td][td]488[/td][td]74[/td][td]0.02[/td][td]0.07[/td][td]0.12[/td][td]0.28[/td][td]0.43[/td][td]0.77[/td][td]0.91[/td][td]1.00[/td][/tr]
[tr][td]VideoGamer[/td][td]672[/td][td]73[/td][td]0.03[/td][td]0.06[/td][td]0.12[/td][td]0.23[/td][td]0.48[/td][td]0.78[/td][td]0.97[/td][td]1.00[/td][/tr]
[/table]
How to explain these findings? Has the sphere of professional game critics gone so thoroughly mad with corruption and fanboyism as to be incapable of delivering anything resembling an honest appraisal? I think not. At least not entirely, and allow me to explain why. First of all, there is a positive bias with respect to what quality of game even registers as a blip on the radar of reviewers. If it isn't a big studio release backed by marketing or an indie title blessed by IGF or IndieCade, it generally doesn't receive a mention let alone a dedicated review. There isn't necessarily anything insidious about this state of affairs; I'd wager that at least a few critics regularly sample low budget offerings only to be reminded of why they don't more often. Mind you, I don't assert that marketing buzz is an accurate predictor of game quality, only that the subset of games with enough traction to garner attention from reviewers is, statistically speaking, of above average quality [footnote]Video Game Reviews: A Discussion Of The Ten-Point Scale And Inflated Scores - Forbes - Erik Kain - June 14, 2013
http://www.forbes.com/sites/erikkain/2013/06/14/how-video-game-reviews-work/
On our return to this article, I draw attention to a comment in which Erik Kain volunteers the following: "For instance, I tend to review games I want to play and play games that I think I will enjoy. So my scores tend to be a bit high, often hovering between 7 and 9 [out of 10] and rarely dropping to a 6 or below." Perhaps without fully appreciating it, Kain has supplied a partial answer to his own inquiries into why video game review scores are clustered at the top half of the scale. It isn't so much that reviewers only play games they think they'll enjoy, it's that they don't waste time formally reviewing games that their audiences won't know or care about. Pageviews pay the bills. Niche games don't attract nearly as many eyeballs as promoted titles, and they also happen to fall on the low end of the quality spectrum with greater regularity.[/footnote]. As support for this claim, consider the following graph of average critic rating (Metascore) as a function of the number of critics. The best linear fit to the data suggests that Metascore increases by an average of 11.7 points as the number of critic reviews goes from 4 (little attention) to 100 (massive attention) [footnote]The linear function is y = ax + b where y denotes score, x is the logarithm of the number of critic reviews, and a, b are the fitted parameters. The power function is y = cx^a + b where a, b, c are parameters. Whereas the linear function indicates an increase in average score from 66.5 to 78.2 (+11.7 points) as the number of reviews increases from 4 to 100, the power function produces a score increase from 65.6 to 76.0 (+10.4 points) over the same range. Some might consider 4 review scores a precariously low sample size, and you could certainly make the argument that it isn't in keeping with the perception of a Metascore as a consensus among a significant number of critics. To address this problem, I recomputed the best fits after discarding all observations with fewer than 10 critic reviews. The results showed a slightly stronger trend: going from 10 to 100 critic reviews yielded +12.3 points for the linear function and +10.6 points for the power function. Let's put things into perspective though - the data exhibits wide deviation around the fitted curves. The correlation coefficient between x and y is just 0.22 (or 0.24 if the 10 review minimum is enforced), which is a statistically significant positive correlation but by no means a dominant one. This is a good sign because it suggests that other factors (such as quality) are more closely connected to a game's overall score than the number of critics who deem it worthy of attention. Coming back around to the original point, in my opinion these results make a reasonable case for the aforementioned selection bias among games reviewers. But I imagine that one of the internet's myriad causation experts will be along to correct me shortly, arguing that all this really proves is professional games reviewers are under the influence of marketing hype if not outright bribery.[/footnote].
Which brings me to a second, albeit related, point of discussion. A robust rating system can accommodate not only the good and the bad, but the extremes thereof. As a former dabbler with Microsoft XNA which led to a minor obsession with XBLIG games, I discovered a degree of awful that simply can't be overcome by normalizing to a price point approaching zero. However, much like the stereotypical dumb jock who brings home a report card of failing grades, an overhyped video game featuring insipid gameplay in a drab universe still manages to answer half the questions correctly [footnote]My argument is straightforward: games that manage to succeed at the fundamentals - competent modelling and animation, functional game mechanics, controls that aren't unintuitive or horribly laggy, free of game-breaking bugs, price point in line with quality and quantity of gameplay - have done enough things correctly to warrant a score in the neighborhood of 50%. Even if the overall package isn't so attractive, I don't see a compelling reason to assign 1/5 or 3/10 just to satisfy some contrived quota of low scores to prove how 'honest' your opinions are. There's a useful distinction to be made between 'difficult to recommend' and 'complete and utter trash'. There isn't anything wrong with a scoring standard where 50% is regarded as a marker for acceptable product quality, as opposed to being treated as a target average score for the particular subset of (mostly high-end) products appraised by a reviewer. My general feeling is that game review scores are somewhat inflated at the moment, but perhaps only by 1 point out of 10 for most publications rather than the 2-2.5 points deduced by comparing current scoring averages to a 5/10 target. People who bleat about "average game quality" are invariably full of shit because they never elaborate on their personal selection criteria from which that nebulous average is derived.[/footnote]. The gamer who encounters a review score of 5/10 or 50/100 is no more deceived about the game's quality than the parents of the 50%-average student are deceived about his intellectual prowess. Because they have a feel for the scale and don't need it excessively dumbed down into compartments labeled 'bad / fair / good'.
In conclusion, I can't help but think the fixation of certain game critics with compactness of scale is misguided. And I'm not the only one to recognize that trading a high resolution scale for one with fewer notches doesn't immediately solve the problems inherent in judging video game quality [footnote]A Review Scoring System That Would Work - The Escapist - Ben 'Yahtzee' Croshaw - Feb 17, 2015
http://www.escapistmagazine.com/articles/view/video-games/columns/extra-punctuation/12989-Video-Game-Review-Scores-A-System-That-Would-Work
In this article, Ben Croshaw attempts to demonstrate the futility of game review scoring by emphasizing the subjective experience of the player. The introduction offers some commentary on the recent changes to Eurogamer's scoring system: "what Eurogamer doesn't seem to have realized is that it's not cured, it's just dropping from heroin to methadone. In the article, they state that they are switching from scores to awards. So rather than ditching scores entirely, they've merely switched from a 10-point system to a 4-point one. Essential, Recommended, Avoid, and presumably one that goes between the last two indicated by there being no award at all." Later, Croshaw argues that the multifaceted nature of games means they can't be adequately characterized by anything less than a detailed account of the individual experiences of 25 players, to be collected through a series of questionnaires filled out at regular intervals during the playing session. The reader would then decide which playtester(s) to heed based on the results of personality tests. Just when you're beginning to appreciate the article as a satirical take on the sort of game review system that might be devised by a drug-addled social sciences dropout from San Francisco, you're reminded that the author is trying to make a serious point against quantitative scoring. However, rather than supporting its conclusion of "perhaps alternatively you could just [ignore scores and] read the cunting review", the main text of the article contradicts it by showing exactly why no single reviewer ought to be trusted, least of all one who neglects to divulge the results a mental status examination taken at the time of writing. Instead, the savvy game consumer must seek council from some 25 reviewers at minimum, which can be interpreted as a tacit endorsement of the following strategy: browse the aggregated review blurbs on Metacritic until you find something that resonates. I always suspected a measure of grudging support for review aggregation lurking beneath the tough guy facade of the games press.[/footnote]. Major games review sites, operating under the guise of addressing the needs of their readers, are all too eager to trumpet their own special flavor of review format, apparently unaware that assigning different names to two or three rating categories doesn't suddenly make you innovative. Coupled with a growing aversion to the 10-point system, you might say that professional game critics are collectively struggling with their own variant of 'not invented here' syndrome, a phenomenon well known to those in the software industry. Barring a widespread outbreak of sanity, I'll just be sitting here waiting for the next super-duper-friend-of-consumer recommendation system to come along and caress my prefrontal cortex into submission with all the finesse of a day-one DLC offering.
http://www.eurogamer.net/articles/2015-02-10-eurogamer-has-dropped-review-scores
The article lays out Eurogamer's rationale for dropping the 10-point scoring system in favor of a 3-level 'Essential / Recommended / Avoid' system. Says Oli Welsh: "This hasn't been the first time we've discussed dropping review scores. In the past, the case we've made for it internally has always been that a number is a very reductive way to represent a nuanced, subjective opinion, and that the arguments started by scores aren't productive." Welsh goes on to state: "The counter-argument was simple but powerful: as an at-a-glance guide to what we think, scores were very useful to readers. We no longer think that's true. In the present environment, scores are struggling to encompass the issues that are most important to you." In summary: "Scores are failing us, they're failing you, and perhaps most importantly, they are failing to fairly represent the games themselves." For all the talk in the article about how modern games are continuously evolving through patches and feature updates, and how this has made reviewing games more difficult, it's not made clear exactly how the proposed 3-level system is better equipped to handle these challenges. Eurogamer's manifesto becomes even more puzzling when it's revealed that the proposed rating system will only be applied to selected games, and that, barring extraordinary circumstances, they will not update reviews as games evolve over time. [/footnote], as did Joystiq shortly before closing its doors [footnote]Joystiq isn't scoring reviews anymore, and here's why - Joystiq - Richard Mitchell - Jan 13, 2015
http://www.joystiq.com/2015/01/13/joystiq-isnt-scoring-reviews-anymore-and-heres-why/
Here Joystiq explains their decision to stop scoring reviews. Unfortunately, they did not get the opportunity to gauge the decision's impact as the site was shuttered by AOL just two weeks later. Mitchell states: "The very purpose of a score is to define something entirely nebulous and subjective ? fun ? as narrowly as possible. The problem is that narrowing down something as broad and fluid as a video game isn't truly useful, especially in today's industry. Between pre-release reviews, post-release patching, online connectivity, server stability and myriad other unforeseeable possibilities, attaching a concrete score to a new game just isn't practical. More importantly, it's not helpful to our readers." Later in the article it is claimed that Joystiq felt compelled to modify their original five star rating system to better align with the typical distribution of scores on Metacritic, an apparent "capitulation to the industry" they weren't happy with. However, as the only changes implemented were to start using half-stars and to add a half-star to a few old scores, it's not entirely convincing that there was ever any serious incompatibility between Joystiq's rating scale and that of the typical reviewer listed on Metacritic. [/footnote]. Chief among the complaints about review scores are that they discourage meaningful discussion of games, and that they aren't sufficiently nuanced to be an effective tool for reviewers [footnote]The Spotty Death and Eternal Life of Gaming Review Scores - Ars Technica - Kyle Orland - Feb 15, 2015
http://arstechnica.com/gaming/2015/02/the-spotty-death-and-eternal-life-of-gaming-review-scores/
The article contains some comments from Jason Schreier of Kotaku regarding his opinion of game review scores. Says Schreier: "When I read through the comments on an IGN review, for example, all I see is people talking about the score ... compare that to, say, comments on an [unscored] review from Kotaku or Rock Paper Shotgun, and it's night and day". Schreier elaborates further: "Scores strip the nuance from video game criticism, forcing reviewers to stuff games into neat little boxes labeled 'good' and 'bad'". Well, it's perhaps not surprising that Schreier believes he and his colleagues at Kotaku are cultivating a superior audience, but it remains unclear as to whether reality subscribes to the same theory. What's considerably more likely is that reality maintains a subscription to the eponymous subreddit known for mocking the outlet's tawdry editorials.[/footnote]. Other industry professionals prefer to direct their ire towards the practice of review score aggregation on websites such as Metacritic and GameRankings. Have a listen to old Sessler rave about how evil Metacritic is tearing the industry apart [footnote]Adam Sessler's rant about Metacritic at GDC 2009 - Youtube video - Simon LeMonkey - Mar 28, 2009
https://www.youtube.com/watch?v=0QsXrswJ-yM#t=25s
This video is of Adam Sessler giving a talk at the Game Developers Conference (GDC) 2009. By way of introduction, Sessler proclaims: "Fuck Metacritic. Who the hell made you king?" Sessler relates an anecdote of a developer he was acquainted with approaching him, upset that Sessler's 2/5 rating for his game had been translated to a score of 40/100 on Metacritic. He later clarifies: "It's just kind of odious when we in the press are seeing our work retranslated and recalibrated ... where we're really not claiming ownership and suddenly there's this number attached to it." At this point, I'm wondering in what alternative universe has a 2/5 rating ever been indicative of anything other than a steamy mound of canine feces? It should go without saying that a straight multiplicative scaling of 5-point or 10-point scores to a 100-point score is the most natural method of conversion. The talk also touches on the issue of publishers compensating developers based on Metacritic scores. Here Sessler displays an astounding proficiency at mental gymnastics when he suggests that professional reviewers should not in any way be held responsible for poor game sales, but that Metacritic (an aggregation of professional reviews) should definitely be held responsible: "You know what? If it's a good game and you know it's a good game, but it doesn't sell well, go talk to your marketing staff, all right? Don't put it on us [game critics]. I'm sorry this has happened to you [game developers]. I want to put a stop to it. Maybe somehow we'll all get together, we'll march down to CNET [Metacritic's owner], we'll flip 'em the bird, and maybe somebody in that building will take a long hard look at something they're putting online that is odious, pernicious and needs to stop now."[/footnote]. TotalBiscuit also isn't a fan, to put it mildly [footnote]Content Patch : January 3rd, 2013 - Ep. 025 [Tomb Raider, Gamestick, Elite] - Youtube video - TotalBiscuit, The Cynical Brit - Jan 3, 2013
https://www.youtube.com/watch?v=mQqzqHgvB90#t=521s
Says TotalBiscuit: "It's no secret that I'm very much a critic of Metacritic. I very much dislike the notion of the website, and I very much dislike what it has done to the industry". The commentary devolves into a meandering rant containing several dubious arguments, such as (1) Metacritic is lambasted for refusing to remove a GameSpot review score after the reviewers later admitted to doing a terrible job, (2) the use of background colors to improve readability of review score text is criticized as pandering to lazy dimwits, (3) Metacritic is accused of operating on a "business model of manipulating ratings" because they dare to translate review scores of 10/10 or A+ to 100/100. The single coherent point of the tirade revolves around Metacritic's lack of transparency in the way critic scores are weighted to determine a game's Metascore, a legitimate concern that has been raised elsewhere. However, it's hard to take this too seriously when the sinister allegations of targeted rating manipulation that follow aren't backed by any evidence or even reasonable suspicion. TotalBiscuit concludes with a strongly worded condemnation: "it just proves once again that Metacritic, in its current form, is a dangerously useless site that is actively stomping around like a bull in a china shop when it comes down to game and media reviews. And its limited usefulness is not enough to counteract the potential damage that Metacritic could actually do to the industry and is continuing to do to this very day."[/footnote].
On the other side of the table, plenty of game consumers don't see a problem with including scores as a component of written reviews. Aggregated ratings pages are viewed as as a helpful, if not completely definitive, source of information about present and past titles. Games released with serious technical deficiencies will almost certainly find themselves on the business end of a numerical beatdown from critics and users alike. In fact, if attempts at coercion and bribery by publishers are even half as pervasive as games journalists tell us, turning to aggregate ratings for a reliable appraisal of game quality is a perfectly sensible course of action [footnote]The basic premise is that a minority of deliberately inflated (or deflated) review scores will not have a great impact on the overall average score. Fortunately, the premise still holds when outliers are merely a result of overzealousness rather than outright dishonesty. [/footnote]. I've also seen many comments on discussion boards that boil down to "find a critic that works for you", which suggests it isn't so much a matter of the particular review format being employed as it is the tastes of the reviewer. Currently, it seems there is still some degree of support for review scores among game critics [footnote]Review Score Guide - The Jimquisition - Jim Sterling - Nov 22, 2014
http://www.thejimquisition.com/review-score-guide/
This guide was posted shortly after Jim Sterling left The Escapist to become an independent game critic funded directly by readers. There isn't anything especially remarkable here, other than the fact that Sterling elects to continue using review scores of his own accord: "Some people don't like review scores, and do not want to see them in reviews ? that is okay! Scores here are subtle in their application, casually included at the end of the review, and you can always ignore it if you don't think it's useful. I personally like using scores, and intend to continue doing so until such time as I don't." [/footnote] [footnote]The Official Destructoid Review Guide - Destructoid - Chris Carter - June 16, 2011
http://www.destructoid.com/the-official-destructoid-review-guide-2011-203909.phtml
This article explains Destructoid's system of review scoring. Yanier 'Niero' Gonzalez, founder of Destructoid, is quoted as saying: "Ad companies we've worked with have called us crazy for publishing scores. It really is like deciding to go to war. The only reason a site does not publish review scores is to sell more advertising. We have lost ad campaigns because we've given bad review scores, and frankly my dear, I don't give a damn. I'm not compromising our voice. Still, we understand the danger of a bad score. For example, some publishers giving their employees pay cuts due to scores, but in that case we push it back on them. It's not our fault you choose this method to compensate your employees. Grow a backbone, stand behind your work, make better games, and stop blaming the gaming press for having an honest opinion." Blunt and logical. A sincere defense of the review score is a rare sight to behold. [/footnote], while for others it comes across as a calculated business decision [footnote]The Spotty Death and Eternal Life of Gaming Review Scores - Ars Technica - Kyle Orland - Feb 15, 2015
http://arstechnica.com/gaming/2015/02/the-spotty-death-and-eternal-life-of-gaming-review-scores/
The article contains some comments from Arthur Gies of Polygon regarding his opinion of game review scores. Says Gies: "The anecdotal accounts and experiences we had suggested that readers want them [scores], whether they admit to it or not", adding that "I think there are people who are interested in arguing numbers and people who are more interested in discussing points raised in review text, and that neither are mutually exclusive." Damn Arthur, could you be any less enthusiastic about it? Not even a respectful nod towards the numeric tramp stamp you brandished to hit back at that pixelated perpetrator of psychosexual trauma known as Bayonetta? [/footnote].
So what exactly is the problem with review scores? I find the artsy-fartsy answer of "video games, like any form of art, are far too complex for their merits to be reduced to a number" less than convincing [footnote]Listening to some of the unabashed arrogance on display from professional games journalists over the past six months, both on forums and in the Twitterverse, one begins to wonder if it isn't in fact their creations we're expected to regard with the sort of reverence usually reserved for fine art. And there's more than a sneaking suspicion that certain critics, unable to conceal a stunning lack of awareness of their own position on the food chain of gaming content production, are deeply resentful of Metacritic for continuing to exist on the back of their hard work. As Escapist reader Thanatos2K [http://www.escapistmagazine.com/forums/read/6.870766-A-Review-Scoring-System-That-Would-Work?page=2#21827140] aptly puts it: "The real reason reviewers try to argue that "averages are useless" and "averages flatten out stuff" is that they're afraid of what averages really do - render them just a voice in a sea of other reviews, all equal. They don't want to be equal though. They want their review to mean more than everyone else's. They think they know better after all, and they want you to read THEIR review (and please click through to our site while you're at it). When I can see 50 reviews at a glance, their scores, and the average of their scores, I only need to fully read a few reviews to get the gist of why the scores fell where they did. This is poison to reviewer ego."[/footnote]. A more cynical view is that taking a non-committal approach to reviews merely serves as a way for professional critics to alleviate themselves of the embarrassment of recommending the next big stinker, or perhaps as a way to make criticism more palatable to site advertisers with a vested interest in game sales [footnote]See reference [8].[/footnote]. This isn't quite right though; many game reviewers adopt an even more committal approach than writing down a number, as explained below.
I think it's misguided to point fingers at neutral aggregators like Metacritic, but I can understand why some eyebrows were raised at the discovery that game developer bonuses had, at least in one instance [footnote]Obsidian missed Fallout: New Vegas Metacritic bonus by one point - Joystiq - Ben Gilbert - Mar 15, 2012
http://www.joystiq.com/2012/03/15/obsidian-missed-fallout-new-vegas-metacritic-bonus-by-one-point/
A report on Fallout: New Vegas developer Obsidian missing out on royalties because the game achieved a (critic) Metascore of 84 instead of 85 or higher. This actually seems like a generous score considering the reviews of the PC version make universal reference to a buggy experience and gameplay all too similar to its predecessor. Curiously, even though Joystiq themselves only awarded the Xbox 360 version of the game a score of 70 (tied for 4th lowest out of 81 ratings), they can't resist taking a swing at Metacritic for giving smaller outlets a seat at the table: "Leaving aside the fact that Metacritic is a woefully unbalanced aggregation of review scores from both vetted and unvetted publications, agreements like this can leave indie studios -- like Obsidian -- in the lurch should that Metacritic score just barely miss the mark." Sorry Ben Gilbert, but the gaseous emissions of major gaming publications aren't quite as fragrant as you seem to think.[/footnote], been tied to critic review scores. What I find indefensible is the idea that one can take a principled stand against review scores while at the same time being perfectly happy to issue binary recommendations of the form 'buy / don't buy' or 'play / don't play'. Kotaku, Ars Technica, and now Eurogamer explicitly engage in this practice, to say nothing of the propensity of unscored reviews to all but club the reader over the head with a final verdict. The reason I consider this position absurd is simple: a score more astutely clarifies the relative value of the pros and cons discussed by a reviewer than a binary or ternary assessment. That is to say, if we gauge various forms of review by the amount of information they convey, the 'yes / no' recommendation ranks at the bottom. It's about as nuanced as a chainsaw wielding medical intern who complains that the surgical instruments aren't sharp enough.

Meanwhile, back in the camp where nuance isn't merely a philosophical construct for peddling opinions that are anything but, there are those who aren't as dogmatically opposed to numerical ratings but feel as though the scoring system is flawed. The main point of contention is that scores on a 10-point or 100-point scale are artificially skewed towards the 70%-100% region [footnote]Video Game Reviews: A Discussion Of The Ten-Point Scale And Inflated Scores - Forbes - Erik Kain - June 14, 2013
http://www.forbes.com/sites/erikkain/2013/06/14/how-video-game-reviews-work/
In this article, Erik Kain begins by explaining how the compression of school grades into the high end of the percentage scale is mirrored by video game review scores: "First of all, the 10-point scale is deceptive. Here's what I mean by that: First, take the numbers 1-10 and graft them over to the traditional letter-grading we use at school. There are just five letter grades [which] translate roughly to A = 90-100%, B = 80-89%, C = 70-79%, D = 60-69%, F = 00-59%. Only truly awful grades would get an F even though F comprises 59% of the total scale ... the same is true with video game reviews. Only truly awful games are given an F while most games fall somewhere between a 7 and a 9." Later Kain reveals his personal preference of a 3-tier rating system ("Buy / Hold / Sell"), stating: To me, only two scores count: ones above 9 and ones below 7. This indicates something that might be special on the one hand, and something that might be truly terrible on the other or at the very least not worth buying. Everything else just means it's okay-to-good with a margin of error based on personal taste."[/footnote] [footnote]Review Score Guide - The Jimquisition - Jim Sterling - Nov 22, 2014
http://www.thejimquisition.com/review-score-guide/
Here Jim Sterling advocates the full range of the 10-point scale: "In my prior work at Destructoid, I always aimed to use the full ten-point scale, rather than simply the higher end of it. There?s a popular belief that reviews are rated from 7-10 by major outlets, instead of 1-10, and while that's an exaggeration, I certainly feel more publications could stand to utilize all the numbers a bit more readily." Sterling also recommends the use of half-points on the 10-point scale, stating "[it's] useful to have that bit of wiggle room". The post goes on to describe the 10-point system employed by The Jimquisition.[/footnote], and any game rated below 70% is more likely to be found in a GameStop bargain bin than a console disc tray. Thus, there is a perceived incompatibility with the traditional 4-star system in which scores are (presumably) more evenly distributed across the scale, and where 2/4 truly does stand for average quality [footnote]#GamerGate Wants Objective Video Game Reviews: What Would Roger Ebert Do? - Forbes - Erik Kain - Dec 28, 2014
http://www.forbes.com/sites/erikkain/2014/12/28/gamergate-wants-objective-video-game-reviews-what-would-roger-ebert-do/
After dismissing various complaints made by Gamergate as "paranoid" and "silly", Kain moves on to the general topic of objectivity in game reviews, saying: "readers need to accept that each critic will weight his or her review differently, and that the search for the 'objective' reviewer is futile. A reviewer who ignores politics or gender issues in their review entirely is simply biased in another direction. Balance is crucial." I'd argue that leaving politics and gender issues out of reviews entirely is a far cry from hamfisting them to the forefront in fictional works where they aren't remotely a main theme. Citing famed movie critic Roger Ebert's review of The Last Boy Scout as the epitome of balance, Kain states: "Part of the problem may be our scoring system for video games. There's something about the four-star system that's simpler and more honest than a ten point scale. Gone are the weird decimals. Gone is the tendency to weight scores toward the upper end of the scale. A great movie or game simply gets four stars. A good movie or game gets three. A mediocre movie or game gets two. And a bad movie or game gets one. It's nice and tidy, and it allows reviewers to give a 'good' review score to a good game while still criticizing its less savory aspects, much as Ebert does with The Last Boy Scout." I find this reasoning extremely flimsy. The 10-point scale allows equivalent penalties to be levied for "less savory aspects" while affording greater flexibility in just how large that penalty should be. Flexibility ought to be a welcome ally to the "it's just a subjective opinion, no worse than any other" crowd, a group that many game critics count themselves as proud members of and which Kain seems intent on joining.[/footnote].
Well, if this 'tight scale makes for honest scores' argument holds water, surely we should be able to find some evidence to support it. The problem is that video game reviewers who employ a strict 4-star or 5-star grading system (strict = no half-stars) are a rare commodity. Among sites that could be classified as well-established, I count only Giant Bomb. So, while I'll grant that a single data point doesn't prove the general case, this particular data point might be viewed as an important one by those familiar with the origins of Giant Bomb [footnote]Jeff Gerstmann Explains His Departure From Gamespot - The Escapist - Earnest 'Nex' Cavalli - Mar 15, 2012
http://www.escapistmagazine.com/news/view/116360-Jeff-Gerstmann-Explains-His-Departure-From-Gamespot
A synopsis of the infamous dismissal of Jeff Gerstmann from GameSpot in 2007 and the subsequent formation of Giant Bomb. The conflict between Gerstmann and GameSpot management arose primarily over a low review score he awarded to Kane & Lynch: Dead Men, a game that was being heavily advertised on the site. Once the details of the affair became known, Gerstmann was lauded for refusing to cave to pressure from advertisers and he became somewhat of a symbol for ethical games journalism. Ironically, Giant Bomb was later sold to CBS Interactive, the parent company of GameSpot. More recently, Gerstmann's spotless reputation has been called into question for admitting indie game marketing baronesses onto Giant Bomb to hawk their wares under the pretext of 'Top 10 Games' lists. [/footnote]. Illustrated in the bar plot below is the probability distribution of review scores for Giant Bomb and The Escapist based on console and PC games reviewed over the past five years [footnote]A collection of data for the analysis of video game ratings - Blog post - Slandered Gamer - Dec 30, 2014
http://slanderedgamer.blogspot.com/2014/12/a-collection-of-data-for-analysis-of.html
This blog post details a software application for downloading and viewing Metacritic game reviews. It also provides a sizeable collection of review data. The collection of data used in this article includes all console and PC games reviewed by either Polygon, Joystiq, Giant Bomb or The Escapist from January 2010 to mid-December 2014. Mobile and handheld titles were excluded completely. Why does this matter? Well, I suspect this selection of games slightly inflates the score statistics of other publications, i.e., anyone who isn't Polygon, Joystiq, Giant Bomb or The Escapist, by discounting some of the lesser known (and lower scoring) titles they've reviewed. This is because game reviewers tend to cover all the same major titles while randomly picking and choosing among minor titles with less overlap. However, disparities between statistics given here and those listed on Metacritic - typically 1 to 5 points in average score, for example - are also influenced by the exclusion of mobile and handheld games as well as any games released prior to 2010. These exclusions were viewed as desirable in order to obtain a selection of games that is both relevant and recent.[/footnote]. Note that Metacritic uses proportional scaling to convert to a 100-point score (e.g. 3/5 = 60/100, 9/10 = 90/100). Rather than the vaunted uniform distribution across the scale, it can be observed that Giant Bomb's scores are about as heavily concentrated towards the high end as The Escapists'. The average ratings of 72 and 74 are also very similar. Even if, in a foolish attempt to appease the "five stars doesn't translate to a perfect score!!!" mouth breathers, we shift the score distributions left by a half-interval (subtract 10 points for Giant Bomb and 5 points for Escapist), what remains is that Giant Bomb rates 61% of games above the midpoint of its scale (3/5 stars) while just 13% fall below this mark.

There's no denying the fact that video game ratings lean heavily towards the upper end of the scale. You need only browse the publication lists on Metacritic to discover an industry-wide scoring average currently sitting at 74, a figure that the vast majority of individual publications fall within +/-10 points of. Compare that to an overall average score of 62 for movies, 65 for television and 72 for music.
To gain a better appreciation of the status quo, the table below contains a statistical summary of review scores from some prominent gaming critics for a large selection of console and PC titles [footnote]See reference [17].[/footnote]. Average scores are in the range of 68-78. The cumulative probability data conveys the distribution of each critic's review scores across the scale [footnote]If you've never heard of a cumulative distribution function (CDF) before, here is a brief explanation sufficient for our purposes. I have a bunch of review scores for different games, say {55, 70, 75, 90, 100}. If you name a particular score threshold you're interested in, I can calculate what fraction of my scores are less than or equal to that threshold. For example, if you say 70, I count 2 out of 5 of scores that are less than or equal to 70, which gives a cumulative probability 2/5 = 0.4. If you then ask about 85, I find that 3 of my 5 scores are less than or equal to 85, giving a cumulative probability of 3/5 = 0.6. The nice thing about this system is that it allows us to efficiently summarize thousands of scores by calculating the cumulative probability at a small number of preselected thresholds, for example at 30, 40, 50, 60, 70, 80, 90, 100 like in the article.[/footnote]. Some useful observations can be made. For example, the probability of scoring a game at 70 or lower is between 0.33 and 0.52 for most publications, but it goes as low as 0.24 (Game Informer and GameTrailers) and as high as 0.65 (Edge Magazine). The probability of scoring at 50 or below is typically between 0.09 and 0.17, but this can be as low as 0.05 in the most extreme case. Perhaps most alarming is the inclination of certain critics to use the 81-100 region of the scale for half of all games they rate (take 1.0 minus the cumulative probability value in the 80 column), whereas most gamers would agree that 81-100 territory should be reserved for truly top notch efforts [footnote]Downfall of Gaming Journalism #9: GAME INFORMER - Youtube video - The Rageaholic (RazorFist) - Feb 15, 2015
https://www.youtube.com/watch?v=pss0hJkmLBA
I doubt the creator of this video would think much of the current article as he unequivocally condemns video game rating inflation instead of seeking rationale for it. But if we agree on one thing, it's who to point the finger at when the worst offenders are lined up. Says RazorFist: "Polygon, Kotaku and Rock Paper Shotgun didn't just wake up one day and decide "hey, let's be a **** lapping cabal of bought out bitches." ... Long before there were URLs and mailing lists, there were SKUs and mailing lists. Print motherfucking journalism, folks. And no institution is more steeped in, or emblematic of, the omnipresent orgy of corruption in gaming journalism than fucking Game Informer." Among various bombs unloaded during this blistering rant, an interesting theory is put forward concerning a tipping point in the history of game review scoring: "I'm of the opinion that Game Informer almost single-handedly skewed review scores across all websites and publications for all time ... I hold an issue [of Game Informer] in my hand from 2009 - far from a banner year for gaming - that in 25 reviews boasts not one that ranks below a 5.5 [out of 10]". RazorFist later adds: "A bad game isn't a 6 or a 7 you colluding **** flaps [reviewers], a bad game is a 1 or a 2.", going on to enumerate the many faults of the December 2009 edition of Game Informer magazine.[/footnote]. All told, the results serve as further confirmation that nearly all of the action takes place in the top half of the scale.
[table border="1" width="600"]
[tr][td rowspan="2"]Critic[/td][td rowspan="2"]Samples[/td][td rowspan="2"]Average[/td][td colspan="8" align="center"]Cumulative probability at score:[/td][/tr]
[tr][td]30[/td][td]40[/td][td]50[/td][td]60[/td][td]70[/td][td]80[/td][td]90[/td][td]100[/td][/tr]
[tr][td]Destructoid[/td][td]747[/td][td]74[/td][td]0.04[/td][td]0.08[/td][td]0.13[/td][td]0.20[/td][td]0.37[/td][td]0.63[/td][td]0.90[/td][td]1.00[/td][/tr]
[tr][td]Edge Magazine[/td][td]548[/td][td]68[/td][td]0.04[/td][td]0.08[/td][td]0.19[/td][td]0.40[/td][td]0.65[/td][td]0.88[/td][td]0.99[/td][td]1.00[/td][/tr]
[tr][td]Eurogamer[/td][td]767[/td][td]71[/td][td]0.04[/td][td]0.09[/td][td]0.16[/td][td]0.29[/td][td]0.52[/td][td]0.79[/td][td]0.96[/td][td]1.00[/td][/tr]
[tr][td]Game Informer[/td][td]1090[/td][td]78[/td][td]0.01[/td][td]0.02[/td][td]0.06[/td][td]0.13[/td][td]0.24[/td][td]0.54[/td][td]0.89[/td][td]1.00[/td][/tr]
[tr][td]GameSpot[/td][td]1334[/td][td]72[/td][td]0.01[/td][td]0.06[/td][td]0.12[/td][td]0.24[/td][td]0.45[/td][td]0.79[/td][td]0.99[/td][td]1.00[/td][/tr]
[tr][td]GamesRadar[/td][td]746[/td][td]73[/td][td]0.03[/td][td]0.06[/td][td]0.14[/td][td]0.26[/td][td]0.47[/td][td]0.76[/td][td]0.95[/td][td]1.00[/td][/tr]
[tr][td]GameTrailers[/td][td]641[/td][td]78[/td][td]0.00[/td][td]0.01[/td][td]0.05[/td][td]0.11[/td][td]0.24[/td][td]0.47[/td][td]0.88[/td][td]1.00[/td][/tr]
[tr][td]Giant Bomb[/td][td]506[/td][td]72[/td][td]0.02[/td][td]0.13[/td][td]0.13[/td][td]0.39[/td][td]0.39[/td][td]0.87[/td][td]0.87[/td][td]1.00[/td][/tr]
[tr][td]IGN[/td][td]1476[/td][td]75[/td][td]0.02[/td][td]0.04[/td][td]0.09[/td][td]0.19[/td][td]0.33[/td][td]0.59[/td][td]0.92[/td][td]1.00[/td][/tr]
[tr][td]Joystiq[/td][td]638[/td][td]73[/td][td]0.03[/td][td]0.10[/td][td]0.17[/td][td]0.29[/td][td]0.45[/td][td]0.73[/td][td]0.92[/td][td]1.00[/td][/tr]
[tr][td]PC Gamer[/td][td]347[/td][td]74[/td][td]0.01[/td][td]0.03[/td][td]0.09[/td][td]0.18[/td][td]0.33[/td][td]0.59[/td][td]0.94[/td][td]1.00[/td][/tr]
[tr][td]Polygon[/td][td]386[/td][td]71[/td][td]0.04[/td][td]0.08[/td][td]0.17[/td][td]0.27[/td][td]0.48[/td][td]0.72[/td][td]0.95[/td][td]1.00[/td][/tr]
[tr][td]The Escapist[/td][td]488[/td][td]74[/td][td]0.02[/td][td]0.07[/td][td]0.12[/td][td]0.28[/td][td]0.43[/td][td]0.77[/td][td]0.91[/td][td]1.00[/td][/tr]
[tr][td]VideoGamer[/td][td]672[/td][td]73[/td][td]0.03[/td][td]0.06[/td][td]0.12[/td][td]0.23[/td][td]0.48[/td][td]0.78[/td][td]0.97[/td][td]1.00[/td][/tr]
[/table]
How to explain these findings? Has the sphere of professional game critics gone so thoroughly mad with corruption and fanboyism as to be incapable of delivering anything resembling an honest appraisal? I think not. At least not entirely, and allow me to explain why. First of all, there is a positive bias with respect to what quality of game even registers as a blip on the radar of reviewers. If it isn't a big studio release backed by marketing or an indie title blessed by IGF or IndieCade, it generally doesn't receive a mention let alone a dedicated review. There isn't necessarily anything insidious about this state of affairs; I'd wager that at least a few critics regularly sample low budget offerings only to be reminded of why they don't more often. Mind you, I don't assert that marketing buzz is an accurate predictor of game quality, only that the subset of games with enough traction to garner attention from reviewers is, statistically speaking, of above average quality [footnote]Video Game Reviews: A Discussion Of The Ten-Point Scale And Inflated Scores - Forbes - Erik Kain - June 14, 2013
http://www.forbes.com/sites/erikkain/2013/06/14/how-video-game-reviews-work/
On our return to this article, I draw attention to a comment in which Erik Kain volunteers the following: "For instance, I tend to review games I want to play and play games that I think I will enjoy. So my scores tend to be a bit high, often hovering between 7 and 9 [out of 10] and rarely dropping to a 6 or below." Perhaps without fully appreciating it, Kain has supplied a partial answer to his own inquiries into why video game review scores are clustered at the top half of the scale. It isn't so much that reviewers only play games they think they'll enjoy, it's that they don't waste time formally reviewing games that their audiences won't know or care about. Pageviews pay the bills. Niche games don't attract nearly as many eyeballs as promoted titles, and they also happen to fall on the low end of the quality spectrum with greater regularity.[/footnote]. As support for this claim, consider the following graph of average critic rating (Metascore) as a function of the number of critics. The best linear fit to the data suggests that Metascore increases by an average of 11.7 points as the number of critic reviews goes from 4 (little attention) to 100 (massive attention) [footnote]The linear function is y = ax + b where y denotes score, x is the logarithm of the number of critic reviews, and a, b are the fitted parameters. The power function is y = cx^a + b where a, b, c are parameters. Whereas the linear function indicates an increase in average score from 66.5 to 78.2 (+11.7 points) as the number of reviews increases from 4 to 100, the power function produces a score increase from 65.6 to 76.0 (+10.4 points) over the same range. Some might consider 4 review scores a precariously low sample size, and you could certainly make the argument that it isn't in keeping with the perception of a Metascore as a consensus among a significant number of critics. To address this problem, I recomputed the best fits after discarding all observations with fewer than 10 critic reviews. The results showed a slightly stronger trend: going from 10 to 100 critic reviews yielded +12.3 points for the linear function and +10.6 points for the power function. Let's put things into perspective though - the data exhibits wide deviation around the fitted curves. The correlation coefficient between x and y is just 0.22 (or 0.24 if the 10 review minimum is enforced), which is a statistically significant positive correlation but by no means a dominant one. This is a good sign because it suggests that other factors (such as quality) are more closely connected to a game's overall score than the number of critics who deem it worthy of attention. Coming back around to the original point, in my opinion these results make a reasonable case for the aforementioned selection bias among games reviewers. But I imagine that one of the internet's myriad causation experts will be along to correct me shortly, arguing that all this really proves is professional games reviewers are under the influence of marketing hype if not outright bribery.[/footnote].

Which brings me to a second, albeit related, point of discussion. A robust rating system can accommodate not only the good and the bad, but the extremes thereof. As a former dabbler with Microsoft XNA which led to a minor obsession with XBLIG games, I discovered a degree of awful that simply can't be overcome by normalizing to a price point approaching zero. However, much like the stereotypical dumb jock who brings home a report card of failing grades, an overhyped video game featuring insipid gameplay in a drab universe still manages to answer half the questions correctly [footnote]My argument is straightforward: games that manage to succeed at the fundamentals - competent modelling and animation, functional game mechanics, controls that aren't unintuitive or horribly laggy, free of game-breaking bugs, price point in line with quality and quantity of gameplay - have done enough things correctly to warrant a score in the neighborhood of 50%. Even if the overall package isn't so attractive, I don't see a compelling reason to assign 1/5 or 3/10 just to satisfy some contrived quota of low scores to prove how 'honest' your opinions are. There's a useful distinction to be made between 'difficult to recommend' and 'complete and utter trash'. There isn't anything wrong with a scoring standard where 50% is regarded as a marker for acceptable product quality, as opposed to being treated as a target average score for the particular subset of (mostly high-end) products appraised by a reviewer. My general feeling is that game review scores are somewhat inflated at the moment, but perhaps only by 1 point out of 10 for most publications rather than the 2-2.5 points deduced by comparing current scoring averages to a 5/10 target. People who bleat about "average game quality" are invariably full of shit because they never elaborate on their personal selection criteria from which that nebulous average is derived.[/footnote]. The gamer who encounters a review score of 5/10 or 50/100 is no more deceived about the game's quality than the parents of the 50%-average student are deceived about his intellectual prowess. Because they have a feel for the scale and don't need it excessively dumbed down into compartments labeled 'bad / fair / good'.
In conclusion, I can't help but think the fixation of certain game critics with compactness of scale is misguided. And I'm not the only one to recognize that trading a high resolution scale for one with fewer notches doesn't immediately solve the problems inherent in judging video game quality [footnote]A Review Scoring System That Would Work - The Escapist - Ben 'Yahtzee' Croshaw - Feb 17, 2015
http://www.escapistmagazine.com/articles/view/video-games/columns/extra-punctuation/12989-Video-Game-Review-Scores-A-System-That-Would-Work
In this article, Ben Croshaw attempts to demonstrate the futility of game review scoring by emphasizing the subjective experience of the player. The introduction offers some commentary on the recent changes to Eurogamer's scoring system: "what Eurogamer doesn't seem to have realized is that it's not cured, it's just dropping from heroin to methadone. In the article, they state that they are switching from scores to awards. So rather than ditching scores entirely, they've merely switched from a 10-point system to a 4-point one. Essential, Recommended, Avoid, and presumably one that goes between the last two indicated by there being no award at all." Later, Croshaw argues that the multifaceted nature of games means they can't be adequately characterized by anything less than a detailed account of the individual experiences of 25 players, to be collected through a series of questionnaires filled out at regular intervals during the playing session. The reader would then decide which playtester(s) to heed based on the results of personality tests. Just when you're beginning to appreciate the article as a satirical take on the sort of game review system that might be devised by a drug-addled social sciences dropout from San Francisco, you're reminded that the author is trying to make a serious point against quantitative scoring. However, rather than supporting its conclusion of "perhaps alternatively you could just [ignore scores and] read the cunting review", the main text of the article contradicts it by showing exactly why no single reviewer ought to be trusted, least of all one who neglects to divulge the results a mental status examination taken at the time of writing. Instead, the savvy game consumer must seek council from some 25 reviewers at minimum, which can be interpreted as a tacit endorsement of the following strategy: browse the aggregated review blurbs on Metacritic until you find something that resonates. I always suspected a measure of grudging support for review aggregation lurking beneath the tough guy facade of the games press.[/footnote]. Major games review sites, operating under the guise of addressing the needs of their readers, are all too eager to trumpet their own special flavor of review format, apparently unaware that assigning different names to two or three rating categories doesn't suddenly make you innovative. Coupled with a growing aversion to the 10-point system, you might say that professional game critics are collectively struggling with their own variant of 'not invented here' syndrome, a phenomenon well known to those in the software industry. Barring a widespread outbreak of sanity, I'll just be sitting here waiting for the next super-duper-friend-of-consumer recommendation system to come along and caress my prefrontal cortex into submission with all the finesse of a day-one DLC offering.