More Panel Ratings Discussion

Started by Le Woltaire, Thu 07/01/2010 20:35:25

Previous topic - Next topic

IndieBoy

#100
How about adding a disclaimer that by posting your game in the database, then you accept that there will be a rating of your game by a panel for organisational purposes only and if you don't agree then why don't you just make a post in the completed games forum instead.

Or how about giving the option to the game makers to hide the panel comment once it has been made? Although the comment usually justifies the rating, disgruntled game makers might prefer not to remind the potential players of the games' pros and cons. Which seems to be the thing a lot of people are biting on to with these rating rants.

However purposely unrated games in the database makes the previous ratings pointless and makes the search tool utterly useless.

I don't want to correct you Snarky, but commercial games aren't rated and either are demos, so I doubt it will put potential buyers off. Unless you are commenting on the possible previous free and rated games of that same developer, which surely shouldn't be given special treatment rating wise because they are trying to sell something...

Also same thing for the awards, I understand that the game needs to be in the database for it to be eligible to vote for it, but hiding/delaying the rating/comment to win an award seems a little wrong to me.

As of the episodic games point, I understand, but I think the first game should be rated then if the author wants to challenge that rating they can with using the other instalments as evidence. Seems like the fairest way to me.
Quote from: Calin Elephantsittingonface on Tue 08/02/2011 09:00:55
The only person in favour of the mobs seems to be IndieBoy.. but he's scottish so we dont listen to him anyway.

Snarky

Quote from: IndieBoy on Mon 11/01/2010 22:37:43
How about adding a disclaimer that by posting your game in the database, then you accept that there will be a rating of your game by a panel for organisational purposes only and if you don't agree then why don't you just make a post in the completed games forum instead.

Sure. I guess the question then becomes if we'd rather have games that the creators don't want rated by the panel in the database, or not. I always thought the database was meant to be an as-complete-as-possible repository for AGS games, so players would have a single go-to location; it doesn't exist for the sake of the panel ratings. They're there "for organisational purposes only", as you say.

Also, submission to the database is currently a requirement to be eligible for the AGS Awards.

QuoteHowever purposely unrated games in the database makes the previous ratings pointless and makes the search tool utterly useless.

Again, I don't follow this argument. Missing ratings for a few games invalidate the whole system? If that was the case, shouldn't submitted games be kept off the games db until the panel rates them?

QuoteI don't want to correct you Snarky, but commercial games aren't rated and either are demos, so I doubt it will put potential buyers off. Unless you are commenting on the possible previous free and rated games of that same developer, which surely shouldn't be given special treatment rating wise because they are trying to sell something...

It's a good point about them already not being rated. In any case, I wasn't suggesting giving commercial games special treatment, and certainly not to manipulate the ratings to bolster sales. I was just arguing that there are many reasons beyond "I don't want people to know how bad my game is" for why game makers might want to opt out of having their game rated by the panel. Specifics would vary from case to case, depending on the situation, and I gave a few hypothetical examples.

QuoteAlso same thing for the awards, I understand that the game needs to be in the database for it to be eligible to vote for it, but hiding/delaying the rating/comment to win an award seems a little wrong to me.

As of the episodic games point, I understand, but I think the first game should be rated then if the author wants to challenge that rating they can with using the other instalments as evidence. Seems like the fairest way to me.

I don't see anything underhanded in wanting AGS voters to come into your game with as few preconceptions as possible. It's probably not the decision I'd make if I had a game that was eligible for awards, but different people take different approaches.

And while challenging or appealing the rating at a later date is one way to deal with cases like the last example, wouldn't it be better to have a way of achieving the same result in a less confrontational way?

IndieBoy

#102
Ok Snarky I'll explain my view/points a little more clearer:

From my understanding: The database is a collection of ags games that are user submitted, which are indexed in genre, length, type and cups. My point was, for example having half of the database entries without any of these fields completed would make them unsearchable, therefore it wouldn't be a complete database. I know it wouldn't be as drastic as this in reality, but there would be a difference if a potential 5 cup game was chosen not to be rated by the author, therefore it would be least likely to be found and therefore would be a loss to the community/potential players. To prevent this I'm suggesting to add the disclaimer. It would mean no drastic change to how the database is built and structured, and also I think it would be fair for the game designers who don't want their work to be "judged" and to the ags panel/community, that is all. It's a thing that the majority of people who submit their games already expect this so why shouldn't it be as a rule? As such as the rule for the game must be submitted in the database for it to be able to voted for the awards. 

The point of the ratings and the adding of the new genres and things a couple of years back was to make it easier to search for games. So my point was: by having entries in the database purposely not being rated would be taking a step back in the database's evolution. Also there is a difference of between a waiting-to-be-rated game and a game that has been purposely unrated. I just don't think that the community who vote and nominate games for awards are going to be swayed with the game's rating in the database. As the recent few threads have shown, the community can/will either ignore the ratings, challenge them or accept them. In addition if any such disagreement takes place about a game I'm sure it would be resolved before the awards, hence no damage to the chances of that game's award success.   

As of the preconceptions of the new player I agree, that is why I suggested the option to hide the panel's comment.


I just don't understand why however more pages/threads about this kind of thing we never seem to have came up with the ultimate solution and just gone with it. It seems like at least one person will never agree with an idea. Also I promised myself I wouldn't post in one of these threads again.. whoops!
Quote from: Calin Elephantsittingonface on Tue 08/02/2011 09:00:55
The only person in favour of the mobs seems to be IndieBoy.. but he's scottish so we dont listen to him anyway.

Snarky

Well, you'll never get 100% consensus on anything. But by the same token, no system is so perfect that it can't be improved.  :-\

I understand your argument that not having a rating for some games makes them a bit harder to find, and that this reduces the value of the db. But it's better than not having them in the db at all, which is the alternative offered by your proposal, isn't it?

Technically, classifying something as "unrated" is also a kind of categorization, so I don't think you have to see it as a "step back" for the database.

As far as I can tell, people have suggested 3 options so far:

1. Panel rating will be non-optional for games submitted to the database. Submitters are given a disclaimer to this effect. People who don't want their games rated will not add them to the db (or they'll do so without realizing/noticing that they're signing up for a review, and be unhappy).
2. Submitted games are rated by default, unless the submitter specifically opts out through some documented process. A new classification of "Not rated by request of the creator" is added (and could even be made searchable as a "0 cups" rating if desired).
3. Panel rating is generally automatic, but game makers can prevent it by taking extraordinary action (PMing Andail, for example). This option will not be mentioned on the submission page.

I'm assuming that 3. is the system currently in place, at least for as long as the game submission page doesn't mention anything about the panel.

All the options seem more or less viable, though no. 2 is my clear favorite, personally. However, I think if CJ (or whoever) decides to go with option 1., the rule that AGS Award nominees have to be in the Games DB should be changed. It doesn't seem right to me that submitting to be rated by the panel should be compulsory if one wants to be eligible for the awards.

As for your idea of letting game makers hide the panel comments (but not the rating), I think it runs into the same objection Andail raised earlier: it's just a way for game makers to suppress bad reviews, and thereby compromises the integrity of the panel.

Layabout

I still don't really understand the theory behind the five cup system. Did anyone ever think what two cups out of five would look attractive to the average browser? I also get confused as people have been saying contradictory things.

CJ has said "The 2-cup rating is by far the most common, and 2/5 is not a "bad" rating as it would be on some review sites."

Progzmax said " where 2 appears more clearly as 'below average' "
and on other occasions, "while 2 is a 'nice try' category that is meant to encourage people to do better.  As mentioned before, 3 is essentially the standard score which is fine since most people will produce a few average but interesting games before they reach something better."

On the 'what is this' page "2 Cups   A reasonable game, worth a try"

Andail "I think everyone needs to remember, before you jump on the complaint bandwagon, that the descripition for 2 cups is that the game is reasonable and worth a try."

So guys... what is it? Average usually means the medium. The middle of the road... Even progzmax, the supposed head of the ags panel, seems to be confused about the meaning of the 2 cup score.

A game that offers multiple endings, has improved graphics over the authors previous 4 starred titles, and an interesting plot gets a two cup rating. Did the reviewer only play it once? This game if you care: http://www.adventuregamestudio.co.uk/games.php?action=detail&id=1194
This is one of those situations where the panel score is at odds with the user rating.

There seem to be a lot of inconsistencies with the scoring system that make it an unreliable way of judging whether one would enjoy a game.

To make things even worse, you have decided to publish the means in which you rate games. People will still complain, saying 'ohh but I did all of that'.

BTW that bar chart makes the community look terrible. Everyones making shit games, and only a few make games worth playing. That is what it looks like to an outsider seeing all the two cup games. NO ONE sees two cups as being in any way GOOD. It is a BAD score, to players, to creator, to anyone except the review panel that isn't Progz. Who also seems to see it as a bad score.
I am Jean-Pierre.

bicilotti

Quote from: Ryan Timothy on Mon 11/01/2010 20:52:38
I've pressed that Lucky Dip feature in the new website layout more than 100 times now, and to tell you the truth, I haven't seen a game with a lesser player rating than blue cup rating.  

It is true that the panel is more unforgiving than the players rating, but what you have experienced is probably also caused by a different scale. To me:

It was ok, play it if you got some some spare time (the middle option in the "rate this game" overall enjoyment)
and
A reasonable game, worth a try (2 cups panel rating)

are pretty similar, but they will lead to a 3 orange cups (player) vs. 2 blue cups (panel), and those two judgments agree.

This could be conveniently addressed changing the options in "rate this game - overall enjoyment", bearing in mind that effects will take place only in the long round (as more votes are collected means will slowly change).

SSH

Maybe we should just make the scale from 2-6 cups instead of 1-5 and then everyone's egos will be artificially inflated? This suggestion sounds facetious, but if it reduces complaints, why not? 1 cup can be reserved for games that are so bad they don't even run, etc.
12

Calin Leafshade

I think we should remember that the 5 cup rating has to span a very broad spectrum.

AGS has created games which are barely playable and masterpieces of fiction.

In my eyes 3 cups can only be considered 'good for an amateur game' and things dont really start to be 'good' until the 4 and 5 cup mark.

I have only released one game and only been here around 5-6 months so my experience of AGS games is limited but if the scale is linear and the benchmark is trilbys notes (which got 5 cups) and shifters box (which got 4) then i would rate the vast majority of games I play as 2 cups or even lower. And that includes my own.

The resolution of the cup rating simply isnt high enough to be as useful as you guys seem to want it to be.

You should use the cup rating more like a basic technical benchmark to answer a few simple questions.

- Is this game fun
- Is this game buggy
- Does this game make basic logical sense.

beyond that I fail to see what you want from the rating system.

Iliya

Completely support Layabout opinion. And Back Door Man is a great game - nice polished graphics, multiply endings - its a different game! Who rated this game? We want to know names? :)

Layabout

#109
Quote from: Calin Leafshade on Tue 12/01/2010 07:46:59
I think we should remember that the 5 cup rating has to span a very broad spectrum.

AGS has created games which are barely playable and masterpieces of fiction.

In my eyes 3 cups can only be considered 'good for an amateur game' and things dont really start to be 'good' until the 4 and 5 cup mark.

I have only released one game and only been here around 5-6 months so my experience of AGS games is limited but if the scale is linear and the benchmark is trilbys notes (which got 5 cups) and shifters box (which got 4) then i would rate the vast majority of games I play as 2 cups or even lower. And that includes my own.

The resolution of the cup rating simply isnt high enough to be as useful as you guys seem to want it to be.

You should use the cup rating more like a basic technical benchmark to answer a few simple questions.

- Is this game fun
- Is this game buggy
- Does this game make basic logical sense.

beyond that I fail to see what you want from the rating system.

I am responsible for a 1 cup game. It was playable, but very short, very easy and by god was it ugly. I did it as a test and a challenge to create the first 'game' with the new windows roomedit. I don't even think it should deserve even a 1 cup rating. It was horrible. Truly awful. Yet we get a 2 cup rated game with higher production values, in engrossing story with multiple endings (here I refer to Back Door Man). That doesn't make a lot of sense does it.

Neither does giving a game like Ben There, Dan That!, a highly enjoyable game with a very high user rating 3 cups with comments along the lines of 'lots of rooms, not much to do', and giving a game like shifters box, which has the same thing (lots of rooms, usually only 1 puzzle per room) 4 cups. Or Limey Lizard, which has about 20 useless rooms in a maze puzzle, which may lead to a walking dead if you aren't careful... no, scrap that, it's easy to come across just from exploring. (also 4 cups) Another game by the same author isn't even complete and gets 3 cups.

There seems to be a lack of consistency with the panel rating. This is what is troubling me.

You say 3 cups can only be considered 'good for an amateur game'... well what else would it be? Despite there being a handful of commercial AGS games, 99% of content created with AGS are amateur games. A clearer and more consistent panel is needed.

*edit*

And oh yay, an example of a game that doesn't really have any game play. 3 cups http://www.adventuregamestudio.co.uk/games.php?action=detail&id=1266 Yes the game has consistent graphics, animation, etc (all of which are lovely), but for anyone who has never experienced Scid and the original Red Flagg, they would be disappointed and wonder why this game was rated 3 cups, when other games, which have a far greater degree of story and game play get 2 cups and a similar consistency of graphics.
I am Jean-Pierre.

Andail

Yes, Harg and Layabout, let's fill this thread with specific titles whose rating we don't agree with.

Then let's invent a system so perfect that among the thousand-something games in the database, there wouldn't be even one or two ratings that any single one of the thousands of members wouldn't disagree on.

SSH

#111
This suggestion is probably a bit too much information but might not be hard to implement. For any given reviewer on the panel who has reviewed a statistically significant number of games (e.g. 30 or more based on six sigma principles) then the reviewer is kept secret, but the cup rating says : 3 cups... is better or equal to X% of the other games reviewed by this reviewer. Then at least you'd be able to see if it was reviewed by a hardass or a softie.
12

Gilbert

As usual I'm too lazy busy to read all the pages in the thread. I'm not sure whether my idea is good or valid, but it may be useful for brainstorming to finding a better system.

I'll just ignore the matter of inconsistencies about the rating standard, and whether the games get appropriate ratings at the moment, as I never devote the time into this I'm obviously not eligible to comment on these in details.

What I see is, as 2-cup games are the most common, they're sort of having an average quality among all the games. If I am say, a teacher, who rates students' works I'll say I feel (so don't take me seriously) this is sort of equivalent to a 'C grade'.

So, under this criterion, 3-cup games are most likely 'good to quite good', which are considered 'B grade'. Then, obviously, 4-cup games are 'quite good to very good', which are considered to get an 'A'.

What do the 1-cup and 5-cup ratings really mean? This is my view:

5-cup games are those that are 'excellent' and are highly recommended (as suggested by the sheer number of 5-cup games at the moment), so they're something like getting an 'A+ grade'.

If you agree to the above, it's easy to say that 1-cup games are actually 'D grade or below'. Now, the problem lies here. 'D grade or below' can mean many things. It can be a game that some effort was actually put into it but the overall result isn't as enjoyable because of various reasons (say, uninteresting stories, bugs, language problems, disturbing graphics, etc.) which could be considered to belong in the 'D or a better E' department. These games are actually quite different from effortless attempts that are merely stuff messily put together (such deserve 'the lower part of E, or F') and in fact, some people are still willing to try out those 'D or better E grade' games but under the current system all of these are 1-cup games, which makes it difficult for them to filter out the games they want to try.

Because of this, my idea is like this:
F to lower E - 1 cup (worthless to try)
better E to D - 2 cups (try if you really want)
C - 3 cups (same as current 2-cup games)
B - 4 cups (same as current 3-cup games)
A - 5 cups (same as current 4-cup games)
A+ - still 5 cups, but with the cup graphics changed to gold or shiny or something like that (they can be just categorised as 'highly recommended' in the search filter)

which is actually a 6-cup system disguised as a 5-cup one (unlike the joking 2-6 cup system someone mentioned). In this way we don't have to change much and many people may be happy as apparently most games seem to be rated 1 cup higher.

Note that this applies only to the panel scale. I don't see the public system need to change. Since the public ratings are determined by votes from various people I don't see it need a special treatment to include an 'A+ grade'.

Layabout

Quote from: Andail on Tue 12/01/2010 08:46:18
Yes, Harg and Layabout, let's fill this thread with specific titles whose rating we don't agree with.

Then let's invent a system so perfect that among the thousand-something games in the database, there wouldn't be even one or two ratings that any single one of the thousands of members wouldn't disagree on.

I'm actually pointing out how inconsistent the rating system seems to be due to the subjective nature of such a system. I personally loved shifters box. I think it deserves a 4 cup rating, which is consistent with the guidelines you posted. I loved Red Flagg, but it has a major flaw in the gameplay department and would fit more in the 2 cup rating as per the guidelines. Yes SpacePirateCaine by far exceeded the expectation of those who had played the original Red Flagg, but it felt more like an interactive cut-scene than a game.

Sure I do single out Back Door Man as a game that has been given an incorrect rating, but I challenge people to play it going through the Panel's guidelines and see how many crosses you can put against it.

The Panel system was put into place to moderate the games against being artificially jiggered with, and I totally agree with the sentiment behind the whole system. To me it seems there is too much subjectivity to the ratings. I'd like to think if one day I make a game of substance that I might get an objective panel rating... although with all the fuss I'm causing right now I'm doomed to be subjected to a 1 cup rating for everything I ever create.

I also agree with what Gilbert Cheung says. Always. It would bring it more in line with the proposed yellow cup system, whilst also being able to clearly see inconsistencies, which could then be flagged as aforementioned.

User ratings should also be bought in line with the panel rating, as this would provide a more accurate picture of what the general public thought of the game. Same guidelines, same overall 'cup' system. This way if the user rating was seen to differentiate too much from the panel rating, the panel could be flagged to check if the user rating was subjected to jiggery, or if the original panel rating was fair and objective.
I am Jean-Pierre.

RickJ

I think Gilbert makes some good points.  His suggestion addresses the concern that the uninitiated will interpret 2 cup games as below average crap. 

Following up on his theme allow me to suggest that perhaps the rating process should begin with the presumption that the game under review  is average (i.e. a rating of 3).  Then the reviewer would then score it higher or lower according to the process  laid out by Andail and the other panel members.   The object of doing this is to calibrate the rating process so that 3 cups == average as Gilbert suggests.

I would also like to suggest that one possibility for depicting Giiberts "A+" rating would to be have  the cups with steam coming off them. 



As I recall the original problem with the user rating is that there were many games with just a few  votes and undeserved high ratings.   Perhaps it would be better just have user ratings that also included a weighted panel rating.   

For example the panel rating could have a 30 vote weight.  So if the someone make a crappy  game and then he and a couple of his fiends gave the game an undeserved  5 cup rating and the panel gave a 1 cup rating the end result would be ((30*1)+(3*5))/(30+3) = 45/33 = 1.36, which would still be a 1 cup rating.  There is noting magic about the panel weight of 30;  we or CJ could decide what value is appropriate.



I would also like to repeat one of my earlier suggestions I made in the other thread as it seems that thread was already dead and some may not have seen it .

In this thread I suggested that the review should be written for the benefit of the game author rather than the game players.   How so?  I was a member of Toastmasters, for a number of years, where members learn speaking and leadership skills.  The key ingredient to the success of their program is the way members' practice speeches are critiqued, which I will share with you.

Someone is assigned to give an evaluation speech immediately after the main speech.   The evaluation is usually structured into three parts consisting of no more than three points each.   The evaluation speech begins by pointing out the best parts of the speech.   Next the evaluator points out what could  be improved.  There is no point in mentioning more than three things because

1) The speaker may become disillusioned/discouraged
2) It's difficult for people to remember more than three things at a time

The evaluation ends on a positive note by mentioning what things the evaluator would like to see more of in the future.   This essentially presents negatives in a positive light (i.e. "half full" rather than "half empty").  This is also an opportunity to encourage the speaker  to continue making speeches and improving.

I think all of this or something similar could be applied to the panel's reviews.   Game authors wouldn't be in the dark about why they got the rating they got.  They would have clear guidance and specific suggestions to improve on their next effort.    Game players would get the same or more benefit from this sort of review as before.

I also suggested that authors be given advance notice of the rating and review so that they may give the reviewer some feedback before publication.  This is a professional courtesy rather than an invitation to whine.  Given the panel's work load as described in these discussion it wouldn't be surprising if a reviewer occasionaly missed something an author felt strongly about.   


Helme

Quote from: RickJ on Tue 12/01/2010 11:45:30
For example the panel rating could have a 30 vote weight.  So if the someone make a crappy  game and then he and a couple of his fiends gave the game an undeserved  5 cup rating and the panel gave a 1 cup rating the end result would be ((30*1)+(3*5))/(30+3) = 45/33 = 1.36, which would still be a 1 cup rating.  There is noting magic about the panel weight of 30;  we or CJ could decide what value is appropriate.

That implies that most of user that actually rate games are unqualified to do so, whereas the panel is full of heavenly wisdom and fairness. I think this method is quite undemocratic (in the sense that every vote should count the same).

miguel

As a member of this great community and maker of crappy games I want to say that a 6 cup ranking would fit better.

All the considered demands and skills would be given a percentage by the board, like:

Art : 30%
Story : 50%
Tech skill: 20%
...and so on...

If the board shows those numbers or not it's a different issue,

the result of the combination of this percentages would be automatically turned into cups:
0 - 20 % - 1cup
20% - 40% - 2 cups
40% 55% - 3 cups
55% - 70% - 4 cups
70 - 90% - 5 cups
90 - 100% - 6 cups

Working on a RON game!!!!!

Snarky

#117
I'll take the panel's side on this topic for once. Layabout, miguel and others, it sounds like you're trying to solve the "problem" that the reviews are a subjective opinion. Someone said on SSH's blog:

Quote from: RadoI support the AGS panel because their work is free, and the panel can not hire more people to make the review more objective.

I support the developer because I think that is not fair AGS panel (which in the case is a single person) to comment the creative work. In this case I think it's better AGS panel to review only technical areas of the game.

But what players are interested in is "should I play this game?" And that mainly comes down to intangible qualities like "fun," which are inherently subjective. So miguel, I don't think your system would produce ratings that are any more helpful than the current guidelines. You can't just feed in numbers to a formula and get out a rating, particularly since games are so different. To play along with your idea for a second, in some games the story might indeed be 50% of the enjoyment, while in others the puzzles are 50% and the story only 10%.

The original idea Andail had for how the ratings would be set also strove for perfect objectivity, by using a checklist. There was a specific list of requirements for getting a particular rating, and if you failed any of them, you didn't qualify. When he proposed it, it got a lot of criticism because it would assign counter-intuitive ratings to a lot of games (some of the most popular games would only have got one or two cups), and so the panel modified it to get the system you see today, where the guidelines are much more flexible. I think that was absolutely the right decision.

And as long as panel members have to use their own judgment, and have to quantify inherently subjective experiences like "fun", there will always be room for disagreement. And with different members of the panel having different tastes and using the scales slightly differently, there will be some inconsistency between ratings. (And I should point out that IMO the average user ratings of the games are much more inconsistent than the panel ratings.)

No, the system isn't perfect, but do you really think that there is a perfect way to turn the experience that each and every individual player will have with a particular game into a number from one to five? (Or six?)

That's why I think the most important thing is not the number of cups by itself, but to always use the panel comments to explain what factors led to the rating. Rick, your idea for how these comments might be structured sounds like one good approach; it's similar to the "Pros/Cons/Verdict" capsule-review that Adventure Gamers uses.

tzachs

Quote from: Gilbet V7000a on Tue 12/01/2010 09:27:24
Because of this, my idea is like this:
F to lower E - 1 cup (worthless to try)
better E to D - 2 cups (try if you really want)
C - 3 cups (same as current 2-cup games)
B - 4 cups (same as current 3-cup games)
A - 5 cups (same as current 4-cup games)
A+ - still 5 cups, but with the cup graphics changed to gold or shiny or something like that (they can be just categorised as 'highly recommended' in the search filter)

I like this idea. It reminds me a bit of the rating system in the "home of the underdogs" where there is a picture of a dog next to the highly recommended games.

Layabout

As far as I understood for the original 'Panel Rating' discussion, it was open to change after the first couple of years to hopefully implement a 'decent' system.
I am Jean-Pierre.

SMF spam blocked by CleanTalk