PDA

View Full Version : What's a Good Mechanism of Judging??


UWCanadianGuy
01-16-03, 10:28 PM
While here in the Pacific north west (and I suspect in California as well), the very common way to judge an event is to issue a win/loss verdict and a speaker point score, many different systems have been employed around the world.

Most regions use a score based judging mechanism where each individual debater is assigned a score out of 100 or something like that and the team with the highest combined score wins.

One type of ballot commonly seen in Europe goes something like this
/ 40 for Content
/ 40 for Style
/ 20 for Strategy

In Canada ballots are often broken into
/ 10 for Content Material
/ 10 for Refutation and Interrogation
/ 10 for Constructive Oration
/ 10 for Organization
/ 10 for Style and Flair

What do people think about different systems of judging mechanisms?

Patio11
01-16-03, 11:25 PM
I personally prefer systems which allow there to be multiple axes to score the debate on. I did a lot of "sum and compare" debating in my pre-college career, and let me tell you, you have never seen frustration until you get beaten in a debate after devastating another team's arguments only to lose because their, say, wit or sportsmanship score was higher. Similarly, I feel validated when, even though the other team outmanuevers me or I make tactical errors sufficient to lose the round, the judge has somewhere on the ballot to say "But that POI leading to the contradiction was handled with some pinazz -- nice job".

However, I think the breakdown is helpful for awarding speaker points, rather than the policy "I think that was about 27.5" status quo.

Patrick McKenzie

Pattybar
01-20-03, 11:49 AM
I may be wrong, but It seems to me that the reason Europe has a scoring system the way it does is that individuals as well as teams can advance in competition... so each person's individual score has more importance than it does here.

Patty

PancreasMatt
01-20-03, 08:41 PM
only problem i see with the division of speaks into categories is that when you actually do look at the catagories and be a "box checker" you end up doing some real hideous point deflation. The range ends up down around 22-24 for a relatively good round. we had some problems with this when we had new people judging a high school policy tournament- giving kids like 18 speaker points, which is just completely atrocious and painful.

thedancingbear
01-21-03, 06:35 AM
A good way to solve for the speaker-point problem Jason talks about is to use a technique that I think is called Z-scoring.

Basically, it finds a judge's mean score and standard deviation and awards points based on that. Thus if you get a 20 from a judge who is consistently handing out 18s, you are not screwed.

At that point, the judge can basically make up their own scale; as long as they stick to whatever they make up, everything will work.

Cheers,
Ian

UWNM
01-21-03, 08:41 AM
Originally posted by NotAnOrganMatt
only problem i see with the division of speaks into categories is that when you actually do look at the catagories and be a "box checker" you end up doing some real hideous point deflation. The range ends up down around 22-24 for a relatively good round. we had some problems with this when we had new people judging a high school policy tournament- giving kids like 18 speaker points, which is just completely atrocious and painful.

It's only a problem because someone descided that people should hand out more speaks than that. If everyone used a rubric system for handing out points, then point "deflation" wouldn't make anyone mad because it would be normal. In fact, it's a move I support as both a competitor and as a judge. I'm tired of being pressured as a judge to only hand out awards in the top five point levels (either 25+ or 45+ in the case of some LD ballots). It doesn't leave enough room to diferentiate between many debaters if you ask me. And as a debater, I hate it that with only rare exceptions, all of my speaks are just high. Pretty much regardless of whether I do well or not.

Furthermore, if there was some sort of rubric, the speaker point totals would be that much more educational because they would reflect more details about what you need to work on.

However, I would say that the "Canadian" way of awarding wins based on what are essentially speaker points bites. I have awarded a lot of low point wins. It's easy to win on the issues while having a lesser overall performance and I think the seperation of the win/loss and the speakers reflects that.

Nathen

thedancingbear
01-21-03, 02:03 PM
Originally posted by tutakai
Ian, that depends who's "problem" you are trying to fix. I've talked to more than one team that seeks to strike low-point judges (regardless of Z-score variation) solely because they feel that will maximize their chances of speaker awards and "punish" judges who refuse to go along with inflation tendencies....

Right. That's I suggest we base speaker point scoring on the Z-score; so getting a 20 from a judge whose average is around 18 is as good or better than betting a 30 from a judge who likes to hover around 29.

This does have the disadvantage of making tab sheets slightly harder to understand and I don't know if any of the (currently-available) tab software does it. But mathematically, I think the idea has appeal, at least to a CS geek such as me. :)

Cheers,
Ian

thedancingbear
01-21-03, 09:53 PM
I realized I was being unclear. Sorry about that.

My advocacy is that tab software should take into account a judge's Z-score in assignment of speaker points and in determining breaks.

It does suffer a "dataset problem," but that is merely an implementation issue. Theoretically there is no reason this data couldn't be publicly available. At least, I haven't thought of any. :)

Judges who change their scale over time would be an issue. This could be dealt with in a couple of ways, although I honestly hadn't thought about this before. I guess the best way would be to average over the X most recent tournaments, X being determined by a statistician much smarter than me. :)

Cheers,
Ian

PancreasMatt
01-21-03, 10:11 PM
i agree with everyone who says we shouldn't rate high, but as long as people keep doing it, it sucks to get the one judge rebelling against it. And Ian, that has to be ablsolutely the coolest idea EVER for speaker points. Its really cool. Im excited.

WyomingJimmie
01-21-03, 10:59 PM
I believe that the best mechanism for judging is the one that the critic and debaters mutually agree upon during the course of the debate. Although this does not directly address the issue of speaker points, I think that it is the only fair way to judge a round. Ideally, this "mutually agreed upon criteria" is something that manifests itself during the course of the debate. For example, the disadvantage that both teams devote a significant portion of their speech time to should be of greater importance than the first contention of the PMC that is never mention again until the PMR.

In my opinion, the time spent on an issue is in direct proportion to the importance of the issue in the round and weight it ought to carry in a judge making their decision.

As far as speaker points, the only way to get away from the CEDA/NDT model for determine speaker points is to adopt a different scale for points. When I was in high school the LD debates used a 50 point scale to differentiate themselves from the 30 point scale of policy debate. It would allow for someone to give a score 10 pts. below the max. and still look like they were giving reasonable speaker points. Additionally, I believe that most of the tab room software allows for using a 50 point scale just as eaisly as a 30 point scale.

Jimmie DeVore
Wyoming Debate

kanodin
01-24-03, 03:37 AM
I have to agree with ISamuel.

The use of standard deviation to control how scores are weighed sounds like an amazing idea. One thing that I would add to this in order to alleviate the score sampling issue is to actually query the judges before/during the tournament (in writing) to get their personal 'scoring philosophy' and use a computer model to compensate for each judge.

It is true that having one person using a true rubric out of everyone else going with the norm of sticking to giving upper-echelon scores for everyone is problematic for determining speaker awards. It would be an excellent idea for a tournament to recognize this issue and meet eye-to-eye with the judges and avoid friction.

Besides, having realistic scores will make me truly value that 'perfect 30' mark.

Natebear
01-30-03, 06:48 AM
just to let you know...

There is currently tab software that calculates judge variance and it is in wide use among CEDA circles. Since there is no difference in tabbing those types of debates, it would be applicable to parli as well. One drawback -- it only uses the data set from the tournament itself.
Anyway...

thedancingbear
01-30-03, 02:13 PM
I wish more tabbing alogirithms were "open". Does anyone know if the authors of common tabbing programs would be willing to share their source code? They don't sell the software so it seems as if they have no reason for secrecy.

Cheers,
Ian

PancreasMatt
01-30-03, 05:34 PM
Isam, try e-mailing one of our coaches- ill give you the email, just send me a private msg thingy i can reply to, since im not sure if he would want his email on the board. He knows how to run all the policy software, since he was a coach for fullertons team last year (policy) and he could probably put iyou in touch with the guys that have the program, since he's all about the spread of MPJ in parli.