Would it be possible to make the annotation guidelines that were used by the annotators available to the challenge participants?
The doctors were given instructions on how to distinguish between textual and intuitive judgments. The levels of certainty that would lead to “Y”, “N”, and “Q” judgments were discussed with them. They then refined the definitions they received. This information is posted on our web pages. Posted on 04-17-08 Q: It is quite straightforward to interpret intuitive annotation as the physicians had to make a decision on textual unknown documents based on the expertise and impressions (Unmentioned to Yes/No/Questionable). Documents that caused disagreement were left out. There are some label-combinations that seem very weird, e.g.: Yes/No to Disagreement (left out, 212 docs/label pairs in total): this suggests that these documents hold textual evidence on the patient exhibiting a certain disease (or not exhibiting it) but in the intuitive judgment it caused disagreement. The probable disagreement is 1 Yes and 1 No decision as physicians seem to avoid using Questionable tag (infrequent in both