What do “Y”, “N”, “U”, and “Q” used in the annotation files stand for?
“Y” stands for “Yes, the patient has the disease”, “N” stands for “No, the patient does not have the disease”, “Q” stands for “it is questionable whether the patient has the disease”, and “U” stands for “disease is not mentioned in the record”. “Y”, “N”, “Q”, and “U” are all valid judgments for textual annotations. But only “Y”, “N”, and “Q” are valid judgments for intuitive annotations. Posted on 03-18-08 Q: Not all of the records have judgments for all of the co-morbidity/source combinations. Are these missing judgments supposed to be left out of the training data, or are they to be treated as “unmentioned” (“U”)? A: We have deliberately left out some of the records from the training set for some co-morbidities. This is necessitated by lack of agreement for those co-morbidities on those records. In other words, there is no judgment for the records that have been left out for a co-morbidity/source combination. Please exclude them from training.