How should unknowns be handled in calculating rates/percentages?
If we include unknowns in the total, the percent in any category is smaller than it would be if we subtract unknowns from the total. For example, if you wanted to calculate the percent of births to non-smoking mothers in a county where 100 births occurred and there were 22 births with unknowns for mother s smoking status with 45 records indicating that the mother did not smoke and 33 indicating that she did smoke. By including the unknowns in the calculation, we would get a percentage of 45 for births to non-smoking mothers. If we excluded the 22 records with unknown data for smoking status, then we would get a percentage of 58. In deciding which method offers a “truer” representation of the population, as a whole, one needs to consider whether the cases with an unknown characteristic are similar to or different from those cases in which the characteristic is known. If it appears likely that the cases with the unknown characteristic are similar to those with the known values, then “unk