SpamBayes doesn seem to catch much spam. What gives?
Initially, SpamBayes will not be able to distinguish spams from hams. With no training inputs, the classifier will simply mark everything unsure. Once you start training the classifier on a representative set of spams and hams it should very quickly begin to improve, however. If that’s not the case, perhaps you have something misconfigured. Here are a couple things to check: • What do you have your ham and spam thresholds set to? The defaults are 0.2 and 0.9, respectively, and should be reasonable starting points. They should not be close together (say, 0.4 and 0.6). • It is quite important that you have trained on roughly equal numbers of ham and spam (don’t go above a 4::1 ratio, for example). • Have you trained on a reasonable number of hams and spams? You should train on 10 to 20 of each to start with just to get a decent base. After that, you should be able to train on just mistakes and messages classified as unsure. • Check to be sure you haven’t made any classification mistakes