Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

Does SpamBayes work with non-English languages?

0
Posted

Does SpamBayes work with non-English languages?

0

SpamBayes was developed by English-speaking people and has therefore had very little testing with other languages. There are some anecdotal reports that it doesn’t work as well with Western European language. It might work very well with them if these default values are changed in the user’s ini file (note that for Outlook users, this means the default_bayes_customize.ini file, rather than the one called Outlook.ini, or named after your profile): [Tokenizer] replace_nonascii_chars: True skip_max_word_size: 12 The first setting causes all non-ASCII characters to be replaced by a question mark. For non-English languages the setting should probably be False. The second setting causes all words longer than 12 characters to yield a “skip: X NNN” token instead of the word itself, where X is the first letter of the word and NNN is the word length. For languages like German, this can be especially troublesome, because an inordinate number of words will yield tokens like “skip: ? 17” because th

Related Questions

What is your question?

*Sadly, we had to bring back ads too. Hopefully more targeted.

Experts123