What is Language Stemming?
Often a user will type in a query with one form of a word but would like to match other forms of what is essentially the same word. In 1980 Dr Martin Porter, a member of the team working on a Probabilistic Model at Cambridge University developed a suffix-stripping algorithm that has been very widely adopted for normalizing words in IR systems. Using Porter’s algorithm the following words can be matched: “dangerous” with “danger”, “dangers” and “dangerous” “attacks” with “attack”, “attacks”, “attacker”, “attackers” and “attacking” “baby” with “baby” and “babies” In addition, with our fuzzy stemmer the following words can also be matched: “misspelt” with “mispelt” “commission” with “commision”, “comission”, “commissioning” and “comisioned” “accommodate” with “accomodate” and “acomodation” conceptSearching uses language stemming as part of its concept matching process, although individual words and phrases may be left unstemmed by enclosing with double quotes. This means that by default s