Whats a spider trap?
A spider-trap means different things to different people – to some it’s just a method of identifying crawlers as they browse your site, to some it is an interactive extension to their logfiles, to some it’s a way of determining if a crawler is good or bad by monitoring where it browses and to others it’s a way to sabotage bad crawlers. My personal definition would be the first example “a method of identifying crawlers as they browse your site” as this is extremely flexible and can be adapted to meet a specific need. At the heart of each of these techniques you will find the same basic concept – a need to identify crawlers in real-time, and so this is what I intend to cover in this article. If you took your average spider trap (also called “bot trap” or “crawler trap”) and stripped it back to the most basic components you would find that what you are left with is a glorified sorting and filtering mechanism which gives somebody the ability to determine whether the current request is comi