Does Harvest support the Robot Exclusion Protocol?
Yes. Both robots.txt files and META robots tags are supported. The correct format for robots.txt files is documented at http://info.webcrawler.com/mak/projects/robots/norobots-rfc.html, Harvest may have problems gathering from sites which have incorrectly formed robots.txt files. The format for META robots tags, which give users control over indexing on a page by page basis, is available from http://info.webcrawler.com/mak/projects/robots/meta-user.