What HarvestMan can be used for?
HarvestMan is a desktop tool for web search/data gathering. It works on the client side. As of the recent version, HarvestMan can be used for, • Downloading a website or a part of it. • Download certain files from a website (matching certain patterns) • Search a website for keywords & download the files containing them • Scan a website for links and download them specifically using filters. 1.4. What HarvestMan cannot be used for… HarvestMan is a small-medium size web-crawler mostly intended for personal use or for use by a small group. It cannot be used for massive data harvesting from the web. However a project to create a large-scale, distributed web crawler based on HarvestMan is underway. It is calld ‘Distributed HarvestMan’ or ‘D-HarvestMan’ in short. D-HarvestMan is currently at a prototype stage. Projects like EIAO has been able to customize HarvestMan for medium-large scale data gathering from the Internet. The EIAO project uses HarvestMan to download as much as 100,000 file