Can I embedd Heritrix in another application?
Sure. Make sure all that is in the Heritrix lib directory is on your CLASSPATH (ensuring the heritrix.jar is found first). Thereafter, using HEAD (post-1.2.0), doing the following should get you a long ways: Heritrix h = new Heritrix(); h.launch(); You’ll then need to have your program hangaround while the crawl runs. See message1276 for an example. See also the answer to the next question and this page up on our wiki, Embedding Heritrix.