Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

Why do I get an OutOfMemoryException ten minutes after starting a broad scoped crawl?

broad Crawl Minutes scoped
0
Posted

Why do I get an OutOfMemoryException ten minutes after starting a broad scoped crawl?

0

If using 64-bit JVM, see Gordon’s note to the list on 12/19/2005, Re: Large crawl experience (like, 500M links). See the note in [ 896772 ] “Site-first”/’frontline’ prioritization and this Release Note, 5.1.1 Crawl Size Upper Bounds. See this note by Kris from the list, 1027 for how to mitigate memory-use when using HostQueuesFrontier. The advice is less applicable if using a post-1.2.0, BdbFrontier Heritrix. See sections ‘Crawl Size Upper Bounds Update’ in the Release Notes.

Related Questions

What is your question?

*Sadly, we had to bring back ads too. Hopefully more targeted.

Experts123