Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

Can I use WebSPHINX to crawl the entire Web, like search engines do?

April 26, 2017Crawl entire search engines web websphinx

0

Posted

Can I use WebSPHINX to crawl the entire Web, like search engines do?

1 Answer

0

Posted

WebSPHINX isn’t designed for enormous crawls like that. Search engines typically use distributed crawlers running on farms of PCs with a fat network pipe and a distributed filesystem or database for managing the crawl frontier and storing page data. WebSPHINX is intended more for personal use, to crawl perhaps a hundred or a thousand web pages. If you want to use WebSPHINX for large crawls, you should definitely read the next question about memory usage.