Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

How does the appliance determine whether a URL has changed since the previous crawl?

0
Posted

How does the appliance determine whether a URL has changed since the previous crawl?

0

Unless the URL matches the patterns in a “Force Recrawl” field we’ve set up, the appliance performs a HEAD request to determine the Last Modified date of the document. If the host claims that the document has not changed in the last 20 days and our indexed version is less than 20 days old, the appliance will assume that we have the most recent version of the document. Once the document is downloaded, the appliance uses a content checksum to determine whether the document has changed.

Related Questions

What is your question?

*Sadly, we had to bring back ads too. Hopefully more targeted.

Experts123