Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

SocSciBot 4 misses a lot of pages on the sites crawled – can this problem be fixed?

0
Posted

SocSciBot 4 misses a lot of pages on the sites crawled – can this problem be fixed?

0

The problem might be due to JavaScript, Java or Flash links on the home page or at key points within the site. To try to get round this, add a list of URLs of extra pages in the site to give the crawler alternative points. To do this, browse the site by yourself and then make a text file listing lots of URLs (including the one above) and rename this as startl.txt and put it in the folder created by SocSciBot 4 just before crawling (called with the same name as the domain name of your web site) and check the Preload start list start.txt option just before the crawl button. This will ensure that all the pages you have found are added and – especially if you have found a page like a site map that links to loads of pages – then the crawler should be able to find more pages.

Related Questions

What is your question?

*Sadly, we had to bring back ads too. Hopefully more targeted.

Experts123