Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

How can I prevent TurnitinBot from accessing certain web pages on my site?

April 26, 2017certain pages site turnitinbot web

0

Posted

How can I prevent TurnitinBot from accessing certain web pages on my site?

1 Answer

0

Posted

The Robots Exclusion Protocol allows web site maintainers the ability to communicate to a crawler which parts of their site the crawler cannot access. Furthermore, it allows the administrator the ability to create access rules on a crawler by crawler basis. It works something like this: TurnitinBot visits a web site http://www.somewhere.com. Knowing it hasn’t been here before or in a while, it tries to download http://www.somewhere.com/robots.txt. It then examines the robots.txt file for any rules which apply to it. An example of a robots.txt file is: #This is an example robots.txt file User-agent: * Disallow: /secret/ Disallow: /hide/ Lines starting with # are comments and are ignored by the crawler. The User-agent line is used to indicate which crawler(s) should abide by the rules. In this case, a * means all crawlers. If it were User-agent: turnitinbot the rules would only apply to the TurnitinBot crawler. Please note that both the token “user-agent” and “turnitinbot” are case insen