How can I prevent TurnitinBot from accessing certain web pages on my site?
The Robots Exclusion Protocol allows web site maintainers the ability to communicate to a crawler which parts of their site the crawler cannot access. Furthermore, it allows the administrator the ability to create access rules on a crawler by crawler basis. It works something like this: TurnitinBot visits a web site http://www.somewhere.com. Knowing it hasn’t been here before or in a while, it tries to download http://www.somewhere.com/robots.txt. It then examines the robots.txt file for any rules which apply to it. An example of a robots.txt file is: #This is an example robots.txt file User-agent: * Disallow: /secret/ Disallow: /hide/ Lines starting with # are comments and are ignored by the crawler. The User-agent line is used to indicate which crawler(s) should abide by the rules. In this case, a * means all crawlers. If it were User-agent: turnitinbot the rules would only apply to the TurnitinBot crawler. Please note that both the token “user-agent” and “turnitinbot” are case insen