Blocking search engine indexing with robots.txt

How to control which pages and directories search engines can index.

robots.txt is a plain text file that sits in your site's root directory and tells search engine crawlers how to behave on your site. It's the very first thing any bot looks for when it visits.

You can use it to:

  • Block indexing of specific pages or directories
  • Point search engines to your canonical domain
  • Set a crawl delay between page requests
  • And much more

The file belongs in your site's root directory — the same place as your main index.* file. For your primary domain, that's the public_html folder. If it doesn't exist yet, just create it.

Core directives

  • User-agent — specifies which crawler the rule applies to. Use * to target all bots.
  • Disallow — blocks the specified path from being indexed. An empty value means no restrictions.
  • Crawl-delay — suggests a delay (in seconds) between consecutive page requests.

Examples

Block a specific crawler:

# Block Googlebot
User-agent: Googlebot
Disallow: /

# Block Yandex
User-agent: Yandex
Disallow: /

# Block MSNBot (Bing)
User-agent: MSNBot
Disallow: /

# Block Yahoo
User-agent: Slurp
Disallow: /

Block all search engines:

User-agent: *
Disallow: /

Block specific directories:

User-agent: *
Disallow: /cgi-bin/
Disallow: /images/

Allow all search engines to index everything:

User-agent: *
Disallow:

An empty Disallow value is equivalent to having no robots.txt file at all — everything is open.

Our products and services

Web HostingReliable hosting services for websites of any scale.
Order
VPSFlexible cloud infrastructure with full root access.
Order
Dedicated ServersBare metal servers for maximum performance.
Order

Allow only specific crawlers, with a crawl delay: In the example below, the entire site is blocked for all bots except Yandex, Google, and Rambler. Each of those is given a 4-second delay between page requests:

User-agent: *
Disallow: /

User-agent: Yandex
Crawl-delay: 4
Disallow:

User-agent: Googlebot
Crawl-delay: 4
Disallow:

User-agent: StackRambler
Crawl-delay: 4
Disallow:

Help

If you have any questions or need assistance, please contact us through the ticket system — we're always here to help!

Need help?Our engineers will help you free of charge with any question in minutesContact us