Question 1

What syntax does robots.txt use?

Accepted Answer

Robots.txt uses simple directives: User-agent (which bot), Allow and Disallow (which paths), Sitemap (where to find your sitemap), and Crawl-delay (seconds between requests). Each rule group starts with a User-agent line followed by one or more Allow/Disallow lines.

Question 2

How does the URL tester determine if a path is allowed?

Accepted Answer

It follows the Google specification: find the most specific matching rule (longest path pattern) for the selected user-agent. If no specific user-agent match exists, it falls back to the wildcard (*) rules. No matching rules means the URL is allowed.

Question 3

Which AI bots does the validator detect?

Accepted Answer

It detects GPTBot, ChatGPT-User, ClaudeBot, Google-Extended, CCBot, Bytespider, Amazonbot, PerplexityBot, Meta-ExternalAgent, and other known AI crawlers. The validator shows which bots have Disallow rules blocking them.

Question 4

Does Google respect Crawl-delay?

Accepted Answer

No. Google ignores the Crawl-delay directive. Use Google Search Console to control Googlebot crawl rate. Bing and Yandex do respect Crawl-delay.

Question 5

Is robots.txt case-sensitive?

Accepted Answer

Directive names (User-agent, Disallow) are case-insensitive. URL paths in Allow and Disallow rules are case-sensitive. /Admin/ and /admin/ are treated as different paths by most crawlers.

Robots.txt Validator

使用方法

示例

Checking a WordPress robots.txt

Auditing AI bot rules

Testing a specific URL path

Frequently asked questions

What syntax does robots.txt use?

How does the URL tester determine if a path is allowed?

Which AI bots does the validator detect?

Does Google respect Crawl-delay?

Is robots.txt case-sensitive?

About this tool

Part of these workflow kits

Related tools

Meta Tag Previewer

Robots.txt Generator

Schema Markup Generator