?
Test your robots.txt rules against any user-agent and URL. Validate Allow/Disallow directives, spot errors and see exactly which pages are blocked or permitted for crawling.
GPTBotOpenAIanthropic-aiAnthropicPerplexityBotPerplexityCCBotCommon CrawlEnter your website URL to automatically fetch your live robots.txt, or paste the raw robots.txt content directly into the editor. You can edit the content to test proposed changes before deploying.
Select a user-agent from the list (Googlebot, Bingbot, GPTBot, or any custom bot name) and enter the page URL you want to test — for example /resources/private-post or /admin/. The user-agent selector ensures you test the exact rules relevant to each crawler.
The tester instantly shows whether the URL is Allowed or Disallowed for the selected bot, and highlights the exact directive that caused the result so you know precisely which rule to edit.
Disallow: /Blocks everythingAllow: /Opens everythingCrawl-delay:Throttles bot speedSitemap:Declares XML sitemapA robots.txt file is a plain-text file placed at the root of your website (e.g. example.com/robots.txt) that instructs search engine crawlers which pages or sections of your site they are allowed or not allowed to access. It is part of the Robots Exclusion Protocol and is respected by all major crawlers including Googlebot, Bingbot and Baidubot.
No — robots.txt only controls crawling, not indexing. If other sites link to a blocked page, Google may still index the URL based on those external links. To prevent a page from appearing in search results, use a noindex meta tag or X-Robots-Tag header instead.
Disallow: / blocks the specified user-agent from crawling your entire website. It is the broadest possible restriction. If it applies to Googlebot, your whole site will be de-indexed over time. Always double-check this directive before deploying — it is one of the most common and damaging SEO mistakes.
Crawl-delay tells a bot how many seconds to wait between consecutive requests to your server. For example, Crawl-delay: 5 asks the bot to wait 5 seconds between page fetches. Note that Googlebot does not respect this directive — for Google, use the crawl rate settings in Google Search Console instead.
Add a User-agent: GPTBot block followed by Disallow: / to prevent OpenAI GPTBot from crawling your site. Similarly, User-agent: CCBot with Disallow: / blocks Common Crawl. Our tester lets you select these AI bots from the user-agent dropdown and test your rules against them specifically.
Yes — robots.txt supports multiple User-agent sections, each with their own Allow and Disallow rules. You can allow Googlebot full access while blocking other crawlers from specific directories. The wildcard User-agent: * applies to all crawlers not otherwise specified in the file.
Allow: explicitly permits a crawler to access a URL or path, even within a broader Disallow: section. This is useful for blocking /private/ but allowing /private/sitemap.xml. In cases of conflict, the longer, more specific rule takes precedence.
Add a Sitemap: directive at the end of your robots.txt file, for example: Sitemap: https://example.com/sitemap.xml. You can list multiple sitemaps. This helps search engines discover your sitemap even if they arrive only at your robots.txt file.
Yes — when you enter a domain URL, the tool attempts to fetch the live robots.txt from example.com/robots.txt via a CORS proxy. You can also paste your robots.txt content directly if you want to test a draft version before going live. All processing happens in your browser.
A valid robots.txt file starts with one or more User-agent: lines specifying which bot the following rules apply to, followed by Disallow: and/or Allow: directives. Each block should be separated by a blank line. Comments start with #. The file must be served at the exact path /robots.txt with a 200 HTTP status and text/plain content-type.