Launch Readiness Checker History

Launch Readiness Checker To-Do List

Launch Readiness Checker robots.txt

robots.txt: Publish Clear Crawl Rules at the Site Root

Robots.txt is one of the first files crawlers look for when they evaluate a site.

It should be present, intentional, and free of launch-blocking mistakes.

What It Is

The robots.txt file lives at the site root and gives crawl guidance to well-behaved bots.

User-agent: *
Allow: /
Sitemap: https://example.com/sitemap.xml

Why It Matters

It communicates basic crawl rules early.
It can point crawlers to the sitemap.
It helps avoid accidental blocking of important public paths.

Best Practices

Publish the file at /robots.txt.
Keep rules simple unless you have a clear reason for complexity.
Review staging disallow rules before deploying to production.

Common Mistakes

No file at all.
Leaving Disallow: / from staging.
Using robots.txt as if it were a security control.

Quick Checklist

File exists at root.
Important pages are not blocked.
Sitemap location included when useful.

Final Takeaway

Robots.txt should guide discovery, not accidentally suppress it.

Run this check on your own page

Open the tool and analyze a public URL to see this section inside the full report.

Back to checker

Continue to your tool account

Use Google or email. New tool accounts are created automatically the first time you continue.

Continue with Google Connecting to Google...